Patent classifications
G06V10/85
Efficient black box adversarial attacks exploiting input data structure
Markov random field parameters are identified to use for covariance modeling of correlation between gradient terms of a loss function of the classifier. A subset of images are sampled, from a dataset of images, according to a normal distribution to estimate the gradient terms. Black-box gradient estimation is used to infer values of the parameters of the Markov random field according to the sampling. Fourier basis vectors are generated from the inferred values. An original image is perturbed using the Fourier basis vectors to obtain loss function values. An estimate of a gradient is obtained from the loss function values. An image perturbation is created using the estimated gradient. The image perturbation is added to an original input to generate a candidate adversarial input that maximizes loss in identifying the image by the classifier. The neural network classifier is queried to determine a classifier prediction for the candidate adversarial input.
UNCERTAINTY-AWARE DEEP REINFORCEMENT LEARNING FOR ANATOMICAL LANDMARK DETECTION IN MEDICAL IMAGES
Described are techniques for uncertainty-aware anatomical landmark detection, using, for example, a deep reinforcement learning (DRL) anatomical landmark detection agent. For instance, a process can include generating one or more image features for an input medical image using a first sub-network of the anatomical landmark detection agent. A softmax layer of a second sub-network of the anatomical landmark detection agent can generate a plurality of discrete Q-value distributions for a set of allowable actions associated with movement of the agent within the medical image. An anatomical landmark location within the medical image can be predicted using the discrete Q-value distributions. An uncertainty can be determined for the predicted anatomical landmark location, based on an average full width half maximum (FWHM) calculated for the plurality of discrete Q-value distributions.
SYSTEM AND METHOD FOR ENABLING ROBOT TO PERCEIVE AND DETECT SOCIALLY INTERACTING GROUPS
This disclosure relates to system and method for enabling a robot to perceive and detect socially interacting groups. Various known systems have limited accuracy due to prevalent rule-driven methods. In case of few data-driven learning methods, they lack datasets with varied conditions of light, occlusion, and backgrounds. The disclosed method and system detect the formation of a social group of people, or, f-formation in real-time in a given scene. The system also detects outliers in the process, i.e., people who are visible but not part of the interacting group. This plays a key role in correct f-formation detection in a real-life crowded environment. Additionally, when a collocated robot plans to join the group it has to detect a pose for itself along with detecting the formation. Thus, the system provides the approach angle for the robot, which can help it to determine the final pose in a socially acceptable manner.
Systems and methods for a title quality scoring framework
A system including one or more processors and one or more non-transitory computer-readable media storing computing instructions configured to run on the one or more processors and perform receiving a title of an item associated with an online catalog; interpreting, using a natural language model, one or more attributes of the predetermined set of attributes; determining a first title quality score for the title based on a first rule; determining a second title quality score for the title based on a second rule; determining an aggregated title quality score for the title based on at least the first title quality score and the second title quality score; generating a content quality list for the title; and sending instructions to display, on a user interface of an electronic device, a content quality dashboard comprising the content quality list for the title of the item. Other embodiments are disclosed.
Method and apparatus for defining a storyline based on path probabilities
A method, apparatus and computer-readable storage medium are provided to define a storyline based on path probabilities for a plurality of paths through the frames of a video. Relative to a method and for a plurality of frames of a video, regions of a first frame and regions of a second, subsequent frame that have been viewed are identified. For each of at least one first-frame region of one or more regions of the first frame, the method determines a transition probability of transitioning from a respective first-frame region of the first frame to each of at least one second-frame region of a plurality of regions of the second frame. Based on the transition probabilities, the method determines a path probability for each of at least one of a plurality of paths through the frames of the video. The method additionally defines a storyline based on the path probabilities.
AUTOMATICALLY CLASSIFYING ANIMAL BEHAVIOR
Systems and methods are disclosed to objectively identify sub-second behavioral modules in the three-dimensional (3D) video data that represents the motion of a subject. Defining behavioral modules based upon structure in the 3D video data itself—rather than using a priori definitions for what should constitute a measurable unit of action—identifies a previously-unexplored sub-second regularity that defines a timescale upon which behavior is organized, yields important information about the components and structure of behavior, offers insight into the nature of behavioral change in the subject, and enables objective discovery of subtle alterations in patterned action. The systems and methods of the invention can be applied to drug or gene therapy classification, drug or gene therapy screening, disease study including early detection of the onset of a disease, toxicology research, side-effect study, learning and memory process study, anxiety study, and analysis in consumer behavior.
System and method for enabling robot to perceive and detect socially interacting groups
This disclosure relates to system and method for enabling a robot to perceive and detect socially interacting groups. Various known systems have limited accuracy due to prevalent rule-driven methods. In case of few data-driven learning methods, they lack datasets with varied conditions of light, occlusion, and backgrounds. The disclosed method and system detect the formation of a social group of people, or, f-formation in real-time in a given scene. The system also detects outliers in the process, i.e., people who are visible but not part of the interacting group. This plays a key role in correct f-formation detection in a real-life crowded environment. Additionally, when a collocated robot plans to join the group it has to detect a pose for itself along with detecting the formation. Thus, the system provides the approach angle for the robot, which can help it to determine the final pose in a socially acceptable manner.
METHODS AND APPARATUS FOR DISPLAYING, COMPRESSING AND/OR INDEXING INFORMATION RELATING TO A MEETING
A method of visualising a meeting between one or more participants on a display includes, in an electronic processing device, the steps of: determining a plurality of signals, each of the plurality of signals being at least partially indicative of the meeting; generating a plurality of features using the plurality of signals, the features being at least partially indicative of the signals; generating at least one of: at least one phase indicator associated with the plurality of features, the at least one phase indicator being indicative of a temporal segmentation of at least part of the meeting; and at least one event indicator associated with the plurality of features, the at least one event indicator being indicative of an event during the meeting. The method also includes the step of causing a representation indicative of the at least one phase indicator and/or the at least one event indicator to be displayed on the display to thereby provide visualisation of the meeting.
Computer interaction method, device, and program product
Embodiments of the present disclosure provide a computer interaction method, device, and program product. The method includes: acquiring, in response to triggering of an input to an electronic device, multiple images that present a given part of a user; determining a corresponding character sequence based on respective gestures of the given part in the multiple images, corresponding characters in the character sequence being selected from a predefined character set in which multiple characters respectively correspond to different gestures of the given part; and determining, based on the character sequence, a computer instruction to be input to the electronic device. With this solution, the user can conveniently and flexibly execute the input to the electronic device through a gesture of the given part (e.g., a hand).
ESTIMATING VIDEO RESOLUTION DELIVERED BY AN ENCRYPTED VIDEO STREAM
There is provided a method for estimating play out resolution of a video delivered to a client device by an encrypted video stream communicated over a network. The method selects a current chunk of the encrypted video stream comprising data packets expected to carry video data of the same level of playout resolution and determines values for a predetermined set of features indicative of conditions in the network. By accessing a pregenerated model, a corresponding set of state transition probabilities is obtained, defining a Markov chain whose states comprise the different levels of resolution. The determined state transition probabilities are then used to calculate, from a first probability distribution arising from a first or previous step in the Markov chain, a second probability distribution for the plurality of states of the Markov chain expected to result from the indicated network conditions.