Patent classifications
G06V10/464
METHODS AND ARRANGEMENTS FOR IDENTIFYING OBJECTS
In some arrangements, product packaging is digitally watermarked over most of its extent to facilitate high-throughput item identification at retail checkouts. Imagery captured by conventional or plenoptic cameras can be processed (e.g., by GPUs) to derive several different perspective-transformed viewsfurther minimizing the need to manually reposition items for identification. Crinkles and other deformations in product packaging can be optically sensed, allowing such surfaces to be virtually flattened to aid identification. Piles of items can be 3D-modelled and virtually segmented into geometric primitives to aid identification, and to discover locations of obscured items. Other data (e.g., including data from sensors in aisles, shelves and carts, and gaze tracking for clues about visual saliency) can be used in assessing identification hypotheses about an item. Logos may be identified and usedor ignoredin product identification. A great variety of other features and arrangements are also detailed.
Method for processing a stream of video images
A method for processing a stream of video images to search for information therein, in particular detect predefined objects and/or a motion, comprising the steps of: a) supplying at least one attention map in at least one space of the positions and of the scales of at least one image of the video stream, b) selecting, in this space, points to be analyzed by making the selection depend at least on the values of the coefficients of the attention map at these points, at least some of the points to be analyzed being selected by random draw with a probability of selection in the draw at a point depending on the value of the attention map at that point, a bias being introduced into the map to give a non-zero probability of selection at any point, c) analyzing the selected points to search therein for said information, d) updating the attention map at least for the processing of the subsequent image, from at least the result of the analysis performed in c), e) reiterating the steps a) to d) for each new image of the video stream and/or for the current image on at least one different scale.
Object authentication device and object authentication method
An object authentication device includes a speech recognition unit configured to obtain candidates for a speech recognition result for an input speech and a likelihood of the speech as a speech likelihood, an image model generation unit configured to obtain image models of a predetermined number of candidates for the speech recognition result in descending order of speech likelihoods, an image likelihood calculation unit configured to obtain an image likelihood based on an image model of an input image, and an object authentication unit configured to perform object authentication using the image likelihood, wherein vocabularies predicted through speech recognition are categorized and the image model is formed in association with a category.
METHODS AND ARRANGEMENTS FOR IDENTIFYING OBJECTS
In some arrangements, product packaging is digitally watermarked over most of its extent to facilitate high-throughput item identification at retail checkouts. Imagery captured by conventional or plenoptic cameras can be processed (e.g., by GPUs) to derive several different perspective-transformed viewsfurther minimizing the need to manually reposition items for identification. Crinkles and other deformations in product packaging can be optically sensed, allowing such surfaces to be virtually flattened to aid identification. Piles of items can be 3D-modelled and virtually segmented into geometric primitives to aid identification, and to discover locations of obscured items. Other data (e.g., including data from sensors in aisles, shelves and carts, and gaze tracking for clues about visual saliency) can be used in assessing identification hypotheses about an item. Logos may be identified and usedor ignoredin product identification. A great variety of other features and arrangements are also detailed.
MACHINE LEARNING-BASED PREDICTION OF PRECISE PERCEPTUAL VIDEO QUALITY
Systems and Methods disclosed for measuring a similarity between the input and the output of computing systems and communications channels. Techniques disclosed provide for low complexity prediction method of a perceptual video quality (PVQ) score, which may be used to design and tune performance of the computing systems and communications channels.
Analyzing content of digital images
Methods, apparatuses, and embodiments related to analyzing the content of digital images. A computer extracts multiple sets of visual features, which can be keypoints, based on an image of a selected object. Each of the multiple sets of visual features is extracted by a different visual feature extractor. The computer further extracts a visual word count vector based on the image of the selected object. An image query is executed based on the extracted visual features and the extracted visual word count vector to identify one or more candidate template objects of which the selected object may be an instance. When multiple candidate template objects are identified, a matching algorithm compares the selected object with the candidate template objects to determine a particular candidate template of which the selected object is an instance.
Visual-inertial positional awareness for autonomous and non-autonomous tracking
The described positional awareness techniques employing visual-inertial sensory data gathering and analysis hardware with reference to specific example implementations implement improvements in the use of sensors, techniques and hardware design that can enable specific embodiments to provide positional awareness to machines with improved speed and accuracy.
Apparatus and method for re-identifying object in image processing
The apparatus includes: a weighted feature extractor configured to extract a weighted feature from an input image and generate a weighted descriptor to which a feature of a salient region is applied; a dictionary constructor configured to construct a dictionary composed of images with different characteristics of one object using the weighted descriptor to which the feature of the salient region is applied by the weighted feature extractor and store the dictionary in a database (DB); and a coefficient estimator and ID determiner configured to apply sparse representation for estimating a coefficient that allows an object to be reconstructed as much as possible with a few linear combinations of candidate objects of a target object constituting the dictionary, and perform identification using an error between a target and the reconstructed object.
Systems and methods for identifying salient images
Image information defining an image may be accessed. The image may include one or more salient objects. A saliency map may be generated based on the image information. The saliency map may include one or more regions corresponding to the one or more salient objects. The one or more regions may be characterized by different levels of intensity than other regions of the saliency map. One or more salient regions around the one or more salient objects may be identified based on the saliency map. A saliency metric for the image may be generated based on one or more of (1) sizes of the one or more salient regions; (2) an amount of the one or more salient regions; and/or (3) histograms within the one or more salient regions.
Image processing apparatus, image processing method, and program to identify objects using image features
An image processing apparatus, an image processing method, and a program, provide accurate collation even when an image contains a number of identical or similar subjects. The image processing apparatus generates, with respect to feature points to be detected from a first image, a first local feature amount group including local feature amounts representing feature amounts of local regions containing the respective feature points, and a first coordinate position information group including coordinate position information. The image processing apparatus clusters the feature points of the first image based on the first coordinate position information group. The image processing apparatus collates, in units of clusters, the first local feature amount group with a second local feature amount group formed from local feature amounts of feature points detected from a second image.