Patent classifications
G06V10/464
Image representation method and processing device based on local PCA whitening
An image representation method and processing device based on local PCA whitening. A first mapping module maps words and characteristics to a high-dimension space. A principal component analysis module conducts principal component analysis in each corresponding word space, to obtain a projection matrix. A VLAD computation module computes a VLAD image representation vector; a second mapping module maps the VLAD image representation vector to the high-dimension space. A projection transformation module conducts projection transformation on the VLAD image representation vector obtained by means of projection. A normalization processing module conducts normalization on characteristics obtained by means of projection transformation, to obtain a final image representation vector. An obtained image representation vector is projected to a high-dimension space first, then projection transformation is conducted on a projection matrix computed in advance and vectors corresponding to words, to obtain a low-dimension vector; and in this way, the vectors corresponding to the words are consistent. The disclosed method and the processing device can obtain better robustness and higher performance.
System and method for visual event description and event analysis
A system and method are provided for analyzing a video. The method comprises: sampling the video to generate a plurality of spatio-temporal video volumes; clustering similar ones of the plurality of spatio-temporal video volumes to generate a low-level codebook of video volumes; analyzing the low-level codebook of video volumes to generate a plurality of ensembles of volumes surrounding pixels in the video; and clustering the plurality of ensembles of volumes by determining similarities between the ensembles of volumes, to generate at least one high-level codebook. Multiple high-level codebooks can be generated by repeating steps of the method. The method can further include performing visual event retrieval by using the at least one high-level codebook to make an inference from the video, for example comparing the video to a dataset and retrieving at least one similar video, activity and event labeling, and performing abnormal and normal event detection.
ANALYZING CONTENT OF DIGITAL IMAGES
Methods, apparatuses, and embodiments related to analyzing the content of digital images. A computer extracts multiple sets of visual features, which can be keypoints, based on an image of a selected object. Each of the multiple sets of visual features is extracted by a different visual feature extractor. The computer further extracts a visual word count vector based on the image of the selected object. An image query is executed based on the extracted visual features and the extracted visual word count vector to identify one or more candidate template objects of which the selected object may be an instance. When multiple candidate template objects are identified, a matching algorithm compares the selected object with the candidate template objects to determine a particular candidate template of which the selected object is an instance.
IMAGE PROCESSING
A image processing system and method are provided for receiving an image with a set of feature points characteristic of the image and selecting each of the feature points to be a selected feature point. Moreover, a number of neighboring feature points associated with the selected feature point are identified and a first hash is created that includes information associated with a first pair of neighboring feature points, with the information associated with the first and second neighboring feature points representative of the relative location of these neighboring feature points to the selected feature point. Moreover, a second hash is created that includes information associated with, a second pair of neighboring feature points, with the information associated with these neighboring feature points representative of the relative location of these to the selected feature point.
Systems and methods for identifying salient images
Image information defining an image may be accessed. The image may include one or more salient objects. A saliency map may be generated based on the image information. The saliency map may include one or more regions corresponding to the one or more salient objects. The one or more regions may be characterized by different levels of intensity than other regions of the saliency map. One or more salient regions around the one or more salient objects may be identified based on the saliency map. A saliency metric for the image may be generated based on one or more of (1) sizes of the one or more salient regions; (2) an amount of the one or more salient regions; and/or (3) histograms within the one or more salient regions.
MACHINE VISION SYSTEM FOR RECOGNIZING NOVEL OBJECTS
Described is a system for classifying novel objects in imagery. In operation, the system extracts salient patches from a plurality of unannotated images using a multi-layer network. Activations of the multi-layer network are clustered into key attribute, with the key attributes being displayed to a user on a display, thereby prompting the user to annotate the key attributes with class label. An attribute database is then generated based on user prompted annotations of the key attributes. A test image can then be passed through the system, allowing the system to classify at least one object in the test image by identifying an object class in the attribute database. Finally, a device can be caused to operate or maneuver based on the classification of the at least one object in the test image.
Analyzing content of digital images
Methods, apparatuses, and embodiments related to analyzing the content of digital images. A computer extracts multiple sets of visual features, which can be keypoints, based on an image of a selected object. Each of the multiple sets of visual features is extracted by a different visual feature extractor. The computer further extracts a visual word count vector based on the image of the selected object. An image query is executed based on the extracted visual features and the extracted visual word count vector to identify one or more candidate template objects of which the selected object may be an instance. When multiple candidate template objects are identified, a matching algorithm compares the selected object with the candidate template objects to determine a particular candidate template of which the selected object is an instance.
Deformable-surface tracking based augmented reality image generation
There are provided systems and methods for performing deformable-surface tracking based augmented reality image generation. In one implementation, such a system includes a hardware processor and a system memory storing an augmented reality three-dimensional image generator. The hardware processor is configured to execute the augmented reality three-dimensional image generator to receive image data corresponding to a two-dimensional surface, and to identify an image template corresponding to the two-dimensional surface based on the image data. In addition, the hardware processor is configured to execute the augmented reality three-dimensional image generator to determine a surface deformation of the two-dimensional surface. The hardware processor is further configured to execute the augmented reality three-dimensional image generator to generate an augmented reality three-dimensional image including at least one feature of the two-dimensional surface, based on the image template and the surface deformation of the two-dimensional surface.
Control method, information terminal, recording medium, and determination method
If a lesion included in a specification target image is a texture lesion, a probability image calculation unit calculates a probability value indicating a probability that each of a plurality of pixels of the specification target image is included in a lesion area. An output unit calculates, as a candidate area, an area including pixels whose probability values are equal to or larger than a first threshold in a probability image obtained from the probability image calculation unit and, as a modification area, an area including pixels whose probability values are within a certain probability range including the first threshold. An input unit detects an input from a user on a pixel in the modification area. A lesion area specification unit specifies a lesion area on the basis of the probability image, the candidate area, the modification area, and user operation information.
DRIVER ASSISTANCE SYSTEM FOR DETERMINING A POSITION OF A VEHICLE
The invention relates to a driver assistance system for determining a position of a vehicle. The driver assistance system comprises a processing unit and a first positioning system for providing first information about the position of the vehicle. The driver assistance system further comprises a second positioning system for providing second, visual information about the position of the vehicle. The second positioning system is configured to provide the second, visual information about the position of the vehicle based on a comparison between image data of an image generated by an on-board camera of the vehicle and image data of an image stored in a database using a visual bag of words technique. The processing unit is configured for determining the position of the vehicle based on the first information about the position of the vehicle and on the second, visual information about the position of the vehicle.