Patent classifications
G06V10/464
Large scale image recognition using global signatures and local feature information
Techniques are provided that include receiving one or more global signatures for a query image in response to an image recognition query, wherein some of the plurality of global signatures are generated using local descriptors corresponding to different cropped versions of the image. A ranking order is determined for a plurality of document images based on nearest neighbor relations between document signatures corresponding to the plurality of document images and each one of the one or more global signatures for the query image. A subset of the plurality of document images is selected based on the determined ranking order. Additional document data corresponding to the selected subset of the plurality of document images is obtained, and a search result is generated based on a geometric verification between the additional document data corresponding to the selected subset of the plurality of document images and the query image.
Method of identifying objects based on region of interest and electronic device supporting the same
An electronic device includes a display, and a processor functionally connected with the display. The processor is configured to output content including one or more objects through the display, receive user input for specifying at least one point in the entire region of the content, determine a portion of an entire region with respect to the at least one point as a search region, obtain a saliency map associated with the content based on the search region, and determine a region of interest of the user based on the saliency map. Alternatively, the processor is configured to obtain an index map associated with the content by dividing the entire region of the content into similar regions according to a preset criterion and determine the region of interest of the user by overlapping the saliency map and the index map. It is possible to provide other embodiments.
OPTIMIZING 360-DEGREE VIDEO STREAMING WITH VIDEO CONTENT ANALYSIS
Aspects of the subject disclosure may include, for example, a method performed by a processing system of determining a present orientation of a display region presented at a first time on a display of a video viewer, predicting a future orientation of the display region occurring at a second time based on data collected, to obtain a predicted orientation of the display region to be presented at the second time on the display of the video viewer, identifying, based on the predicted orientation of the display region, a first group of tiles from a video frame of a panoramic video being displayed by the video viewer, wherein the first group of tiles covers the display region in the video frame at the predicted orientation, and a plurality of objects moving in the video frame from the first time to the second time, wherein each object of the plurality of objects is located in a separate spatial region of the video frame at the second time, wherein a second group of tiles collectively covers the separate spatial regions, wherein tiles in the first group of tiles and tiles in the second group of tiles are different, and facilitating wireless transmission of the first group of tiles and a second tile from the second group of tiles, for presentation at the video viewer at the second time. Other embodiments are disclosed.
Object based image processing
A method includes determining, at an image processing device, object quality values for a plurality of objects represented in an image. The object quality values are based on portions of image data for the image. The object quality values include a blurriness value for each object and a color value for each object. The method includes accessing, via the image processing device, object category metrics associated with an object category. The object category metrics include a blurriness metric for each object and a color metric for each object. The method also includes performing, with the image processing device, a particular image processing operation for the image based on comparisons of the object quality values for each object to corresponding object category metrics.
IMAGE SEARCHING APPARATUS, CLASSIFIER TRAINING METHOD, AND RECORDING MEDIUM
An image searching apparatus includes: a processor; and a memory, wherein the processor is configured to attach, to an image with a first correct label attached thereto, a second correct label, the first correct label being a correct label attached to each image included in an image dataset for training for use in supervised training, the second correct label being a correct label based on a degree of similarity from a predetermined standpoint; execute main training processing to train a classifier by using the images and one of the first correct label and the second correct label, fine-tune a training state of the classifier; trained by the main training processing, by using the images and the other one of the first correct label and the second correct label; and search, by using the classifier that is fine-tuned, for images similar to a query image.
System and method for biometric authentication in connection with camera-equipped devices
The present invention relates generally to the use of biometric technology for authentication and identification, and more particularly to non-contact based solutions for authenticating and identifying users, via computers, such as mobile devices, to selectively permit or deny access to various resources. In the present invention authentication and/or identification is performed using an image or a set of images of an individual's palm through a process involving the following key steps: (1) detecting the palm area using local classifiers; (2) extracting features from the region(s) of interest; and (3) computing the matching score against user models stored in a database, which can be augmented dynamically through a learning process.
Deformable-surface tracking based augmented reality image generation
There are provided systems and methods for performing deformable-surface tracking based augmented reality image generation. In one implementation, such a system includes a hardware processor and a system memory storing an augmented reality three-dimensional image generator. The hardware processor is configured to execute the augmented reality three-dimensional image generator to receive image data corresponding to a two-dimensional surface, and to identify an image template corresponding to the two-dimensional surface based on the image data. In addition, the hardware processor is configured to execute the augmented reality three-dimensional image generator to determine a surface deformation of the two-dimensional surface. The hardware processor is further configured to execute the augmented reality three-dimensional image generator to generate an augmented reality three-dimensional image including at least one feature of the two-dimensional surface, based on the image template and the surface deformation of the two-dimensional surface.
Optimizing 360-degree video streaming with video content analysis
Aspects of the subject disclosure may include, for example, a method performed by a processing system of determining a present orientation of a display region presented at a first time on a display of a video viewer, predicting a future orientation of the display region occurring at a second time based on data collected, to obtain a predicted orientation of the display region to be presented at the second time on the display of the video viewer, identifying, based on the predicted orientation of the display region, a first group of tiles from a video frame of a panoramic video being displayed by the video viewer, wherein the first group of tiles covers the display region in the video frame at the predicted orientation, and a plurality of objects moving in the video frame from the first time to the second time, wherein each object of the plurality of objects is located in a separate spatial region of the video frame at the second time, wherein a second group of tiles collectively covers the separate spatial regions, wherein tiles in the first group of tiles and tiles in the second group of tiles are different, and facilitating wireless transmission of the first group of tiles and a second tile from the second group of tiles, for presentation at the video viewer at the second time. Other embodiments are disclosed.
GLOBAL SIGNATURES FOR LARGE-SCALE IMAGE RECOGNITION
Techniques are provided that include obtaining a vocabulary including a set of content indices that reference corresponding cells in a descriptor space based on an input set of descriptors. A plurality of local features of an image are identified based on the vocabulary, the local features being represented by a plurality of local descriptors. An associated visual word in the vocabulary is determined for each of the plurality of local descriptors. A plurality of global signatures for the image are generated based on the associated visual words, wherein some of the plurality of global signatures are generated using local descriptors corresponding to different cropped versions of the image, two or more of the different cropped versions of the image being centered at a same pixel location of the image, and an image recognition search is facilitated using the plurality of global signatures to search a document image dataset.
Methods and systems for human action recognition using 3D integral imaging
The present disclosure includes systems and methods for detecting and recognizing human gestures or activity in a group of 3D volumes using integral imaging. In exemplary embodiments, a camera array including multiple image capturing devices captures multiple images of a point-of-interest in a 3D scene. The multiple images can be used to reconstruct a group of 3D volumes of the point-of-interest using the images. The system detects spatiotemporal interest points (STIPs) in the group of 3D volumes and assigns descriptors to each of the plurality of STIPs. The descriptors are quantized into a number of visual words to create clusters of the group of 3D volumes. The system builds a histogram for each of the visual words. Each 3D volume is classified using the histogram of the plurality of visual words, implying the classification of a human gesture in each 3D volumes.