Patent classifications
G06V10/464
THREE-DIMENSIONAL FACIAL RECOGNITION METHOD AND SYSTEM
The present disclosure provides a three-dimensional facial recognition method and system. The method includes: performing pose estimation on an input binocular vision image pair by using a three-dimensional facial reference model, to obtain a pose parameter and a virtual image pair of the three-dimensional facial reference model with respect to the binocular vision image pair; reconstructing a facial depth image of the binocular vision image pair by using the virtual image pair as prior information; detecting, according to the pose parameter, a local grid scale-invariant feature descriptor corresponding to an interest point in the facial depth image; and generating a recognition result of the binocular vision image pair according to the detected local grid scale-invariant feature descriptor and training data having attached category annotations. The present disclosure can reduce computational costs and required storage space.
Neural network patch aggregation and statistics
Neural network patch aggregation and statistical techniques are described. In one or more implementations, patches are generated from an image, e.g., randomly, and used to train a neural network. An aggregation of outputs of patches processed by the neural network may be used to label an image using an image descriptor, such as to label aesthetics of the image, classify the image, and so on. In another example, the patches may be used by the neural network to calculate statistics describing the patches, such as to describe statistics such as minimum, maximum, median, and average of activations of image characteristics of the individual patches. These statistics may also be used to support a variety of functionality, such as to label the image as described above.
VIRTUAL USER INPUT CONTROLS IN A MIXED REALITY ENVIRONMENT
A wearable display system can automatically recognize a physical remote or a device that the remote serves using computer vision techniques. The wearable system can generate a virtual remote with a virtual control panel viewable and interactable by user of the wearable system. The virtual remote can emulate the functionality of the physical remote. The user can select a virtual remote for interaction, for example, by looking or pointing at the parent device or its remote control, or by selecting from a menu of known devices. The virtual remote may include a virtual button, which is associated with a volume in the physical space. The wearable system can detect that a virtual button is actuated by determining whether a portion of the user's body (e.g., the user's finger) has penetrated the volume associated with the virtual button.
Identifying consumer products in images
Systems and methods identify consumer products in images. Known consumer products are captured as grayscale or color images. They are converted to binary at varying thresholds. Connected components in the binary images identify image features according to pixels of a predetermined size, shape, solidity, aspect ratio, and the like. The image features are stored and searched for amongst image features similarly extracted from unknown images of consumer products. Identifying correspondence between the features of the images lends itself to identifying or not known consumer products.
Method and Apparatus for Detecting and Assessing Road Reflections
The invention relates to a method for detecting and assessing reflections on a road (7). A camera (2) is provided and is used to produce at least two digital images of at least one point (3) of the road, wherein the images are produced from different recording perspectives (A, B) of the camera (2). Diffuse reflection and specular reflection of the road (7) are then detected by assessing differences in the appearances of the at least one point (3) of the road in the at least two digital images using digital image processing algorithms. Road reflections are particularly preferably assessed using an approximative approach. An item of road condition information is determined on the basis of the detected reflection, in particular an item of road condition information which states whether the road (7) is dry, wet, snow-covered or icy. The invention also relates to an apparatus (1) for carrying out the above-mentioned method and to a vehicle having such an apparatus (1).
Multiple Hypotheses Segmentation-Guided 3D Object Detection and Pose Estimation
A machine vision system and method uses captured depth data to improve the identification of a target object in a cluttered scene. A 3D-based object detection and pose estimation (ODPE) process is use to determine pose information of the target object. The system uses three different segmentation processes in sequence, where each subsequent segmentation process produces larger segments, in order to produce a plurality of segment hypotheses, each of which is expected to contain a large portion of the target object in the cluttered scene. Each segmentation hypotheses is used to mask 3D point clouds of the captured depth data, and each masked region is individually submitted to the 3D-based ODPE.
Method and system for identifying books on a bookshelf
A method and system for identifying books located on a bookshelf. Photographs of the bookshelf are captured and processed to identify individual books. Processing involves segmenting the photograph into individual book spines and extracting and analyzing features of the book spines. Analysis may include database matching and/or optical character recognition. Book spines for which a match is not found are human labeled, and the label information is added to the database. User feedback is also used to update the database.
CLASSIFICATION OF SEVERITY OF PATHOLOGICAL CONDITION USING HYBRID IMAGE REPRESENTATION
A computer-implemented method obtains at least one image from which severity of a given pathological condition presented in the at least one image is to be classified. The method generates a hybrid image representation of the at least one obtained image. The hybrid image representation comprises a concatenation of a discriminative pathology histogram, a generative pathology histogram, and a fully connected representation of a trained baseline convolutional neural network. The hybrid image representation is used to train a classifier to classify the severity of the given pathological condition presented in the at least one image. One non-limiting example of a pathological condition whose severity can be classified with the above method is diabetic retinopathy.
METHOD AND SYSTEM OF DETECTING AND RECOGNIZING A VEHICLE LOGO BASED ON SELECTIVE SEARCH
The invention discloses a method and a system of detecting and recognizing a vehicle logo based on Selective Search, the method comprising: positioning a vehicle plate on an original image of a vehicle to obtain a vehicle plate position; coarsely positioning a vehicle logo on the original image to obtain a coarse positioning image of the vehicle logo; selecting vehicle logo candidate areas in the coarse positioning image; performing target positioning in the vehicle logo candidate areas with the Selective Search to obtain a set of target regions; training a vehicle logo location classifier with Spatial Pyramid Matching based on Sparse Coding (ScSPM) to determine the vehicle logo from the set of target regions to obtain a vehicle logo position; and training a multi-class vehicle logo recognition classifier with the ScSPM to conduct a specific type-recognition for the vehicle logo to obtain a vehicle logo recognition result.
EXTRACTING MOTION SALIENCY FEATURES FROM VIDEO USING A NEUROSYNAPTIC SYSTEM
Embodiments of the invention provide a method of visual saliency estimation comprising receiving an input video of image frames. Each image frame has one or more channels, and each channel has one or more pixels. The method further comprises, for each channel of each image frame, generating corresponding neural spiking data based on a pixel intensity of each pixel of the channel, generating a corresponding multi-scale data structure based on the corresponding neural spiking data, and extracting a corresponding map of features from the corresponding multi-scale data structure. The multi-scale data structure comprises one or more data layers, wherein each data layer represents a spike representation of pixel intensities of a channel at a corresponding scale. The method further comprises encoding each map of features extracted as neural spikes.