Patent classifications
G06V10/449
PROCESS AND SYSTEM FOR COLOUR GRADING FOR DIAMONDS
A process is operable using a computerized system for grading the colour of a diamond using a pre-trained neural network for determination of a colour grading. The computerized system includes an optical image acquisition device, a pre-trained neural network and an output module operably interconnected together via a communication link. The process includes: (i) acquiring via an optical image acquisition device one or more optical image of at least a portion of a diamond; and (ii) in a pre-trained neural network, providing a regressive value associated with the colour grade of the diamond.
Method and apparatus for providing unknown moving object detection
An approach is provided for an unknown moving object detection system. The approach, for instance, involves capturing a plurality of unknown object events indicating an unknown object detected by one or more computer vision systems. The approach also involves clustering the plurality of unknown object events into a plurality of clusters based on one or more clustering parameters. The approach further involves selecting at least one cluster of the plurality of clusters based on a selection criterion. The approach further involves determining at least one operating scenario for the one or more computer vision systems based on a combination of the one or more clustering parameters associated with the selected at least one cluster.
Deep learning based adaptive arithmetic coding and codelength regularization
A deep learning based compression (DLBC) system applies trained models to compress binary code of an input image to a target codelength. For a set of binary codes representing the quantized coefficents of an input image, the DLBC system applies a first model that is trained to predict feature probabilities based on the context of each bit of the binary codes. The DLBC system compresses the binary code via adaptive arithmetic coding based on the determined probability of each bit. The compressed binary code represents a balance between a reconstruction quality of a reconstruction of the input image and a target compression ratio of the compressed binary code.
Method of analyzing a fingerprint
A method of analyzing a fingerprint, the method comprising the step of acquiring a fingerprint image (20) together with the following steps: performing filtering processing on the fingerprint image to estimate, for each pixel of the fingerprint image, a first frequency of the ridges (21) in the fingerprint, and using the first frequencies associated with the pixels of the fingerprint image to produce a first frequency map (22) of the fingerprint image; subdividing the fingerprint image into a plurality of windows each comprising a plurality of pixels, calculating a Fourier transform for each window in order to estimate a second frequency of the ridges for all of the pixels in said window, and using the second frequencies associated with the pixels of the windows to produce a second frequency map of the fingerprint image; and merging the first frequency map and the second frequency map in order to obtain a map of consolidated frequencies of the fingerprint image.
POLY-SCALE KERNEL-WISE CONVOLUTION FOR HIGH-PERFORMANCE VISUAL RECOGNITION APPLICATIONS
Techniques related to poly-scale kernel-wise convolutional neural network layers are discussed. A poly-scale kernel-wise convolutional neural network layer is applied to an input volume to generate an output volume and include filters each having a number of filter kernels with the same sample rate and differing dilation rates optionally in a repeating pattern of dilation rate groups within each of filters with the pattern of dilation rate groups offset between the filters the poly-scale kernel-wise convolutional neural network layer.
RETINAL ENCODER FOR MACHINE VISION
A method is disclosed including: receiving raw image data corresponding to a series of raw images; processing the raw image data with an encoder to generate encoded data, where the encoder is characterized by an input/output transformation that substantially mimics the input/output transformation of one or more retinal cells of a vertebrate retina; and applying a first machine vision algorithm to data generated based at least in part on the encoded data.
METHODS AND SYSTEMS FOR ANNOTATION AND TRUNCATION OF MEDIA ASSETS
Methods and systems for improving the interactivity of media content. The methods and systems are particularly applicable to the e-learning space, which features unique problems in engaging with users, maintaining that engagement, and allowing users to alter media assets to their specific needs. To address these issues, as well as improving interactivity of media assets generally, the methods and systems described herein provide for annotation and truncation of media assets. More particularly, the methods and systems described herein provide features such as annotation guidance and video condensation.
Low-cost face recognition using Gaussian receptive field features
Methods and systems may provide for facial recognition of at least one input image utilizing hierarchical feature learning and pair-wise classification. Receptive field theory may be used on the input image to generate a pre-processed multi-channel image. Channels in the pre-processed image may be activated based on the amount of feature rich details within the channels. Similarly, local patches may be activated based on the discriminant features within the local patches. Features may be extracted from the local patches and the most discriminant features may be selected in order to perform feature matching on pair sets. The system may utilize patch feature pooling, pair-wise matching, and large-scale training in order to quickly and accurately perform facial recognition at a low cost for both system memory and computation.
Adaptive image cropping for face recognition
By adding a side network to a face recognition network, output of early convolution blocks may be used to determine relative bounding box values. The relative bounding box values may be used to refine existing boundary box value with an eye on improving the generation, by the face recognition network, of embedding vectors.
TILING FORMAT FOR CONVOLUTIONAL NEURAL NETWORKS
Systems, apparatuses, and methods for converting data to a tiling format when implementing convolutional neural networks are disclosed. A system includes at least a memory, a cache, a processor, and a plurality of compute units. The memory stores a first buffer and a second buffer in a linear format, where the first buffer stores convolutional filter data and the second buffer stores image data. The processor converts the first and second buffers from the linear format to third and fourth buffers, respectively, in a tiling format. The plurality of compute units load the tiling-formatted data from the third and fourth buffers in memory to the cache and then perform a convolutional filter operation on the tiling-formatted data. The system generates a classification of a first dataset based on a result of the convolutional filter operation.