Patent classifications
G06V10/449
Spatially preserving flattening in deep learning neural networks
Techniques for spatially preserving flattening in deep learning neural networks are provided. In one aspect, a spatially preserving flattening module includes: a predictor for generating image feature maps from at least one convolutional layer of a feature extraction phase of a deep learning neural network applied to input image data; an auto-encoder for producing encodings of the image feature maps that preserve location and shape information associated with objects in the input image data; and a flattener for concatenating the encodings of the image feature maps to form a spatially preserving flattened encoding vector. A deep learning neural network that includes the present spatially preserving flattening module is also provided, as is a method for spatially preserving flattening.
POLY-SCALE KERNEL-WISE CONVOLUTION FOR HIGH-PERFORMANCE VISUAL RECOGNITION APPLICATIONS
Techniques related to poly-scale kernel-wise convolutional neural network layers are discussed. A poly-scale kernel-wise convolutional neural network layer is applied to an input volume to generate an output volume and include filters each having a number of filter kernels with the same sample rate and differing dilation rates optionally in a repeating pattern of dilation rate groups within each of filters with the pattern of dilation rate groups offset between the filters the poly-scale kernel-wise convolutional neural network layer.
Systems and Methods for Medical Image Diagnosis Using Machine Learning
Systems and methods for medical image diagnoses in accordance with embodiments of the invention are illustrated. One embodiment includes a method for evaluating multimedia content. The method includes steps for receiving multimedia content and identifying a set of one or more image frames for each of several target views from the received multimedia content. For each target view, the method includes steps for evaluating the corresponding set of image frames to generate an intermediate result. The method includes steps for determining a composite result based on the intermediate results for each of the several target views.