G06V30/194

Learning apparatus, operation program of learning apparatus, and operation method of learning apparatus
11594056 · 2023-02-28 · ·

A learning apparatus learns a machine learning model for performing semantic segmentation of determining a plurality of classes in an input image in units of pixels by extracting, for each layer, features which are included in the input image and have different frequency bands of spatial frequencies. A learning data analysis unit analyzes the frequency bands included in an annotation image of learning data. A learning method determination unit determines a learning method using the learning data based on an analysis result of the frequency bands by the learning data analysis unit. A learning unit learns the machine learning model via the determined learning method using the learning data.

Systems and methods for searching audiovisual data using latent codes from generative networks and models

Systems and methods for viewing, storing, transmitting, searching, and editing application-specific audiovisual content (or other unstructured data) are disclosed in which edge devices generate content on the fly from a partial set of instructions rather than merely accessing the content in its final or near-final form. An image processing architecture may include a generative model that may be a deep learning model. The generative model may include a latent space comprising a plurality of latent codes and a trained generator mapping. The trained generator mapping may convert points in the latent space to uncompressed data points, which in the case of audiovisual content may be generated image frames. The generative model may be capable of closely approximating (up to noise or perceptual error) most or all potential data points in the relevant compression application, which in the case of audiovisual content may be source images.

Systems and methods for searching audiovisual data using latent codes from generative networks and models

Systems and methods for viewing, storing, transmitting, searching, and editing application-specific audiovisual content (or other unstructured data) are disclosed in which edge devices generate content on the fly from a partial set of instructions rather than merely accessing the content in its final or near-final form. An image processing architecture may include a generative model that may be a deep learning model. The generative model may include a latent space comprising a plurality of latent codes and a trained generator mapping. The trained generator mapping may convert points in the latent space to uncompressed data points, which in the case of audiovisual content may be generated image frames. The generative model may be capable of closely approximating (up to noise or perceptual error) most or all potential data points in the relevant compression application, which in the case of audiovisual content may be source images.

FAST IMAGE STYLE TRANSFERS

Manipulating images using computationally expensive machine learning schemes can be implemented using server-generated models of the machine learning schemes that are transmitted to a client device for application. The schemes can include convolutional neural networks having a kernel comprising a plurality of low-rank matrices.

FAST IMAGE STYLE TRANSFERS

Manipulating images using computationally expensive machine learning schemes can be implemented using server-generated models of the machine learning schemes that are transmitted to a client device for application. The schemes can include convolutional neural networks having a kernel comprising a plurality of low-rank matrices.

Using attributes for identifying imagery for selection

A system includes a computing device that includes a memory configured to store instructions. The system also includes a processor to execute the instructions to perform operations that include receiving data representing an image, the image being represented in the data by a collection of visual elements. Operations also include determining whether to select the image for presentation by one or more entities using a machine learning system, the machine learning system being trained using data representing a plurality of training images and data representing one or more attributes regarding image presentation by the one or more entities.

Using attributes for identifying imagery for selection

A system includes a computing device that includes a memory configured to store instructions. The system also includes a processor to execute the instructions to perform operations that include receiving data representing an image, the image being represented in the data by a collection of visual elements. Operations also include determining whether to select the image for presentation by one or more entities using a machine learning system, the machine learning system being trained using data representing a plurality of training images and data representing one or more attributes regarding image presentation by the one or more entities.

AUTOMATIC DECISIONING OVER UNSTRUCTURED DATA

Automatic decisioning associated with unstructured data is disclosed. Unstructured data, such as that associated with comments of an underwriter regarding a credit decision, can be received. Text mining can be performed to extract features from the unstructured data. The extracted features can subsequently be provided as input to a machine learning model configured to return a prediction of a class associated with the unstructured data. The predicted class, such as approved or rejected, can subsequently be conveyed for display on a display device.

Method of detecting at least one element of interest visible in an input image by means of a convolutional neural network

A method of detecting at least one element of interest visible on an input image, by means of a convolutional neural network, CNN, the method comprises the steps of: (a) extracting, by means of an ascending branch of a first subnetwork of said CNN of feature pyramid network, FPN, type, a plurality of initial feature maps (C1, C2, C3, C4, C5) representative of the input image at different scales, said FPN further comprising a descending branch and lateral connections between the ascending branch and the descending branch, at least one lateral connection comprising an attention module; (b) generating, by means of said descending branch of the FPN, a plurality of enriched feature maps (P2, P3, P4, P5) also representative of the input image at different scales, each enriched feature map (P2, P3, P4, P5) incorporating the information from the initial feature maps (C1, C2, C3, C4, C5) of smaller or equal scale; (d) detecting at least one element of interest visible on an input image, by means of a second subnetwork of said CNN, referred to as detection network, taking said enriched feature maps (P2, P3, P4, P5) as input.

Method of detecting at least one element of interest visible in an input image by means of a convolutional neural network

A method of detecting at least one element of interest visible on an input image, by means of a convolutional neural network, CNN, the method comprises the steps of: (a) extracting, by means of an ascending branch of a first subnetwork of said CNN of feature pyramid network, FPN, type, a plurality of initial feature maps (C1, C2, C3, C4, C5) representative of the input image at different scales, said FPN further comprising a descending branch and lateral connections between the ascending branch and the descending branch, at least one lateral connection comprising an attention module; (b) generating, by means of said descending branch of the FPN, a plurality of enriched feature maps (P2, P3, P4, P5) also representative of the input image at different scales, each enriched feature map (P2, P3, P4, P5) incorporating the information from the initial feature maps (C1, C2, C3, C4, C5) of smaller or equal scale; (d) detecting at least one element of interest visible on an input image, by means of a second subnetwork of said CNN, referred to as detection network, taking said enriched feature maps (P2, P3, P4, P5) as input.