IPIQ

G06V10/449

DATA COMPRESSION FOR MACHINE LEARNING TASKS

20180174047 · 2018-06-21 ·

A machine learning (ML) task system trains a neural network model that learns a compressed representation of acquired data and performs a ML task using the compressed representation. The neural network model is trained to generate a compressed representation that balances the objectives of achieving a target codelength and achieving a high accuracy of the output of the performed ML task. During deployment, an encoder portion and a task portion of the neural network model are separately deployed. A first system acquires data, applies the encoder portion to generate a compressed representation, performs an encoding process to generate compressed codes, and transmits the compressed codes. A second system regenerates the compressed representation from the compressed codes and applies the task model to determine the output of a ML task.

AUTOENCODING IMAGE RESIDUALS FOR IMPROVING UPSAMPLED IMAGES

20180174275 · 2018-06-21 ·

An enhanced encoder system generates residual bitstreams representing additional image information that can be used by an image enhancement system to improve a low quality image. The enhanced encoder system upsamples a low quality image and compares the upsampled image to a true high quality image to determine image inaccuracies that arise due to the upsampling process. The enhanced encoder system encodes the information describing the image inaccuracies using a trained encoder model as the residual bitstream. The image enhancement system upsamples the same low quality image to obtain a prediction of a high quality image that can include image inaccuracies. Given the residual bitstream, the image enhancement system decodes the residual bitstream using a trained decoder model and uses the additional image information to improve the predicted high quality image. The image enhancement system can provide an improved, high quality image for display.

DEEP LEARNING BASED ON IMAGE ENCODING AND DECODING

20180176570 · 2018-06-21 ·

A deep learning based compression (DLBC) system trains multiple models that, when deployed, generates a compressed binary encoding of an input image that achieves a reconstruction quality and a target compression ratio. The applied models effectively identifies structures of an input image, quantizes the input image to a target bit precision, and compresses the binary code of the input image via adaptive arithmetic coding to a target codelength. During training, the DLBC system reconstructs the input image from the compressed binary encoding and determines the loss in quality from the encoding process. Thus, the models can be continually trained to, when applied to an input image, minimize the loss in reconstruction quality that arises due to the encoding process while also achieving the target compression ratio.

DEEP LEARNING BASED ADAPTIVE ARITHMETIC CODING AND CODELENGTH REGULARIZATION

20180176576 · 2018-06-21 ·

A deep learning based compression (DLBC) system applies trained models to compress binary code of an input image to a target codelength. For a set of binary codes representing the quantized coefficents of an input image, the DLBC system applies a first model that is trained to predict feature probabilities based on the context of each bit of the binary codes. The DLBC system compresses the binary code via adaptive arithmetic coding based on the determined probability of each bit. The compressed binary code represents a balance between a reconstruction quality of a reconstruction of the input image and a target compression ratio of the compressed binary code.

ADAPTIVE COMPRESSION BASED ON CONTENT

20180176578 · 2018-06-21 ·

A compression system trains a machine-learned encoder and decoder. The encoder can be deployed by a sender system to encode content for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode the encoded content and reconstruct the original content. The encoder receives content and generates a tensor as a compact representation of the content. The content may be, for example, images, videos, or text. The decoder receives a tensor and generates a reconstructed version of the content. In one embodiment, the compression system trains one or more encoding components such that the encoder can adaptively encode different degrees of information for regions in the content that are associated with characteristic objects, such as human faces, texts, or buildings.

USING GENERATIVE ADVERSARIAL NETWORKS IN COMPRESSION

20180174052 · 2018-06-21 ·

The compression system trains a machine-learned encoder and decoder through an autoencoder architecture. The encoder can be deployed by a sender system to encode content for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode the encoded content and reconstruct the original content. The encoder is coupled to receive content and output a tensor as a compact representation of the content. The content may be, for example, images, videos, or text. The decoder is coupled to receive a tensor representing content and output a reconstructed version of the content. The compression system trains the autoencoder with a discriminator to reduce compression artifacts in the reconstructed content. The discriminator is coupled to receive one or more input content, and output a discrimination prediction that discriminates whether the input content is the original or reconstructed version of the content.

Method and apparatus for determining target region in video frame for target acquisition

09990546 · 2018-06-05 ·

Alibaba Group Holding Limited

Xuan Jin

An example target acquisition method includes obtaining, according to a global feature of each video frame of a plurality of video frames, a target pre-estimated position of each scale in the video frame; clustering the target pre-estimated position in each video frame to obtain a corresponding target candidate region; and determining a target actual region in the video frame according to all the target candidate regions in each video frame in combination with confidence levels of the target candidate regions and corresponding scale processing. The techniques of the present disclosure quickly and effectively acquire one or multiple targets, and, more particularly, achieve accurately distinguishing and acquiring the multiple targets.

Low-power always-on face detection, tracking, recognition and/or analysis using events-based vision sensor

09986211 · 2018-05-29 ·

Qualcomm Incorporated

Techniques disclosed herein utilize a vision sensor that integrates a special-purpose camera with dedicated computer vision (CV) computation hardware and a dedicated low-power microprocessor for the purposes of detecting, tracking, recognizing, and/or analyzing subjects, objects, and scenes in the view of the camera. The vision sensor processes the information retrieved from the camera using the included low-power microprocessor and sends events (or indications that one or more reference occurrences have occurred, and, possibly, associated data) for the main processor only when needed or as defined and configured by the application. This allows the general-purpose microprocessor (which is typically relatively high-speed and high-power to support a variety of applications) to stay in a low-power (e.g., sleep mode) most of the time as conventional, while becoming active only when events are received from the vision sensor.

Method and System for Optimization of a Human-Machine Team for Geographic Region Digitization

20240371145 · 2024-11-07 ·

The Government Of The United States Of America, As Represented By The Secretary Of The Navy

A method that includes receiving a set of one or more images, each having one or more sets of pixels, receiving a ground truth value that a vertex point associated with a transition between two regions in a respective set of pixels, identifying a machine placement candidate vertex point for a first set of pixels, determining a set of one or more selected candidate features in the set of candidate features that maximizes an objective function that identifies an accuracy of the identified machine placement candidate vertex point compared to the respective ground truth for a respective set of pixels, updating a set of one or more basis features by adding the set of one or more selected candidate features that maximizes the objective function, and training a machine learning model based on the updated set of one or more basis features to identify additional vertex points for a transition.

Digital makeup palette

12136173 · 2024-11-05 ·

L'oreal

An augmented reality system for makeup, includes a makeup objective unit including computation circuitry operably coupled to a graphical user interface configured to generate one or more instances of user selectable makeup objectives and to receive user-selected makeup objective information, a makeup palette unit operably coupled to the makeup objective unit, the makeup palette unit including computation circuitry configured to generate at least one digital makeup palette for a digital makeup product, and a makeup objective visualization unit including computation circuitry configured to analyze a user's face to determine one or more of face shape, facial landmarks, skin tone, hair color, eye color, lip shape, eyelid shape, hair style and lighting, and automatically create one or more instances of a custom virtual try-on for a user in accordance with the user-selected makeup objective information and the at least one digital makeup palette generated based on the analysis of the user's face.

Patent classifications

G06V10/449