G06T9/002

Machine learning based video compression

Systems and methods for compressing target content are disclosed. In one embodiment, a system may include non-transient electronic storage and one or more physical computer processors. The one or more physical computer processors may be configured by machine-readable instructions to obtain the target content comprising one or more frames, wherein a given frame comprises one or more features. The one or more physical computer processors may be configured by machine-readable instructions to obtain a conditioned network. The one or more physical computer processors may be configured by machine-readable instructions to generate decoded target content by applying the conditioned network to the target content.

Method and data processing system for lossy image or video encoding, transmission and decoding

A method for lossy image or video encoding, transmission and decoding, the method comprising the steps of: receiving an input image at a first computer system; encoding the first input training image using a first trained neural network to produce a latent representation; performing a quantization process on the latent representation to produce a quantized latent; entropy encoding the quantized latent using a probability distribution, wherein the probability distribution is defined using a tensor network; transmitting the entropy encoded quantized latent to a second computer system; entropy decoding the entropy encoded quantized latent using the probability distribution to retrieve the quantized latent; and decoding the quantized latent using a second trained neural network to produce an output image, wherein the output image is an approximation of the input training image.

Generating modified digital images utilizing a global and spatial autoencoder

The present disclosure relates to systems, methods, and non-transitory computer readable media for generating a modified digital image from extracted spatial and global codes. For example, the disclosed systems can utilize a global and spatial autoencoder to extract spatial codes and global codes from digital images. The disclosed systems can further utilize the global and spatial autoencoder to generate a modified digital image by combining extracted spatial and global codes in various ways for various applications such as style swapping, style blending, and attribute editing.

Encoder and decoder for encoding and decoding images

There are disclosed techniques for encoding and/or decoding multiple image information, e.g. through recurrent neural network based stereo compression. In an image encoder (1), a primary block (100) encodes a primary image information, and a secondary block (300) encodes a secondary image information. States for the primary block are transformed onto states for the secondary block at a transformation block (200), which keeps into account correspondence information (e.g. disparity information) between the first image information and the second image information. In an image decoder (1b), a primary block (100) decodes an encoded version of a primary image information, and a secondary block (300) encodes an encoded version of a secondary image information. States for the primary block are transformed onto states for the secondary block at a transformation block (200), which keeps into account correspondence information (e.g. disparity information) between the first image information and the second image information.

Shared Training of Neural Networks to Reduce Data and for the Object Detection of Image Data
20220415033 · 2022-12-29 ·

A method for configuring an object detection system includes providing annotated training data comprising image data with defined assignments to at least one object, and training a neural network with a first neural sub-network, which is provided to compress the image data. The first neural sub-network is connected to at least one further neural sub-network. The at least one further neural sub-network is configured to detect an object from the compressed training data. The first neural sub-network is parameterized in such a manner that the object is detected using the at least one further sub-network in a defined quality. The neural sub-networks are trained jointly.

Low-Power Fast-Response Machine Learning Variable Image Compression

Computing devices, such as mobile computing devices, have access to one or more image sensors that can capture images and video with multiple subjects. Some of these subjects may vary in priority for various tasks. It may be desired to increase or decrease the compression on each subject in order to more efficiently store the image data. Low-power, fast-response machine learning logic can be configured to allow for the generation of a plurality of inference data. Inference data can be associated with the type, motion and/or priority of the subjects as desired. This inference data can be utilized along with other subject data to generate one or more variable compression regions within the image data. The image data can be subsequently processed to compress different areas of the image based on a desired application. The variably compressed image can reduce file sizes and allow for more efficient storage and processing.

Machine learning model development

A method of machine learning model development includes building an autoencoder including an encoder trained to map an input into a latent representation, and a decoder trained to map the latent representation to a reconstruction of the input. The method includes building an artificial neural network classifier including the encoder, and a classification layer partially trained to perform a classification in which a class to which the input belongs is predicted based on the latent representation. Neural network inversion is applied to the classification layer to find inverted latent representations within a decision boundary between classes in which a result of the classification is ambiguous, and inverted inputs are obtained from the inverted latent representations. Each inverted input is labeled with a class that is its ground truth, and thereby producing added training data for the classification, and the classification layer is further trained using the added training data.

COMPUTER-IMPLEMENTED METHODS AND SYSTEMS FOR PROVISION OF A CORRECTION ALGORITHM FOR AN X-RAY IMAGE AND FOR CORRECTION OF AN X-RAY IMAGE, X-RAY FACILITY, COMPUTER PROGRAM, AND ELECTRONICALLY READABLE DATA MEDIUM
20220405897 · 2022-12-22 ·

A computer-implemented method for provision of a correction algorithm for an x-ray image that was recorded with an x-ray source emitting an x-ray radiation field, a filter facility spatially modulating an x-ray radiation dose, and an x-ray detector is provided. The correction algorithm includes a trained first processing function that, from first input data that includes at least one first physical parameter describing the x-ray radiation field and/or the measurement and at least one second physical parameter describing the spatial modulation of the filter facility, determines first output data. The first output data includes a mask for brightness compensation with regard to the spatial modulation of the filter facility in the x-ray image. The method includes providing first training data, providing an autoencoder for masks, and training of the autoencoder using the first training data. The method also includes determining an assignment rule, and providing the trained first processing function.

Method, Device And Non-Transitory Computer-Readable Storage Medium For Processing A Sequence Of Top View Image Frames

The invention relates to a method for processing a sequence of top view image frames ({I.sub.t.sup.LR}.sub.t=0.sup.T) of low resolution of a same terrestrial location, each top view image frame ({I.sub.t.sup.LR}.sub.t=0.sup.T) having pixels and pixel values, comprising the following steps: choosing (S0) one top view image frame (I.sub.0.sup.LR), called reference frame (I.sub.0.sup.LR), among the top view image frames ({I.sub.t.sup.LR}.sub.t=0.sup.T), estimating (S1) motion fields (F.sub.t.fwdarw.0, ME) between each top view image frame ({I.sub.t.sup.LR}.sub.t=1.sup.T) and the reference frame (I.sub.0.sup.LR) by using a first neural network (NN1), encoding (S1′) the top view image frames ({I.sub.t.sup.LR}.sub.t=1.sup.T) to produce convolutional features ({J.sub.t.sup.LR}.sub.t=1.sup.T) extracted respectively from the top view image frames ({I.sub.t.sup.LR}.sub.t=1.sup.T) by using a second neural network (NN2), aggregating (S2, S3, Shift-and-add, SPMC) pixels of the convolutional features ({J.sub.t.sup.LR}.sub.t=1.sup.T) to positions ({J.sub.t.sup.HR}.sub.t=1.sup.T) in a high resolution grid (HRG) using the motion fields (F.sub.t.fwdarw.0, ME) to obtain aggregated features (J.sup.HR), decoding (S4) by a decoder network (DN) the aggregated features (J.sup.HR) to produce a super-resolved image (Î.sub.0.sup.SR).

CONTENT-ADAPTIVE ONLINE TRAINING METHOD AND APPARATUS FOR DEBLOCKING IN BLOCK-WISE IMAGE COMPRESSION
20220405979 · 2022-12-22 · ·

Aspects of the disclosure provide a method, an apparatus, and non-transitory computer-readable storage medium for video decoding. The apparatus includes processing circuitry that reconstructs blocks of an image that is to be reconstructed from a coded video bitstream. The processing circuitry decodes first deblocking information in the coded video bitstream including a first deblocking parameter of a deep neural network (DNN) in a video decoder. The first deblocking parameter of the DNN is an updated parameter that has been previously determined by a content adaptive training process. The processing circuitry determines the DNN for a first boundary region comprising a subset of samples in the reconstructed blocks based on the first deblocking parameter included in the first deblocking information. The processing circuitry deblocks the first boundary region comprising the subset of samples in the reconstructed blocks based on the determined DNN corresponding to the first deblocking parameter.