G06T9/002

NON-LINEAR QUANTIZATION WITH SUBSTITUTION IN NEURAL IMAGE COMPRESSION
20220405978 · 2022-12-22 · ·

Method, apparatus, and non-transitory storage medium for end-to-end neural image compression using non-linear quantization with substitution, including receiving one or more input images, generating a substitute image associated with the input image based on the input image using a neural network based substitute feature generator, compressing the substitute image, quantizing the compressed substitute image to obtain a quantized representation of the input image with higher compression performance by using a non-linear quantizer, and entropy encoding the quantized substitute image using a neural network based encoder to generate a compressed representation of the quantized representation.

Systems and methods for processing audiovisual data using latent codes from generative networks and models

Systems and methods for viewing, storing, transmitting, searching, and editing application-specific audiovisual content (or other unstructured data) are disclosed in which edge devices generate content on the fly from a partial set of instructions rather than merely accessing the content in its final or near-final form. An image processing architecture may include a generative model that may be a deep learning model. The generative model may include a latent space comprising a plurality of latent codes and a trained generator mapping. The trained generator mapping may convert points in the latent space to uncompressed data points, which in the case of audiovisual content may be generated image frames. The generative model may be capable of closely approximating (up to noise or perceptual error) most or all potential data points in the relevant compression application, which in the case of audiovisual content may be source images.

ITERATIVE TRAINING OF NEURAL NETWORKS FOR INTRA PREDICTION
20220398455 · 2022-12-15 ·

An iterative training of neural networks for video coding and decoding using intra prediction is provided that finds a tradeoff between an extreme genericity and an extreme specialization to a codec for the trained neural networks. At the first iteration, the set of neural networks is trained following a partitioning approach. Then, for several iterations, the set of neural networks is inserted into the codec, and pairs of a block and its context are extracted from the partitioning of images via the codec with a single additional neural network-based mode then, the neural networks are retrained on these pairs. This way, from the second iteration, the neural networks learn an intra prediction diverging from that in the codec while still being valuable for the codec in terms of rate-distortion performance.

IMAGE PROCESSING METHOD AND DEVICE, NEUTRAL NETWORK AND TRAINING METHOD THEREOF, STORAGE MEDIUM
20220398783 · 2022-12-15 ·

Disclosed are an image processing method, an image processing device, a neutral network and a training method thereof, and a storage medium. The image processing method includes: obtaining an input image; performing a segmentation process on the input image via a first encoding-decoding network, to obtain a first output feature map and the first segmented image; concatenating the first output feature map with at least one selected from the group consisting of the input image and the first segmented image, to obtain an input of the second encoding-decoding network; and performing a segmentation process on the input of the second encoding-decoding network via a second encoding-decoding network, to obtain the second segmented image. And the first encoding-decoding network and the second encoding-decoding network forms a neural network.

Iterative media object compression algorithm optimization using decoupled calibration of perceptual quality algorithms

One or more multi-stage optimization iterations are performed with respect to a compression algorithm. A given iteration comprises a first stage in which hyper-parameters of a perceptual quality algorithm are tuned independently of the compression algorithm. A second stage of the iteration comprises tuning hyper-parameters of the compression algorithm using a set of perceptual quality scores generated by the tuned perceptual quality algorithm. The final stage of the iteration comprises performing a compression quality evaluation test on the tuned compression algorithm.

Unified referring video object segmentation network
11526698 · 2022-12-13 · ·

Systems and methods for video object segmentation are described. Embodiments of systems and methods may receive a referral expression and a video comprising a plurality of image frames, generate a first image mask based on the referral expression and a first image frame of the plurality of image frames, generate a second image mask based on the referral expression, the first image frame, the first image mask, and a second image frame of the plurality of image frames, and generate annotation information for the video including the first image mask overlaid on the first image frame and the second image mask overlaid on the second image frame.

NEURAL FRAME EXTRAPOLATION RENDERING MECHANISM
20220392116 · 2022-12-08 · ·

A mechanism is described for image frame rendering. An apparatus of embodiments, as described herein, includes one or more processors to receive a plurality of past image frames including a plurality of pixels, receive a predicted optical flow, generate a predicted frame and a confidence map associated with the predicted frame based on the plurality of past image frames and the predicted optical flow, render a first set of the plurality of pixels in the predicted frame based on the confidence map and adding the rendered pixels to the predicted frame to generate a final frame.

On Padding Methods For Neural Network-Based In-Loop Filter
20220394309 · 2022-12-08 ·

A method implemented by a video coding apparatus. The method includes determining, in real time, padding dimensions for padding samples to be applied to a video unit of a video for in-loop filtering, wherein d.sub.1, d.sub.2, d.sub.3, and d.sub.4 represent the padding dimensions corresponding to top, bottom, left, and right boundaries of the video unit, respectively; and performing a conversion between a video unit and a bitstream of the video based on the padding dimensions that were determined. A corresponding video coding apparatus and non-transitory computer-readable recording medium are also disclosed.

STABLE POSE ESTIMATION WITH ANALYSIS BY SYNTHESIS
20220392099 · 2022-12-08 ·

One embodiment of the present invention sets forth a technique for generating a pose estimation model. The technique includes generating one or more trained components included in the pose estimation model based on a first set of training images and a first set of labeled poses associated with the first set of training images, wherein each labeled pose includes a first set of positions on a left side of an object and a second set of positions on a right side of the object. The technique also includes training the pose estimation model based on a set of reconstructions of a second set of training images, wherein the set of reconstructions is generated by the pose estimation model from a set of predicted poses outputted by the one or more trained components.

DATA COMPRESSION AND DECOMPRESSION SYSTEM AND METHOD THEREOF
20220392117 · 2022-12-08 ·

Both of a high compression ratio and a high processing speed are achievable. In a data compression and decompression system that includes a parallel processing device performing a plurality of processes in parallel, the parallel processing device divides original data into a plurality of data by a predetermined unit. The parallel processing device performs coding processes on the plurality of data in parallel using a predetermined model to create a plurality of coded data. The parallel processing device creates compressed data of the original data from the plurality of coded data.