H04N19/48

Using generative adversarial networks in compression

The compression system trains a machine-learned encoder and decoder through an autoencoder architecture. The encoder can be deployed by a sender system to encode content for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode the encoded content and reconstruct the original content. The encoder is coupled to receive content and output a tensor as a compact representation of the content. The content may be, for example, images, videos, or text. The decoder is coupled to receive a tensor representing content and output a reconstructed version of the content. The compression system trains the autoencoder with a discriminator to reduce compression artifacts in the reconstructed content. The discriminator is coupled to receive one or more input content, and output a discrimination prediction that discriminates whether the input content is the original or reconstructed version of the content.

Using generative adversarial networks in compression

The compression system trains a machine-learned encoder and decoder through an autoencoder architecture. The encoder can be deployed by a sender system to encode content for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode the encoded content and reconstruct the original content. The encoder is coupled to receive content and output a tensor as a compact representation of the content. The content may be, for example, images, videos, or text. The decoder is coupled to receive a tensor representing content and output a reconstructed version of the content. The compression system trains the autoencoder with a discriminator to reduce compression artifacts in the reconstructed content. The discriminator is coupled to receive one or more input content, and output a discrimination prediction that discriminates whether the input content is the original or reconstructed version of the content.

INVERSE TRANSFORMATION USING PRUNING FOR VIDEO CODING
20230244747 · 2023-08-03 ·

A method for decoding an encoded video bit stream in a video decoder is provided that includes determining a scan pattern type for a transform block to be decoded, decoding a column position X and a row position Y of a last non-zero coefficient in the transform block from the encoded video bit stream, selecting a column-row inverse transform order when the scan pattern type is a first type, selecting a row-column inverse transform order when the scan pattern type is a second type, and performing one dimensional (1D) inverse discrete cosine transformation (IDCT) computations according to the selected transform order to inversely transform the transform block to generate a residual block.

INVERSE TRANSFORMATION USING PRUNING FOR VIDEO CODING
20230244747 · 2023-08-03 ·

A method for decoding an encoded video bit stream in a video decoder is provided that includes determining a scan pattern type for a transform block to be decoded, decoding a column position X and a row position Y of a last non-zero coefficient in the transform block from the encoded video bit stream, selecting a column-row inverse transform order when the scan pattern type is a first type, selecting a row-column inverse transform order when the scan pattern type is a second type, and performing one dimensional (1D) inverse discrete cosine transformation (IDCT) computations according to the selected transform order to inversely transform the transform block to generate a residual block.

Methods and Apparatuses for Coding Video Data with Adaptive Secondary Transform Signaling Depending on TB Level Syntax
20220159300 · 2022-05-19 ·

Video processing methods and apparatuses implemented in a video encoding or decoding system with conditional secondary transform signaling. The video encoding system determines and applies a transform operation to residuals of a transform block to generate final transform coefficients, and adaptively signals a secondary transform index according to a position of a last significant coefficient in the transform block. A value of the secondary transform index is determined according to the transform operation. The video decoding system parses last significant coefficient position syntax of each transform block in the current block from a video bitstream to determine a position of a last significant coefficient for each transform block, and infers inverse secondary transform is not applied to the current block according to the positions of the last significant coefficients; otherwise, the video decoding system determines an inverse transform operation by parsing a secondary transform index from the video bitstream.

Method and apparatus for video coding
11190794 · 2021-11-30 · ·

Aspects of the disclosure provide methods and apparatuses for video encoding/decoding. In some examples, an apparatus for video decoding includes processing circuitry that can decode coded information of a transform block (TB) from a coded video bitstream. The coded information can indicate a region of the TB on which a secondary transform is applied. The region can include a first sub-region having transform coefficients calculated by the secondary transform and a second sub-region. The processing circuitry can determine, for a transform coefficient in the TB, whether a neighboring transform coefficient used to determine the transform coefficient is located in the second sub-region. When the neighboring transform coefficient is determined to be located in the second sub-region, the processing circuitry can determine the transform coefficient according to a default value for the neighboring transform coefficient and reconstruct a sample in the TB based on the transform coefficient for the sample.

Method and apparatus for video coding
11190794 · 2021-11-30 · ·

Aspects of the disclosure provide methods and apparatuses for video encoding/decoding. In some examples, an apparatus for video decoding includes processing circuitry that can decode coded information of a transform block (TB) from a coded video bitstream. The coded information can indicate a region of the TB on which a secondary transform is applied. The region can include a first sub-region having transform coefficients calculated by the secondary transform and a second sub-region. The processing circuitry can determine, for a transform coefficient in the TB, whether a neighboring transform coefficient used to determine the transform coefficient is located in the second sub-region. When the neighboring transform coefficient is determined to be located in the second sub-region, the processing circuitry can determine the transform coefficient according to a default value for the neighboring transform coefficient and reconstruct a sample in the TB based on the transform coefficient for the sample.

Iterative IDCT with adaptive non-linear filtering
11234022 · 2022-01-25 · ·

A method includes obtaining respective filtered pixels for pixels of a reconstructed image; and obtaining an edge-preserved image using the respective filtered pixels. Obtaining the respective filtered pixels includes, for each pixel of the reconstructed image, obtaining a respective filtered pixel by selecting a pixel patch including the pixel and first neighboring pixels of the pixel that are at relative neighboring locations with respect to the pixel; calculating respective weights for the first neighboring pixels; and filtering the pixel using the respective weights of the first neighboring pixels and the neighboring pixels to obtain the respective filtered pixel. Calculating the respective weights includes, for each neighboring pixel of the first neighboring pixels, forming a neighboring patch including the neighboring pixel and second neighboring pixels, and calculating a neighboring patch distance between the pixel patch and the neighboring pixel; and calculating a respective weight using the neighboring patch distance.

BLOCK-BASED PICTURE FUSION FOR CONTEXTUAL SEGMENTATION AND PROCESSING
20210360249 · 2021-11-18 · ·

An encoder includes circuitry configured to receive a video frame, partition the video frame into blocks, determine a first area within the video frame including a first grouping of a first subset of the blocks, determine a first average measure of information of the first area, and encode the video frame, the encoding including controlling a quantization parameter based on the first average measure of information of the first area. Related apparatus, systems, techniques and articles are also described.

Adaptive DCT sharpener
11178430 · 2021-11-16 · ·

Methods are provided for sharpening or otherwise modifying compressed images without decompressing and re-encoding the images. An overall image quality is determined based on the source of the compressed image, the quantization table of the compressed image, or some other factor(s), and a set of scaling factors corresponding to the image quality is selected. The selected scaling factors are then applied to corresponding quantization factors of the image's quantization table or other parameters of the compressed image that describe the image contents of the compressed image. The scaling factors of a given set of scaling factors can be determined by a machine learning process that involves training the scaling factors based on training images determined by decompressing and then sharpening or otherwise modifying a source set of compressed images. These methods can provide improvements with respect to encoded image size and computational cost of the image modification method.