H04N19/44

Data preprocessing and data augmentation in frequency domain

Methods and systems are provided for implementing preprocessing operations and augmentation operations upon image datasets transformed to frequency domain representations, including decoding images of an image dataset to generate a frequency domain representation of the image dataset; performing a resizing operation based on resizing factors on the image dataset in a frequency domain representation; performing a reshaping operation based on reshaping factors on the image dataset in a frequency domain representation; and performing a cropping operation on the image dataset in a frequency domain representation. The methods and systems may further include performing an augmentation operation on the image dataset in a frequency domain representation. Methods and systems of the present disclosure may free learning models from computational overhead caused by transforming image datasets into frequency domain representations. Furthermore, computational overhead caused by inverse transformation operations is also alleviated.

Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device

A three-dimensional data encoding method encodes a plurality of three-dimensional points, and includes: selecting one prediction mode from two or more prediction modes for calculating a predicted value of an attribute information item of a first three-dimensional point, in accordance with attribute information items of one or more second three-dimensional points in vicinity of the first three-dimensional point; calculating the predicted value by the selected prediction mode; calculating, as a prediction residual, a difference between a value of the attribute information item of the first three-dimensional point and the calculated predicted value; and generating a first bit stream that includes the selected prediction mode, the prediction residual, and a number of the two or more prediction modes.

Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device

A three-dimensional data encoding method encodes a plurality of three-dimensional points, and includes: selecting one prediction mode from two or more prediction modes for calculating a predicted value of an attribute information item of a first three-dimensional point, in accordance with attribute information items of one or more second three-dimensional points in vicinity of the first three-dimensional point; calculating the predicted value by the selected prediction mode; calculating, as a prediction residual, a difference between a value of the attribute information item of the first three-dimensional point and the calculated predicted value; and generating a first bit stream that includes the selected prediction mode, the prediction residual, and a number of the two or more prediction modes.

Encoding apparatus and encoding method, decoding apparatus and decoding method
11716487 · 2023-08-01 · ·

There is provided an encoding apparatus, an encoding method, a decoding apparatus, and a decoding method that make it possible to acquire two-dimensional image data of a viewpoint corresponding to a predetermined display image generation method and depth image data without depending upon the viewpoint upon image pickup. A conversion unit generates, from three-dimensional data of an image pickup object, two-dimensional image data of a plurality of viewpoints corresponding to a predetermined display image generation method and depth image data indicative of a position of each of pixels in a depthwise direction of the image pickup object. An encoding unit encodes the two-dimensional image data and the depth image data generated by the conversion unit. A transmission unit transmits the two-dimensional image data and the depth image data encoded by the encoding unit. The present disclosure can be applied, for example, to an encoding apparatus and so forth.

Encoding apparatus and encoding method, decoding apparatus and decoding method
11716487 · 2023-08-01 · ·

There is provided an encoding apparatus, an encoding method, a decoding apparatus, and a decoding method that make it possible to acquire two-dimensional image data of a viewpoint corresponding to a predetermined display image generation method and depth image data without depending upon the viewpoint upon image pickup. A conversion unit generates, from three-dimensional data of an image pickup object, two-dimensional image data of a plurality of viewpoints corresponding to a predetermined display image generation method and depth image data indicative of a position of each of pixels in a depthwise direction of the image pickup object. An encoding unit encodes the two-dimensional image data and the depth image data generated by the conversion unit. A transmission unit transmits the two-dimensional image data and the depth image data encoded by the encoding unit. The present disclosure can be applied, for example, to an encoding apparatus and so forth.

Method of efficient signalling of CBF flags

A method comprising obtaining a bitstream, the bitstream comprises a transform unit syntax and a coding unit syntax, the transform unit syntax includes a value of a first flag and a value of a second flag related to, respectively, a first chroma transform block and a second chroma transform block of a current transform unit or a current sub-transform unit within the current transform unit, the first or second flag specifies whether the first or second chroma transform block contains at least one transform coefficient levels not equal to 0, the coding unit syntax includes a value of a third flag specifying whether a transform tree structure is present or not; and deriving a value of a fourth flag based on the values of the first, second, and third flags, the fourth flag specifies whether a luma transform block contains at least one transform coefficient levels not equal to 0.

Method of efficient signalling of CBF flags

A method comprising obtaining a bitstream, the bitstream comprises a transform unit syntax and a coding unit syntax, the transform unit syntax includes a value of a first flag and a value of a second flag related to, respectively, a first chroma transform block and a second chroma transform block of a current transform unit or a current sub-transform unit within the current transform unit, the first or second flag specifies whether the first or second chroma transform block contains at least one transform coefficient levels not equal to 0, the coding unit syntax includes a value of a third flag specifying whether a transform tree structure is present or not; and deriving a value of a fourth flag based on the values of the first, second, and third flags, the fourth flag specifies whether a luma transform block contains at least one transform coefficient levels not equal to 0.

DECODER, ENCODER AND METHODS FOR MIXING NAL UNITS OF DIFFERENT NAL UNIT TYPES IN VIDEO STREAMS

The present invention is concerned with decoders, encoders and corresponding methods for handling video data streams (11) comprising a first sub-bitstream (11-1) and a second sub-bitstream (11-2). The herein described concept provides solutions for mixing, within an access unit (30, 31, 32), different NAL units (301, 302, 3030, 304) of different NAL unit types. For example, RAP NAL unit types may be mixed with different IRAP NAL unit types or non-IRAP NAL unit types, and non-IRAP NAL unit types may be mixed with different non-IRAP NAL unit types.

DECODER, ENCODER AND METHODS FOR MIXING NAL UNITS OF DIFFERENT NAL UNIT TYPES IN VIDEO STREAMS

The present invention is concerned with decoders, encoders and corresponding methods for handling video data streams (11) comprising a first sub-bitstream (11-1) and a second sub-bitstream (11-2). The herein described concept provides solutions for mixing, within an access unit (30, 31, 32), different NAL units (301, 302, 3030, 304) of different NAL unit types. For example, RAP NAL unit types may be mixed with different IRAP NAL unit types or non-IRAP NAL unit types, and non-IRAP NAL unit types may be mixed with different non-IRAP NAL unit types.

VIDEO DECODING METHOD, VIDEO ENCODING METHOD, AND RELATED APPARATUSES
20230024834 · 2023-01-26 ·

A video decoding method includes: performing entropy decoding processing on an encoding block of a video image frame to obtain a quantized coefficient block of residual data corresponding to the encoding block; calculating quantization coefficients in the quantized coefficient block to obtain an implicitly derived index value; determining a transformation mode of the encoding block according to the implicitly derived index value and a value of an index identifier included in the encoding block; and performing inverse transformation processing on an inverse quantization result of the quantized coefficient block based on the transformation mode of the encoding block.