H04N19/19

METHOD FOR INTER PREDICTION AND DEVICE THEREFOR, AND METHOD FOR MOTION COMPENSATION AND DEVICE THEREFOR

Provided are an inter prediction method and a motion compensation method. The inter prediction method includes: performing inter prediction on a current image by using a long-term reference image stored in a decoded picture buffer; determining residual data and a motion vector of the current image generated via the inter prediction; and determining least significant bit (LSB) information as a long-term reference index indicating the long-term reference image by dividing picture order count (POC) information of the long-term reference image into most significant bit (MSB) information and the LSB information.

Motion-compensated compression of dynamic voxelized point clouds

Disclosed herein are exemplary embodiments of innovations in the area of point cloud encoding and decoding. Example embodiments can reduce the computational complexity and/or computational resource usage during 3D video encoding by selectively encoding one or more 3D-point-cloud blocks using an inter-frame coding (e.g., motion compensation) technique that allows for previously encoded/decoded frames to be used in predicting current frames being encoded. Alternatively, one or more 3D-point-cloud block can be encoded using an intra-frame encoding approach. The selection of which encoding mode to use can be based, for example, on a threshold that is evaluated relative to rate-distortion performance for both intra-frame and inter-frame encoding. Still further, embodiments of the disclosed technology can use one or more voxel-distortion-correction filters to correct distortion errors that may occur during voxel compression. Such filters are uniquely adapted for the particular challenges presented when compressing 3D image data. Corresponding decoding techniques are also disclosed.

Motion-compensated compression of dynamic voxelized point clouds

Disclosed herein are exemplary embodiments of innovations in the area of point cloud encoding and decoding. Example embodiments can reduce the computational complexity and/or computational resource usage during 3D video encoding by selectively encoding one or more 3D-point-cloud blocks using an inter-frame coding (e.g., motion compensation) technique that allows for previously encoded/decoded frames to be used in predicting current frames being encoded. Alternatively, one or more 3D-point-cloud block can be encoded using an intra-frame encoding approach. The selection of which encoding mode to use can be based, for example, on a threshold that is evaluated relative to rate-distortion performance for both intra-frame and inter-frame encoding. Still further, embodiments of the disclosed technology can use one or more voxel-distortion-correction filters to correct distortion errors that may occur during voxel compression. Such filters are uniquely adapted for the particular challenges presented when compressing 3D image data. Corresponding decoding techniques are also disclosed.

Receptive-field-conforming convolutional models for video coding
11310498 · 2022-04-19 · ·

An apparatus for encoding a block of a picture includes a convolutional neural network (CNN) for determining a block partitioning of the block, the block having an N×N size and a smallest partition determined by the CNN being of size S×S. The CNN includes feature extraction layers; a concatenation layer that receives, from the feature extraction layers, first feature maps of the block, where each first feature map of the first feature maps is of the smallest possible partition size S×S of the block; and at least one classifier that is configured to infer partition decisions for sub-blocks of size (αS)×(αS) of the block, where α is a power of 2.

Receptive-field-conforming convolutional models for video coding
11310498 · 2022-04-19 · ·

An apparatus for encoding a block of a picture includes a convolutional neural network (CNN) for determining a block partitioning of the block, the block having an N×N size and a smallest partition determined by the CNN being of size S×S. The CNN includes feature extraction layers; a concatenation layer that receives, from the feature extraction layers, first feature maps of the block, where each first feature map of the first feature maps is of the smallest possible partition size S×S of the block; and at least one classifier that is configured to infer partition decisions for sub-blocks of size (αS)×(αS) of the block, where α is a power of 2.

Method and apparatus for applying deep learning techniques in video coding, restoration and video quality analysis (VQA)

Video quality analysis may be used in many multimedia transmission and communication applications, such as encoder optimization, stream selection, and/or video reconstruction. An objective VQA metric that accurately reflects the quality of processed video relative to a source unprocessed video may take into account both spatial measures and temporal, motion-based measures when evaluating the processed video. Temporal measures may include differential motion metrics indicating a difference between a frame difference of a plurality of frames of the processed video relative to that of a corresponding plurality of frames of the source video. In addition, neural networks and deep learning techniques can be used to develop additional improved VQA metrics that take into account both spatial and temporal aspects of the processed and unprocessed videos.

METHOD AND SYSTEMS FOR OPTIMIZED CONTENT ENCODING
20220103832 · 2022-03-31 ·

An encoder (e.g., a Versatile Video Coding (VVC) encoder, etc.) may make real-time encoding decisions to vary a bitrate of a content item (e.g., high-resolution video, streaming content, a movie, a show/program, etc.) by upsampling and/or downsampling portions (e.g., frames/slices, group of pictures (GOP), coding units (CU), coding tree units (CTU), etc.) based on a cost function that utilizes a versatile Lagrangian multiplier (denoted as A) to denote the versatility). The versatile Lagrangian multiplier may not only account for the effects of quantization (e.g., a quantization parameter (QP), etc.) on portions of the content item, instead, the versatile Lagrangian multiplier may also be based on adjustment parameters associated with content resolution and/or playback quality of experience (QoE). The versatile Lagrangian multiplier enables a substantial decrease in transmission bitrate for portions of a content item without reducing the visual presentation quality and keeping an end-user quality of experience (QoE) at an optimal level.

METHOD AND SYSTEMS FOR OPTIMIZED CONTENT ENCODING
20220103832 · 2022-03-31 ·

An encoder (e.g., a Versatile Video Coding (VVC) encoder, etc.) may make real-time encoding decisions to vary a bitrate of a content item (e.g., high-resolution video, streaming content, a movie, a show/program, etc.) by upsampling and/or downsampling portions (e.g., frames/slices, group of pictures (GOP), coding units (CU), coding tree units (CTU), etc.) based on a cost function that utilizes a versatile Lagrangian multiplier (denoted as A) to denote the versatility). The versatile Lagrangian multiplier may not only account for the effects of quantization (e.g., a quantization parameter (QP), etc.) on portions of the content item, instead, the versatile Lagrangian multiplier may also be based on adjustment parameters associated with content resolution and/or playback quality of experience (QoE). The versatile Lagrangian multiplier enables a substantial decrease in transmission bitrate for portions of a content item without reducing the visual presentation quality and keeping an end-user quality of experience (QoE) at an optimal level.

VIDEO STREAM ADAPTIVE FILTERING FOR BITRATE REDUCTION
20220078446 · 2022-03-10 ·

Adaptive filtering is used video stream for bitrate reduction. A first copy of the input video stream is encoded to a reference bitstream. Each of a number of candidate filters is applied to each frame of a second copy of the input video stream to produce a filtered second copy of the input video stream. The filtered second copy is encoded to a candidate bitstream. A cost value for the candidate filter is determined based on distortion value and bitrate differences between the candidate bitstream and the reference bitstream. The candidate bitstream corresponding to the candidate filter with a lowest one of the cost values is selected as the output bitstream, which is then output or stored. Processing the input video stream using the adaptive filter and before the encoding may result in bitrate reduction, thereby improving compression, decompression, and other performance.

Bit rate control method and video processing device

A bit rate control method includes the following operations: receiving a first target bit of a video to be coded; determining a second target bit for first coding tree units (CTUs) in CTUs of the video according to the first target bit; determining a fourth target bit of at least one fourth CTU in the CTUs according to an actual bit of at least one second CTU in the CTUs and a third target bit of at least one third CTU in the CTUs, in which the at least one second CTU is completely coded, the at least one third CTU is not completely coded, and a coding of the at least one fourth CTU is not started; and sequentially adjusting at least one coding parameter for coding the video according to the second target bit, the third target bit, and the fourth target bit.