Patent classifications
H04N19/19
METHODS, SYSTEMS, AND APPARATUSES FOR PROCESSING VIDEO BY ADAPTIVE RATE DISTORTION OPTIMIZATION
Systems and methods are described herein for processing video. An encoder implementing the systems and methods described herein may receive video data comprising a plurality of frames and may partition each frame of the plurality of frames into a plurality of coding units. The encoder may then partition a coding unit into two or more prediction units. The encoder may determine, based on one or more coding parameters, a target bit rate, and characteristics of a human visual system (HVS), a coding mode for each of the two or more prediction units to minimize distortion in the encoded bitstream. The encoder may then determine a residual signal comprising a difference between each of the two or more prediction units and each of one or more corresponding prediction areas in a previously encoded frame and then generate an encoded bitstream comprising the residual signal.
Methods and Apparatuses of Frequency Domain Mode Decision in Video Encoding Systems
Video encoding methods and apparatuses for frequency domain mode decision include receiving residual data of a current block, testing multiple coding modes on the residual data, calculating a distortion associated with each of the coding modes in a frequency domain, performing a mode decision to select a best coding mode from the tested coding modes according to the distortion calculated in the frequency domain, and encoding the current block based on the best coding mode.
Systems and methods for dynamic early termination of mode decision in hardware video encoders
An example system may include a primary mode decision module, included in a hardware video encoding pipeline, that (1) receives video data for encoding in accordance with a video encoding standard, and (2) identifies, from an initial set of prediction modes supported by the video encoding standard, a primary set of prediction modes for encoding the video data in accordance with the video encoding standard. The example system may also include a secondary mode decision module that (1) determines, for each prediction mode included in the primary set of prediction modes and based on the video data, a cost associated with the prediction mode, and (2) selects, from the primary set of prediction modes and based on the determined costs associated with the prediction modes included in the primary set of prediction modes, a prediction mode for encoding of the video data by the hardware video encoding pipeline.
MOTION-COMPENSATED COMPRESSION OF DYNAMIC VOXELIZED POINT CLOUDS
Disclosed herein are exemplary embodiments of innovations in the area of point cloud encoding and decoding. Example embodiments can reduce the computational complexity and/or computational resource usage during 3D video encoding by selectively encoding one or more 3D-point-cloud blocks using an inter-frame coding (e.g., motion compensation) technique that allows for previously encoded/decoded frames to be used in predicting current frames being encoded. Alternatively, one or more 3D-point-cloud block can be encoded using an intra-frame encoding approach. The selection of which encoding mode to use can be based, for example, on a threshold that is evaluated relative to rate-distortion performance for both intra-frame and inter-frame encoding. Still further, embodiments of the disclosed technology can use one or more voxel-distortion-correction filters to correct distortion errors that may occur during voxel compression. Such filters are uniquely adapted for the particular challenges presented when compressing 3D image data. Corresponding decoding techniques are also disclosed.
Machine learning based rate-distortion optimizer for video compression
Systems and techniques are described for data encoding using a machine learning approach to generate a distortion prediction {circumflex over (D)} and a predicted bit rate {circumflex over (R)}, and to use {circumflex over (D)} and {circumflex over (R)} to perform rate-distortion optimization (RDO). For example, a video encoder can generate the distortion prediction {circumflex over (D)} and the bit rate residual prediction based on outputs of the one or more neural networks in response to the one or more neural networks receiving a residual portion of a block of a video frame as input. The video encoder can determine bit rate metadata prediction
based on metadata associated with a mode of compression, and determine {circumflex over (R)} to be the sum of
and
. The video encoder can determine a rate-distortion cost prediction Ĵ as a function of {circumflex over (D)} and {circumflex over (R)}, and can determine a prediction mode for compressing the block based on Ĵ.
Machine learning based rate-distortion optimizer for video compression
Systems and techniques are described for data encoding using a machine learning approach to generate a distortion prediction {circumflex over (D)} and a predicted bit rate {circumflex over (R)}, and to use {circumflex over (D)} and {circumflex over (R)} to perform rate-distortion optimization (RDO). For example, a video encoder can generate the distortion prediction {circumflex over (D)} and the bit rate residual prediction based on outputs of the one or more neural networks in response to the one or more neural networks receiving a residual portion of a block of a video frame as input. The video encoder can determine bit rate metadata prediction
based on metadata associated with a mode of compression, and determine {circumflex over (R)} to be the sum of
and
. The video encoder can determine a rate-distortion cost prediction Ĵ as a function of {circumflex over (D)} and {circumflex over (R)}, and can determine a prediction mode for compressing the block based on Ĵ.
Mode complexity based coding strategy selection
A system may receive an input stream for a coding operation. The system may determine available coding modes for the coding operation. The system may include coding selection logic that may determine a coding mode in response to the based on the available selection of coding modes. The coding selection logic may use the selected coding mode to determine a coding strategy. The selection logic may send an indication of the selected coding mode and coding strategy to coding logic to support execution of the coding operation, which may use the selected coding mode and coding strategy.
Mode complexity based coding strategy selection
A system may receive an input stream for a coding operation. The system may determine available coding modes for the coding operation. The system may include coding selection logic that may determine a coding mode in response to the based on the available selection of coding modes. The coding selection logic may use the selected coding mode to determine a coding strategy. The selection logic may send an indication of the selected coding mode and coding strategy to coding logic to support execution of the coding operation, which may use the selected coding mode and coding strategy.
METHOD OF TRANSCODING VIDEO DATA WITH FUSION OF CODING UNITS, COMPUTER PROGRAM, TRANSCODING MODULE AND TELECOMMUNICATIONS EQUIPMENT ASSOCIATED THEREWITH
Method of transcoding video data with fusion of coding units, computer program, transcoding module and telecommunications equipment associated therewith. Method of transcoding video data between a first and a second format (F1, F2), the method comprising a step of decoding the binary stream (F.sub.B1) providing decoded video data, data representative of the coding structure of the frames in the first format (F1) and, for all or some of the first coding units, prediction data, and a step of re-encoding in the course of which the decoded video data are encoded in the second format (F2). During the re-encoding step, an intermediate coding structure is constructed, comprising intermediate coding units constructed so as to correspond to the fusion of one or more first coding units, prediction data are allocated to each of the intermediate coding units, and the decoded video data are re-encoded in the second format (F2) as a function of the intermediate coding structure.
METHODS, ENCODERS AND DECODERS FOR CODING OF VIDEO SEQUENCING
Methods, encoders (110) and decoders (120) for encoding frames of a video sequence into an encoded representation of the video sequence are disclosed. The encoder (110) encodes (203) frames into a first set of encoded units, while specifying at least one residual parameter in one or more of the first set of encoded units. The encoder (110) encodes (204) frames into a second set of encoded units, while refraining from specifying the at least one residual parameter. The encoder (110) encodes (203) frames into a first set of encoded units, wherein each frame has a first level of fidelity. The encoder (110) encodes (204) frames into a second set of encoded units, wherein each frame has a second level of fidelity, wherein the second level is less than the first level. The decoder (120) decodes (212, 213), while obtaining a first or a second level of fidelity for each frame. When the second level is less than the first level, the decoder (120) enhances (216) a second set of frames towards obtaining the first level of fidelity for each frame of the second set. Corresponding computer programs and carriers therefor are also disclosed.