Patent classifications
H04N19/19
RECEPTIVE-FIELD-CONFORMING CONVOLUTION MODELS FOR VIDEO CODING
Convolutional neural networks (CNN) that determine a mode decision (e.g., block partitioning) for encoding a block include feature extraction layers and multiple classifiers. A non-overlapping convolution operation is performed at a feature extraction layer by setting a stride value equal to a kernel size. The block has a NN size, and a smallest partition output for the block has a SS size. Classification layers of each classifier receive feature maps having a feature dimension. An initial classification layer receives the feature maps as an output of a final feature extraction layer. Each classifier infers partition decisions for sub-blocks of size (S)(S) of the block, wherein is a power of 2 and =2, . . . , N/S, by applying, at some successive classification layers, a 11 kernel to reduce respective feature dimensions; and outputting by a last layer of the classification layers an output corresponding to a N/(S)N/(S)1 output map.
METHOD AND APPARATUS FOR POINT CLOUD COMPRESSION
Aspects of the disclosure provide methods and apparatuses for point cloud compression and decompression. In some examples, an apparatus for point cloud compression/decompression includes processing circuitry. For example, the processing circuitry in the apparatus for point cloud encoding receives an occupancy map for a point cloud. The occupancy map is indicative of a background portion and a foreground portion for a coding block in an image that is generated based on the point cloud. Then, the processing circuitry devaluates distortions in the background portion of the coding block during an optimization process that results a coding option for the coding block, and encodes the coding block according to the coding option.
Content-aware predictive bitrate ladder
Methods, systems, and apparatuses may encode a media content item based on metadata from previous encoding. The encoding may also generate encoding metadata, which may comprise a qualitative or quantitative characterization of the encoded media content item. A prediction engine may, based on this metadata, determine new encoding settings for the same or a different video resolution. The prediction engine may cause an encoded media content item to be stored and may cause encoding of the media content item using the new encoding settings.
ESCAPE CODING FOR COEFFICIENT LEVELS
As part of bypass decoding syntax elements for a set of coefficients in response to reaching a maximum number of regular coded bins, a video decoder is configured to receive a prefix value for a transform coefficient; decode the prefix value using Golomb-Rice coding; in response to a length of the prefix value being equal to a threshold value, receive a suffix value for the transform coefficient; decode the suffix value using exponential Golomb coding; and determine a level value for the transform coefficient based on the decoded prefix value and the decoded suffix value.
Hybrid Motion-Compensated Neural Network with Side-Information Based Video Coding
A hybrid apparatus for coding a video stream includes a first encoder. The first encoder includes a neural network having at least one hidden layer, and the neural network receives source data from the video stream at a first hidden layer of the at least one hidden layer, receives side information correlated with the source data at the first hidden layer, and generates guided information using the source data and the side information. The first encoder outputs the guided information and the side information for a decoder to reconstruct the source data.
RATE/DISTORTION/RDCOST MODELING WITH MACHINE LEARNING
A method for encoding a block of a video stream includes generating, using pixel values of the block, block features for the block; for each candidate encoding mode of candidate encoding modes, generating, using the block features and the each candidate encoding mode as inputs to a machine-learning module, a respective encoding cost; selecting, based on the respective encoding costs, a predetermined number of the candidate encoding modes; selecting, based on the respective encoding costs of the at least some encoding modes, a best mode for encoding the block; and encoding, in a compressed bitstream, the block using the best mode.
SPEEDUP TECHNIQUES FOR RATE DISTORTION OPTIMIZED QUANTIZATION
Techniques for selecting a coding mode for an image coding process are described. Coding modes can be selected through a coding mode transition state machine, a re-quantization process, selection of an optimal transform size, by skipping some quantization parameters, or by performing motion search.
SPEEDUP TECHNIQUES FOR RATE DISTORTION OPTIMIZED QUANTIZATION
Techniques for selecting a coding mode for an image coding process are described. Coding modes can be selected through a coding mode transition state machine, a re-quantization process, selection of an optimal transform size, by skipping some quantization parameters, or by performing motion search.
SPEEDUP TECHNIQUES FOR RATE DISTORTION OPTIMIZED QUANTIZATION
Techniques for selecting a coding mode for an image coding process are described. Coding modes can be selected through a coding mode transition state machine, a re-quantization process, selection of an optimal transform size, by skipping some quantization parameters, or by performing motion search.
SPEEDUP TECHNIQUES FOR RATE DISTORTION OPTIMIZED QUANTIZATION
Techniques for selecting a coding mode for an image coding process are described. Coding modes can be selected through a coding mode transition state machine, a re-quantization process, selection of an optimal transform size, by skipping some quantization parameters, or by performing motion search.