Patent classifications
H04N19/90
METHOD AND APPARATUS FOR TRANSFORM-BASED IMAGE ENCODING/DECODING
The present invention relates to a method and apparatus for encoding and decoding a video image based on transform. The method for decoding a video includes: determining a transform mode of a current block; inverse-transforming residual data of the current block according to the transform mode of the current block; and rearranging the inverse-transformed residual data of the current block according to the transform mode of the current block, wherein the transform mode includes at least one of SDST (Shuffling Discrete Sine Transform), SDCT (Shuffling Discrete cosine Transform), DST (Discrete Sine Transform) or DCT (Discrete Cosine Transform).
METHOD AND APPARATUS FOR TRANSFORM-BASED IMAGE ENCODING/DECODING
The present invention relates to a method and apparatus for encoding and decoding a video image based on transform. The method for decoding a video includes: determining a transform mode of a current block; inverse-transforming residual data of the current block according to the transform mode of the current block; and rearranging the inverse-transformed residual data of the current block according to the transform mode of the current block, wherein the transform mode includes at least one of SDST (Shuffling Discrete Sine Transform), SDCT (Shuffling Discrete cosine Transform), DST (Discrete Sine Transform) or DCT (Discrete Cosine Transform).
ENCODING METHOD AND DEVICE THEREFOR, AND DECODING METHOD AND DEVICE THEREFOR
A video decoding method includes determining, based on an area of a current block, whether a multi-prediction combination mode for predicting the current block by combining prediction results obtained according to a plurality of prediction modes is applied to the current block, when the multi-prediction combination mode is applied to the current block, determining the plurality of prediction modes to be applied to the current block, generating a plurality of prediction blocks of the current block, according to the plurality of prediction modes, and determining a combined prediction block of the current block, by combining the plurality of prediction blocks according to respective weights.
ENCODING METHOD AND DEVICE THEREFOR, AND DECODING METHOD AND DEVICE THEREFOR
A video decoding method includes determining, based on an area of a current block, whether a multi-prediction combination mode for predicting the current block by combining prediction results obtained according to a plurality of prediction modes is applied to the current block, when the multi-prediction combination mode is applied to the current block, determining the plurality of prediction modes to be applied to the current block, generating a plurality of prediction blocks of the current block, according to the plurality of prediction modes, and determining a combined prediction block of the current block, by combining the plurality of prediction blocks according to respective weights.
Machine-learned in-loop predictor for video compression
A compression system trains a compression model for an encoder and decoder. In one embodiment, the compression model includes a machine-learned in-loop flow predictor that generates a flow prediction from previously reconstructed frames. The machine-learned flow predictor is coupled to receive a set of previously reconstructed frames and output a flow prediction for a target frame that is an estimation of the flow for the target frame. In particular, since the flow prediction can be generated by the decoder using the set of previously reconstructed frames, the encoder may transmit a flow delta that indicates a difference between the flow prediction and the actual flow for the target frame, instead of transmitting the flow itself. In this manner, the encoder can transmit a significantly smaller number of bits to the receiver, improving computational efficiency.
Hybrid digital-analog modulation for transmission of video data
A method for encoding video data comprises generating coefficients based on video data; generating coefficient vectors, wherein each of the coefficient vectors includes n of the coefficients; for each of the coefficient vectors, determining an amplitude value for the coefficient vector based on a mapping pattern, wherein for each respective allowed coefficient vector in a plurality of allowed coefficient vectors: the mapping pattern maps the respective allowed coefficient vector to a respective amplitude value in a plurality of amplitude values, and the respective amplitude value is adjacent in an n-dimensional space to at least one other amplitude value in the plurality of amplitude values that is adjacent to the respective amplitude value in a monotonic number line of the amplitude values; and modulating an analog signal based on the amplitude values for the coefficient vectors.
End-to-end neural network based video coding
Systems and techniques are described herein for processing video data using a neural network system. For instance, a process can include generating, by a first convolutional layer of an encoder sub-network of the neural network system, output values associated with a luminance channel of a frame. The process can include generating, by a second convolutional layer of the encoder sub-network, output values associated with at least one chrominance channel of the frame. The process can include generating a combined representation of the frame by combining the output values associated with the luminance channel of the frame and the output values associated with the at least one chrominance channel of the frame. The process can include generating encoded video data based on the combined representation of the frame.
End-to-end neural network based video coding
Systems and techniques are described herein for processing video data using a neural network system. For instance, a process can include generating, by a first convolutional layer of an encoder sub-network of the neural network system, output values associated with a luminance channel of a frame. The process can include generating, by a second convolutional layer of the encoder sub-network, output values associated with at least one chrominance channel of the frame. The process can include generating a combined representation of the frame by combining the output values associated with the luminance channel of the frame and the output values associated with the at least one chrominance channel of the frame. The process can include generating encoded video data based on the combined representation of the frame.
Point cloud compression
A system comprises an encoder configured to compress attribute information for a point cloud and/or a decoder configured to decompress compressed attribute information for the point cloud. Attribute values for at least one starting point are included in a compressed attribute information file and attribute correction values used to correct predicted attribute values are included in the compressed attribute information file. Attribute values are predicted based, at least in part, on attribute values of neighboring points and distances between a particular point for whom an attribute value is being predicted and the neighboring points. The predicted attribute values are compared to attribute values of a point cloud prior to compression to determine attribute correction values. A decoder follows a similar prediction process as an encoder and corrects predicted values using attribute correction values included in a compressed attribute information file.
Point cloud compression
A system comprises an encoder configured to compress attribute information for a point cloud and/or a decoder configured to decompress compressed attribute information for the point cloud. Attribute values for at least one starting point are included in a compressed attribute information file and attribute correction values used to correct predicted attribute values are included in the compressed attribute information file. Attribute values are predicted based, at least in part, on attribute values of neighboring points and distances between a particular point for whom an attribute value is being predicted and the neighboring points. The predicted attribute values are compared to attribute values of a point cloud prior to compression to determine attribute correction values. A decoder follows a similar prediction process as an encoder and corrects predicted values using attribute correction values included in a compressed attribute information file.