Patent classifications
H04N19/19
Method for Optimizing Second Coding
A method for optimizing a second coding is provided. The method includes: Setting a quantization parameter of a start frame of a video sequence according to a range of an input quantization parameter QP.sub.0 of a coder; performing first coding with a simplified method, and calculating a frame-level temporal impact factor k.sub.i of a current frame and a block-level temporal impact factor k.sub.B,j of all 16×16 pixel blocks in the current frame; restoring reference list information of the coder after the first coding is completed, and then determining whether a scene is switched; and performing the second coding by setting quantization parameters with different strategies according to whether the scene is switched.
Media Object Compression/Decompression with Adaptive Processing for Block-Level Sub-Errors and/or Decomposed Block-Level Sub-Errors
A system comprises an encoder configured to compress media objects using a compression loop that includes a residual decomposition component that decomposes a residual signal for a block of the media object being compressed into multiple sub-error signals. The encoder is further configured to enable different transformation and/or quantization processes to be specified to be applied to different ones of the sub-errors. A corresponding decoder is configured to apply inverse transformation/quantization processing to the sub-error signals, based on the transformation/quantization processes that were applied at the encoder. The decoder then recreates a residual signal from the processed sub-error signals and uses the re-created residual signal to correct predicted values at the decoder.
Media Object Compression/Decompression with Adaptive Processing for Block-Level Sub-Errors and/or Decomposed Block-Level Sub-Errors
A system comprises an encoder configured to compress media objects using a compression loop that includes a residual decomposition component that decomposes a residual signal for a block of the media object being compressed into multiple sub-error signals. The encoder is further configured to enable different transformation and/or quantization processes to be specified to be applied to different ones of the sub-errors. A corresponding decoder is configured to apply inverse transformation/quantization processing to the sub-error signals, based on the transformation/quantization processes that were applied at the encoder. The decoder then recreates a residual signal from the processed sub-error signals and uses the re-created residual signal to correct predicted values at the decoder.
METHOD AND APPARATUS FOR ENCODING/DECODING THE GEOMETRY OF A POINT CLOUD REPRESENTING A 3D OBJECT
At least one embodiment provides a method comprising encoding or decoding a coding model information representative of an encoding of points of a point cloud, said encoding being defined from at least one point belonging to a bounding box encompassing said points of the point cloud.
Receptive-field-conforming convolution models for video coding
Convolutional neural networks (CNN) that determine a mode decision (e.g., block partitioning) for encoding a block include feature extraction layers and multiple classifiers. A non-overlapping convolution operation is performed at a feature extraction layer by setting a stride value equal to a kernel size. The block has a N×N size, and a smallest partition output for the block has a S×S size. Classification layers of each classifier receive feature maps having a feature dimension. An initial classification layer receives the feature maps as an output of a final feature extraction layer. Each classifier infers partition decisions for sub-blocks of size (αS)×(αS) of the block, wherein α is a power of 2 and α=2, . . . , N/S, by applying, at some successive classification layers, a 1×1 kernel to reduce respective feature dimensions; and outputting by a last layer of the classification layers an output corresponding to a N/(αS)×N/(αS)×1 output map.
Method based on global rate-distortion optimization for rate control in video coding
A method for video coding with global rate-distortion optimization-based rate control (RC) is provided. By using this video coding method, an RC scheme for High dynamic range (HDR) in High Efficiency Video Coding (HEVC) is provided. Briefly, considering the characteristics of HDR image content, a rate-distortion (R-D) model based on HDR-Visual Difference Predictor (VDP)-2 for performance optimization is provided. In the optimization process, the λ is directly utilized rather than the bit rate to obtain the globally optimal solution. Finally, the model parameter estimation method is used to reduce errors. The video coding method of the present invention is verified that bit rate reduction on average can be achieved.
Method based on global rate-distortion optimization for rate control in video coding
A method for video coding with global rate-distortion optimization-based rate control (RC) is provided. By using this video coding method, an RC scheme for High dynamic range (HDR) in High Efficiency Video Coding (HEVC) is provided. Briefly, considering the characteristics of HDR image content, a rate-distortion (R-D) model based on HDR-Visual Difference Predictor (VDP)-2 for performance optimization is provided. In the optimization process, the λ is directly utilized rather than the bit rate to obtain the globally optimal solution. Finally, the model parameter estimation method is used to reduce errors. The video coding method of the present invention is verified that bit rate reduction on average can be achieved.
METHOD AND APPARATUS FOR APPLYING DEEP LEARNING TECHNIQUES IN VIDEO CODING, RESTORATION AND VIDEO QUALITY ANALYSIS (VQA)
Video quality analysis may be used in many multimedia transmission and communication applications, such as encoder optimization, stream selection, and/or video reconstruction. An objective VQA metric that accurately reflects the quality of processed video relative to a source unprocessed video may take into account both spatial measures and temporal, motion-based measures when evaluating the processed video. Temporal measures may include differential motion metrics indicating a difference between a frame difference of a plurality of frames of the processed video relative to that of a corresponding plurality of frames of the source video. In addition, neural networks and deep learning techniques can be used to develop additional improved VQA metrics that take into account both spatial and temporal aspects of the processed and unprocessed videos.
Method and apparatus for video coding
A method of processing point cloud data at a decoder can include receiving three dimensional (3D) coordinates of a set of points of a point cloud including first points and a current point. Each of the first points can be associated with a reconstructed attribute value. A group of neighboring points of the current point can be determined from the first points. A first index is received and the first index indicates a reconstructed attribute value selected from the reconstructed attribute values of the plurality of neighboring points. The reconstructed attributed value indicated by the first index is determined based on a rate-distortion decision function. An attribute distance for each of the neighboring points can be determined based on the reconstructed attribute values of the neighboring points.
Guiding Decoder-Side Optimization of Neural Network Filter
Optimization of a neural network, for example in a video codec at the decoder side, may be guided to limit overfitting. The encoder may encode video(s) with different qualities for different frames in the video. Low-quality frames may be used as both input and ground-truth during optimization. High-quality frames may be used to optimize the neural network so that higher-quality versions of lower-quality inputs may be predicted. The neural network may be trained to make such predictions by making a prediction based on a constructed low-quality input for which the corresponding high-quality version is known, comparing the prediction to the high-quality version, and fine-tuning the neural network to improve its ability to predict a high-quality version of a low-quality input. To limit overfitting, the neural network may be concurrently or in an alternating fashion trained with low-quality input for which a higher-quality version of the low-quality input is known.