Patent classifications
H04N19/154
Method and device for encoding intra prediction mode for image prediction unit, and method and device for decoding intra prediction mode for image prediction unit
Methods and apparatuses for encoding and decoding an intra prediction mode of a prediction unit of a chrominance component based on an intra prediction mode of a prediction unit of a luminance component are provided. When an intra prediction mode of a prediction unit of a luminance component is the same as an intra prediction mode in an intra prediction mode candidate group of a prediction unit of a chrominance component, reconstructing the intra prediction mode candidate group of the prediction unit of the chrominance component by excluding or replacing an intra prediction mode of the prediction unit of the chrominance component which is same as an intra prediction mode of the prediction unit of the luminance component from the intra prediction mode candidate group, and encoding the intra prediction mode of the prediction unit of the chrominance component by using the reconstructed intra prediction mode candidate group.
Deep learning based on image encoding and decoding
A deep learning based compression (DLBC) system trains multiple models that, when deployed, generates a compressed binary encoding of an input image that achieves a reconstruction quality and a target compression ratio. The applied models effectively identifies structures of an input image, quantizes the input image to a target bit precision, and compresses the binary code of the input image via adaptive arithmetic coding to a target codelength. During training, the DLBC system reconstructs the input image from the compressed binary encoding and determines the loss in quality from the encoding process. Thus, the models can be continually trained to, when applied to an input image, minimize the loss in reconstruction quality that arises due to the encoding process while also achieving the target compression ratio.
OPTIMAL FORMAT SELECTION FOR VIDEO PLAYERS BASED ON PREDICTED VISUAL QUALITY USING MACHINE LEARNING
A system and methods are disclosed for optimal format selection for video players based on visual quality. The method includes generating a plurality of reference transcoded versions of a reference video, obtaining quality scores for frames of the plurality of reference transcoded versions of the reference video, generating a first training input comprising a set of color attributes, spatial attributes, and temporal attributes of the frames of the reference video, and generating a first target output for the first training input, wherein the first target output comprises the quality scores for the frames of the plurality of reference transcoded versions of the reference video. The method further includes providing the training data to train a machine learning model on (i) a set of training inputs comprising the first training input and (ii) a set of target outputs comprising the first target output.
OPTIMAL FORMAT SELECTION FOR VIDEO PLAYERS BASED ON PREDICTED VISUAL QUALITY USING MACHINE LEARNING
A system and methods are disclosed for optimal format selection for video players based on visual quality. The method includes generating a plurality of reference transcoded versions of a reference video, obtaining quality scores for frames of the plurality of reference transcoded versions of the reference video, generating a first training input comprising a set of color attributes, spatial attributes, and temporal attributes of the frames of the reference video, and generating a first target output for the first training input, wherein the first target output comprises the quality scores for the frames of the plurality of reference transcoded versions of the reference video. The method further includes providing the training data to train a machine learning model on (i) a set of training inputs comprising the first training input and (ii) a set of target outputs comprising the first target output.
DMVR USING DECIMATED PREDICTION BLOCK
The present disclosure provides an inter prediction method, comprising the steps of obtaining an initial motion vector and a reference picture for bi-prediction; obtaining sets of candidate sample positions in the reference picture according to the initial motion vector and candidate motion vectors, wherein each candidate motion vector is derived by the initial motion vector and a respective motion vector offset, and wherein each set of candidate sample positions corresponds to each candidate motion vector; obtaining a respective set of sample positions from each set of candidate sample positions; computing a matching cost for each candidate motion vector within each set of sample positions; obtaining a refined motion vector based on the computed matching cost of each candidate motion vector; and obtaining prediction values for a current block based on the refined motion vector.
APPARATUS, METHOD, AND COMPUTER READABLE MEDIUM
Provided is an apparatus including: an image acquisition unit configured to acquire a captured image; a compression unit configured to compress the captured image to generate a compressed image; an evaluation acquisition unit configured to acquire evaluation according to visibility of the compressed image from a user; and a learning processing unit configured to perform, in response to input of a new captured image, learning processing of a model for outputting a compression parameter value to be applied in compression of the captured image by using learning data including the evaluation, a captured image corresponding to the compressed image targeted for the evaluation, and a compression parameter value applied in generation of the compressed image.
APPARATUS, METHOD, AND COMPUTER READABLE MEDIUM
Provided is an apparatus including: an image acquisition unit configured to acquire a captured image; a compression unit configured to compress the captured image to generate a compressed image; an evaluation acquisition unit configured to acquire evaluation according to visibility of the compressed image from a user; and a learning processing unit configured to perform, in response to input of a new captured image, learning processing of a model for outputting a compression parameter value to be applied in compression of the captured image by using learning data including the evaluation, a captured image corresponding to the compressed image targeted for the evaluation, and a compression parameter value applied in generation of the compressed image.
CONTENT-ADAPTIVE ONLINE TRAINING WITH IMAGE SUBSTITUTION IN NEURAL IMAGE COMPRESSION
Aspects of the disclosure provide a method and an apparatus for video encoding. The apparatus includes processing circuitry configured to perform an iterative update of sample values of a plurality of samples in an initial input image. The iterative update includes generating a coded representation of a final input image based on the final input image by an encoding neural network (NN) and at least one training module. The final input image has been updated from the initial input image by a number of iterations of the iterative update. The iterative update includes generating a reconstructed image of the final input image based on the coded representation of the final input image by a decoding NN. One of a rate-distortion loss for the final input image or the number of iterations of the iterative update satisfies a pre-determined condition. An encoded image corresponding to the final input image is generated.
CONTENT-ADAPTIVE ONLINE TRAINING WITH IMAGE SUBSTITUTION IN NEURAL IMAGE COMPRESSION
Aspects of the disclosure provide a method and an apparatus for video encoding. The apparatus includes processing circuitry configured to perform an iterative update of sample values of a plurality of samples in an initial input image. The iterative update includes generating a coded representation of a final input image based on the final input image by an encoding neural network (NN) and at least one training module. The final input image has been updated from the initial input image by a number of iterations of the iterative update. The iterative update includes generating a reconstructed image of the final input image based on the coded representation of the final input image by a decoding NN. One of a rate-distortion loss for the final input image or the number of iterations of the iterative update satisfies a pre-determined condition. An encoded image corresponding to the final input image is generated.
APPARATUS, MONITORING SYSTEM, METHOD, AND COMPUTER-READABLE MEDIUM
Provided is an apparatus comprising: an image acquisition unit configured to acquire a captured image; a compression unit configured to compress a captured image to generate compressed data; a reproduction unit configured to generate, from the compressed data, a reproduced image that reproduces the captured image; an evaluation acquisition unit configured to acquire an evaluation corresponding to a degree of approximation between the reproduced image and the captured image; and a learning processing unit configured to perform learning processing of a model configured to output, in response to an input of a new captured image, a compression parameter value to be applied in compression of the captured image, by using learning data including the evaluation, a captured image corresponding to the evaluation, and a compression parameter value applied in compression of the captured image