Patent classifications
H04N19/154
DEEP PALETTE PREDICTION
Example embodiments allow for training of encoders (e.g., artificial neural networks (ANNs)) to generate a color palette based on an input image. The color palette can then be used to generate, using the input image, a quantized, reduced color depth image that corresponds to the input image. Differences between a plurality of such input images and corresponding quantized images are used to train the encoder. Encoders trained in this manner are especially suited for generating color palettes used to convert images into different reduced color depth image file formats. Such an encoder also has benefits, with respect to memory use and computational time or cost, relative to the median-cut algorithm or other methods for producing reduced color depth color palettes for images.
HIGH DYNAMIC RANGE HDR VIDEO PROCESSING METHOD, ENCODING DEVICE, AND DECODING DEVICE
This application provides a high dynamic range HDR video processing method, an encoding device, and a decoding device. The method includes: obtaining dynamic metadata of an N.sup.th HDR video frame according to a dynamic metadata generation algorithm; calculating a tone-mapping (tone-mapping) curve parameter of the N.sup.th HDR video frame based on the dynamic metadata of the N.sup.th HDR video frame; generating a tone-mapping curve based on the curve parameter; determining, according to a quality assessment algorithm, distortion D′ caused by the tone-mapping curve; comparing D′ and D.sub.T, to determine a mode used by the N.sup.th HDR video frame, where the mode is an automatic mode or a director mode, and D.sub.T is a threshold value; and determining metadata of the N.sup.th HDR video frame based on the determined mode used by the N.sup.th HDR video frame.
HIGH DYNAMIC RANGE HDR VIDEO PROCESSING METHOD, ENCODING DEVICE, AND DECODING DEVICE
This application provides a high dynamic range HDR video processing method, an encoding device, and a decoding device. The method includes: obtaining dynamic metadata of an N.sup.th HDR video frame according to a dynamic metadata generation algorithm; calculating a tone-mapping (tone-mapping) curve parameter of the N.sup.th HDR video frame based on the dynamic metadata of the N.sup.th HDR video frame; generating a tone-mapping curve based on the curve parameter; determining, according to a quality assessment algorithm, distortion D′ caused by the tone-mapping curve; comparing D′ and D.sub.T, to determine a mode used by the N.sup.th HDR video frame, where the mode is an automatic mode or a director mode, and D.sub.T is a threshold value; and determining metadata of the N.sup.th HDR video frame based on the determined mode used by the N.sup.th HDR video frame.
HARDWARE PIPELINES FOR RATE-DISTORTION OPTIMIZATION (RDO) THAT SUPPORT MULTIPLE CODECS
A disclosed system may include a hardware distortion data pipeline that may include (1) a quantization module that generates a quantized data set, (2) an inverse quantization module that generates, from the quantized data set, an inverse quantized data set by executing an inverse quantization of the quantized data set, and (3) an inverse transformation module that generates an inversely transformed data set by executing an inverse transformation of the inverse quantized data set. The system may also include a hardware determination pipeline that determines a distortion metric based on the inversely transformed data set and the residual frame data set, and a hardware token rate pipeline that determines, based on the quantized data set, a token rate for an encoding of the residual frame data set via a video encoding pipeline. Various other methods, systems, and computer-readable media are also disclosed.
HARDWARE PIPELINES FOR RATE-DISTORTION OPTIMIZATION (RDO) THAT SUPPORT MULTIPLE CODECS
A disclosed system may include a hardware distortion data pipeline that may include (1) a quantization module that generates a quantized data set, (2) an inverse quantization module that generates, from the quantized data set, an inverse quantized data set by executing an inverse quantization of the quantized data set, and (3) an inverse transformation module that generates an inversely transformed data set by executing an inverse transformation of the inverse quantized data set. The system may also include a hardware determination pipeline that determines a distortion metric based on the inversely transformed data set and the residual frame data set, and a hardware token rate pipeline that determines, based on the quantized data set, a token rate for an encoding of the residual frame data set via a video encoding pipeline. Various other methods, systems, and computer-readable media are also disclosed.
Constraint-modified selection of video encoding configurations
A video to be encoded to a plurality of different target encodings for bandwidth adaptive serving is received. The video is encoded into a plurality of different candidate encodings using different candidate encoding parameters. A quality metric is determined for each of the plurality of different candidate encodings. One or more different target quality metrics are selected for a first portion of the different target encodings based at least in part on one or more specified constraints for one or more target devices. One or more different target quality metrics are selected for a second portion of the different target encodings based at least in part on the determined quality metrics of the different candidate encodings. Based at least in part on the selected different target quality metrics for the first portion and the second portion, the plurality of different target encodings of the video is generated.
Constraint-modified selection of video encoding configurations
A video to be encoded to a plurality of different target encodings for bandwidth adaptive serving is received. The video is encoded into a plurality of different candidate encodings using different candidate encoding parameters. A quality metric is determined for each of the plurality of different candidate encodings. One or more different target quality metrics are selected for a first portion of the different target encodings based at least in part on one or more specified constraints for one or more target devices. One or more different target quality metrics are selected for a second portion of the different target encodings based at least in part on the determined quality metrics of the different candidate encodings. Based at least in part on the selected different target quality metrics for the first portion and the second portion, the plurality of different target encodings of the video is generated.
Measuring video quality of experience based on decoded frame rate
Techniques are described for determining quality of experience (QoE) rate information for streaming video. For example, QoE rates can be calculated by a client while receiving and decoding an encoded video stream. The QoE rates can be calculated based on the number of video stalls that occur at the client while decoding the encoded video stream during a plurality of time periods. Determining whether a video stall occurs during a given time period involves comparing an encoded frame rate to a decoded frame rate. Indications of the QoE rates can be output.
Measuring video quality of experience based on decoded frame rate
Techniques are described for determining quality of experience (QoE) rate information for streaming video. For example, QoE rates can be calculated by a client while receiving and decoding an encoded video stream. The QoE rates can be calculated based on the number of video stalls that occur at the client while decoding the encoded video stream during a plurality of time periods. Determining whether a video stall occurs during a given time period involves comparing an encoded frame rate to a decoded frame rate. Indications of the QoE rates can be output.
Compressing weight updates for decoder-side neural networks
A method, apparatus, and computer program product are provided for training a neural network or providing a pre-trained neural network with the weight-updates being compressible using at least a weight-update compression loss function and/or task loss function. The weight-update compression loss function can comprise a weight-update vector defined as a latest weight vector minus an initial weight vector before training. A pre-trained neural network can be compressed by pruning one or more small-valued weights. The training of the neural network can consider the compressibility of the neural network, for instance, using a compression loss function, such as a task loss and/or a weight-update compression loss. The compressed neural network can be applied within a decoding loop of an encoder side or in a post-processing stage, as well as at a decoder side.