Patent classifications
H04N19/147
Method and apparatus for encoding/decoding video signal by using edge-adaptive graph-based transform
The present invention provides a method for encoding a video signal based on an Edge Adaptive Graph-Based Transform (EA-GBT) including detecting a step edge or a ramp edge from a residual signal; generating a graph signal based on at least one of the step edge or the ramp edge; obtaining an EA-GBT coefficient by performing the EA-GBT for the graph signal; quantizing the EA-GBT coefficient; and entropy-encoding the quantized EA-GBT coefficient.
Methods for generating video-and audience-specific encoding ladders with audio and video just-in-time transcoding
A method including: populating an encoding ladder with a subset of bitrate-resolution pairs, from a set of bitrate-resolution pairs, based on a distribution of audience bandwidths; receiving a first request for a first playback segment, at a first bitrate-resolution pair in a encoding ladder, in the video from a first device; in response to determining an absence of video segments, at the first bitrate-resolution pair and corresponding to the segment, in a first rendition cache: identifying a first set of mezzanine segments, in the video, corresponding to the first playback segment; assigning the first set of mezzanine segments to a set of workers for transcoding into a first set of video segments according to the first bitrate-resolution pair; storing the first set of video segments in the first rendition cache; and based on the first request, releasing the first set of video segments to the first device.
TECHNIQUES OF MULTI-HYPOTHESIS MOTION COMPENSATION
The present disclosure describes techniques for coding and decoding video in which a plurality of coding hypotheses are developed for an input pixel block of frame content. Each coding hypothesis may include generation of prediction data for the input pixel block according to a respective prediction search. The input pixel block may be coded with reference to a prediction block formed from prediction data derived according to plurality of hypotheses. Data of the coded pixel block may be transmitted to a decoder along with data identifying a number of the hypotheses used during the coding to a channel. At a decoder, an inverse process may be performed, which may include generation of a counterpart prediction block from prediction data derived according to the hypothesis identified with the coded pixel block data, then decoding of the coded pixel block according to the prediction data.
TECHNIQUES OF MULTI-HYPOTHESIS MOTION COMPENSATION
The present disclosure describes techniques for coding and decoding video in which a plurality of coding hypotheses are developed for an input pixel block of frame content. Each coding hypothesis may include generation of prediction data for the input pixel block according to a respective prediction search. The input pixel block may be coded with reference to a prediction block formed from prediction data derived according to plurality of hypotheses. Data of the coded pixel block may be transmitted to a decoder along with data identifying a number of the hypotheses used during the coding to a channel. At a decoder, an inverse process may be performed, which may include generation of a counterpart prediction block from prediction data derived according to the hypothesis identified with the coded pixel block data, then decoding of the coded pixel block according to the prediction data.
Dynamic codec adaptation
Embodiments are described for dynamically adapting video encoding to maintain a nearly stable frame rate based on processor capabilities and bandwidth, for example, by varying a quantization parameter. The quality of the encoded video can be varied to maintain the nearly constant frame rate, which may be measured from the number of encoded video frames being transmitted over a network interface.
Dynamic codec adaptation
Embodiments are described for dynamically adapting video encoding to maintain a nearly stable frame rate based on processor capabilities and bandwidth, for example, by varying a quantization parameter. The quality of the encoded video can be varied to maintain the nearly constant frame rate, which may be measured from the number of encoded video frames being transmitted over a network interface.
Fast multi-rate encoding for adaptive HTTP streaming
According to embodiments of the disclosure, information of higher and lower quality encoded video segments is used to limit Rate-Distortion Optimization (RDO) for each Coding Unit Tree (CTU). A method first encodes the highest bit-rate segment and consequently uses it to encode the lowest bit-rate video segment. Block structure and selected reference frame of both highest and lowest bit-rate video segments are used to predict and shorten RDO process for each CTU in middle bit-rates. The method delays just one frame using parallel processing. This approach provides time-complexity reduction compared to the reference software for middle bit-rates while degradation is negligible.
Method for optimizing structure similarity index in video coding
The present disclosure provides a computer-implemented method for encoding video. The method includes: generating training data based on one or more video sequences, the training data including a structure similarity index comprising at least one of structure similarity index (SSIM) or multi-scale-structural similarity index (MS-SSIM); training a rate-distortion optimization (RDO) model using the training data; processing the one or more video sequences using the rate-distortion optimization model.
Method for optimizing structure similarity index in video coding
The present disclosure provides a computer-implemented method for encoding video. The method includes: generating training data based on one or more video sequences, the training data including a structure similarity index comprising at least one of structure similarity index (SSIM) or multi-scale-structural similarity index (MS-SSIM); training a rate-distortion optimization (RDO) model using the training data; processing the one or more video sequences using the rate-distortion optimization model.
VIDEO CODING USING MAPPED TRANSFORMS AND SCANNING MODES
A video encoder may transform residual data by using a transform selected from a group of transforms. The transform is applied to the residual data to create a two-dimensional array of transform coefficients. A scanning mode is selected to scan the transform coefficients in the two-dimensional array into a one-dimensional array of transform coefficients. The combination of transform and scanning mode may be selected from a subset of combinations that is based on an intra-prediction mode. The scanning mode may also be selected based on the transform used to create the two-dimensional array. The transforms and/or scanning modes used may be signaled to a video decoder.