Patent classifications
H04N19/176
VIDEO ENCODING METHOD AND APPARATUS, VIDEO DECODING METHOD AND APPARATUS, AND DEVICE
A video encoding method includes: decoding a target coding unit to obtain a quantization coefficient matrix corresponding to the target coding unit; determining first reference information according to a quantization coefficient in the quantization coefficient matrix; obtaining a value of a target flag bit corresponding to the first reference information, the target flag bit being a flag bit of a target sub-block position, the target sub-block position being a position of a sub-block that requires processing of residual data in a coding unit; and determining the target sub-block position of the target coding unit according to the value of the target flag bit. A flag bit of a sub-block transform position or a transform skip sub-block position in a target coding unit is implicitly indicated using a quantization coefficient in a quantization coefficient matrix corresponding to a coding unit.
VIDEO DECODING METHOD, VIDEO ENCODING METHOD, RELATED DEVICES, AND STORAGE MEDIUM
A video decoding method, a video encoding method, related devices, and a storage medium are provided. The video decoding method includes: determining a current string to be decoded from a current coding unit of a current image; based on the current string being a unit vector string and the current string including a first pixel, determining a reference pixel of the first pixel from a historical decoding unit in the current image, the historical decoding unit being a decoded coding unit adjacent to the current coding unit in the current image, the reference pixel of the first pixel being adjacent to the first pixel in the current image; and acquiring a predicted value of the first pixel based on a reconstructed value of the reference pixel of the first pixel to obtain a decoded image. may
Latency Reduction For Reordering Prediction Candidates
For each prediction candidate of a set of one or more prediction candidates of the current block, a video coder computes a matching cost between a set of reference pixels of the prediction candidate in a reference picture and a set of neighboring pixels of a current block in a current picture. The video coder identifies a subset of the reference pictures as major reference pictures based on a distribution of the prediction candidates among the reference pictures of the current picture. A bounding block is defined for each major reference picture, the bounding block encompassing at least portions of multiple sets of reference pixels for multiple prediction candidates. The video coder assigns an index to each prediction candidate based on the computed matching cost of the set of prediction candidates. A selection of a prediction candidate is signaled by using the assigned index of the selected prediction candidate.
Latency Reduction For Reordering Prediction Candidates
For each prediction candidate of a set of one or more prediction candidates of the current block, a video coder computes a matching cost between a set of reference pixels of the prediction candidate in a reference picture and a set of neighboring pixels of a current block in a current picture. The video coder identifies a subset of the reference pictures as major reference pictures based on a distribution of the prediction candidates among the reference pictures of the current picture. A bounding block is defined for each major reference picture, the bounding block encompassing at least portions of multiple sets of reference pixels for multiple prediction candidates. The video coder assigns an index to each prediction candidate based on the computed matching cost of the set of prediction candidates. A selection of a prediction candidate is signaled by using the assigned index of the selected prediction candidate.
Video encoding mode selection by a hierarchy of machine learning models
Techniques for training and using machine learning models for video encoding mode selection are described. According to some embodiments, a computer-implemented method includes receiving a live video at a content delivery service, extracting one or more features for a plurality of macroblocks of a frame of the live video, determining an encoding mode from a plurality of encoding modes for each of the plurality of macroblocks of the frame with a machine learning model based at least in part on an input of the one or more features, performing a real time encode of the frame of the live video based at least in part on the determined encoding modes to generate an encoded frame by the content delivery service, and transmitting the encoded frame from the content delivery service to a viewer device.
Video encoding mode selection by a hierarchy of machine learning models
Techniques for training and using machine learning models for video encoding mode selection are described. According to some embodiments, a computer-implemented method includes receiving a live video at a content delivery service, extracting one or more features for a plurality of macroblocks of a frame of the live video, determining an encoding mode from a plurality of encoding modes for each of the plurality of macroblocks of the frame with a machine learning model based at least in part on an input of the one or more features, performing a real time encode of the frame of the live video based at least in part on the determined encoding modes to generate an encoded frame by the content delivery service, and transmitting the encoded frame from the content delivery service to a viewer device.
Architecture to adapt cumulative distribution functions for mode decision in video encoding
A mode decision component is configured to determine the costs of different modes for a selected partition of a frame of a video using an estimated compression coding data, which is calculated prior to a corresponding actual compression coding data being calculated based on another partition immediately prior to the selected partition in a partition processing order. The estimated compression coding data is determined based on a previously calculated compression coding data calculated based on a completed partition prior to the selected partition in the partition processing order. The mode decision component is configured to use the determined costs to select one of the modes. An encoder component is configured to use the selected mode to encode the selected partition by using the corresponding actual compression coding data calculated based on the another partition immediately prior to the selected partition in the partition processing order.
Video coding method on basis of secondary transform, and device for same
A video decoding method according to the present document is characterized by comprising: a step for deriving transform coefficients through inverse quantization on the basis of quantized transform coefficients for a target block; a step for deriving modified transform coefficients on the basis of an inverse reduced secondary transform (RST) of the transform coefficients; and a step for generating a reconstructed picture on the basis of residual samples for the target block on the basis of an inverse primary transform of the modified transform coefficients, wherein the inverse RST using a transform kernel matrix is performed on transform coefficients of the upper-left 4×4 region of an 8×8 region of the target block, and the modified transform coefficients of the upper-left 4×4 region, upper-right 4×4 region, and lower-left 4×4 region of the 8×8 region are derived through the inverse RST.
Video coding method on basis of secondary transform, and device for same
A video decoding method according to the present document is characterized by comprising: a step for deriving transform coefficients through inverse quantization on the basis of quantized transform coefficients for a target block; a step for deriving modified transform coefficients on the basis of an inverse reduced secondary transform (RST) of the transform coefficients; and a step for generating a reconstructed picture on the basis of residual samples for the target block on the basis of an inverse primary transform of the modified transform coefficients, wherein the inverse RST using a transform kernel matrix is performed on transform coefficients of the upper-left 4×4 region of an 8×8 region of the target block, and the modified transform coefficients of the upper-left 4×4 region, upper-right 4×4 region, and lower-left 4×4 region of the 8×8 region are derived through the inverse RST.
Video decoding method and apparatus and video encoding method and apparatus
Provided is a video decoding method including determining a displacement vector per unit time of pixels of a current block in a horizontal direction or a vertical direction, the pixels including a pixel adjacent to an inside of a boundary of the current block, by using values about reference pixels included in a first reference block and a second reference block, without using a stored value about a pixel located outside boundaries of the first reference block and the second reference block; and obtaining a prediction block of the current block by performing block-unit motion compensation and pixel group unit motion compensation on the current block by using a gradient value in the horizontal direction or the vertical direction of a first corresponding reference pixel in the first reference block which corresponds to a current pixel included in a current pixel group in the current block, a gradient value in the horizontal direction or the vertical direction of a second corresponding reference pixel in the second reference block which corresponds to the current pixel, a pixel value of the first corresponding reference pixel, a pixel value of the second corresponding reference pixel, and a displacement vector per unit time of the current pixel in the horizontal direction or the vertical direction. In this regard, the current pixel group may include at least one pixel.