Patent classifications
H04N19/107
VIDEO ENCODING METHOD AND APPARATUS, VIDEO DECODING METHOD AND APPARATUS, AND DEVICE
A video encoding method includes: decoding a target coding unit to obtain a quantization coefficient matrix corresponding to the target coding unit; determining first reference information according to a quantization coefficient in the quantization coefficient matrix; obtaining a value of a target flag bit corresponding to the first reference information, the target flag bit being a flag bit of a target sub-block position, the target sub-block position being a position of a sub-block that requires processing of residual data in a coding unit; and determining the target sub-block position of the target coding unit according to the value of the target flag bit. A flag bit of a sub-block transform position or a transform skip sub-block position in a target coding unit is implicitly indicated using a quantization coefficient in a quantization coefficient matrix corresponding to a coding unit.
VIDEO ENCODING METHOD AND APPARATUS, VIDEO DECODING METHOD AND APPARATUS, AND DEVICE
A video encoding method includes: decoding a target coding unit to obtain a quantization coefficient matrix corresponding to the target coding unit; determining first reference information according to a quantization coefficient in the quantization coefficient matrix; obtaining a value of a target flag bit corresponding to the first reference information, the target flag bit being a flag bit of a target sub-block position, the target sub-block position being a position of a sub-block that requires processing of residual data in a coding unit; and determining the target sub-block position of the target coding unit according to the value of the target flag bit. A flag bit of a sub-block transform position or a transform skip sub-block position in a target coding unit is implicitly indicated using a quantization coefficient in a quantization coefficient matrix corresponding to a coding unit.
Scene-aware video encoder system and method
Embodiments of the present disclosure discloses a scene-aware video encoder system. The scene-aware encoder system transforms a sequence of video frames of a video of a scene into a spatio-temporal scene graph. The spatio-temporal scene graph includes nodes representing one or multiple static and dynamic objects in the scene. Each node of the spatio-temporal scene graph describes an appearance, a location, and/or a motion of each of the objects (static and dynamic objects) at different time instances. The nodes of the spatio-temporal scene graph are embedded into a latent space using a spatio-temporal transformer encoding different combinations of different nodes of the spatio-temporal scene graph corresponding to different spatio-temporal volumes of the scene. Each node of the different nodes encoded in each of the combinations is weighted with an attention score determined as a function of similarities of spatio-temporal locations of the different nodes in the combination.
Scene-aware video encoder system and method
Embodiments of the present disclosure discloses a scene-aware video encoder system. The scene-aware encoder system transforms a sequence of video frames of a video of a scene into a spatio-temporal scene graph. The spatio-temporal scene graph includes nodes representing one or multiple static and dynamic objects in the scene. Each node of the spatio-temporal scene graph describes an appearance, a location, and/or a motion of each of the objects (static and dynamic objects) at different time instances. The nodes of the spatio-temporal scene graph are embedded into a latent space using a spatio-temporal transformer encoding different combinations of different nodes of the spatio-temporal scene graph corresponding to different spatio-temporal volumes of the scene. Each node of the different nodes encoded in each of the combinations is weighted with an attention score determined as a function of similarities of spatio-temporal locations of the different nodes in the combination.
Method and apparatus for improved compound orthonormal transform
A method of controlling residual coding for decoding or encoding of a video sequence, is performed by at least one processor and includes determining whether a small transform size of a primary transform is to be used for the residual coding of a coded block of the video sequence. The method further includes based on the small transform size of the primary transform being determined to be used, identifying, as the primary transform, a first transform set including discrete sine transform (DST)-4 and discrete cosine transform (DCT)-4, based on the small transform size of the primary transform being determined to not be used, identifying, as the primary transform, a second transform set including DST-7 and DCT-8, and performing the residual coding of the coded block, using the identified primary transform.
Method and apparatus for improved compound orthonormal transform
A method of controlling residual coding for decoding or encoding of a video sequence, is performed by at least one processor and includes determining whether a small transform size of a primary transform is to be used for the residual coding of a coded block of the video sequence. The method further includes based on the small transform size of the primary transform being determined to be used, identifying, as the primary transform, a first transform set including discrete sine transform (DST)-4 and discrete cosine transform (DCT)-4, based on the small transform size of the primary transform being determined to not be used, identifying, as the primary transform, a second transform set including DST-7 and DCT-8, and performing the residual coding of the coded block, using the identified primary transform.
Encoder, decoder, encoding method, and decoding method
A decoder comprises circuitry and memory. The circuitry, using the memory, in operation, determines a number of first pixels and a number of second pixels used in a deblocking filter process, wherein the first pixels are located at an upper side of a block boundary and the second pixels are located at a lower side of the block boundary, and performs the deblocking filter process on the block boundary. The number of the first pixels and the number of the second pixels are selected from among candidates, wherein the candidates include at least 4 and M larger than 4. Response to a location of the block boundary being a predetermined location, the number of the first pixels used in the deblocking filter process is limited to be 4.
Selective resolution video encoding method, computer device, and readable storage medium
This application relates to a video encoding method performed at a computer device. The method includes: obtaining an input video frame; determining a processing parameter corresponding to the input video frame; selecting, from candidate processing manners according to the processing parameter, a target processing manner corresponding to the input video frame, the candidate processing manners comprising a full-resolution processing manner and a downsampling processing manner; and encoding the input video frame according to the target processing manner, to obtain encoded data corresponding to the input video frame. Therefore, the target processing manner of the input video frame can be flexibly selected, and the input video frame is encoded according to the target processing manner, to adaptively adjust a resolution of the input video frame, and improve video encoding quality.
Caching and clearing mechanism for deep convolutional neural networks
An apparatus includes circuitry configured to: partition an input tensor into one or more block tensors; partition at least one of the block tensors into one or more continuation bands, the one or more continuation bands being associated with a caching counter having a value; store the one or more continuation bands in a cache managed using a cache manager; retrieve, prior to a convolution or pooling operation on a current block tensor, the one or more continuation bands of a previous block tensor from the cache that are adjacent to a current block tensor; concatenate the retrieved continuation bands with the current block tensor; apply the convolution or pooling operation on the current block tensor after the concatenation; decrease the respective caching counter value of the retrieved continuation bands; and clear the continuation bands from the cache when its respective caching counter reaches a value of zero.
Caching and clearing mechanism for deep convolutional neural networks
An apparatus includes circuitry configured to: partition an input tensor into one or more block tensors; partition at least one of the block tensors into one or more continuation bands, the one or more continuation bands being associated with a caching counter having a value; store the one or more continuation bands in a cache managed using a cache manager; retrieve, prior to a convolution or pooling operation on a current block tensor, the one or more continuation bands of a previous block tensor from the cache that are adjacent to a current block tensor; concatenate the retrieved continuation bands with the current block tensor; apply the convolution or pooling operation on the current block tensor after the concatenation; decrease the respective caching counter value of the retrieved continuation bands; and clear the continuation bands from the cache when its respective caching counter reaches a value of zero.