H04N19/55

Method and device for processing video signal on basis of inter prediction
11695948 · 2023-07-04 · ·

The disclosure discloses a method for processing a video signal and an apparatus therefor. Specifically, the method of processing a video signal based on an inter prediction, comprising: configuring a merge list based on a neighboring block of a current block; adding a history based merge candidate included in a history based merge candidate list to the merge list when a number of a merge candidate included in the merge list is smaller than a first predetermined number; obtaining a merge index indicating a merge candidate used for an inter prediction of the current block within the merge list; and generating a prediction block of the current block based on motion information of a merge candidate indicated by the merge index, wherein a step of adding the history based merge candidate to the merge list comprises checking whether a second predetermined number of the history based merge candidate within the history based merge candidate list has the same motion information as the merge candidate included in the merge list.

Tiling for video based point cloud compression

A method for point cloud encoding includes generating, for a three-dimensional (3D) point cloud, video frames and atlas frames that includes pixels representing information about the 3D point cloud, wherein atlas tiles represent partitions in the atlas frames and video tiles represent partitions in the video frames. The method also includes setting a value for a syntax element according to relationships between sizes of the video tiles and sizes of the atlas tiles. The method further includes encoding the video frames and the atlas frames to generate video sub-bitstreams and an atlas sub-bitstream, respectively. Additionally, the method includes generating a bitstream based on the atlas sub-bitstream, the video sub-bitstreams, and the syntax element and transmitting the bitstream.

Tiling for video based point cloud compression

A method for point cloud encoding includes generating, for a three-dimensional (3D) point cloud, video frames and atlas frames that includes pixels representing information about the 3D point cloud, wherein atlas tiles represent partitions in the atlas frames and video tiles represent partitions in the video frames. The method also includes setting a value for a syntax element according to relationships between sizes of the video tiles and sizes of the atlas tiles. The method further includes encoding the video frames and the atlas frames to generate video sub-bitstreams and an atlas sub-bitstream, respectively. Additionally, the method includes generating a bitstream based on the atlas sub-bitstream, the video sub-bitstreams, and the syntax element and transmitting the bitstream.

USING UNREFINED MOTION VECTORS FOR PERFORMING DECODER-SIDE MOTION VECTOR DERIVATION

A device for decoding video data includes a memory configured to store video data; and one or more processors implemented in circuitry and configured to: determine a deterministic bounding box from which to retrieve reference samples of reference pictures of video data for performing decoder-side motion vector derivation (DMVD) for a current block of the video data; derive a motion vector for the current block according to DMVD using the reference samples within the deterministic bounding box; form a prediction block using the motion vector; and decode the current block using the prediction block.

Gradual Decoding Refresh In Video Coding
20250234045 · 2025-07-17 ·

A method of decoding a coded video bitstream implemented by a video decoder is provided. The method includes the video decoder receiving the coded video bitstream, the coded video bitstream containing a gradual decoding refresh (GDR) flag corresponding to a coded video sequence (CVS); determining, by the video decoder, whether a GDR picture is present in the CVS based on a value of the GDR flag; initiating, by the video decoder, decoding of the CVS at the GDR picture when the value of the GDR flag indicates that the GDR picture is present; and generating, by the video decoder, an image according to the CVS as decoded. A corresponding method of encoding implemented by a video encoder is also disclosed.

Refinement of internal sub-blocks of a coding unit

Motion information for an internal sub-block of a larger block can be derived for use in encoding or decoding the video block or a coding unit by using the motion information for sub-blocks on the left or top edge of the coding block. The left column of edge sub-blocks and the top row of sub-blocks has motion information, such as motion vectors, derived using such techniques as template matching. The motion vectors of these edge sub-blocks are used in deriving the motion vectors of internal sub-blocks, which leads to better prediction and improved coding efficiency. In another embodiment, other motion information for internal sub-blocks is derived from corresponding information of the edge sub-blocks.

Refinement of internal sub-blocks of a coding unit

Motion information for an internal sub-block of a larger block can be derived for use in encoding or decoding the video block or a coding unit by using the motion information for sub-blocks on the left or top edge of the coding block. The left column of edge sub-blocks and the top row of sub-blocks has motion information, such as motion vectors, derived using such techniques as template matching. The motion vectors of these edge sub-blocks are used in deriving the motion vectors of internal sub-blocks, which leads to better prediction and improved coding efficiency. In another embodiment, other motion information for internal sub-blocks is derived from corresponding information of the edge sub-blocks.

Method and apparatus for encoding and transmitting at least a spatial part of a video sequence

When encoding and transmitting video data comprising regions of interest, different usages of the regions of interest implicate different kinds of combination of region of interest at decoding. By studying the different impacts of the encoding mechanisms depending on other set of tiles data on the different kind of combination, it is possible to define a plurality of tile set coding dependency levels. Each tile set coding dependency level is linked to a set of constraints on encoding. These set of constraints have different impacts on the possibilities allowed when combining the different regions of interest. It is therefore possible, based on a desired usage, to select an encoding with minimal restrictions, as defined by a given tile coding dependency level, compatible with the desired usage. Accordingly, the encoding efficiency is improved, for a given usage, compared to a solution where a complete tile independency solution is used.

Method and apparatus for encoding and transmitting at least a spatial part of a video sequence

When encoding and transmitting video data comprising regions of interest, different usages of the regions of interest implicate different kinds of combination of region of interest at decoding. By studying the different impacts of the encoding mechanisms depending on other set of tiles data on the different kind of combination, it is possible to define a plurality of tile set coding dependency levels. Each tile set coding dependency level is linked to a set of constraints on encoding. These set of constraints have different impacts on the possibilities allowed when combining the different regions of interest. It is therefore possible, based on a desired usage, to select an encoding with minimal restrictions, as defined by a given tile coding dependency level, compatible with the desired usage. Accordingly, the encoding efficiency is improved, for a given usage, compared to a solution where a complete tile independency solution is used.

Coding schemes for virtual reality (VR) sequences
11527015 · 2022-12-13 · ·

An improved method for coding video is provided that includes Virtual Reality (VR) sequences that enables more efficient encoding by organizing the VR sequence as a single 2D block structure. In the method, reference picture and subpicture lists are created and extended to account for coding of the VR sequence. To further improve coding efficiency, reference indexing can be provided for the temporal and spatial difference between a current VR picture block and the reference pictures and subpictures for the VR sequence. Further, because the reference subpictures for the VR sequence may not have the proper orientation once the VR sequence subpictures are organized into the VR sequence, reorientation of the reference subpictures is made so that the reference subpicture orientations match the current VR subpicture orientations.