H04N19/177

ENCODING A VIDEO FRAME AS A REFERENCE FRAME BASED ON A SCENE CHANGE HINT AT A CLOUD GAMING SERVER
20230138708 · 2023-05-04 ·

A method for encoding including executing game logic built on a game engine of a video game at a cloud gaming server to generate video frames. The method including executing scene change logic to predict a scene change in the video frames based on game state collected during execution of the game logic. The method including identifying a range of video frames that is predicted to include the scene change. The method including generating a scene change hint using the scene change logic, wherein the scene change hint identifies the range of video frames, wherein the range of video frames includes a first video frame. The method including delivering the first video frame to an encoder. The method including sending the scene change hint from the scene change logic to the encoder. The method including encoding the first video frame as an I-frame based on the scene change hint.

ENCODING A VIDEO FRAME AS A REFERENCE FRAME BASED ON A SCENE CHANGE HINT AT A CLOUD GAMING SERVER
20230138708 · 2023-05-04 ·

A method for encoding including executing game logic built on a game engine of a video game at a cloud gaming server to generate video frames. The method including executing scene change logic to predict a scene change in the video frames based on game state collected during execution of the game logic. The method including identifying a range of video frames that is predicted to include the scene change. The method including generating a scene change hint using the scene change logic, wherein the scene change hint identifies the range of video frames, wherein the range of video frames includes a first video frame. The method including delivering the first video frame to an encoder. The method including sending the scene change hint from the scene change logic to the encoder. The method including encoding the first video frame as an I-frame based on the scene change hint.

METHOD FOR ALIGNMENT ACROSS LAYERS IN CODED VIDEO STREAM
20230132814 · 2023-05-04 · ·

A method, computer program, and computer system is provided for aligning across layers in a coded video stream. A video bitstream having multiple layers is decoded. One or more subpicture regions are identified from among the multiple layers of the decoded video bitstream, the subpicture regions including a background region and one or more foreground subpicture regions. An enhanced subpicture is decoded and displayed based on a determination that a foreground subpicture region is selected. The background region is decoded and displayed based on a determination that a foreground subpicture region was not selected.

METHOD FOR ALIGNMENT ACROSS LAYERS IN CODED VIDEO STREAM
20230132814 · 2023-05-04 · ·

A method, computer program, and computer system is provided for aligning across layers in a coded video stream. A video bitstream having multiple layers is decoded. One or more subpicture regions are identified from among the multiple layers of the decoded video bitstream, the subpicture regions including a background region and one or more foreground subpicture regions. An enhanced subpicture is decoded and displayed based on a determination that a foreground subpicture region is selected. The background region is decoded and displayed based on a determination that a foreground subpicture region was not selected.

PICTURE METADATA FOR VARIABLE FRAME-RATE VIDEO

Metadata and methods for variable-frame rate (VFR) video playback are presented. Proposed metadata include syntax parameters related to the presentation time duration, picture source type (e.g., original, duplicate, or interpolated), picture position in a scene (e.g., first, last, or in the middle), and motion-related information with respect to a previous picture. A decoder may use these metadata to apply appropriate frame-rate conversion techniques to reduce artifacts during VFR playback.

PICTURE METADATA FOR VARIABLE FRAME-RATE VIDEO

Metadata and methods for variable-frame rate (VFR) video playback are presented. Proposed metadata include syntax parameters related to the presentation time duration, picture source type (e.g., original, duplicate, or interpolated), picture position in a scene (e.g., first, last, or in the middle), and motion-related information with respect to a previous picture. A decoder may use these metadata to apply appropriate frame-rate conversion techniques to reduce artifacts during VFR playback.

SINGLE LAYER HIGH DYNAMIC RANGE CODING WITH STANDARD DYNAMIC RANGE BACKWARD COMPATIBILITY

A method for transforming high dynamic range (HDR) video data into standard dynamic range (SDR) video data and encoding the SDR video data so that the HDR video data may be recovered at the decoder includes generating a tone map describing a transformation applied to the HDR video data to generate the SDR video data. The generated tone map describes the transformation as the multiplication of each HDR pixel in the HDR video data by a scalar to generate the SDR video data. The tone map is then modeled as a reshaping transfer function and the HDR video data is processed by the reshaping transfer function to generate the SDR video data. The reshaping transfer function is then inverted and described in a self-referential metadata structure. The SDR video data is then encoded including the metadata structure defining the inverse reshaping transfer function.

SINGLE LAYER HIGH DYNAMIC RANGE CODING WITH STANDARD DYNAMIC RANGE BACKWARD COMPATIBILITY

A method for transforming high dynamic range (HDR) video data into standard dynamic range (SDR) video data and encoding the SDR video data so that the HDR video data may be recovered at the decoder includes generating a tone map describing a transformation applied to the HDR video data to generate the SDR video data. The generated tone map describes the transformation as the multiplication of each HDR pixel in the HDR video data by a scalar to generate the SDR video data. The tone map is then modeled as a reshaping transfer function and the HDR video data is processed by the reshaping transfer function to generate the SDR video data. The reshaping transfer function is then inverted and described in a self-referential metadata structure. The SDR video data is then encoded including the metadata structure defining the inverse reshaping transfer function.

SELECTION OF MOTION VECTOR PRECISION

Approaches to selection of motion vector (“MV”) precision during video encoding are presented. These approaches can facilitate compression that is effective in terms of rate-distortion performance and/or computational efficiency. For example, a video encoder determines an MV precision for a unit of video from among multiple MV precisions, which include one or more fractional-sample MV precisions and integer-sample MV precision. The video encoder can identify a set of MV values having a fractional-sample MV precision, then select the MV precision for the unit based at least in part on prevalence of MV values (within the set) having a fractional part of zero. Or, the video encoder can perform rate-distortion analysis, where the rate-distortion analysis is biased towards the integer-sample MV precision. Or, the video encoder can collect information about the video and select the MV precision for the unit based at least in part on the collected information.

SELECTION OF MOTION VECTOR PRECISION

Approaches to selection of motion vector (“MV”) precision during video encoding are presented. These approaches can facilitate compression that is effective in terms of rate-distortion performance and/or computational efficiency. For example, a video encoder determines an MV precision for a unit of video from among multiple MV precisions, which include one or more fractional-sample MV precisions and integer-sample MV precision. The video encoder can identify a set of MV values having a fractional-sample MV precision, then select the MV precision for the unit based at least in part on prevalence of MV values (within the set) having a fractional part of zero. Or, the video encoder can perform rate-distortion analysis, where the rate-distortion analysis is biased towards the integer-sample MV precision. Or, the video encoder can collect information about the video and select the MV precision for the unit based at least in part on the collected information.