H04N19/33

MOTION COMPENSATION FOR NEURAL NETWORK ENHANCED IMAGES

A device includes a memory and one or more processors. The memory is configured to store instructions. The one or more processors are configured to execute the instructions to apply a neural network to a first image to generate an enhanced image. The one or more processors are also configured to execute the instructions to adjust at least a portion of a high-frequency component of the enhanced image based on a motion compensation operation to generate an adjusted high-frequency image component. The one or more processors are further configured to execute the instructions to combine a low-frequency component of the enhanced image and the adjusted high-frequency image component to generate an adjusted enhanced image.

Adaptive video quality
11563959 · 2023-01-24 · ·

A method for encoding a first stream of video data comprising a plurality of frames of video, the method, for one or more of the plurality of frames of video, comprising the steps of: encoding in a hierarchical arrangement a frame of the video data, the hierarchical arrangement comprising a base layer of video data and a first enhancement layer of video data, said first enhancement layer of video data comprising a plurality of sub-layers of enhancement data, such that when encoded: the base layer of video data comprises data which when decoded renders the frame at a first, base, level of quality; and each sub-layer of enhancement data comprises data which, when decoded with the base layer, render the frame at a higher level of quality than the base level of quality; and wherein the steps of encoding the sub-layers of enhancement data comprises: quantizing the enhancement data at a determined initial level of quantization thereby creating a set of quantized enhancement data; associating to each of the plurality of sub-layers a respective notional quantization level and allocating, for each of the plurality of sub-layers, a sub-set of the set of quantized enhancement data based on the respective notional quantization level.

ENCODING AND DECODING A SEQUENCE OF PICTURES

An apparatus for decoding a sequence of pictures from a data stream is configured for decoding a picture of the sequence by: deriving a residual transform signal of the picture from the data stream; combining the residual transform signal with a buffered transform signal of a previous picture of the sequence so as to obtain a transform signal of the picture, the transform signal representing the picture in spectral components; and subjecting the transform signal to a spectral-to-spatial transformation, wherein the buffered transform signal comprises a selection out of spectral components representing the previous picture.

ENCODING AND DECODING A SEQUENCE OF PICTURES

An apparatus for decoding a sequence of pictures from a data stream is configured for decoding a picture of the sequence by: deriving a residual transform signal of the picture from the data stream; combining the residual transform signal with a buffered transform signal of a previous picture of the sequence so as to obtain a transform signal of the picture, the transform signal representing the picture in spectral components; and subjecting the transform signal to a spectral-to-spatial transformation, wherein the buffered transform signal comprises a selection out of spectral components representing the previous picture.

SIGNALING DECODED PICTURE BUFFER SIZE IN MULTI-LOOP SCALABLE VIDEO CODING
20230232025 · 2023-07-20 ·

A method for encoding a video sequence in a scalable video encoder to generate a scalable bitstream is provided that includes encoding the video sequence in a first layer encoder of the scalable video encoder to generate a first sub-bitstream, encoding the video sequence in a second layer encoder of the scalable video encoder to generate a second sub-bitstream, wherein portions of the video sequence being encoded in the second layer encoder are predicted using reference portions of the video sequence encoded in the first layer encoder, combining the first sub-bitstream and the second sub-bitstream to generate the scalable bitstream, and signaling in the scalable bitstream an indication of a maximum decoded picture buffer (DPB) size needed for decoding the second sub-bitstream and the first sub-bitstream when the second sub-bitstream is a target sub-bitstream for decoding.

Method for parameter set reference constraints in coded video stream

There is included a method and apparatus comprising computer code configured to cause a processor or processors to perform obtaining video data comprising data of a plurality of semantically independent source pictures, determining, among the video data, whether references are associated with any of a first access unit (AU) and a second AU according to at least one picture order count (POC) signal value included with the video data, and outputting a first quantity of the references set to the first AU and a second quantity of the references set to the second AU based on the at least one POC signal value.

INFORMATION PROCESSING DEVICE AND METHOD

The present disclosure relates to an information processing device and a method capable of more easily reproducing 3D data using spatial scalability.

2D data obtained by two-dimensionally converting a point cloud representing an object having a three-dimensional shape as a set of points and corresponding to spatial scalability is encoded, a bitstream including a sub-bitstream obtained by encoding the point cloud corresponding to a single or plurality of layers of the spatial scalability is generated, spatial scalability information regarding the spatial scalability of the sub-bitstream is generated, and a file that stores the bitstream generated and the spatial scalability information generated is generated. The present disclosure can be applied to, for example, an information processing device, an information processing method, or the like.

INFORMATION PROCESSING DEVICE AND METHOD

The present disclosure relates to an information processing device and a method capable of more easily reproducing 3D data using spatial scalability.

2D data obtained by two-dimensionally converting a point cloud representing an object having a three-dimensional shape as a set of points and corresponding to spatial scalability is encoded, a bitstream including a sub-bitstream obtained by encoding the point cloud corresponding to a single or plurality of layers of the spatial scalability is generated, spatial scalability information regarding the spatial scalability of the sub-bitstream is generated, and a file that stores the bitstream generated and the spatial scalability information generated is generated. The present disclosure can be applied to, for example, an information processing device, an information processing method, or the like.

DYNAMIC RESOLUTION CHANGE HINTS FOR ADAPTIVE STREAMING
20230224532 · 2023-07-13 ·

An example device for retrieving media data includes a memory configured to store video data; a video decoder configured to decode the video data; and one or more processors implemented in circuitry and configured to: determine that a media presentation includes first video data at a first spatial resolution and second video data at a second spatial resolution, the second spatial resolution being different than the first spatial resolution; receive a first portion of the first video data at the first spatial resolution for a first playback time; send the first portion of the first video data at the first spatial resolution to the video decoder; receive a second portion of the second video data at the second spatial resolution for a second playback time later than the first playback time; and send the second portion of the second video data at the second spatial resolution to the video decoder.

DYNAMIC RESOLUTION CHANGE HINTS FOR ADAPTIVE STREAMING
20230224532 · 2023-07-13 ·

An example device for retrieving media data includes a memory configured to store video data; a video decoder configured to decode the video data; and one or more processors implemented in circuitry and configured to: determine that a media presentation includes first video data at a first spatial resolution and second video data at a second spatial resolution, the second spatial resolution being different than the first spatial resolution; receive a first portion of the first video data at the first spatial resolution for a first playback time; send the first portion of the first video data at the first spatial resolution to the video decoder; receive a second portion of the second video data at the second spatial resolution for a second playback time later than the first playback time; and send the second portion of the second video data at the second spatial resolution to the video decoder.