H04N19/177

Neural network based image set compression
11496769 · 2022-11-08 · ·

Techniques for coding sets of images with neural networks include transforming a first image of a set of images into coefficients with an encoder neural network, encoding a group of the coefficients as an integer patch index into coding table of table entries each having vectors of coefficients, and storing a collection of patch indices as a first coded image. The encoder neural network may be configured with encoder weights determined by jointly with corresponding decoder weights of a decoder neural network on the set of images.

METHOD FOR ENCODING OF A VIDEO STREAM
20230099056 · 2023-03-30 · ·

A temporal sequence of pictures is generated in a method for encoding of a first video stream. To do so, a synchronization signal can be used, which can be derived from a second video stream independently of the first video stream. Alternatively, the encoding of a second video stream independent of the first video stream can be based on the same principle as for the encoding of the first video stream.

METHOD FOR ENCODING OF A VIDEO STREAM
20230099056 · 2023-03-30 · ·

A temporal sequence of pictures is generated in a method for encoding of a first video stream. To do so, a synchronization signal can be used, which can be derived from a second video stream independently of the first video stream. Alternatively, the encoding of a second video stream independent of the first video stream can be based on the same principle as for the encoding of the first video stream.

Single layer high dynamic range coding with standard dynamic range backward compatibility

A method for transforming high dynamic range (HDR) video data into standard dynamic range (SDR) video data and encoding the SDR video data so that the HDR video data may be recovered at the decoder includes generating a tone map describing a transformation applied to the HDR video data to generate the SDR video data. The generated tone map describes the transformation as the multiplication of each HDR pixel in the HDR video data by a scalar to generate the SDR video data. The tone map is then modeled as a reshaping transfer function and the HDR video data is processed by the reshaping transfer function to generate the SDR video data. The reshaping transfer function is then inverted and described in a self-referential metadata structure. The SDR video data is then encoded including the metadata structure defining the inverse reshaping transfer function.

Single layer high dynamic range coding with standard dynamic range backward compatibility

A method for transforming high dynamic range (HDR) video data into standard dynamic range (SDR) video data and encoding the SDR video data so that the HDR video data may be recovered at the decoder includes generating a tone map describing a transformation applied to the HDR video data to generate the SDR video data. The generated tone map describes the transformation as the multiplication of each HDR pixel in the HDR video data by a scalar to generate the SDR video data. The tone map is then modeled as a reshaping transfer function and the HDR video data is processed by the reshaping transfer function to generate the SDR video data. The reshaping transfer function is then inverted and described in a self-referential metadata structure. The SDR video data is then encoded including the metadata structure defining the inverse reshaping transfer function.

Hierarchical Video Encoders

A computer-implemented method for generating video representations utilizing a hierarchical video encoder includes obtaining a video, wherein the video includes a plurality of frames, processing each of the plurality of frames with a machine-learned frame-level encoder model to respectively generate a plurality of frame representations for the plurality of frames, the plurality of frame representations respective to the plurality of frames determining a plurality of segment representations representative of a plurality of video segments including one or more of the plurality of frames, the plurality of segment representations based at least in part on the plurality of frame representations, processing the plurality of segment representations with a machine-learned segment-level encoder model to generate a plurality of contextualized segment representations, determining a video representation based at least in part on the plurality of contextualized segment representations, and providing the video representation as an output.

Encoding and Decoding Video Content

In an example method, a system receives a plurality of frames of a video, and generates a data structure representing the video and representing a plurality of temporal layers. Generating the data structure includes: (i) determining a plurality of quality levels for presenting the video, where each of the quality levels corresponds to a different respective sampling period for sampling the frames of the video, (ii) assigning, based on the sampling periods, each of the frames to a respective one of the temporal layers of the data structure, and (iii) indicating, in the data structure, one or more relationships between (a) at least one the frames assigned to at least one of the temporal layers of the data structure, and (b) at least another one of the frames assigned to at least another one of the temporal layers of the data structure. Further, the system outputs the data structure.

Encoding and Decoding Video Content

In an example method, a system receives a plurality of frames of a video, and generates a data structure representing the video and representing a plurality of temporal layers. Generating the data structure includes: (i) determining a plurality of quality levels for presenting the video, where each of the quality levels corresponds to a different respective sampling period for sampling the frames of the video, (ii) assigning, based on the sampling periods, each of the frames to a respective one of the temporal layers of the data structure, and (iii) indicating, in the data structure, one or more relationships between (a) at least one the frames assigned to at least one of the temporal layers of the data structure, and (b) at least another one of the frames assigned to at least another one of the temporal layers of the data structure. Further, the system outputs the data structure.

ENHANCED WI-FI SENSING MEASUREMENT SETUP AND SENSING TRIGGER FRAME FOR RESPONDER-TO-RESPONDER SENSING
20230033468 · 2023-02-02 ·

This disclosure describes systems, methods, and devices related to responder-to-responder Wi-Fi sensing between station devices. A device may cause to send, during a trigger frame sounding phase of a responder-to-responder Wi-Fi sensing, a sensing responder-to-responder sounding trigger frame to the first station device and the second station device, the sensing responder-to-responder sounding trigger frame associated with causing the first station device to send a responder-to-responder null data packet (NDP) to the second station device; cause to send, during a reporting phase of the responder-to-responder Wi-Fi sensing, a sensing report trigger frame to the second station device; and identify, during the reporting phase, a sensing measurement report from the second station device based on the sensing report trigger frame, wherein the sensing measurement report is indicative of measurements of the responder-to-responder NDP.

METHOD AND DEVICE USING HIGH LAYER SYNTAX ARCHITECTURE FOR CODING AND DECODING
20230086585 · 2023-03-23 · ·

A method of and a device for decoding a coded picture coded according to a video codec technology or standard that uses a syntax structure including a Picture Header and at least one Picture Parameter Set (PPS) are provided. The method includes decoding, by a decoder, the Picture Header, the Picture Header including transient information pertaining to a plurality of Coding Units of the coded picture, and the transient information of the Picture Header including at least one reference to the at least one PPS, and further including at least one first syntax element pertaining to an aspect of the video codec technology or standard for decoding. The method further includes activating, by the decoder, a PPS of the at least one PPS that is decoded, the PPS including a second syntax element pertaining to the aspect of the video codec technology or standard for decoding.