Patent classifications
H04N19/179
Constraint-modified selection of video encoding configurations
A video to be encoded to a plurality of different target encodings for bandwidth adaptive serving is received. The video is encoded into a plurality of different candidate encodings using different candidate encoding parameters. A quality metric is determined for each of the plurality of different candidate encodings. One or more different target quality metrics are selected for a first portion of the different target encodings based at least in part on one or more specified constraints for one or more target devices. One or more different target quality metrics are selected for a second portion of the different target encodings based at least in part on the determined quality metrics of the different candidate encodings. Based at least in part on the selected different target quality metrics for the first portion and the second portion, the plurality of different target encodings of the video is generated.
Scene-aware video encoder system and method
Embodiments of the present disclosure discloses a scene-aware video encoder system. The scene-aware encoder system transforms a sequence of video frames of a video of a scene into a spatio-temporal scene graph. The spatio-temporal scene graph includes nodes representing one or multiple static and dynamic objects in the scene. Each node of the spatio-temporal scene graph describes an appearance, a location, and/or a motion of each of the objects (static and dynamic objects) at different time instances. The nodes of the spatio-temporal scene graph are embedded into a latent space using a spatio-temporal transformer encoding different combinations of different nodes of the spatio-temporal scene graph corresponding to different spatio-temporal volumes of the scene. Each node of the different nodes encoded in each of the combinations is weighted with an attention score determined as a function of similarities of spatio-temporal locations of the different nodes in the combination.
Scene-aware video encoder system and method
Embodiments of the present disclosure discloses a scene-aware video encoder system. The scene-aware encoder system transforms a sequence of video frames of a video of a scene into a spatio-temporal scene graph. The spatio-temporal scene graph includes nodes representing one or multiple static and dynamic objects in the scene. Each node of the spatio-temporal scene graph describes an appearance, a location, and/or a motion of each of the objects (static and dynamic objects) at different time instances. The nodes of the spatio-temporal scene graph are embedded into a latent space using a spatio-temporal transformer encoding different combinations of different nodes of the spatio-temporal scene graph corresponding to different spatio-temporal volumes of the scene. Each node of the different nodes encoded in each of the combinations is weighted with an attention score determined as a function of similarities of spatio-temporal locations of the different nodes in the combination.
Signaling parameter value information in a parameter set to reduce the amount of data contained in an encoded video bitstream
A method performed by a decoder for decoding a bitstream comprising a picture parameter set, PPS, and a first set of slices. The method includes obtaining the picture parameter set. The method also includes decoding a syntax element included in the picture parameter set to obtain an indicator value. The decoder is configured such that if the indicator value is set to a first value then the decoder determines that a picture header included in the bitstream comprises a parameter value corresponding to a particular parameter, otherwise the decoder determines that each slice included in the first set of slices comprises a parameter value corresponding to the particular parameter. If the picture header comprises the parameter value corresponding to the particular parameter, then this parameter value is used to decode slice data of each slice included in the first set of slices.
Event/object-of-interest centric timelapse video generation on camera device with the assistance of neural network input
An apparatus including an interface and a processor. The interface may be configured to receive pixel data generated by a capture device. The processor may be configured to generate video frames in response to the pixel data, perform computer vision operations on the video frames to detect objects, perform a classification of the objects detected based on characteristics of the objects, determine whether the classification of the objects corresponds to a user-defined event and generate encoded video frames from the video frames. The encoded video frames may be communicated to a cloud storage service. The encoded video frames may comprise a first sample of the video frames selected at a first rate when the user-defined event is not detected and a second sample of the video frames selected at a second rate while the user-defined event is detected. The second rate may be greater than the first rate.
HIGH FRAME RATE-LOW FRAME RATE TRANSMISSION TECHNIQUE
A method for transmitting video content segments includes providing Low Frame Rate (LFR) and High Frame Rate (HFR) encoding mode designations for video content segments having static scenes and scenes with motion, respectively. Each video content segment is encoded accordance with its encoding mode designation and then transmitted with its encoding mode designation to enable retrieval and decoding by a decoder. Encoded video content appears as LFR content for processing as LFR content by equipment unaware of the present encoding.
SMART SMALL FORM-FACTOR PLUGGABLE (SFP) TRANSCEIVER
Approaches for processing video in a smart small form-factor pluggable (SFP) transceiver. The smart SFP transceiver may dynamically select, from a plurality of codecs available to the smart SFP transceiver, an appropriate codec for use in processing the video prior to the video being transmitted over a link. The selection of the codec may be based, at least in part, on assessed environmental attributes. The smart SFP transceiver may then use the codec selected by the smart SFP transceiver to process the video, e.g., the video may be encoded, compressed, or timing information generated.
Video signal encoding/decoding method and device therefor
A video decoding method according to the present invention may comprise: a step for determining whether to divide a current block into a plurality of sub-blocks; a step for determining an intra prediction mode for the current block; and a step for performing intra prediction for each sub-block on the basis of the intra prediction mode, when the current block is divided into the plurality of sub-blocks.
Method and apparatus for encapsulating images or sequences of images with proprietary information in a file
A method of encapsulating entities in a file, wherein the method comprises for at least one entity: generating a grouping data structure associated with at least one of the entities, and indicating that the at least one of the entities belong to a same group; encapsulating the grouping data structure and the entities in the file; wherein the grouping data structure is a proprietary grouping data structure comprising an universally unique identifier identifying the type of the proprietary grouping.
Method and apparatus for encapsulating images or sequences of images with proprietary information in a file
A method of encapsulating entities in a file, wherein the method comprises for at least one entity: generating a grouping data structure associated with at least one of the entities, and indicating that the at least one of the entities belong to a same group; encapsulating the grouping data structure and the entities in the file; wherein the grouping data structure is a proprietary grouping data structure comprising an universally unique identifier identifying the type of the proprietary grouping.