H04N19/179

Methods for generating video-and audience-specific encoding ladders with audio and video just-in-time transcoding

A method including: populating an encoding ladder with a subset of bitrate-resolution pairs, from a set of bitrate-resolution pairs, based on a distribution of audience bandwidths; receiving a first request for a first playback segment, at a first bitrate-resolution pair in a encoding ladder, in the video from a first device; in response to determining an absence of video segments, at the first bitrate-resolution pair and corresponding to the segment, in a first rendition cache: identifying a first set of mezzanine segments, in the video, corresponding to the first playback segment; assigning the first set of mezzanine segments to a set of workers for transcoding into a first set of video segments according to the first bitrate-resolution pair; storing the first set of video segments in the first rendition cache; and based on the first request, releasing the first set of video segments to the first device.

Fast multi-rate encoding for adaptive HTTP streaming

According to embodiments of the disclosure, information of higher and lower quality encoded video segments is used to limit Rate-Distortion Optimization (RDO) for each Coding Unit Tree (CTU). A method first encodes the highest bit-rate segment and consequently uses it to encode the lowest bit-rate video segment. Block structure and selected reference frame of both highest and lowest bit-rate video segments are used to predict and shorten RDO process for each CTU in middle bit-rates. The method delays just one frame using parallel processing. This approach provides time-complexity reduction compared to the reference software for middle bit-rates while degradation is negligible.

Techniques for optimizing encoding tasks
11539966 · 2022-12-27 · ·

In various embodiments, a shot collation application causes multiple encoding instances to encode a source video sequence that includes at least two shot sequences. The shot collation application assigns a first shot sequence to a first chunk. Subsequently, the shot collation application determines that a second shot sequence does not meet a collation criterion with respect to the first chunk. Consequently, the shot collation application assigns the second shot sequence or a third shot sequence derived from the second shot sequence to a second chunk. The shot collation application causes a first encoding instance to independently encode each shot sequence assigned to the first chunk. Similarly, the shot collation application causes a second encoding instance to independently encode each shot sequence assigned to the second chunk. Finally, a chunk assembler combines the first encoded chunk and the second encoded chunk to generate an encoded video sequence.

Techniques for optimizing encoding tasks
11539966 · 2022-12-27 · ·

In various embodiments, a shot collation application causes multiple encoding instances to encode a source video sequence that includes at least two shot sequences. The shot collation application assigns a first shot sequence to a first chunk. Subsequently, the shot collation application determines that a second shot sequence does not meet a collation criterion with respect to the first chunk. Consequently, the shot collation application assigns the second shot sequence or a third shot sequence derived from the second shot sequence to a second chunk. The shot collation application causes a first encoding instance to independently encode each shot sequence assigned to the first chunk. Similarly, the shot collation application causes a second encoding instance to independently encode each shot sequence assigned to the second chunk. Finally, a chunk assembler combines the first encoded chunk and the second encoded chunk to generate an encoded video sequence.

Game application providing scene change hint for encoding at a cloud gaming server

A method for encoding including executing game logic built on a game engine of a video game at a cloud gaming server to generate video frames. The method including executing scene change logic to predict a scene change in the video frames based on game state collected during execution of the game logic. The method including identifying a range of video frames that is predicted to include the scene change. The method including generating a scene change hint using the scene change logic, wherein the scene change hint identifies the range of video frames, wherein the range of video frames includes a first video frame. The method including delivering the first video frame to an encoder. The method including sending the scene change hint from the scene change logic to the encoder. The method including encoding the first video frame as an I-frame based on the scene change hint.

Game application providing scene change hint for encoding at a cloud gaming server

A method for encoding including executing game logic built on a game engine of a video game at a cloud gaming server to generate video frames. The method including executing scene change logic to predict a scene change in the video frames based on game state collected during execution of the game logic. The method including identifying a range of video frames that is predicted to include the scene change. The method including generating a scene change hint using the scene change logic, wherein the scene change hint identifies the range of video frames, wherein the range of video frames includes a first video frame. The method including delivering the first video frame to an encoder. The method including sending the scene change hint from the scene change logic to the encoder. The method including encoding the first video frame as an I-frame based on the scene change hint.

APPARATUS, A METHOD AND A COMPUTER PROGRAM FOR OMNIDIRECTIONAL VIDEO

There are disclosed various methods, apparatuses and computer program products for video encoding and decoding. In some embodiments the method for video encoding comprises obtaining compressed volumetric video data representing a three-dimensional scene or object (71); capsulating the compressed volumetric video data into a data structure (72); obtaining data of a two-dimensional projection of at least a part of the three-dimensional scene as seen from a certain viewport (73); and including the data of the two-dimensional projection into the data structure (74).

APPARATUS, A METHOD AND A COMPUTER PROGRAM FOR OMNIDIRECTIONAL VIDEO

There are disclosed various methods, apparatuses and computer program products for video encoding and decoding. In some embodiments the method for video encoding comprises obtaining compressed volumetric video data representing a three-dimensional scene or object (71); capsulating the compressed volumetric video data into a data structure (72); obtaining data of a two-dimensional projection of at least a part of the three-dimensional scene as seen from a certain viewport (73); and including the data of the two-dimensional projection into the data structure (74).

Content adaptive encoding

The described technology is generally directed towards developing an adaptive bitrate stack (ladder) on a per-title basis. Variable bitrate encodings are used to obtain complexity information for a title and per-frames scores for the encodings; another encoding provides scene data. The complexity information is analyzed and processed based on the scene data to determine scene-based (e.g., objective and/or subjective quality) scores, which are used to determine scores for the encodings. The results are used to derive a candidate stack, comprising various resolutions and bitrates that provide desirable results. The candidate stack is evaluated by encoding the title using the candidate stack. These encodings are evaluated to select one resolution from any duplicate resolutions for a bitrate (e.g., based on relative quality), resulting in a pruned, final ladder that is associated with the title as the adaptive bitrate stack to be used for streaming that title's content.

Multivariate Rate Control for Transcoding Video Content
20230101806 · 2023-03-30 ·

A learning model is trained for rate-distortion behavior prediction against a corpus of a video hosting platform and used to determine optimal bitrate allocations for video data given video content complexity across the corpus of the video hosting platform. Complexity features of the video data are processed using the learning model to determine a rate-distortion cluster prediction for the video data, and transcoding parameters for transcoding the video data are selected based on that prediction. The rate-distortion clusters are modeled during the training of the learning model, such as based on rate-distortion curves of video data of the corpus of the video hosting platform and based on classifications of such video data. This approach minimizes total corpus egress and/or storage while further maintaining uniformity in the delivered quality of videos by the video hosting platform.