H04N19/87

Dynamic compression of audio-visual data

Disclosed are techniques for dynamic compression of audio-visual data, including a method for reducing a size of media content, comprising identifying a scene to be captured by a capture device, wherein the scene comprises a plurality of objects, and determining whether at least one of: (i) an available storage in the capture device to store a digital media file associated with the scene; and (ii) an available network bandwidth to transfer the digital media file is below an associated threshold. In response to the threshold determination, some embodiments may further comprise analyzing the plurality of objects to determine which objects can be redeveloped by a GAN regeneration module to a threshold quality level and which objects cannot be redeveloped with the GAN regeneration module to the threshold quality level, generating an optimized capture plan based on the analyzing, and encoding the scene pursuant to the optimized capture plan.

Metrics and messages to improve experience for 360-degree adaptive streaming
11166072 · 2021-11-02 · ·

A method for receiving and displaying media content may be provided. The method may include requesting a set of DASH video segments that are associated with various viewports and qualities. The method may include displaying the DASH video segments. The method may indue determining a latency metric based on a time difference between the display of a DASH video segment and one of: a device beginning to move, the device ceasing to move, the device determining that the device has begun to move, the device determining that the device has stopped moving, or the display of a different DASH video segment. The different DASH video segment may be associated with one or more of a different quality or a different viewport.

Using motion compensated temporal filter (MCTF) statistics for scene change detection when a fade, dissolve or cut occurs

A method is provided to better detect a scene change to provide a prediction to an encoder to enable more efficient encoding. The method uses a Motion Compensated Temporal Filter (MCTF) that provides motion estimation and is located prior to an encoder. The MCTF provides a Motion Compensated Residual (MCR) used to detect the scene change transition. When a scene is relatively stable, the MCR score is also relatively stable. However, when a scene transition is in process, the MCR score behavior changes, Algorithmically, the MCR score is used by comparing the sliding mean of the MCR score to the sliding median. This comparison highlights the transition points. In the case of a scene cut, the MCR score exhibits a distinct spike. In the case of a fade or dissolve, the MCR score exhibits a transitional period of degradation followed by recovery. By implementing the above detection using the MCR, the location of the I-pictures in the downstream encoding process can be accurately determined for the encoder.

Using motion compensated temporal filter (MCTF) statistics for scene change detection when a fade, dissolve or cut occurs

A method is provided to better detect a scene change to provide a prediction to an encoder to enable more efficient encoding. The method uses a Motion Compensated Temporal Filter (MCTF) that provides motion estimation and is located prior to an encoder. The MCTF provides a Motion Compensated Residual (MCR) used to detect the scene change transition. When a scene is relatively stable, the MCR score is also relatively stable. However, when a scene transition is in process, the MCR score behavior changes, Algorithmically, the MCR score is used by comparing the sliding mean of the MCR score to the sliding median. This comparison highlights the transition points. In the case of a scene cut, the MCR score exhibits a distinct spike. In the case of a fade or dissolve, the MCR score exhibits a transitional period of degradation followed by recovery. By implementing the above detection using the MCR, the location of the I-pictures in the downstream encoding process can be accurately determined for the encoder.

SYSTEMS AND METHODS FOR COMPRESSING VIDEO
20220417524 · 2022-12-29 ·

Systems, methods, and apparatuses are described for compressing digital content. The digital content may comprise a plurality of frames. The plurality of frames may comprise a plurality of crossfade frames. A first boundary frame of the crossfade frames may be determined. A second boundary frame of the crossfade frames may be determined. At least a portion of the crossfade frames may be coded as inter-predicted frames using a weighting factor and based on the first boundary frame or the second boundary frame.

SYSTEMS AND METHODS FOR COMPRESSING VIDEO
20220417524 · 2022-12-29 ·

Systems, methods, and apparatuses are described for compressing digital content. The digital content may comprise a plurality of frames. The plurality of frames may comprise a plurality of crossfade frames. A first boundary frame of the crossfade frames may be determined. A second boundary frame of the crossfade frames may be determined. At least a portion of the crossfade frames may be coded as inter-predicted frames using a weighting factor and based on the first boundary frame or the second boundary frame.

Video management

The disclosure relates to a method of processing a sequence of image frames to reduce its length. One implementation may involve extracting coefficients (e.g., Discrete Cosine Transform coefficients) from components of individual frames, and comparing the resulting coefficients for sequential frames to identify frames having the least change from a prior frame. Also, scene change values for each frame may be calculated and placed in a sorted list to facilitate identification of frames for removal. Frame removal may be conducted in rounds, where a group of pictures (GOP) may only have one frame removed for any given round.

Video management

The disclosure relates to a method of processing a sequence of image frames to reduce its length. One implementation may involve extracting coefficients (e.g., Discrete Cosine Transform coefficients) from components of individual frames, and comparing the resulting coefficients for sequential frames to identify frames having the least change from a prior frame. Also, scene change values for each frame may be calculated and placed in a sorted list to facilitate identification of frames for removal. Frame removal may be conducted in rounds, where a group of pictures (GOP) may only have one frame removed for any given round.

Optimizing encoding operations when generating encoded versions of a media title

In various embodiments, a sequence-based encoding application partitions a set of shot sequences associated with a media title into multiple clusters based on at least one feature that characterizes media content and/or encoded media content associated with the media title. The clusters include at least a first cluster and a second cluster. The sequence-based encoding application encodes a first shot sequence using a first operating point to generate a first encoded shot sequence. The first shot sequence and the first operating point are associated with the first cluster. By contrast, the sequence-based encoding application encodes a second shot sequence using a second operating point to generate a second encoded shot sequence. The second shot sequence and the second operating point are associated with the second cluster. Subsequently, the sequence-based encoding application generates an encoded media sequence based on the first encoded shot sequence and the second encoded shot sequence.

VIDEO PROCESSING METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
20230362416 · 2023-11-09 ·

This application discloses a video processing method performed by a computer device. The method includes: inputting a target video and a video mask to an encoding model for feature extraction to obtain a first video feature vector and a second video feature vector; determining an index distribution of the first quantization feature vector in a discrete hidden space composed on the basis of the first quantization feature vector; determining a second quantization feature vector in the discrete hidden space on the basis of the second video feature vector and the index distribution of the first quantization feature vector; and inputting the first quantization feature vector and the second quantization feature vector to a decoding model to obtain a reconstructed video, the reconstructed video referring to a video with a content of the masked region of the target video filled in accordance with the second quantization feature vector.