H04N19/142

Spatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation

Several improvements for use with Bidirectionally Predictive (B) pictures within a video sequence are provided. In certain improvements Direct Mode encoding and/or Motion Vector Prediction are enhanced using spatial prediction techniques. In other improvements Motion Vector prediction includes temporal distance and subblock information, for example, for more accurate prediction. Such improvements and other presented herein significantly improve the performance of any applicable video coding system/logic.

Spatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation

Several improvements for use with Bidirectionally Predictive (B) pictures within a video sequence are provided. In certain improvements Direct Mode encoding and/or Motion Vector Prediction are enhanced using spatial prediction techniques. In other improvements Motion Vector prediction includes temporal distance and subblock information, for example, for more accurate prediction. Such improvements and other presented herein significantly improve the performance of any applicable video coding system/logic.

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING SYSTEM, AND INFORMATION PROCESSING METHOD
20230095186 · 2023-03-30 ·

An information processing device (10, 10A) includes a band requesting unit (162, 162A), an adjusting unit (164, 164A), and a transmitting unit (140). The band requesting unit (162, 162A) requests, according to a bandwidth necessary for transmitting information including a moving image, a use reservation of the bandwidth. The adjusting unit (164, 164A) adjusts, according to a result of the request by the band requesting unit and a reserved bandwidth, an information amount of information to be transmitted. The transmitting unit (140) converts the information with the adjusted information amount into a transmission signal and transmits the transmission signal.

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING SYSTEM, AND INFORMATION PROCESSING METHOD
20230095186 · 2023-03-30 ·

An information processing device (10, 10A) includes a band requesting unit (162, 162A), an adjusting unit (164, 164A), and a transmitting unit (140). The band requesting unit (162, 162A) requests, according to a bandwidth necessary for transmitting information including a moving image, a use reservation of the bandwidth. The adjusting unit (164, 164A) adjusts, according to a result of the request by the band requesting unit and a reserved bandwidth, an information amount of information to be transmitted. The transmitting unit (140) converts the information with the adjusted information amount into a transmission signal and transmits the transmission signal.

VIDEO ENCODING OPTIMIZATION FOR MACHINE LEARNING CONTENT CATEGORIZATION
20230095541 · 2023-03-30 ·

Systems, apparatuses, and methods for performing machine learning content categorization leveraging video encoding pre-processing are disclosed. A system includes at least a motion vector unit and a machine learning (ML) engine. The motion vector unit pre-processes a frame to determine if there is temporal locality with previous frames. If the objects of the scene have not changed by a threshold amount, then the ML engine does not process the frame, saving computational resources that would typically be used. Otherwise, if there is a change of scene or other significant changes, then the ML engine is activated to process the frame. The ML engine can then generate a QP map and/or perform content categorization analysis on this frame and a subset of the other frames of the video sequence.

VIDEO ENCODING OPTIMIZATION FOR MACHINE LEARNING CONTENT CATEGORIZATION
20230095541 · 2023-03-30 ·

Systems, apparatuses, and methods for performing machine learning content categorization leveraging video encoding pre-processing are disclosed. A system includes at least a motion vector unit and a machine learning (ML) engine. The motion vector unit pre-processes a frame to determine if there is temporal locality with previous frames. If the objects of the scene have not changed by a threshold amount, then the ML engine does not process the frame, saving computational resources that would typically be used. Otherwise, if there is a change of scene or other significant changes, then the ML engine is activated to process the frame. The ML engine can then generate a QP map and/or perform content categorization analysis on this frame and a subset of the other frames of the video sequence.

CONTENT ADAPTIVE ENCODING
20230085928 · 2023-03-23 ·

The described technology is generally directed towards developing an adaptive bitrate stack (ladder) on a per-title basis. Variable bitrate encodings are used to obtain complexity information for a title and per-frames scores for the encodings; another encoding provides scene data. The complexity information is analyzed and processed based on the scene data to determine scene-based (e.g., objective and/or subjective quality) scores, which are used to determine scores for the encodings. The results are used to derive a candidate stack, comprising various resolutions and bitrates that provide desirable results. The candidate stack is evaluated by encoding the title using the candidate stack. These encodings are evaluated to select one resolution from any duplicate resolutions for a bitrate (e.g., based on relative quality), resulting in a pruned, final ladder that is associated with the title as the adaptive bitrate stack to be used for streaming that title's content.

CONTENT ADAPTIVE ENCODING
20230085928 · 2023-03-23 ·

The described technology is generally directed towards developing an adaptive bitrate stack (ladder) on a per-title basis. Variable bitrate encodings are used to obtain complexity information for a title and per-frames scores for the encodings; another encoding provides scene data. The complexity information is analyzed and processed based on the scene data to determine scene-based (e.g., objective and/or subjective quality) scores, which are used to determine scores for the encodings. The results are used to derive a candidate stack, comprising various resolutions and bitrates that provide desirable results. The candidate stack is evaluated by encoding the title using the candidate stack. These encodings are evaluated to select one resolution from any duplicate resolutions for a bitrate (e.g., based on relative quality), resulting in a pruned, final ladder that is associated with the title as the adaptive bitrate stack to be used for streaming that title's content.

AI PREDICTION FOR VIDEO COMPRESSION
20220345715 · 2022-10-27 ·

A method for encoding a first image within a first set of images, in which the first image is cut into blocks, each block being encoded according to one among a plurality of coding modes, is proposed, which comprises, for a current block of the first image, the determination, on the basis of at least one second image distinct from the first image and previously encoded according to an encoding sequence of the images of the first set of images, of a prediction of a feature of the current block in one or more third images from the first set of images distinct from the first image and not yet encoded according to the encoding sequence, and the use of the prediction to encode the current block while minimizing a flow-distortion criterion.

METHOD AND APPARATUS FOR TEMPORAL FILTER IN VIDEO CODING

Aspects of the disclosure provide methods and apparatuses for video processing. In some examples, an apparatus for video processing includes processing circuitry. The processing circuitry determines one or more parameters of a temporal filter based on video contents in an uncompressed video. The uncompressed video includes a sequence of frames. Then, the processing circuitry applies the temporal filter with the determined parameter on a first pixel in a first frame to determine a filtered value for the first pixel based on the first pixel in the first frame and second pixels in a group of reference frames for the first frame. Further, the processing circuitry encodes a filtered video that includes the filtered value for the first pixel in the first frame to generate a coded video bitstream that carries the filtered video.