H04N19/166

System and method for compressing video for streaming video game content to remote clients

Methods for hosting online video games are provided. The method includes generating a plurality of video frames and initiating a sending of each one of the plurality of video frames to a client. Each of the video frames that is sent is compressed. Then, stopping the compression and sending of video frames when one of the plurality of video frames is taking longer than a frame time to compress and send. A frame time is defined as one over a frame rate, and wherein stopping the compression of video frames includes ignoring the video frames by an encoder. The method includes continuing to compress and send audio data to the client when one or more of the plurality of video frames are not sent to the client. The client is configured to display a received video frame for more than one frame time when a video frame is not received.

Method and apparatus for streaming data

A terminal for receiving streaming data may receive information of a plurality of different quality versions of an image content; request, based on the information, a server for a version of the image content from among the plurality of different quality versions of the image content; when the requested version of the image content and artificial intelligence (AI) data corresponding to the requested version of the image content are received, determines whether to perform AI upscaling on the received version of the image content, based on the AI data; and based on a result of the determining whether to perform AI upscaling, performs AI upscaling on the received version of the image content through a upscaling deep neural network (DNN) that is trained jointly with a downscaling DNN of the server.

METHOD AND DEVICE FOR PERFORMING ARTIFICIAL INTELLIGENCE ENCODING AND ARTIFICIAL INTELLIGENCE DECODING

An artificial intelligence (AI) encoding apparatus, including a memory configured to store instructions; and at least one processor configured to execute the instructions to: obtain an original image, previously-encoded frame information, and network environment information; obtain deblocking filter setting information, based on the original image, the previously-encoded frame information, and the network environment information; perform deblocking filtering to the original image, based on the deblocking filter setting information to obtain a deblocking-filtered original image; obtain an AI-downscaled first image by providing the deblocking-filtered original image a downscaling deep neural network (DNN); generate image data by performing first encoding on the AI-downscaled first image; and transmit the deblocking filter setting information, AI data including information related to the AI downscaling, and the image data

CLUSTER-BASED DEPENDENCY SIGNALING

The signalization of the inter-layer dependencies between layers of a multi-layered data stream is described. A good compromise between a too intensive restriction of the potential diversity of inter-layer dependencies on the one hand and a too complex signaling of the inter-layer dependencies on the other hand has been found by describing the inter-layer dependencies by way of a first inter-dependency syntax structure indicating inter-dependencies between pairs of different values representable by a base layer-ID and a second inter-dependency syntax structure indicating inter-dependencies between pairs of different values representable by an extension layer-ID, the base layer ID and extension layer ID indexing the layers the portions of the multi-layer data stream are associated with. In accordance with this concept, emphasis may be shifted between increased diversity of the signalizable inter-layer dependencies on the one hand and reduced side-information overhead for signaling the inter-layer dependencies on the other hand.

CLUSTER-BASED DEPENDENCY SIGNALING

The signalization of the inter-layer dependencies between layers of a multi-layered data stream is described. A good compromise between a too intensive restriction of the potential diversity of inter-layer dependencies on the one hand and a too complex signaling of the inter-layer dependencies on the other hand has been found by describing the inter-layer dependencies by way of a first inter-dependency syntax structure indicating inter-dependencies between pairs of different values representable by a base layer-ID and a second inter-dependency syntax structure indicating inter-dependencies between pairs of different values representable by an extension layer-ID, the base layer ID and extension layer ID indexing the layers the portions of the multi-layer data stream are associated with. In accordance with this concept, emphasis may be shifted between increased diversity of the signalizable inter-layer dependencies on the one hand and reduced side-information overhead for signaling the inter-layer dependencies on the other hand.

QoE-based adaptive acquisition and transmission method for VR video

The present application discloses a QoE-based adaptive acquisition and transmission method for VR video, comprising the following steps: 1, capturing, by respective cameras in a VR video acquisition system, original videos with the same bit rate level, and compressing each original video with different bit rate levels; 2, selecting, by a server, a bit rate level for each original video for transmission, and synthesizing all of the transmitted original videos into a complete VR video; 3, performing, by the server, a segmentation process on the synthesized VR video, and compressing each video block into different quality levels; and 4, selecting, by the server, a quality level and an MCS scheme for each video block according to real-time viewing angle information of users and downlink channel bandwidth information in a feedback channel, and transmitting each video block to a client.

QoE-based adaptive acquisition and transmission method for VR video

The present application discloses a QoE-based adaptive acquisition and transmission method for VR video, comprising the following steps: 1, capturing, by respective cameras in a VR video acquisition system, original videos with the same bit rate level, and compressing each original video with different bit rate levels; 2, selecting, by a server, a bit rate level for each original video for transmission, and synthesizing all of the transmitted original videos into a complete VR video; 3, performing, by the server, a segmentation process on the synthesized VR video, and compressing each video block into different quality levels; and 4, selecting, by the server, a quality level and an MCS scheme for each video block according to real-time viewing angle information of users and downlink channel bandwidth information in a feedback channel, and transmitting each video block to a client.

Machine-learned in-loop predictor for video compression

A compression system trains a compression model for an encoder and decoder. In one embodiment, the compression model includes a machine-learned in-loop flow predictor that generates a flow prediction from previously reconstructed frames. The machine-learned flow predictor is coupled to receive a set of previously reconstructed frames and output a flow prediction for a target frame that is an estimation of the flow for the target frame. In particular, since the flow prediction can be generated by the decoder using the set of previously reconstructed frames, the encoder may transmit a flow delta that indicates a difference between the flow prediction and the actual flow for the target frame, instead of transmitting the flow itself. In this manner, the encoder can transmit a significantly smaller number of bits to the receiver, improving computational efficiency.

Machine-learned in-loop predictor for video compression

A compression system trains a compression model for an encoder and decoder. In one embodiment, the compression model includes a machine-learned in-loop flow predictor that generates a flow prediction from previously reconstructed frames. The machine-learned flow predictor is coupled to receive a set of previously reconstructed frames and output a flow prediction for a target frame that is an estimation of the flow for the target frame. In particular, since the flow prediction can be generated by the decoder using the set of previously reconstructed frames, the encoder may transmit a flow delta that indicates a difference between the flow prediction and the actual flow for the target frame, instead of transmitting the flow itself. In this manner, the encoder can transmit a significantly smaller number of bits to the receiver, improving computational efficiency.

Methods for compressing video for streaming video game content to remote clients

Computer-implemented methods for hosting online video games are provided. One method includes generating a plurality of video frames. The method includes initiating a sending of each one of the plurality of video frames to a client. Each of the video frames that is sent is compressed. The method includes stopping the compression and sending of video frames when one of the plurality of video frames is taking longer than a frame time to compress and send. A frame time is defined as one over a frame rate, and wherein stopping the compression of video frames includes ignoring said video frames by an encoder. The method includes continuing to compress and send audio data to the client when one or more of said plurality of video frames are not sent to the client.