Patent classifications
H04N19/90
MACHINE-LEARNED IN-LOOP PREDICTOR FOR VIDEO COMPRESSION
A compression system trains a compression model for an encoder and decoder. In one embodiment, the compression model includes a machine-learned in-loop flow predictor that generates a flow prediction from previously reconstructed frames. The machine-learned flow predictor is coupled to receive a set of previously reconstructed frames and output a flow prediction for a target frame that is an estimation of the flow for the target frame. In particular, since the flow prediction can be generated by the decoder using the set of previously reconstructed frames, the encoder may transmit a flow delta that indicates a difference between the flow prediction and the actual flow for the target frame, instead of transmitting the flow itself. In this manner, the encoder can transmit a significantly smaller number of bits to the receiver, improving computational efficiency.
MULTIPLE NEURAL NETWORK MODELS FOR FILTERING DURING VIDEO CODING
An example device for filtering decoded video data includes one or more processors configured to execute a neural network filtering unit to: receive data from one or more other units of the device, the data from the one or more other units of the device being different than data for a decoded picture of video data, and wherein to receive the data from the one or more other units of the device, the one or more processors are configured to execute the neural network filtering unit to receive boundary strength data from a deblocking unit of the device; determine one or more neural network models to be used to filter a portion of the decoded picture; and filter the portion of the decoded picture using the one or more neural network models and the data from the one or more other units of the device, including the boundary strength data.
MULTIPLE NEURAL NETWORK MODELS FOR FILTERING DURING VIDEO CODING
An example device for filtering decoded video data includes one or more processors configured to execute a neural network filtering unit to: receive data from one or more other units of the device, the data from the one or more other units of the device being different than data for a decoded picture of video data, and wherein to receive the data from the one or more other units of the device, the one or more processors are configured to execute the neural network filtering unit to receive boundary strength data from a deblocking unit of the device; determine one or more neural network models to be used to filter a portion of the decoded picture; and filter the portion of the decoded picture using the one or more neural network models and the data from the one or more other units of the device, including the boundary strength data.
METHOD AND APPARATUS FOR ADAPTIVE IMAGE COMPRESSION WITH FLEXIBLE HYPERPRIOR MODEL BY META LEARNING
A method of adaptive neural image compression with a hyperprior model by meta-learning is performed by at least one processor and includes generating a statistic feature, based on an input image and a hyperparameter, and generating a first shared feature and an estimated adaptive encoding parameter, encoding the input image to obtain a signal encoded image, based on the generated first shared feature and the generated estimated adaptive encoding parameter, generating a second shared feature and an estimated adaptive hyper encoding parameter, generating a hyper feature, based on the signal encoded image, the generated second shared feature, and the generated estimated adaptive hyper encoding parameter, and compressing the obtained signal encoded image, the generated statistic feature, and the generated hyper feature.
COMPRESSING IMAGE-TO-IMAGE MODELS WITH AVERAGE SMOOTHING
System and methods for compressing image-to-image models. Generative Adversarial Networks (GANs) have achieved success in generating high-fidelity images. An image compression system and method adds a novel variant to class-dependent parameters (CLADE), referred to as CLADE-Avg, which recovers the image quality without introducing extra computational cost. An extra layer of average smoothing is performed between the parameter and normalization layers. Compared to CLADE, this image compression system and method smooths abrupt boundaries, and introduces more possible values for the scaling and shift. In addition, the kernel size for the average smoothing can be selected as a hyperparameter, such as a 3×3 kernel size. This method does not introduce extra multiplications but only addition, and thus does not introduce much computational overhead, as the division can be absorbed into the parameters after training.
Pixel storage for graphical frame buffers
A device implementing the subject pixel storage for graphical frame buffers may include at least one processor configured to obtain a plurality of data units containing a plurality of pixels stored in memory, each of the plurality of data units including a first pixel of the plurality of pixels packed in succession with at least a portion of a second pixel of the plurality of pixels, in which the plurality of pixels is represented by a number of bits, obtain a group of pixels from the plurality of pixels, and store the group of pixels using a targeted number of bits. A method and computer program product implementing the subject pixel storage for graphical frame buffers is also provided.
Surface normal vector processing mechanism
- Jill Boyce ,
- Scott Janus ,
- Itay Kaufman ,
- Archie Sharma ,
- Stanley Baran ,
- Michael Apodaca ,
- Prasoonkumar Surti ,
- Srikanth Potluri ,
- Barnan Das ,
- Hugues Labbe ,
- Jong Dae Oh ,
- Gokcen CILINGIR ,
- Maria Bortman ,
- Tzach Ashkenazi ,
- Jonathan Distler ,
- Atul Divekar ,
- Mayuresh M. Varerkar ,
- Narayan Biswal ,
- Nilesh V. Shah ,
- Atsuo Kuwahara ,
- Kai Xiao ,
- Jason Tanner ,
- Jeffrey Tripp
An apparatus to facilitate processing video bit stream data is disclosed. The apparatus includes one or more processors to encode surface normals data with point cloud geometry data included in the video bit stream data for reconstruction of objects within the video bit stream data based on the surface normals data and a memory communicatively coupled to the one or more processors.
Surface normal vector processing mechanism
- Jill Boyce ,
- Scott Janus ,
- Itay Kaufman ,
- Archie Sharma ,
- Stanley Baran ,
- Michael Apodaca ,
- Prasoonkumar Surti ,
- Srikanth Potluri ,
- Barnan Das ,
- Hugues Labbe ,
- Jong Dae Oh ,
- Gokcen CILINGIR ,
- Maria Bortman ,
- Tzach Ashkenazi ,
- Jonathan Distler ,
- Atul Divekar ,
- Mayuresh M. Varerkar ,
- Narayan Biswal ,
- Nilesh V. Shah ,
- Atsuo Kuwahara ,
- Kai Xiao ,
- Jason Tanner ,
- Jeffrey Tripp
An apparatus to facilitate processing video bit stream data is disclosed. The apparatus includes one or more processors to encode surface normals data with point cloud geometry data included in the video bit stream data for reconstruction of objects within the video bit stream data based on the surface normals data and a memory communicatively coupled to the one or more processors.
DEEP LOOP FILTER BY TEMPORAL DEFORMABLE CONVOLUTION
A method, apparatus and storage medium for performing video coding are provided. The method includes obtaining a plurality of image frames in a video sequence; determining a feature map for each of the plurality of image frames and determining an offset map based on the feature map; determining an aligned feature map by performing a temporal deformable convolution (TDC) on the feature map and the offset map; and generating a plurality of aligned frames based on the aligned feature map.
PRUNING METHODS AND APPARATUSES FOR NEURAL NETWORK BASED VIDEO CODING
A pruning method of neural network based video coding of a current block of a picture of a video sequence is performed by at least one processor and includes categorizing parameters of a neural network into groups, setting a first index to indicate that a first group of the groups is to be pruned, and a second index to indicate that a second group of the groups is not to be pruned, and transmitting, to a decoder, the set first index and the set second index. Based on the transmitted first index and the transmitted second index, the current block is processed using the parameters of which the first group of the groups is pruned.