Patent classifications
H04N19/19
ENHANCEMENT PROCESS FOR VIDEO CODING FOR MACHINES
Systems, devices, and methods for performing video coding for machine (VCM) image enhancement, including obtaining a coded image from a coded bitstream; obtaining enhancement parameters corresponding to the coded image; decoding the coded image using a VCM decoding module to generate a decoded image; generating an enhanced image using an enhancement module based on the decoded image and the enhancement parameters, wherein the enhancement parameters are optimized for one of a human vision VCM task, a machine vision VCM task, and a human-machine hybrid vision VCM task; providing at least one of the decoded image and the enhanced image to at least one of a human vision module and a machine vision module for performing the one of the human vision VCM task, the machine vision VCM task, and the human-machine hybrid vision VCM task.
DATA PROCESSING METHOD AND APPARATUS
Embodiments of this disclosure provide a data processing method and apparatus. The method includes: acquiring M candidate quantized state chains of a transform block in multimedia data; acquiring N syntax elements corresponding to a transform coefficient i in the transform block, and acquiring fixed probability models respectively corresponding to K syntax elements; performing context modeling for (N-K) syntax elements according to adjacent coding coefficients of the transform coefficient i, to obtain target probability models respectively corresponding to the (N-K) syntax elements; determining a coefficient rate distortion cost of the transform coefficient i according to the fixed probability models, the target probability models, and a quantization reconstruction value of the transform coefficient i; and determining path rate distortion costs respectively corresponding to the M candidate quantized state chains according to a coefficient rate distortion cost of each transform coefficient.
Method and apparatus for encoding and decoding HDR images
To encode High Dynamic Range (HDR) images, the HDR images can be converted to Low Dynamic Range (LDR) images through tone mapping operation, and the LDR images can be encoded with an LDR encoder. The present principles formulates a rate distortion minimization problem when designing the tone mapping curve. In particular, the tone mapping curve is formulated as a function of the probability distribution function of the HDR images to be encoded and a Lagrangian multiplier that depends on encoding parameters. At the decoder, based on the parameters indicative of the tone mapping function, an inverse tone mapping function can be derived to reconstruct HDR images from decoded LDR images.
Intra-Prediction Mode Concept for Block-Wise Picture Coding
An apparatus for block-wise decoding a picture from a data stream and/or encoding a picture into a data stream, the apparatus supporting at least one intra-prediction mode according to which the intra-prediction signal for a block of a predetermined size of the picture is determined by applying a first template of samples which neighbours the current block onto a neural network. The apparatus may be configured, for a current block differing from the predetermined size, to: resample a second template of samples neighboring the current block, so as to conform with the first template so as to obtain a resampled template ; apply the resampled template of samples onto the neural network so as to obtain a preliminary intra-prediction signal; and resample the preliminary intra-prediction signal so as to conform with the current block so as to obtain the intra-prediction signal for the current block.
Using generative adversarial networks in compression
The compression system trains a machine-learned encoder and decoder through an autoencoder architecture. The encoder can be deployed by a sender system to encode content for transmission to a receiver system, and the decoder can be deployed by the receiver system to decode the encoded content and reconstruct the original content. The encoder is coupled to receive content and output a tensor as a compact representation of the content. The content may be, for example, images, videos, or text. The decoder is coupled to receive a tensor representing content and output a reconstructed version of the content. The compression system trains the autoencoder with a discriminator to reduce compression artifacts in the reconstructed content. The discriminator is coupled to receive one or more input content, and output a discrimination prediction that discriminates whether the input content is the original or reconstructed version of the content.
Method for processing transform coefficients
Methods, software products, digital cameras and other image processing systems process a set of transform coefficients. In at least one embodiment, the method comprises, for each block of transform coefficients representing an image: ordering the block's coefficients into a sequence; encoding the ordered coefficients to yield a sequence of codewords, each codeword including one or more encoded coefficients; and dividing the sequence of codewords into two or more sub-sequences.
Image processing device, image processing method, and program for determining a cost function for mode selection
An image processing device is described. The circuitry of the image processing device obtains an image that is generated on a basis of an incident light and a transfer function related to a conversion between the incident light and the image, and determines a cost function for prediction mode selection according to the transfer function. The cost function calculates a cost value based on a first parameter corresponding to a prediction residual code amount and a second parameter corresponding to a prediction mode code amount. The cost function is determined in a manner in favor of increasing the prediction residual code amount or decreasing the prediction mode code amount as a dynamic range of the transfer function increases. The circuitry determines a prediction mode for coding a coding unit of the image according to the determined cost function, and encodes the coding unit according to the determined prediction mode.
Iterative IDCT with adaptive non-linear filtering
A method includes obtaining respective filtered pixels for pixels of a reconstructed image; and obtaining an edge-preserved image using the respective filtered pixels. Obtaining the respective filtered pixels includes, for each pixel of the reconstructed image, obtaining a respective filtered pixel by selecting a pixel patch including the pixel and first neighboring pixels of the pixel that are at relative neighboring locations with respect to the pixel; calculating respective weights for the first neighboring pixels; and filtering the pixel using the respective weights of the first neighboring pixels and the neighboring pixels to obtain the respective filtered pixel. Calculating the respective weights includes, for each neighboring pixel of the first neighboring pixels, forming a neighboring patch including the neighboring pixel and second neighboring pixels, and calculating a neighboring patch distance between the pixel patch and the neighboring pixel; and calculating a respective weight using the neighboring patch distance.
DEPTH CODEC FOR REAL-TIME, HIGH-QUALITY LIGHT FIELD RECONSTRUCTION
Techniques to facilitate compression of depth data and real-time reconstruction of high-quality light fields. A parameter space of values for a line, pairs of endpoints on different sides of the line, and a palette index for each pixel of a pixel tile of a depth image is sampled. Values for the line, the pairs of endpoints, and the palette index that minimize an error are determined and stored.
DEPTH CODEC FOR REAL-TIME, HIGH-QUALITY LIGHT FIELD RECONSTRUCTION
Techniques to facilitate compression of depth data and real-time reconstruction of high-quality light fields. A parameter space of values for a line, pairs of endpoints on different sides of the line, and a palette index for each pixel of a pixel tile of a depth image is sampled. Values for the line, the pairs of endpoints, and the palette index that minimize an error are determined and stored.