Patent classifications
H04N19/90
APPARATUS, METHOD AND COMPUTER PROGRAM PRODUCT FOR OPTIMIZING PARAMETERS OF A COMPRESSED REPRESENTATION OF A NEURAL NETWORK
In example embodiments, an apparatus, a method, and a computer program product are provided. An example apparatus include processing circuitry; and at least one memory including computer program code, the at least one memory and the computer program code configured to, with the processing circuitry, cause the apparatus at least to: overfit a neural network on each media item, from a batch of media items, for a number of iterations to obtain an overfitted neural network model for the each media item; evaluate the overfitted neural network model on the each media item to obtain evaluation errors; and update parameters of the neural network to be based on the evaluation errors.
Image decoding device, image encoding device, and image decoding method
A video encoding device (2) includes a side information determination section (21) and a side information encoding section (22). The side information determination section (21) sets a quantization parameter for each macroblock in such a manner that a difference between quantization parameters for each pair of macroblocks with successive encoding orders is equal to one of n difference values, and transforms the difference into one of n indices with respect to each pair. The side information encoding section (22) generates a binary sequence having a length corresponding to the size of the absolute value of the index. The total of absolute values of the n indices is smaller than the total of absolute values of the n difference values.
Image decoding device, image encoding device, and image decoding method
A video encoding device (2) includes a side information determination section (21) and a side information encoding section (22). The side information determination section (21) sets a quantization parameter for each macroblock in such a manner that a difference between quantization parameters for each pair of macroblocks with successive encoding orders is equal to one of n difference values, and transforms the difference into one of n indices with respect to each pair. The side information encoding section (22) generates a binary sequence having a length corresponding to the size of the absolute value of the index. The total of absolute values of the n indices is smaller than the total of absolute values of the n difference values.
SYSTEMS AND METHODS FOR IMPROVING OBJECT TRACKING IN COMPRESSED FEATURE DATA IN CODING OF MULTI-DIMENSIONAL DATA
A method of compressing feature data includes: receiving feature data; performing spatial down sampling on the received feature data by applying a pixel unshuffle operation; and performing channel reduction on the spatially down sampled feature data by applying a non-linear two dimensional convolution with an activation.
ON-DEVICE KNOWLEDGE EXTRACTION FROM VISUALLY RICH DOCUMENTS
Computer-based content understanding can include segmenting an image into a plurality of blocks, wherein each block includes textual information from the image. For each block of the plurality of blocks, encoded feature data is generated by encoding visual information of the block and visual information of one or more neighboring blocks from the plurality of blocks and encoded textual data is generated by encoding the textual information of the block and the textual information of the one or more neighboring blocks. Further, using an entity class prediction model, one or more tokens of the block are classified into one or more entity classes based on a combination of the encoded textual data and the encoded feature data. A plurality of entities can be extracted from the image based on the entity classes of the plurality of blocks.
Sub-diffraction imaging, coding and decoding of non-bleaching scatters
An image reconstruction method includes capturing a reference image of the specimen and capturing a set of original images based on the reference image. The method includes generating a set of analyzed images based on the set of original images by determining an intensity distribution for each pixel of each original image of the set of original images and combining the intensity distribution at each pixel location across the set of original images into an intermediate image. The method includes, identifying an object in the intermediate image. In response to identifying the object in the intermediate image, determining an intensity value of the object in each original image of the set of original images and generating an improved image of the object based on the determined intensity value of the object. The method includes generating a final image including the improved image of the object and displaying the final image.
Video compression using deep generative models
Certain aspects of the present disclosure are directed to methods and apparatus for compressing video content using deep generative models. One example method generally includes receiving video content for compression. The received video content is generally encoded into a latent code space through an encoder, which may be implemented by a first artificial neural network. A compressed version of the encoded video content is generally generated through a trained probabilistic model, which may be implemented by a second artificial neural network, and output for transmission.
Frequency adjustment for texture synthesis in video coding
The present disclosure relates to encoding and decoding video by employing texture coding. In particular, a texture region is identified within a video picture and a texture patch is determined for the texture region. Moreover, a first set of parameters is derived specifying weighting factors for reconstructing spectral coefficients of the texture region by fitting the texture region in a spectral domain to a first function of the texture patch, the first function being defined by the first set of parameters. The texture patch and the first set of parameters are then included into a bitstream which is output by an encoder and provided in this way to the decoder which reconstructs the texture based on the patch and the function applied to the patch.
IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND COMPUTER-READABLE RECORDING MEDIUM STORING IMAGE PROCESSING PROGRAM
An image processing device includes: a memory; and a processor coupled to the memory and configured to: calculate a degree of influence of each pixel of image data, the influence being exerted on a processing result when the image data is input to a deep learning model; reduce an information amount of intermediate information extracted from the deep learning model based on the degree of influence; and compress the intermediate information, the information amount of which has been reduced.
Systems and methods for using pre-calculated block hashes for image block matching
A server accesses a previous frame of an image in a video and obtains hash values for each pixel in the previous frame and creates a hash map that stores each of the hash values. The server receives a current frame of the image and separates the current frame into a plurality of current blocks of pixels. The server calculates, using a hash function, a hash value for each of the current blocks of pixels. The server compares the hash values in the hash map with the hash values associated with the current frame and identifies a hash value in the hash map that matches a hash value in the current frame. The server compresses the current frame for transmission to a client using the identified matching hash values and pre-calculates a new hash map based on the current frame for use in compressing a next frame of the video.