G06T9/002

TASK-ORIENTED DYNAMIC MESH COMPRESSION USING OCCUPANCY NETWORKS
20230016302 · 2023-01-19 ·

Methods, systems and device for efficiently compressing task-oriented dynamic meshes using occupancy networks are described herein. A single trained occupancy network model is able to reconstruct a mesh video using a few additional points per input mesh frame. To optimize the compression of the model and points, the estimated rate to compress the occupancy network is able to be included in the loss function. This minimizes the number of bits to encode the model, while it tries to reproduce the meshes as well as possible. An adaptive subsampling per input mesh is added to optimize the mesh reconstruction and the N-point point clouds compression. To optimize the model to perform a particular task, a metric is added to the cost function that takes this task into account.

Policy-based system interface for a real-time autonomous system

Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

Automatic Area Detection

An example computing platform is configured to (i) receive a two-dimensional (2D) image file comprising a construction drawing, (ii) generate, via semantic segmentation, a first set of polygons corresponding to respective areas of the 2D image file, (iii) generate, via instance segmentation, a second set of polygons corresponding to respective areas of the 2D image file, (iv) generate, via unsupervised image processing, a third set of polygons corresponding to respective areas of the 2D image file, (v) based on (a) overlap between polygons in the first, second, and third sets of polygons and (b) respective confidence scores for each of the overlapping polygons, determine a set of merged polygons corresponding to respective areas of the 2D image file, and (vi) cause a client station to display a visual representation of the 2D image file where each merged polygon is overlaid as a respective selectable region of the 2D image file.

METHOD AND DEVICE OF SUPER RESOLUTION USING FEATURE MAP COMPRESSION

Disclosed are an image processing method and device using a line-wise operation. The image processing device, according to one embodiment, comprises: a receiver for receiving an image; a first convolution operator for generating a feature map by performing a convolution operation on the basis of the image; and a compressor for compressing the feature map into units of at least one line; and a decompressor for reconstructing the feature map compressed into units of lines.

Unified shape representation

Techniques are described herein for generating and using a unified shape representation that encompasses features of different types of shape representations. In some embodiments, the unified shape representation is a unicode comprising a vector of embeddings and values for the embeddings. The embedding values are inferred, using a neural network that has been trained on different types of shape representations, based on a first representation of a three-dimensional (3D) shape. The first representation is received as input to the trained neural network and corresponds to a first type of shape representation. At least one embedding has a value dependent on a feature provided by a second type of shape representation and not provided by the first type of shape representation. The value of the at least one embedding is inferred based upon the first representation and in the absence of the second type of shape representation for the 3D shape.

METHOD AND APPARATUS FOR TEXT-TO-IMAGE GENERATION USING SELF-SUPERVISED DISCRIMINATOR TO EXTRACT IMAGE FEATURE

An apparatus for text-to-image generation which is a self-supervised based on one-stage generative adversarial network and uses a discriminator network that extracts an image feature may comprise: a text encoder that extracts a sentence vector from input text; a discriminator that determines whether or not an image matches the text from the sentence vector and the image input from a generator; and a decoder that is connected to an encoder inside the discriminator, wherein the decoder and the encoder form an autoencoder structure inside the discriminator.

Method and electronic device for analyzing image

A method for analyzing an image for anomaly detection includes obtaining a first image. The method also includes generating a second image by auto-encoding the first image. The method additionally includes extracting first and second feature vectors from the first and second images, respectively. The method further includes filtering each of the first and second feature vectors by using a filtering vector generated based on first distance values between first respective elements of the first and second feature vectors. Additionally, the method includes determining whether there is an anomaly in the first image based on second distance values between second respective elements of the filtered first and second feature vectors.

Image encoder using machine learning and data processing method of the image encoder
11694125 · 2023-07-04 · ·

An image encoder for outputting a bitstream by encoding an input image includes a predictive block, a machine learning based prediction enhancement (MLBE) block, and a subtractor. The predictive block is configured to generate a prediction block using data of a previous input block. The MLBE block is configured to transform the prediction block into an enhanced prediction block by applying a machine learning technique to the prediction block. The subtractor is configured to generate a residual block by subtracting pixel data of the enhanced prediction block from pixel data of a current input block.

Audio signal encoding and decoding method using learning model, training method of learning model, and encoder and decoder that perform the methods

An audio signal encoding and decoding method using a learning model, a training method of the learning model, and an encoder and decoder that perform the method, are disclosed. The audio signal decoding method may include extracting a first residual signal and a first linear prediction coefficient by decoding a bitstream received from an encoder, generating a first audio signal from the first residual signal using the first linear prediction coefficient, generating a second linear prediction coefficients and a second residual signal from the first audio signal, obtaining a third linear prediction coefficient by inputting the second linear prediction coefficient into a trained learning model, and generating a second audio signal from the second residual signal using the third linear prediction coefficient.

Unsupervised real-to-virtual domain unification for end-to-end highway driving
11543830 · 2023-01-03 · ·

An unsupervised real to virtual domain unification model for highway driving, or DU-drive, employs a conditional generative adversarial network to transform driving images in a real domain to their canonical representations in the virtual domain, from which vehicle control commands are predicted. In the case where there are multiple real datasets, a real-to-virtual generator may be independently trained for each real domain and a global predictor could be trained with data from multiple real domains. Qualitative experiment results show this model can effectively transform real images to the virtual domain while only keeping the minimal sufficient information, and quantitative results verify that such canonical representation can eliminate domain shift and boost the performance of control command prediction task.