G06N3/063

Dynamic quantization for deep neural network inference system and method

A method for dynamically quantizing feature maps of a received image. The method includes convolving an image based on a predicted maximum value, a predicted minimum value, trained kernel weights and the image data. The input data is quantized based on the predicted minimum value and predicted maximum value. The output of the convolution is computed into an accumulator and re-quantized. The re-quantized value is output to an external memory. The predicted min value and the predicted max value are computed based on the previous max values and min values with a weighted average or a pre-determined formula. Initial min value and max value are computed based on known quantization methods and utilized for initializing the predicted min value and predicted max value in the quantization process.

Dynamic quantization for deep neural network inference system and method

A method for dynamically quantizing feature maps of a received image. The method includes convolving an image based on a predicted maximum value, a predicted minimum value, trained kernel weights and the image data. The input data is quantized based on the predicted minimum value and predicted maximum value. The output of the convolution is computed into an accumulator and re-quantized. The re-quantized value is output to an external memory. The predicted min value and the predicted max value are computed based on the previous max values and min values with a weighted average or a pre-determined formula. Initial min value and max value are computed based on known quantization methods and utilized for initializing the predicted min value and predicted max value in the quantization process.

Method and apparatus with neural network data input and output control
11580393 · 2023-02-14 · ·

A neural network deep learning data control apparatus includes: a memory; an encoding circuit configured to receive a data sequence, generate a compressed data sequence in which consecutive invalid bits in a bit string of the data sequence are compressed into a single bit of the compressed data sequence, generate a validity determination sequence indicating a valid bit and an invalid bit in a bit string of the compressed data sequence, and write the compressed data sequence and the validity determination sequence to the memory; and a decoding circuit configured to read the compressed data sequence and the validity determination sequence from the memory, and determine a bit in the bit string of the compressed data sequence set for transmission to a neural network circuit, based on the validity determination sequence, such that the neural network circuit omits an operation with respect to non-consecutive invalid bits.

Efficient inferencing with piecewise pointwise convolution

Certain aspects of the present disclosure provide techniques for performing piecewise pointwise convolution, comprising: performing a first piecewise pointwise convolution on a first subset of data received via a first branch input at a piecewise pointwise convolution layer of a convolutional neural network (CNN) model; performing a second piecewise pointwise convolution on a second subset of data received via a second branch input at the piecewise pointwise convolution layer; determining a piecewise pointwise convolution output by summing a result of the first piecewise pointwise convolution and a result of the second piecewise pointwise convolution; and providing the piecewise pointwise convolution output to a second layer of the CNN model.

Method and apparatus to efficiently process and execute Artificial Intelligence operations
11580371 · 2023-02-14 · ·

A method, apparatus, and system are discussed to efficiently process and execute Artificial Intelligence operations. An integrated circuit has a tailored architecture to process and execute Artificial Intelligence operations, including computations for a neural network having weights with a sparse value. The integrated circuit contains at least a scheduler, one or more arithmetic logic units, and one or more random access memories configured to cooperate with each other to process and execute these computations for the neural network having weights with the sparse value.

Method and apparatus to efficiently process and execute Artificial Intelligence operations
11580371 · 2023-02-14 · ·

A method, apparatus, and system are discussed to efficiently process and execute Artificial Intelligence operations. An integrated circuit has a tailored architecture to process and execute Artificial Intelligence operations, including computations for a neural network having weights with a sparse value. The integrated circuit contains at least a scheduler, one or more arithmetic logic units, and one or more random access memories configured to cooperate with each other to process and execute these computations for the neural network having weights with the sparse value.

Efficient convolutional engine
11580372 · 2023-02-14 · ·

A hardware architecture for implementing a convolutional neural network.

Efficient convolutional engine
11580372 · 2023-02-14 · ·

A hardware architecture for implementing a convolutional neural network.

AUGMENTATION OF MULTIMODAL TIME SERIES DATA FOR TRAINING MACHINE-LEARNING MODELS
20230045548 · 2023-02-09 ·

The present invention relates to training predictive data-driven model for predicting an industrial time dependent process. A data driven generative model is introduced for modelling and generating complex sequential data comprising multiple modalities, by learning a joint time-dependent representation of the different modalities. The model may be configured to handle any combination of missing modalities, which enables conditional generation based on known modalities, providing a high degree of control over the properties of the generated sequences.

AUGMENTATION OF MULTIMODAL TIME SERIES DATA FOR TRAINING MACHINE-LEARNING MODELS
20230045548 · 2023-02-09 ·

The present invention relates to training predictive data-driven model for predicting an industrial time dependent process. A data driven generative model is introduced for modelling and generating complex sequential data comprising multiple modalities, by learning a joint time-dependent representation of the different modalities. The model may be configured to handle any combination of missing modalities, which enables conditional generation based on known modalities, providing a high degree of control over the properties of the generated sequences.