Patent classifications
G06F7/523
Artificial neural networks
The present disclosure relates to a neuron for an artificial neural network. The neuron includes: a first dot product engine operative to: receive a first set of weights; receive a set of inputs; and calculate the dot product of the set of inputs and the first set of weights to generate a first dot product engine output. The neuron further includes a second dot product engine operative to: receive a second set of weights; receive an input based on the first dot product engine output; and generate a second dot product engine output based on the product of the first dot product engine output and a weight of the second set of weights. The neuron further includes an activation function module arranged to generate a neuron output based on the second dot product engine output. The first dot product engine and the second dot product engine are structurally or functionally different.
Artificial neural networks
The present disclosure relates to a neuron for an artificial neural network. The neuron includes: a first dot product engine operative to: receive a first set of weights; receive a set of inputs; and calculate the dot product of the set of inputs and the first set of weights to generate a first dot product engine output. The neuron further includes a second dot product engine operative to: receive a second set of weights; receive an input based on the first dot product engine output; and generate a second dot product engine output based on the product of the first dot product engine output and a weight of the second set of weights. The neuron further includes an activation function module arranged to generate a neuron output based on the second dot product engine output. The first dot product engine and the second dot product engine are structurally or functionally different.
Computer-implemented perceptual apparatus
A method for compressing a digital representation of a stimulus includes encoding the digital representation as a feature vector within a feature space. The method also includes multiplying the feature vector with a Jacobian that maps the feature space to a non-Euclidean perceptual space according to a perceptual system that is capable of perceiving the stimulus. This multiplication generates a perceptual vector within the non-Euclidean perceptual space. The method also includes applying an update operator to the perceptual vector to move the perceptual vector in the perceptual space to an updated vector such that the updated vector has a lower entropy than the perceptual vector. The method also includes rounding the updated vector into a compressed vector that is smaller than the feature vector.
Computer-implemented perceptual apparatus
A method for compressing a digital representation of a stimulus includes encoding the digital representation as a feature vector within a feature space. The method also includes multiplying the feature vector with a Jacobian that maps the feature space to a non-Euclidean perceptual space according to a perceptual system that is capable of perceiving the stimulus. This multiplication generates a perceptual vector within the non-Euclidean perceptual space. The method also includes applying an update operator to the perceptual vector to move the perceptual vector in the perceptual space to an updated vector such that the updated vector has a lower entropy than the perceptual vector. The method also includes rounding the updated vector into a compressed vector that is smaller than the feature vector.
Contiguous sparsity pattern neural networks
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using neural networks having contiguous sparsity patterns. One of the methods includes storing a first parameter matrix of a neural network having a contiguous sparsity pattern in storage associated with a computing device. The computing device performs an inference pass of the neural network to generate an output vector, including reading, from the storage associated with the computing device, one or more activation values from the input vector, reading, from the storage associated with the computing device, a block of non-zero parameter values, and multiplying each of the one or more activation values by one or more of the block of non-zero parameter values.
Contiguous sparsity pattern neural networks
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using neural networks having contiguous sparsity patterns. One of the methods includes storing a first parameter matrix of a neural network having a contiguous sparsity pattern in storage associated with a computing device. The computing device performs an inference pass of the neural network to generate an output vector, including reading, from the storage associated with the computing device, one or more activation values from the input vector, reading, from the storage associated with the computing device, a block of non-zero parameter values, and multiplying each of the one or more activation values by one or more of the block of non-zero parameter values.
Variable accuracy computing system
The present disclosure relates to a computing system. The computing system comprises a data input configured to receive an input data signal, a computation unit having an input coupled with the data input, the computation unit being operative to apply a weight to a signal received at its input to generate a weighted output signal, and a controller. The controller is configured to monitor a parameter of the input signal and/or a parameter of the output signal and to issue a control signal to the computation unit to control a level of accuracy of the weighted output signal based at least in part on the monitored parameter.
Variable accuracy computing system
The present disclosure relates to a computing system. The computing system comprises a data input configured to receive an input data signal, a computation unit having an input coupled with the data input, the computation unit being operative to apply a weight to a signal received at its input to generate a weighted output signal, and a controller. The controller is configured to monitor a parameter of the input signal and/or a parameter of the output signal and to issue a control signal to the computation unit to control a level of accuracy of the weighted output signal based at least in part on the monitored parameter.
Logarithmic addition-accumulator circuitry, processing pipeline including same, and methods of operation
An integrated circuit including a plurality of logarithmic addition-accumulator circuits, connected in series, to, in operation, perform logarithmic addition and accumulate operations, wherein each logarithmic addition-accumulator circuit includes: (i) a logarithmic addition circuit to add a first input data and a filter weight data, each having the logarithmic data format, and to generate and output first sum data having a logarithmic data format, and (ii) an accumulator, coupled to the logarithmic addition circuit of the associated logarithmic addition-accumulator circuit, to add a second input data and the first sum data output by the associated logarithmic addition circuit to generate first accumulation data. The integrated circuit may further include first data format conversion circuitry, coupled to the output of each logarithmic addition circuit, to convert the data format of the first sum data to a floating point data format wherein the accumulator may be a floating point type.
COMPUTE IN MEMORY ARCHITECTURE AND DATAFLOWS FOR DEPTH-WISE SEPARABLE CONVOLUTION
Certain aspects of the present disclosure provide a method, including: storing a depthwise convolution kernel in a first one or more columns of a CIM array; storing a fused convolution kernel in a second one or more columns of the CIM array; storing pre-activations in one or more input data buffers associated with a plurality of rows of the CIM array; processing the pre-activations with the depthwise convolution kernel in order to generate depthwise output; modifying one or more of the pre-activations based on the depthwise output to generate modified pre-activations; and processing the modified pre-activations with the fused convolution kernel to generate fused output.