Patent classifications
G06F2207/4828
Integrated pixel and two-terminal non-volatile memory cell and an array of cells for deep in-sensor, in-memory computing
Disclosed is a cell that integrates a pixel and a two-terminal non-volatile memory device. The cell can be selectively operated in write, read and functional computing modes. In the write mode, a first data value is stored the memory device. In the read mode, it is read from the memory device. In the functional computing mode, the pixel captures a second data value and a sensed change in an electrical parameter (e.g., voltage or current) on a bitline connected to the cell is a function of both the first and second data value. Also disclosed is an IC structure that includes an array of the cells and, when multiple cells in a given column are concurrently operated in the functional computing mode, the sensed total change in the electrical parameter on the bitline for the column is indicative of a result of a dot product computation.
TERNARY IN-MEMORY ACCELERATOR
A circuit of cells used as a memory array and capable of in-memory arithmetic is disclosed which includes a plurality of signed ternary processing, each signed ternary processing cell includes a first memory cell, adapted to hold a first digital value, a second memory cell, adapted to hold a second digital value, wherein a binary combination of the first digital value and the second digital value establishes a first signed ternary operand, a signed ternary input forming a second signed ternary operand, and a signed ternary output, wherein the signed ternary output represents a signed multiplication of the first signed ternary operand and the second signed ternary operand, a sense circuit adapted to output a subtraction result.
Eigenvalue decomposition with stochastic optimization
A computer-implemented method for Eigenpair computation is provided. The method includes computing, her a hardware processor, an Eigenvector and respective Eigenvalues of the Eigenvector of a matrix by using a modified Stochastic Optimization process including performing a matrix vector product on a Resistive Processing Unit (RPU) crossbar array operatively coupled to the hardware processor and performing a scalar vector product on a digital device operatively coupled to the hardware processor and representing, for each of an Eigenpair, an initial guess for the Eigenvector and the respective Eigenvalues. The computing step includes storing the matrix in the RPU crossbar array.
MULTIPLY AND ACCUMULATE CALCULATION DEVICE, NEUROMORPHIC DEVICE, AND MULTIPLY AND ACCUMULATE CALCULATION METHOD
A multiply and accumulate calculation device including a variable resistor array unit having a plurality of variable resistance elements, a reference array unit having a reference resistance element having a fixed resistance value, a signal input unit that generates an input signal from input data, and inputs the input signal to the variable and reference resistance elements, a first detection unit that detects a current flowing through the variable resistor array unit, based on the input signal applied to the variable resistance elements, a second detection unit that detects a current flowing through the reference array unit, based on the input signal applied to the reference resistance element, and a correction calculation unit that performs a predetermined calculation on the output from the first detection unit, based on the output from the second.
Fault-tolerant analog computing
A fault-tolerant analog computing device includes a crossbar array having a number l rows and a number n columns intersecting the l rows to form l×n memory locations. The l rows of the crossbar array receive an input signal as a vector of length l. The n columns output an output signal as a vector of length n that is a dot product of the input signal and the matrix values defined in the l×n memory locations. Each memory location is programmed with a matrix value. A first set of k columns of the n columns is programmed with continuous analog target matrix values with which the input signal is to be multiplied, where k<n. A second set of m columns of the n columns is programmed with continuous analog matrix values for detecting an error in the output signal that exceeds a threshold error value, where m<n.
TIME DOMAIN RATIOMETRIC READOUT INTERFACES FOR ANALOG MIXED-SIGNAL IN MEMORY COMPUTE CROSSBAR NETWORKS
A circuit configured to compute matrix multiply-and-add calculations that includes a digital-to-time converter configured to receive a digital input and output a signal proportional to the digital input and modulated in time-domain associated with a reference time, a memory including a crossbar network, wherein the memory is configured to receive the time modulated signal from the digital-to-time converter and output a weighted signal scaled in response to network weights of the crossbar network and the time modulated input signal, and an output interface in communication with the crossbar network and configured to receive its weighted output signal and output a digital value proportional to at least the reference time using a time-to-digital converter.
Memristor spiking architecture
A circuit for a neuron of a multi-stage compute process is disclosed. The circuit comprises a weighted charge packet (WCP) generator. The circuit may also include a voltage divider controlled by a programmable resistance component (e.g., a memristor). The WCP generator may also include a current mirror controlled via the voltage divider and arrival of an input spike signal to the neuron. WCPs may be created to represent the multiply function of a multiply accumulate processor. The WCPs may be supplied to a capacitor to accumulate and represent the accumulate function. The value of the WCP may be controlled by the length of the spike in signal times the current supplied through the current mirror. Spikes may be asynchronous. Memristive components may be electrically isolated from input spike signals so their programmed conductance is not affected. Positive and negative spikes and WCPs for accumulation may be supported.
METHOD AND APPARATUS WITH NEURAL NETWORK PROCESSING
A neural network device includes a shift register circuit, a control circuit, and a processing circuit. The shift register circuit includes registers configured to, in each cycle of cycles, transfer stored data to a next register and store new data received from a previous register to a current register. The control circuit is configured to sequentially input data of input activations included in an input feature map into the shift register circuit in a preset order. The processing circuit, includes crossbar array groups that receive input activations from at least one of the registers and perform a multiply-accumulate (MAC) operation with respect to the received input activation and weights, is configured to accumulate and add at least some operation results output from the crossbar array groups in a preset number of cycles to obtain an output activation in an output feature map.
PRODUCT-SUM OPERATION DEVICE, LOGICAL CALCULATION DEVICE, NEUROMORPHIC DEVICE, AND MULTIPLY-ACCUMULATE METHOD
A multiply-accumulate calculation device includes: multiple calculation units which generates output signals by multiplying an input signal corresponding to an input value and having a rising part, a signal part, and a falling part by a weight, and output the output signals; an accumulate calculation unit configured to calculate a sum of the output signals output from the plurality of multiple calculation units; and a correction unit configured to execute correction processing for correcting the sum of the output signals on the basis of a correction value including at least one of a first value incorporated into the sum by a current flowing into variable resistors of the multiple calculation units due to the rising part of the input signal, and a second value incorporated into the sum by a current flowing into the variable resistors of the multiple calculation units due to the falling part of the input signal.
Kernel transformation techniques to reduce power consumption of binary input, binary weight in-memory convolutional neural network inference engine
Techniques are presented for performing in-memory matrix multiplication operations for binary input, binary weight valued convolution neural network (CNN) inferencing. The weights of a filter are stored in pairs of memory cells of a storage class memory device, such as a ReRAM or phase change memory based devices. To reduce current consumption, the binary valued filters are transformed into ternary valued filters by taking sums and differences of binary valued filter pairs. The zero valued weights of the transformed filters are stored as a pair of high resistance state memory cells, reducing current consumption during convolution. The results of the in-memory multiplications are pair-wise combined to compensate for the filter transformations. To compensate for zero valued weights, a zero weight register stores the number of zero weights along each bit line and is used to initialize counter values for accumulating the multiplication operations.