G06F2207/4802

Pulse generation for updating crossbar arrays

Provided are embodiments for a computer-implemented method, a system, and a computer program product for updating an analog crossbar array. Embodiment include receiving a number used in matrix multiplication to represent using pulse generation for a crossbar array, and receiving a bit-length to represent the number. Embodiments also include selecting pulse positions in a pulse sequence having the bit length to represent the number, performing a computation using the selected pulse positions in the pulse sequence, and updating the crossbar array using the computation.

PRODUCT-SUM OPERATION DEVICE, LOGICAL OPERATION DEVICE, NEUROMORPHIC DEVICE, AND PRODUCT-SUM OPERATION METHOD
20220092396 · 2022-03-24 · ·

A product-sum operation device including: product operation units generating output signals by multiplying input signals corresponding to input values; a current detection unit executing a current detecting process in which a current output from the product operation units with a predetermined time delay from input of the input signal and a current output from the product operation units at an interval thereafter are detected in a time span from a first transient response to before occurrence of a second transient response, the first transient response due to charging to a parasitic capacitance of the product operation units by input of the input signal and the second transient response being due to discharging from the parasitic capacitance of the product operation units by input of the input signal; and a sum operation unit calculating a value relating to a total sum of the output signals based on currents detected.

Differential mixed signal multiplier with three capacitors

A differential mixed-signal logic processor is provided. The differential mixed-signal logic processor includes a plurality of mixed-signal multiplier branches for multiplication of an analog value A and a N-bit digital value B. Each of the plurality of mixed-signal multiplier branches include a first capacitor connected across a second capacitor and a third capacitor to provide a differential output across the second and third capacitors. A capacitance of the first capacitor is equal to half a capacitance of the second and third capacitors.

THROUGHPUT AND PRECISION-PROGRAMMABLE MULTIPLIER-ACCUMULATOR ARCHITECTURE
20220113942 · 2022-04-14 · ·

A method and circuit for performing multi-layer vector-matrix multiplication operations may include, at a first multiplier-accumulator (MAC) layer, converting a digital input vector using one-bit digital to analog converters (DACs); sequentially performing vector-matrix multiplication operations for the analog DAC signals; and sequentially performing an analog-to-digital (ADC) operation on outputs of the vector-matrix multiplication operations to generate binary partial output vectors. At a second MAC layer, the method and circuit may sequentially receive the binary partial output vectors from the first MAC layer at multi-bit DACs; and sequentially perform vector-matrix multiplication operations to generate a summed binary output for the second MAC layer.

SCALABLE, MULTI-PRECISION, SELF-CALIBRATED MULTIPLIER-ACCUMULATOR ARCHITECTURE
20220113941 · 2022-04-14 · ·

A method for performing vector-matrix multiplication may include converting a digital input vector comprising a plurality of binary-encoded values into a plurality of analog signals using a plurality of one-bit digital to analog converters (DACs); sequentially performing, using an analog vector matrix multiplier and based on bit-order, vector-matrix multiplication operations using a weighting matrix for the plurality of analog signals to generate analog outputs of the analog vector matrix multiplier; sequentially performing an analog-to-digital (ADC) operation on the analog outputs of the analog vector matrix multiplier to generate binary partial output vectors; and combining the binary partial output vectors to generate a result of the vector-matrix multiplication.

SEMICONDUCTOR STORAGE DEVICE AND INFORMATION PROCESSOR
20220084581 · 2022-03-17 · ·

A semiconductor storage device has a plurality of memory cells that are arranged in a first direction and store first data, a plurality of first wiring pairs that are provided corresponding to the plurality of memory cells arranged in the first direction, and supply second data multiplied with the first data, a second wiring pair that is provided corresponding to two memory cells adjacent to each other in the first direction, and outputs multiplication data obtained by multiplying the first data stored in the two memory cells with the corresponding second data on the first wiring pair, and a third wiring pair in which potentials are changed depending on an addition result only when the addition result obtained by adding two multiplication data output to the second wiring pair to each other is not zero.

Non-Volatile Memory Accelerator for Artificial Neural Networks

A non-volatile memory (NVM) crossbar for an artificial neural network (ANN) accelerator is provided. The NVM crossbar includes row signal lines configured to receive input analog voltage signals, multiply-and-accumulate (MAC) column signal lines, a correction column signal line, a MAC cell disposed at each row signal line and MAC column signal line intersection, and a correction cell disposed at each row signal line and correction column signal line intersection. Each MAC cell includes one or more programmable NVM elements programmed to an ANN unipolar weight, and each correction cell includes one or more programmable NVM elements. Each MAC column signal line generates a MAC signal based on the input analog voltage signals and the respective MAC cells, and the correction column signal line generates a correction signal based on the input analog voltage signals and the correction cells. Each MAC signal is corrected based on the correction signal.

COMPUTATION SYSTEM
20220083846 · 2022-03-17 · ·

According to one embodiment, in a processing circuit of a computation system, a plurality of comparators corresponds to the respective columns, each including a first input node, a second input node, and an output node, the first input node receiving any one of the second signals, the second input node receiving a signal corresponding to a global reference signal provided to each second input node, the output node outputting a local signal. A global circuit is provided common to the plurality of comparators, the global circuit generating a global signal according to a plurality of the local signals, the global circuit generating the global reference signal by an SAR method according to the global signal. The processing circuit disables some of the plurality of comparators according to the local signals and the global signal.

SRAM-based process in memory system

Many signal processing, machine learning and scientific computing applications require a large number of multiply-accumulate (MAC) operations. This type of operation is demanding in both computation and memory. Process in memory has been proposed as a new technique that computes directly on a large array of data in place, to eliminate expensive data movement overhead. To enable parallel multi-bit MAC operations, both width- and level-modulating memory word lines are applied. To improve performance and provide tolerance against process-voltage-temperature variations, a delay-locked loop is used to generate fine unit pulses for driving memory word lines and a dual-ramp Single-slope ADC is used to convert bit line outputs. The concept is prototyped in a 180 nm CMOS test chip made of four 320×64 compute-SRAMs, each supporting 128× parallel 5 b×5 b MACs with 32 5 b output ADCs and consuming 16.6 mW at 200 MHz.

Analog Dot Product Multiplier

A dot product multiplier for matrix operations for an A matrix of order 1×m with a coefficient B matrix of order m×m. Processing Elements (PEs) are arranged in an m×m array, the columns of the array summed to provide a dot product result. Each of the PEs contains a sign determiner and a plurality of analog multiplier cells, one multiplier cell for each value bit. The multipliers operate over four clock cycles, initializing a capacitor charge according to sign on a first clock phase, sharing charge on a second phase, canceling charge on a third phase, and outputting the resultant charge on a fourth phase, the resultant charge on each column representing the dot product for that column.