Patent classifications
G06F2207/4814
RECONFIGURABLE MULTIBIT ANALOG IN-MEMORY COMPUTING WITH COMPACT COMPUTATION
Systems, apparatuses and methods may provide for technology that includes a memory array to store multibit weight data and a capacitor ladder network to conduct multiply-accumulate (MAC) operations on first analog signals and multibit weight data, the capacitor ladder network further to output second analog signals based on the MAC operations, wherein the capacitor ladder network is external to the memory array. In one example, the capacitor ladder network includes a plurality of switches and the logic includes a controller to selectively activate the plurality of switches based on a data format of the multibit weight data.
MEMORY DEVICE AND COMPUTING METHOD
A memory device and a computing method are provided. The memory device includes a memory array, comprising a first and second memory blocks, and a comparator. The first memory block performs a multiplication and accumulation (MAC) operation according to a first weight matrix and a first input matrix to generate a first sum. The second memory block performs the MAC operation according to a second weight matrix and a second input matrix to generate a second sum. The comparator compares the first and second sums. In a first configuration, each value of the input and second input matrixes are the same and each value of the first and second weight matrixes are complements. In a second configuration, each value of the first and second input matrixes are complements and each value of the first and second weight matrixes are the same.
Neural network semiconductor device and system using the same
A semiconductor device capable of performing product-sum operation is provided. The semiconductor device includes a first memory cell, a second memory cell, and an offset circuit. The semiconductor device retains first analog data and reference analog data in the first memory cell and the second memory cell, respectively. A potential corresponding to second analog data is applied to each of them as a selection signal, whereby current depending on the sum of products of the first analog data and the second analog data is obtained. The offset circuit includes a constant current circuit comprising a transistor and a capacitor. A first terminal of the transistor is electrically connected to a first gate of the transistor and a first terminal of the capacitor. A second gate of the transistor is electrically connected to a second terminal of the capacitor. A voltage between the first terminal and the second gate of the transistor is held in the capacitor, whereby a change in source-drain current of the transistor can be suppressed.
USING REDUCED READ ENERGY BASED ON THE PARTIAL-SUM
Embodiments include monitoring a partial sum of a multiply accumulate calculation for certain conditions. When the certain conditions are met, a reduced read energy is used to read out memory contents instead of the regular read energy used. The reduced read energy may be obtained by reducing a pre-charge voltage, withholding a pre-charge voltage or providing a ground signal, and/or by reducing voltage hold times (i.e., reducing the time a pre-charge voltage is provided and/or discharged).
Unit element for performing multiply-accumulate operations
The present invention provides an analog-digital hybrid architecture, which performs 256 multiplications and additions at a time. The system comprises 256 Processing Elements (PE) (108), which are arranged in a matrix form (16 rows and 16 columns). The digital inputs (110) are converted to analog signal (114) using digital to analog converters (DAC) (102). One PE (108) produces one analog output (115) which is nothing but the multiplication of the analog input (114) and the digital weight input (112). The implementation of PE is done by using i) capacitors and switches and ii) resistor and switches. The outputs from multiple PEs (108) in a column are connected together to produce one analog MAC output (116). In the similar manner, the system produces 16 MAC outputs (118) corresponding to 16 columns. Analog to digital converters (ADC) (104) are used to convert the analog MAC output (116) to digital form (118).
System and method applied with computing-in-memory
A system includes a global generator and local generators. The global generator is coupled to a memory array, and is configured to generate global signals, according to a number of a computational output of the memory array. The local generators are coupled to the global generator and the memory array, and are configured to generate local signals, according to the global signals. Each one of the local generators includes a first reference circuit and a local current mirror. The first reference circuit is coupled to the global generator, and is configured to generate a first reference signal at a node, in response to a first global signal of the global signals. The local current mirror is coupled to the first reference circuit at the node, and is configured to generate the local signals, by mirroring a summation of at least one signal at the node.
Integer matrix multiplication based on mixed signal circuits
A multiply-accumulate device comprises a digital multiplication circuit and a mixed signal adder. The digital multiplication circuit is configured to input L m.sub.1-bit multipliers and L m.sub.2-bit multiplicands and configured to generate N one-bit multiplication outputs, each one-bit multiplication output corresponding to a result of a multiplication of one bit of one of the L m.sub.1-bit multipliers and one bit of one of the L m.sub.2-bit multiplicands. The mixed signal adder comprises one or more stages, at least one stage configured to input the N one-bit multiplication outputs, each stage comprising one or more inner product summation circuits; and a digital reduction stage coupled to an output of a last stage of the one or more stages and configured to generate an output of the multiply-accumulate device based on the L m.sub.1-bit multipliers and the L m.sub.2-bit multiplicands.
MAC operating device and method for processing machine learning algorithm
A MAC operating device comprising a plurality of operation circuits respectively including an operation capacitor and a plurality of switches; and a division capacitor, wherein one end of the operation capacitor is respectively connected to a first operation switch connected to an input terminal and a first reset switch connected to a ground terminal, and the other end of the operation capacitor is connected to both a second operation switch connected to a division capacitor and a second reset switch connected to the ground terminal is provided.
Layout structure for shared analog bus in unit element multiplier
A planar fabrication charge transfer capacitor for coupling charge from a Unit Element (UE) generates a positive charge first output V_PP and a positive charge second output V_NP, the first output coupled to a positive charge line comprising a continuous first planar conductor, a continuous second planar conductor parallel to the first planar conductor, and a continuous third planar conductor parallel to the first planar conductor and second planar conductor, the charge transfer capacitor comprising, in sequence: a first co-planar conductor segment, the first planar conductor, a second co-planar conductor segment, the second planar conductor, a third co-planar conductor segment, the third planar conductor, and a fourth coplanar conductor segment, the first and third coplanar conductor segments capacitively edge coupled to the UE first output V_PP, the second and fourth coplanar conductor segments capacitively edge coupled to the UE second output V_NP.
Computation apparatus and method using the same
A computation apparatus includes a plurality of memory cells and a plurality of sense amplifiers, in which each of the memory cells includes a memory circuit and a calculation circuit. The memory circuits of the memory cells are configured to receive input values from a plurality of word lines, generate a computation result based on the input values and output the computation result to a bit line. The calculation circuits of the memory cells are configured to receive calculation input values from a plurality of calculation word lines, generate calculation output values based on the calculation input values, and output the calculation output values to a plurality of calculation bit lines. The sense amplifiers are configured to sense the calculation output values from the calculation bit lines to generate sensed values, wherein a value of the computation result is determined based on the sensed values and the calculation output values.