Patent classifications
G11C13/0002
Area and power efficient implementations of modified backpropagation algorithm for asymmetric RPU devices
A device configured to implement an artificial intelligence deep neural network includes a first matrix and a second matrix. The first matrix resistive processing unit (“RPU”) array receives a first input vector along the rows of the first matrix RPU. A second matrix RPU array receives a second input vector along the rows of the second matrix RPU. A reference matrix RPU array receives an inverse of the first input vector along the rows of the reference matrix RPU and an inverse of the second input vector along the rows of the reference matrix RPU. A plurality of analog to digital converters are coupled to respective outputs of a plurality of summing junctions that receive respective column outputs of the first matrix RPU array, the second matrix RPU array, and the reference RPU array and provides a digital value of the output of the plurality of summing junctions.
Neural network architecture
Various implementations are related to an apparatus with memory cells arranged in columns and rows, and the memory cells are accessible with a column control voltage for accessing the memory cells via the columns and a row control voltage for accessing the memory cells via the rows. The apparatus may include neural network circuitry having neuronal junctions that are configured to receive, record, and provide information related to incoming voltage spikes associated with input signals based on resistance through the neuronal junctions. The apparatus may include stochastic re-programmer circuitry that receives the incoming voltage spikes, receives the information provided by the neuronal junctions, and reconfigure the information recorded in the neuronal junctions based on the incoming voltage spikes associated with the input signals along with a programming control signal provided by the memory circuitry.
INTEGRATED CIRCUIT DEVICE AND METHODS
An integrated circuit (IC) device includes a substrate, and a memory array layer having a plurality of transistors. First through fourth gate contacts are arranged along a first axis, and coupled to underlying gates of the plurality of transistors. First through fifth source/drain contacts in the memory array layer extend along a second axis transverse to the first axis, and are coupled to underlying source/drains of the plurality of transistors. The gate contacts and the source/drain contacts are alternatingly arranged along the first axis. A source line extends along the first axis, and is coupled to the first and fifth source/drain contacts. First and second word lines extend along the first axis, the first word line is coupled to the first and third gate contacts, and the second word line is coupled to the second and fourth gate contacts.
APPARATUS AND METHOD WITH MULTIPLY-ACCUMULATE OPERATION
A multiply-accumulate (MAC) computation circuit includes: a bit-cell array configured to generate an analog output corresponding to a MAC operation result of an input signal; a first analog-to-digital conversion (ADC) circuit configured to determine an upper part of a digital output corresponding to the analog output; and a second ADC circuit configured to determine a lower part of the digital output based on a reference voltage corresponding to the upper part.
ANALOG NEUROMORPHIC CIRCUIT IMPLEMENTED USING RESISTIVE MEMORIES
An analog neuromorphic circuit is disclosed, having input voltages applied to a plurality of inputs of the analog neuromorphic circuit. The circuit also includes a plurality of resistive memories that provide a resistance to each input voltage applied to each of the inputs so that each input voltage is multiplied in parallel by the corresponding resistance of each corresponding resistive memory to generate a corresponding current for each input voltage and each corresponding current is added in parallel. The circuit also includes at least one output signal that is generated from each of the input voltages multiplied in parallel with each of the corresponding currents for each of the input voltages added in parallel. The multiplying of each input voltage with each corresponding resistance is executed simultaneously with adding each corresponding current for each input voltage.
CIRCUIT DESIGN AND LAYOUT WITH HIGH EMBEDDED MEMORY DENSITY
Various embodiments of the present disclosure are directed towards a memory device. The memory device has a first transistor having a first source/drain and a second source/drain, where the first source/drain and the second source/drain are disposed in a semiconductor substrate. A dielectric structure is disposed over the semiconductor substrate. A first memory cell is disposed in the dielectric structure and over the semiconductor substrate, where the first memory cell has a first electrode and a second electrode, where the first electrode of the first memory cell is electrically coupled to the first source/drain of the first transistor. A second memory cell is disposed in the dielectric structure and over the semiconductor substrate, where the second memory cell has a first electrode and a second electrode, where the first electrode of the second memory cell is electrically coupled to the second source/drain of the first transistor.
Systems and methods for mapping matrix calculations to a matrix multiply accelerator
Systems and methods of configuring a fixed memory array of an integrated circuit with coefficients of one or more applications includes identifying a utilization constraint type of the fixed memory array from a plurality of distinct utilization constraint types based on computing attributes of the one or more applications; identifying at least one coefficient mapping technique from a plurality of distinct coefficient mapping techniques that addresses the utilization constraint type; configuring the fixed memory array according to the at least one coefficient mapping technique, wherein configuring the array includes at least setting within the array the coefficients of the one or more applications in an arrangement prescribed by the at least one coefficient mapping technique that optimizes a computational utilization of the fixed memory array.
Hardware accelerator with analog-content addressable memory (a-CAM) for decision tree computation
Examples described herein relate to a decision tree computation system in which a hardware accelerator for a decision tree is implemented in the form of an analog Content Addressable Memory (a-CAM) array. The hardware accelerator accesses a decision tree. The decision tree comprises of multiple paths and each path of the multiple paths includes a set of nodes. Each node of the decision tree is associated with a feature variable of multiple feature variables of the decision tree. The hardware accelerator combines multiple nodes among the set of nodes with a same feature variable into a combined single node. Wildcard values are replaced for feature variables not being evaluated in each path. Each combined single node associated with each feature variable is mapped to a corresponding column in the a-CAM array and the multiple paths of the decision tree to rows of the a-CAM array.
Mixed conducting volatile memory element for accelerated writing of nonvolatile memristive device
An embodiment in the application may include an analog memory structure, and methods of writing to such a structure, including a volatile memory element in series with a non-volatile memory element. The analog memory structure may change resistance upon application of a voltage. This may enable accelerated writing of the analog memory structure.
SEMICONDUCTOR DEVICE AND ELECTRONIC DEVICE
A semiconductor device that can perform product-sum operation with low power is provided. The semiconductor device includes a switching circuit. The switching circuit includes first to fourth terminals. The switching circuit has a function of selecting one of the third terminal and the fourth terminal as electrical connection destination of the first terminal, and selecting the other of the third terminal and the fourth terminal as electrical connection destination of the second terminal, on the basis of first data. The switching circuit includes a first transistor and a second transistor each having a back gate. The switching circuit has a function of determining a signal-transmission speed between the first terminal and one of the third terminal and the fourth terminal and a signal-transmission speed between the second terminal and the other of the third terminal and the fourth terminal on the basis of potentials of the back gates. The potentials are determined by second data. When signals are input to the first terminal and the second terminal, a time lag between the signals output from the third terminal and the fourth terminal is determined by the first data and the second data.