G11C7/1006

INPUT CIRCUITRY FOR ANALOG NEURAL MEMORY IN A DEEP LEARNING ARTIFICIAL NEURAL NETWORK

Numerous embodiments of input circuitry for an analog neural memory in a deep learning artificial neural network are disclosed.

OUTPUT CIRCUITRY FOR ANALOG NEURAL MEMORY IN A DEEP LEARNING ARTIFICIAL NEURAL NETWORK
20230049032 · 2023-02-16 ·

Numerous embodiments of output circuitry for an analog neural memory in a deep learning artificial neural network are disclosed. In some embodiments, a common mode circuit is used with differential cells, W+ and W−, that together store a weight, W. The common mode circuit can utilize current sources, variable resistors, or transistors as part of the structure for introducing a common mode voltage bias.

COMPUTING DEVICE, MEMORY CONTROLLER, AND METHOD FOR PERFORMING AN IN-MEMORY COMPUTATION

A method for performing an in-memory computation includes: storing data in memory cells of a memory array, the data including weights for computation; determining whether an update command to change at least one of the weights is received; in response to receiving the update command, performing a write operation on the memory array to update the at least one weight; and disabling the write operation on the memory array until receiving a next update command to change the at least one weight.

STENCIL DATA ACCESS FROM TILE MEMORY
20230049052 · 2023-02-16 ·

A reconfigurable compute fabric of a system can include multiple nodes, and each node can include multiple, communicatively coupled tiles with respective processing and storage elements. In an example, a tile-based processor can be configured to perform operations comprising receiving a first stencil that defines input data for a first operation. The stencil can have a height corresponding to N rows in a main memory and a stencil width corresponding to M columns in the main memory. The processor can perform operations comprising establishing N buffers in a tile memory, each buffer having M buffer elements, and populating the M buffer elements of the N buffers using respective information, defined by the first stencil, from the main memory. Tile-based stencil operations can use information from the N buffers and provide compute results in an output array.

NEUROMORPHIC HARDWARE APPARATUS BASED ON A RESISTIVE MEMORY ARRAY

A neuromorphic hardware apparatus based on a resistive memory array includes a resistive memory array in which a plurality of synaptic resistor elements are arranged. Each synaptic resistor element is changed in its resistance value depending on a voltage pulse applied thereto and stores the resistance value for a predetermined time. The apparatus also includes a neuron circuit configured to receive an output signal from the resistive memory array and to output a voltage signal to another resistive memory array. The neuron circuit includes a temperature compensation unit, which compensates for an output voltage of the resistive memory array on the basis of an operating temperature of the resistive memory array. Even when a resistive memory array outputs an abnormal output depending on an operating temperature, by compensating a neuron circuit for an input value, it is possible to prevent an operation error from occurring.

MODULAR MEMORY ARCHITECTURE WITH GATED SUB-ARRAY OPERATION DEPENDENT ON STORED DATA CONTENT

A memory circuit includes an array of memory cells arranged with first word lines connected to a first sub-array storing less significant bits of data and second word lines connected to a second sub-array storing more significant bits of data. A row decoder circuit coupled to the first and second word lines generates word line signals. A word line gating circuit is configured to selectively gate passage of the word line signals to the second word lines for the second sub-array in response to assertion of a maximum value signal. A data modification circuit performs a mathematical operation on data read from the array of memory cells, and asserts the maximum value signal if the mathematical operation performed on the less significant bits of data from the first sub-array produces a maximum data value.

Digital compute-in-memory (DCIM) bit cell circuit layouts and DCIM arrays for multiple operations per column

Digital compute-in-memory (DCIM) bit cell circuit layouts and DCIM array circuits for multiple operations per column are disclosed. A DCIM bit cell array circuit including DCIM bit cell circuits comprising exemplary DCIM bit cell circuit layouts disposed in columns is configured to evaluate the results of multiple multiply operations per clock cycle. The DCIM bit cell circuits in the DCIM bit cell circuit layouts each couples to one of a plurality of column output lines in a column. In this regard, in each cycle of a system clock, each of the plurality of column output lines receives a result of a multiply operation of a DCIM bit cell circuit coupled to the column output line. The DCIM bit cell array circuit includes digital sense amplifiers coupled to each of the plurality of column output lines to reliably evaluate a result of a plurality of multiply operations per cycle.

Memory sense amplifier trimming

A memory device, such as an MRAM memory, includes a memory array with a plurality of bit cells. The memory array is configured to store trimming information and store user data. A sense amplifier is configured to read the trimming information from the memory array, and a trimming register is configured to receive the trimming information from the sense amplifier. The sense amplifier is configured to receive the trimming information from the trimming register so as to operate in a trimmed mode for reading the user data from the memory array.

Multi-port memory architecture for a systolic array

A memory architecture and a processing unit that incorporates the memory architecture and a systolic array. The memory architecture includes: memory array(s) with multi-port (MP) memory cells; first wordlines connected to the cells in each row; and, depending upon the embodiment, second wordlines connected to diagonals of cells or diagonals of sets of cells. Data from a data input matrix is written to the memory cells during first port write operations using the first wordlines and read out from the memory cells during second port read operations using the second wordlines. Due to the diagonal orientation of the second wordlines and due to additional features (e.g., additional rows of memory cells that store static zero data values or read data mask generators that generate read data masks), data read from the memory architecture and input directly into a systolic array is in the proper order, as specified by a data setup matrix.

Bit string accumulation in multiple registers
11579843 · 2023-02-14 · ·

Methods, Systems, and apparatuses related to performing bit string accumulation within a compute or memory device are described. A logic circuit with processing capability and a register within or near memory, for example, can perform multiple iterations of a recursive operation using several bit strings. Results of the various iterations may be written to the register, and subsequent iterations of the recursive operation using the bit strings may be performed. Results of the iterations of recursive operations may be accumulated within the register. Accumulated results may be written as data to another register or to memory that is external to or separate from the logic circuit.