Patent classifications
G06F7/5443
COMPLEX MULTIPLICATION CIRCUIT
A first multiplex circuit generates a first multiplex signal obtained by time-divisionally multiplexing a first real part and a first imaginary part of a first complex number. A second multiplex circuit generates a second multiplex signal obtained by time-divisionally multiplexing a second real part and a second imaginary part of a second complex number. A multiply-subtract operation circuit performs a multiply-subtract operation of the first and second multiplex signals. A third multiplex circuit generates a third multiplex signal obtained by time-divisionally multiplexing the first and second real parts. A fourth multiplex circuit generates a fourth multiplex signal obtained by time-divisionally multiplexing the first and second imaginary parts. A multiply-accumulate operation circuit performs a multiply-accumulate operation of the third and fourth multiplex signals. A fifth multiplex circuit generates a fifth multiplex signal obtained by time-divisionally multiplexing output values of the multiply-subtract operation circuit and the multiply-accumulate operation circuit.
INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING PROGRAM
A reservoir includes a common input layer, first and second output layers that outputs a first and a second readout values based on an input, a first partial reservoir including the input layer and the first output layer, and a second partial reservoir having a size between the input layer and the second output layer larger than the size of the first partial reservoir, and the training processing including: first calculating a third output weight that reduces a difference between a first product sum value of a third readout value and a first output weight; and second calculating a fourth output weight that reduces a difference between a second product sum value of a fourth readout value and a second output weight and differential teaching data that is a difference between a third product sum value of the third readout value and the third output weight and the teaching data.
MEMORY ARRAY WITH PROGRAMMABLE NUMBER OF FILTERS
Aspects of the present disclosure are directed to devices and methods for performing MAC operations using a memory array as a compute-in-memory (CIM) device that can enable higher computational throughput, higher performance and lower energy consumption compared to computation using a processor outside of a memory array. In some embodiments, an activation architecture is provided using a bit cell array arranged in rows and columns to store charges that represent a weight value in a weight matrix. A read word line (RWL) may be repurposed to provide the input activation value to bit cells within a row of bit cells, while a read-bit line (RBL) is configured to receive multiplication products from bit cells arranged in a column. Some embodiments provide multiple sub-arrays or tiles of bit cell arrays.
COMPUTE-IN-MEMORY SYSTEMS AND METHODS WITH CONFIGURABLE INPUT AND SUMMING UNITS
A device includes a multiplication unit and a configurable summing unit. The multiplication unit is configured to receive data and weights for an Nth layer, where N is a positive integer. The multiplication unit is configured to multiply the data by the weights to provide multiplication results. The configurable summing unit is configured by Nth layer values to receive an Nth layer number of inputs and perform an Nth layer number of additions, and to sum the multiplication results and provide a configurable summing unit output.
High Performance Systems And Methods For Modular Multiplication
A circuit system for performing modular reduction of a modular multiplication includes multiplier circuits that receive a first subset of coefficients that are generated by summing partial products of a multiplication operation that is part of the modular multiplication. The multiplier circuits multiply the coefficients in the first subset by constants that equal remainders of divisions to generate products. Adder circuits add a second subset of the coefficients and segments of bits of the products that are aligned with respective ones of the second subset of the coefficients to generate sums.
NEURAL NETWORK COMPUTING DEVICE AND COMPUTING METHOD THEREOF
A computing method for performing a matrix multiplying-and-accumulating computation by a flash memory array which includes word lines, bit lines and flash memory cells. The computing method includes the following steps: respectively storing a weight value in each of the flash memory cells, receiving a plurality of input voltages via the word lines, performing an computation on one of the input voltages and the weight value by each of the flash memory cells to obtain an output current, outputting the output currents of the flash memory cells via the bit lines, and accumulating the output currents of the flash memory cells connected to the same bit line of the bit lines to obtain a total output current. Each of the flash memory cells is an analog device, and each of the input voltages, each of the output currents and each of the weight values are analog values.
PROCESSING-IN-MEMORY(PIM) DEVICE
A PIM device includes a memory/arithmetic region including a plurality of memory banks and a plurality of MAC operators, the plurality of MAC operators including a first MAC operator, a peripheral region including a data input/output circuit, and a global data input/output (GIO) line capable of providing a data transmission path between the peripheral region and the memory/arithmetic region. The first MAC operator is configured to perform an EWM operation by performing a multiplication operation on first input data and second input data that are transmitted from first and second memory banks of the plurality of memory banks, respectively, to generate multiplication result data and transmitting the multiplication result data to a third memory bank. While the EWM operation is being performed, data transmission through the GIO line between the peripheral region and the memory/arithmetic region is blocked.
ACCELERATED CRYPTOGRAPHIC-RELATED PROCESSING WITH FRACTIONAL SCALING
Cryptographic-related processing is performed using an n-bit accelerator. The processing includes providing a binary operand to a multiply-and-accumulate unit of the n-bit accelerator. The multiply-and-accumulate unit performs an operation using the binary operand and a predetermined fractional constant F to obtain an operation result, and rounds the operation result by discarding x least-significant bits of the operation result to obtain a fractionally-scaled result, where x is a configurable number of bits to discard from the operation result, and the fractionally-scaled result facilitates performing the cryptographic-related processing.
CROSS COUPLED CAPACITOR ANALOG IN-MEMORY PROCESSING DEVICE
A system for performing analog multiply-and-accumulate (MAC) operations employs at least one cross coupling capacitor processing unit (C3PU). A system includes a wordline to which an analog input voltage is applied, a voltage supply line having a supply voltage (VDD), a bitline, a clock signal line, a current integrator op-amp connected to the bitline and to the clock signal line, and a C3PU connected to the wordline. The C3PU includes a CMOS transistor and a capacitive unit. The capacitive unit includes a cross coupling capacitor and a gate capacitor. The cross coupling capacitor is connected between the wordline and the gate terminal of the CMOS transistor. The gate capacitor is connected between the gate terminal and ground. The CMOS transistor is configured to conduct a current that is proportional to voltage applied to the gate terminal.
MULTIPLICATION BY A RATIONAL IN HARDWARE WITH SELECTABLE ROUNDING MODE
A fixed logic circuit for performing multiplication of an input x by a constant rational p/q so as to calculate an output y according to a directed rounding or round-to-nearest rounding mode. Fixed logic hardware is derived comprising an addition array configured to operate on canonical signed digit (CSD) forms of binary values (a CSD array) so as to form an approximation of a multiplication of an input x [m−1:0] by a rational p/q. A truncated summation array of a finite sequence of most significant bits of an infinite CSD expansion of the rational p/q operating on the bits of the input x satisfies
Registers define a plurality of corrective constants for a respective plurality of rounding modes, and selection logic selects the respective corrective constant for that rounding mode in dependence on a rounding mode in which the truncated summation array is to operate.