Patent classifications
G06F2207/481
ACCUMULATOR FOR DIGITAL COMPUTATION-IN-MEMORY ARCHITECTURES
Certain aspects provide an apparatus for performing machine learning tasks, and in particular, to computation-in-memory architectures. One aspect provides a method for in-memory computation. The method generally includes: accumulating, via each digital counter of a plurality of digital counters, output signals on a respective column of multiple columns of a memory, wherein a plurality of memory cells are on each of the multiple columns, the plurality of memory cells storing multiple bits representing weights of a neural network, wherein the plurality of memory cells of each of the multiple columns correspond to different word-lines of the memory; adding, via an adder circuit, output signals of the plurality of digital counters; and accumulating, via an accumulator, output signals of the adder circuit.
RATE DOMAIN NUMERICAL PROCESSING CIRCUIT AND METHOD
Rate domain numerical processing comprises receiving an input serial data stream on a single input wire in which a multi-valued number is represented as a rate of pulse events comprising pulses, null pulses or a combination thereof. The rate of pulse events in an output serial data stream is varied in accordance with an operand to perform a multi-valued arithmetic operation selected from multiplication, division, addition and subtraction or a combination thereof. The rate may be varied by scaling the rate by an operand, which may be implemented using a count and compare circuit topology. The output serial data stream is output on a single output wire. The rate domain operations, specifically multiplication and division are accomplished without the resource & power intensive binary multiplier and binary divisions circuits. The operations are implemented using simple registers, adders, accumulators, counters, comparators, and basic logic, which is far more SWaP-C efficient.
BIT-SERIAL MULTIPLIER FOR FPGA APPLICATIONS
A Field-Programmable Gate Array (FPGA) implementation of a multiplier topology can provide a considerable increase in computation performance and cost benefit as compared to other approaches, particularly for large bit widths (e.g., for multiplication of large-bit numbers). A lack of sufficient input/output (I/O) ports on the FPGA for a particular bit width can be remedied by implementing large-bit number multiplications in a bit-serial fashion. The bit-serial multiplier topologies described herein can provide a relatively small footprint as compared to other approaches. An FPGA-implemented bit-serial multiplier can improve operation of a computing system, for example, by offloading binary multiplication operations from a general-purpose processor.
IN-MEMORY MATRIX MULTIPLICATION WITH BINARY COMPLEMENT INPUTS
A matrix-vector multiplication device includes an input encoder that encodes an input vector into a binary complement format value and a binary true format value; a pulse generator that converts each encoded bit of the binary complement format value and each encoded bit of the binary true format value into a corresponding pulse signal; a crossbar array of weights, wherein each weight is encoded as a differential analog conductance of resistive memory devices, wherein the pulse generator simultaneously applies a pulse signal corresponding to a given encoded bit of the binary complement format value and a pulse signal corresponding to a given encoded bit of the binary true format value to corresponding resistive memory devices; an analog-to-digital converter that digitizes outputs of the crossbar array of weights to generate partial dot-product results; and a digital counter that computes a final dot-product result from the partial dot-product results.
Accumulator for digital computation-in-memory architectures
Certain aspects provide an apparatus for performing machine learning tasks, and in particular, to computation-in-memory architectures. One aspect provides a method for in-memory computation. The method generally includes: accumulating, via each digital counter of a plurality of digital counters, output signals on a respective column of multiple columns of a memory, wherein a plurality of memory cells are on each of the multiple columns, the plurality of memory cells storing multiple bits representing weights of a neural network, wherein the plurality of memory cells of each of the multiple columns correspond to different word-lines of the memory; adding, via an adder circuit, output signals of the plurality of digital counters; and accumulating, via an accumulator, output signals of the adder circuit.