G06F7/544

DATA TRANSFER WITH CONTINUOUS WEIGHTED PPM DURATION SIGNAL
20230046980 · 2023-02-16 ·

A computer-implemented method for processing signals is provided including advantageously generating a temporally continuous weighted pulse position modulation (CW PPM) duration signal from an input analog signal, converting the CW PPM duration signal to a memory access signal, executing a multiply and accumulate (MAC) operation with the memory access signal, and advantageously generating the input analog signal from a result of the MAC operation by an activation function (AF).

INTEGRATED CIRCUIT AND METHOD OF OPERATING SAME

An integrated circuit includes a first encoder, a compute in-memory (CIM) array and a de-encoder. The first encoder is configured to quantize a first received signal into a first signal. The first received signal has a first floating point number format. The first signal has an integer number format. The compute in-memory (CIM) array is coupled to the first encoder. The CIM array is configured to generate a CIM signal in response to at least the first signal. The CIM signal has the integer number format. The de-encoder is coupled to the CIM array, and is configured to generate a first output signal in response to the CIM signal. The first output signal has a second floating point number format.

TECHNIQUES FOR FAST DOT-PRODUCT COMPUTATION
20230053261 · 2023-02-16 · ·

Techniques are presented to improve the speed of calculating floating-point dot-products, such as in a floating point unit (FPU). Rather than determine the full maximum exponent initially and wait until the full individual shift amounts are calculated to right-shift each mantissa product, each product of exponents is divided into two fields, a high field and a low field. The low field is used as a fine-grained shift amount to right-shift each mantissa product as soon as the mantissa product is ready, while only hi field participates in the maximum exponent calculation. This allows a dot-product computation to be speed up in two ways: Right-shifting of the mantissa product can begin as soon as the mantissa products are calculated, without waiting for the maximum exponent calculation; and calculation of the maximum exponent is sped up because it is calculated only on the high fields of the exponent, not its full-width.

SEMICONDUCTOR DEVICE AND ELECTRONIC DEVICE

A semiconductor device that has low power consumption and is capable of performing arithmetic operation is provided. The semiconductor device includes first to third circuits and first and second cells. The first cell includes a first transistor, and the second cell includes a second transistor. The first and second transistors operate in a subthreshold region. The first cell is electrically connected to the first circuit, the first cell is electrically connected to the second and third circuits, and the second cell is electrically connected to the second and third circuits. The first cell sets current flowing from the first circuit to the first transistor to a first current, and the second cell sets current flowing from the second circuit to the second transistor to a second current. At this time, a potential corresponding to the second current is input to the first cell. Then, a sensor included in the third circuit supplies a third current to change a potential of the second wiring, whereby the first cell outputs a fourth current corresponding to the first current and the amount of change in the potential.

CIRCUITRY FOR PERFORMING A MULTIPLY-ACCUMULATE OPERATION

The present disclosure relates to circuitry for performing a multiply-accumulate (MAC) operation. The circuitry comprises a first multiplexer having a plurality of inputs for receiving a plurality of unary-coded input signals representing operands of the MAC operation and an output for outputting a multiplexer output signal representing a result of the MAC operation and a first vector quantizer configured to receive a plurality of weighting signals, each representing a proportion of a computation time period for which a respective one of the unary-coded input signals should be selected by the multiplexer and to output a first selector signal to the multiplexer to cause the multiplexer to select each of the input signals in accordance with the plurality of weighting signals.

Multiplication and accumulation (MAC) operator
11579870 · 2023-02-14 · ·

A MAC operator includes a plurality of multipliers, a plurality of floating-point to fixed-point converters, an adder tree, an accumulator, and a fixed-point to floating-point converter. Each of the plurality of multipliers may perform a multiplication operation on first data and second data of a single-precision floating-point (FP32) format to output multiplication result data of the FP 32 format. Each of the plurality of floating-point to fixed-point converters may convert the FP 32 format into a fixed-point format. The adder tree may perform a first addition operation on the data of the fixed-point format. The accumulator may perform an accumulation operation on the data output from the adder tree. And the fixed-point to floating-point converter may convert the data of the fixed-point format into data of the FP32 format.

Digital compute-in-memory (DCIM) bit cell circuit layouts and DCIM arrays for multiple operations per column

Digital compute-in-memory (DCIM) bit cell circuit layouts and DCIM array circuits for multiple operations per column are disclosed. A DCIM bit cell array circuit including DCIM bit cell circuits comprising exemplary DCIM bit cell circuit layouts disposed in columns is configured to evaluate the results of multiple multiply operations per clock cycle. The DCIM bit cell circuits in the DCIM bit cell circuit layouts each couples to one of a plurality of column output lines in a column. In this regard, in each cycle of a system clock, each of the plurality of column output lines receives a result of a multiply operation of a DCIM bit cell circuit coupled to the column output line. The DCIM bit cell array circuit includes digital sense amplifiers coupled to each of the plurality of column output lines to reliably evaluate a result of a plurality of multiply operations per cycle.

Grouped convolution using point-to-point connected channel convolution engines

A processor system comprises a plurality of processing elements. Each processing element includes a corresponding convolution processor unit configured to perform a portion of a groupwise convolution. The corresponding convolution processor unit determines multiplication results by multiplying each data element of a portion of data elements in a convolution data matrix with a corresponding data element in a corresponding groupwise convolution weight matrix. The portion of data elements in the convolution data matrix that are multiplied belong to different channels and different groups. For each specific channel of the different channels, the corresponding convolution processor unit sums together at least some of the multiplication results belonging to the same specific channel to determine a corresponding channel convolution result data element. The processing elements sum together a portion of the channel convolution result data elements from a group of different convolution processor units to determine a groupwise convolution result data element.

Bit string accumulation in multiple registers
11579843 · 2023-02-14 · ·

Methods, Systems, and apparatuses related to performing bit string accumulation within a compute or memory device are described. A logic circuit with processing capability and a register within or near memory, for example, can perform multiple iterations of a recursive operation using several bit strings. Results of the various iterations may be written to the register, and subsequent iterations of the recursive operation using the bit strings may be performed. Results of the iterations of recursive operations may be accumulated within the register. Accumulated results may be written as data to another register or to memory that is external to or separate from the logic circuit.

Processing apparatus and electronic device including the same

Provided are processing and an electronic device including the same. The processing apparatus includes a bit cell line comprising bit cells connected in series, a mirror circuit unit configured to generate a mirror current by replicating a current flowing through the bit cell line at a ratio, a charge charging unit configured to charge a voltage corresponding to the mirror current as the mirror current replicated by the mirror circuit unit is applied, and a voltage measuring unit configured to output a value corresponding to a multiply-accumulate (MAC) operation of weights and inputs applied to the bit cell line, based on the voltage charged by the charge charging unit.