G06F7/5095

SEMICONDUCTOR DEVICE
20240248684 · 2024-07-25 ·

A semiconductor device according to one includes: an initial value setting unit configured to provide an initial value of a register that holds a cumulative value to be a result of a product-sum operation in a product-sum operation circuit; and an initial value canceling circuit configured to cancel the initial value contained in the cumulative value held by the register and output a final output value, and the initial value setting unit sets a positive or negative value other than zero as the initial value.

Multiply accumulate (MAC) unit with split accumulator
12039290 · 2024-07-16 · ·

In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.

Oligomer Stabilized Liquid Crystal Light Valve
20240256223 · 2024-08-01 ·

Example accumulation systems and methods are described. In one implementation, data is received for processing. A multiplication operation is performed on the received data to generate multiplied data. An addition operation is performed on the multiplied data to generate a result. At least a portion of the least significant bits of the result are stored in a first region of an accumulation buffer of a convolution core. And, at least a portion of the remaining bits of the result are stored in a shared memory that is separate from the convolution core.

DATA ACCUMULATION APPARATUS AND METHOD, AND DIGITAL SIGNAL PROCESSING DEVICE

The present disclosure provides a data accumulation device and method, and a digital signal processing device. The device comprises: an accumulation tree module for accumulating input data in the form of a binary tree structure and outputting accumulated result data; a register module including a plurality of groups of registers and used for registering intermediate data generated by the accumulation tree module during an accumulation process and the accumulated result data; and a control circuit for generating a data gating signal to control the accumulation tree module to filter the input data not required to be accumulated, and generating a flag signal to perform the following control: selecting a result obtained after adding one or more of intermediate data stored in the register to the accumulated result as output data, or directly selecting the accumulated result as output data. Thus, a plurality of groups of input data can be rapidly accumulated to a group of sums in a clock cycle. At the same time, the accumulation device can flexibly select to simultaneously accumulate some data of the plurality of of input data by means of a control signal.

Accumulation systems and methods

Example accumulation systems and methods are described. In one implementation, data is received for processing. A multiplication operation is performed on the received data to generate multiplied data. An addition operation is performed on the multiplied data to generate a result. At least a portion of the least significant bits of the result are stored in a first region of an accumulation buffer of a convolution core. And, at least a portion of the remaining bits of the result are stored in a shared memory that is separate from the convolution core.

Redundant representation of numeric value using overlap bits

A redundant representation is provided where an M-bit value represents a P-bit numeric value using a plurality of N-bit portions, where M>P>N. An anchor value identifies the significance of bits of each N-bit, and within a group of at least two adjacent N-bit portions, two or more overlap bits of a lower N-bit portion of the group have a same significance as two or more least significant bits of at least one upper N-bit portion of the group. A plurality of operation circuit units can perform a plurality of independent N-bit operation in parallel, each N-bit operation comprising computing a function of corresponding N-bit portions of at least two M-bit operand values having the redundant representation to generate a corresponding N-bit portion of an M-bit result value having the redundant representation. This enables fast associative processing of relatively long M-bit values in the time taken for performing an N-bit operation.

Floating-point calculation apparatus, program, and calculation apparatus

A floating-point calculation apparatus comprising: a selection part; an addition and subtraction calculation part; an output determination part; and a buffer management part configured to add, when it is determined that a buffer used to store an input value is not prepared, a buffer that corresponds to the input value, wherein when a number of significant digits of the result of performing an addition and subtraction calculation exceeds a number of significant digits of the buffer selected by the selection part, the addition and subtraction calculation part shifts right or shifts left part of the result of performing the addition and subtraction calculation and divides the result of performing the addition and subtraction calculation into values each being storable in one of a plurality of buffers.

REDUNDANT REPRESENTATION OF NUMERIC VALUE USING OVERLAP BITS
20170139673 · 2017-05-18 ·

A redundant representation is provided where an M-bit value represents a P-bit numeric value using a plurality of N-bit portions, where M>P>N. An anchor value identifies the significance of bits of each N-bit, and within a group of at least two adjacent N-bit portions, two or more overlap bits of a lower N-bit portion of the group have a same significance as two or more least significant bits of at least one upper N-bit portion of the group. A plurality of operation circuit units can perform a plurality of independent N-bit operation in parallel, each N-bit operation comprising computing a function of corresponding N-bit portions of at least two M-bit operand values having the redundant representation to generate a corresponding N-bit portion of an M-bit result value having the redundant representation. This enables fast associative processing of relatively long M-bit values in the time taken for performing an N-bit operation.

Coarse floating point accumulator circuit, and MAC processing pipelines including same
12282748 · 2025-04-22 · ·

An integrated circuit including a multiplier-accumulator circuit pipeline including a plurality of MAC circuits. Each MAC circuit includes: (A) a multiplier circuit to multiply first input data and filter weight data to generate and output first product data having a floating point data format, and (B) a coarse floating point accumulator circuit including: (1) an alignment shift circuit to shift at least one field of the first product data and generate shifted first product data, and (2) fixed point addition circuitry, coupled to the alignment shift circuit, to add second input data and the shifted first product data using the fixed point addition circuitry. The plurality of MAC circuits of the multiplier-accumulator circuit execution pipeline, in operation, each perform a plurality of multiply operations and accumulate operations to process the first input data and generate processed data therefrom.

Split accumulator with a shared adder
12293163 · 2025-05-06 · ·

In a multiply accumulate (MAC) unit, an accumulator may be implemented in two or more stages. For example, a first accumulator may accumulate products from the multiplier of the MAC unit, and a second accumulator may periodically accumulate the running total of the first accumulator. Each time the first accumulator's running total is accumulated by the second accumulator, the first accumulator may be initialized to begin a new accumulation period. In one embodiment, the number of values accumulated by the first accumulator within an accumulation period may be a user-adjustable parameter. In one embodiment, the bit width of the input of the second accumulator may be greater than the bit width of the output of the first accumulator. In another embodiment, an adder may be shared between the first and second accumulators, and a multiplexor may switch the accumulation operations between the first and second accumulators.