G06F7/5324

Computer processor for higher precision computations using a mixed-precision decomposition of operations
11126428 · 2021-09-21 · ·

Embodiments detailed herein relate to arithmetic operations of float-point values. An exemplary processor includes decoding circuitry to decode an instruction, where the instruction specifies locations of a plurality of operands, values of which being in a floating-point format. The exemplary processor further includes execution circuitry to execute the decoded instruction, where the execution includes to: convert the values for each operand, each value being converted into a plurality of lower precision values, where an exponent is to be stored for each operand; perform arithmetic operations among lower precision values converted from values for the plurality of the operands; and generate a floating-point value by converting a resulting value from the arithmetic operations into the floating-point format and store the floating-point value.

FAST DIGITAL MULTIPLY-ACCUMULATE (MAC) BY FAST DIGITAL MULTIPLICATION CIRCUIT
20210240447 · 2021-08-05 ·

Certain aspects provide methods and apparatus for multiplication of digital signals. In accordance with certain aspects, a multiplication circuit may be used to multiply a portion of a first digital input signal with a portion of a second digital input signal via a first multiplier circuit to generate a first multiplication signal, and multiply another portion of the first digital input signal with another portion of the second digital input signal via a second multiplier circuit to generate a second multiplication signal. A third multiplier circuit and multiple adder circuits may be used to generate an output of the multiplication circuit based on the first and second multiplication signals.

NEURAL NETWORK DEVICE, NEURAL NETWORK SYSTEM, AND OPERATION METHOD EXECUTED BY NEURAL NETWORK DEVICE
20210303979 · 2021-09-30 ·

According to an embodiment, a neural network device includes a circuit configured to receive a first bit sequence representing a first value and output a second bit sequence representing a threefold value of the first value. The device includes a circuit configured to generate a fourth bit sequence based on the first and second bit sequences and two adjacent bits of a third bit sequence representing a second value, and output a fifth bit sequence representing a product of the first and second values based on the fourth bit sequence, and to generate a seventh bit sequence based on the first and second bit sequences and two adjacent bits of a sixth bit sequence representing a third value, and output an eighth bit sequence representing a product of the first and third values based on the seventh bit sequence.

Multiplier Circuitry

Various implementations described herein are related to a device having multiplier circuitry with an array of summation result cells that holds summation bit values for shifted arrays added together. The device may include latch circuitry having one or more gated elements disposed between the summation result cells, and the gated elements may be adapted to provide a portion of the summation bit values based on a gating signal.

COMPUTER PROCESSOR FOR HIGHER PRECISION COMPUTATIONS USING A MIXED-PRECISION DECOMPOSITION OF OPERATIONS
20210103444 · 2021-04-08 ·

Embodiments detailed herein relate to arithmetic operations of float-point values. An exemplary processor includes decoding circuitry to decode an instruction, where the instruction specifies locations of a plurality of operands, values of which being in a floating-point format. The exemplary processor further includes execution circuitry to execute the decoded instruction, where the execution includes to: convert the values for each operand, each value being converted into a plurality of lower precision values, where an exponent is to be stored for each operand; perform arithmetic operations among lower precision values converted from values for the plurality of the operands; and generate a floating-point value by converting a resulting value from the arithmetic operations into the floating-point format and store the floating-point value.

ACCUMULATOR HARDWARE
20230409287 · 2023-12-21 ·

Accumulator hardware logic includes first and second addition logic units and a store. The first addition logic unit comprises a first input, a second input and an output, each of the first and second inputs arranged to receive an input value in each clock cycle. The second addition logic unit comprises a first input that is connected directly to the output of the first addition logic unit. It also comprises a second input and an output. The store is arranged to store a result output by the second addition logic unit. The accumulator hardware logic further comprises shifting hardware and/or negation hardware positioned in a feedback path between the store and the second input of the second addition logic unit. The shifting hardware is configured to perform a shift by a fixed number of bit positions in a fixed direction.

Processor with efficient arithmetic units

A processor includes a carry save array multiplier. The carry save array multiplier includes an array of cascaded partial product generators. The array of cascaded partial product generators is configured to generate an output value as a product of two operands presented at inputs of the multiplier. The array of cascaded partial product generators is also configured to generate an output value as a sum of two operands presented at inputs of the multiplier.

MULTI-PARTITIONING FOR COMBINATION OPERATIONS
20210049177 · 2021-02-18 ·

Systems and methods are disclosed for processing and executing queries against one or more dataset. As part of processing the query, the system determines whether the query is susceptible to a significantly imbalanced partition. In the event, the query is susceptible to an imbalanced partition, the system monitors the query and determines whether to perform a multi-partitioning determination to avoid a significantly imbalanced partition.

COMPUTER PROCESSOR FOR HIGHER PRECISION COMPUTATIONS USING A MIXED-PRECISION DECOMPOSITION OF OPERATIONS
20210089303 · 2021-03-25 ·

Embodiments detailed herein relate to arithmetic operations of float-point values. An exemplary processor includes decoding circuitry to decode an instruction, where the instruction specifies locations of a plurality of operands, values of which being in a floating-point format. The exemplary processor further includes execution circuitry to execute the decoded instruction, where the execution includes to: convert the values for each operand, each value being converted into a plurality of lower precision values, where an exponent is to be stored for each operand; perform arithmetic operations among lower precision values converted from values for the plurality of the operands; and generate a floating-point value by converting a resulting value from the arithmetic operations into the floating-point format and store the floating-point value.

ARITHMETIC CIRCUIT
20210064340 · 2021-03-04 ·

An arithmetic circuit includes an LUT generation circuit (1) that, when coefficients c[n] (n=1, . . . , N) are paired two by two, outputs a value calculated for each of the pairs, and distributed arithmetic circuits (2-m) that calculate values z[m] of product-sum arithmetic, by which data x[m, n] of a data set X[m] containing M pairs of data x[m, n] are multiplied by the coefficients c[n] and the products are summed up, in parallel for each of the M pairs. The distributed arithmetic circuits (2-m) includes a plurality of binomial distributed arithmetic circuits that calculate the value of binomial product-sum arithmetic for each of the pairs, based on a value obtained by pairing N data x[m, n] corresponding to the circuit two by two, a value obtained by pairing the coefficients c[n] two by two, and the value calculated by the LUT generation circuit (1), a summing circuit that sums up the calculated values, and a figure matching circuit that matches the number of decimal figures of the sum with a predetermined number of decimal figures.