Patent classifications
G06F7/5045
ADDER CIRCUIT USING LOOKUP TABLES
A four-input lookup table (LUT4) is modified to operate in a first mode as an ordinary LUT4 and in a second mode as a 1-bit adder providing a sum output and a carry output. A six-input lookup table (LUT6) is modified to operate in a first mode as an ordinary LUT6 with a single output and in a second mode as a 2-bit adder providing a sum output and a carry output. Both possible results for the two different possible carry inputs can be determined and selected between when the carry input is available, implementing a 2-bit carry-select adder when in the second mode and retaining the ability to operate as an ordinary LUT6 in the first mode. Using the novel LUT6 design in a circuit chip fabric allows a 2-bit adder slice to be built that efficiently makes use of the LUT6 without requiring additional logic blocks.
Processor and method for outer product accumulate operations
A processor and method for performing outer product and outer product accumulation operations on vector operands requiring large numbers of multiplies and accumulations is disclosed.
PROCESSOR AND METHOD FOR OUTER PRODUCT ACCUMULATE OPERATIONS
A processor and method for performing outer product and outer product accumulation operations on vector operands requiring large numbers of multiplies and accumulations is disclosed.
METHOD FOR A STAGE OPTIMIZED HIGH SPEED ADDER
A method for fast parallel adder processing. The method includes receiving parallel inputs from a communications path, wherein each input comprises one bit, adding the inputs using a parallel structure, wherein the parallel structure is optimized to accelerate the addition by utilizing a characteristic that the inputs are one bit each, and transmitting the resulting outputs to a subsequent stage.
Method for a stage optimized high speed adder
A method for fast parallel adder processing. The method includes receiving parallel inputs from a communications path, wherein each input comprises one bit, adding the inputs using a parallel structure, wherein the parallel structure is optimized to accelerate the addition by utilizing a characteristic that the inputs are one bit each, and transmitting the resulting outputs to a subsequent stage.
ADDER CIRCUIT USING LOOKUP TABLES
A four-input lookup table (LUT4) is modified to operate in a first mode as an ordinary LUT4 and in a second mode as a 1-bit adder providing a sum output and a carry output. A six-input lookup table (LUT6) is modified to operate in a first mode as an ordinary LUT6 with a single output and in a second mode as a 2-bit adder providing a sum output and a carry output. Both possible results for the two different possible carry inputs can be determined and selected between when the carry input is available, implementing a 2-bit carry-select adder when in the second mode and retaining the ability to operate as an ordinary LUT6 in the first mode. Using the novel LUT6 design in a circuit chip fabric allows a 2-bit adder slice to be built that efficiently makes use of the LUT6 without requiring additional logic blocks.
Adder circuit using lookup tables
A four-input lookup table (LUT4) is modified to operate in a first mode as an ordinary LUT4 and in a second mode as a 1-bit adder providing a sum output and a carry output. A six-input lookup table (LUT6) is modified to operate in a first mode as an ordinary LUT6 with a single output and in a second mode as a 2-bit adder providing a sum output and a carry output. Both possible results for the two different possible carry inputs can be determined and selected between when the carry input is available, implementing a 2-bit carry-select adder when in the second mode and retaining the ability to operate as an ordinary LUT6 in the first mode. Using the novel LUT6 design in a circuit chip fabric allows a 2-bit adder slice to be built that efficiently makes use of the LUT6 without requiring additional logic blocks.
LOWER PRECISION OPERAND REPRESENTATION
Apparatuses, systems, and techniques to simulate high-precision calculations with a series expansion of lower precision tensor cores. In at least one embodiment, one or more multiplication operands of a first precision are represented by a sum of two or more operands of a different precision.