Patent classifications
G06F2207/3816
Variable precision floating-point multiplier
Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.
OPTIMIZED COMPUTE HARDWARE FOR MACHINE LEARNING OPERATIONS
A processing cluster of a processing cluster array comprises a plurality of registers to store input values of vector input operands, the input values of at least some of the vector input operands having different bit lengths than those of other input values of other vector input operands, and a compute unit to execute a dot-product instruction with the vector input operands to perform a number of parallel multiply operations and an accumulate operation per 32-bit lane based on a bit length of the smallest-sized input value of a first vector input operand relative to the 32-bit lane.
ARITHMETIC OPERATION DEVICE AND ARITHMETIC OPERATION SYSTEM
Provided is an arithmetic operation device including a multiplying section in which multiplying units are divided and assigned to each of one or more groups such that each group includes one or more of the multiplying units according to a calculation precision mode, and each multiplying unit multiplies together an individual multiplier, which is a digit range of at least a portion of a multiplier for the group, and an individual multiplicand, which is a digit range of at least a portion of a multiplicand for the group, and an adding section in which adding units are divided and assigned to each of one or more groups such that each group includes one or more of the adding units according to the calculation precision mode, and the adding units add together each multiplication result obtained by each multiplying unit and output a product of the multiplier and the multiplicand.
METHOD AND DEVICE FOR FLOATING POINT REPRESENTATION WITH VARIABLE PRECISION
The present disclosure relates to a method of storing, by a load and store circuit or other processing means, a variable precision floating point value to a memory address of a memory, the method comprising: reducing the bit length of the variable precision floating point value to no more than a size limit, and storing the variable precision floating point value to one of a plurality of storage zones in the memory, each of the plurality of storage zones having a storage space equal to or greater than the size limit (MBB).
VARIABLE PRECISION FLOATING-POINT MULTIPLIER
Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.
Variable precision floating-point multiplier
Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.
Mixed-precision floating-point arithmetic circuitry in specialized processing blocks
The present embodiments relate to integrated circuits with circuitry that efficiently performs mixed-precision floating-point arithmetic operations. Such circuitry may be implemented in specialized processing blocks. The specialized processing blocks may include configurable interconnect circuitry to support a variety of different use modes. For example, the specialized processing blocks may implement fixed-point addition, floating-point addition, fixed-point multiplication, floating-point multiplication, sum of two multiplications in a first floating-point precision, with or without casting to a second floating-point precision and the latter followed by a subsequent addition in the second floating-point precision, if desired, just to name a few. In some embodiments, two or more specialized processing blocks may be arranged in a cascade chain and perform together more complex operations such as a recursive mode dot product of two vectors of floating-point numbers having a first floating-point precision and output the dot product in a second floating-point precision.
Bit string accumulation
Systems, apparatuses, and methods related to bit string accumulation are described. A method for bit string accumulation can include performing an iteration of a recursive operation using a first bit string and a second bit string and modifying a quantity of bits of a result of the iteration of the recursive operation, wherein the modified quantity of bits is less than a threshold quantity of bits. The method can further include writing a first value comprising the modified bits indicative of the result of the iteration of the recursive operation to a first register and writing a second value indicative of the factor corresponding to the result of the iteration of the recursive operation to a second register.
VARIABLE PRECISION FLOATING-POINT MULTIPLIER
Integrated circuits with specialized processing blocks are provided. The specialized processing blocks may include floating-point multiplier circuits that can be configured to support variable precision. A multiplier circuit may include a first carry-propagate adder (CPA), a second carry-propagate adder (CPA), and an associated rounding circuit. The first CPA may be wide enough to handle the required precision of the mantissa. In a bridged mode, the first CPA may borrow an additional bit from the second CPA while the rounding circuit will monitor the appropriate bits to select the proper multiplier output. A parallel prefix tree operable in a non-bridged mode or the bridged mode may be used to compute multiple multiplier outputs. The multiplier circuit may also include exponent and exception handling circuitry using various masks corresponding to the desired precision width.
DECOMPOSED FLOATING POINT MULTIPLICATION
Systems, apparatuses and methods may provide for technology that in response to an identification that one or more hardware units are to execute on a first type of data format, decomposes a first original floating point number to a plurality of first segmented floating point numbers that are to be equivalent to the first original floating point number. The technology may further in response to the identification, decompose a second original floating point number to a plurality of second segmented floating point numbers that are to be equivalent to the second original floating point number. The technology may further execute a multiplication operation on the first and second segmented floating point numbers to multiply the first segmented floating point numbers with the second segmented floating point numbers.