IPIQ

G06F7/49957

Mixed precision floating-point multiply-add operation

11275561 · 2022-03-15 ·

International Business Machines Corporation

An example computer-implemented method includes receiving a first value, a second value, a third value, and a fourth value, wherein the first value, the second value, the third value, and the fourth value are 16-bit or smaller precision floating-point numbers. The method further includes multiplying the first value and the second value to generate a first product, wherein the first product is a 32-bit floating-point number. The method further includes multiplying the third value and the fourth value to generate a second product, wherein the second product is a 32-bit floating-point number. The method further includes summing the first product and the second product to generate a summed value, wherein the summed value is a 32-bit floating-point number. The method further includes adding the summed value to an addend value to generate a result value, wherein the addend value and the result value are 32-bit floating-point numbers.

DYNAMIC DIRECTIONAL ROUNDING

20210232366 · 2021-07-29 ·

A method, computer readable medium, and system are disclosed for rounding floating point values. Dynamic directional rounding is a rounding technique for floating point operations. A floating point operation (addition, subtraction, multiplication, etc.) is performed on an operand to compute a floating point result. A sign (positive or negative) of the operand is identified. In one embodiment, the sign determines a direction in which the floating point result is rounded (towards negative or positive infinity). When used for updating parameters of a neural network during backpropagation, dynamic directional rounding ensures that rounding is performed in the direction of the gradient.

Arithmetic and logical operations in a multi-user network

11074100 · 2021-07-27 ·

Micron Technology, Inc.

Vijay S. Ramesh

Systems, apparatuses, and methods related to arithmetic and logical operations in a multi-user network are described. An agent may be provisioned with a pool of shared computing resources that includes circuitry to perform operations on data (e.g., one or more posit bit strings) in a multi-user network. The circuitry can perform operations on data to convert the data between one or more formats, such as floating-point and/or universal number (e.g., posit) formats, and can further perform arithmetic and/or logical operations on the converted data. The agent may receive a parameter corresponding to performance of an arithmetic operation and/or a logical operation using one or more posit bit strings and cause performance of the arithmetic operation and/or the logical operation using the one or more posit bit strings.

MIXED PRECISION FLOATING-POINT MULTIPLY-ADD OPERATION

20210182024 · 2021-06-17 ·

Dynamic directional rounding

10908878 · 2021-02-02 ·

Nvidia Corporation

PREPARE FOR SHORTER PRECISION (ROUND FOR REROUND) MODE IN A DECIMAL FLOATING-POINT INSTRUCTION

20210004206 · 2021-01-07 ·

An instruction is executed in round-for-reround mode wherein the permissible resultant value that is closest to and no greater in magnitude than the infinitely precise result is selected. If the selected value is not exact and the units digit of the selected value is either 0 or 5, then the digit is incremented by one and the selected value is delivered. In all other cases, the selected value is delivered.

FLOATING-POINT UNIT STOCHASTIC ROUNDING FOR ACCELERATED DEEP LEARNING

20200380370 · 2020-12-03 ·

Techniques in advanced deep learning provide improvements in one or more of accuracy, performance, and energy efficiency. An array of processing elements comprising a portion of a neural network accelerator performs flow-based computations on wavelets of data. Each processing element has a respective compute element and a respective routing element. Each compute element has a respective floating-point unit enabled to perform stochastic rounding, thus in some circumstances enabling reducing systematic bias in long dependency chains of floating-point computations. The long dependency chains of floating-point computations are performed, e.g., to train a neural network or to perform inference with respect to a trained neural network.

Round for reround mode in a decimal floating point instruction

10782932 · 2020-09-22 ·

International Business Machines Corporation

A round-for-reround mode (preferably in a BID encoded Decimal format) of a floating point instruction prepares a result for later rounding to a variable number of digits by detecting that the least significant digit may be a 0, and if so changing it to 1 when the trailing digits are not all 0. A subsequent reround instruction is then able to round the result to any number of digits at least 2 fewer than the number of digits of the result. An optional embodiment saves a tag indicating the fact that the low order digit of the result is 0 or 5 if the trailing bits are non-zero in a tag field rather than modify the result. Another optional embodiment also saves a half-way-and-above indicator when the trailing digits represent a decimal with a most significant digit having a value of 5. An optional subsequent reround instruction is able to round the result to any number of digits fewer or equal to the number of digits of the result using the saved tags.

Reduced floating-point precision arithmetic circuitry

10761805 · 2020-09-01 ·

Altera Corporation

Martin Langhammer

The present embodiments relate to performing reduced-precision floating-point arithmetic operations using specialized processing blocks with higher-precision floating-point arithmetic circuitry. A specialized processing block may receive four floating-point numbers that represent two single-precision floating-point numbers, each separated into an LSB portion and an MSB portion, or four half-precision floating-point numbers. A first partial product generator may generate a first partial product of first and second input signals, while a second partial product generator may generate a second partial product of third and fourth input signals. A compressor circuit may generate carry and sum vector signals based on the first and second partial products; and circuitry may anticipate rounding and normalization operations by generating in parallel based on the carry and sum vector signals at least two results when performing the single-precision floating-point operation and at least four results when performing the two half-precision floating-point operations.

ARITHMETIC AND LOGICAL OPERATIONS IN A MULTI-USER NETWORK

20200272506 · 2020-08-27 ·

Vijay S. Ramesh

Patent classifications

G06F7/49957