G06F5/012

METHOD AND SYSTEM FOR PROCESSING FLOATING POINT NUMBERS
20220050665 · 2022-02-17 ·

A method and system for processing a set of ‘k’ floating point numbers to perform addition and/or subtraction is disclosed. Each floating-point number comprises a mantissa (m.sub.i) and an exponent (e.sub.i). The method comprises receiving the set of ‘k’ floating point numbers in a first format, each floating-point number in the first format comprising a mantissa (m.sub.i) with a bit-length of ‘b’ bits. The method further comprises creating a set of ‘k’ numbers (y.sub.i) based on the mantissas of the ‘k’ floating-point numbers, the numbers having a bit-length of ‘n’ bits obtained by adding both extra most-significant bits and extra least-significant bits to the bit length ‘b’ of the mantissa (m.sub.i). The method includes identifying a maximum exponent (e.sub.max) among the exponents e.sub.i, aligning the magnitude bits of the numbers (y.sub.i) based on the maximum exponent (e.sub.max) and processing the set of ‘k’ numbers concurrently.

Neural network device for neural network operation, method of operating neural network device, and application processor including the neural network device

Provided are a neural network device for performing a neural network operation, a method of operating the neural network device, and an application processor including the neural network device. The neural network device includes a direct memory access (DMA) controller configured to receive floating-point data from a memory; a data converter configured to convert the floating-point data received through the DMA controller to integer-type data; and a processor configured to perform a neural network operation based on an integer operation by using the integer-type data provided from the data converter.

SCALING HALF-PRECISION FLOATING POINT TENSORS FOR TRAINING DEEP NEURAL NETWORKS
20220269931 · 2022-08-25 · ·

A graphics processor is described that includes a single instruction, multiple thread (SIMT) architecture including hardware multithreading. The multiprocessor can execute parallel threads of instructions associated with a command stream, where the multiprocessor includes a set of functional units to execute at least one of the parallel threads of the instructions. The set of functional units can include a mixed precision tensor processor to perform tensor computations. The functional units can also include circuitry to analyze statistics for output values of the tensor computations, determine a target format to convert the output values, the target format determined based on the statistics for the output values and a precision associated with a second layer of the neural network, and convert the output values to the target format.

STOCHASTIC ROUNDING FLOATING-POINT ADD INSTRUCTION USING ENTROPY FROM A REGISTER

Embodiments are directed to a computer implemented method for executing machine instructions in a central processing unit. The executing includes loading a first operand into a first operand register, and loading a second operand into a second operand register. The executing further includes shifting either the first operand or the second operand to form a shifted operand. The executing further includes adding or subtracting the first operand and the second operand to obtain a sum or a difference, and loading the sum or the difference having a least significant bit into a third register or a memory. The executing further includes performing a probability analysis on least significant bits of the shifted operand or the non-shifted operand, and initiating a rounding operation on the least significant bit of the sum or the difference based at least in part on the probability analysis.

BINARY FUSED MULTIPLY-ADD FLOATING-POINT CALCULATIONS

A binary fused multiply-add floating-point unit configured to operate on an addend, a multiplier, and a multiplicand. The unit is configured to receive as the addend an unrounded result of a prior operation executed in the unit via an early result feedback path; to perform an alignment shift of the unrounded addend on an unrounded exponent and an unrounded mantissa; as well as perform a rounding correction for the addend in parallel to the actual alignment shift, responsive to a rounding-up signal.

FIXED-POINT AND FLOATING-POINT ARITHMETIC OPERATOR CIRCUITS IN SPECIALIZED PROCESSING BLOCKS
20170322769 · 2017-11-09 · ·

The present embodiments relate to circuitry that efficiently performs floating-point arithmetic operations and fixed-point arithmetic operations. Such circuitry may be implemented in specialized processing blocks. If desired, the specialized processing blocks may include configurable interconnect circuitry to support a variety of different use modes. For example, the specialized processing block may efficiently perform a fixed-point or floating-point addition operation or a portion thereof, a fixed-point or floating-point multiplication operation or a portion thereof, a fixed-point or floating-point multiply-add operation or a portion thereof, just to name a few. In some embodiments, two or more specialized processing blocks may be arranged in a cascade chain and perform together more complex operations such as a recursive mode dot product of two vectors of floating-point numbers or a Radix-2 Butterfly circuit, just to name a few.

DATA PROCESSING SYSTEM AND METHOD
20220207047 · 2022-06-30 ·

A data pipeline system includes a binary data extractor to receive a data portion identifier, extract a portion of a binary data item based on the data portion identifier; and output the portion of the binary data item. A data iterator provides a first data portion identifier to the binary data extractor, receives, from the binary data extractor, a first portion of the binary data item, determines a second data portion identifier, provides the second data portion identifier to the binary data extractor, receives, from the binary data extractor, a second portion of the binary data item, and outputs the second portion of the binary data item. A data converter receives, from the data iterator, the second portion of the binary data item; and transforms, based on a data format specification, at least the second portion of the binary data item for processing by components of the data pipeline system.

Efficient Dual-path Floating-Point Arithmetic Operators
20220206747 · 2022-06-30 ·

Systems and methods related to performing arithmetic operations on floating-point numbers. Floating-point arithmetic circuitry is configured to receive two floating-point numbers. The floating-point arithmetic circuitry includes a first path configured to perform a first operation on the two floating-point numbers based at least in part on a difference in size between the two floating-point numbers. The floating-point arithmetic circuitry includes a second path configured to perform a second operation on the two floating-point numbers based at least in part on the difference is size between the two floating-point numbers. The first path and the second path diverge from each other after receipt of the floating-point numbers in the floating-point arithmetic circuitry and converge on a shared adder that is used for the first operation and the second operation.

Floating point multiply-add, accumulate unit with combined alignment circuits

Floating point Multiply-Add, Accumulate Unit, supporting BF16 format for Multiply-Accumulate operations, and FP32 Single-Precision Addition complying with the IEEE 754 Standard. The Multiply-Accumulate unit uses higher radix and longer internal 2's complement significand representation to facilitate precision as well as comparison and operation with negative numbers. The addition can be performed using Carry-Save format to avoid long carry propagation and speed up the operation. The circuit uses early exponent comparison to shorten the accumulate pipeline stage. Operations including overflow detection, zero detection and sign extension are adopted for 2s complement and Carry-Save format.

Hybrid matrix multiplication pipeline

Systems, apparatuses, and methods implementing a hybrid matrix multiplication pipeline are disclosed. A hybrid matrix multiplication pipeline is able to execute a plurality of different types of instructions in a plurality of different formats by reusing execution circuitry in an efficient manner. For a first type of instruction for source operand elements of a first size, the pipeline uses N multipliers to perform N multiplication operations on N different sets of operands, where N is a positive integer greater than one. For a second type of instruction for source operand elements of a second size, the N multipliers work in combination to perform a single multiplication operation on a single set of operands, where the second size is greater than the first size. The pipeline also shifts element product results in an efficient manner when implementing a dot product operation.