Patent classifications
G06F7/49936
High performance floating-point adder with full in-line denormal/subnormal support
According to one general aspect, an apparatus may include a floating-point addition unit that includes a far path circuit, a close path circuit, and a final result selector circuit. The far path circuit may be configured to compute a far path result based upon either the addition or the subtraction of the two floating-point numbers regardless of whether the operands or the result include normal or denormal numbers. The close path circuit may be configured to compute a close path result based upon the subtraction of the two floating-point operands regardless of whether the operands or the result include normal or denormal numbers. The final result selector circuit may be configured to select between the far path result and the close path result based, at least in part, upon an amount of difference in the exponent portions of the two floating-point operands.
Processing denormal numbers in FMA hardware
A microprocessor includes FMA execution logic that determines whether to accumulate an accumulator operand C to the partial products of multiplier and multiplicand operands A and B in the partial product adder or in a second accumulation stage. The logic calculates an exponent delta of Aexp+BexpCexp and determines the number of leading zeroes in C, if C is denormal. The microprocessor accumulates C with the partial products of A and B when the accumulation of C to the product of A and B could result in mass cancellation, when ExpDelta is greater than or equal to K (where K is related to a width of a datapath in the partial product adder), and when a C is denormal and its number of leading zeroes plus K exceeds ExpDelta. The strategic use of resources in the partial product adder and second accumulation stage reduces latency.
METHODS AND APPARATUS FOR PERFORMING FIXED-POINT NORMALIZATION USING FLOATING-POINT FUNCTIONAL BLOCKS
An integrated circuit may include normalization circuitry that can be used when converting a fixed-point number to a floating-point number. The normalization circuitry may include at least a floating-point generation circuit that receives the fixed-point number and that creates a corresponding floating-point number. The normalization circuitry may then leverage an embedded digital signal processing (DSP) block on the integrated circuit to perform an arithmetic operation by removing the leading one from the created floating-point number. The resulting number may have a fractional component and an exponent value, which can then be used to derive the final normalized value.
Integer-based fused convolutional layer in a convolutional neural network
An example fused convolutional layer, comprising, a comparator capable of reception of a first zero point and a multiply-accumulation result, a first multiplexer coupled to the comparator, wherein the first multiplexer receives a plurality of power-of-two exponent values, a shift normalizer, coupled to the first multiplexer, wherein the shift normalizer is capable of receiving the multiply-accumulation result and the plurality of power-of-two exponent values, wherein the shift normalizer limits a quantization of the multiply-accumulation result to a power-of-two scale and a second multiplexer coupled to an output of the shift normalizer, the first multiplexer and receives a second zero point and outputs an activation.
Apparatus and method for fixed point to floating point conversion and negative power of two detector
A data processing system 2 supports conversion of fixed point numbers to floating point numbers. The result floating point numbers may be subnormal. A first shifter 28 shifts input signals representing the fixed point number by a first shift amount depending upon a leading zero count within an integer portion followed by a fractional portion of the fixed point number. A second shifter 30 shifts the input signals by a second shift amount depending upon the variable point position within the fixed point number. A subnormal result detector 34 generates a selection signal in dependence upon detection of a combination of a variable point position and the count of leading zeros which corresponds to the floating point number having a subnormal value. Selection circuitry 32 selects one of the outputs from the first shifter or the second shifter to form the significand in dependence upon the selection signal generated by the subnormal result detector.
DENORMALIZATION IN MULTI-PRECISION FLOATING-POINT ARITHMETIC CIRCUITRY
The present embodiments relate to integrated circuits with floating-point arithmetic circuitry that handles normalized and denormalized floating-point numbers. The floating-point arithmetic circuitry may include a normalization circuit and a rounding circuit, and the floating-point arithmetic circuitry may generate a first result in form of a normalized, unrounded floating-point number and a second result in form of a normalized, rounded floating-point number. If desired, the floating-point arithmetic circuitry may be implemented in specialized processing blocks.
INTEGER-BASED FUSED CONVOLUTIONAL LAYER IN A CONVOLUTIONAL NEURAL NETWORK
An example fused convolutional layer, comprising, a comparator capable of reception of a first zero point and a multiply-accumulation result, a first multiplexer coupled to the comparator, wherein the first multiplexer receives a plurality of power-of-two exponent values, a shift normalizer, coupled to the first multiplexer, wherein the shift normalizer is capable of receiving the multiply-accumulation result and the plurality of power-of-two exponent values, wherein the shift normalizer limits a quantization of the multiply-accumulation result to a power-of-two scale and a second multiplexer coupled to an output of the shift normalizer, the first multiplexer and receives a second zero point and outputs an activation.
PROCESSING DENORMAL NUMBERS IN FMA HARDWARE
A microprocessor includes FMA execution logic that determines whether to accumulate an accumulator operand C to the partial products of multiplier and multiplicand operands A and B in the partial product adder or in a second accumulation stage. The logic calculates an exponent delta of Aexp+BexpCexp and determines the number of leading zeroes in C, if C is denormal. The microprocessor accumulates C with the partial products of A and B when the accumulation of C to the product of A and B could result in mass cancellation, when ExpDelta is greater than or equal to K (where K is related to a width of a datapath in the partial product adder), and when a C is denormal and its number of leading zeroes plus K exceeds ExpDelta. The strategic use of resources in the partial product adder and second accumulation stage reduces latency.
HIGH PERFORMANCE FLOATING-POINT ADDER WITH FULL IN-LINE DENORMAL/SUBNORMAL SUPPORT
According to one general aspect, an apparatus may include a floating-point addition unit that includes a far path circuit, a close path circuit, and a final result selector circuit. The far path circuit may be configured to compute a far path result based upon either the addition or the subtraction of the two floating-point numbers regardless of whether the operands or the result include normal or denormal numbers. The close path circuit may be configured to compute a close path result based upon the subtraction of the two floating-point operands regardless of whether the operands or the result include normal or denormal numbers. The final result selector circuit may be configured to select between the far path result and the close path result based, at least in part, upon an amount of difference in the exponent portions of the two floating-point operands.
APPARATUS AND METHOD FOR FIXED POINT TO FLOATING POINT CONVERSION AND NEGATIVE POWER OF TWO DETECTOR
A data processing system 2 supports conversion of fixed point numbers to floating point numbers. The result floating point numbers may be subnormal. A first shifter 28 shifts input signals representing the fixed point number by a first shift amount depending upon a leading zero count within an integer portion followed by a fractional portion of the fixed point number. A second shifter 30 shifts the input signals by a second shift amount depending upon the variable point position within the fixed point number. A subnormal result detector 34 generates a selection signal in dependence upon detection of a combination of a variable point position and the count of leading zeros which corresponds to the floating point number having a subnormal value. Selection circuitry 32 selects one of the outputs from the first shifter or the second shifter to form the significand in dependence upon the selection signal generated by the subnormal result detector.