Patent classifications
G06F7/49936
Look ahead normaliser
Apparatus includes hardware logic arranged to normalise an n-bit input number. The hardware logic comprises at least a first hardware logic stage, an intermediate hardware logic stage and a final hardware logic stage. Each stage comprises a left shifting logic element, the first and intermediate stages each also comprise a plurality of OR-reduction logic elements and the intermediate and final stages each also comprise one or more multiplexers. The OR-reduction logic elements operate on different subsets of bits from the number input to the particular stage. In the intermediate and final hardware logic stages, a first of the multiplexers selects an OR-reduction result received from a previous hardware logic stage and the left shifting logic element is arranged to perform left shifting on the updated binary number received from an immediately previous hardware logic stage dependent upon the selected OR-reduction result.
Method for determining a value of an integer scaling in a linking of input sets to output sets, and computer program product
The invention relates to a method for determining a value of an integer scaling in a linking of input sets to output sets, wherein the linking comprises operators, each of which has operator inputs and operator outputs that are at least partially linked to one another or to the input sets or to the output sets, by using a computer device having a processing unit, a memory unit, and an output unit. Representations of set objects are used to efficiently carry out rescaling operations within the linking, with up to infinitely large resolution sets. This procedure makes it possible to calculate resource-conserving integer scalings for a target system while taking secondary conditions into account.
Distributed batch normalization using estimates and rollback
A technique utilizing speculative execution and rollback for performing data parallel training of a neural network model is disclosed. Activations for a layer of the neural network model are normalized during a speculative normalization operation using estimated normalization parameters associated with a partial population of a set of training data allocated to a particular processor. Normalization parameters associated with the total population of the set of training data are generated by a distributed reduce operation in parallel with the speculative normalization operation. An optional rollback operation can revert the activations to a pre-normalization state if the estimated normalization parameters for the partial population are subsequently determined to be inaccurate compared to the normalization parameters for the population of the set of training data distributed across a plurality of processors.
EXACT VERSUS INEXACT DECIMAL FLOATING-POINT NUMBERS AND COMPUTATION SYSTEM
This disclosure represents an improved computer system and process to avoid the consequences of improper conversion of numbers and of rounding errors. This process makes the distinction between exact and inexact decimal floating-point numbers. If the result of a sequence of operation is exact, the user can trust that every decimal digit in the computed result is correct. On the other hand, if the input operands are inexact or the result cannot be computed exactly, a loss of significant digits occurs, and the user is warned of the loss. A novel representation is used for the inexact computed values. An estimate of the absolute error is also part of the representation.
SYSTOLIC ARRAY INCLUDING FUSED MULTIPLY ACCUMULATE WITH EFFICIENT PRENORMALIZATION AND EXTENDED DYNAMIC RANGE
Systems and methods are provided to perform multiply-accumulate operations of at least one normalized number in a systolic array. The systolic array can obtain a first input and detect that the first input is denormal. Based on determining the first input is denormal, the systolic array can generate a first normalized number by normalizing the first input. Processing elements of the systolic array can include a multiplier and an adder. The multiplier can multiply the first normalized number by a second normal or normalized number to generate a multiplier product and the adder can add an input partial sum to the multiplier product to generate an addition result.
Look ahead normaliser
Apparatus includes hardware logic arranged to normalise an n-bit input number. The hardware logic comprises at least a first hardware logic stage, an intermediate hardware logic stage and a final hardware logic stage. Each stage comprises a left shifting logic element, the first and intermediate stages each also comprise a plurality of OR-reduction logic elements and the intermediate and final stages each also comprise one or more multiplexers. The OR-reduction logic elements operate on different subsets of bits from the number input to the particular stage. In the intermediate and final hardware logic stages, a first of the multiplexers selects an OR-reduction result received from a previous hardware logic stage and the left shifting logic element is arranged to perform left shifting on the updated binary number received from an immediately previous hardware logic stage dependent upon the selected OR-reduction result.
Process for a Floating Point Dot Product Multiplier-Accumulator
A process for performing vector dot products receives a row vector and a column vector as floating point numbers in a format of sign plus exponent bits plus mantissa bits. The process generates a single dot product value by separately processing the sign bits, exponent bits, and mantissa bits to form a sign bit, a normalized mantissa formed by multiplying pairs multiplicand elements, and exponent information including MAX_EXP and EXP_DIFF. A second pipeline stage receives the multiplied pairs of normalized mantissas, optionally performs an exponent adjustment, pads, complements and shifts the normalized mantissas, and the results are added in a series of stages until a single addition result remains, which is normalized using MAX_EXP to form the floating point output result.
Floating point dot-product operator with correct rounding
The disclosure relates to a hardware operator for dot-product computation, comprising a plurality of multipliers each receiving two multiplicands in the form of floating-point numbers encoded in a first precision format; an alignment circuit associated with each multiplier, configured to, based on the exponents of the corresponding multiplicands, convert the result of the multiplication into a respective fixed-point number having a sufficient number of bits to cover the full dynamic range of the multiplication; and a multi-adder configured to add without loss the fixed-point numbers provided by the multipliers, providing a sum in the form of a fixed-point number.
Residue checking of entire normalizer output of an extended result
A method includes generating an extended result from a first operation circuitry having a result register bit width greater than a bus width associated with a residue check path of a second operation circuitry associated with a floating point unit. An extended result residue less a first portion residue of the extended result received from the residue check path is stored as a first partial result residue. The first partial result residue is compared with a first result residue of the second operation circuitry. The extended result residue less both the first partial result residue and a second portion residue of the extended result received from the residue check path as a second partial result residue is compared with a second result residue of the second operation circuitry.
USING FUZZY-JBIT LOCATION OF FLOATING-POINT MULTIPLY-ACCUMULATE RESULTS
Disclosed embodiments relate to performing floating-point (FP) arithmetic. In one example, a processor is to decode an instruction specifying locations of first, second, and third floating-point (FP) operands and an opcode calling for accumulating a FP product of the first and second FP operands with the third FP operand, and execution circuitry to, in a first cycle, generate the FP product having a Fuzzy-Jbit format comprising a sign bit, a 9-bit exponent, and a 25-bit mantissa having two possible positions for a JBit and, in a second cycle, to accumulate the FP product with the third FP operand, while concurrently, based on Jbit positions of the FP product and the third FP operand, determining an exponent adjustment and a mantissa shift control of a result of the accumulation, wherein performing the exponent adjustment concurrently enhances an ability to perform the accumulation in one cycle.