G06F7/537

Small multiplier after initial approximation for operations with increasing precision
12217020 · 2025-02-04 · ·

In an aspect, a processor includes circuitry for iterative refinement approaches, e.g., Newton-Raphson, to evaluating functions, such as square root, reciprocal, and for division. The circuitry includes circuitry for producing an initial approximation; which can include a LookUp Table (LUT). LUT may produce an output that (with implementation-dependent processing) forms an initial approximation of a value, with a number of bits of precision. A limited-precision multiplier multiplies that initial approximation with another value; an output of the limited precision multiplier goes to a full precision multiplier circuit that performs remaining multiplications required for iteration(s) in the particular refinement process being implemented. For example, in division, the output being calculated is for a reciprocal of the divisor. The full-precision multiplier circuit requires a first number of clock cycles to complete, and both the small multiplier and the initial approximation circuitry complete within the first number of clock cycles.

Division and root computation with fast result formatting

Systems and methods relate to division of a dividend by a divisor, with fast result formatting. Counts of leading sign bits of the dividend and the divisor are determined. The dividend and the divisor are normalized based on their respective counts of leading sign bits to obtain a normalized dividend and a normalized divisor, respectively. An exact number of significant quotient bits of a quotient of the division, based on the normalized dividend, the normalized divisor, and the counts of leading sign bits of the dividend and the divisor and used to determine a correct position of a leading bit of the quotient based on this exact number. The quotient is developed by placing the leading bit at or near the correct position and appending less significant bits to the right of the leading bit. Thus, left-shifts in each iteration and large final shifts are avoided in formatting the result.

CIRCUITRY AND METHOD FOR PERFORMING DIVISION
20170199723 · 2017-07-13 ·

A data processing apparatus comprises signal receiving circuitry to receive a signal corresponding to a divide instruction that identifies a dividend x and a divisor d. Processing circuitry performs, in response to said divide instruction, a radix-N division algorithm to generate a result value q=x/d, where N is an integer power of 2 and greater than 1. Said division algorithm comprises a plurality of iterations, each of said plurality of iterations being performed by quotient digit calculation circuitry to determine a quotient value of that iteration q[i+1] based on a remainder value of a previous iteration rem[i]; and remainder calculation circuitry to determine a remainder value of that iteration rem[i+1] based on said quotient value of that iteration q[i+1] and said remainder value of said previous iteration rem[i]. Result calculation circuitry derives said result value q based on each quotient value selected by said digit selection circuitry for each of said plurality of iterations. For at least some of said plurality of iterations, said quotient digit calculation circuitry speculatively determines a set of candidate values before a quotient value of said previous iteration is known and, in response to said quotient value of said previous iteration becoming known, determines said quotient value of that iteration q[i+1] based on one of said candidate values.

Small Multiplier After Initial Approximation For Operations With Increasing Precision
20250156150 · 2025-05-15 ·

In an aspect, a processor includes circuitry for iterative refinement approaches, e.g., Newton-Raphson, to evaluating functions, such as square root, reciprocal, and for division. The circuitry includes circuitry for producing an initial approximation; which can include a LookUp Table (LUT). LUT may produce an output that (with implementation-dependent processing) forms an initial approximation of a value, with a number of bits of precision. A limited-precision multiplier multiplies that initial approximation with another value; an output of the limited precision multiplier goes to a full precision multiplier circuit that performs remaining multiplications required for iteration(s) in the particular refinement process being implemented. For example, in division, the output being calculated is for a reciprocal of the divisor. The full-precision multiplier circuit requires a first number of clock cycles to complete, and both the small multiplier and the initial approximation circuitry complete within the first number of clock cycles.

Small Multiplier After Initial Approximation For Operations With Increasing Precision
20250156150 · 2025-05-15 ·

In an aspect, a processor includes circuitry for iterative refinement approaches, e.g., Newton-Raphson, to evaluating functions, such as square root, reciprocal, and for division. The circuitry includes circuitry for producing an initial approximation; which can include a LookUp Table (LUT). LUT may produce an output that (with implementation-dependent processing) forms an initial approximation of a value, with a number of bits of precision. A limited-precision multiplier multiplies that initial approximation with another value; an output of the limited precision multiplier goes to a full precision multiplier circuit that performs remaining multiplications required for iteration(s) in the particular refinement process being implemented. For example, in division, the output being calculated is for a reciprocal of the divisor. The full-precision multiplier circuit requires a first number of clock cycles to complete, and both the small multiplier and the initial approximation circuitry complete within the first number of clock cycles.