G06F7/5375

COMBINED DIVIDE/SQUARE ROOT PROCESSING CIRCUITRY AND METHOD
20230017462 · 2023-01-19 ·

An apparatus comprises combined divide/square root processing circuitry to perform, in response to a divide instruction, a given radix-64 iteration of a radix-64 divide operation, and in response to a square root instruction, a given radix-64 iteration of a radix-64 square root operation; in which: the combined divide/square root processing circuitry comprises shared circuitry to generate at least one output value for the given radix-64 iteration on a same data path used for both the radix-64 divide operation and the radix-64 square root operation.

DIGIT-RECURRENCE SELECTION CONSTANTS
20230018977 · 2023-01-19 ·

A data processing apparatus to perform a digit-recurrence operation on an input value comprises receiver circuitry for receiving a remainder value of a previous iteration of the digit-recurrence operation. Comparison circuitry performs comparisons on most significant bits of the remainder value of the previous iteration of the digit-recurrence operation with each of a plurality of selection constants associated with available digits of a next digit of a result of the digit-recurrence operation and outputs the next digit of the result of the digit-recurrence operation based on the comparisons. Each of the selection constants is associated with one of the available digits and an input parameter. Storage circuitry stores a subset of the selection constants, the subset of the selection constants excluding an excluded selection constant from the selection constants, which is associated with an excluded digit from the available digits.

Methods and Apparatus for Quotient Digit Recoding in a High-Performance Arithmetic Unit
20230086090 · 2023-03-23 ·

A divider includes a digit recoder that recodes upper bits of a partial remainder into sets of lower-radix multiples without carry propagate addition. Elimination of the carry propagate adder makes computation of the quotient carry free and independent of the number of bits computed per cycle, thereby enabling a higher number of bits per cycle, as well as increased clock speeds.

Data processing apparatus having combined divide-square root circuitry
09785407 · 2017-10-10 · ·

A processing apparatus has combined divide-square root circuitry for performing a radix-N SRT divide algorithm and a radix-N SRT square root algorithm, where N is an integer power-of-2. The combined circuitry has shared remainder updating circuitry which performs remainder updates for a greater number of iterations per cycle for the SRT divide algorithm than for the SRT square root algorithm. This allows reduced circuit area while avoiding the SRT square root algorithm compromising the performance of the SRT divide algorithm.

Arithmetic operation in a data processing system

An arithmetic operation in a data processing unit, preferably by iterative digit accumulations, is proposed. An approximate result of the arithmetic operation is computed iteratively. Concurrently at least two supplementary values of the approximate result of the arithmetic operation are computed, and the final result selected from one of the values of the approximate result and the at least two supplementary values of the arithmetic operation depending on the results of the last iteration step.

Calculation of a number of iterations

Performing an arithmetic operation in a data processing unit, including calculating a number of iterations for performing the arithmetic operation with a given number of bits per iteration. The number of bits per iteration is a positive natural number. A number of consecutive digit positions of a digit in a sequence of bits represented in the data processing unit is counted. The length of the sequence is a multiple of the number of bits per iteration. A quotient of the number of consecutive digit positions divided by the number of bits per iteration is calculated, as well as a remainder of the division.

OPTIMIZED INTEGER DIVISION CIRCUIT

The disclosed embodiments relate to the design of an integer division circuit, which comprises: a dividend-input that receives an integer dividend A; a divisor-input that receives an integer divisor B; a quotient-output that outputs an integer quotient q; and a division engine that executes the Goldschmidt method to divide A by B to produce q. During a pre-processing operation, which commences executing before the Goldschmidt method starts executing, the division engine determines whether A<B. If A<B, the division engine sets q=0 without having to execute the Goldschmidt method.

DIVIDE/SQUARE-ROOT PIPELINE AND METHOD
20240296048 · 2024-09-05 · ·

An apparatus comprises a divide/square-root pipeline comprising: a plurality of divide/square-root iteration pipeline stages each to perform a respective iteration of a digit-recurrence divide or square root operation; and signal paths to supply outputs generated by one divide/square root iteration pipeline stage in one iteration as inputs to a subsequent divide/square root iteration pipeline stage of the divide/square-root pipeline for performing a subsequent iteration of the digit-recurrence divide or square root operation. The divide/square-root pipeline is capable of performing the digit-recurrence divide or square root operation on a floating-point operand to generate a floating-point result.

Division and root computation with fast result formatting

Systems and methods relate to division of a dividend by a divisor, with fast result formatting. Counts of leading sign bits of the dividend and the divisor are determined. The dividend and the divisor are normalized based on their respective counts of leading sign bits to obtain a normalized dividend and a normalized divisor, respectively. An exact number of significant quotient bits of a quotient of the division, based on the normalized dividend, the normalized divisor, and the counts of leading sign bits of the dividend and the divisor and used to determine a correct position of a leading bit of the quotient based on this exact number. The quotient is developed by placing the leading bit at or near the correct position and appending less significant bits to the right of the leading bit. Thus, left-shifts in each iteration and large final shifts are avoided in formatting the result.