Patent classifications
G06F7/5525
Methods for constructing lookup tables for division and square-root implementations
Control circuitry coupled to a multiply unit which includes a plurality of stage, each of which may be configured to perform a corresponding arithmetic function, may be configured to retrieve a given entry from a lookup table dependent upon a first portion of a binary representation of an input operand. An error value of an error function evaluated dependent upon a lookup value in a given entry of the plurality of entries is included in a predetermined error range. The control circuitry may be further configured to determine an initial approximation of a result of an iterative arithmetic operation using the first entry and initiate the iterative arithmetic operation using the initial approximation and the input operand.
Check procedure for floating point operations
Method and computer system for implementing an operation on 1 floating point input, in accordance with a rounding mode, e.g. using a Newton-Raphson technique. The floating point result comprises a p-bit mantissa. An unrounded proposed mantissa result is determined using the Newton-Raphson technique, wherein a p-bit rounded proposed mantissa result, t, corresponds to a rounding of the unrounded proposed mantissa result in accordance with the rounding mode, with k leading zeroes. If an increment to the (mk).sup.th bit of the unrounded result would affect the p-bit rounded result then the input(s) and bits of the unrounded result are used to determine a check parameter which is indicative of a relationship between an exact result and the unrounded result if the (mk).sup.th bit were incremented. The p-bit mantissa of the floating point result, is determined in dependence upon the check parameter, to be either t or t+1.
Addition of Qubit States Using Entangled Quaternionic Roots of Single-Qubit Gates
Systems and methods are provided for performing addition of qubit states using entangled quaternionic roots of single qubit gates. A method includes identifying a single qubit gate in a quantum circuit of a quantum computer, wherein the quantum circuit includes two summand qubits entangled with a third qubit that stores a measurable non-linear sum of the two summand qubits. The method includes mapping the single qubit gate to a representative gate that is phase-equivalent to the single qubit gate, mapping the representative gate to a unit quaternion, calculating an nth root of the unit quaternion, and mapping the nth root of the unit quaternion to a unitary matrix that represents a fractional single qubit gate. The method includes configuring the quantum circuit to include at least one fractional single qubit gate representing the unitary matrix in place of the single qubit gate for performing the addition.
METHOD AND PROCESSING APPARATUS FOR PERFORMING ARITHMETIC OPERATION
A method of performing an arithmetic operation by a processing apparatus includes determining a polynomial expression approximating an arithmetic operation to be performed on a variable; adaptively determining upper bits for addressing a look-up table (LUT) according to a variable section to which the variable belongs; obtaining coefficients of the polynomial expression from the LUT by addressing the LUT using a value of the upper bits; and performing the arithmetic operation by calculating a result value of the polynomial expression using the coefficients.
APPARATUS AND METHOD FOR PROCESSING RECIPROCAL SQUARE ROOT OPERATIONS
An apparatus and method for performing a reciprocal square root. For example one embodiment of a processor comprises: a decoder to decode a reciprocal square root instruction to generate a decoded reciprocal square root instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal square root execution circuitry to execute the decoded reciprocal square root instruction, the reciprocal square root execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal square root execution circuitry to generate a reciprocal square root of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.
APPARATUS AND METHOD FOR PROCESSING FRACTIONAL RECIPROCAL OPERATIONS
An apparatus and method for performing a reciprocal. For example one embodiment of a processor comprises: a decoder to decode a reciprocal instruction to generate a decoded reciprocal instruction; a source register to store at least one packed input data element; a destination register to store a result data element; and reciprocal execution circuitry to execute the decoded reciprocal instruction, the reciprocal execution circuitry to use a first portion of the packed input data element as an index to a data structure containing a plurality of sets of coefficients to identify a first set of coefficients from the plurality of sets, the reciprocal execution circuitry to generate a reciprocal of the packed input data element using a combination of the coefficients and a second portion of the packed input data element.
Combinatorial Logic Circuits With Feedback
Combinatorial logic circuits with feedback, which include at least two combinatorial logic elements, are disclosed. At least one of the combinatorial logic elements receives an external input (i.e., from outside the circuit), at least one of the combinatorial logic elements receives an input that is feedback of the circuit output, and at least one of the combinatorial logic elements receives an input that is neither an external input nor an output of the circuit but rather is from another of the combinatorial logic elements and thus only implicit to the circuit. No staticizers are needed; the logic circuits effectively create implicit equations to perform functions that were previously thought to require sequential logic. The combinatorial logic circuits result in a stable output (in some instances after a brief period of time) due to the implicit equations, rather than achieving stability from an explicit expression of some input to the circuit.
Method and processing apparatus for performing arithmetic operation
A method of performing an arithmetic operation by a processing apparatus includes determining a polynomial expression approximating an arithmetic operation to be performed on a variable; adaptively determining upper bits for addressing a look-up table (LUT) according to a variable section to which the variable belongs; obtaining coefficients of the polynomial expression from the LUT by addressing the LUT using a value of the upper bits; and performing the arithmetic operation by calculating a result value of the polynomial expression using the coefficients.
SMALL MULTIPLIER AFTER INITIAL APPROXIMATION FOR OPERATIONS WITH INCREASING PRECISION
In an aspect, a processor includes circuitry for iterative refinement approaches, e.g., Newton-Raphson, to evaluating functions, such as square root, reciprocal, and for division. The circuitry includes circuitry for producing an initial approximation; which can include a LookUp Table (LUT). LUT may produce an output that (with implementation-dependent processing) forms an initial approximation of a value, with a number of bits of precision. A limited-precision multiplier multiplies that initial approximation with another value; an output of the limited precision multiplier goes to a full precision multiplier circuit that performs remaining multiplications required for iteration(s) in the particular refinement process being implemented. For example, in division, the output being calculated is for a reciprocal of the divisor. The full-precision multiplier circuit requires a first number of clock cycles to complete, and both the small multiplier and the initial approximation circuitry complete within the first number of clock cycles.
SYSTEM AND METHOD TO ACCELERATE MICROPROCESSOR OPERATIONS
Systems and methods are directed to accelerating operations associated with a microprocessor. Example embodiments improve the operations of the microprocessor by providing devices (e.g., integrated circuits, independent accelerators) configured to use reciprocal or reciprocal square root instructions. Such devices can be further configured to follow the reciprocal or reciprocal square root instructions with multiplication or other instructions to finish division, square root, or other complex operations.