G06F7/4812

APPARATUS AND METHOD FOR COMPLEX MULTIPLICATION

An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.

EFFICIENT FAULT COUNTERMEASURE THROUGH POLYNOMIAL EVALUATION

Various embodiments relate to a fault detection system and method for polynomial operations, including: selecting a plurality of evaluation points; evaluating a first polynomial at the plurality of evaluation points to produce first results; applying a first function to the first polynomial to produce a second polynomial; evaluating the second polynomial at the plurality of evaluation points second results; evaluating a second scalar function on the first results to produce third results; comparing the second results to the third results; and performing a polynomial operation using the second polynomial when the second results match the third results.

INTEGRATED CIRCUITS WITH SPECIALIZED PROCESSING BLOCKS FOR PERFORMING FLOATING-POINT FAST FOURIER TRANSFORMS AND COMPLEX MULTIPLICATION
20190121614 · 2019-04-25 ·

Integrated circuits with specialized processing blocks are provided. A specialized processing block may include one real addition stage and one real multiplier stage. The multiplier stage may simultaneously feed its output to the addition stage and directly to an adjacent specialized processing block. The addition stage may also produce sum and difference outputs in parallel. A group of four such specialized processing blocks may be connected in a chain to implement a radix-2 fast Fourier transform (FFT) butterfly. Multiple radix-2 butterflies may be stacked to form yet higher order radix butterflies. If desired, the specialized processing block may also be used to implement a complex multiply operation. Three or four specialized processing blocks may be chained together and along with one or more adders outside the specialized processing blocks, real and imaginary portions of a complex product can be generated.

APPARATUS AND METHOD FOR COMPLEX BY COMPLEX CONJUGATE MULTIPLICATION

An apparatus and method for multiplying packed real and imaginary components of complex numbers. For example, one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to select real and imaginary data elements in the first source register and second source register to multiply, the multiplier circuitry to multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and to multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products, adder circuitry to add a first subset of the plurality of imaginary products and subtract a second subset of the plurality of imaginary products to generate a first temporary result and to add a third subset of the plurality of imaginary products and subtract a fourth subset of the plurality of imaginary products to generate a second temporary result, accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result and to combine the second temporary result with second data from the destination register to generate a second final result and to store the first final result and second final result back in the destination register.

APPARATUS AND METHOD FOR PERFORMING MULTIPLICATION WITH ADDITION-SUBTRACTION OF REAL COMPONENT

An apparatus and method for performing a transform on complex data. For example, one embodiment of a processor comprises: multiplier circuitry to multiply packed real N-bit data elements in the first source register with packed real M-bit data elements in the second source register and to multiply packed imaginary N-bit data elements in the first source register with packed imaginary M-bit data elements in the second source register to generate at least four real products, adder circuitry to subtract a first selected real product from a second selected real product to generate a first temporary result and to subtract a third selected real product from a fourth selected real product to generate a second temporary result, the adder circuitry to add the first temporary result to a first packed N-bit data element from the third source register to generate a first pre-scaled result, to subtract the first temporary result from the first packed N-bit data element to generate a second pre-scaled result, to add the second temporary result to a second packed N-bit data element from the third source register to generate a third pre-scaled result, and to subtract the second temporary result from the second packed N-bit data element to generate a fourth pre-scaled result; scaling circuitry to scale the first, second, third and fourth pre-scaled results to a specified bit width to generate first, second, third, and fourth final results; and a destination register to store the first, second, third, and fourth final results in specified data element positions.

APPARATUS AND METHOD FOR MULTIPLICATION AND ACCUMULATION OF COMPLEX AND REAL PACKED DATA ELEMENTS

An apparatus and method for multiplying packed real and imaginary components of complex numbers. For example, one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to select real and imaginary data elements in the first source register and second source register to multiply, the multiplier circuitry to multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and to multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products, adder circuitry to add a first subset of the plurality of imaginary products to generate a first temporary result and to add a second subset of the plurality of imaginary products to generate a second temporary result; negation circuitry to negate the first temporary result to generate a third temporary result and to negate the second temporary result to generate a fourth temporary result; accumulation circuitry to combine the third temporary result with first data from a destination register to generate a first final result and to combine the fourth temporary result with second data from the destination register to generate a second final result and to store the first final result and second final result back in the destination register.

METHOD AND APPARATUS FOR CONCURRENT READING AND CALCULATION OF MIXED RADIX DFT/IDFT

A method for concurrent reading of mixed radix DFT/IDFT data, a method for concurrent calculation of mixed radix DFT/IDFT method, an apparatus for concurrent reading of mixed radix DFT/IDFT data, and an apparatus for concurrent calculation of mixed radix DFT/IDFT. The method for concurrent reading includes: configuring dual circulation parameters according to the number of points corresponding to the number of series to be computed and the number of points corresponding to the number of series accomplished; then, determining the value size between the maximum number of concurrently read data and the product of the number of points corresponding to the number of series accomplished; and based on the result of determination, calculating the dual circulation parameters corresponding thereto according to the result of determination, and concurrently reading data based on the calculated dual circulation parameters.

Apparatus and method for controlling operation
10061559 · 2018-08-28 · ·

Methods and apparatuses for performing arithmetic operations efficiently and quickly are described. Such arithmetic operations include, but are not limited to, multiplying 2N bit integers, multiplying multiple N-bit integers simultaneously, multiplying 2N bit complex numbers, and other multiplication operations involving coefficients, complex numbers, and complex conjugate numbers.

INTEGRATED CIRCUITS WITH SPECIALIZED PROCESSING BLOCKS FOR PERFORMING FLOATING-POINT FAST FOURIER TRANSFORMS AND COMPLEX MULTIPLICATION
20180088906 · 2018-03-29 ·

Integrated circuits with specialized processing blocks are provided. A specialized processing block may include one real addition stage and one real multiplier stage. The multiplier stage may simultaneously feed its output to the addition stage and directly to an adjacent specialized processing block. The addition stage may also produce sum and difference outputs in parallel. A group of four such specialized processing blocks may be connected in a chain to implement a radix-2 fast Fourier transform (FFT) butterfly. Multiple radix-2 butterflies may be stacked to form yet higher order radix butterflies. If desired, the specialized processing block may also be used to implement a complex multiply operation. Three or four specialized processing blocks may be chained together and along with one or more adders outside the specialized processing blocks, real and imaginary portions of a complex product can be generated.

Modular multiplication using look-up tables
09652200 · 2017-05-16 · ·

Various embodiments relate to a method, system, and non-transitory machine-readable medium encoded with instructions for execution by a processor for performing modular exponentiation, the non-transitory machine-readable medium including: instructions for iteratively calculating a modular exponentiation, b.sup.d mod n, including: instructions for squaring a working value, c; and instructions for conditionally multiplying the working value, c, by a base value, b, dependent on a bit of an exponent, d, including: instructions for unconditionally multiplying the working value, c, by a lookup table entry associated with the base value.