Patent classifications
G06F7/4806
Method and Apparatus for Configuring a Reduced Instruction Set Computer Processor Architecture to Execute a Fully Homomorphic Encryption Algorithm
Systems and methods for configuring a reduced instruction set computer processor architecture to execute fully homomorphic encryption (FHE) logic gates as a streaming topology. The method includes parsing sequential FHE logic gate code, transforming the FHE logic gate code into a set of code modules that each have in input and an output that is a function of the input and which do not pass control to other functions, creating a node wrapper around each code module, configuring at least one of the primary processing cores to implement the logic element equivalents of each element in a manner which operates in a streaming mode wherein data streams out of corresponding arithmetic logic units into the main memory and other ones of the plurality arithmetic logic units.
Iterative Estimation Hardware
A function estimation hardware logic unit may be implemented as part of an execution pipeline in a processor. The function estimation hardware logic unit is arranged to calculate, in hardware logic, an improved estimate of a function of an input value, d, where the function is given by
The hardware logic comprises a plurality of multipliers and adders arranged to implement a m.sup.th-order polynomial with coefficients that are rational numbers, where m is not equal to two and in various examples m is not equal to a power of two. In various examples i=1, i=2 or i=3. In various examples m=3.
EFFICIENT IMPLEMENTATION OF COMPLEX VECTOR FUSED MULTIPLY ADD AND COMPLEX VECTOR MULTIPLY
Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.
APPARATUS AND METHOD FOR MULTIPLICATION AND ACCUMULATION OF COMPLEX VALUES
An apparatus and method for multiplying packed signed words. For example, one embodiment of a processor comprises: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed signed words; a second source register to store a second plurality of packed signed words; execution circuitry to execute the decoded instruction, the execution circuitry comprising: multiplier circuitry to multiply each of a plurality of packed signed words from the first source register with corresponding packed signed words from the second source register to generate a plurality of products responsive to the decoded instruction, adder circuitry to add the products to generate a first result, and accumulation circuitry to combine the first result with an accumulated result to generate a final result comprising a third plurality of packed signed words, and to write the third plurality of packed signed words or a maximum value to a destination register.
APPARATUS AND METHOD FOR COMPLEX MULTIPLICATION
An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.
APPARATUS AND METHOD FOR MULTIPLICATION AND ACCUMULATION OF COMPLEX VALUES
An apparatus and method for accumulating complex numbers. For example, one embodiment of a processor comprises: a first source register to store a first plurality of real and imaginary components of a first set of complex numbers; a second source register to store a second plurality of real and imaginary components of a second set of complex numbers; wherein the real and imaginary components of the first and second plurality of are to be stored as packed data elements within the first and second source registers; and execution circuitry comprising: multiplier circuitry to multiply selected real and imaginary values from the first source register with selected real and imaginary values from the second source register to generate a first plurality of values, adder circuitry to add and subtract selected combinations of the first plurality of values to generate a second plurality of values, and accumulation circuitry to combine the second plurality of values with a third set of complex numbers stored in a destination register to generate an accumulated result, the accumulated result to be written to the destination register.
Apparatus and method for complex multiply and accumulate
An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiply-accumulate of a first complex number, a second complex number, and a third complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder to decode an instruction to generate the decoded instruction and a first source register, a second source register, and a source and destination register to provide the first complex number, the second complex number, and the third complex number, respectively.
Apparatus and method for complex multiplication
An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.
Apparatus and method for complex multiplication
An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.
EFFICIENT IMPLEMENTATION OF COMPLEX VECTOR FUSED MULTIPLY ADD AND COMPLEX VECTOR MULTIPLY
Disclosed embodiments relate to efficient complex vector multiplication. In one example, an apparatus includes execution circuitry, responsive to an instruction having fields to specify multiplier, multiplicand, and summand complex vectors, to perform two operations: first, to generate a double-even multiplicand by duplicating even elements of the specified multiplicand, and to generate a temporary vector using a fused multiply-add (FMA) circuit having A, B, and C inputs set to the specified multiplier, the double-even multiplicand, and the specified summand, respectively, and second, to generate a double-odd multiplicand by duplicating odd elements of the specified multiplicand, to generate a swapped multiplier by swapping even and odd elements of the specified multiplier, and to generate a result using a second FMA circuit having its even product negated, and having A, B, and C inputs set to the swapped multiplier, the double-odd multiplicand, and the temporary vector, respectively.