Patent classifications
G06F7/4812
APPARATUS AND METHOD FOR COMPLEX MULTIPLICATION
An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.
Apparatus and method for complex multiplication
An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.
Apparatus and method for complex by complex conjugate multiplication
An apparatus and method for multiplying packed real and imaginary components of complex numbers are described. A processor embodiment includes: a decoder to decode a first instruction to generate a decoded instruction; a first source register to store a first plurality of packed real and imaginary data elements; a second source register to store a second plurality of packed real and imaginary data elements; and execution circuitry to execute the decoded instruction. The execution circuitry includes: multiplier circuitry to select real and imaginary data elements in the first source register and second source, multiply each selected imaginary data element in the first source register with a selected real data element in the second source register, and multiply each selected real data element in the first source register with a selected imaginary data element in the second source register to generate a plurality of imaginary products; adder circuitry to add a first subset of the plurality of imaginary products and subtract a second subset of the plurality of imaginary products to generate a first temporary result, and to add a third subset of the plurality of imaginary products and subtract a fourth subset of the plurality of imaginary products to generate a second temporary result; and accumulation circuitry to combine the first temporary result with first data from a destination register to generate a first final result, combine the second temporary result with second data from the destination register to generate a second final result, and store the first final result and second final result back in the destination register.
N-POINT COMPLEX FOURIER TRANSFORM STRUCTURE HAVING ONLY 2N REAL MULTIPLIES, AND OTHER MATRIX MULTIPLY OPERATIONS
An integrated circuit chip implementing multiplication of an M×N element matrix with an N-element vector to obtain an M-element product by combining the vector with rows of bits of the same significance selected from the matrix one bit-row at a time to form partial products, exploiting the fact that the same potential combinations are needed for all bit-rows and all matrix rows to precompute all of the combinations once and for all, and combining selected partial products for different bit place-significance with a shift-and-add operation only once for each of the M product elements, thereby effectively using only M multiply-equivalent structures. An N-point Complex Fourier Transform can therefore be claimed which only needs 2N real multiplies and the product of an N×N matrix with another N×N matrix requires only N.sup.2 multiplies.
SPARSE ANTENNA ARRAYS FOR AUTOMOTIVE RADAR
An exemplary radar sensing system utilizing a sparse array antenna structure provides an enhanced angular resolution to detect multiple targets with improved accuracy beyond the abilities of conventional radar. The exemplary radar system uses sparsely located antenna array elements allowing improved FOV, angular resolution, beam width, and side lobes using fewer physical antenna elements. Sparse antenna arrays allow the use of physically larger elements, larger separation between transmitter and receiver elements to reduce mutual coupling, and fewer elements to reduce necessary computations.
APPARATUS AND METHOD FOR SCALING PRE-SCALED RESULTS OF COMPLEX MUTIPLY-ACCUMULATE OPERATIONS ON PACKED REAL AND IMAGINARY DATA ELEMENTS
An apparatus and method for performing a transform on complex data. For example, one embodiment of a processor comprises: multiplier circuitry to multiply packed real N-bit data elements in the first source register with packed real M-bit data elements in the second source register and to multiply packed imaginary N-bit data elements in the first source register with packed imaginary M-bit data elements in the second source register to generate at least four real products, adder circuitry to subtract a first selected real product from a second selected real product to generate a first temporary result and to subtract a third selected real product from a fourth selected real product to generate a second temporary result, the adder circuitry to add the first temporary result to a first packed N-bit data element from the third source register to generate a first pre-scaled result, to subtract the first temporary result from the first packed N-bit data element to generate a second pre-scaled result, to add the second temporary result to a second packed N-bit data element from the third source register to generate a third pre-scaled result, and to subtract the second temporary result from the second packed N-bit data element to generate a fourth pre-scaled result; scaling circuitry to scale the first, second, third and fourth pre-scaled results to a specified bit width to generate first, second, third, and fourth final results; and a destination register to store the first, second, third, and fourth final results in specified data element positions.
INFORMATION PROCESSING APPARATUS, SECURE COMPUTATION METHOD, AND PROGRAM
An information processing apparatus comprises a partial modular exponentiation calculating part and a partial modular exponentiation synthesizing part. The partial modular exponentiation calculating part is given a base in plaintext and a modulo in plaintext and shared exponents and calculates a partial modular exponentiation that equals a set of shared values according to a modular exponentiation of the base raised by the shared exponent. The partial modular exponentiation synthesizing part calculates shared values of the modular exponentiation from the partial modular exponentiation that equals shared values relating to the modular exponentiation of a sum of shared exponents.
Method and system for elastic precision enhancement using dynamic shifting in neural networks
A multiplier for calculating a multiplication of a first fixed point number and a second fixed point number comprises a converter and a restoration circuit. The converter is configured to convert the first fixed point number to a sign, a mantissa, and an exponent. At least one of a bit width of the sign, a bit width of the mantissa, and a bit width of the exponent is dynamically configured based on a position of a layer associated with the first fixed point number in a neural network, a position of a pixel in an input feature map associated with the first fixed point number, and/or a channel associated with the first fixed point number. The restoration circuit is configured to calculate the multiplication based on the sign, the mantissa, the exponent, and the second fixed point number.
FAST FOURIER TRANSFORM CIRCUIT OF AUDIO PROCESSING DEVICE
A fast Fourier transform (FFT) circuit of an audio processing device configured to perform an N-points FFT and including a memory circuit and a butterfly operation unit circuit is provided. The butterfly operation unit circuit reads two points input data from the memory circuit, performs a butterfly operation for the two points input data according to a twiddle factor to generate two points output data, and writes the two points output data into the memory circuit. The butterfly operation unit circuit includes a multiplier and a plurality of adders/subtracters. The multiplier sequentially multiplies real or imaginary coefficients of one of the two points input data by real or imaginary coefficients of the twiddle factor in multiple clock cycles. The multiplier performs a multiplication once in each clock cycle. The adders/subtractors perform addition/subtraction, such that the butterfly operation unit circuit generates the two points output data.
Multiplication Circuit, System on Chip, and Electronic Device
A multiplication circuit is provided, the circuit is configured to perform a multiplication operation on two pieces of data: A and B, and includes: an addition subcircuit configured to obtain logarithmic field data a and b that corresponding to A and B, and perform an addition operation on a and b to obtain c, where c includes an integral part and a fractional part; an exponentiation operation subcircuit configured to perform an exponentiation operation in which a base is 2 and an exponent is the fractional part of c, to obtain an exponentiation operation result; a shift subcircuit configured to shift the exponentiation operation result based on the integral part of c to obtain a shift result; and an output subcircuit, configured to output a product of A and B based on signs of a and b and with reference to the shift result.