Patent classifications
G06F7/44
MATRIX MULTIPLICATION IN HARDWARE USING MODULAR MATH
A first group of modulo result matrices corresponding to modulo of elements of a first matrix by each of a plurality of moduli is stored. A second group of modulo result matrices corresponding to modulo of elements of a second matrix by each of the plurality of moduli is stored. It is determined whether an element operation of a multiplication of the first matrix with the second matrix can be performed using a first hardware multiplication module rather than a second hardware multiplication module. In response to a determination that the element operation can be performed using the first hardware multiplication module, the element operation is performed using the first hardware multiplication module including by multiplying one or more corresponding elements from the first group of modulo result matrices with one or more corresponding elements from the second group of modulo result matrices.
CRYPTOGRAPHIC PROCESSING DEVICE AND METHOD FOR CRYPTOGRAPHICALLY PROCESSING DATA
A cryptographic processing device for cryptographically processing data, having a memory configured to store a first operand and a second operand represented by the data to be cryptographically processed, wherein the first operand and the second operand each correspond to an indexed array of data words, and a cryptographic processor configured to determine, for cryptographically processing the data, a product of the first operand with the second operand by accumulating results of partial multiplications, each partial multiplication comprising the multiplication of a data word of the first operand with a data word of the second operand wherein the cryptographic processor is configured to perform the partial multiplications in successive blocks of partial multiplications, each block being associated with a result index range and a first operand index range and each block comprising all partial multiplications between data words of the first operand within the first operand index range with data words of the second operand such that a sum of indices of the data word of the first operand and of the data word of the second operand is within the result index range.
NEURAL NETWORK PROCESSOR USING DYADIC WEIGHT MATRIX AND OPERATION METHOD THEREOF
An neural network (NN) processor includes an input feature map buffer configured to store an input feature matrix, a weight buffer configured to store a weight matrix trained in a form of a, a transform circuit configured to perform a Walsh-Hadamard transform on an input feature vector obtained from the input feature matrix and a weight vector included in the weight matrix to output a transformed input feature vector and a transformed weight vector, and an arithmetic circuit configured to perform an element-wise multiplication (EWM) on the transformed input feature vector and the transformed weight vector.
NEURAL NETWORK PROCESSOR USING DYADIC WEIGHT MATRIX AND OPERATION METHOD THEREOF
An neural network (NN) processor includes an input feature map buffer configured to store an input feature matrix, a weight buffer configured to store a weight matrix trained in a form of a, a transform circuit configured to perform a Walsh-Hadamard transform on an input feature vector obtained from the input feature matrix and a weight vector included in the weight matrix to output a transformed input feature vector and a transformed weight vector, and an arithmetic circuit configured to perform an element-wise multiplication (EWM) on the transformed input feature vector and the transformed weight vector.
RF square-law circuit
A circuit includes a first transistor that conducts a first current responsive to a DC bias voltage and an RF signal. A second transistor conducts a second current responsive to the DC bias voltage. The first current and the second current are mirrored through a pair of current mirrors coupled together through a low-pass filter to filter the envelope of the RF signal.
PROCESSOR WITH EFFICIENT ARITHMETIC UNITS
A processor includes a carry save array multiplier. The carry save array multiplier includes an array of cascaded partial product generators. The array of cascaded partial product generators is configured to generate an output value as a product of two operands presented at inputs of the multiplier. The array of cascaded partial product generators is also configured to generate an output value as a sum of two operands presented at inputs of the multiplier.
PROCESSOR WITH EFFICIENT ARITHMETIC UNITS
A processor includes a carry save array multiplier. The carry save array multiplier includes an array of cascaded partial product generators. The array of cascaded partial product generators is configured to generate an output value as a product of two operands presented at inputs of the multiplier. The array of cascaded partial product generators is also configured to generate an output value as a sum of two operands presented at inputs of the multiplier.
Computing and summing up multiple products in a single multiplier
Methods, systems and computer program products for computing and summing up multiple products in a single multiplier are provided. Aspects include receiving a first number and a second number, creating partial products of the first number and the second number based on a multiplication of the first number and the second number, and reducing the number of partial products to create an intermediate result. Aspects also include receiving a third number and a fourth number, creating partial products of the third number and the fourth number based on a multiplication of the third number and the fourth number, creating a reduction tree and adding the intermediate result to the reduction tree. Aspects further include reducing the number of partial products in the reduction tree to create a second sum value and a second carry value and adding the second sum value and the second carry value to create a result.
Digital signal processing blocks with embedded arithmetic circuits
A specialized processing block on an integrated circuit includes a first and second arithmetic operator stage, an output coupled to another specialized processing block, and configurable interconnect circuitry which may be configured to route signals throughout the specialized processing block, including in and out of the first and second arithmetic operator stages. The configurable interconnect circuitry may further include multiplexer circuitry to route selected signals. The output of the specialized processing block that is coupled to another specialized processing block together with the configurable interconnect circuitry reduces the need to use resources outside the specialized processing block when implementing mathematical functions that require the use of more than one specialized processing block. An example for such mathematical functions include the implementation of vector (dot product) operations, FIR filters, or sum-of-product operations.
Digital signal processing blocks with embedded arithmetic circuits
A specialized processing block on an integrated circuit includes a first and second arithmetic operator stage, an output coupled to another specialized processing block, and configurable interconnect circuitry which may be configured to route signals throughout the specialized processing block, including in and out of the first and second arithmetic operator stages. The configurable interconnect circuitry may further include multiplexer circuitry to route selected signals. The output of the specialized processing block that is coupled to another specialized processing block together with the configurable interconnect circuitry reduces the need to use resources outside the specialized processing block when implementing mathematical functions that require the use of more than one specialized processing block. An example for such mathematical functions include the implementation of vector (dot product) operations, FIR filters, or sum-of-product operations.