Patent classifications
G06F7/5324
Multiple Mode Arithmetic Circuit
A tile of an FPGA includes a multiple mode arithmetic circuit. The multiple mode arithmetic circuit is configured by control signals to operate in an integer mode, a floating-point mode, or both. In some example embodiments, multiple integer modes (e.g., unsigned, two's complement, and sign-magnitude) are selectable, multiple floating-point modes (e.g., 16-bit mantissa and 8-bit sign, 8-bit mantissa and 6-bit sign, and 6-bit mantissa and 6-bit sign) are supported, or any suitable combination thereof. The tile may also fuse a memory circuit with the arithmetic circuits. Connections directly between multiple instances of the tile are also available, allowing multiple tiles to be treated as larger memories or arithmetic circuits. By using these connections, referred to as cascade inputs and outputs, the input and output bandwidth of the arithmetic circuit is further increased.
DEVICE AND METHOD FOR ACCELERATING MATRIX MULTIPLY OPERATIONS
A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.
Multiple Mode Arithmetic Circuit
A tile of an FPGA includes a multiple mode arithmetic circuit. The multiple mode arithmetic circuit is configured by control signals to operate in an integer mode, a floating-point mode, or both. In some example embodiments, multiple integer modes (e.g., unsigned, two's complement, and sign-magnitude) are selectable, multiple floating-point modes (e.g., 16-bit mantissa and 8-bit sign, 8-bit mantissa and 6-bit sign, and 6-bit mantissa and 6-bit sign) are supported, or any suitable combination thereof. The tile may also fuse a memory circuit with the arithmetic circuits. Connections directly between multiple instances of the tile are also available, allowing multiple tiles to be treated as larger memories or arithmetic circuits. By using these connections, referred to as cascade inputs and outputs, the input and output bandwidth of the arithmetic circuit is further increased.
MULTIPLE MULTIPLICATION ARRAYS
A data processing apparatus is provided. An A×B multiplier array has a group of logic gates clocked by a first clock signal, where A and B are both integers. A C×D multiplier array, separate from the A×B multiplier array, has second group of logic gates clocked by a second clock signal, where C and D are both integers. Addition circuitry performs an addition operation between a first at least partial product produced by the A×B multiplier array and a second at least partial product produced by the C×D multiplier array.
COMPUTATIONAL MEMORY
A processing device includes a two-dimensional array of processing elements, each processing element including an arithmetic logic unit to perform an operation. The device further includes interconnections among the two-dimensional array of processing elements to provide direct communication among neighboring processing elements of the two-dimensional array of processing elements. A processing element of the two-dimensional array of processing elements is connected to a first neighbor processing element that is immediately adjacent the processing element in a first dimension of the two-dimensional array. The processing element is further connected to a second neighbor processing element that is immediately adjacent the processing element in a second dimension of the two-dimensional array.
SIGNED MULTIWORD MULTIPLIER
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a hardware circuit configured as a signed multiword multiplier. The circuit includes a processing circuit that receives inputs that each have a respective bit-width. The processing circuit can represent at least one input as a signed multiword input based on the first input having a bit-width that exceeds a fixed bit-width of the hardware circuit. The circuit includes signed multipliers that are each configured to multiply signed inputs. Each signed multiplier includes multiplication circuitry configured to: receive the signed multiword input; receive a signed second input; and generate a signed output in response to multiplying the signed multiword input with the signed second input.
Reduced latency multiplier circuitry for very large numbers
An integrated circuit with a large multiplier is provided. The multiplier may be configured to receive large input operands with thousands of bits. The multiplier may be implemented using a multiplier decomposition scheme that is recursively flattened into multiple decomposition levels to expose a tree of adders. The adders may be collapsed into a merged pipelined structure, where partial sums are forwarded from one level to the next while bypassing intervening prefix networks. The final correct sum is not calculated until later. In accordance with the decomposition technique, the partial sums are successively halved, which allows the prefix networks to be smaller from one level to the next. This allows all sums to be calculated at approximately the same pipeline depth, which significantly reduces latency with no or limited pipeline balancing.
Device for performing multiply/accumulate operations
A circuit for performing multiply/accumulate operations evaluates a type of each value of a pair of input values. Signed values are split into sign and magnitude. One or more pairs of arguments are input to a multiplier such that the arguments have fewer bits than the magnitude of signed values or unsigned values. This may include splitting input values into multiple arguments and inputting multiple pairs of arguments to the multiplier for a single pair of input values.
Multi-partition operation in combination operations
In an environment where multiple datasets are to be combined, systems and methods are disclosed for allocating a group of data entries from at least one dataset into multiple partitions. For a particular partition, the subgroup in the partition can be combined with data entries from the other dataset. In some cases, groups of data entries from each dataset are assigned to different partitions. For a particular partition, a subgroup is duplicated, some of the data entries of the subgroup are reassigned to other partitions, the subgroup is reformed to include data entries from other partitions, and the reformed subgroup is combined with the subgroup from the other dataset(s).
Method for multiplying polynomials for a cryptographic operation
A method is provided for multiplying two polynomials. In the method, first and second polynomials are evaluated at 2t inputs, where t is greater than or equal to one, and where each input is a fixed power of two 2.sup.l/(2t) multiplied with a different power of a primitive root of unity, thereby creating 2 times 2t integers, where l is an integer such that 2.sup.l is at least as large as the largest coefficient of the resulting product multiplying the first and second polynomials. The 2 times 2t integers are then multiplied pairwise, and a modular reduction is performed to get 2t integers. A linear combination of the 2t integers multiplied with primitive roots of unity is computed to get 2t integers whose limbs in the base 2.sup.l-bit representation correspond to coefficients of the product of the first and second polynomials. The method can be implemented on a processor designed for performing RSA and/or ECC type cryptographic operations.