G06F7/49

FAST MODULAR MULTIPLICATION OF LARGE INTEGERS

In an approach, a processor receives a plurality of first operand values, where the first operand values are integer values. A processor adds, using binary addition, the plurality of first operand values resulting in a sum value S. A processor determines a single combined modular correction term D for a binary sum of all operand values based on leading bits of the sum value S. A processor performs a modular addition of S and D resulting in a modular sum of said plurality of said first operand values.

RECONFIGURABLE DIGITAL SIGNAL PROCESSING (DSP) VECTOR ENGINE

Systems and methods described herein may relate to providing a dynamically configurable circuitry able to process data associated with a variety of matrix dimensions one or more complex number operations, one or more real number operations, or both. Configurations may be applied to the configurable circuitry to program the configurable circuitry for a next operation. The configurable circuitry may process data according to a variety of operations based at least in part on operation of a repeated processing element coupled in a compute network of processing elements.

RECONFIGURABLE DIGITAL SIGNAL PROCESSING (DSP) VECTOR ENGINE

Systems and methods described herein may relate to providing a dynamically configurable circuitry able to process data associated with a variety of matrix dimensions one or more complex number operations, one or more real number operations, or both. Configurations may be applied to the configurable circuitry to program the configurable circuitry for a next operation. The configurable circuitry may process data according to a variety of operations based at least in part on operation of a repeated processing element coupled in a compute network of processing elements.

HIGH PERFORMANCE MERGE SORT WITH SCALABLE PARALLELIZATION AND FULL-THROUGHPUT REDUCTION
20200159492 · 2020-05-21 ·

Disclosed herein is a novel multi-way merge network, referred to herein as a Hybrid Comparison Look Ahead Merge (HCLAM), which incurs significantly less resource consumption as scaled to handle larger problems. In addition, a parallelization scheme is disclosed, referred to herein as Parallelization by Radix Pre-sorter (PRaP), which enables an increase in streaming throughput of the merge network. Furthermore, high performance reduction scheme is disclosed to achieve full throughput.

HIGH PERFORMANCE MERGE SORT WITH SCALABLE PARALLELIZATION AND FULL-THROUGHPUT REDUCTION
20200159492 · 2020-05-21 ·

Disclosed herein is a novel multi-way merge network, referred to herein as a Hybrid Comparison Look Ahead Merge (HCLAM), which incurs significantly less resource consumption as scaled to handle larger problems. In addition, a parallelization scheme is disclosed, referred to herein as Parallelization by Radix Pre-sorter (PRaP), which enables an increase in streaming throughput of the merge network. Furthermore, high performance reduction scheme is disclosed to achieve full throughput.

NATIVE TERNARY RANDOM NUMBERS GENERATION
20200150929 · 2020-05-14 ·

A system and method for random number generation are presented. A plurality of parameter values are generated by inspecting a plurality of cells in a memory array. Each parameter value in the plurality of parameter values is associated with a cell of the plurality of cells. A plurality of unstable cells in the plurality of cells are identified. Each unstable cell in the plurality of unstable cells is associated with a parameter value within a threshold value of an average of the plurality of parameter values. First, second and third groups of cells in the plurality of unstable cells identified and associated with values. The groups are determined based upon the parameter values associated with the cells in each group. A data stream is generated using the first group of cells, the second group of cells, and the third group of cells.

NATIVE TERNARY RANDOM NUMBERS GENERATION
20200150929 · 2020-05-14 ·

A system and method for random number generation are presented. A plurality of parameter values are generated by inspecting a plurality of cells in a memory array. Each parameter value in the plurality of parameter values is associated with a cell of the plurality of cells. A plurality of unstable cells in the plurality of cells are identified. Each unstable cell in the plurality of unstable cells is associated with a parameter value within a threshold value of an average of the plurality of parameter values. First, second and third groups of cells in the plurality of unstable cells identified and associated with values. The groups are determined based upon the parameter values associated with the cells in each group. A data stream is generated using the first group of cells, the second group of cells, and the third group of cells.

Processor and Methods Configured to Provide a Low-Complexity Input/Output Pruning Fast Fourier Transform

In some embodiments, a circuit may include an input configured to receive a signal and a radix-r input/output pruning fast Fourier transform (FFT) processing element coupled to the input. The radix-r input/output pruning FFT processing element may be configured to remove FFT operations on input values of zero within the signal and to determine a discrete Fourier Transform (DFT) output having fewer output values than a number of input values of the signal.

Processor and Methods Configured to Provide a Low-Complexity Input/Output Pruning Fast Fourier Transform

In some embodiments, a circuit may include an input configured to receive a signal and a radix-r input/output pruning fast Fourier transform (FFT) processing element coupled to the input. The radix-r input/output pruning FFT processing element may be configured to remove FFT operations on input values of zero within the signal and to determine a discrete Fourier Transform (DFT) output having fewer output values than a number of input values of the signal.

Radix-23 Fast Fourier Transform for an Embedded Digital Signal Processor
20200142670 · 2020-05-07 ·

In some embodiments, a circuit may include an input configured to receive a signal and a radix-2.sup.3 fast Fourier transform (FFT) processing element coupled to the input. The radix-2.sup.3 FFT processing element may be configured to control variation of twiddle factors during calculation of a complete FFT through a plurality of processing stages. The radix-2.sup.3 FFT processing element may be configured to incorporate the twiddle factors and adder tree matrices of the calculation into a single stage.