Patent classifications
G06F7/505
Device and Method of Handling a Modular Multiplication
A modular operation device for handling a modular multiplication, comprises a controller, configured to divide a multiplicand into a plurality of multiplicand words, a multiplier into a plurality of multiplier words, and a modulus into a plurality of modulus words; a first plurality of processing elements, coupled to the controller, configured to compute a first plurality of updated carry results and a first plurality of updated sum results; a second plurality of processing elements, coupled to the controller, configured to compute a second plurality of updated carry results and a second plurality of updated sum results; and a reduction element, coupled to the controller, configured to compute a resulting remainder according to the second plurality of updated carry results and the second plurality of updated sum results.
Device and Method of Handling a Modular Multiplication
A modular operation device for handling a modular multiplication, comprises a controller, configured to divide a multiplicand into a plurality of multiplicand words, a multiplier into a plurality of multiplier words, and a modulus into a plurality of modulus words; a first plurality of processing elements, coupled to the controller, configured to compute a first plurality of updated carry results and a first plurality of updated sum results; a second plurality of processing elements, coupled to the controller, configured to compute a second plurality of updated carry results and a second plurality of updated sum results; and a reduction element, coupled to the controller, configured to compute a resulting remainder according to the second plurality of updated carry results and the second plurality of updated sum results.
IN-MEMORY BIT-SERIAL ADDITION SYSTEM
An in-memory vector addition method for a dynamic random access memory (DRAM) is disclosed which includes consecutively transposing two numbers across a plurality of rows of the DRAM, each number transposed across a fixed number of rows associated with a corresponding number of bits, assigning a scratch-pad including two consecutive bits for each bit of each number being added, two consecutive bits for carry-in (C.sub.in), and two consecutive bits for carry-out-bar (
Multiple accumulate busses in a systolic array
Systems and methods are provided to enable parallelized multiply-accumulate operations in a systolic array. Each column of the systolic array can include multiple busses enabling independent transmission of input partial sums along the respective bus. Each processing element of a given columnar bus can receive an input partial sum from a prior element of the given columnar bus, and perform arithmetic operations on the input partial sum. Each processing element can generate an output partial sum based on the arithmetic operations, provide the output partial sum to a next processing element of the given columnar bus, without the output partial sum being processed by a processing element of the column located between the two processing elements that uses a different columnar bus. Use of columnar busses can enable parallelization to increase speed or enable increased latency at individual processing elements.
Multiple accumulate busses in a systolic array
Systems and methods are provided to enable parallelized multiply-accumulate operations in a systolic array. Each column of the systolic array can include multiple busses enabling independent transmission of input partial sums along the respective bus. Each processing element of a given columnar bus can receive an input partial sum from a prior element of the given columnar bus, and perform arithmetic operations on the input partial sum. Each processing element can generate an output partial sum based on the arithmetic operations, provide the output partial sum to a next processing element of the given columnar bus, without the output partial sum being processed by a processing element of the column located between the two processing elements that uses a different columnar bus. Use of columnar busses can enable parallelization to increase speed or enable increased latency at individual processing elements.
QUANTUM CIRCUIT OPTIMIZATION USING WINDOWED QUANTUM ARITHMETIC
Methods, systems and apparatus for performing windowed quantum arithmetic. In one aspect, a method for performing a product addition operation includes: determining multiple entries of a lookup table, comprising, for each index in a first set of indices, multiplying the index value by a scalar for the product addition operation; for each index in a second set of indices, determining multiple address values, comprising extracting source register values corresponding to indices between i) the index in the second set of indices, and ii) the index in the second set of indices plus the predetermined window size; and adjusting values of a target quantum register based on the determined multiple entries of the lookup table and the determined multiple address values.
QUANTUM CIRCUIT OPTIMIZATION USING WINDOWED QUANTUM ARITHMETIC
Methods, systems and apparatus for performing windowed quantum arithmetic. In one aspect, a method for performing a product addition operation includes: determining multiple entries of a lookup table, comprising, for each index in a first set of indices, multiplying the index value by a scalar for the product addition operation; for each index in a second set of indices, determining multiple address values, comprising extracting source register values corresponding to indices between i) the index in the second set of indices, and ii) the index in the second set of indices plus the predetermined window size; and adjusting values of a target quantum register based on the determined multiple entries of the lookup table and the determined multiple address values.
Measurement based uncomputation for quantum circuit optimization
Methods and apparatus for optimizing a quantum circuit. In one aspect, a method includes identifying one or more sequences of operations in the quantum circuit that un-compute respective qubits on which the quantum circuit operates; generating an adjusted quantum circuit, comprising, for each identified sequence of operations in the quantum circuit, replacing the sequence of operations with an X basis measurement and a classically-controlled phase correction operation, wherein a result of the X basis measurement acts as a control for the classically-controlled correction phase operation; and executing the adjusted quantum circuit.
Measurement based uncomputation for quantum circuit optimization
Methods and apparatus for optimizing a quantum circuit. In one aspect, a method includes identifying one or more sequences of operations in the quantum circuit that un-compute respective qubits on which the quantum circuit operates; generating an adjusted quantum circuit, comprising, for each identified sequence of operations in the quantum circuit, replacing the sequence of operations with an X basis measurement and a classically-controlled phase correction operation, wherein a result of the X basis measurement acts as a control for the classically-controlled correction phase operation; and executing the adjusted quantum circuit.
Methods of manufacturing semiconductor devices by etching active fins using etching masks
In a method of manufacturing a semiconductor device, first to third active fins are formed on a substrate. Each of the first to third active fins extends in a first direction, and the second active fin, the first active fin, and the third active fin are disposed in this order in a second direction crossing the first direction. The second active fin is removed using a first etching mask covering the first and third active fins. The third active fin is removed using a second etching mask covering the first active fin and a portion of the substrate from which the second active fin is removed. A first gate structure is formed on the first active fin. A first source/drain layer is formed on a portion of the first active fin adjacent the first gate structure.