IPIQ

G06F7/729

Method of operating neural networks, corresponding network, apparatus and computer program product

11308406 · 2022-04-19 ·

Stmicroelectronics S.R.L.

A method of operating neural networks such as convolutional neural networks including, e.g., an input layer, an output layer and at least one intermediate layer between the input layer and the output layer, with the network layers including operating circuits performing arithmetic operations on input data to provide output data. The method includes: selecting a set of operating circuits in the network layers, performing arithmetic operations in operating circuits in the selected set of operating circuits by performing Residue Number System or RNS operations on RNS-converted input data by obtaining RNS output data in the Residue Number System, backward converting from the Residue Number System the RNS output data resulting from the RNS operations.

Secure transformation from a residue number system to a radix representation

11755288 · 2023-09-12 ·

Koninklijke Philips N.V.

An electronic calculating device (100) arranged to convert an input number (y) represented ((y.sub.1, y.sub.2, . . . , y.sub.k)) m a residue number system (RNS) to an output number represented in a radix representation ((e.sub.0, e.sub.1, . . . e.sub.s−1)), the calculating device comprising an input interface (110) arranged to receive the input number (y) represented in the residue number system, and a processor circuit (120) configured to iteratively update an intermediate number (ŷ) represented in the residue number system, wherein iterations produce the digits (e.sub.0, e.sub.1, . . . e.sub.s−1) in the radix representation with respect to the bases (b.sub.0, b.sub.1, . . . , b.sub.s−1), at least one iteration comprises computing the intermediate number modulo a base (b.sub.t) of the radix representation to obtain a digit (e.sub.t=(ŷ).sub.bt) of the radix representation, updating the intermediate number (ŷ←(ŷ−e.sub.t+F)/b.sub.t) by subtracting the digit from the intermediate number, adding an obfuscating number (F; F.sub.t), and dividing by the base (b.sub.t).

Applications of and techniques for quickly computing a modulo operation by a Mersenne or a Fermat number

11455145 · 2022-09-27 ·

Nvidia Corporation

Various embodiments include a modulo operation generator associated with a cache memory in a computer-based system. The modulo operation generator generates a first sum by performing an addition and/or a subtraction function on an input address. A first portion of the first sum is applied to a lookup table that generates a correction value. The correction value is then added to a second portion of the first sum to generate a second sum. The second sum is adjusted, as needed, to be less than the divisor. The adjusted second sum forms a residue value that identifies a cache memory slice in which the input data value corresponding to the input address is stored. By generating the residue value in this manner, the cache memory distributes input data values among the slices in a cache memory even when the number of slices is not a power of two.

Hardware accelerator method, system and device

11442700 · 2022-09-13 ·

A system includes an addressable memory array, one or more processing cores, and an accelerator framework coupled to the addressable memory. The accelerator framework includes a Multiply ACcumulate (MAC) hardware accelerator cluster. The MAC hardware accelerator cluster has a binary-to-residual converter, which, in operation, converts binary inputs to a residual number system. Converting a binary input to the residual number system includes a reduction modulo 2.sup.m and a reduction modulo 2.sup.m−1, where m is a positive integer. A plurality of MAC hardware accelerators perform modulo 2.sup.m multiply-and-accumulate operations and modulo 2.sup.m−1 multiply-and-accumulate operations using the converted binary input. A residual-to-binary converter generates a binary output based on the output of the MAC hardware accelerators.

Residue number system in a photonic matrix accelerator

11836466 · 2023-12-05 ·

Lightmatter, Inc.

A photonic processor uses light signals and a residue number system (RNS) to perform calculations. The processor sums two or more values by shifting the phase of a light signal with phase shifters and reading out the summed phase with a coherent detector. Because phase winds back every 2π radians, the photonic processor performs addition modulo 2π. A photonic processor may use the summation of phases to perform dot products and correct erroneous residues. A photonic processor may use the RNS in combination with a positional number system (PNS) to extend the numerical range of the photonic processor, which may be used to accelerate homomorphic encryption (HE)-based deep learning.

Multiple Sinusoid Signal Sub-Nyquist Sampling Method Based on Multi-channel Time Delay Sampling System

20210226830 · 2021-07-22 ·

The disclosure discloses a multiple sinusoid signal sub-Nyquist sampling method based on a multi-channel time delay sampling system. The method includes step 1: initializing; step 2: enabling multiple sinusoid signals x(t) to respectively enter N′ parallel sampling channels after the multiple sinusoid signals are divided, wherein a sampling time delay of adjacent channels is τ, and the number of sampling points of each channel is N; step 3: combining sampled data of each sampling channel to construct an autocorrelation matrix R.sub.xx, and estimating sampling signal parameters c.sub.m of each channel and a set of frequency parameters {circumflex over (f)}.sub.m by utilizing the ESPRIT method; step 4: estimating signal amplitudes α.sub.m and another set of frequency parameters f.sub.m.sup.′ through the estimated parameters c.sub.m and the sampling time delay τ of each channel by utilizing the ESPRIT method; and step 5: reconstructing 2K frequency parameters {circumflex over (f)}.sub.m through the two sets of estimated minimum frequency parameters f.sub.m and f.sub.m.sup.′ by utilizing a closed-form robust Chinese remainder theorem, and screening out K correct frequency parameters {{circumflex over (f)}.sub.k}.sub.k=0.sup.K-1 through sampling rate parameters. The disclosure is configured to solve problems of frequency aliasing and image frequency aliasing occurring in real-valued multiple sinusoid signal sub-Nyquist sampling.

METHOD AND ARCHITECTURE FOR ACCELERATING DETERMINISTIC STOCHASTIC COMPUTING USING RESIDUE NUMBER SYSTEM

20210241085 · 2021-08-05 ·

Inaccuracy of computations is an important challenge with the Stochastic Computing (SC) paradigm. Recently, deterministic approaches to SC are proposed to produce completely accurate results with SC circuits. Instead of random bit-streams, the computations are performed on structured deterministic bit-streams. However, current deterministic methods take a large number of clock cycles to produce correct result. This long processing time directly translates to very high energy consumption. This invention proposes a design methodology based on the Residue Number Systems (RNS) to mitigate the long processing time of the deterministic methods. Compared to the state-of-the-art deterministic methods of SC, the proposed approach delivers improvements in terms of processing time and energy consumption.

Performing processing using hardware counters in a computer system

11029921 · 2021-06-08 ·

International Business Machines Corporation

Performing processing using hardware counters in a computer system includes storing, in association with greatest common divisor (GCD) processing of the system, a first variable in a first redundant binary representation and a second variable in a second redundant binary representation. Each such redundant binary representation includes a respective sum term and a respective carry term, and a numerical value being represented by a redundant binary representation is equal to a sum of the sum and carry terms of the redundant binary representation. The process performs redundant arithmetic operations of the GCD processing on the first variable and second variables using hardware counter(s), of the computer system, that take input values in redundant binary representation form and provide output values in redundant binary representation form. The process uses output of the redundant arithmetic operations of the GCD processing to obtain an output GCD of integer inputs to the GCD processing.

HARDWARE ACCELERATOR METHOD, SYSTEM AND DEVICE

20200310761 · 2020-10-01 ·

A system includes an addressable memory array, one or more processing cores, and an accelerator framework coupled to the addressable memory. The accelerator framework includes a Multiply ACcumulate (MAC) hardware accelerator cluster. The MAC hardware accelerator cluster has a binary-to-residual converter, which, in operation, converts binary inputs to a residual number system. Converting a binary input to the residual number system includes a reduction modulo 2.sup.m and a reduction modulo 2.sup.m1, where m is a positive integer. A plurality of MAC hardware accelerators perform modulo 2.sup.m multiply-and-accumulate operations and modulo 2.sup.m1 multiply-and-accumulate operations using the converted binary input. A residual-to-binary converter generates a binary output based on the output of the MAC hardware accelerators.

PERFORMING PROCESSING USING HARDWARE COUNTERS IN A COMPUTER SYSTEM

20200264842 · 2020-08-20 ·

Patent classifications

G06F7/729