G06F7/722

HOMOMORPHIC ENCRYPTION FOR MACHINE LEARNING AND NEURAL NETWORKS USING HIGH-THROUGHPUT CRT EVALUATION

Embodiments are directed to homomorphic encryption for machine learning and neural networks using high-throughput Chinese remainder theorem (CRT) evaluation. An embodiment of an apparatus includes a hardware accelerator to receive a ciphertext generated by homomorphic encryption (HE) for evaluation, decompose coefficients of the ciphertext into a set of decomposed coefficients, multiply the decomposed coefficients using a set of smaller modulus determined based on a larger modulus, and convert results of the multiplying back to an original form corresponding to the larger modulus by performing a reverse Chinese remainder theorem (CRT) transform on the results of multiplying the decomposed coefficients.

HOMOMORPHIC ENCRYPTION FOR MACHINE LEARNING AND NEURAL NETWORKS USING HIGH-THROUGHPUT CRT EVALUATION

Embodiments are directed to homomorphic encryption for machine learning and neural networks using high-throughput Chinese remainder theorem (CRT) evaluation. An embodiment of an apparatus includes a hardware accelerator to receive a ciphertext generated by homomorphic encryption (HE) for evaluation, decompose coefficients of the ciphertext into a set of decomposed coefficients, multiply the decomposed coefficients using a set of smaller modulus determined based on a larger modulus, and convert results of the multiplying back to an original form corresponding to the larger modulus.

TWIDDLE FACTOR GENERATING CIRCUIT FOR AN NTT PROCESSOR

A circuit for generating twiddle factors for an NTT processor. The circuit includes a cache management manager, a modular multipliers bank, and a central controller. The cache management module includes a local controller and a cache memory in which operands are stored for calculating future twiddle factors. The modular multipliers bank includes an interconnection matrix at the input distributing operands on the modular multiplier inputs. The circuit can be configured to minimise the size of the cache memory and/or reduce the latency of the twiddle factor sequence calculation. Finally, the generating circuit may include several calculation management modules sharing the same modular multipliers bank to generate sequences of twiddle factors on several finite fields.

INTEGRATED CIRCUIT FOR MODULAR MULTIPLICATION OF TWO INTEGERS FOR A CRYPTOGRAPHIC METHOD, AND METHOD FOR THE CRYPTOGRAPHIC PROCESSING OF DATA BASED ON MODULAR MULTIPLICATION
20210243006 · 2021-08-05 ·

Integrated circuits for modular multiplication of two integers for a cryptographic method, and methods for the cryptographic processing of data based on modular multiplication are herein disclosed. For example, an integrated circuit for modular multiplication of two integers for a cryptographic method has a processor that represents the integers to be multiplied in Montgomery representation with a specified Montgomery representation parameter and a specified modulus, and calculates the result of the modular multiplication of the integers to be multiplied in Montgomery representation iteratively from the least significant word to the most significant word, where for a word calculated in an iteration, the product of the word with a specified factor is added to the words of subsequent iterations, the specified factor given by the product of the negative inverse of the least significant word of the modulus and the modulus, without the least significant word of the product, plus one.

OPTIMIZATION TECHNIQUE FOR MODULAR MULTIPLICATION ALGORITHMS
20230401037 · 2023-12-14 ·

Methods and apparatus for optimization techniques for modular multiplication algorithms. The optimization techniques may be applied to variants of modular multiplication algorithms, including variants of Montgomery multiplication algorithms and Barrett multiplication algorithms. The optimization techniques reduce the number of serial steps in Montgomery reduction and Barrett reduction. Modular multiplication operations involving products of integer inputs A and B may be performed in parallel to obtain a value C that is reduced to a residual RES. Modular multiplication and modular reduction operations may be performed in parallel. The number of serial steps in the modular reductions are reduced to L, where L serial steps, where w is a digit size in bits, and L is a number of digits of operands=[k/w].

Method, device, and system for task processing
11018864 · 2021-05-25 · ·

A number of RSA computing tasks that have different word lengths which are less than a maximum word length of an operand register are processed at the same time by combining a number of different word lengths to be equal to or less than the maximum word length of the operand register.

Performing processing using hardware counters in a computer system

Performing processing using hardware counters in a computer system includes storing, in association with greatest common divisor (GCD) processing of the system, a first variable in a first redundant binary representation and a second variable in a second redundant binary representation. Each such redundant binary representation includes a respective sum term and a respective carry term, and a numerical value being represented by a redundant binary representation is equal to a sum of the sum and carry terms of the redundant binary representation. The process performs redundant arithmetic operations of the GCD processing on the first variable and second variables using hardware counter(s), of the computer system, that take input values in redundant binary representation form and provide output values in redundant binary representation form. The process uses output of the redundant arithmetic operations of the GCD processing to obtain an output GCD of integer inputs to the GCD processing.

REPEATED MODULO METHOD AND APPARATUS FOR SIZE-LIMITATION OF INTERIM CALCULATION RESULTS
20210111873 · 2021-04-15 ·

A method and apparatus for limiting the size of large numbers during numeric calculations, such as during encryption and decryption calculations.

Systems and Methods for Low Latency Modular Multiplication
20210117157 · 2021-04-22 ·

An integrated circuit device includes multiplier circuitry configured to determine a plurality of columns of subproducts by multiplying a plurality of values. Each column of the plurality of columns includes one or more subproducts of a plurality of subproducts. The integrated circuit device also includes adder circuitry configured to determine a plurality of sums, each sum being a sum of one column of the plurality of columns. A first portion of the adder circuitry associated with a first column of the plurality of columns is configured to receive a first value and second value that are associated with the first column and a third value associated with a second column of the plurality of columns that differs from the first column. The third value is a carry-out value generated by a second portion of the adder circuitry associated with the second column of the plurality of columns.

APPARATUS FOR PROCESSING MODULAR MULTIPLY OPERATION AND METHODS THEREOF
20200374103 · 2020-11-26 ·

Disclosed is a ciphertext computation method. The ciphertext computation method includes: receiving a modular computation command for a plurality of ciphertexts; performing a modular computation for the plurality of ciphertexts by using a lookup table storing a plurality of predetermined prime number information; and outputting a result of the computation.