Low complexity conversion to Montgomery domain
11508263 · 2022-11-22
Assignee
Inventors
Cpc classification
G09C1/00
PHYSICS
H04L9/302
ELECTRICITY
H04L9/0841
ELECTRICITY
International classification
G06F21/00
PHYSICS
G09C1/00
PHYSICS
Abstract
Disclosed herein is an apparatus for calculating a cryptographic component R.sup.2 mod n for a cryptographic function, where n is a modulo number and R is a constant greater than n. The apparatus comprises a processor configured to set a start value to be equal to R mod n, perform b iterations of a shift and subtract operation on the start value to produce a base value, wherein the start value is set to be equal to the base value after each iteration, set a multiplication operand to be equal to the base value, and perform k iterations of a Montgomery modular multiplication of the multiplication operand with the multiplication operand to produce an intermediate result, wherein the multiplication operand is set to be equal to the intermediate result after each iteration, wherein the shift and subtract operation comprises determining a shifted start value which is equivalent to the start value multiplied by two, and subtracting n from the shifted start value if the shifted start value is greater than or equal to n.
Claims
1. A method for calculating, using a processor, a cryptographic component R.sup.2 mod n for a cryptographic function, where n is a modulo number and R is a constant greater than n, the method comprising: setting a start value to be equal to R mod n; performing b iterations of a shift and subtract operation on the start value to produce a base value, wherein the start value is set to be equal to the base value after each iteration; setting a multiplication operand to be equal to the base value; and performing k iterations of a Montgomery modular multiplication of the multiplication operand with the multiplication operand to produce an intermediate result, wherein the multiplication operand is set to be equal to the intermediate result after each iteration; wherein the shift and subtract operation comprises determining a shifted start value which is equivalent to the start value multiplied by two, and subtracting n from the shifted start value if the shifted start value is greater than or equal to n.
2. The method of claim 1, wherein: the cryptographic component is an integer equal to R.sup.2 mod n; R is of the form 2.sup.l; l is an integer; n is an integer which is less than R; and R is coprime with n.
3. The method of claim 1, wherein determining a shifted start value which is equivalent to the start value multiplied by two comprises performing a single bit left shift of the start value.
4. The method of claim 1, further comprising setting the value of b based on the value of R.
5. The method of claim 1, further comprising setting the value of b to be equal to the lowest integer value for which a single iteration of the Montgomery multiplication consumes less processing time than b iterations of the shift and subtract operation.
6. The method of claim 1, further comprising setting the value of k based on the value of b.
7. The method of claim 6, wherein the value of k is set based on the value of b to satisfy the relationship 2.sup.2.sup.
8. The method of claim 6, wherein the value of k is set based on the value of b to satisfy the equation R=2.sup.2.sup.
9. The method of claim 1, further comprising setting the value of b based on a calculation of the power consumed to calculate the cryptographic component, or based on the time taken to calculate the cryptographic component.
10. The method of claim 1, further comprising: determining an adjustment parameter indicative of a difference between the intermediate result and the cryptographic component; and performing Montgomery multiplication of the intermediate result with the adjustment parameter, to calculate the cryptographic component for the cryptographic function.
11. The method of claim 10, wherein determining the adjustment parameter comprises determining an excess parameter, the excess parameter being indicative of a difference between the intermediate result and the cryptographic component.
12. The method of claim 10, wherein the adjustment parameter is a function of an inverse of the excess parameter.
13. The method of claim 10, wherein determining the excess parameter comprises: determining an integer c for which 2.sup.2.sup.
14. The method of claim 10, wherein the adjustment parameter is equal to the inverse of the excess parameter multiplied by R mod n.
15. The method of claim 10, wherein the adjustment parameter is equal to the inverse of the excess parameter multiplied by R.
16. An apparatus for calculating a cryptographic component R.sup.2 mod n for a cryptographic function, where n is a modulo number and R is a constant greater than n, the apparatus comprising a processor configured to: set a start value to be equal to R mod n; perform b iterations of a shift and subtract operation on the start value to produce a base value, wherein the start value is set to be equal to the base value after each iteration; set a multiplication operand to be equal to the base value; and perform k iterations of a Montgomery modular multiplication of the multiplication operand with the multiplication operand to produce an intermediate result, wherein the multiplication operand is set to be equal to the intermediate result after each iteration; wherein the shift and subtract operation comprises determining a shifted start value which is equivalent to the start value multiplied by two, and subtracting n from the shifted start value if the shifted start value is greater than or equal to n.
17. The apparatus of claim 16, wherein the processor is further configured to: determine an adjustment parameter indicative of a difference between the intermediate result and the cryptographic component; and perform Montgomery multiplication of the intermediate result with the adjustment parameter, to calculate the cryptographic component for the cryptographic function.
18. The apparatus of claim 17, wherein the processor comprises: a shift and subtract unit having a first input, a second input, and an output; a Montgomery multiplication unit having a first input and a second input for receiving a first operand and a second operand, respectively, and having an output for supplying an intermediate result, the Montgomery multiplication unit configured to perform a Montgomery multiplication of the first operand with the second operand to produce the intermediate result; and a controller for controlling the values of the first input and the second input, the controller configured to perform the steps of, setting the input to the shift and subtract unit to be equal to a start value, iteratively setting the input to the shift and subtract unit to be equal to the output of the shift and subtract unit, for b shift and subtract operation iterations, setting the first input and the second input of the Montgomery multiplication unit to be equal to the output of the shift and subtract unit, and iteratively setting the first input and the second input to the intermediate multiplication result, for k Montgomery multiplication iterations.
19. The apparatus of claim 18, wherein the controller is further configured to perform the steps of: determining an adjustment parameter indicative of a difference between the intermediate result and the cryptographic component; and performing Montgomery multiplication of the intermediate result with the adjustment parameter, to calculate the cryptographic component for the cryptographic function.
20. An apparatus for calculating a cryptographic component R.sup.2 mod n for a cryptographic function, where n is a modulo number and R is a constant greater than n, the apparatus comprising: means for setting a start value to be equal to R mod n; means for performing b iterations of a shift and subtract operation on the start value to produce a base value, wherein the start value is set to be equal to the base value after each iteration; means for setting a multiplication operand to be equal to the base value; and means for performing k iterations of Montgomery modular multiplication of the multiplication operand with the multiplication operand to produce an intermediate result, wherein the multiplication operand is set to be equal to the intermediate result after each iteration; wherein the shift and subtract operation comprises determining a shifted start value which is equivalent to the start value multiplied by two, and subtracting n from the shifted start value if the shifted start value is greater than or equal to n.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1) The technology will be described with reference to the following drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
DESCRIPTION OF EMBODIMENTS
(10) Cryptographic Processing Unit
(11)
(12) The cryptographic processing unit 102 comprises a control logic unit 104, which receives control and data input via signal 106. The control logic unit 104 outputs resulting data from cryptographic processing via data signal 106. The cryptographic processing unit 102 further comprises a Montgomery arithmetic logic unit (ALU) 110, which is configured to convert to and from the Montgomery domain, and to process within the Montgomery domain. The cryptographic processing unit 102 also comprises a cryptographic ALU 114, which is configured to provide cryptographic processing outside the Montgomery domain. The modules of the cryptographic processing unit 102 are clocked by clock signal 120.
(13) Calculating s.sup.e mod n
(14) In one embodiment, the cryptographic processor 102 is configured to perform cryptographic functions in accordance with the Rivest-Shamir-Adleman (RSA) cryptographic algorithm. The cryptographic processor 102 performs modular exponentiation, as part of the RSA algorithm, to compute z=s.sup.e mod n, where n is a large number which may be greater than 2020 bits long in a binary representation.
(15) To calculate z=s.sup.e mod n, the cryptographic processor 102 converts the operand s to a corresponding Montgomery domain representation, which is an alternative mathematical representation of the operand s. The cryptographic processor 102 then calculates the Montgomery domain representation of z within the Montgomery domain. The cryptographic processor 102 may then convert the Montgomery domain representation of z to the integer domain, to produce the desired result z.
(16) Converting to the Montgomery Domain
(17) Converting to and from the Montgomery domain is an additional step not performed in other modulo multiplication algorithms; however, when performing many multiplications in a row, as in modular exponentiation, intermediate results can be left in the Montgomery domain representation, and the initial and final conversions may become a negligible fraction of the overall computation.
(18) In one embodiment, the cryptographic processor 102 converts the operand s to the Montgomery domain by computing s′=sR mod n, for some R=2.sup.l>n, where l is greater than or equal to the number of bits of n. The cryptographic processor 102 calculates the value sR mod n by performing modulo multiplication of the operand (s=s mod n) with the conversion component (R.sup.2 mod n). The cryptographic processor then applies a Montgomery reduction function REDC ((s mod n)(R.sup.2 mod n)) to reduce the modulo multiplication product by a factor of R, to produce the result sR mod n.
(19) Montgomery Multiplication
(20) The combined operation of the modulo multiplication of two Montgomery domain operands, modulo n, and the subsequent application of the Montgomery reduction function to the resulting product of the modulo multiplication is called Montgomery multiplication.
(21) Montgomery multiplication is a function that can be performed by the Montgomery ALU 110 in a number of circumstances. For example, in one embodiment, the cryptographic processor 102 configures the Montgomery ALU 110 to perform the Montgomery multiplication function on two Montgomery domain operands aR mod n and bR mod n, as defined by the cryptographic processor control 104. In performing Montgomery multiplication, the Montgomery ALU 110 multiplies the Montgomery domain operands aR mod n and bR mod n within the Montgomery domain to produce product abR.sup.2 mod n. The Montgomery ALU 110 then applies the Montgomery reduction function to produce abR mod n, which is the Montgomery domain form of the desired product of operands a and b modulo n.
(22) For some cryptographic algorithms, the Montgomery ALU 110 then converts the Montgomery domain form of the product out of the Montgomery domain, by performing a second Montgomery reduction function. Alternatively, if the cryptographic algorithm performed by the cryptographic processor 102 specifies further modulo multiplication involving the product, the cryptographic processor 102 may keep the Montgomery domain form of the product for use as an operand of further Montgomery multiplication functions.
(23) Conversion Component
(24) In order to convert an operand to its corresponding Montgomery domain representation, the cryptographic processor 102 calculates the conversion component R.sup.2 mod n. The calculation of R.sup.2 mod n involves the determination of an integer value equal to R.sup.2 mod n. Such an integer is referred to as the “cryptographic component”, “conversion component” or simply “component” throughout this disclosure.
(25) Calculation of the Component Via Direct Modulo Reduction
(26) Processor 102 can calculate R.sup.2 mod n by performing direct modulo reduction, e.g. by applying a Euclidean algorithm. When R.sup.2 is significantly larger than n, direct modulo reduction may take many iterations.
(27) For small exponents, e, such as the exponents typically used for signature verification algorithms, the direct reduction of R.sup.2 mod n may take over half the time of the computation of s.sup.e mod n. Accordingly, the direct modulo reduction of R.sup.2 mod n can be quite computationally expensive.
(28) Embodiments of the present disclosure seek to ameliorate the issue of expensive calculation of the component R.sup.2 mod n by reducing the clock cycles consumed for the calculation of the component R.sup.2 mod n, compared to calculating R.sup.2 mod n via direct modulo reduction.
(29) Accordingly, embodiments of the present disclosure provide a method and apparatus for calculation of the component R.sup.2 mod n, through application of iterative Montgomery multiplication calculations, followed by an adjustment of the Montgomery multiplication product.
(30) Montgomery ALU Subsection
(31)
(32) Control logic block 202 provides control signals 214, 228 to control the function of other blocks within the subsection 200. Control logic block 202 also provides data values, via signals 224 and 226, to be stored in Register A 216 and Register B 218, respectively.
(33) Multiplexer 204 is a 3:2 multiplexer which maps three input signals 206, 208, 222 to two output signals 210, 212 in accordance with the value of the selection signal 214. The selection signal 214 is set by the control logic block 202 and indicates the mapping of one of the three input signals 206, 208, 222 to output signal 210. The selection signal 214 also indicates the mapping of one of the three input signals 206, 208, 222 to output signal 212. Input signal 206 is set by Register A 216. Input signal 208 is set by Register B 218.
(34) Montgomery multiplication block 220 operates to perform Montgomery multiplication on a first and a second operand. The first operand is provided, via multiplexer 204, on data signal 210, and the second operand is provided, via multiplexer 204, on data signal 212. Functional block 220 outputs the result of Montgomery multiplication of the first operand and the second operand on output signal 222. Control logic block 202, multiplexer 204 and Montgomery multiplication unit 220 are all clocked by clock signal 120.
(35) The Montgomery ALU 110 provides control signals and parameters to the control logic block 202 of the subsection 200, via signal 230. Parameters can comprise the values of R and n. The Montgomery ALU 110 provides the component R.sup.2 mod n, as output from the Montgomery multiplication on data signal 222, to logic units within the cryptographic processing unit 102. The component R.sup.2 mod n may then be used in cryptographic functions performed by the cryptographic processing unit 102.
First Embodiment—Base of 2R mod n
(36) In a first embodiment, the cryptographic component (R.sup.2 mod n) is calculated by determining an exponent 2.sup.k which raises 2 to a value greater than R, such that 2.sup.2.sup.
(37) The Montgomery ALU 110 then determines how much greater the intermediate multiplication result is compared to the component R.sup.2 mod n, based on the value of k, and defines an excess parameter based thereon. Then, the Montgomery ALU performs a Montgomery multiplication of the intermediate multiplication result and an adjustment parameter, where the adjustment parameter is a function of the inverse of the excess parameter, to produce the component R.sup.2 mod n.
(38)
(39) In step 1, 302, of method 300, the Montgomery ALU 110 determines the base value δ.sub.0 and stores the base value in Register A 216. In accordance with the first embodiment, the base value δ.sub.0 is set to 2R mod n=2.sup.2.sup.
(40) If R=2.sup.l is greater than n, and n is greater than 2.sup.l−1, then the base value δ.sub.0=2R mod n is equal to either 2R−2n or 2R−3n, both of which may be calculated. Since R>n, it follows that 2R-2n>0. On the other hand, since R/2<n, it follows that 2R-4n<0. Therefore, only 2R−2n or 2R−3n are candidates for 2R mod n. If n>2/3R, then 3n>2R, therefore 2R−3n<0. In this case, 2R mod n=2R−2n. On the other hand, if n<2/3R, then 3n<2R, therefore 2R−3n>0, and 2R mod n=2R−3n.
(41) In step 2, 304, of method 300, the Montgomery ALU 110 sets a first and a second Montgomery multiplication operand to be equal to the base value δ.sub.0, which was determined in step 1, 302.
(42) In step 3, 306, of method 300, the Montgomery ALU 110 iteratively performs Montgomery multiplication operations on the first and second operands to produce an intermediate multiplication result at each iteration. At the end of each iteration, the first and second operands are set to the intermediate multiplication result calculated via the Montgomery multiplication operation.
(43) A termination condition may be defined in accordance with different embodiments of the present disclosure. In one embodiment, the termination condition is met when the intermediate multiplication result is in the form yR mod n, wherein y is greater than R. In an alternative embodiment, the termination condition is met when the intermediate multiplication result is in the form yR mod n, wherein y satisfies the form y≤R≤y.sup.2.
(44) In accordance with the first embodiment described herein, the Montgomery ALU 110 determines a first integer exponent k for which 2.sup.2.sup.
(45) Specifically, the Montgomery ALU 110 performs the following loop:
for (i=1, to i=k,i++); δ.sub.i=Montgomery(δ.sub.i-1,δ.sub.i-1); end
(46) Once the Montgomery ALU 110 has performed the k.sup.th Montgomery multiplication 308, as determined by termination condition logic 310, the Montgomery ALU 110 proceeds to Step 4, 312.
(47) It is noted that for the abovementioned loop, after iteration i, the intermediate result δ.sub.i=2.sup.2.sup.
(48) In step 4, 312, of method 300, the Montgomery ALU 110 determines an excess parameter, which is an amount by which the penultimate result differs from the component R.sup.2 mod n. In accordance with the first embodiment, the Montgomery ALU 110 determines an integer c for which R=2.sup.2.sup.
(49) In step 5, 314, of method 300, the Montgomery ALU 110 determines an adjustment parameter, where the adjustment parameter is a function of the inverse of the excess parameter, of the form:
adjustment parameter=2.sup.−cR mod n
(50) Then the Montgomery ALU 110 determines the component R.sup.2 mod n, by performing a Montgomery multiplication of the penultimate result δ.sub.k with the adjustment parameter, in the form:
target value=Montgomery (δ.sub.k,2.sup.−cR mod n)
Calculating c
(51) If R=2.sup.l>n>2.sup.l−1, then n>2.sup.l−1=2.sup.−1R, and therefore 2.sup.−cR mod n=2.sup.−cR for any c>0. If c>0 and 2.sup.−cR>n, then the reduction 2.sup.−cR mod n is directly computed by the Montgomery ALU 110.
(52) Alternatively, in one embodiment, the Montgomery ALU 110 calculates the reduction 2.sup.−cR mod n as an intermediate step in the computation of the reduction 2R mod n. In this case, 2.sup.−cR=2.sup.l−c>n, therefore, the Montgomery ALU 110 determines the integer d such that 2.sup.l−d>n>2.sup.l−d−1. The value d is greater than or equal to c, therefore 2R mod n may be calculated by noting that 2R=.sub.2.sup.d+1R′ where R′=2.sup.l−d, and 2.sup.−cR=2.sup.d−cR′.
(53) Accordingly, the Montgomery ALU 110 calculates the reduction of both 2.sup.d+1R′ mod n and 2.sup.d−c R′ mod n by first computing 2R′ mod n and then proceeding to compute 2(2R′ mod n) mod n and by induction after computing 2.sup.kR′ mod n proceeding to compute 2.sup.k+1R′ mod n=2(2.sup.k R′ mod n) mod n for all k≤d.
(54) If d is a small integer, the full sequence of 2.sup.kR′ mod n may be computed in low complexity. Use of an R which satisfies 2.sup.−dR>n may occur in an embodiment which uses one value of R for all the possible values of n. The value of d may be greater than zero but not much larger. The reduction 2.sup.−cR mod n may be stored in Register B 218. When the termination condition is set to y<R, an excess 2.sup.c for which 2.sup.cR>n, so the Montgomery ALU 110 calculates 2.sup.c R mod n.
(55) Signal Diagram
(56)
(57) The first operand is an output signal 210 of the multiplexer 204. The second operand is the other output signal 212 of the multiplexer 204. On the first clock cycle, both of the multiplexer outputs are 2R mod n. Accordingly, the first and the second operands are 2R mod n. For the following clock cycles, up to k, the multiplexer outputs are the values fed back from the Montgomery unit.
(58) On the second clock cycle, the output 222 of the Montgomery multiplication unit 220 is equal to 2.sup.2.sup.
(59) On the k.sup.th clock cycle, one of the multiplexer outputs 210 is the intermediate result 2.sup.2.sup.
(60) On the last clock cycle (k+1), the output 222 of the Montgomery multiplication unit 220 is the component R.sup.2 mod n.
(61) Performance
(62) Embodiments of the present disclosure may be applied to simplify the computation of R.sup.2 mod n in RSA computations through the use of Montgomery multiplication. In particular, the first embodiment, as described above, can reduce the latency of the calculation of R.sup.2 mod n for RSA compared to the method of direct modulo reduction.
(63) For small exponents e of s.sup.e mod n, which are typically used in RSA signature verification, the direct modulo reduction of R.sup.2 mod n takes approximately two-thirds of the computation time for the calculation of s.sup.e mod n. In contrast, the method proposed herein takes approximately only one-third of the computation time, as exemplified by the following performance figures.
(64)
(65) Considering
(66) In a hardware simulation, the clock cycle consumption for the total computation of s.sup.e mod n (i.e. the RSA core) was reduced from 240,300 cycles, in which R.sup.2 mod n was calculated using direct modulo reduction, to 145,000 cycles, in which R.sup.2 mod n was calculated via an embodiment of the method 300.
Further Advantages
(67) Advantageously, the cryptographic architecture of an embodiment of the present disclosure may be more efficiently utilised because the method 300 can use the same Montgomery multiplication units which are used by the RSA multiplication. Accordingly, there may be a reduced requirement for dedicated computational units, which may not be fully utilised after the computation of R.sup.2 mod n is complete. The inclusion of dedicated computation units for the calculation of a specific cryptographic value is often undesired due to increased implementation footprint, energy consumption and/or design complexity.
(68) As noted above, although converting to the Montgomery domain consumes computation time, this computation time can be an acceptable overhead when performing many multiplications in a row, as in modular exponentiation, as intermediate results can be left in Montgomery domain representation, and the initial and final conversions may become a negligible fraction of the overall computation.
(69) If an exponent e is large (e.g. e>2.sup.16), then computation within the Montgomery domain is likely to be the preferred method for computing s.sup.e mod n, even for implementations that calculate R.sup.2 mod n via direct modulo reduction. On the other hand, if an exponent e is smaller, it may be more efficient for an implementation to compute s.sup.e mod n directly, without converting to and from the Montgomery domain, since the calculation of R.sup.2 mod n via direct modulo reduction, for conversion to the Montgomery domain, consumes a significant portion of the computation cycles for computing s.sup.e mod n. Advantageously, however, the improved method of calculating of R.sup.2 mod n, as described herein, reduces the computation cycles for converting to the Montgomery domain, thus making computing s.sup.e mod n in the Montgomery domain an efficient option for a wider range of exponents.
(70) An embodiment of the improved method 300 for R.sup.2 mod n described herein may also provide an advantage in the situation where e changes over time.
(71) In a device which uses Montgomery multipliers for calculation of cryptographic expressions other than R.sup.2 mod n, embodiments of the present disclosure can utilise the existing Montgomery multipliers of the device. Accordingly, it may be worthwhile implementing embodiments of the present disclosure for even small exponents, to take advantage of the hardware optimisation provided by the method's utilisation of existing Montgomery multipliers.
(72) Additionally, embodiments of the present disclosure may be advantageous for implementations with limited storage, particularly implementations that cannot store R.sup.2 mod n for future use and have to compute it each time.
(73) It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described method of calculating the component R.sup.2 mod n, without departing from the broad general scope of the present disclosure. Specifically, other embodiments may utilise an alternative method for determining the number of Montgomery multiplication iterations performed by the Montgomery ALU 110. Additionally, other embodiments may determine an alternative base value than the base value described above. Variations on the method 300, as described above, are disclosed in relation to a second embodiment.
Second Embodiment—Base of 2.SUP.b .R
(74) According to a second embodiment of the present disclosure, the computation of R.sup.2 mod n does not require the computation of the base value 2R mod n in all situations. Instead, in accordance with this second embodiment, the Montgomery ALU 110, in step 302, sets the base value for the first iteration to 2.sup.bR for an integer b>1, under certain conditions, as detailed below.
(75) In one example, the cryptographic processing unit comprises an architecture in which the memory register for storing the first and second operands of the Montgomery multiplication is 2.sup.k wide, for some integer k. (For example, the memory register is 2048 bits wide, as is common for RSA implementations, and k=11.) Accordingly, n<2.sup.2.sup.
(76) In step 1, 302, of method 300, the Montgomery ALU 110 determines the base value δ.sub.0 and stores the base value in Register A 216. In accordance with the second embodiment, the Montgomery ALU 110 determines an integer b such that 1≤b≤2.sup.k−l−1.
(77) The Montgomery ALU 110 sets the base value as δ=2.sup.b R=2.sup.b+l. There is no need to compute mod n since δ=2.sup.b R<2.sup.2.sup.
(78) Accordingly, in step 304, the first and second operands for the first iteration of Montgomery multiplication are set to δ.sub.0=2.sup.b R.
(79) In step 3, 306, of method 300, the Montgomery ALU 110 iteratively performs Montgomery multiplication operations on the first and second operands to produce an intermediate result at each iteration. At the end of each iteration, the first and second operands are set to the intermediate result calculated via the Montgomery multiplication operation.
(80) In accordance with the second embodiment described herein, the Montgomery ALU 110 determines the first integer i for which 2.sup.2.sup.
(81) At the exiting of the while loop, as determined by the termination condition logic 310, the resulting δ is the penultimate result. The Montgomery ALU 110 then proceeds to Step 4, 312.
(82) In step 4, 312, of method 300, the Montgomery ALU 110 determines an excess parameter, which is an amount by which the penultimate result differs from the component R.sup.2 mod n. In accordance with the second embodiment, the Montgomery ALU 110 determines an integer c for which R=2.sup.2.sup.
(83) In step 5, 314, of method 300, the Montgomery ALU 110 determines an adjustment parameter, where the adjustment parameter is a function of the inverse of the excess parameter, of the form:
adjustment parameter=2.sup.−cR
(84) Then the Montgomery ALU 110 determines the value of component R.sup.2 mod n, by performing a Montgomery multiplication of the penultimate result with the adjustment parameter, in the form:
target value=Montgomery (δ,2.sup.−cR)
(85) Advantageously, this second embodiment may reduce the number of Montgomery multiplications required to compute R.sup.2 mod n, compared to the first embodiment, described above.
(86) Note that for b=1, this variant coincides with the first variant. For b=2, the second embodiment saves one Montgomery multiplication compared to the first embodiment. For b=4, the second embodiment saves two Montgomery multiplications compared to the first embodiment. In general, for b=2.sup.2.sup.
Third Embodiment—Two Stage Calculation of Conversion Component
(87) According to a third embodiment, the conversion component, otherwise known as the cryptographic component, is calculated via a two-stage process. The two-stage process comprises a first stage, comprising the calculation of a base value via an iterative application of a shift and subtract operation, and a second stage comprising the calculation of the conversion component via an iterative application of Montgomery multiplication of identical operands, starting with the base value. In some cases, as detailed below, the third embodiment may further comprise determining and applying an adjustment parameter to the result of the iteratively applied Montgomery multiplications.
(88) Advantageously, the third embodiment may reduce the processing time consumed in the computation of the conversion component R.sup.2 mod n, compared to the first and second embodiments described above.
(89) Shift and Subtract Operation
(90) According to the third embodiment, the Montgomery ALU 110 applies an operation referred to as a shift and subtract operation. In general terms, for an integer a, the value 2.sup.2a R mod n may be computed from 2.sup.a R mod n via at least two different methods. According to a first method, a processor performs one Montgomery multiplication of two identical operands, 2.sup.a R mod n, to calculate 2.sup.2a R mod n. According to a second method, a processor performs a operations, where the first operation computes 2.sup.a+1 R mod n by multiplying 2.sup.a R mod n by 2 (which may be implemented by a single bit shift to the left). If the result of shift is greater or equal to n, then the processor subtracts n to obtain 2.sup.a+1 R mod n. The processor continues the shift and subtract operation for a total of a iterations. In each iteration, the processor computes 2.sup.a+i+1 R mod n by shifting 2.sup.a+1 R mod n one bit to the left, and subtracting n if the shift resulted in a number greater than or equal to n.
(91) Conducting the shift and subtract operation on a start value comprises calculating a shifted start value, which is equivalent to the value of the start value multiplied by two, and subtracting n if the shifted start value is greater than or equal to n. The method of calculating the shifted start value may depend on the format in which the start value is represented, including endianness, and the logical units implemented in an embodiment. According to one embodiment, calculating the shifted start value, which is equivalent to the operand 2.sup.a R mod n multiplied by two, is implemented by a single bit shift of the start value to the left, wherein the most significant bit is located in the left most bit and the least significant bit is located in the right most bit. In another embodiment, in which the most significant bit is located in the right most bit, calculating the shifted start value is implemented as a single bit shift to the right.
(92) In another embodiment, calculating the shifted start value is implemented by connecting the bits of a RegisterA containing 2.sup.a R mod n to the bits of a RegisterB configured to contain 2.sup.a+1 R mod n, such that RegisterA(bit i) is connected to RegisterB(bit i+1). Alternative architectures may be implemented to determine the shifted start value, to effect the shift component of the shift and subtract operation.
(93) Calculating the Base Value
(94) According to the third embodiment, the Montgomery ALU 110 uses shift and subtract operations to calculate a base value 2.sup.b R mod n. More particularly, the Montgomery ALU 110 computes 2.sup.b R mod n, for an integer b, by performing b iterations of the shift and subtract operation, starting at start value 2R mod n.
(95) The Montgomery ALU 110 then computes 2.sup.2b R mod n, 2.sup.2.sup.
(96) Accordingly, the Montgomery ALU 110 computes 2.sup.2.sup.
(97) The Values of b and k
(98) The value of b may be set in accordance with the value of k, such that the application of b shift and subtract operations, followed by k Montgomery multiplications, produces an intermediate result which is either equal to the conversion component, or can be adjusted to be equal to the conversion component by being Montgomery multiplied with an adjustment parameter indicative of a difference between the intermediate result and the cryptographic component.
(99) An intermediate result which can be adjusted to be equal to the conversion component 2.sup.2.sup.
(100) Montgomery ALU Subsection for Third Embodiment
(101)
(102) Control logic block 602 provides control signals 604, 606 and 608 to control the function of other blocks within the subsection 600. Control logic block 602 also provides data values, via signals 612 and 614, to be stored in register 616 and register 618, respectively.
(103) Control logic 602 sets register 618, and register 618 stores the value R mod n. Shift and subtract block 622 performs the two-step shift and subtract operation. This operation comprises a one bit left shift, then a conditional subtraction of n, if the left shifted value is greater than or equal to n. The result of the two-step operation performed by shift and subtract block 622 is output on data signal 624.
(104) In accordance with the third embodiment, the shift and subtract operation is performed b times to produce an operand to be used by the Montgomery multiplier 638. The control logic 602 controls the selection signal 604 of the multiplexer 618 so that the output 624 of the shift and subtract block 622 is routed through the multiplexer 620 to the input of the shift and subtract block 622. Register A 626 is configured to store the output 624 of the shift and subtract block 622.
(105) In one embodiment, the control logic block 602 of the Montgomery ALU 110 determines the value of b. The control logic block 602 uses the value of b to control the function of the multiplexer 620 via control signal 604. Accordingly, the input signal to the shift and subtract block 622 is initialised to R mod n from register 618, and for subsequent iterations of the shift and subtract operation, the input signal to the shift and subtract block 622 is set to the shift and subtract output signal 624.
(106) In one embodiment, the control logic block 602 of the Montgomery ALU 110 determines the value of k. The control logic block 602 uses the value of k to control the function of the multiplexer 630 via control signal 608. Multiplexer 630 is a 3:2 multiplexer which maps three input signals, 628, 630 and 640, to two output signals, 634 and 636, in accordance with the value of the selection signal 608. The selection signal 608 is set by the control logic block 602 and indicates the mapping of one of the three input signals, 628, 630 and 640, to output signal 634. The selection signal 608 also indicates the mapping of one of the three input signals, 628, 630 and 640, to output signal 636. Input signal 628 is set by Register A 626. Input signal 632 is set by Register B 616.
(107) Montgomery multiplication block 638 operates to perform Montgomery multiplication on a first operand and a second operand. The first operand is provided, via multiplexer 630, on data signal 634, and the second operand is provided, via multiplexer 630, on data signal 636. Functional block 638 outputs the result of Montgomery multiplication of the first operand and the second operand on output signal 640. Register B 616 stores an adjustment parameter, 2.sup.−cR mod n, which may be Montgomery multiplied with the output of the Montgomery multiplication block 638 on signal 640.
(108) Control logic block 602, shift and subtract logic block 622, multiplexer 620, multiplexer 630 and Montgomery multiplication unit 638 are all clocked by clock signal 120.
(109) The Montgomery ALU 110 provides control signals and parameters to the control logic block 602 of the subsection 600, via signal 642. Parameters can comprise the values of R and n. The Montgomery ALU 110 provides the component R.sup.2 mod n, as output from the Montgomery multiplication block 638 on data signal 640, to logic units within the cryptographic processing unit 102. The component R.sup.2 mod n may then be used in cryptographic functions performed by the cryptographic processing unit 102.
Method for Third Embodiment
(110)
(111) In step 702, of method 700, the Montgomery ALU 110 determines the values of b and k. In one embodiment, the Montgomery ALU 110 determines the values of b and k by accessing parameters provided by the control logic unit 104. In one embodiment, the Montgomery ALU 110 determines the values of b and k by calculating the values of b and k. Determining the values of b and k is described in further depth below.
(112) In step 704, subsection 600 of the Montgomery ALU 110 performs b iterations of the shift and subtract operation, starting from R mod n, to calculate the base value δ.sub.0=2.sup.b R mod n. Accordingly, after step 704, the operands for the first iteration of Montgomery multiplication, as provided on signals 634 and 636, are set to δ.sub.0=2.sup.b R mod n.
(113) In step 706, of method 700, the Montgomery ALU 110 sets a Montgomery multiplication operand to be equal to the base value δ.sub.0, which was determined in step 704.
(114) The Montgomery ALU 110 performs k iterations of the Montgomery multiplication operation on the multiplication operand, which is provided on both signals 634 and 636. Accordingly, the multiplication operand is Montgomery multiplied with itself. An intermediate result is provided on signal 640 after each Montgomery multiplication operation. At the end of each iteration, the multiplication operand provided on signals 634 and 636 is set to the intermediate result, on signal 640, as calculated via the Montgomery multiplication operation.
(115) After performing k iterations of the Montgomery multiplication operation, the result on signal 640 is the penultimate result.
(116) If the values of b, k and R are in the form R=2.sup.l, where l is in the form l=b2.sup.k, then the penultimate result will be equal to the conversion component R.sup.2 mod n. Accordingly, in this situation there is no need to apply an adjustment parameter to the penultimate result. In decision 708, the Montgomery ALU 110 determines whether the values of b, k and R are in this form. If the values of b, k and R are in this form, the Montgomery ALU 110 does not perform step 710, and the target value, being the conversion component, is set to be the penultimate result.
(117) In step 710, the Montgomery ALU 110 determines an excess parameter, which is an amount by which the penultimate result differs from the component R.sup.2 mod n. In accordance with the second embodiment, the Montgomery ALU 110 determines an integer c for which R=2.sup.2.sup.
(118) Further, in step 710, the Montgomery ALU 110 determines an adjustment parameter, where the adjustment parameter is a function of the inverse of the excess parameter, of the form:
adjustment parameter=2.sup.−cR
(119) Then the Montgomery ALU 110 determines the value of component R.sup.2 mod n, by performing a Montgomery multiplication of the penultimate result with the adjustment parameter, in the form:
target value=Montgomery (δ,2.sup.−cR)
Choosing b and k
(120) As noted above, the values of b and k are correlative, meaning that there is a relationship between the value of b and value of k. The values of b and k may be configured to satisfy requirements or limitations of the Montgomery ALU 110. More specifically, the value of b may be set in conjunction with setting the value of k, such that the application of b shift and subtract operations, followed by k Montgomery multiplications produces an intermediate result which is either equal to the conversion component, or can be adjusted to be equal to the conversion component by being Montgomery multiplied with an adjustment parameter. An intermediate result which can be adjusted to be equal to the conversion component 2.sup.2.sup.
(121) In other words, the values of b and k may be selected by noting that, for each choice of b, the required number of operations to obtain 2.sup.2.sup.
(122)
(123) A Montgomery ALU may be configured with static values of either or both of b and k. Alternatively, control logic block 104 may dynamically configure a Montgomery ALU with values of b and k, or may dynamically set the values of b and k for a calculation of the conversion component. The values of b and k may be adjusted for the calculation of different conversion components. For example, the value of b may be adjusted upwards to increase the number of shift and subtract operations performed by the Montgomery ALU, and to decrease the number of Montgomery multiplications performed by the Montgomery ALU. Conversely, the value of b may be adjusted downwards to decrease the number of shift and subtract operations performed by the Montgomery ALU, and to increase the number of Montgomery multiplications performed by the Montgomery ALU.
(124) Accordingly an embodiment of the Montgomery ALU may be configured to set the values of b and k, or may be configured to calculate preferred values of b and k according to the requirements of the cryptographic processing unit 102. Accordingly, the values of b and k may be hardcoded, dynamically selected, or dynamically calculated.
(125) The selection of preferred values of b and k may depend on the architecture of the subsection 600 of the Montgomery ALU 110. Alternatively or additionally, the selection of the values of b and k may depend on the operational cost of performing the shift and subtract operation, and the operational cost of performing the Montgomery multiplication function, where operational cost may comprise the processing time consumed per operation, the number of clock cycles consumed per operation, or the power consumed per operation.
(126) In one embodiment, the Montgomery ALU 110 chooses the value of b as the first integer for which the Montgomery multiplication operation consumes less processing time than b iterations of the shift and subtract operation. Processing time may be measured in clock cycles, machine cycles, or units of time.
(127) In another embodiment, the Montgomery ALU 110 sets values b and k to avoid the need to apply an adjustment parameter in step 710. More particularly, R is in the form R=2.sup.l, where l is in the form l=b2.sup.k l, where b is an odd integer. Using this factorization, the Montgomery ALU 110 calculates the conversion component R.sup.2 mod n by b shift and subtract operations and k Montgomery multiplications, with no need to apply an adjustment parameter based on the excess parameter 2.sup.c, in step 710.
(128) For example, considering the situation in which R=2.sup.2048=2.sup.2.sup.
(129) According to another embodiment, the values of b and k may be determined empirically by calculating, for different values of b, the minimal value k, such that b.Math.2.sup.k≥2.sup.1. For each value of b, the Montgomery ALU 110 calculates the operational cost of computing b shift and subtract operations followed by k Montgomery multiplications and a further Montgomery multiplication to apply the adjustment parameter to reduce the intermediate result to the value of the conversion component. Accordingly, a value of b may be chosen to minimize the cost (in terms of processing time, power consumption, clock cycles or machine cycles) of b shift and subtract operations and k+1 Montgomery multiplications. In one embodiment, the Montgomery ALU sets the value of b based on a calculation of the processing time and/or power that would be consumed in the calculation of the conversion component for that value of b.
(130) According to another embodiment, the Montgomery ALU 110 calculates the maximal value k, such that b.Math.2.sup.k≤2.sup.1 for many values of b. The Montgomery ALU 110 calculates R.sup.2 mod n for each value of b by b shift and subtract operations and k Montgomery multiplications, which provides a value of 2.sup.b.Math.2.sup.
(131) Performance
(132)
(133) Considering
(134) Considering also
(135) Furthermore, the number of clock cycles consumed (107,257) to calculate the conversion component for key size of 2048 bits, via the method of the third embodiment with b=1024 is higher than the number of clock cycles consumed (52,728) to calculate the conversion component for key size 2048 bits via the method of the first embodiment (referring to
(136) It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
(137) It will be appreciated by persons skilled in the art that the present invention is not limited to what has been particularly shown and described herein. Rather, the scope of the present invention is defined only by the claims that follow.