METHOD OF DEFENSE AGAINST CRYPTOSYSTEM TIMING ATTACK, ASSOCIATED CRYPTOSYSTEM PROCESSING CIRCUIT, AND ASSOCIATED ELECTRONIC DEVICE
20230261851 · 2023-08-17
Assignee
Inventors
Cpc classification
H04L9/3066
ELECTRICITY
H04L9/003
ELECTRICITY
H04L9/302
ELECTRICITY
International classification
H04L9/00
ELECTRICITY
Abstract
A method of defense against cryptosystem timing attack such as Rivest-Shamir-Adleman (RSA) cryptosystem timing attack, an associated cryptosystem processing circuit and an associated electronic device are provided. The method may include: utilizing a point double calculation circuit to perform a plurality of point double calculation operations related to a predetermined cryptosystem; utilizing a point add calculation circuit to perform a plurality of point add calculation operations related to the predetermined cryptosystem; and in response to there being no need to perform any point add calculation operation related to the predetermined cryptosystem, utilizing a dummy point add calculation circuit to perform a dummy point add calculation operation to emulate a calculation time of performing the any point add calculation operation, without changing a calculation result before performing the dummy point add calculation operation.
Claims
1. A method of defense against cryptosystem timing attack, comprising: utilizing a point double calculation circuit to perform a plurality of point double calculation operations related to a Rivest-Shamir-Adleman (RSA) cryptosystem; utilizing a point add calculation circuit to perform a plurality of point add calculation operations related to the RSA cryptosystem; and in response to there being no need to perform any point add calculation operation related to the RSA cryptosystem, utilizing a dummy point add calculation circuit to perform a dummy point add calculation operation to emulate a calculation time of performing the any point add calculation operation, without changing a calculation result before performing the dummy point add calculation operation.
2. The method of claim 1, wherein for any point A that is an integer on a number line, a double calculation operation among the plurality of point double calculation operations represents calculation of A.sup.2 mod N, wherein the N represents modulus, and the symbol “mod” represents a modulo operation.
3. The method of claim 1, wherein for any two points A and B that are integers on a number line, a point add calculation operation among the plurality of point add calculation operations represents calculation of A×B mod N, wherein the N represents modulus, the symbol “mod” represents a modulo operation, and the symbol “×” represents a multiplication operation.
4. The method of claim 1, wherein the dummy point add calculation operation comprises emulating a point add calculation operation through parameter substitution.
5. The method of claim 4, wherein the parameter substitution comprises multiplying a term in a process of Montgomery reduction by (R−N), wherein the N represents the modulus, and the R is a Montgomery multiplier.
6. The method of claim 5, wherein regarding calculation of Y=X.sup.e mod N, during calculating an intermediate value y=X.sup.MID, in response to a target bit being 1, the dummy point add calculation operation replaces calculation of y×X mod N with calculation of y×(R−N)×R.sup.−1 mod N to emulate the point add calculation operation, wherein the symbol “X” represents plain text, the symbol “Y” represents cryptograph, the symbol “e” represents a key, the symbol “MID” represents a positive integer corresponding to the intermediate value, the symbol “mod” represents a modulo operation, and the symbol “×” represents a multiplication operation.
7. The method of claim 1, wherein the dummy point add calculation operation comprises performing dummy write on a buffer through address swap to emulate a point add calculation operation through parameter substitution.
8. A cryptosystem processing circuit that operates according to the method of claim 1, the cryptosystem processing circuit comprising: a core circuit, arranged to control a plurality of cryptosystem processing operations that are related to the RSA cryptosystem in the cryptosystem processing circuit, wherein the plurality of cryptosystem processing operations comprise the plurality of point double calculation operations and the plurality of point add calculation operations; the point double calculation circuit, coupled to the core circuit, arranged to perform the plurality of point double calculation operations; the point add calculation circuit, coupled to the core circuit, arranged to perform the plurality of point add calculation operations; and the dummy point add calculation circuit, coupled to the core circuit, arranged to perform the dummy point add calculation operation.
9. An electronic device comprising the cryptosystem processing circuit of claim 8, further comprising: a processor, arranged to control operations of the electronic device; a memory, arranged to temporarily store information for the electronic device; and a communications interface circuit, arranged to perform communications operations for the electronic device.
10. An electronic device comprising the cryptosystem processing circuit of claim 8, wherein the electronic device comprises: a processor, arranged to control operations of the electronic device, wherein the core circuit is implemented by the processor; wherein the electronic device further comprises: a memory, arranged to temporarily store information for the electronic device; and a communications interface circuit, arranged to perform communications operations for the electronic device.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0010]
[0011]
[0012]
[0013]
[0014]
DETAILED DESCRIPTION
[0015]
[0016] According to this embodiment, the processor 110 may be arranged to control operations of the electronic device 100. Under control of the processor 110, the memory 120 may be arranged to temporarily store information for the electronic device 100 (e.g., for being used as a buffer), and the communications interface circuit 130 may perform communications operations for the electronic device 100, and more particularly, may be coupled to an external electronic device to communicate with the external electronic device (labeled as “To external electronic device” for brevity). In addition, the cryptosystem engine circuit 140 may provide the cryptosystem processing function, to make the electronic device 100 protect important data such as the user data according to a predetermined cryptosystem. Additionally, the storage module 150 may be arranged to store information, and more particularly, store the important data such as the user data.
[0017] For better comprehension, the electronic device 100 may represent a storage device (e.g., a universal serial bus (USB) flash drive or a solid state drive (SSD)), and the external electronic device may represent a control device that utilizes the electronic device 100 to store the user data (e.g., a desktop computer or a laptop computer), wherein the storage module 150 may comprise a storage medium that is arranged to store the user data (e.g., a flash memory), but the present invention is not limited thereto. In some embodiments, the type of the electronic device 100 and/or the architecture shown in
[0018] In the embodiment shown in
[0019] According to some embodiments, the predetermined cryptosystem may be an RSA cryptosystem and the cryptosystem engine circuit 140 may be an RSA cryptosystem engine circuit, but the present invention is not limited thereto. In some embodiments, the predetermined cryptosystem may be an elliptic curve cryptosystem (ECC), and the cryptosystem engine circuit 140 may be an ECC engine circuit.
[0020]
[0021] For better comprehension, the hardware structure of the cryptosystem processing circuit 200 shown in
[0022]
[0023] In Step S11, the cryptosystem processing circuit 200 (e.g., the core circuit 210) may utilize the point double calculation circuit 220 to perform the plurality of point double calculation operations related to the predetermined cryptosystem. For example, in a situation where the plurality of cryptosystem processing operations of the cryptosystem processing circuit 200 are a plurality of RSA cryptosystem processing operations related to the RSA cryptosystem, the plurality of point double calculation operations may represent a plurality of RSA point double calculation operations among the plurality of RSA cryptosystem processing operations. In some embodiments, for any point A that is an integer on a number line, a double calculation operation among the plurality of point double calculation operations may represent the calculation of A.sup.2 mod N. For example, for the RSA cryptosystem, A may be a number in a certain number field, and may also be referred to as a point in this number field.
[0024] In Step S12, the cryptosystem processing circuit 200 (e.g., the core circuit 210) may utilize the point add calculation circuit 230 to perform the plurality of point add calculation operations related to predetermined cryptosystem. For example, in a situation where the plurality of cryptosystem processing operations of the cryptosystem processing circuit 200 are the plurality of RSA cryptosystem processing operations related to the RSA cryptosystem, the plurality of point add calculation operations may represent a plurality of RSA point add calculation operations among the plurality of RSA cryptosystem processing operations. In some embodiments, for any two points A and B that are integers on the number line, a point add calculation operation among the plurality of point add calculation operations may represent the calculation of A×B mod N.
[0025] In Step S13, in response to there being no need to perform any point add calculation operation related to the predetermined cryptosystem, such as any RSA point add calculation operation related to the RSA cryptosystem, the cryptosystem processing circuit 200 (e.g., the core circuit 210) may utilize the dummy point add calculation circuit 240 to perform a dummy point add calculation operation to emulate a calculation time of performing the any point add calculation operation (e.g., the any RSA point add calculation operation), without changing a calculation result (e.g., an RSA calculation result) before performing the dummy point add calculation operation.
[0026] For better comprehension, the method may be illustrated with the working flow shown in
[0027] Regarding the plurality of cryptosystem processing operations, some implementation details are further described as follows. During decryption, the cryptosystem processing circuit 200 may need to perform the following operations multiple times:
A×B mod N;
where the N represents modulus, the symbol “mod” represents a modulo operation, the symbol “×” represents a multiplication operation, A, B and N can be integers, and A<N and B<N. In order to accelerate the calculation speed, the cryptosystem processing circuit 200 can perform conversion on associated parameters to perform calculation in a Montgomery field, and more particularly, can perform multiplication and Montgomery reduction simultaneously. The calculation as described above can be the calculation of any Montgomery step among a plurality of Montgomery steps of the Montgomery reduction (REDC) algorithm. Additionally, the normal field may be a prime field.
[0028] In some embodiments, the cryptosystem processing circuit 200 may calculate T=A×B×R.sup.−1 mod N. A certain parameter calculated in the Montgomery field may be set as a corresponding parameter calculated in the normal field that is multiplied by R. Therefore, returning to the normal field, the cryptosystem processing circuit 200 may calculate as follows:
T×R.sup.2 modN=((A×B×R.sup.−1)×R.sup.2)×R.sup.−1 modN=A×BmodN;
where the above-mentioned point double calculation operations may correspond to A=B, and the above-mentioned point add calculation operations may correspond to A≠B.
[0029] For better comprehension, the base r may be equal to any value among a set of predetermined values such as 2.sup.16, 2.sup.32, 2.sup.64, etc., K may represent the RSA width (measured in bits), and the respective bit widths of A, B, and N may be less than or equal to (K−1). For example:
r=2.sup.64;
K=512; and
K′=(512/64);
[0030] where K′ may represent the number of times the loop is executed when calculating in unit of the bit width of r. In addition, R=r.sup.K′, which can be expressed with the base r, and N<R. The notation “Σ.sub.i=0.sup.K−1” represents the summation with the index i changing from 0 to (K−1) (e.g., i=0, 1, . . . or (K−1)), and N can be expressed as follows:
N=Σ.sub.i=0.sup.K−1(n.sub.i×r.sup.i),n.sub.i belongs to{0,1, . . . ,64′hFFFFFFFFFFFFFFF};
where a value starting with the symbol “h” represents the value in hexadecimal format, and the symbol “64′h” represents that this value has 64 bits, so 64′hFFFFFFFFFFFFFFFF can be regarded as a value of 64 bits with each bit being equal to 1.
[0031] The commonly used bit width (e.g., the RSA width K) of the RSA cryptosystem is typically more regular, in particular, can be any value among 512 (e.g., R=2.sup.512), 1024 (e.g., R=2.sup.1024), 1536 (e.g., R=2.sup.1536), 2048 (e.g., R=2.sup.2048), 3072 (e.g., R=2.sup.3072), 4096 (e.g., R=2.sup.4096), etc. For ECC, R can be a power of 2 other than (N+1) bits (i.e., R≠2.sup.(N+1)), but is not limited thereto.
[0032] In addition, A, B and T can be expressed as follows:
A=Σ.sub.i=0.sup.K−1(a.sub.i×r.sup.i);
B=Σ.sub.i=0.sup.K−1(b.sub.i×r.sup.i);
T=A×B=A×Σ.sub.i=0.sup.K−1(b.sub.i×r.sup.i)=Σ.sub.i=0.sup.K−1(A×b.sub.i×r.sup.i);
where T.sub.i=(A×b.sub.i×r.sup.i). A and B can be very large numbers, and T=A×B can be an even larger number having many bits. For better comprehension, the calculation related to T can be expressed with pseudo-code as follows:
TABLE-US-00001 for(int i = 0; i <= (K′ − 1) ; i++) { //K′ = RSA_WIDTH/(r_width); T += b[i] * A; m = T.sub.L0 * (r − N.sub.L0).sup.−1 mod r; T += m * N; T /= r; if(T > N) return (T − N), else return T;
In the above pseudo-code, the symbol “*” may represent a multiplication operation and may be regarded as equivalent to the symbol “×”, the symbol “+=” may represent addition assignment, and the symbol “/=” may represent division assignment, b[i] may represent the above-mentioned and the text starting with the symbol “II” until the end of the same line represents the corresponding comment, where the comment “//K′=RSA_WIDTH/(r_width);” indicates that K′ is equal to the RSA width K (e.g., RSA_WIDTH) divided by the bit width of r (e.g., r_width). For example, when r=2.sup.64, T.sub.L0 may represent the value of the lowest 64 bits (e.g., the 63.sup.rd bit to the 0.sup.th bit) of T, and N.sub.L0 may represent the value of the lowest 64 bits (e.g., the 63.sup.rd bit to the 0.sup.th bit) of N. In addition, “if (T>N) return (T−N)” in the last line of the above pseudo-code may correspond to the calculation of (T mod N). In order to simplify the calculation, the modulo operation can be directly replaced with the subtraction in a situation where the calculation conforms to the Montgomery reduction algorithm, so it is preferably required that T<2N (rather than A<N and B<N which are generally required).
[0033] After each Montgomery step, the calculation result of (A×B×R.sup.−1 mod N), rather than (A×B mod N), can be obtained, where R is the Montgomery multiplier. For ease of operation and implementation, R can be set as a value of a power of 2, such as 2.sup.K (i.e., R=2.sup.K), where the RSA width K can also be referred to as the RSA bit width. Taking 512-bit RSA as an example, R=2.sup.512, that is, the 512.sup.th bit of R is 1, and the other bits such as the 511.sup.th bit to the 0.sup.th bit are 0. In addition, calculation of (X.sup.e mod N) is needed in the RSA cryptosystem, where the symbol “e” may represent a key, such as any of a private key and a public key.
[0034] The binary exponentiation algorithm or the modular exponentiation algorithm used in RSA scalar multiplication can be helpful on increasing the calculation speed, and more particularly, can be used for calculating X.sup.e mod N. Given that X, e and N are integers, and e≥0 and 0≤X<N, and (X.sup.e mod N) is required to be calculated. For example:
e=205=(11001101).sub.2=2.sup.7+2.sup.6+2.sup.3+2.sup.2+2.sup.0;
wherein the symbol “( ).sub.2” may represent a binary value. In the above-mentioned calculation, only 4 more modular multiplication operations are required to generate (X.sup.205 mod N):
X.sup.205=(X{circumflex over ( )}(2.sup.7)×X{circumflex over ( )}(2.sup.6)×X{circumflex over ( )}(2.sup.3)×X{circumflex over ( )}(2.sup.2)×X{circumflex over ( )}(2.sup.0))mod N;
for better comprehension, the above calculation may be expressed with pseudo-code as follows:
TABLE-US-00002 Integer fastExp2(Integer X, Integer e, Integer N) if(e == 0) return 1; f = X; for(i = (g − 1), (g − 2), ..., 0) if(β.sub.i == 0) f = f.sup.2 mod N; else f = f.sup.2X mod N; return f;
in the above-listed pseudo-code, the symbol “g” may represent the length of the binary representation {(β.sub.g-1, β.sub.g-2, . . . , β.sub.0} of e, the symbol “β.sub.i” may represent the value of any bit in the binary representation {β.sub.g-1, β.sub.g-2, . . . , β.sub.0} of e, and the symbol “f” may represent an intermediate value during the calculation. For example:
e=205={β.sub.7,β.sub.6,β.sub.5,β.sub.4,β.sub.3,β.sub.2,β.sub.1,β.sub.0};
wherein g=8 (i.e., the length of (11001101).sub.2), β.sub.0=β.sub.2=β.sub.3=β.sub.6=β.sub.7=1′b1 (i.e., single bit (1).sub.2), and β.sub.1=β.sub.4=β.sub.5=1′b0 (i.e., single bit (0).sub.2).
[0035] It is assumed that the method of scanning from the Most Significant Bit (MSB) to the Least Significant Bit (LSB) is utilized, and X.sup.205 may be expressed as follows:
X{circumflex over ( )}205=(((((((X{circumflex over ( )}2×X){circumflex over ( )}2){circumflex over ( )}2){circumflex over ( )}2)×X){circumflex over ( )}2×X){circumflex over ( )}2){circumflex over ( )}2×X;
in the above equation, “{circumflex over ( )}2” (i.e., the power of 2) may be taken as an example of the aforementioned point double calculation operation, and “×X” may be taken as an example of the aforementioned point add calculation operation. The above equation may be rewritten as:
X{circumflex over ( )}205=X{circumflex over ( )}(2{circumflex over ( )}7)×X{circumflex over ( )}(2{circumflex over ( )}6)×X{circumflex over ( )}(2{circumflex over ( )}3)×X{circumflex over ( )}(2{circumflex over ( )}2)×X; or
X{circumflex over ( )}205=X{circumflex over ( )}(2{circumflex over ( )}7+2{circumflex over ( )}6+2{circumflex over ( )}3+2{circumflex over ( )}2+2{circumflex over ( )}0).
It is noted that, the point double calculation operation is performed every time a scanning step of 1 bit is moved. When the 6.sup.th bit is scanned after the 7.sup.th bit, the cryptosystem processing circuit 200 may detect that the 6.sup.th bit is equal to 1, and may first perform the point double calculation operation (e.g., a square operation) and then perform the point add calculation operation; when the 5.sup.th bit is scanned after the 6.sup.th bit, the cryptosystem processing circuit 200 may detect that the 5.sup.th bit is equal to 0, and may only perform the point double calculation operation (e.g., a square operation); and the rest can be deduced by analogy. The e may be referred to as a private key. Since the private key is typically an exponent in the associated calculation (e.g., the calculation of (X.sup.e mod N)), if a person can know the calculation currently being performed is the point double calculation operation or the point add calculation operation according to the power difference, the person can easily guess the private key. The respective lengths of time to perform the respective calculation for respective points are different from each other. For example, the calculation of the point corresponding to bit 1 may comprise the point double calculation and the point add calculation, and the calculation of the point corresponding to bit 0 may only comprise the point double calculation (without any point add calculation), so executing the former can take longer time than executing the latter.
[0036] Therefore, the cryptosystem processing circuit 200 can be designed to maintain a constant length of time for performing calculation of the point corresponding to any bit (e.g., bit 0 or bit 1), in particular, regardless of whether the current bit is 0 or 1, always perform the point double operation (e.g., one of the plurality of point double operations) and then perform the point add operation (e.g., one of the plurality of point add operations, or the dummy point add operation), making it be impossible for any attacker to crack the private key simply according to the time difference. Based on the method shown in
[0037] Assume that the symbol “X” represents plain text such as a message before encryption, and the symbol “Y” represents the encrypted cryptograph. For example, regarding the calculation of Y=X.sup.e mod N, the cryptosystem processing circuit 200 can perform the dummy point add calculation operation as follows:
X′=X×R mod N=X×R.sup.2×(R−1)mod N;
where R.sup.2 mod N can be pre-calculated. Regarding the calculation of Y=X.sup.e mod N, during the calculation of an intermediate value y=X.sup.MID (e.g., the symbol “MID” represents a positive integer corresponding to the intermediate value y), if a target bit such as the next bit is 1, then the cryptosystem processing circuit 200 can perform the real point add calculation such as y×X mod N, otherwise (this means that the next bit is 0), the cryptosystem processing circuit 200 can perform the dummy point add calculation as follows:
y×(R−N)×R.sup.−1 mod N=y×R.sup.−1 mod N=y mod N;
(R−N)mod N=R mod N+N mod N=R mod N;
R×R.sup.−1 mod N=1;
but the present invention is not limited thereto.
[0038] Taking N having 512 bits as an example, R=2.sup.512, which is 1 bit more than N. When the calculation is based on a 512-bit range, then for unsigned subtraction, R−N=|0−N|, for example, R={1′b1, 512′h0} (that is, R 512.sup.th bit of R is 1, and the other bits such as 511.sup.th bit to 0.sup.th bit are 0), and N=512′h{X4, X3, X2, X1} (which is not equal to 0), for example:
TABLE-US-00003 N ={{64′h..., 64′h... }, {64′hAAAABBBB, 64′hCCCCDDDD}, {64′h..., 64′h... }, {64′h11112222, 64′h33334443}}; but the present invention is not limited thereto.
[0039] As can be seen from the above example, this calculation does not affect the actual result, which means that this calculation can achieve the same calculation result as that without performing any point add calculation. According to an embodiment, the cryptosystem processing circuit 200 may pre-calculate and store (R−N) for subsequent direct use. According to another embodiment, in a situation where (R−N) does not need to be pre-calculated, although it may be needed to increase a logic gate count, the cryptosystem processing circuit 200 may be equipped with a corresponding subtraction calculation unit for processing the calculation of (R−N) at any time, so implementing the cryptosystem processing circuit 200 based on the method does not significantly increase too many logic gates, where these logic gates may be a certain number of multiplexers (MUXs). The cryptosystem processing circuit 200 may subtract a current bit range of N with 0 while reading a bit range of N, for example, in the case of calculating once every 64 bits, the cryptosystem processing circuit 200 may read a bit range BIT[63:0], and read the next bit range BIT[127:64] next time, and the rest can be deduced by analogy. Given that in R, only the MSB is 1 and the remaining bits are 0, assuming that we use a 512-bit space to perform calculation (taking 512-bit RSA as an example), during calculating (R−N), the MSB is 1 and there is no need to calculate, sign bit is ignored, and unsigned subtraction is used, so just use (0−N) to represent (R−N) to get exactly the same result as that of (R−N). According to some embodiments, when the power consumption of the above-listed operations is slightly increased, the cryptosystem processing circuit 200 may also use the subtraction unit to perform dummy subtraction operations during performing the real point add calculation (e.g., one of the plurality of point add calculation operations), to make the power consumption be similar to that of the dummy point add calculation.
[0040]
[0041] As shown in the upper half of
[0042] As shown in the lower half of
[0043] According to some embodiments, the cryptosystem processing circuit 200 may actually perform the calculation of y×X×R.sup.−1 mod N, but not store the final result into any buffer or any memory, wherein in comparison with the real point add calculation, the power consumption may become slightly lower, but the present invention is not limited to this. For example, the cryptosystem processing circuit 200 may actually perform the calculation of y×X×R.sup.−1 mod N, and perform an address swap operation on the buffer, making the power consumption be closer to that of the power consumption of the real point add calculation. For brevity, similar descriptions for these embodiments are not repeated in detail here.
[0044]
[0045] Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.