SECURED PERFORMANCE OF A CRYPTOGRAPHIC PROCESS
20230082339 · 2023-03-16
Inventors
Cpc classification
H04L9/003
ELECTRICITY
International classification
Abstract
A method of performing a cryptographic process in a secured manner, wherein the cryptographic process generates output data based on input data, the generating of the output data involving generating a value y based on an amount of data x, the value y representing a combination, according to a linear transformation L, of respective outputs from a plurality of S-boxes S.sub.n (n=0, . . . , N−1) for integer N>1, wherein each S-box S.sub.n (n=0, . . . , N−1) implements a respective function H.sub.n that is either (a) the composition of a respective first function F.sub.n and a respective linear or affine second function G.sub.n so that H.sub.n=G.sub.n∘F.sub.n, or (b) the composition of a respective first function F.sub.n, a respective linear or affine second function G.sub.n and a respective third function W.sub.n so that H.sub.n=G.sub.n∘F.sub.n∘W.sub.n, wherein the method comprises: performing a first processing stage and a second processing stage to generate the value y based on the amount of data x, wherein: the first processing stage uses a plurality of first lookup tables to generate respective outputs, each output being based on at least part of the amount of data x, wherein, for each S-box S.sub.n (n=0, . . . , N−1), the respective first function F.sub.n is implemented by a corresponding first lookup table; and the second processing stage combines outputs from a plurality of second lookup tables to generate the value y, wherein the input to each second lookup table is formed from the output of a plurality of the first lookup tables, and wherein the set of second lookup tables is based on the second functions G.sub.n (n=0, . . . , N−1) and the linear transformation L.
Claims
1. A method of performing a cryptographic process in a secured manner, wherein the cryptographic process generates output data based on input data, the generating of the output data involving generating a value y based on an amount of data x, the value y representing a combination, according to a linear transformation L, of respective outputs from a plurality of S-boxes S.sub.n (n=0, . . . , N−1) for integer N>1, wherein each S-box S.sub.n (n=0, . . . , N−1) implements a respective function H.sub.n that is either (a) the composition of a respective first function F.sub.n and a respective linear or affine second function G.sub.n so that H.sub.n=G.sub.n∘F.sub.n, or (b) the composition of a respective first function F.sub.n, a respective linear or affine second function G.sub.n and a respective third function W.sub.n so that H.sub.n=G.sub.n∘F.sub.n∘W.sub.n, wherein the method comprises: performing a first processing stage and a second processing stage to generate the value y based on the amount of data x, wherein: the first processing stage uses a plurality of first lookup tables to generate respective outputs, each output being based on at least part of the amount of data x, wherein, for each S-box S.sub.n (n=0, . . . , N−1), the respective first function F.sub.n is implemented by a corresponding first lookup table; and the second processing stage combines outputs from a plurality of second lookup tables to generate the value y, wherein the input to each second lookup table is formed from the output of a plurality of the first lookup tables, and wherein the set of second lookup tables is based on the second functions G.sub.n (n=0, . . . , N−1) and the linear transformation L.
2. A method of generating a secured implementation of a cryptographic process, wherein the cryptographic process generates output data based on input data, the generating of the output data involving generating a value y based on an amount of data x, the value y representing a combination, according to a linear transformation L, of respective outputs from a plurality of S-boxes S.sub.n (n=0, . . . , N−1) for integer N>1, wherein each S-box S.sub.n (n=0, . . . , N−1) implements a respective function H.sub.n that is either (a) the composition of a respective first function F.sub.n and a respective linear or affine second function G.sub.n so that H.sub.n=G.sub.n∘F.sub.n, or (b) the composition of a respective first function F.sub.n, a respective linear or affine second function G.sub.n and a respective third function W.sub.n so that H.sub.n=G.sub.n∘F.sub.n∘W.sub.n, wherein the method comprises: implementing a first processing stage and a second processing stage that, together, are arranged to generate the value y based on the amount of data x, wherein: implementing the first processing stage comprises generating a plurality of first lookup tables that provide respective outputs, each output being based on at least part of the amount of data x, wherein, for each S-box S.sub.n (n=0, . . . , N−1), the respective first function F.sub.n is implemented by a corresponding first lookup table; and implementing the second processing stage comprises generating a plurality of second lookup tables, the second processing stage arranged to combine outputs from the plurality of second lookup tables to generate the value y, wherein the input to each second lookup table is formed from the output of a plurality of the first lookup tables, and wherein the set of second lookup tables is based on the second functions G.sub.n (n=0, . . . , N−1) and the linear transformation L.
3. The method of claim 1, wherein the outputs of the first lookup tables have a larger bit width than the inputs to the first lookup tables.
4. The method of claim 1, wherein the first lookup tables implement a corresponding obfuscation transformation that is undone by the plurality of second lookup tables.
5. The method of claim 1, wherein the output of each first lookup table being based on at least part of the amount of data x comprises the output of each first lookup table being based on a corresponding portion of bits of the amount of data x.
6. The method of claim 1, wherein the output of each first lookup table comprises the sum of a respective plurality of components, and wherein the input to each second lookup table is formed from one or more respective components of each of said plurality of the first lookup tables.
7. A system arranged to perform a cryptographic process in a secured manner, wherein the cryptographic process generates output data based on input data, the generating of the output data involving generating a value y based on an amount of data x, the value y representing a combination, according to a linear transformation L, of respective outputs from a plurality of S-boxes S.sub.n (n=0, . . . , N−1) for integer N>1, wherein each S-box S.sub.n (n=0, . . . , N−1) implements a respective function H.sub.n that is either (a) the composition of a respective first function F.sub.n and a respective linear or affine second function G.sub.n so that H.sub.n=G.sub.n∘F.sub.n, or (b) the composition of a respective first function F.sub.n, a respective linear or affine second function G.sub.n and a respective third function W.sub.n so that H.sub.n=G.sub.n∘F.sub.n∘W.sub.n, wherein the system comprises one or more processors configure to: perform a first processing stage and a second processing stage to generate the value y based on the amount of data x, wherein: the first processing stage uses a plurality of first lookup tables to generate respective outputs, each output being based on at least part of the amount of data x, wherein, for each S-box S.sub.n (n=0, . . . , N−1), the respective first function F.sub.n is implemented by a corresponding first lookup table; and the second processing stage combines outputs from a plurality of second lookup tables to generate the value y, wherein the input to each second lookup table is formed from the output of a plurality of the first lookup tables, and wherein the set of second lookup tables is based on the second functions G.sub.n (n=0, . . . , N−1) and the linear transformation L.
8. A system arranged to generate a secured implementation of a cryptographic process, wherein the cryptographic process generates output data based on input data, the generating of the output data involving generating a value y based on an amount of data x, the value y representing a combination, according to a linear transformation L, of respective outputs from a plurality of S-boxes S.sub.n (n=0, . . . , N−1) for integer N>1, wherein each S-box S.sub.n (n=0, . . . , N−1) implements a respective function H.sub.n that is either (a) the composition of a respective first function F.sub.n and a respective linear or affine second function G.sub.n so that H.sub.n=G.sub.n∘F.sub.n, or (b) the composition of a respective first function F.sub.n, a respective linear or affine second function G.sub.n and a respective third function W.sub.n so that H.sub.n=G.sub.n∘F.sub.n∘W.sub.n, wherein the system comprises one or more processors arranged to: implement a first processing stage and a second processing stage that, together, are arranged to generate the value y based on the amount of data x, wherein: implementing the first processing stage comprises generating a plurality of first lookup tables that provide respective outputs, each output being based on at least part of the amount of data x, wherein, for each S-box S.sub.n (n=0, . . . , N−1), the respective first function F.sub.n is implemented by a corresponding first lookup table; and implementing the second processing stage comprises generating a plurality of second lookup tables, the second processing stage arranged to combine outputs from the plurality of second lookup tables to generate the value y, wherein the input to each second lookup table is formed from the output of a plurality of the first lookup tables, and wherein the set of second lookup tables is based on the second functions G.sub.n (n=0, . . . , N−1) and the linear transformation L.
9. The system of claim 7, wherein the outputs of the first lookup tables have a larger bit width than the inputs to the first lookup tables.
10. The system of claim 7, wherein the first lookup tables implement a corresponding obfuscation transformation that is undone by the plurality of second lookup tables.
11. The system of claim 7, wherein the output of each first lookup table being based on at least part of the amount of data x comprises the output of each first lookup table being based on a corresponding portion of bits of the amount of data x.
12. The system of claim 7, wherein the output of each first lookup table comprises the sum of a respective plurality of components, and wherein the input to each second lookup table is formed from one or more respective components of each of said plurality of the first lookup tables.
13. (canceled)
14. (canceled)
15. The method of claim 2, wherein the outputs of the first lookup tables have a larger bit width than the inputs to the first lookup tables.
16. The method of claim 2, wherein the first lookup tables implement a corresponding obfuscation transformation that is undone by the plurality of second lookup tables.
17. The method of claim 2 wherein the output of each first lookup table being based on at least part of the amount of data x comprises the output of each first lookup table being based on a corresponding portion of bits of the amount of data x.
18. The method of claim 2, wherein the output of each first lookup table comprises the sum of a respective plurality of components, and wherein the input to each second lookup table is formed from one or more respective components of each of said plurality of the first lookup tables.
19. The system of claim 8, wherein the outputs of the first lookup tables have a larger bit width than the inputs to the first lookup tables.
20. The system of claim 8, wherein the first lookup tables implement a corresponding obfuscation transformation that is undone by the plurality of second lookup tables.
21. The system of claim 8, wherein the output of each first lookup table being based on at least part of the amount of data x comprises the output of each first lookup table being based on a corresponding portion of bits of the amount of data x.
22. The system of claim 8, wherein the output of each first lookup table comprises the sum of a respective plurality of components, and wherein the input to each second lookup table is formed from one or more respective components of each of said plurality of the first lookup tables.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
[0029] In the description that follows and in the figures, certain embodiments of the invention are described. However, it will be appreciated that the invention is not limited to the embodiments that are described and that some embodiments may not include all of the features that are described below. It will be evident, however, that various modifications and changes may be made herein without departing from the broader spirit and scope of the invention as set forth in the appended claims.
1—System Overview
[0030]
[0031] The storage medium 104 may be any form of non-volatile data storage device such as one or more of a hard disk drive, a magnetic disc, a solid-state-storage device, an optical disc, a ROM, etc. The storage medium 104 may store an operating system for the processor 108 to execute in order for the computer 102 to function. The storage medium 104 may also store one or more computer programs (or software or instructions or code).
[0032] The memory 106 may be any random access memory (storage unit or volatile storage medium) suitable for storing data and/or computer programs (or software or instructions or code).
[0033] The processor 108 may be any data processing unit suitable for executing one or more computer programs (such as those stored on the storage medium 104 and/or in the memory 106), some of which may be computer programs according to embodiments of the invention or computer programs that, when executed by the processor 108, cause the processor 108 to carry out a method according to an embodiment of the invention and configure the system 100 to be a system according to an embodiment of the invention. The processor 108 may comprise a single data processing unit or multiple data processing units operating in parallel, separately or in cooperation with each other. The processor 108, in carrying out data processing operations for embodiments of the invention, may store data to and/or read data from the storage medium 104 and/or the memory 106.
[0034] The interface 110 may be any unit for providing an interface to a device 122 external to, or removable from, the computer 102. The device 122 may be a data storage device, for example, one or more of an optical disc, a magnetic disc, a solid-state-storage device, etc. The device 122 may have processing capabilities—for example, the device may be a smart card. The interface 110 may therefore access data from, or provide data to, or interface with, the device 122 in accordance with one or more commands that it receives from the processor 108.
[0035] The user input interface 114 is arranged to receive input from a user, or operator, of the system 100. The user may provide this input via one or more input devices of the system 100, such as a mouse (or other pointing device) 126 and/or a keyboard 124, that are connected to, or in communication with, the user input interface 114. However, it will be appreciated that the user may provide input to the computer 102 via one or more additional or alternative input devices (such as a touch screen). The computer 102 may store the input received from the input devices via the user input interface 114 in the memory 106 for the processor 108 to subsequently access and process, or may pass it straight to the processor 108, so that the processor 108 can respond to the user input accordingly.
[0036] The user output interface 112 is arranged to provide a graphical/visual and/or audio output to a user, or operator, of the system 100. As such, the processor 108 may be arranged to instruct the user output interface 112 to form an image/video signal representing a desired graphical output, and to provide this signal to a monitor (or screen or display unit) 120 of the system 100 that is connected to the user output interface 112. Additionally or alternatively, the processor 108 may be arranged to instruct the user output interface 112 to form an audio signal representing a desired audio output, and to provide this signal to one or more speakers 121 of the system 100 that is connected to the user output interface 112.
[0037] Finally, the network interface 116 provides functionality for the computer 102 to download data from and/or upload data to one or more data communication networks.
[0038] It will be appreciated that the architecture of the system 100 illustrated in
2—Secured Implementation of Cryptographic Processes and S-Boxes
[0039] The SM4 encryption and decryption algorithms are well-known—details of SM4 can be found at http://www.gmbz.org.cn/upload/2018-04-04/1522788048733065051.pdf, the entire disclosure of which is incorporated herein by reference.
[0040] SM4 encryption operates on a 128-bit input d.sub.In and produces a corresponding 128-bit encrypted output d.sub.Out using a 128-bit encryption key. SM4 encryption involves performing a round 32 times—the input to the first round (round 0) is d.sub.In, and the input to the (r+1).sup.th round is the output of the preceding r.sup.th round (for r=0, 1, . . . , 30). The r.sup.th round (for r=0, 1, . . . , 31) makes use of a corresponding 32-bit round key k.sub.r that is derived from the 128-bit encryption key using a key expansion algorithm.
[0041]
[0048] The output of the last round is, therefore, the four 32-bit quantities v.sub.32, v.sub.33, v.sub.34, v.sub.35. The 128-bit encrypted output d.sub.Out is formed by reversing the order of these four 32-bit quantities, i.e. the 128-bit quantity represented by the concatenation of v.sub.35, v.sub.34, v.sub.33, v.sub.32.
[0049] SM4 decryption may be implemented similarly, as is well-known.
[0050] The S-boxes S.sub.0, S.sub.1, S.sub.2, S.sub.3 for SM4 encryption each implement the lookup table set out in Table 1 below. In particular, for an 8-bit input with hexadecimal representation αβ, the corresponding 8-bit value that is output/provided by the S-box is as per Table 1 below.
TABLE-US-00001 TABLE 1 β 0 1 2 3 4 5 6 7 8 9 A B C D E F α 0 D6 90 E9 FE CC E1 3D B7 16 B6 14 C2 28 FB 2C 5 1 2B 67 9A 76 2A BE 4 C3 AA 44 13 26 49 86 6 99 2 9C 42 50 F4 91 EF 98 7A 33 54 0B 43 ED CF AC 62 3 E4 B3 1C A9 C9 8 E8 95 80 DF 94 FA 75 8F 3F A6 4 47 7 A7 FC F3 73 17 BA 83 59 3C 19 E6 85 4F A8 5 68 6B 81 B2 71 64 DA 8B F8 EB 0F 4B 70 56 9D 35 6 1E 24 0E 5E 63 58 D1 A2 25 22 7C 3B 1 21 78 87 7 D4 0 46 57 9F D3 27 52 4C 36 2 E7 A0 C4 C8 9E 8 EA BF 8A D2 40 C7 38 B5 A3 F7 F2 CE F9 61 15 A1 9 E0 AE 5D A4 9B 34 1A 55 AD 93 32 30 F5 8C B1 E3 A 1D F6 E2 2E 82 66 CA 60 C0 29 23 AB 0D 53 4E 6F B D5 DB 37 45 DE FD 8E 2F 3 FF 6A 72 6D 6C 5B 51 C 8D 1B AF 92 BB DD BC 7F 11 D9 5C 41 1F 10 5A D8 D 0A C1 31 88 A5 CD 7B BD 2D 74 D0 12 B8 E5 B4 B0 E 89 69 97 4A 0C 96 77 7E 65 B9 F1 9 C5 6E C6 84 F 18 F0 7D EC 3A DC 4D 20 79 EE 5F 3E D7 CB 39 48
[0051] The S-box of Table 1 implements the S-box function H(x)=(A.sub.2((A.sub.1(x⊕C.sub.1)).sup.−1))⊕C.sub.2, where x, C.sub.1, C.sub.2∈.sub.2.sup.8 (i.e. are represented by respective 8×1 vectors of bits), and A.sub.1, A.sub.2 are 8×8 matrices over
.sub.2. It will be appreciated, of course, that there are other equivalent ways of mathematically representing the S-box function H(x).
[0052] If the 32-bit round key k.sub.r is viewed as a concatenation of four 8-bit subkeys k.sub.r,0, k.sub.r,1, k.sub.r,2, k.sub.r,3, then each of the subkeys k.sub.r,n (n=0, 1, 2, 3) may be implemented as part of the corresponding S-box S.sub.n. This results in four bespoke S-boxes for the r.sup.th round, namely S.sub.r,n (n=0, 1, 2, 3) that correspond, respectively, to the 8-bit subkeys k.sub.r,n. In particular, for any 8-bit input x, the S-box S.sub.r,n generates an 8-bit output y that equals the output of the standard S-box for SM4 encryption when provided with the 8-bit input x⊕k.sub.r,n. Thus, the S-box S.sub.r,n implements the S-box function H.sub.r,n(x)=(A.sub.2((A.sub.1(x⊕k.sub.r,n⊕C.sub.1)).sup.−1))⊕C.sub.2.
[0053] The AES encryption and decryption algorithms are well-known—details of AES are given in Federal Information Processing Standards Publication 197 (found at http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf), the entire disclosure of which is incorporated herein by reference.
[0054] AES encryption operates on a 128-bit input d.sub.In and produces a corresponding 128-bit encrypted output d.sub.Out. There are three variations of AES, known as AES-128, AES-192 and AES-256: for AES-n, the size of the encryption key is n bits. AES encryption involves performing a round a number of times, R—for AES-128, R=10; for AES-192, R=12; for AES-256, R=14. A key expansion algorithm is used to generate R+1 128-bit subkeys k.sub.r (r=0, 1, . . . , R). The r.sup.th round makes use of k.sub.r (r=1, 2, . . . , R). The input to AES encryption is d.sub.In, which gets XOR-ed with k.sub.0, following which the sequence of R rounds (rounds 1, 2, . . . , R) is performed.
[0055]
[0060] AES decryption may be implemented similarly, as is well-known.
[0061] The S-boxes S.sub.0, S.sub.1, . . . S.sub.15 for AES encryption each implement the lookup table set out in Table 2 below. In particular, for an 8-bit input with hexadecimal representation αβ, the corresponding 8-bit value that is output/provided by the S-box is as per Table 2 below.
TABLE-US-00002 TABLE 2 β 0 1 2 3 4 5 6 7 8 9 A B C D E F α 0 63 7c 77 7b f2 6b 6f c5 30 1 67 2b fe d7 ab 76 1 ca 82 c9 7d fa 59 47 f0 ad d4 a2 af 9c a4 72 C0 2 b7 fd 93 26 36 3f f7 cc 34 a5 e5 f1 71 d8 31 15 3 4 c7 23 c3 18 96 5 9a 7 12 80 e2 eb 27 b2 75 4 9 83 2c 1a 1b 6e 5a a0 52 3b d6 b3 29 e3 2f 84 5 53 d1 0 ed 20 fc b1 5b 6a cb be 39 4a 4c 58 cf 6 d0 ef aa fb 43 4d 33 85 45 f9 2 7f 50 3c 9f a8 7 51 a3 40 8f 92 9d 38 f5 be b6 da 21 10 ff f3 d2 8 cd 0c 13 ec 5f 97 44 17 c4 a7 7e 3d 64 5d 19 73 9 60 81 4f de 22 2a 90 88 46 ee b8 14 de 5e 0b db A e0 32 3a 0a 49 6 24 5c c2 d3 ac 62 91 95 e4 79 B e7 c8 37 6d 8d d5 4e a9 6c 56 f4 ea 65 7a ae 8 C ba 78 25 2e 1c a6 b4 c6 e8 dd 74 1f 4b bd 8b 8a D 70 3e b5 66 48 3 f6 0e 61 35 57 b9 86 c1 1d 9e E e1 f8 98 11 69 d9 8e 94 9b 1e 87 e9 ce 55 28 df F 8c a1 89 0d bf e6 42 68 41 99 2d 0f b0 54 bb 16
[0062] The S-box of Table 2 implements the S-box function H(x)=(A.sub.1(x.sup.−1))⊕C.sub.1, where x, C.sub.1∈.sub.2.sup.8 (i.e. are represented by respective 8×1 vectors of bits), and A.sub.1 is an 8×8 matrix over
.sub.2. It will be appreciated, of course, that there are other equivalent ways of mathematically representing the S-box function H(x).
[0063] If the 128-bit subkey k.sub.r−1 is viewed as a concatenation of sixteen 8-bit subkeys k.sub.r−1,0, k.sub.r−1,1, . . . , k.sub.r−1,15, (r=1, 2, . . . , R), then each of the subkeys k.sub.r−1,n (n=0, 1, . . . , 15) may be implemented as part of the corresponding S-box S, in the r.sup.th round. In other words, the initial XOR of the input d.sub.In with k.sub.0 may be implemented as part of the S-boxes for round 1. Likewise, the XOR at the end of the r.sup.th round (r=1, 2, . . . , R−1) may be implemented as part of the S-boxes for the following round, i.e. the (r+1).sup.th round. This results in sixteen bespoke S-boxes for the r.sup.th round, namely S.sub.r,n (n=0, 1, . . . , 15) that correspond, respectively, to the 8-bit subkeys k.sub.r−1,n. In particular, for any 8-bit input x, the S-box S.sub.r,n generates an 8-bit output y that equals the output of the standard S-box for AES encryption when provided with the 8-bit input x⊕k.sub.r−1,n. Thus, the S-box S.sub.r,n implements the S-box function H.sub.r,n(x)=(A.sub.1((x⊕k.sub.r−1,n).sup.−1))⊕C.sub.1.
[0064] As can be seen from the above, SM4 encryption, SM4 decryption, AES encryption and AES decryption may be viewed as a cryptographic process that comprises generating output data d.sub.Out based on input data d.sub.In. The generation of the output data involves generating an amount of data y based on an amount of data x, the amount of data y representing a combination, according to a linear transformation L, of N S-box outputs for some integer N>1. In some implementations of such a cryptographic process, the S-box may be implemented once and used N times to provide the N S-box outputs; in other implementations of such a cryptographic process, the S-box may be implemented separately more than once (e.g. N times), with the N S-box outputs being provided from the plurality of implementations of the S-boxes—this is particularly true when the S-boxes are different for each S-box output, e.g. when a key (or a part thereof) has been combined with the S-box, as discussed above. Regardless of the actual implementation, in the following this may be regarded as equivalent to using a plurality N of S-boxes S.sub.n (n=0, 1, . . . , N−1) for integer N>1, wherein each S-box S.sub.n (n=0, 1, . . . , N−1) implements a respective function H.sub.n (which may or may not be the same of the other functions H.sub.j for j≠n). As illustrated above, the generation of an amount of data y based on an amount of data x occurs in each round of SM4 encryption, SM4 decryption, AES encryption and AES decryption, but it will be appreciated that this need not be the case for other cryptographic processes. It will be appreciated that embodiments of the invention are not limited to SM4 encryption/decryption or AES encryption/decryption as the cryptographic process, but that other algorithms could be used instead (such as Serpent encryption/decryption). Based on the above, it will be appreciated that embodiments of the invention are particularly suited to cryptographic processes that are, or that involve use of, a substitution-permutation network (such networks being well-known, and more details of which can be found at https://en.wikipedia.org/wiki/Substitution%E2%80%93permutation_network, the entire disclosure of which is incorporated herein by reference).
[0065] More generally, then, suppose there are N S-boxes S.sub.n (n=0, 1, . . . , N−1), where N is an integer greater than 1. Each S-box S.sub.n (n=0, 1, . . . , N−1) implements, or represents, an S-box function H.sub.n, i.e. for each valid input x, for the S-box S.sub.n, the corresponding output from the S-box S.sub.n is H.sub.n(x.sub.n). Thus, the amount of data x may comprise (or provide or represent) the inputs x.sub.n for the S-boxes S.sub.n (n=0, 1, . . . , N−1), and the linear transformation L may operates on the outputs H.sub.n(x.sub.n) from the S-boxes S.sub.n (n=0, 1, . . . , N−1) to generate the amount of data y. For example (e.g. as in the SM4 encryption and decryption and AES encryption and decryption discussed above), the inputs x.sub.n (n=0, 1, . . . , N−1) may be formed from corresponding bits (e.g. blocks of consecutive bits) of the amount of data x.
[0066] Now, for each S-box S.sub.n (n=0, 1, . . . , N−1), the corresponding S-box function H.sub.n may be represented as a composition of a corresponding first function F.sub.n and a corresponding second function G.sub.n, so that H.sub.n=G.sub.n∘F.sub.n. In the following discussion and embodiments, the corresponding second function G.sub.n is an affine function/transformation or possibly a linear function/transformation. Indeed, the corresponding S-box function H.sub.n may be represented as a composition of more than two functions, which may be represented as a composition of a corresponding first function F.sub.n, a corresponding second function G.sub.n and a corresponding third function W.sub.n so that H.sub.n=G.sub.n∘F.sub.n∘W.sub.n. If, on the face of it, the corresponding S-box function H.sub.n does not appear to be representable as a composition of two or more functions, then note that H.sub.n=G.sub.n∘(G.sub.n.sup.−1∘H.sub.n) for any affine (or possibly linear) function G.sub.n having the same codomain as H.sub.n's codomain—thus, the corresponding S-box function H.sub.n may be represented as a composition of a corresponding first function, namely (G.sub.n.sup.−1∘H.sub.n), and a corresponding second function G.sub.n.
[0067] Thus, each S-box S.sub.n (n=0, 1, . . . , N−1) implements a respective function H.sub.n that is either (a) the composition of a respective first function F.sub.n and a respective second function G.sub.n so that H.sub.n=G.sub.n∘F.sub.n, or (b) the composition of a respective first function F.sub.n, a respective second function G.sub.n and a respective third function W.sub.n so that H.sub.n=G.sub.n∘F.sub.n∘W.sub.n. It will be appreciated that, for any S-box function H.sub.n, there may be multiple ways of writing H.sub.n as a composition of two or more functions.
[0068] For each S-box S.sub.n (n=0, 1, . . . , N−1), the corresponding S-box function H.sub.n may be an algebraic function, but this is not essential. Likewise, the corresponding first function F.sub.n (and, where used, the corresponding third function W.sub.n) may be algebraic functions, but this is not essential. As mentioned, the corresponding second function G.sub.n is an affine function/transformation (or possibly a linear function/transformation).
[0069]
[0070]
[0073] As discussed, the third functions W.sub.n (n=0, 1, . . . , N−1) are optional, hence they are shown in dotted lines in
[0074] In
[0075] Now, F.sub.n(x.sub.n) (n=0, 1, . . . , N−1) may be written, or represented as, a plurality of components (or parts), i.e. F.sub.n(x.sub.n)=Σ.sub.d=0.sup.D.sup..sub.2.sup.B.sup.
.sub.2.sup.B.sup.
.sub.2.sup.B.sup.
[0076] As a linear transformation, L may be represented as a matrix
so that the first processing stage and the second processing stage together implement
[0077] It will be appreciated that, in some embodiments, G.sub.n(F.sub.n(x.sub.n)) (n=0, 1, . . . , N−1) may be represented as B.sub.n×1 a vector (e.g. a vector from .sub.2.sup.B.sup.
, in which case each of l.sub.q,n (n, q=0, 1, . . . , N−1) may be an element of the field
.
[0078] Now:
[0079] Here,
to
Thus, each component e.sub.n,d (n=0, 1, . . . , N−1; d=0, 1, . . . , D.sub.n−1) contributes the vector
to the computation of the amount of data y, with y being the sum of these vector contributions (and potentially with the addition of
[0080] The set of components E={e.sub.n,d: n=0, 1, . . . , N−1; d=0, 1, . . . , D.sub.n−1} may be partitioned into a plurality of disjoint partitions, each having a respective plurality of the components e.sub.n,d. Let there be M such partitions (for integer M>1), namely E.sub.m (m=0, 1, . . . , M−1), where U.sub.m=0.sup.M−1E.sub.m=E and E.sub.m1∩E.sub.m2=Ø if m1≠m2. In some embodiments, the partitions have the same number of components; in other embodiments, some or all of the partitions may have different numbers of components from each other. Each partition E.sub.m contributions the vector
to the computation of the amount of data y.
[0081] Based on the above,
(or a representation thereof), i.e. the contribution that the partition E.sub.m makes to the computation of y. The outputs from the plurality of second lookup tables Ω.sub.m (m=0, 1, . . . , M−1) may then be combined (i.e. summed/added) to generate y (and potentially with the addition of
[0082] Thus, as can be seen from
[0085] In some embodiments, the second stage may be arranged to generate a masked version of the amount of data y. For example, the second stage may be arranged to generate the amount of data y+
[0086] In
[0087] A specific example of this is set out below, and is illustrated schematically in
H.sub.r,n(x)=(A.sub.2((A.sub.1(x⊕k.sub.r,n⊕C.sub.1)).sup.−1))⊕C.sub.2
[0088] In this case, one may write H.sub.r,n=G.sub.r,n∘F.sub.r,n, where F.sub.r,n(x)=(A.sub.1(x⊕k.sub.r,n⊕C.sub.1)).sup.−1 and G.sub.r,n(x)=A.sub.2x⊕C.sub.2.
[0089] The 128-bit input to the r.sup.th round comprises, or is treated as a concatenation of, four 32-bit quantities (or values): v.sub.r, v.sub.r+1, v.sub.r+2, v.sub.r+3. For the first processing stage shown in
[0090] The outputs F.sub.r,n(x.sub.r,n) from the lookup tables Φ.sub.r,n (n=0, 1, 2, 3) for the r.sup.th round are 8-bit values, each of which may be viewed as having three respective components, namely: e.sub.r,n,0 is the 8-bit value whose 3 most significant bits match those of F.sub.r,n(x.sub.r,n) and whose other bits are 0; e.sub.r,n,1 is the 8-bit value whose 2 middle bits match those of F.sub.r,n(x.sub.r,n) and whose other bits are 0; and e.sub.r,n,2 is the 8-bit value whose 3 least significant bits match those of F.sub.r,n(x.sub.r,n) and whose other bits are 0. Thus F.sub.r,n(x.sub.r,n)=e.sub.r,n,0+e.sub.r,n,1+e.sub.r,n,2. Of course, the way in which components are chosen/selected may change from one Type 1 table to another Type 1 table. Likewise, the way in which components are chosen/selected may change from one round to another round.
[0091] The set of components E.sub.r={e.sub.r,n,j: n=0, . . . , 3; j=0, 1, 2} could be partitioned in a variety of ways, but suppose that five partitions are used so that M=5, e.g. E.sub.r,0={e.sub.r,0,0, e.sub.r,1,2}, E.sub.r,1={e.sub.r,0,1, e.sub.r,1,0, e.sub.r,3,2}, E.sub.r,2={e.sub.r,0,2, e.sub.r,2,1}, E.sub.r,3={e.sub.r,2,0, e.sub.r,3,1} and E.sub.r,4={e.sub.r,1,1, e.sub.r,2,2, e.sub.r,3,0}. Of course, the way in which components are partitioned may change from one round to another round.
[0092] Then, for the r.sup.th round there will be five respective Type 2 lookup tables Ω.sub.r,m (m=0, . . . , M−1), where: [0093] (a) The input to Ω.sub.r,0 corresponds to, or is based on, e.sub.r,0,0 and e.sub.r,1,2 and, therefore, could be a 6-bit value formed from the 3 most significant bits of the output of Φ.sub.r,0 and the 3 least significant bits of the output of Φ.sub.r,1; the output of Ω.sub.r,0 would then be
[0098] As mentioned, each S-box S.sub.n (n=0, 1, . . . , N−1) implements a respective function H.sub.n that can be written as either (a) the composition of a respective first function F.sub.n and a respective second function G.sub.n so that H.sub.n=G.sub.n∘F.sub.n, or (b) the composition of a respective first function F.sub.n, a respective second function G.sub.n and a respective third function W.sub.n so that H.sub.n=G.sub.n∘F.sub.n∘W.sub.n. There may be multiple ways of writing H.sub.n as a composition of two or more functions. For example, as discussed above for the SM4 encryption example of
[0099] The invertible linear transformations T.sub.n (n=0, 1, . . . , N−1) could be any linear transformations and could, for example, be randomly generated. In some embodiments, T.sub.n1 is different from T.sub.n2 for some n1≠n2; in other embodiments, T.sub.n is the same for all n=0, 1, . . . , N−1. In embodiments that make use of the invertible linear transformations T.sub.n (n=0, 1, . . . , N−1), the first lookup tables Φ.sub.n (n=0, 1, . . . , N−1) implement a corresponding obfuscation transformation that is undone by the plurality of second lookup tables Ω.sub.m (m=0, 1, . . . , M−1).
[0100] An example of bit-expansion-function J.sub.n (n=0, 1, . . . , N−1) is as follows. Suppose B.sub.n,1=8 and B.sub.n,2=12. If the input to the bit-expansion function J.sub.n is z (as an 8-bit vector or element of .sub.2.sup.8), then let γ.sub.0 and γ.sub.1 be 4-bit values made from different bits of z, so that z can be reformed from γ.sub.0 and γ.sub.1 (e.g. γ.sub.0 is the value made from the 4 most significant bits of z and γ.sub.1 is the value made from the 4 least significant bits of z). J.sub.n may generate two 4-bit random numbers α.sub.1 and β.sub.1, and define two 4-bit numbers α.sub.0 and β.sub.0 as α.sub.0=γ.sub.0⊕α.sub.1 and β.sub.0=γ.sub.1⊕β.sub.1. Then J.sub.n(z)=(δ.sub.0, δ.sub.1, δ.sub.2), i.e. a triple of three 4-bit numbers, where δ.sub.0=α.sub.0⊕β.sub.0, δ.sub.1=α.sub.1⊕β.sub.1 and δ.sub.2=α.sub.1⊕β.sub.0. Here we note that γ.sub.0=δ.sub.0⊕δ.sub.2 and γ.sub.1=δ.sub.1⊕δ.sub.2, so that γ.sub.0 and γ.sub.1 (and hence z) may be recovered from (δ.sub.0, δ.sub.1, δ.sub.2), thereby defining the inverse mapping J.sub.n.sup.−1 over the codomain of J.sub.n. Thus, one could represent J.sub.n(z) with three 12-bit vectors or components, namely
(with 0, δ.sub.0, δ.sub.1, δ.sub.2 viewed here as 4-bit vectors), so that J.sub.n(z) (as an 12-bit vector or element of .sub.2.sup.12) is J.sub.n(z)=e.sub.0+e.sub.1+e.sub.2. It will be appreciated, of course, that other bit-expansion functions could be used instead. Regardless, in embodiments that make use of bit-expansion-functions J.sub.n (n=0, 1, . . . , N−1), the outputs of the first lookup tables Φ.sub.n (n=0, 1, . . . , N−1) have a larger bit width than the inputs to the first lookup tables Φ.sub.n (n=0, 1, . . . , N−1).
[0101]
[0102] At a step 702, the method 700 comprises performing a first processing stage. The first processing stage uses a plurality of first lookup tables Φ.sub.n (n=0, 1, . . . , N−1) to generate respective outputs, each output being based on at least part of the amount of data x. For each S-box S.sub.n (n=0, . . . , N−1), the respective first function F.sub.n is implemented by a corresponding first lookup table Φ.sub.n.
[0103] At a step 704, the method 700 comprises performing a second processing stage. The second processing stage combines outputs from a plurality of second lookup tables Ω.sub.m (m=0, 1, . . . , M−1) to generate the value y (and potentially with the addition of
[0104] The first processing stage and the second processing stage generate the value y based on the amount of data x.
[0105]
[0106] At a step 802, the method 800 comprises implementing a first processing stage. This involves generating a plurality of first lookup tables Φ.sub.n (n=0, 1, . . . , N−1) that provide respective outputs, each output being based on at least part of the amount of data x. For each S-box S.sub.n (n=0, . . . , N−1), the respective first function F.sub.n is implemented by a corresponding first lookup table Φ.sub.n.
[0107] At a step 804, the method 800 comprises performing a second processing stage. This involves generating a plurality of second lookup tables Ω.sub.m (m=0, 1, . . . , M−1). The second processing stage is arranged or configured to combine outputs from the plurality of second lookup tables Ω.sub.m (m=0, 1, . . . , M−1) to generate the value y. The input to each second lookup table Ω.sub.m (m=0, 1, . . . , M−1) is formed from the output of a plurality of the first lookup tables (namely the components e.sub.n,d∈E.sub.m). The set of second lookup tables is based on the second functions G.sub.n (n=0, . . . , N−1) and the linear transformation L.
[0108] The first processing stage and the second processing stage, together, are arranged to generate the value y based on the amount of data x.
4—Modifications
[0109] It will be appreciated that the methods described have been shown as individual steps carried out in a specific order. However, the skilled person will appreciate that these steps may be combined or carried out in a different order whilst still achieving the desired result.
[0110] It will be appreciated that embodiments of the invention may be implemented using a variety of different information processing systems. In particular, although the figures and the discussion thereof provide an exemplary computing system and methods, these are presented merely to provide a useful reference in discussing various aspects of the invention. Embodiments of the invention may be carried out on any suitable data processing device, such as a personal computer, laptop, personal digital assistant, mobile telephone, set top box, television, server computer, etc. Of course, the description of the systems and methods has been simplified for purposes of discussion, and they are just one of many different types of system and method that may be used for embodiments of the invention. It will be appreciated that the boundaries between logic blocks are merely illustrative and that alternative embodiments may merge logic blocks or elements, or may impose an alternate decomposition of functionality upon various logic blocks or elements.
[0111] It will be appreciated that the above-mentioned functionality may be implemented as one or more corresponding modules as hardware and/or software. For example, the above-mentioned functionality may be implemented as one or more software components for execution by a processor of the system. Alternatively, the above-mentioned functionality may be implemented as hardware, such as on one or more field-programmable-gate-arrays (FPGAs), and/or one or more application-specific-integrated-circuits (ASICs), and/or one or more digital-signal-processors (DSPs), and/or one or more graphical processing units (GPUs), and/or other hardware arrangements. Method steps implemented in flowcharts contained herein, or as described above, may each be implemented by corresponding respective modules; multiple method steps implemented in flowcharts contained herein, or as described above, may be implemented together by a single module.
[0112] It will be appreciated that, insofar as embodiments of the invention are implemented by a computer program, then one or more storage media and/or one or more transmission media storing or carrying the computer program form aspects of the invention. The computer program may have one or more program instructions, or program code, which, when executed by one or more processors (or one or more computers), carries out an embodiment of the invention. The term “program” as used herein, may be a sequence of instructions designed for execution on a computer system, and may include a subroutine, a function, a procedure, a module, an object method, an object implementation, an executable application, an applet, a servlet, source code, object code, byte code, a shared library, a dynamic linked library, and/or other sequences of instructions designed for execution on a computer system. The storage medium may be a magnetic disc (such as a hard drive or a floppy disc), an optical disc (such as a CD-ROM, a DVD-ROM or a BluRay disc), or a memory (such as a ROM, a RAM, EEPROM, EPROM, Flash memory or a portable/removable memory device), etc. The transmission medium may be a communications signal, a data broadcast, a communications link between two or more computers, etc.