Protection system and method
11733966 · 2023-08-22
Assignee
Inventors
Cpc classification
H04L9/003
ELECTRICITY
G06F7/501
PHYSICS
H04L2209/046
ELECTRICITY
International classification
G06F7/501
PHYSICS
Abstract
A device of executing a cryptographic operation on bit vectors, the execution of the cryptographic operation includes the execution of at least one arithmetic addition operation between a first operand and a second operand. Each operand comprises a set of components, each component corresponding to a given bit position of the operand. The device comprises a set of elementary adders, each elementary adder being associated with a given bit position of the operands and being configured to perform a bitwise addition between a component of the first operand at the given bit position and the corresponding component of the second operand at the given bit position using the carry generated by the computation performed by the elementary adder corresponding to the previous bit position. Each elementary adder has a sum output corresponding to the bitwise addition and a carry output, the result of the arithmetic addition operation being derived from the sum outputs provided by each elementary adder. The device is configured to apply a mask to each operand component input of at least some of the elementary adders using a masking logical operation, the mask being a random number.
Claims
1. A cryptographic system comprising a device for executing a cryptographic operation on bit vectors, the execution of said cryptographic operation comprising the execution of at least one arithmetic addition operation between a first operand and a second operand, each operand being an integer of a given bit size and representing a bit vector, each operand comprising a set of components, each component corresponding to a given bit position of the operand, the device comprising a set of elementary adders, each elementary adder being associated with a given bit position of the operands, wherein each elementary adder other than the elementary adder in a least significant bit position has a sum output corresponding to a bitwise addition and a carry output, wherein each elementary adder other than the elementary adder in the least significant bit position is configured to perform a bitwise addition between a component of the first operand at said given bit position and the corresponding component of the second operand at said given bit position using the carry generated by the bitwise addition performed by the elementary adder corresponding to a previous bit position, a result of the arithmetic addition operation being derived from the sum outputs provided by each elementary adder, wherein the device is configured to apply a mask to each operand component input of at least some of the elementary adders using a masking logical operation, and wherein a same mask is applied to each elementary adder.
2. The cryptographic system of claim 1, wherein said mask is random.
3. The cryptographic system of claim 1, wherein the masking logical operation used to apply a mask to each operand component input is a XOR logic operation between said mask and said operand component input.
4. The cryptographic system of claim 1, wherein each elementary adder is a full adder.
5. The cryptographic system of claim 1, wherein each elementary adder is a carry look-ahead adder.
6. A method, implemented in a cryptographic system, for executing a cryptographic operation on bit vectors, by a device comprising a set of elementary adders in said cryptographic system, the cryptographic operation being related to a cryptographic mechanism, the execution of said cryptographic operation, by said device, comprising the execution of at least one arithmetic addition operation between a first operand and a second operand, each operand being an integer of a given bit size and representing a bit vector, each operand comprising a set of components, each component corresponding to a given bit position of the operand, each elementary adder being associated with a given bit position of the operands, the method comprising, for each bit position of the operands, other than a least significant bit position, performing, by the elementary adder associated with said bit position, a bitwise addition providing a sum output and a carry output, wherein the step of performing a bitwise addition comprises, for each bit position of the operands, other than the least significant bit position, performing a bitwise addition between a component of the first operand at said bit position and the corresponding component of the second operand at said bit position, using the carry generated by the bitwise addition of the bit components of the operand at a previous bit position, a result of the arithmetic addition operation being derived from the sum outputs provided by each elementary adder, the step of performing a bitwise addition previously comprising applying a mask to each operand bit component using a masking logical operation, wherein a same mask is applied to each elementary adder.
7. The method of claim 6, wherein said mask is a random number.
8. A cryptographic system comprising a device for executing a cryptographic operation on bit vectors, the execution of said cryptographic operation comprising the execution of at least one arithmetic addition operation between a first operand and a second operand, each operand being an integer of a given bit size and representing a bit vector, each operand comprising a set of components, each component corresponding to a given bit position of the operand, the device comprising a set of elementary adders, each elementary adder being associated with a given bit position of the operands, wherein each elementary adder other than the elementary adder in a least significant bit position has a sum output corresponding to a bitwise addition and a carry output, wherein each elementary adder other than the elementary adder in the least significant bit position is configured to perform a bitwise addition between a component of the first operand at said given bit position and the corresponding component of the second operand at said given bit position using the carry generated by the bitwise addition performed by the elementary adder corresponding to a previous bit position, a result of the arithmetic addition operation being derived from the sum outputs provided by each elementary adder, wherein the device is configured to apply a mask to each operand component input of at least some of the elementary adders using a masking logical operation, wherein the device is configured to apply different masks to sequences of elementary adders, each sequence comprising connected elementary adders, wherein said set of elementary adders comprises at least two sequences of elementary adders, a different mask being applied to each sequence of elementary adders, and wherein the device comprises a mask switching unit arranged between the output of a previous sequence of elementary adders which is applied a mask and the input of a next sequence of elementary adders, and wherein the mask switching unit is configured to apply a new mask to the next sequence of elementary adders and to provide the carry output of the last elementary adder of the previous sequence of elementary adders, to a first elementary adder of the next sequence of elementary adders.
9. The cryptographic system of claim 8, wherein the mask switching unit comprises at least two XOR logical gates.
10. The cryptographic system of claim 9, wherein the mask switching unit comprises at least two XOR logical gates, a first XOR logical gate comprising a first XOR logical gate configured to receive the new mask and the carry output of the last elementary adder of a previous set of elementary adders, a second XOR logical gate receiving the mask of the previous set of elementary adders and the output of the first XOR logical gate, the output of the second XOR logical gate being connected to the input of a first elementary adder of a next set of elementary adders.
11. The cryptographic system of claim 8, wherein the masking logical operation used to apply a mask to each operand component input is a XOR logic operation between said mask and said operand component input.
12. The cryptographic system of claim 8, wherein each elementary adder is a full adder.
13. The cryptographic system of claim 8, wherein each elementary adder is a carry look-ahead adder.
14. A method, implemented in a cryptographic system, for executing a cryptographic operation on bit vectors, by a device comprising a set of elementary adders in said cryptographic system, the cryptographic operation being related to a cryptographic mechanism, the execution of said cryptographic operation, by said device, comprising the execution of at least one arithmetic addition operation between a first operand and a second operand, each operand being an integer of a given bit size and representing a bit vector, each operand comprising a set of components, each component corresponding to a given bit position of the operand, each elementary adder being associated with a given bit position of the operands, the method comprising, for each bit position of the operands, other than a least significant bit position, performing a bitwise addition, by the elementary adder associated with said bit position, providing a sum output and a carry output, wherein the step of performing a bitwise addition comprises, for each bit position of the operands, other than the least significant bit position, performing a bitwise addition between a component of the first operand at said bit position and the corresponding component of the second operand at said bit position, using the carry generated by the bitwise addition of the bit components of the operand at a previous bit position, a result of the arithmetic addition operation being derived from the sum outputs provided by each elementary adder, the step of performing a bitwise addition previously comprising applying a mask to each operand bit component using a masking logical operation, wherein the method comprises applying different masks to sequences of elementary adders, each sequence comprising connected elementary adders, wherein a different mask is applied to each sequence of elementary adders, and wherein the method comprises applying a mask to a previous sequence of elementary adders, and applying, by a mask switching unit, a new mask to the next sequence of elementary adders and providing the carry output of the last elementary adder of the previous sequence of elementary adders, to a first elementary adder of the next sequence of elementary adders.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate various embodiments of the invention and, together with the general description of the invention given above, and the detailed description of the embodiments given below, serve to explain the embodiments of the invention.
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13) Additionally, the detailed description is supplemented with an Exhibit 1: Exhibit 1 is an exemplary application of the masking method to mask the algorithm TEA (Tiny Encryption Algorithm).
(14) This Exhibit is placed apart for the purpose of clarifying the detailed description, and of enabling easier reference. It nevertheless forms an integral part of the description of the present invention. This applies to the drawings as well.
(15) A portion of the disclosure of this patent document may contain material which is subject to copyright protection.
DETAILED DESCRIPTION
(16) Referring to
(17) The cryptographic system 10 may comprise a cryptographic engine 11 configured to execute a cryptographic operation related to the cryptographic mechanism while protecting such execution from attacks. The cryptographic operation may be implemented by any cryptographic algorithm comprising Boolean and/or Arithmetic operations, such as for example DES, AES, IDEA, RC5 or SHA. The cryptographic operation comprises at least one arithmetic addition a+b between a first binary operand a and a second binary operand b, the operands being bits integers of a given bit size and representing a bit vectors (also referred to as “data blocks”). The bit vectors represent data blocks, such as intermediate states of the cryptographic algorithm.
(18) The operands may be of same or of different bit sizes. The following description of some embodiments will be made with reference to operands of same bit size n for illustration purpose only, although the skilled person will readily understand that the invention also applies to operands of different size (in such case, the smaller operand can be for example padded with zeros to apply the invention to operands having same bit width).
(19) In cryptographic operations, plain data is encrypted by chaining operations involving a secret (e.g., a key), in such a way the result reveals little (unexploitable) information on the plain data and on the secret. Such operations may consist in linear and/or non-linear operations, which break down in the manipulation of intermediate bit width, such as nibbles (bit vectors of 4 bits), bytes (bit vectors of 8 bits), words (bit vectors of 16 bits), double words (bit vectors of 32 bits), quad-words (bit vectors of 64 bits), etc. The bit vectors (or ‘data blocks’) represented by operands a and b may be such words of limited bit width allow to use efficiently the logic and arithmetic operations (e.g., assembly instructions) in processors or in general purpose machines (e.g., hardwired look-up tables or “digital signal processors” embedded into Field Programmable Gates Array (FPGAs)).
(20) The cryptographic engine 11 may comprise an adder device 100 configured to execute each arithmetic addition a+b. The arithmetic addition may be defined over the integers (the set of integers is noted in mathematical notations) or over the ring of integers modulo 2.sup.n, noted
/2.sup.n
.
(21) To facilitate understanding of some embodiments described hereinafter, the following definitions are provided.
(22) Considering two integers a=(a.sub.n−1; . . . ; a.sub.0).sub.2 and b=(b.sub.n−1; . . . ; b.sub.0).sub.2, each of the two integers a and b being a n-bit integer, represented as a string of bits, an arithmetic addition refers to a sequence of bitwise operations defined as follows, for i=0, . . . , n−1:
d.sub.i =a.sub.i⊕b.sub.i⊕c.sub.i (1)
(23) with:
c.sub.i+1=MAJ(a.sub.i,b.sub.i,c.sub.i) (2)
(24) where c.sub.i is initially set to 0 (c.sub.0=0), when adding a and b (this operation is also referred to as ADD in assembly languages), or when c.sub.i is the value of the incoming carry in case of a pipelined addition a+b+c.sub.0 (this operation is also referred to as ADDC in assembly languages).
(25) It should be noted that integers a=(a.sub.n−1; . . . ; a.sub.0) and b=(b.sub.n−1; . . . ; b.sub.0) both fit on n bits, meaning that 0≤a≤(2.sup.n−1) and 0≤b≤(2.sup.n−1).
(26) As the incoming carry c.sub.0 is a bit 0≤c.sub.0≤1, it comes:
0≤a+b+c.sub.0≤(2.sup.n−1)+(2.sup.n−1)+1=2.sup.n+1−1
(27) As a result, a+b+c.sub.0 fits on (n+1) bits.
(28) As a+b+c.sub.0=(c.sub.n; d.sub.n−1; . . . ; d.sub.0).sub.2 or a+b+c.sub.0=(d.sub.n; d.sub.n−1; . . . ; d.sub.0).sub.2 using the convention that d.sub.n=c.sub.n, the sum a+b+c.sub.0 can be expressed using the bit values:
d.sub.i(0≤i≤(n−1)), and
c.sub.i(0≤i≤n)
(29) In the following description of some embodiments, a+b+c.sub.0 will be denoted d, that is:
a+b+c.sub.0=d
(30) With:
d=(d.sub.n;d.sub.n−1; . . . ; d.sub.0).sub.2
(31) The operand bit size n may be any positive integer. In some embodiments, n may be large (i.e. n»8). The arithmetic addition may be performed modulo a number N, for example N=2.sup.n in which case c.sub.n+1 represents 2.sup.n=0, which hence can be dropped, leaving the result a+b=d=(d.sub.n, . . . , d.sub.0).sub.2=(d.sub.n−1, . . . , d.sub.0).sub.2 on n-bits, which is compatible with other subsequent operations on the same date size n. The following description will be made with reference to a basic arithmetic addition without modulo operations, for illustrative purpose, although the skilled person will readily understand that the invention also applies to arithmetic additions modulo a number N, power of two (i.e., dropping the Most Significant Bits does not alter the result). Further, the skilled person will readily understand that the invention also applies to operations modulo a number which is non power of two.
(32) In equation (2), MAJ denotes the majority function, defined as follows:
MAJ(a.sub.i,b.sub.i,c.sub.i)=(a.sub.i∧b.sub.i)∨(b.sub.i∧c.sub.i)∨(c.sub.i∧a.sub.i)=(a.sub.i∧b.sub.i)⊕(b.sub.i∧c.sub.i)⊕(c.sub.i∧a.sub.i) (3)
(33) Operators are defined as follows: “⊕” designates logical exclusive OR; “∧” designates logical AND; “∨” designates logical OR.
(34) An addition between two binary operands a and b, may be implemented using a logical circuit comprising a set of elementary adders for performing bitwise additions between the bit components a, and b, of each operand a or b. The result of the arithmetic addition operation is then derived from the sum outputs provided by each elementary adder 10. A memory 3 may be used to store the intermediary results (sum outputs) of each elementary adder 10.
(35) Embodiments of the invention provide an addition execution method and an adder device 100 for executing a cryptographic operation on data blocks, the execution of the cryptographic operation comprising the execution of at least one arithmetic addition operation between a first operand a and a second operand b. Each operand a or b may be an integer having a given bit size n and representing a data block. Each operand a or b comprising a set of components (total number of components is n), each component corresponding to a given bit position i of the operand.
(36)
(37) The adder device 100 comprises a set of elementary adders 10. In the example of
(38) In a conventional adder device, the (i+1)−th elementary adder performs directly a bitwise addition between the component a.sub.i of the first operand at the bit position i and the component b.sub.i of the second operand at the same bit position i, using a carry c.sub.i generated by the computation performed by the i-th elementary adder corresponding to the previous bit position i−1. Further, each conventional (i+1)−th elementary adder delivers a sum output d.sub.i corresponding to the bitwise addition and a carry output c.sub.i+1, the result of the arithmetic addition operation being derived from the sum ouputs provided by each elementary adder 10.
(39) According to embodiments of the invention, the adder device 100 may be configured to apply a mask m (which is a bit) to each operand component input a.sub.i or b.sub.i of at least some of the elementary adders 10 of at least some of the elementary adders 10 using at least one masking logical operation. The component a′.sub.i thus corresponds to the operand a.sub.i after masking with a mask m, and the component b′.sub.i corresponds to the operand b.sub.i after masking with a mask m. The same mask may be preferably applied to a.sub.i and b.sub.i. Further, in the embodiment of
(40) In some embodiments, a trans-masking module may be used. According to some embodiments of the invention, the adder 100 may comprise at least one masking unit 5 configured to apply a mask m to at least some of the inputs a.sub.i, b.sub.i, c.sub.i of each elementary adder sub-circuit 10 using at least one masking logical operation. For example, it may occur that sensitive data fit on a smaller bit-width than the bit-width of the register, in particular on some generic computing platform, configured to both handle sensitive and non-sensitive data.
(41) The cryptographic engine 11 may be further configured to perform de-masking, the de-masking consisting in removing by a logical operation, such as XOR, the mask after the addition (with or without carry), so as to yield the correct addition value as if no mask has been used. Advantageously, when the input is masked, the arithmetic operations according to the invention have the property that the output is also masked. That is, several operations (arithmetic operations or other types of operations such as Boolean logic for example) may be carried out successively, whereby the manipulated data remains advantageously masked and thus protected against attacks. The skilled person will readily understand that the invention can also apply to compositions, be them sequential (one after the other) or parallel (same masked data feeding two independent operators processing). Accordingly, the invention can apply to complex operations, thereby extending addition to multiplication (e.g., bit-serial parallel implementation of multipliers) and/or to any arithmetic operations (powers, quotient and remainders, etc.).
(42) Each mask m may be a random bit number. For improved masking, the bit distribution may be uniform (hence of maximal entropy). Each mask may be generated by a Random Number Generator (not shown), that may be provided by the same chip as the cryptosystem or via a separate chip such as a TPM (Trusted Platform Module), a HSM (Hardware Security Module) or a quantum source, whose function is to compute cryptographic-grade random numbers. The Random Number Generator may be provided with dedicated protection to secure the generated masks. Such secrecy of the masks guarantees the security of the masking scheme. Hence, protections such as sensors and shields may be deployed to ensure that masking bits remain confidential.
(43) The mask value may be generated periodically, for example at each clock cycle, or at each addition computation.
(44) In one embodiment, the same mask m may be applied to each masked input. In some embodiments, a mask may be applied to all the inputs a.sub.i, b.sub.i, and c.sub.0 of at least some of the n elementary adders 10.
(45) In a particular embodiment, a mask may be applied to all the inputs a.sub.i, b.sub.i, and c.sub.0 of all n elementary adders 10.
(46) In the embodiment of
(47) In one embodiment, the masking logical operation applied to each operand bit may be a XOR logical operation (also denoted by the operator “⊕”). In some embodiments, it is possible to swap the roles of the Boolean operator XOR and of the arithmetic operator ADD: the masking is then additive while the operation to be protected is Boolean. Further, in some embodiments, the masking unit 5 may apply additional masking to the Boolean masking according to embodiment of the invention. In particular, in one embodiment, it may apply a combination of the Boolean masking and further arithmetic masking, as depicted by
(48)
(49) In the embodiment of
(50)
(51) Indeed, the following properties are satisfied for all m in {0,1}:
d.sub.i⊕m=(a.sub.i⊕m)⊕(b.sub.i⊕m)⊕(c.sub.i61 m) (4)
c.sub.i+1⊕m=MAJ(a.sub.i⊕m, b.sub.i⊕c.sub.i⊕m) (5)
(52) Where m denotes a mask.
(53) Equations (4) and (5) can be rewritten as follows:
d′.sub.i⊕m=(a′.sub.i⊕b′c′.sub.i) (6)
c′.sub.i+1⊕m=MAJ(a′.sub.i, b′.sub.i, c′.sub.i) (7)
(54) where a′.sub.i=(a.sub.i⊕m), b′.sub.i=(b.sub.i⊕m), c′.sub.i =(c.sub.i⊕m), c′.sub.i+1=(c.sub.i+1⊕m), d′.sub.i=(d.sub.i⊕m), and where a.sub.i, b.sub.i, c.sub.i, et c.sub.i+1 are linked by relationships (1) and (2).
(55) The carry delivered for the next stage i=0 to (n−1) is thus c.sub.i+1⊕m. Further, the sum output d.sub.i obtained from each stage i is d′.sub.i=d.sub.i⊕m, as a result of the properties of an XOR logical operator (defined by equations (5) and (6)).
(56) Accordingly, each (i+1)−th elementary adder sub-circuit 10, for i=0 to (n−1), is configured to perform the operation described by equation (1) and (2) and receives as inputs a.sub.i⊕m, b.sub.i⊕m, c.sub.i⊕m and provides as outputs c.sub.i+1⊕m and d.sub.i⊕m. For i=0, the first elementary adder sub-circuit 10 receives as inputs a.sub.0⊕m, b.sub.0⊕m, and carry c.sub.0⊕m with c.sub.0=0.
(57)
(58) Such transparency advantage stems from the properties (4) and (5).
(59) Equation (4) is derived from equation (1) and is due to the associativity of the XOR operator.
(60) The property defined by equation (5) is inherent to the majority function MAJ. Its dual function is also the majority. The majority MAJ of ones (at least two ones amongst three bits) is the opposite of the majority of zeroes (at least two zeroes amongst three bits). This can be written:
MAJ(¬a.sub.i,¬b.sub.i,¬c.sub.i)=¬MAJ(a.sub.i,b.sub.i,c.sub.i) (8)
(61) As used herein, the operator also called ‘¬’, also called ‘neg’, returns the value in negative that is: ¬0=1 and ¬1=0.
(62) Equation (5) can be rewritten:
∀m∈{0,1}, c.sub.i+1⊕m=MAJ(a.sub.i⊕m, b.sub.i⊕m, c.sub.i⊕m)=MAJ(a.sub.i,b.sub.i,c.sub.i)⊕m (9)
(63) Equation (7) shows that addition is transparent to masking with a mask m, provided that the carries are also inverted (not only the inputs a and b are inverted, but also the input carry).
(64) In some embodiments, each elementary adder sub-circuit 10 may be implemented as a Full adder.
(65)
(66) In some embodiments, different masks may be applied to a sequence or a set of elementary adders 10, each comprising one or more elementary adders. An elementary adder will be referred to hereinafter using an index i, with 0≤i≤n−1. The number of elementary adders using a mask m.sub.k will be denoted N.sub.k, hereinafter. Accordingly, a given mask m.sub.k may be applied to a sequence of elementary adder sub-circuits 10 while one or more different masks m.sub.k+1, . . . , m.sub.L−1 may be applied to other sequences of elementary adder sub-circuits 10 of the adder device 100. Each mask m.sub.0, m.sub.k, . . . , m.sub.L−1 may be determined randomly in some embodiments. The index k may range from k=0 to L−1. The index L may range from L=1 (only one masking bit) to L=n (i.e. the masking bit is changed between each elementary adder 10). The different masks may be random and independent binary values. In one embodiment, each value m.sub.k+1 may corresponds to an update of the mask m.sub.k, each update being generated periodically, the new value being applied to the current elementary adders 10 as long as no new mask value has been received. In some embodiments, the masks m.sub.k may be applied to a set of elementary adders comprising a same number of adders or to set of elementary adders comprising different numbers of adders.
(67) In one embodiment, the sum of the numbers N.sub.k for k=1 to L−1 is equal to the number n of elementary adder sub-circuits 10:
Σ.sub.k=0.sup.L−1N.sub.k=n (10)
(68) Advantageously, the number of adders with the same mask (i.e., the number N.sub.k) may be a power of two, for instance to match a machine word length (e.g., 8, 16, 32, or 64 bits).
(69)
(70) A mask m.sub.k+1 may be applied to a current sequence 511 of elementary adders 10 using a switching sub-circuit 50 between the current sequence of elementary adders 10 and the previous sequence 510 of elementary adders 10 masked with mask m.sub.k. The switching sub-circuit 50 (also referred to as a “transmasking” sub-circuit) may be configured to switch the masks from mask m.sub.k to mask m.sub.k+1, and apply the new mask m.sub.k+1to the current sequence 511, while delivering the carry c″.sub.l−1=c′.sub.l−1⊕m.sub.k=MAJ(a.sub.l−2,b.sub.l−2,c.sub.l−2) ⊕m.sub.k to the current sequence 510 of elementary adders 10.
(71) A mask m.sub.k+1 applied to a next sequence of elementary adder sub-circuits 10 may be predetermined randomly, for example by a random number generator. The switching sub-circuit 50 may thus be inserted between: the last elementary adder sub-circuit 10-l (l-th elementary adder sub-circuit) of the previous sequence 510 of elementary adder sub-circuits 10; and the first elementary adders 10-(l+1) ((l+1)-th elementary adder sub-circuit) of the current sequence 511 of elementary adder sub-circuits 10.
(72) The switching sub-circuit 50 may be configured to switch the mask value from previous mask m.sub.k to new mask m.sub.k+1 by performing logical operations. More specifically, the switching sub-circuit 50 may: receive as inputs the value of the mask m.sub.k applied to the previous sequence 510 of elementary adder sub-circuits 10 and the output c′.sub.l=c.sub.l⊕m.sub.k of the last elementary adder sub-circuit (l-th elementary adder sub-circuit) of the previous sequence 510 of elementary adder sub-circuits 10. This further comprises receiving a fresh mask m.sub.k+1; deliver as output the output c′.sub.l=c.sub.l⊕m.sub.k of the last elementary adder sub-circuit of the previous sequence 510, the value c.sub.l⊕m.sub.k+1being inputted to the first elementary adder sub-circuit ((l+1)-th elementary adder sub-circuit) of the current sequence 511 of elementary adder sub-circuits 10.
(73)
(74) As shown in
(75) The first mask m.sub.0 is applied to a sequence 401 of full adders comprising the first two full adders 10-1 and 10-2. The second mask m.sub.1 is applied to a sequence 402 of full adders, comprising the last two full adders 10-3 and 10-4.
(76) The mask m.sub.0 is initially applied to the first stage (first elementary adder 10-1) for the first bit i=0.
(77) The first Full Adder 10-1 receives as inputs a.sub.0⊕m.sub.0, b.sub.0⊕m.sub.0, m.sub.0 and provides as output d.sub.0⊕m.sub.0 and c′.sub.1=c.sub.1⊕m.sub.0 with:
d.sub.0⊕m.sub.0=(a.sub.0⊕m.sub.0)⊕(b.sub.0⊕m.sub.0)⊕(c.sub.0⊕m.sub.0) (11) with:
c′.sub.1=MAJ(a.sub.0, b.sub.0, c.sub.0)⊕m.sub.0 (12)
(78) Similarly, the second Full adder 10-2 receives as inputs a.sub.1⊕m.sub.0, b.sub.1⊕m.sub.0, c.sub.1⊕m.sub.0 and provides as output d.sub.1⊕m.sub.0 and c′.sub.2=c.sub.2⊕m.sub.0 with:
d.sub.1⊕m.sub.0=(a.sub.1⊕m.sub.0)⊕(b.sub.1⊕m.sub.0)⊕(c.sub.1⊕m.sub.0) (13) with:
c′.sub.2=MAJ(a.sub.1,b.sub.1,c.sub.1)⊕m.sub.0 (14)
(79) In the embodiment of
(80) The switching sub-circuit 32 may comprise two XOR logical gates 620 and 621. The first XOR gate 500 receives as input the mask m.sub.1 and the carry c′.sub.2 of the second Full adder 10-2 with c′.sub.2=c.sub.2⊕m.sub.0=MAJ(a.sub.1, b.sub.1, c.sub.1)⊕m.sub.0. The first XOR gates thus performs the operation:
S′=m.sub.1⊕c′.sub.2=m.sub.1⊕c.sub.2⊕m.sub.0 (15)
(81) The second XOR gates 502 receives as input the mask m.sub.0 and the output S′ of the first XOR gates 502.The second XOR gates 502 accordingly performs the operation:
S″=m.sub.0⊕S′=m.sub.0⊕(m.sub.1⊕c′.sub.2)=m.sub.0⊕m.sub.1⊕c.sub.2⊕m.sub.0 (16)
(82) S″ is thus equal to c″.sub.2=c.sub.2⊕m.sub.1
(83) It should be noted that, in order to prevent c.sub.2 from appearing transiently unmasked, gates 500 and 502 should not be swapped.
(84) b.sub.2 and c.sub.2 being masked by the same mask m.sub.1, equation (5) applies.
(85) The output S″ of the switching circuit 50 may be applied to the third full adder corresponding to the third bit (i=2).
(86) The third Full Adder 10-2 receives as inputs a.sub.2⊕m.sub.1, b.sub.2⊕m.sub.1, c.sub.2⊕m.sub.1 and provides as output d.sub.2⊕m.sub.1 and c″.sub.3=c.sub.3⊕m.sub.1 with:
d.sub.2⊕m.sub.1=(a.sub.2⊕m.sub.1)⊕(b.sub.2⊕m.sub.1)⊕(c.sub.2⊕m.sub.1) (17) with:
c″.sub.2=MAJ(a.sub.2, b.sub.2, c.sub.2)⊕m.sub.1 (18)
(87) Similarly, the fourth Full adder 30-4 (i=3) receives as inputs a.sub.4⊕m.sub.1, b.sub.4⊕m.sub.1, c.sub.4⊕m.sub.1 and provides as output d.sub.4⊕m.sub.1 with:
d.sub.4⊕m.sub.1=(a.sub.4⊕m.sub.1)⊕(b.sub.4⊕m.sub.1)⊕(c.sub.4⊕m.sub.1) (19)
(88) It should be noted that the number of different masks applied the full adders 10 of the n-bit adder 100 may vary depending on the application. One to n different masks may be applied, each k-th mask m.sub.k being implementable using a switching circuit 50 comprising logical gates arranged to switch the mask value from the mask m.sub.k to the mask m.sub.k+1. The number of different masks applied to the adder 100 allows adjusting the overall entropy of the masking, depending on the requirements of the application of the invention.
(89) The invention is not limited to the application of a mask m.sub.k to the inputs of the elementary adders 10 using a two-input XOR operator. Such XOR operator may have more than two inputs. Alternatively, it may be replaced with a mixture between Boolean XOR and arithmetic additions, as in the case of the hybrid Boolean and arithmetic masking scheme depicted in
(90) It should be noted that the invention is not limited to the use of switching circuit 50, as depicted in
(91) While the invention has been described in relation with a cryptographic operation of the type addition between two binary operands, the invention more generally applies to any cryptographic operation, the execution of which involving at least on addition between two binary operands, the computation of each addition being performed using the adder 100. The cryptographic operation may be for example a multiplication, a subtraction or a division, such operation being implemented using several add steps. For example, a binary multiplication operation PQ can be implemented as a sequence of Q elementary additions P+ . . . +P. In such embodiment, each elementary addition may be performed using the n-bit adder 100.
(92) Further, the invention is not limited to the use of elementary adders 10 of the type full adder for performing each bitwise addition operation. Further, the invention is not limited to the logic design of
(93) For example, the adder device 100 may be of the type “carry look-ahead adder”. The “carry look-ahead adder” implementation is based on the calculation of the carry signals in advance, based on the input signals. Such implementation is based on the fact that a carry signal will be generated in two cases for the addition of two bits A.sub.i and B.sub.i: when both bits A.sub.i and B.sub.i are equal to 1, or when one of the two bits is equal to 1 while the carry-in (carry of the previous stage) is 1.
(94)
(95) A CLA block exploits the signals P′.sub.i, and G′.sub.i which are defined such that:
P′.sub.i=a′.sub.i⊕b′.sub.i (20)
G′.sub.i=a′.sub.ib′.sub.i (21)
(96) The output sum and carry can thus be defined as:
G′.sub.i=P′.sub.i⊕c′.sub.i (22)
c′.sub.i+1=G′.sub.i+P′.sub.ic′.sub.i (23)
(97) A carry c′.sub.i+1 is generated whenever G′.sub.i=1, regardless of the input carry c′.sub.i(G′.sub.i is referred to as “carry Generate” signal).
(98) The input carry is propagated to the output carry (c′.sub.i+1=c′.sub.i), whenever P′.sub.i=1. The signal P′.sub.i is thus referred to as the carry propagate signal.
(99) The determination of the values P′.sub.i and G′.sub.i only depends on the input operand bits a′i and b′.sub.i. Accordingly, the P′.sub.i and G′.sub.i reach a steady-state value after the propagation through their associated gates.
(100) The masked-CLA adder 100 can be implemented using three levels: A First level comprising the n elementary adders 10 of Full Adder type, each generating P′.sub.i and G′.sub.i signals, comprising an XOR gate and an AND gate for each couple {P′.sub.i,G′.sub.i} signals. Output signals of P′.sub.i and G′.sub.i may be valid after 1T. The i-th carry output c′.sub.i may depend on the signals P′.sub.i and G′.sub.i, for i=1 to n−1, and on c′.sub.0. Each carry signal thus depends directly on c′.sub.0 rather than its preceding carry signal. Each output carry may be implemented in a two-level circuit having a propagation delay of two gates (2T), with T designating the propagation delay; A Second level formed by the Carry Look-Ahead (CLA) logic block 71 which consists of n two-level implementation logic circuits. The Carry Look-Ahead (CLA) logic block 71 generates the carry signals c′.sub.i+1. Output signals c′.sub.i+1 of this level may be valid after 3 T; A third level using n XOR gates which generate the sum signals d′.sub.i from P′.sub.i and G′.sub.i (d′.sub.i=P′.sub.i⊕c′.sub.i) and the next carry c′.sub.i+1=G′.sub.i+P′.sub.iC′.sub.i. Output signals d′.sub.i of this level may be valid after 4T.
(101) A mask m.sub.k is applied to each input component bit a.sub.i, b.sub.i of the CLA adder (with i=0 to n−1), similarly to the embodiments described with reference to the previous figures. The mask may be the same for a set of full adders or for all full adders.
(102) It is an advantage of the invention to be implementable with minor changes at the Hardware Description Language level (such as VHDL level).
(103) In some applications of the invention related to the use of particular algorithms, the device 100 may be used to reduce a number a modulo another number b.
(104)
(105) It is assumed in the following description of some embodiments that a=(a.sub.n−1, . . . , a.sub.0).sub.2 and b=(b.sub.n−1, . . . , b.sub.0).sub.2 fit on n bits. The result of the reduction will be denoted by d=(d.sub.n−1, . . . , d.sub.0).sub.2, which is such that d=a mod b. The case where the MSB b.sub.n−1 of b is set to 1 is further considered. Accordingly:
(106)
(107) If b does not have its Most Significant Bit set, then the reduction may require more than one subtraction of m.
(108) Another case where a maximum of one subtraction is required is after the addition (even with carry c.sub.0=1) of two numbers already reduced, 0≤a and a′≤b−1. Then:
0≤a+a′+1≤2b−1 (25)
(109) Thus, after one subtraction of b (i.e., if a+a′+1≥b), inequality (25) becomes 0≤a+a′+1−b≤b−1, which is reduced.
(110) The reduction operation (25) can be carried out on masked data. The test a≥b can be achieved thanks to the adder depicted in
(111) It should be noted that:
b+¬b=2.sup.n−1 (26)
(112) hence:
a+¬b+1=2.sup.n+(a−b) (27)
(113) Therefore, a≥b if and only if there is a carry while adding a and ¬b with an input carry set to one.
(114) It should be noted that unlike conventional approaches, the embodiments of the invention are homomorphic with a unique path. Accordingly, this obviates the need from accompanying the adder 100 by some logic which would compute a “correcting logic” on the mask m in parallel. The mask m is thus sufficient to protect the chained Boolean and arithmetic operations from end to end.
(115) The invention may have a significant impact to improve the resistance of cryptosystems to non-invasive attacks when implemented in an embedded system such as mobile devices, smartcards or when implemented in a M2M platform and/or terminal in IoT architecture (Internet of Things).
(116)
(117) The method may mask the addition with one or more masks m.sub.0, . . . , m.sub.L−1, the masks being random and independent values. The masks may be generated periodically, as often as possible, at each execution of the method or at each clock cycle. The following description of the adding method will be made with reference to a masking based on a set of masks m.sub.0, . . . , m.sub.L−1.
(118) At step 800, a set of masks m.sub.0, . . . , m.sub.L is received, the masks m.sub.k being randomly generated. The set of masks comprise at least one mask. Each mask m.sub.k is associated with a number N.sub.k representing the number of bit components to which the mask is to be applied.
(119) The following steps are iterated for k=1 to L with k#L (block 802).
(120) At step 804, the mask m.sub.k is selected and an index p is set to 0. This mask will be applied to mask a first set of operand components a.sub.i, b.sub.i for the bit positions of the operands comprised between i=p to i=p+N.sub.k. The mask m.sub.k is thus iteratively applied to each bit component a.sub.p to a.sub.p+N.sub.
(121) At step 814, it is determined if index i has reached the value p+N.sub.k. If not, i is incremented and steps 808 to 814 are iterated for the new value of i. Otherwise, if index i has reached the value p+N.sub.k (i=p+N.sub.k), and if p+N.sub.k<n−1 (block 818), p is set to p+N.sub.k+1 at step 820, and at step 822, k is set to (k+1): the value of the mask m.sub.k is then switched to m.sub.k+1 and N.sub.k is set to N.sub.k+1. Further, at step 822, the stored value of c′.sub.i+1 is replaced by:
c″.sub.i+1=c.sub.i⊕m.sub.k+1
(122) Step 822 may comprise applying the following operation on c′.sub.i+1 to derive c″.sub.i+1:
c″.sub.i+1=c′.sub.i⊕m.sub.k+1⊕m.sub.k (28)
(123) Steps 806 to 814 are then iterated for the new values of p, m.sub.k, and N.sub.k.
(124) If it is determined in block 818 that i=p+N.sub.k≥n−1, no new iteration is performed and step 824 returns d′.sub.0. . . d′.sub.n.
(125) Although the adding method has been described, for simplification purpose only, according to an embodiment where the masks are all received initially in a first step 800, the skilled person will readily understand that, alternatively, the switching of the masks may be performed dynamically, the current mask m.sub.k being switch dynamically to a new mask value in response to the reception of a new mask value m.sub.k+1, the new mask value being applied for each (i-th) iteration to add the binary components a′.sub.i and b′.sub.i until a new mask value m.sub.k+2 is received. The mask value m.sub.k+1 may be reused N.sub.k+1 times, the value N.sub.k+1 representing the time needed to obtain a new mask m.sub.k+1. The new mask m.sub.k+2 may be advantageously independent from the previous mask. In such embodiments, the numbers N.sub.k are not predefined and correspond to the number of iterations of steps 808-814 performed until the new mask value m.sub.k+2 is received. The iterations can be done serially or in parallel.
(126) Exhibit 1, that is included per se in the present specification, provides an exemplary application of the masking method according to the invention to protect the TEA algorithm (TEA stands for Tiny Encryption Algorithm). Annotations have been added in the code which are delimited using “/*” and “*/”. The application of the masking method according to the embodiments of the invention thus allows to protect the algorithm against attacks.
(127) The methods described herein can be implemented by computer program instructions supplied to the processor of any type of computer, to produce a machine with a processor that executes the instructions to implement the functions/acts specified herein. These computer program instructions may also be stored in a computer-readable medium that can direct a computer to function in a particular manner. To that end, the computer program instructions may be loaded onto a computer to cause the performance of a series of operational steps and thereby produce a computer implemented process such that the executed instructions provide processes for implementing the functions/acts specified herein.
(128) More generally, the adder device and adding method described herein may be implemented by various means in hardware, software, or a combination thereof.
(129) Embodiments of the invention provide efficient protection for cryptographic algorithm using at least one addition operation secured against non-invasive attacks, whether the cryptographic algorithm is based on Boolean and/or arithmetic operations.
(130) Although not limited to such embodiments, the invention is particularly adapted to large number libraries, where large integers are represented as a series of limbs, each limb being a machine word. For example, on a 32-bit machine, a 128-bit number a can be represented as a=Σ.sub.i=0.sup.15a.sub.i2.sup.32i, where each 0≤a.sub.i<2.sup.32 is a limb.
(131)
(132) The processor 91 may include one or more devices that manipulate signals (analog or digital) based on operational instructions that are stored in the memory 92, such as microprocessors, micro-controllers, digital signal processors, microcomputers, central processing units, etc. Memory 92 may include a single memory device or a plurality of memory devices including, but not limited to, read-only memory (ROM), random access memory (RAM), volatile memory, non-volatile memory, static random-access memory (SRAM), dynamic random-access memory (DRAM), flash memory, cache memory, or any other device capable of storing information. Processor 91 may execute instructions directly or under the control of an operating system 920 that resides in memory 92. The operating system 920 may manage computing resources so that computer program code embodied as one or more computer software applications, such as an application 94 residing in memory 92, may have instructions executed by the processor 91. One or more data structures 924 may also reside in memory 92, and may be used by the processor 91, operating system 920, and/or application 924 to store or manipulate data. The data structures 924 may include data structures for securely storing the masks. Such secure storage may be a shared structure mutualized to protect also the cryptographic secret parameters.
(133) The I/O interface 97 may provide a machine interface that operatively couples the processor 91 to other devices and systems, such as the network 93 and/or external resource 84. The HMI 98 may be operatively coupled to the processor 91 of computer 900 in a known manner to allow a user of the computer 900 to interact directly with the computer 900. The HMI 98 may include any suitable audio and visual indicators capable of providing information to the user (video and/or alphanumeric displays, a touch screen, a speaker, etc.) and input devices and controls capable of accepting commands or input from the user and transmitting the entered input to the processor 90.
(134) While certain embodiments of the invention have been described mainly in relation to the execution of an arithmetic addition operation used for encryption/decryption of data, it should be noted that the invention it not limited to such application. For example, the invention may also be used in data signature applications for ensuring the authenticity of a digital document or message (for example in the field of files and software distributions, or for financial transactions).
(135) The invention may be applied to any type of cryptographic system executing at least one arithmetic addition as used in embedded systems such as smart cards, embedded secure devices, multimedia players, recorders, or mobile storage devices like memory cards and hard discs, the access to the embedded systems being monitored by the cryptosystem. The addition execution methods and devices may further be used in a wide range of communication and data processing applications such as in the car industry to ensure anti-theft protection, in service provider systems to secure access cards, in RFID™ tags and electronic keys, in mobile phone devices to authenticate the control and access to resources such as batteries and accessories, in manufacturing of embedded devices and equipments to provide a protection of hardware and software algorithms against cloning and reverse engineering, in banking industry to secure banking accounts and financial transactions, etc.
(136) In general, the routines executed to implement the embodiments of the invention, implemented as part of an operating system and/or a specific application, component, program, object, module or sequence of instructions, or even a subset thereof, may be referred to herein as “computer program code”, or simply “program code”. Program code typically comprises computer-readable instructions that are resident at various times in various memory and storage devices in a computer and that, when read and executed by one or more processors in a computer, cause that computer to perform the operations necessary to execute operations and/or elements embodying the various aspects of the embodiments of the invention. Computer-readable program instructions for carrying out operations of the embodiments of the invention may be, for example, assembly language or either source code or object code written in any combination of one or more programming languages.
(137) Various program code described herein may be identified based upon the application within that it is implemented in specific embodiments of the invention. However, it should be appreciated that any particular program nomenclature that follows is used merely for convenience, and thus the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature. Furthermore, given the generally endless number of manners in which computer programs may be organized into routines, procedures, methods, modules, objects, and the like, as well as the various manners in which program functionality may be allocated among various software layers that are resident within a typical computer (e.g., operating systems, libraries, API's, applications, applets, etc.), it should be appreciated that the embodiments of the invention are not limited to the specific organization and allocation of program functionality described herein.
(138) The program code embodied in any of the applications/modules described herein is capable of being individually or collectively distributed as a program product in a variety of different forms. In particular, the program code may be distributed using a computer-readable storage medium having computer-readable program instructions thereon for causing a processor to carry out aspects of the embodiments of the invention.
(139) Computer-readable program instructions stored in a computer-readable medium may be used to direct a computer, other types of programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions that implement the functions, acts, and/or operations specified in the flow charts, sequence diagrams, and/or block diagrams. The computer program instructions may be provided to one or more processors of a general purpose computer, a special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the one or more processors, cause a series of computations to be performed to implement the functions, acts, and/or operations specified in the flow charts, sequence diagrams, and/or block diagrams.
(140) In certain alternative embodiments, the functions, acts, and/or operations specified in the flow charts, sequence diagrams, and/or block diagrams may be re-ordered, processed serially, and/or processed concurrently consistent with embodiments of the invention. Moreover, any of the flow charts, sequence diagrams, and/or block diagrams may include more or fewer blocks than those illustrated consistent with embodiments of the invention.
(141) While all of the disclosure has been illustrated by a description of various embodiments and while these embodiments have been described in considerable detail, it is not the intention of the Applicant to restrict or in any way limit the scope of the appended claims to such detail. Additional advantages and modifications will readily appear to those skilled in the art. The invention in its broader aspects is therefore not limited to the specific details, representative apparatus and method, and illustrative examples shown and described.
(142) Exhibit 1
(143) Exemplary code to applying the masking method to a TEA algorithm:
(144) TABLE-US-00001 #include <stdint.h> void encrypt (uint32_t* v, uint32_t* k) { uint32_t v0=v[0], v1=v[1], sum=0, i; /* set up */ uint32_t delta=0x9e3779b9; /* a key schedule constant */ uint32_t k0=k[0], k1=k[1], k2=k[2], k3=k[3]; /* cache key */ for (i=0; i < 32; i++) { /* basic cycle start */ sum += delta; v0 += ((v1<<4) + k0) {circumflex over ( )} (v1 + sum) {circumflex over ( )} ((v1>>5) + k1); v1 += ((v0<<4) + k2) {circumflex over ( )} (v0 + sum) {circumflex over ( )} ((v0>>5) + k3); } /* end cycle */ v[0] =v0; v[1] =v1; } void decrypt (uint32_t* v, uint32_t* k) { uint32_t v0=v[0], v1=v[1], sum=0xC6EF3720, i; /* set up */ uint32_t delta=0x9e3779b9; /* a key schedule constant */ uint32_t k0=k[0], k1=k[1], k2=k[2], k3=k[3]; /* cache key */ for (i=0; i<32; i++) { /* basic cycle start */ v1 −= ((v0<<4) + k2) {circumflex over ( )} (v0 + sum) {circumflex over ( )} ((v0>>5) + k3); v0 −= ((v1<<4) + k0) {circumflex over ( )} (v1 + sum) {circumflex over ( )} ((v1>>5) + k1); sum −= delta; } /* end cycle */ v[0] =v0; v[1] =v1; }