METHOD FOR EXECUTING A MACHINE CODE BY MEANS OF A MICROPROCESSOR
20220357944 · 2022-11-10
Assignee
Inventors
Cpc classification
G06F9/30185
PHYSICS
G06F21/556
PHYSICS
G06F9/223
PHYSICS
International classification
Abstract
A method for executing a machine code using a microprocessor includes, after an operation of decoding a current loaded instruction, constructing a mask from the signals generated by an instruction decoder in response to decoding of the current loaded instruction by the decoder. The constructed mask varies as a function of the current loaded instruction. Subsequently, before an operation of decoding a next loaded instruction, the next loaded instruction is unmasked using the constructed mask.
Claims
1. A method for executing a machine code using a microprocessor comprising a hardware pipeline for processing instructions, the hardware processing pipeline comprising an instruction loader, a decoder and an arithmetic logic unit, the method comprising, for each instruction of the machine code to be executed, successively: loading, using the instruction loader, an instruction designated by a program counter, to obtain a loaded instruction, then decoding the loaded instruction, using the decoder, to generate signals that configure the microprocessor to execute the loaded instruction, then, executing, using the arithmetic logic unit, the loaded instruction, wherein the method also comprises: after decoding a current loaded instruction, constructing a mask from signals generated by the decoder in response to decoding of the current loaded instruction, the constructed mask thus varying as a function of the current loaded instruction, then before decoding a next loaded instruction, unmasking the next loaded instruction using the constructed mask.
2. The method according to claim 1, wherein: in response to detection that a loaded instruction is a branch instruction that, when executed by the arithmetic logic unit, replaces a value of a program counter with a new value, the new value depending on operands of the branch instruction, the method comprises unmasking the next loaded instruction using a pre-recorded jump mask that is constant and identical for all branch instructions executed by the microprocessor, and in response to an absence of detection that the loaded instruction is a branch instruction, the method comprises unmasking the next loaded instruction using the constructed mask.
3. The method according to claim 1, wherein the method comprises: suspension of execution of the decoding operation for one or more cycles of a clock of the microprocessor, and in response to suspension of the execution of the decoding operation, the method comprises storing the constructed mask in a register, and when execution of the decoding operation restarts, the next loaded instruction is unmasked using the constructed mask stored in the register.
4. The method according to claim 1, comprising: in response to interruption of the execution of a first machine code, and to triggering of execution of a second machine code, saving an execution context of the first machine code, wherein saving the execution context of the first machine code comprises saving the constructed mask, then in response to restart of the execution of the first machine code, and to interruption of the execution of the second machine code, restoring the execution context of the first machine code, wherein restoring the execution context of the first machine code comprises writing, to a register of the microprocessor, the saved constructed mask, then unmasking the next loaded instruction following the restart of the execution of the first machine code using the constructed mask written to the register.
5. The method according to claim 1, wherein unmasking the instruction loaded using the constructed mask comprises XORing bits of the loaded instruction and bits of the constructed mask.
6. The method according to claim 1, wherein the signals generated by the decoder in response to decoding of the current loaded instruction are signals that vary as a function of an opcode of the current loaded instruction.
7. A machine code executable by a microprocessor, this machine code comprising at least one basic block comprising solely instructions to be systematically executed one after the other, wherein each instruction of the basic block, except for a first instruction, is masked using a mask constructed from an immediately preceding instruction in the-basic block, the constructed mask being identical to a mask constructed after decoding the instruction that immediately preceded the instruction in question during execution of this machine code according to claim 1.
8. The machine code according to claim 7, wherein the first instruction of the basic block is masked using a same jump mask as that used during execution of the machine code.
9. A non-transitory data storage medium readable by a microprocessor, wherein this medium comprises a machine code according to claim 7.
10. A microprocessor, comprising a hardware pipeline for processing instructions, the hardware processing pipeline comprising an instruction loader, a decoder and an arithmetic logic unit, the hardware processing pipeline being configured, for each instruction of machine code to be executed, to successively execute: loading, using the instruction loader, an instruction designated by a program counter, to obtain a loaded instruction, then decoding the loaded instruction, using the decoder, to generate signals that configure the microprocessor to execute the loaded instruction, then, executing, using the arithmetic logic unit, the loaded instruction, wherein the hardware processing pipeline also comprises a hardware demasking module configured to: after decoding a current loaded instruction, constructing a mask from signals generated by the decoder in response to decoding of the current loaded instruction, the constructed mask thus varying as a function of the current loaded instruction, then before decoding the next loaded instruction, unmasking a next loaded instruction using the constructed mask.
11. A compiler configured to automatically convert a source code of a computer program into a binary code comprising a machine code executable by a microprocessor comprising a hardware instruction processing pipeline, the hardware processing pipeline comprising an instruction loader, a decoder and an arithmetic logic unit, wherein the compiler is able to automatically convert the source code into a machine code according to claim 7.
12. The compiler according to claim 11, wherein the compiler is configured to automatically insert an unconditional branch instruction that causes a unit jump, in the machine code, immediately before each instruction of the machine code that is a destination of a branch instruction that, when executed by the arithmetic logic unit, causes a jump of a plurality of instructions.
Description
[0016] The invention will be better understood on reading the following description, which is given solely by way of non-limiting example, with reference to the drawings, in which:
[0017]
[0018]
[0019]
[0020]
Section I: Notations and Definitions
[0021] In these figures, the same references have been used to designate the same elements. In the rest of this description, features and functions that are well known to those skilled in the art will not be described in detail.
[0022] In this description, the following definitions have been adopted.
[0023] A “program” designates a set of one or more pre-set functions that it is desired to have executed by a microprocessor.
[0024] A “source code” is a representation of the program in a computer language, not being able to be executed directly by a microprocessor and being intended to be converted, by a compiler, into a machine code able to be executed directly by the microprocessor.
[0025] A program or a code is said to be “able to be executed directly” or “directly executable” when it is able to be executed by a microprocessor without this microprocessor needing to compile it beforehand by way of a compiler or to interpret it by way of an interpreter.
[0026] An “instruction” denotes a machine instruction able to be executed by a microprocessor. Such an instruction consists:
[0027] of an opcode, or operation code, that codes the nature of the operation to be executed, and
[0028] of one or more operands defining the value(s) of the parameters of this operation.
[0029] The “value of an instruction” is a digital value obtained, using a bijective function, from the succession of “0's” and “1's” that code, in machine language, this instruction. This bijective function may be the identity function.
[0030] A “machine code” is a set of machine instructions. It typically is a file containing a sequence of bits with the value “0” or “1”, these bits coding the instructions to be executed by the microprocessor. The machine code is able to be executed directly by the microprocessor, that is to say without the need for a preliminary compilation or interpretation. The machine code comprises a succession of instructions organized one after another and that forms an ordered sequence of instructions in the machine code. The machine code starts with an initial instruction and ends with a final instruction. With respect to a given instruction I.sub.i of the machine code, the instruction I.sub.i−1 located on the side of the initial instruction is called the “previous instruction” and the instruction I.sub.i+1 located on the side of the final instruction is called the “following instruction”. The index “i” is the order number of instruction I.sub.i in the machine code. In this text, this machine code is divided into a sequence of basic blocks that are immediately consecutive or separated by data blocks.
[0031] A “binary code” is a file containing a sequence of bits bearing the value “0” or “1”. These bits code data and instructions to be executed by the microprocessor. The binary code thus comprises at least one machine code and also, in general, digital data processed by this machine code.
[0032] An “instruction stream” is a succession of instructions executed one after the other.
[0033] In this text, a “basic block” is a group of instructions of the machine code that are systematically executed one after the other. A basic block starts at a branch address and ends with a single explicit or implicit branch instruction. An explicit branch instruction is characterized by the explicit presence of an opcode in the machine code that codes the branch instruction. An implicit branch instruction corresponds to the case where execution of a previous basic block systematically continues with execution of a following basic block located, in the machine code, immediately after the previous basic block. In this case, given that in the absence of explicit branch instruction the instructions of the machine code are executed in order one after the other, it is not necessary to insert, at the end of the previous basic block, an explicit instruction to branch to the following basic block. In this description, the previous basic block is, in this case, said to end with an implicit branch instruction because this instruction is not explicitly coded into the machine code. In this case, the previous basic block ends just before the branch address of the following basic block.
[0034] In this patent application, the expression “branch instruction” designates an explicit branch instruction unless otherwise mentioned. The execution of a basic block thus systematically starts with the execution of the instruction located at its branch address and systematically ends with the execution of the branch instruction that ends this basic block. A basic block does not contain any other branch instructions than the one situated at the end of this basic block. The instructions of a basic block are thus systematically all executed by the microprocessor one after another in the order in which they are present in this basic block. The branch instruction, when it is executed, may systematically direct the control flow to the same branch address or, alternatively, to different branch addresses. The latter scenario occurs for example when, at the end of the executed basic block, the control flow is able to continue to a first and, alternatively, to a second basic block.
[0035] A “branch instruction” is an instruction that, when it is executed by the microprocessor, triggers a jump to the branch address of another basic block. This branch instruction therefore comprises at least as parameter the branch address of this other basic block. Typically, for this purpose, this instruction replaces the current value of the program counter with the value of the branch address. It is recalled that the program counter contains the address of the next instruction to be executed by the microprocessor. In the absence of a branch instruction, each time an instruction is executed, the program counter is incremented by the size of the instruction currently being executed. In the absence of a branch instruction, the instructions are systematically executed sequentially one after another in the order in which they are recorded in a main memory, i.e. in the order of their index “i”. The branch instruction may be unconditional, that is to say that the jump to the branch address is performed systematically as soon as this instruction is executed. An unconditional branch instruction is for example the instruction “JMP” in assembly language for microprocessors of the x86 series. The branch instruction may also be conditional, that is to say that the jump to the branch address is triggered when it is executed only if a particular condition is met. For example, a conditional branch instruction is a “JE”, “JA” or “JNE” instruction in assembly language. The branch instruction may equally be a call to a function. In this text, the term “branch instruction” denotes both direct and indirect branch instructions. A direct branch instruction is a branch instruction that directly contains the numerical value of the branch address. An indirect branch instruction is a branch instruction to a branch address contained in a memory or a register of the microprocessor. Thus, unlike a direct branch instruction, an indirect branch instruction does not directly contain the numerical value of the branch address.
[0036] A “branch address” is the address in the main memory where the first executed instruction of a basic block is located. Below, reference will be made to a branch address even with respect to basic blocks the first instruction of which is executed following execution of an implicit branch instruction.
[0037] The expression “execution of a function” is understood to designate execution of the instructions making up this function.
Section II: Examples of Embodiment
[0038]
[0039] The microprocessor 2 here comprises:
[0040] a hardware pipeline 10 for processing the instructions to be executed;
[0041] a set 12 of registers;
[0042] a control module 14; and
[0043] a data input/output interface 16.
[0044] The memory 4 is configured so as to store instructions of a binary code 30 of a program to be executed by the microprocessor 2. The memory 4 is a random access memory. The memory 4 is typically a volatile memory. The memory 4 may be a memory external to the microprocessor 2, as shown in
[0045] In this example of embodiment, the binary code 30 in particular comprises a machine code 32.
[0046] By way of illustration, the microprocessor 2 has an ARMv7 (Advanced RISC Machines—version 7) architecture and supports instruction sets such as Thumb1 and/or Thumb2. An instruction set defines in a limited manner the syntax of the instructions that the microprocessor 2 is capable of executing. This instruction set therefore in particular defines all of the opcodes possible for an instruction. The syntax of an instruction is incorrect if its syntax corresponds to none of the possible syntaxes of an instruction executable by the microprocessor 2. For example, if the bit range of an instruction I.sub.d that corresponds to the bit range used to code the opcode of the instruction contains a value that is different from all the possible values of an opcode, then its syntax is incorrect.
[0047] The pipeline 10 allows an instruction of the machine code to be executed while processing, by the pipeline 10, of the previous instruction of this machine code has not yet ended. Such processing pipelines are well-known and only elements of the pipeline 10 that are required to understand the invention will be described in detail.
[0048] The pipeline 10 typically comprises the following stages:
[0049] an instruction loader 18,
[0050] an instruction decoder 20, and
[0051] an arithmetic logic unit 22 that executes the instructions.
[0052] The loader 18 loads the next instruction to be executed by the unit 22 from the memory 4. More precisely, the loader 18 loads the instruction of the machine code 32 to which a program counter 26 points. Unless its value is modified by executing a branch instruction, the value of the program counter 26 is incremented by a regular increment on each cycle of a clock of the microprocessor. The regular increment is equal to the difference between the addresses of two immediately consecutive instructions in the machine code 32. This amount is called the “unit increment” below.
[0053] The decoder 20 decodes the instruction loaded by the loader 18 to obtain configuration signals that configure the microprocessor 2 so that it executes, in the next clock cycle, the loaded instruction. One of these configurations signals codes the nature of the operation to be executed by the unit 22. This configuration signal corresponds to the opcode of the loaded instruction. Other configuration signals indicate, for example, whether the loaded instruction is an instruction to load a datum from the memory 4 or to write a datum to the memory 4. These configuration signals are transmitted to the unit 22. Other configuration signals comprise the values of the loaded operands. Depending on the instruction to be executed, the signals are transmitted to the set 12 of registers or to the unit 22.
[0054] When the decoder 20 is unable to decode an instruction, it generates an error signal. Typically, this occurs if the syntax of the loaded instruction is incorrect.
[0055] The unit 22 executes the loaded instructions one after another. The unit 22 is also capable of storing the result of these executed instructions in one or more of the registers of the set 12.
[0056] In this description, “execution by the microprocessor 2” and “execution by the unit 10” will be used synonymously.
[0057] A given instruction I.sub.i of the machine code must successively be processed, in order, by the loader 18, the decoder 20 and the unit 22. In addition, the loader 18, the decoder 20 and the unit 22 are capable of working in parallel with one another. Thus, at a given time, the loader 18 may be in the process of loading the following instruction I.sub.i′1, the decoder 20 in the process of decoding the instruction I.sub.i and the unit 22 in the process of executing the previous instruction I.sub.i−1. The pipeline 10 thus allows at least three instructions of the machine code 30 to be processed in parallel.
[0058] In addition, the pipeline 10 comprises a hardware module 28 for unmasking the instructions loaded by the loader 18.
[0059] The module 28 is capable of automatically executing the following operations: [0060] 1) in response to decoding, by the decoder 20, of an instruction I.sub.i, the module 28 constructs a mask M.sub.i, [0061] 2) before starting to decode the following instruction I.sub.i+1, the module unmasks the instruction I.sub.i+1, using a current mask M.sub.c chosen from the group made up of the constructed mask M.sub.i and of a jump mask MJ, before transmitting the unmasked instruction to the decoder 20.
[0062] To carry out operation 1) above, the module 28 implements and executes a pre-programmed function F.sub.CM(i.sub.i). To carry out operation 2) above, the module 28 implements and executes a pre-programmed demasking function F.sub.D(i*.sub.i+1, M.sub.c), where:
[0063] M.sub.c is the current mask chosen from the group made up of the constructed mask M.sub.i and of the jump mask MJ, and
[0064] I*.sub.i+1 is the masked instruction I.sub.i+1, i.e. the instruction such as it is before being unmasked by the module 28.
[0065] Below, the notation “I*” designates the instruction I masked and the notation “I”, without the symbol “*”, designates the instruction I in the clear, or the cleartext instruction, i.e. the result of the function F.sub.D(i*; M.sub.c).
[0066] These functions F.sub.CM( ) and F.sub.D( ) are secret functions. To this end, there is no machine code for executing these functions in a memory located outside of the microprocessor 2. Typically, they are implemented in hardware form inside the module 28.
[0067] The function F.sub.CM( ) is a function that constructs the mask M.sub.i from the configuration signals generated by the decoder 20 when the instruction I.sub.i is decoded. Here, the mask M.sub.i is coded on the same number of bits as the instruction I.sub.i.
[0068] In this embodiment, the function F.sub.D( ) is a function that combines each bit of the instruction I*.sub.i+1 with the bits located in the same locations in the current mask M.sub.c. Here, the function F.sub.D( ) is defined by the following relationship: I.sub.i+1=I*.sub.i+i XOR M.sub.c, where the symbol “XOR” designates the “EXCLUSIVE OR” logic operation.
[0069] In this example of embodiment, the set 12 comprises general registers that are usable to store any type of data, and dedicated registers. In contrast to the general registers, the dedicated registers are dedicated to storing particular data that are generally automatically generated by the microprocessor 2.
[0070] The module 14 is configured so as to move data between the set 12 of registers and the interface 16. The interface 16 is notably able to acquire data and instructions, for example from the memory 4 and/or the medium 6 that are external to the microprocessor 2.
[0071] The microprocessor 2 here comprises a bus 24 that links the various components of the microprocessor 2 to one another.
[0072] The medium 6 is typically a non-volatile memory. It is for example an EEPROM or Flash memory. Here, it contains a backup copy 40 of the binary code 30. It is typically this copy 40 that is automatically copied to the memory 4 to restore the code 30, for example after a loss of current or the like or just before the execution of the code 30 starts.
[0073] The machine code 32 is formed of a sequence of basic blocks that have to be executed one after another.
[0074]
[0075] a register 30 in which is stored the mask M.sub.i constructed from the configuration signals generated by the decoder 20 at the end of decoding of the instruction I.sub.i,
[0076] a register 32 in which is stored the mask MJ,
[0077] a multiplexer 34 comprising two inputs for receiving the mask M.sub.i and the mask MJ, respectively,
[0078] a control circuit 36 that selects, depending on the various received signals, the current mask M.sub.c, from the masks M.sub.i and MJ, that will be delivered to the output of the multiplexer 34,
[0079] a logic gate 38 that XOR's the current mask M.sub.c delivered to the output of the multiplexer 34 and the instruction I.sub.i+1 loaded by the loader 18, and
[0080] a logic gate 40 that NAND's a signal S.sub.F and a signal S.sub.D and delivers, to an output, the result of this logic operation.
[0081] The signal S.sub.F is a Boolean signal that takes the value “1” when operation of the decoder 20 is interrupted for a plurality of clock cycles. Such an interruption is for example necessary when waiting for the result of other computations, for example, carried out by the unit 22. This particular case occurs for example when a conditional branch instruction is executed. Specifically, in this case, it is not possible to indicate to the charger 18 which is the next instruction to be loaded until it is known whether the execution of this conditional branch instruction will cause or not a jump of a plurality of instructions of the machine code 32.
[0082] The signal S.sub.D is a Boolean signal that takes the value “1” when decoding of the instruction has ended and the configuration signals are ready to be used to execute this instruction.
[0083] Thus, the signal output from the gate 40 takes the value “0” only when operation of the loader 18 and decoder 20 has been interrupted and configuration signals generated by the decoder 20 are ready to be used.
[0084] An input 44 of the register 30 is connected to the output of the gate 40. Provided that this input 44 receives a signal equal to “1”, in each clock cycle, the register 30 stores the value of the configuration signals generated by the decoder 20. The value stored in the register 30 is coded on as many bits as there are bits in the mask M.sub.i. Here, the value stored in the register 30 is the value of the mask M.sub.i constructed in response to decoding of the instruction I.sub.i by the decoder 20. In contrast, when the input 44 receives a signal equal to “0”, no new value is stored in the register 30. Thus, in the latter case, the last mask M.sub.i constructed is stored in memory for a plurality of clock cycles and in particular as long as the signal S.sub.F is equal to “1”.
[0085] The circuit 36 selects the register 32 if the instruction decoded by the decoder 20 is an unconditional branch instruction. The circuit 36 also selects the register 32 if the instruction executed by the unit 22 is a conditional branch instruction that, when it is executed, causes a jump of a plurality of instructions of the machine code. To this end, the circuit 36 receives, on the one hand, a signal S.sub.20 generated by the decoder 20 and, on the other hand, a signal S.sub.22 generated by the unit 22. The signal S.sub.20 allows the circuit 36 to identify the instruction that has just been decoded by the decoder 20 and therefore to identify whether it is an unconditional branch instruction. For example, the signal S.sub.20 contains the opcode of the instruction that has just been decoded.
[0086] The signal S.sub.22 takes a particular value when the unit 22 has just executed a conditional branch instruction the condition of which was met. In this case, this causes a jump of a plurality of instructions of the machine code. In contrast, when this signal S.sub.22 does not take this particular value, this means that the condition of the conditional branch instruction was not met. In the latter case, the next instruction executed by the microprocessor is the instruction that immediately follows this conditional branch instruction in the machine code.
[0087] When the circuit 36 selects the register 32, then it controls the multiplexer 34 so that the latter delivers, to its output, the content of the register 32, i.e. the mask MJ. In contrast, when the circuit 36 selects the register 30, it is the mask M.sub.i that is delivered to the output of the multiplexer 34.
[0088]
[0089] The method starts with a step 150 of providing the binary code 30 to the memory 4. To do this, for example, the microprocessor 2 copies the copy 40 to the memory 4 in order to obtain the binary code 30 stored in the memory 4. Beforehand, this binary code 30 will have been generated by the compiler of
[0090] Next, in a phase 152, the microprocessor 2 executes the binary code 30 and, in particular, the machine code 32.
[0091] To do this, for each instruction I*.sub.i pointed to by the program counter 26, the pipeline 10 successively executes the following steps:
[0092] a step 154 of loading, by means of the loader 18, the instruction I*.sub.i pointed to by the current value of the program counter 26, then
[0093] a step 156 of unmasking, by means of the module 28, the loaded instruction I*.sub.i to obtain the cleartext instruction I.sub.i, then
[0094] a step 158 of decoding, by means of the decoder 20, the cleartext instruction I.sub.i, then
[0095] a step 160 of executing, by means of the unit 22, the decoded instruction I.sub.i.
[0096] Steps 154, 158 and 160 are typically each executed in one clock cycle. In addition, they may be executed in parallel for various successive instructions of the machine code 32. Thus, the pipeline 10 is able to execute in parallel:
[0097] step 154 for an instruction I*.sub.i+1,
[0098] step 158 for an instruction I.sub.i, and
[0099] step 160 for an instruction I.sub.i−1.
[0100] Thus, the pipeline 10 allows one instruction to be executed per clock cycle. However, as indicated above, there are situations in which it is necessary to suspend execution of steps 154 and 158 for a plurality of clock cycles. In this case, the signal S.sub.F is set to “1” throughout the clock cycles in which execution of steps 154 and 158 is suspended.
[0101] At the end of step 158 of decoding the instruction I.sub.i, the decoder 20 transmits to the circuit 36 the signal S.sub.20. This signal S.sub.20 makes it possible to detect whether the instruction I.sub.i that has just been decoded is an unconditional branch instruction.
[0102] In addition, in step 158, if an instruction I.sub.i cannot be decoded because its syntax is incorrect, the method continues with a step 170 of signalling an execution fault. In step 170, the decoder 20 triggers signalling of an execution fault.
[0103] In response to such signalling, in a step 172, the microprocessor 2 implements one or more countermeasures. A wide range of countermeasures are possible. The countermeasures implemented may have very different degrees of severity. For example, the countermeasures that are implemented may range from simply displaying an error message without interrupting the normal execution of the machine code 32 to definitively disabling the microprocessor 2. The microprocessor 2 is considered to be disabled when it is definitively put into a state in which it is incapable of executing any machine code. Between these extreme degrees of severity, there are many other possible countermeasures, such as:
[0104] indicating via a human-machine interface detection of the faults,
[0105] immediately interrupting the execution of the machine code 32 and/or reinitializing it, and
[0106] deleting the machine code 32 from the memory 4 and/or deleting the backup copy 40 and/or deleting the secret data.
[0107] In step 160, if the executed instruction is a conditional branch instruction and if execution of this instruction causes a jump of a plurality of instructions, then the unit 22 generates a signal S.sub.22 that indicates to the circuit 36 that execution of this conditional branch instruction has caused a jump of a plurality of instructions.
[0108] To do this, typically, during execution of the conditional branch instruction, a condition is tested. If this condition is not met, execution of this conditional branch instruction causes no instruction jump. In this case, the program counter is simply incremented by a unit increment, i.e. by a single instruction. It is therefore the following instruction I.sub.i+1 of the machine code 32 that is executed. In this case, the signal S.sub.22 for example remains equal to “0”.
[0109] If in contrast, the condition of the conditional branch instruction is met, execution of this instruction by the unit 22 causes a jump of a plurality of instructions of the machine code 32. In this case, typically, a new value is written to the register containing the program counter. The difference between this new value and the previous value is larger than a plurality of times the unit increment of the program counter. In the latter case, the signal S.sub.22 for example takes the value “1”.
[0110] Step 156 is executed between the end of execution of step 158 for the instruction I.sub.i and before the start of execution of step 158 for the next instruction I.sub.p. The next instruction I.sub.p is:
[0111] either the following instruction I.sub.i+1 when execution of the instruction I.sub.i does not cause a jump of a plurality of instructions,
[0112] or another instruction of the machine code, different from the instruction I.sub.i+1, when execution of the instruction I.sub.i causes a jump of a plurality of instructions.
[0113] Step 156 mainly comprises an operation 164 of constructing the mask M.sub.i and an operation 162 of unmasking the next instruction I.sub.p loaded by the loader 18.
[0114] Operation 164 is executed each time the decoder 20 finishes decoding an instruction and, at the same time, the signal S.sub.F is equal to “0”.
[0115] In operation 164, the module 28 stores, in the register 30, the values of a predetermined set of configuration signals of the decoder 20. Preferably, this set of configuration signals contains:
[0116] the signal that codes the opcode of the instruction I.sub.i that has just been decoded, and
[0117] the signals that code the values of the operands of the instruction I.sub.i.
[0118] Here, storage of these configuration signals is triggered, for example, each time a clock signal ends unless the signal S.sub.F is equal to “1”. Thus, a new mask M.sub.i is constructed each time a new instruction I.sub.i is decoded by the decoder 20. This new mask M.sub.i varies as a function of the decoded instruction and, in particular, as a function of its opcode and of the values of its operands. Given that the instructions decoded one after another are generally different, the constructed mask M.sub.i is different on each execution of operation 164.
[0119] In the case where the signal S.sub.F is equal to “1”, no new configuration signal is stored in the register 30 and hence the register 30 contains the value of the last mask M.sub.i constructed.
[0120] Operation 162 is executed each time the decoder 20 decodes a new loaded instruction I*.sub.p. It will be recalled here that, since each instruction of the machine code 32 is masked, loaded instructions are masked.
[0121] In operation 162, the circuit 36 first selects, from the masks M.sub.i and MJ, which are stored in the registers 30 and 32, the mask that must be used as current mask M.sub.c. More precisely, the circuit 36 selects the mask MJ only in the following two cases:
[0122] case 1: the signal S.sub.20 corresponds to the signal generated when the instruction I.sub.i that has just been decoded by the decoder 20 is an unconditional branch instruction, and
[0123] case 2: the received signal S.sub.22 is equal to “1”, this meaning that the branch instruction I.sub.i that has just been executed by the unit 22 caused a jump of a plurality of instructions.
[0124] In any case other than the above two, the circuit 36 systematically selects the mask M.sub.i. Here, to select the mask MJ, the circuit 36 controls the multiplexer 34 so that it delivers, to its output, the content of the register 32. Similarly, to select the mask M.sub.i, the circuit 36 controls the multiplexer 34 to deliver, this time round, the content of the register 30 to its output. Thus, the output of the multiplexer 34 delivers the current mask M.sub.c.
[0125] Next, the gate 38 executes the unmasking function F.sub.D(I*.sub.p; M.sub.c). Here, to do this, the gate 38 “XOR's” the bits of the mask M.sub.c delivered to the output of the multiplexer 34 and the corresponding bits of the loaded instruction I*.sub.p delivered by the loader 18 at the same time. The result of the function F.sub.D( ), i.e. the cleartext instruction I.sub.p, is delivered to the input of the decoder 20, to be decoded in the next clock cycle.
[0126] Thus, when the decoded instruction I*.sub.i is not a branch instruction, the next instruction I*.sub.p is unmasked using the mask M.sub.i, i.e. using a mask the value of which depends on the value of the previous instruction I.sub.i decoded. If at that point, the decoder 20 is subjected to a fault attack such that the configuration signals generated do not correspond to those expected, i.e. to those corresponding to correct decoding of the instruction I.sub.i, then the module 28 constructs a mask M.sub.D that is different from the expected mask M.sub.i. This mask M.sub.D is then used to unmask the next instruction I*.sub.p. As the mask M.sub.D is different from the expected mask M.sub.i, the unmasked instruction I.sub.D+1 obtained is different from the expected cleartext instruction I.sub.p. There are then two particular cases:
[0127] Case 1: The instruction I.sub.D+1 is an instruction the syntax of which is incorrect. In this case, the method continues with step 170.
[0128] Case 2: The instruction I.sub.D+1 is an instruction the syntax of which is correct. In this second case, decoding of the instruction I.sub.D+1 causes no error and the module 28 constructs a new mask M.sub.D+1 from the configuration signals generated in response to decoding of this instruction I.sub.D+1. Since the decoded instruction I.sub.D+1 is different from the expected instruction I.sub.p, the constructed mask M.sub.D+1 is different from the expected mask M.sub.p. Hence, execution of operation 162 generates an instruction I.sub.D+2 different from the expected instruction I.sub.p+1. At this stage, as has just been described above, the syntax of this instruction I.sub.D+2 may either be correct or incorrect. Now, it is practically certain that after a certain number of clock cycles, the instruction decoded by the decoder 20 will be an instruction the syntax of which is incorrect. It is therefore certain that after a certain number of clock cycles, an execution fault will be signalled.
[0129] When an instruction I.sub.i is the destination of a branch instruction, i.e. it is located at a branch address, there are two paths that may be taken to reach this instruction I.sub.i. The first path reaches the instruction I.sub.i by a jump of a plurality of instructions when an unconditional branch instruction is executed or when the condition of a conditional branch instruction is met. The second path corresponds to the case where instruction I.sub.i−1 is executed then the program counter incremented by the unit increment. Depending on the path taken, the instruction executed by the microprocessor just before execution of instruction I.sub.i is not the same. When the first path is taken, the previous instruction is a branch instruction. When the second path is taken, the previous instruction is the instruction I.sub.i−1.
[0130] Here, during compilation, before each instruction I.sub.i that is located at a branch address, an unconditional branch instruction is inserted. This unconditional branch instruction causes, when it is executed by the microprocessor 2, a jump of one instruction, i.e. the value of the program counter is replaced by a value incremented by a unit increment. Thus, in the case of the machine code 32 generated by such a compiler, the instruction I.sub.i−1 that precede the instruction I.sub.i is systematically an unconditional branch instruction that causes a jump to the instruction I.sub.i when it is executed by the unit 22.
[0131] Thus, whatever the path taken to reach the instruction the instruction executed just before this instruction I.sub.i is a branch instruction. In the case where it is the first path that is taken, the branch instruction may be an unconditional branch instruction or a conditional branch instruction. Here, the unconditional branch instruction is detected by the circuit 36 based on the signal S.sub.20 generated by the decoder 20. In the case of a conditional branch instruction, the fact that its execution causes a jump of a plurality of instructions is detected based on the signal S.sub.22 generated by the unit 22. It will be noted that in the latter case, after the conditional branch instruction has been decoded, operation of the loader 18 and of the decoder 20 is suspended. Specifically, until the conditional branch instruction is executed by the unit 22, it is not possible to know whether the condition is met or not and therefore to know which instruction will be loaded by the loader 18 next. Thus, operation of the loader 18 restarts solely after the conditional branch instruction has been executed, i.e. at a time at which the address of the next instruction I.sub.p to be executed is known.
[0132] The branch instruction executed when the first path is taken is different from the branch instruction executed when the second path is taken. However, in both cases, the circuit 36 selects the mask MJ as current mask M.sub.c to be used to unmask the instruction I.sub.i. As this mask MJ is the same for all the branch instructions, the instruction I*.sub.i may be correctly unmasked whatever the path taken to reach this instruction.
[0133] Execution of the machine code 32 may be interrupted for a plurality of clock cycles. For example, this occurs when execution of the machine code 32 is interrupted to execute, instead, another machine code. In this case, in a step 180, the microprocessor 2 saves, to the set 12 or to the memory 4, the execution context of the machine code 32. The execution context comprises all the information required for the microprocessor 2 to be able to subsequently restart execution of this machine code 32 at the location of the instruction where its execution stopped. Step 180 in particular comprises saving the current value of the mask M.sub.i contained in the register 30 and of the current mask selected by the circuit 36.
[0134] After a plurality of clock cycles, execution of the machine code 32 by the microprocessor 2 is restarted from the location of the instruction where this execution was interrupted. At this point, in a step 182, the execution context is restored. This step 182 in particular comprises writing, to the register 30, the mask M.sub.i saved in step 180. It also comprises restoring the state of the circuit 36 so that the latter selects the same mask M.sub.c as that which would have been selected if execution of the machine code 32 had not been interrupted. When the interruption of execution of the machine code 32 is caused by the need to execute another machine code, in this case, this other machine code is executed between steps 180 and 182.
[0135] The same principle of saving the mask M.sub.i and the state of the circuit 36 is employed when the interruption of execution of the machine code 32 is caused by a hardware interruption.
[0136]
[0141] Step c) consists in replacing each implicit branch instruction with an explicit branch instruction. Thus, following step c), the instruction executed before each noted instruction is systematically a branch instruction.
[0142] For example, to carry out step d), the compiler parses the machine code obtained at the end of step c) in ascending order of the instructions I.sub.i. For each cleartext instruction I.sub.i encountered, it constructs the mask M.sub.c to be used to obtain the corresponding masked instruction I*.sub.i. The mask M.sub.c constructed by the compiler 190 is the same as the mask constructed by the module 28 to unmask the instruction I*.sub.i on execution of the machine code by the microprocessor 2. To do this, for example, the compiler 190 implements a software emulator that reproduces the operation of the pipeline 10 and, in particular, of the decoder 20 and of the module 28. For example, for each instruction I.sub.i+1 of the cleartext machine code:
[0143] the emulator selects, as mask M.sub.c, the mask MJ if the previous instruction I.sub.i is a branch instruction, else
[0144] the emulator produces configuration signals that are the same as those generated by the decoder 20 when it has finished decoding the previous instruction then it constructs the mask M.sub.i from these signals, the constructed mask M.sub.c then being equal to the constructed mask M.sub.i.
[0145] Once the mask M.sub.c has been constructed, the compiler 190 obtains the masked instruction I*.sub.i+1 using the relationship I*.sub.i+1=F.sub.M(I.sub.i+1; M.sub.c), i.e. in this example using the following relationship: I*.sub.i+1=I.sub.i+1 XOR M.sub.c.
Section III: Variants
[0146] Variants of the Apparatus:
[0147] The memory 4 may also be a non-volatile memory. In this case, it is not necessary to copy the binary code 30 to this memory before launching its execution since it is already stored therein.
[0148] As a variant, the memory 4 may also be an internal memory integrated into the microprocessor 2. In the latter case, it is produced on the same substrate as the other elements of the microprocessor 2. Lastly, in other configurations, the memory 4 is composed of a plurality of memories certain of which are internal memories and others of which are external memories.
[0149] Variants of the Masking and Unmasking Operations:
[0150] In one simplified embodiment, the mask M.sub.i is constructed solely from the configuration signal that codes the opcode of the instruction I.sub.i or solely from the configuration signal that varies as a function of the one or more values of the operands of the instruction I.sub.i.
[0151] Other functions F.sub.D( ) are possible. For example, the function F.sub.D( ) is defined by the following relationship: i.sub.i=i*.sub.i modulo M.sub.c, where “modulo” is the modular-arithmetic operation that associates, with a pair (a, b) of integers, the remainder of the Euclidean division of a by b. The function F.sub.D may also be defined by the following relationship: i.sub.i=i*.sub.i XOR M.sub.c XOR k.sub.s, where k.sub.s is a secret key known only to the microprocessor 2. If computational power allows, the function F.sub.D( ) may also be an encryption function, a symmetric encryption function for example. In the latter case, the mask M.sub.c is then what is better known as a “decryption key”. Each time the function F.sub.D( ) is modified, the function F.sub.M( ) must be changed accordingly.
[0152] The function F.sub.D( ) may be different depending on whether the mask used to unmask the next instruction I*.sub.p is the mask MJ or the mask M.sub.i. For example, when the mask MJ is selected, the function “XOR” is replaced by another function, such as a decryption function that uses the bits of the mask MJ to decrypt the instruction I*.sub.p. The emulator of the compiler 190 must then be changed to take this replacement into account.
[0153] In the case of a machine code devoid of branch instructions, the mask MJ and its use are omitted.
[0154] As a variant, only certain instructions of the machine code are masked. For example, to do this, a specific instruction is added to the instruction set of the microprocessor 2. When this specific instruction is executed by the unit 22, it indicates to the microprocessor that the next T instructions are not masked instructions and need not therefore be unmasked. Typically, the number T is an integer number higher than or equal to 1 or 10 or 100.
[0155] According to another variant, it is a specific bit of a control or status register of the microprocessor 2 that indicates whether the loaded instruction is or is not a masked instruction. More precisely, when this specific bit takes a predetermined value, the loaded instruction is unmasked by the module 28. If this specific bit takes a value different from this predetermined value, then the loaded instruction is not unmasked by the module 28.
[0156] The various embodiments described here may be combined together.
Section IV: Advantages of the Described Embodiments
[0157] The embodiments described here allow signalling of an execution fault to be triggered in case of modification of an instruction, of instructions being skipped or of an error in decoding the instruction. Such an execution fault is also signalled if the execution of a conditional branch instruction is disrupted. Specifically, in this case, the mask M.sub.c used to unmask the next instruction I*.sub.p is not the right one since, following execution of the previous conditional branch instruction, the intended path was not followed. Thus, the described method also allows a compromisation of the control flow to be detected.
[0158] The embodiments described here also have the following advantages:
[0159] the module 28 is simple to implement,
[0160] the extra cost in terms of size of the machine code is very low since only additional unconditional branch instructions are added to the machine code,
[0161] the fact that the instructions of the machine code are unmasked only before they are decoded increases the robustness of the microprocessor 2 to side-channel attacks,
[0162] the instructions of the machine code are not stored in cleartext in the main memory, this making disassembly of this machine code more difficult.
[0163] Using the mask MJ as current mask M.sub.c each time the previous instruction is a branch instruction allows the described method to be implemented even in the case where the machine code comprises branch instructions.
[0164] Storing the constructed mask M.sub.i in the register 30 while execution of the loader 18 and of the decoder 20 is suspended allows the described method to be implemented even if operation of the loader 18 and of the decoder 20 is suspended for a plurality of clock cycles.
[0165] Saving the constructed mask M.sub.i in case of interruption of execution of the machine code allows execution of this machine code to be interrupted then to be restarted subsequently.
[0166] The fact that the function F.sub.D( ) is a simple “EXCLUSIVE OR” gate simplifies and accelerates unmasking.