Abstract
Method of enforcing control flow integrity (CFI) for a monolithic binary using static analysis by: marking evaluated functions as core functions by a chosen heuristic or empirically; generating a binary call graph; merging all function nodes of core functions as a node of highest privilege (set 0); merging all leaf functions in one node without privilege (set n); merging all nodes without privilege that reach functions of privilege i and setting the merged node privilege to i+1; checking if there is a node without privilege besides a trivial function; in a positive case, returning to merging all nodes without privilege and setting the merged node privilege to i+1; and in a negative case, setting the privilege of trivial functions as i+2.
Claims
1. A method of enforcing control flow integrity (CFI) for a monolithic binary using static analysis, the method comprising: performing, by at least one processor, operations to enforce the CFI while protecting essential functions and maintaining a behavior and a performance of the monolithic binary, the operations including: marking some evaluated functions as core functions by a chosen heuristic or empirically; generating a binary call graph; when the generated binary call graph is incomplete, then: creating a call graph from static analysis; setting a privilege 2 for functions with incomplete estimation; setting an ad hoc privilege for specific functions; and generating a policy; merging all function nodes of core functions as a node of highest privilege (set 0); merging all leaf functions in one node without privilege (set n); merging all nodes without privilege that reach functions of privilege i and setting the merged node privilege to i+1; checking whether there is a node without privilege besides a trivial function; and based on a result of the checking: in a positive case, returning to merging all nodes without privilege and setting the merged node privilege to i+1; and in a negative case, setting the privilege of trivial functions as i+2, wherein the privilege of each function is put before an entry point of each function.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The objectives and advantages of the current invention will become clearer through the following detailed description of the example and non-limitative pictures presented at the end of this document.
(2) FIG. 1 discloses an example of attack chain that starts at the vulnerable function f_n and goes until commit_creds(prepare_creds(0)) is executed.
(3) FIG. 2 discloses the same attack chain of FIG. 1 but with the difference that f_n1 calling f_n2 is not allowed anymore by CFI enforcement.
(4) FIG. 3 discloses an exemplar scenario where the attacker has as target calling f_0, he already has a chain formed but has to necessarily pass through f_1.
(5) FIG. 4 discloses a scenario similar to FIG. 3 but having two security layers before calling f_0.
(6) FIG. 5 discloses the general case of FIGS. 3 and 4 with n security layers.
(7) FIG. 6 discloses the indirect call policy of the present invention.
(8) FIG. 7 discloses a flowchart of the Policy Generation method.
(9) FIG. 8 discloses how the set 1 is constructed.
(10) FIG. 9 discloses an example of the merging algorithm for the set 1 before the actual merging.
(11) FIG. 10 discloses an example of the merging algorithm for the set 1 after the actual merging.
(12) FIG. 11 discloses an example of the merging algorithm for the leaf set before the actual merging.
(13) FIG. 12 discloses an example of the merging algorithm for the leaf set after the actual merging.
(14) FIG. 13 discloses the algorithm flowchart of the Policy Generation with incomplete call graph.
(15) FIG. 14 discloses a view of the monolithic binary with all the security layers and with the core set at the center.
DETAILED DESCRIPTION
(16) In Jump-Oriented Programming (JOP), the attacker will try to execute gadgets in a chain in order to perform exploitation. The attacker will simply overwrite the content of a register that points directly or indirectly to a function address. To avoid this, one first line of defense is done by enforcing that an instruction intended to call a function can only be used if the address is indeed pointing to the beginning of a function.
(17) However, as shown by FIG. 1, a simple attack chain can be constructed by an attacker in order to have, for example, root privilege by calling commit_creds(prepare_creds(0)) at the Linux kernel level. It is noted that this chain starts from a vulnerable function that is exploited and an attack chain is created to perform the calling. For the attacker, it does not matter if the program flow is correct or not as long as the exploitation goal is accomplished. Therefore, two things can be concluded. First, exploitation is still possible using JOP, only the number of available gadgets has been reduced. Second, it is easier for the attacker to build this chain from function f_n if he has a great number of possible functions f_n1 that can be used in the chain at intermediary steps.
(18) FIG. 2 discloses the same function chain used, but, at this time, the CFI enforcement does not let f_n1 call f_n2 and, therefore, the attack is avoided. In this scenario, the attacker will have to look for another function to use, which increases the overall complexity of the attack.
(19) FIG. 3 discloses an attack chain in the scenario that functions from set 0, the attack's target, can only be called by functions from set 1. In this case, the attacker will have to build a chain and necessarily add a function from set 1 and a function from set 0. FIG. 4 discloses a similar scenario but with three obligatory sets, which adds another layer of complexity for the attacker.
(20) FIG. 5 discloses a similar scenario when compared to FIGS. 3 and 4. In this general case, the attacker must create a chain, but it must go through all the n functions (n sets) in order to reach his target (set 0). This final target set possibly has critical functions of the binary. For example, if the analyzed binary was the kernel, this set could have functions related to processes credentials. Therefore, this set 0 will be called as the core function set, which, logically, has core functions. In the same FIG. 5, it is shown that if the attacker tries to jump from function f_n (set n, less privileged) to function f_2 (set 2, more privileged), then the execution is interrupted, and an error is raised.
(21) FIG. 6 discloses the method according to an embodiment of the present invention. All core functions are empirically marked and grouped at set 0. This set has the highest privilege and can access (call) a function from any other privilege level. The next set (set 1) has privilege to access the core set and all the functions below this privilege access (all privileges greater than 0). Set 2 follows the same pattern, it has access to set 1 and all the sets below including itself, but it does not have access to set 0. This will occur to every set until the last set is formed by only leaf functions that have no access rights. Summarizing, the access rule is that the most privileged set that a function of privilege level I can access is the one of privilege I1 (it can also access functions with privilege greater than I or equal). For example, a function with privilege 5 can call functions with privilege 4 or greater (which includes its own level, 5, and all levels greater than 5). In order to this enforcement occur, all indirect calls are instrumented with this privilege checking. Also, by applying the reverse flow of FIG. 6, a protection against ROP is made with the same sets. One important note about the privilege access is that although sets 0 and 1 can access exactly the same sets, the set 0 can only be accessed by the set 1 and by itself while the set 1 can be accessed by the sets 0, 1 and 2. Therefore, set 1 acts as a first layer of protection for set 0.
(22) The problem at this point is how to define which set each function belongs to or, in other words, the policy that must be applied. FIG. 7 describes the Policy Generation algorithm. Given the call graph of the binary and all marked core functions, recursively it is checked which functions call the target set at the moment (starting from 0 and incrementing at each step) until there are only leaf functions with the least access privilege. It is important to notice that some functions must have some calls not instrumented at all in order to keep the binary functionalities. This occurs when the calling party function has been implemented in a way that the instructions used for calling a function can be used in a context where the destination address is in the middle of a function instead of the entry point. This scenario is likely to happen if some functions are written directly in Assembly. Therefore, they are allowed to keep non-instrumented direct calls. As shown by FIG. 8, because a function with a non-instrumented call can go to any function, the components of the set 1 are made of functions that can access the core directly or indirectly in any context (instrumented or not). The merging algorithm used in the Policy Generation is shown in FIG. 9 (before) and FIG. 10 (after) for set 1 and FIG. 11 (before) and FIG. 12 (after) for the leaf functions set.
(23) One problem with the Policy Generation algorithm is that it relies on the call graph of the binary, which may be incomplete. In order to solve this problem and keep the binary functionalities as expected, an extension is made on the Policy Generation as described by FIG. 13. The first increment is that all functions with detected incomplete estimation that do not access any core functions is put at set 2. The second is that ad hoc rules are inserted. With this, the binary functionalities are kept because a function from set 1 can access any other, which keeps the binary functionalities regardless of the call graph, and, in a similar manner, a function from set 2 can access any other except from set 0. In this case, if a misclassified function from set 2 actually accesses a core function, then its privilege is enhanced to 1 in the ad hoc insertion phase. However, this is an unlikely scenario, because access of core functions is possibly supervised. Therefore, this improvement on the Policy Generation makes this enforcement method feasible.
(24) FIG. 14 discloses how the binary is seen with all the security layers added. The core functions are at the center only available to be called by themselves and by the previous layer.
(25) In order to enforce the policy for the JOP protection, the binary is instrumented in the following way: before every function entry point the function, privilege is inserted and before each instrumented indirect call this privilege is put after. One important note is that it is possible that the core functions do not have their privilege before the entry point to avoid any indirect call. In the instrumented call two things are done. The first is to check if the calling party can actually call the called party. If not, then raise an error. The second is to put the return address to the proper location if necessary. For the ROP protection, it is required to insert after every function call the privilege after and instrument the return instructions. This is needed because when a function returns it does not know if the call was directly or indirectly. Furthermore, it is inserted the function's privilege after the instrumented return instruction. Because usually in the return instruction only the target is kept as information, it is necessary to reserve a register to hold the calling party address and, therefore, verify if the return is allowed by the policy.
(26) The original idea of CFI is to check if every tuple of calling party-called party (source and target) is valid. However, if no relaxation is used, this approach has a high cost because it is necessary to search at runtime in a data structure for a specific tuple. Therefore, a more feasible CFI is needed if the security improvement is desired.
(27) By using the method of the present invention, the performance of the binary is maintained because the policy check is simple, it can be just a verification if the calling party privilege is less or equal than the called party privilege plus one. Therefore, the costly search in a data structure at runtime is avoided by inserting undefined instructions (data) in the binary.
(28) Although the present disclosure has been described in connection with certain preferred embodiments, it should be understood that it is not intended to limit the disclosure to those particular embodiments. Rather, it is intended to cover all alternatives, modifications and equivalents possible within the spirit and scope of the disclosure as defined by the appended claims.