Method For Protection From Cyber Attacks To A Vehicle Based Upon Time Analysis, And Corresponding Device

Abstract

A method for protection from cyber attacks in a CAN (Controller Area Network), of a vehicle including the steps of selecting periodic messages having a transmission periodicity, grouping the periodic messages, and performing an analysis of messages of the nodes that exchange the received periodic messages, which includes obtaining times of arrival at the respective nodes of a set of periodic messages that have the same message identifier, computing average-offset values over successive subsets, of a given number of messages, accumulating the average-offset values for each identifier to obtain accumulated-offset values, identifying linear parameters by computing an angular coefficient, of a regression, and an intercept, or identification error, computing a correlation coefficient of the average offset of pairs of messages identified as coming from the same node, determining whether the correlation coefficient is higher than a first given threshold, determining whether the angular coefficient between two consecutive messages with the same identifier is higher than a second given threshold, determining whether the intercept between two consecutive messages is higher than a third given threshold, and supplying the results of these determinations to a message-classification operation.

Claims

1. A method for protection from cyber attacks in a communication network, in particular a CAN (Controller Area Network), of a vehicle, comprising: a communication bus, in particular a CAN-bus, and a plurality of nodes associated to said communication bus in a signal-exchange relationship and associated at least in part to control units for controlling functions of the vehicle, said nodes exchanging messages passing between nodes of said plurality of nodes, and said messages being identified by respective message identifiers, said method including, at a control node associated to said communication bus, the steps of: selecting, from among the messages exchanged between the nodes, periodic messages having a transmission periodicity, grouping said periodic messages into respective groups according to the respective period, and performing a procedure of analysis of messages of the nodes that exchange said received periodic messages, which comprises, for each group of transmission periodicity: obtaining times of arrival at the respective nodes of a set of periodic messages that have the same message identifier, computing as a function of said arrival times average-offset values over successive subsets, of a given number of messages, of said set of received messages, accumulating said average-offset values for each identifier with respect to each successive subset to obtain accumulated-offset values for each successive subset and a respective identifier, identifying linear parameters by computing a regression over said accumulated-offset values for each successive subset and respective identifier, said computation comprising computing an angular coefficient, or slope, of the regression, and an intercept, or identification error, computing, on the basis of average-offset values obtained at the step of computing as a function of said arrival times average-offset values over successive subsets, a correlation coefficient (ρ) of the average offset of pairs of messages identified as coming from one and the same node, performing a first check to determine whether the correlation coefficient is higher than a first given threshold, performing a second check to determine whether the angular coefficient between two consecutive messages with the same identifier is higher than a second given threshold, performing a third check to determine whether the intercept between two consecutive messages is higher than a third given threshold, and supplying the results of said first check, said second check, and said third check to a message-classification operation, configured to supply a confirmation of classification of the messages according to the transmitting node and message identifier or an indication of classification error as a function of said results.

2. The method as set forth in claim 1, wherein, if the correlation coefficient is higher than a first given threshold, the classification operation indicates the node that is transmitting the messages as corresponding to the nominal node; if it is lower, it indicates a classification error and indicates the transmitting node as being different from the nominal node.

3. The method as set forth in claim 1, wherein, if the second check has a negative outcome, the classification operation indicates a masquerade attack.

4. The method as set forth in claim 1, wherein, if the third check has a negative outcome, the classification operation indicates a fabrication attack.

5. The method as set forth in claim 1, wherein said classification operation is an operation of decisional logic discrimination in which the result of the first check as to whether the correlation coefficient is higher than a first given threshold is evaluated first, and the result of the second check and/or the result of the third check are/is evaluated if the result of the first check is affirmative.

6. The method as set forth in claim 1, wherein information known a priori, concerning the topology of the network and/or the transmitting nodes and/or the number and type of identifier of the messages transmitted by each of said nodes, is accessible for performing the operations of the method.

7. The method as set forth in claim 1, which comprises an operation of filtering with white list, which accepts only the message identifiers actually present in said white list associated to the control node.

8. The method as set forth in claim 1, wherein said operation of measuring arrival times is performed by acquiring the timestamp of arrival of the messages.

9. A device for protection from cyber attacks in a communication network, in particular a CAN (Controller Area Network), of a vehicle, said network comprising: a communication bus, in particular a CAN-bus, and a plurality of nodes associated to said communication bus in a signal-exchange relationship and associated at least in part to control units for controlling functions of the vehicle, said nodes exchanging messages passing between nodes of said plurality of nodes, and said messages being identified by respective message identifiers, wherein said device is configured to operate according to the method according to one or more of claims as set forth in claim 1.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0023] The invention will be described with reference to the annexed drawings, which are provided purely by way of non-limiting example and in which:

[0024] FIG. 1 has already been described previously;

[0025] FIG. 2 represents a diagram that shows a device that implements the method described herein;

[0026] FIG. 3 represents a timing chart that shows messages transmitted over the communication network in the context of the method described herein;

[0027] FIG. 4 shows a flowchart representing an operating step of the method;

[0028] FIG. 5 shows a diagram that represents linear evolutions calculated by the aforesaid operating step of the method;

[0029] FIG. 6A and FIG. 6B are diagrams representing correlation functions calculated by the aforesaid operating step of the method in different operating conditions;

[0030] FIG. 7 is a schematic diagram that represents schematically an embodiment of the method described herein; and

[0031] FIG. 8 is a diagram that represents schematically steps of the method described herein.

DETAILED DESCRIPTION OF THE INVENTION

[0032] According to the solution described herein, it is envisaged to insert one or more devices for protection from cyber attacks within the network 10 of the vehicle, in particular the CAN-bus, which implements the method for protection from cyber attacks described herein. This device for protection from cyber attacks may be additional to the existing network topology or else may be comprised in one of the existing nodes, in particular by configuring the microcontroller 14.

[0033] Each of the aforesaid devices may be responsible for analysis of the data traffic for a finite number of nodes of the network 10 of the vehicle, which in general describe a subnetwork of the entire communication architecture. For example, the subnetworks may have up to 18 nodes.

[0034] The purpose of the method and device described herein is to ensure that the communication on the CAN-bus corresponding to a specific subnetwork will not present anomalies such as the ones described previously.

[0035] In general, the method for protection from cyber attacks described herein envisages that for each vehicle network 10 there will be made available or accessible to a control node (device 20) a list of message identifiers ID, of the type indicated in Table 1, to be analysed. The list contains the information of which messages are periodic and consequently which messages are in actual fact analysed by the method.

[0036] The periodic messages that belong to the list of the messages supplied beforehand are grouped together on the basis of their periodicity in order to prevent erroneous classifications due to the fact that the time drift of some messages with different period and identifier ID (as well as the same node of origin) could be the same. Hence, a first operation of grouping or clustering by period is carried out upstream of the analysis of the time drift itself.

[0037] In other words, provided herein is a method for protection from cyber attacks in a communication network, in particular a CAN (Controller Area Network), of a vehicle, that comprises: [0038] a communication bus, in particular a CAN-bus, and [0039] a plurality of nodes associated to said communication bus in a signal-exchange relationship and associated at least in part to control units for controlling functions of the vehicle, [0040] said nodes exchanging messages passing between nodes of said plurality of nodes, and [0041] said messages being identified by respective message identifiers, [0042] said method includes, at a control node associated to said communication bus, the steps of: [0043] selecting, from among the messages exchanged between the nodes, periodic messages having a transmission periodicity, [0044] grouping said periodic messages into respective groups according to the respective period, and [0045] performing a procedure of analysis of messages of the nodes that exchange said received periodic messages, which comprises, for each group of transmission periodicity: [0046] obtaining times of arrival at the respective nodes of a set of periodic messages that have the same message identifier, [0047] computing, as a function of said arrival times, average-offset values over successive subsets, of a given number of messages, of said set of received messages, [0048] accumulating said average-offset values for each identifier with respect to each successive subset to obtain accumulated-offset values for each successive subset and a respective identifier, [0049] identifying linear parameters by computing a regression over said accumulated-offset values for each successive subset and respective identifier, said computation comprising computing an angular coefficient, or slope, of the regression, and an intercept, or identification error, [0050] computing, on the basis of average-offset values obtained at the step of computing as a function of said arrival times average-offset values over successive subsets, a correlation coefficient of the average offset of pairs of messages identified as coming from one and the same node, [0051] performing a first check to determine whether the correlation coefficient is higher than a first given threshold, [0052] performing a second check to determine whether the angular coefficient between two consecutive messages with the same identifier is higher than a second given threshold, [0053] performing a third check to determine whether the intercept between two consecutive messages is higher than a third given threshold, and [0054] supplying the results of said first check, said second check, and said third check to a message-classification operation, configured to supply a confirmation of classification of the messages according to the transmitting node and message identifier or an indication of classification error as a function of said results.

[0055] In greater detail, FIG. 2 hence shows a device 20 for protection from attacks connected to the CAN-bus 10, together with the nodes 11, i.e., a control node for analysing the messages exchanged over the network 10. As has been said, the device 20 is a control node of the network that has available or accessible a list of message identifiers ID, of the type indicated in Table 1, to be analysed. The device 20 comprises a block 200 representing an operation of clustering and a block 300 representing a procedure of message analysis comprising a fingerprinting procedure 310, i.e., a procedure of identification of nodes on the basis of respective unique fingerprints, in particular fingerprints obtained from the respective time drifts, and an anomaly-detection procedure 320, which, on the basis of the node identifications, detects whether an anomaly, in particular an attack, is in progress.

[0056] As has been said, in one embodiment, the above device for protection from attacks 20 is comprised in a node with a structure similar to that of the nodes 11, and hence comprises a CAN transceiver 12 and a CAN controller 13 included in a microcontroller 14. In FIG. 2, these elements are not shown in detail; the device 20 is internally represented schematically via blocks that illustrate operations or procedures of the method for protection from attacks described herein. As has been said, the above method may be executed by a microcontroller 14 configured to execute the above method, downstream of reception of the messages by the transceiver 12 and downstream of the operations of the controller 13 (e.g., managing of the logic levels and serialization of the CAN-bus 10).

[0057] Hence, the method executed in the device 20 comprises, once messages have been received, for example through the modules 12 and 13, carrying out the aforementioned preliminary grouping or clustering operation, in block 200. In particular, the received messages, on the basis of their identifier ID and on the basis of the list of message identifiers ID and corresponding periods T, which is information available to the device 20, are divided into respective groups or clusters according to the period T.sub.1, . . . , T.sub.n.

[0058] Then, in block 300, for each group corresponding to a respective period T.sub.1, . . . , T.sub.n each message received at the device 20 is processed so as to take into account the time drift.

[0059] For each of the ECUs, i.e., for each node 11, of the CAN 10 on board the vehicle, the instants of transmission of each periodic message are determined on the basis of the clock signal defined by a clock with quartz crystal, present in the node 11. Following the NTP (Network Time Protocol) convention, denoted here by C.sub.true is the “true” clock signal, which represents at each instant the true time variable, and denoted by C.sub.i is another clock that is “untrue” in order to define the terms clock offset, clock frequency and clock skew as follows: [0060] clock offset: this is the difference between the time variables given by C.sub.i and C.sub.true; in particular, we define “relative offset” as the difference between two consecutive C.sub.i; [0061] clock frequency: this is the rate of variation of the untrue clock signal C.sub.i; in analytical terms, it is nothing but the time derivative of the untrue clock signal C.sub.i; and [0062] clock skew: this is the difference between the frequency associated to the untrue clock signal C.sub.i and the frequency associated to C.sub.true; in particular, the relative skew is defined as the difference between two frequencies associated to consecutive C.sub.i.

[0063] If two clock signals have a relative offset and a skew of 0, then we say that they are synchronized. Otherwise, they are considered as non-synchronized. Since the CAN-bus, such as the bus 10, lacks synchronization of the clock signals in the respective nodes 11, it is considered as being non-synchronized. The offsets and skews of the clock of the non-synchronized nodes depend exclusively upon their local clocks; consequently, they are distinct from the others.

[0064] In particular, the timestamp proper to each ECU 11 includes a clock skew of its own. Through an in-depth analysis of the skew for each ECU 11 it is possible to classify the various ECUs in a CAN 10 with multiple nodes.

[0065] As shown in FIG. 3, considering an ECU A that transmits messages M.sub.0, . . . , M.sub.3 with period T, where T is here assumed to indicate also the value of T milliseconds, and an ECU R, receiver ECU, which periodically receives the above messages M.sub.0, . . . , M.sub.3, from the standpoint of the receiver ECU R, since only its timestamp is available, its clock is considered as if it were the “true” one, C.sub.true. On account of the asymmetry of the clock, the periodic messages M.sub.0, . . . , M.sub.3 are sent at moments with slight offsets from the ideal values (for example, T, 2T, 3T, . . . ). It is assumed for simplicity that a time t=0 is the moment when the first message M.sub.0 was sent by the ECU A, and O.sub.i is the offset of the clock of the ECU A when it sends the i-th message M.sub.i starting from t=0. In relation to FIG. 3, the index indicates the received message. Hence, after a network delay d.sub.i, the receiver ECU R receives that message and associates thereto a timestamp of arrival of iT+O.sub.i+d.sub.i+n.sub.i, where n.sub.i is the noise in the quantization of the timestamp of the receiver ECU R. Thus, the intervals between each arrival timestamp are T.sub.rx,i=T+ΔO.sub.i+Δd.sub.i+Δn.sub.i, where designated by ΔX.sub.i is the difference of the quantity X, for example O.sub.i or n.sub.i or d.sub.i, between step i and step i−1, and O.sub.0=0.

[0066] The working hypotheses are that the variation of the offset O.sub.i in a time step is negligible and the noise n.sub.i is a term of Gaussian noise with zero average so that an expected value μ.sub.T.sub.rx,i of the intervals of arrival timestamp can be estimated as:

μ.sub.T.sub.rx,i=E[T.sub.rx,i]=E[T+ΔO.sub.i+Δd.sub.i+Δn.sub.i]=T+E[ΔO.sub.i+Δd.sub.i+Δn.sub.i]≈T

[0067] Since the lengths of the data of the CAN periodic messages, i.e., the DLCs (Data-Length Codes), are constant in time, for the moment it is considered that E[Δd.sub.i]=0, i.e., the average of the differences in the delays d.sub.i is considered as being zero. On the basis of the timestamp of arrival of the first message, d.sub.0+n.sub.0, and of the average of the timestamp intervals, μ.sub.T.sub.rx, the estimated instant of arrival of the i-th message is extrapolated and determined as iμ.sub.T.sub.rx+d.sub.0+n.sub.0, whereas the actual measured time of arrival is iT+O.sub.i+d.sub.i+n.sub.i. Since we are estimating the subsequent arrival times, the expected value μ.sub.T.sub.rx is given by the previous measurements. Given that the period T is constant in time and hence again the expected value μ.sub.T.sub.rx approximates the period T, the average difference between the estimated times and the measured times is given by E[i(T−μT.sub.rx)+O.sub.i+Δd+Δn]≈E[O.sub.i]. That is, from the periodicity of the message, we can estimate the average offset of the clock, E[O.sub.i], which in actual fact will be different for different transmitters.

[0068] To estimate the clock skew, the messages in arrival are processed in batches of size N (for example, N=20), on which the average offset of the k-th batch, O.sub.avg[k], is calculated. This calculation is expressed via the following equation in closed form:

[00001] $\begin{matrix} O_{avg} [k] = \frac{1}{N - 1} \overset{N}{\underset{i = 2}{.Math.}} [a_{i} - (a_{1} + (i - 1) μ_{T} [k - 1])] & (1) \end{matrix}$

where μ.sub.T[k−1] is the mean time of arrival of the previous batch, and the quantity in square brackets [a.sub.i−(a.sub.1+(i−1)μ.sub.T[k−1])] is the difference between the measured time of arrival a.sub.i and the estimated time of arrival for the i-th message (a.sub.1+(i−1)μ.sub.T[k−1]). When a mean offset value is calculated from the current batch k, its absolute value is added to the accumulated offset O.sub.acc[k] according to the recursive equation defined below:

O.sub.acc[k]=O.sub.acc[k−1]+|O.sub.acc[k]| (2)

[0069] It is possible to use also a different formulation of the average clock offset as shown by the following Eq. 3:

[00002] $\begin{matrix} O_{avg} [k] = \frac{1}{N} \overset{N}{\underset{i = 1}{.Math.}} {\hat{O}}_{i} = \frac{1}{N} [T - (a_{i} - a_{i - 1})] = T - \frac{a_{N} - a_{0}}{N} & (3) \end{matrix}$

where a.sub.0 is the measured timestamp of the last batch of messages that has been analysed (i.e., at the k-1-th step). This makes it possible to redefine the recursive equation that represents the evolution of the accumulated clock offset O.sub.acc[k], as in Eq. (4) below:

O.sub.acc[k]=O.sub.acc[k−1]+N|O.sub.avg[k]| (4)

[0070] Taking again as reference the situation represented schematically in FIG. 6, if the ECU R were to determine the average offset of the clock for every N received messages, since it is derived with reference to the first message (of N messages), it would represent only the average of the most recent offsets. Consequently, to obtain the total amount of the offset sustained, the accumulated clock offset O.sub.acc[k], the absolute values of the average clock offsets O.sub.avg[k] must be added together, and the absolute value of the average clock offset O.sub.avg[k] is pre-multiplied by the value of the number of batches N and then added to the previous value of accumulated offset at step k−1 of the calculation procedure.

[0071] The slope of the accumulated clock offset O.sub.acc[k] hence represents the clock skew, which is practically constant (as is technically evident]). This makes it possible to estimate the clock skew from the timestamps of arrival and hence to identify the message transmitter for detection of intrusions. For a given message identifier ID, the accumulated clock offset for the timestamps of arrival is obtained. Since the clock skew is constant, the dynamics of the accumulated clock offset is linear, and it can thus be recursively estimated with a linear-regression model. The problem of linear regression can be formulated as shown by Eq. 5 below:

O.sub.acc[k]=S[k]t[k]+e[k] (5)

[0072] At the generic k-th step of the calculation procedure, O.sub.acc[k] is the accumulated offset on the k-th batch of N messages analysed, S[k] is the regression parameter, t[k] is the time that has elapsed, and e[k] is the identification error. The regression parameter S[k] represents the slope of the linear model and hence the estimated skew of the clock. The identification error, e[k], represents the residue that is not explained by the model (the intercept). In the procedure of calculation of the parameters O.sub.acc, S, t, O, μ, and e are updated every N messages, i.e., k.Math.N messages are examined up to step k. To determine the unknown parameter, the regression parameter S, an “instantaneous” recursive-least-square (RLS) algorithm is used, which uses the residue as target function to minimize the sum of the squares of the modelling errors.

[0073] As shown in the flowchart of FIG. 4, which represents in detail the fingerprinting procedure 310, Eqs. (1)-(5) define an implementation of the fingerprinting operation 310 through an “instantaneous” recursive-least-square (RLS) algorithm, where, in a step 312, corresponding to implementation of Eq. (1), it is envisaged to compute an average-offset value O.sub.avg[k] for one or more intervals of a number N of received messages. Next, in a step 314, corresponding to implementation of Eq. (2), it is envisaged to compute a current accumulated offset value O.sub.acc[k] by adding to the previous value, calculated at the previous instant k−1, of accumulated value the average-offset value O.sub.avg[k] on the current interval, k, multiplied by the number N of received messages.

[0074] In this way, the time drift, or clock offset, designated by O.sub.acc[k] is accumulated.

[0075] By accumulating values of clock offset as indicated by Eq. 3, there is an increment of clock offset, i.e., the accumulated clock offset O.sub.acc[k], which is substantially linear and hence describes graphically a straight line, which is substantially unique for each of the message identifiers of each cluster, calculated as a function of the period T.

[0076] In step 316 there is hence solved the problem of regression as in Eq. (5), by computing in particular the regression parameter S and the identification error e corresponding to the values of accumulated clock offset O.sub.acc[k]) obtained in the previous steps.

[0077] Provided hereinafter is an example in pseudocode used for recursive calculation and updating of the parameters of the linear model. Present at points 23 and 24 are, respectively Eqs. (1) and (2) (steps 312-314) of calculation of the accumulated clock offset that is entered into the procedure 300 of message analysis. A function SKEWUPDATE (t,e) updates the skew values (S[k]); in this function steps 3-5 correspond to the RLS algorithm. Steps 7-21 correspond to calculation of the timestamp intervals Tn, from the arrival times a.sub.n-a.sub.n−1, step 22 corresponds to calculation of the average interval. In step 25, the identification error(k) is computed as the difference between the accumulated offset and the straight line having as slope the skew S [k−1] at step k−1. Associated to the skew S[k], or regression parameter, is the least-square value of the function SKEWUPDATE (t,e). [0078] 1. Initialize: S[0]=P[0]=δI [0079] 2. Function SKEWUPDATE(t,e) custom-character RLS algorithm [0080] 3.

[00003] $G [k] \leftarrow \frac{λ^{- 1} P [k - 1] t [k]}{1 + λ^{- 1} t^{2} [k] P [k - 1]}$ [0081] 4. P[k]←λ.sup.−1(P[k−1]−G[k]t[k]P[k−1]) [0082] 5. return S[k]←S[k−1]+G[k]e[k] [0083] 6. end function [0084] 7. for k-th step do [0085] 8. a.sub.0<-timestamp of arrival of most recent received message [0086] 9. n<-1 [0087] 10. while n≥N do [0088] 11. If current time>>a.sub.n−1 then [0089] 12. /* it no longer receives the message */ [0090] 13. a.sub.n, . . . , a.sub.N<-significantly high values [0091] 14. T.sub.n, . . . , T.sub.N<-significantly high values [0092] 15. break [0093] 16. else [0094] 17. a.sub.n<-timestamp of arrival of n-th message [0095] 18. T.sub.n<-a.sub.n-a.sub.n−1 custom-character timestamp interval [0096] 19. n<-n+1 [0097] 20. end if [0098] 21. end while [0099] 22.

[00004] $μ_{T} [k] \leftarrow \frac{1}{N - 1} \overset{N - 1}{\underset{i = 1}{.Math.}} T_{i} ⊳$ average timestamp interval [0100] 23.

[00005] $O_{avg} [k] = \frac{1}{N - 1} {.Math.}_{i = 2}^{N} [a_{i} - (a_{1} + (i - 1) μ_{T} [k - 1])]$ [0101] 24. O.sub.acc[k]=O.sub.acc[k−1]+|O.sub.acc[k]| custom-character accumulated offset [0102] 25. e[k]<-O.sub.acc[k]−S[k−1]t[k]identification error [0103] 26. S[k]<-SKEWUPDATE(t,e)clock skew [0104] 27. end for

[0105] What is obtained, in terms of accumulated clock offset appears in FIG. 6, which represents schematically the fact that for each ECU h, j, l there are as many straight lines as are the periodic messages, i.e., identifiers ID, that the respective ECU is expected to send on the communication bus. Given that each periodicity is analysed in a separate way from the others (in the sense that the messages with a period T of 10 ms are analysed separately from those with a period of 20 ms or some other periodicity), from the analysis of the clock offset the result described schematically hereinafter is obtained. In FIG. 6, by the notation ID.sub.r,h,[T.sub.i.sub.] is understood the straight line constructed as described previously via RLS, corresponding to the r-th identifier ID sent by the h-th ECU with periodicity T.sub.i, i.e., assigned by the clustering operation 200 to the cluster or group with periodicity T.sub.i. In order to prevent any erroneous classification in the case where the sheaves of straight lines are too close to one another and there could hence be some ambiguity in the decision of the membership group, a correlation analysis (via Pearson coefficient) is used, in so far as there exists the technical evidence that, even though the accumulated clock offsets may be very close to one another, the variability with which the time drift persists remains unique for that specific ECU.

[0106] The procedure 310 further comprises, as shown in FIG. 4, computing in a step 318 a correlation index ρ of the average delay O.sub.avg of the messages. In this regard, FIG. 6A represents a diagram, where appearing on the abscissae is the average delay O.sub.avg of messages with identifier ID.sub.j sent by an ECU h, and appearing on the ordinates is the average offset O.sub.avg of messages with different identifier ID.sub.i, but with one and the same periodicity, sent by a different ECU k. In the case where two messages have the same periodicity, but come from different ECUs, even though in effect they have associated a similar clock offset, they are in any case distinguishable by computing the correlation index ρ of the average offset O.sub.avg in so far as the correlation is low in absolute value. Instead, the messages coming from the same device/node/ECU, as shown in FIG. 6B, where the identifiers, ID.sub.i, ID.sub.j, are different, the periodicities are equal, and the ECU is the same, the ECU h, have an accumulated clock offset that is very similar and a correlation index ρ that is very high (the experimental evidence indicates a value higher than 0.8).

[0107] Consequently, computed in step 318 are correlation indices p of pairs of messages with similar period, which hence belong to one and the same cluster obtained from the clustering operation 200, with different identifiers ID.sub.i, ID.sub.j, which in reception are found to come from one and the same ECU or node 11 (for example, the ECU h as in FIG. 6B), and the corresponding average-offset values O.sub.avg calculated in step 312 are derived.

[0108] The subsequent anomaly-detection procedure 320 is based on the analysis of the change of slope, i.e., S[k], and intercept, i.e., e[k], of the straight lines ID.sub.r,h,[T.sub.i.sub.], supplied in step 316, corresponding to which are in effect the anomalies/attacks described previously. In particular, a change of slope S[k] corresponds to a variation of the periodicity of the message with that specific identifier ID, i.e., for the straight line ID.sub.r,h[T.sub.i], the message M.sub.r of the group with period T.sub.i from the node or ECU h. If the period increases (the frequency at which it is sent decreases), then the slope S[k] decreases (the straight line is less “inclined”). Instead, if the period decreases (and the frequency increases), the straight line has a steeper slope S[k].

[0109] The joint analysis of the accumulation of clock offsets, and hence of slope S[k] and intercept e[k] of the straight line (supplied by step 316), and of the correlation index ρ between messages that apparently have the same origin makes it possible both to understand whether the communication network is under attack, consistently with the previous definitions of anomaly, and to understand from which ECU (node 11) a certain message with a specific identifier ID.sub.r comes, where r is the index of the message identifiers, to each value of r there corresponding a different identifier, in particular in the list of the identifiers allowed accessible to the device 20.

[0110] FIG. 7 hence shows a diagram of an embodiment of the method so far described.

[0111] Indicated by block 100 is a white-listing step, i.e., of application of a white-list filter, namely, a filter that allows only passage of the elements indicated in a list, the white list, as a step preliminary to steps 200 and 300. This filter makes it possible to accept only the message identifiers ID effectively present in the white list associated to the control node. Types of identifiers ID not belonging to the list may be discarded, recorded in special data structures, and possibly reported to the user through specific signals.

[0112] Designated by 200 is then the clustering or grouping step that on the messages carries out the separation according to the period to obtain the groups of messages ID.sub.r,h,[T.sub.1.sub.], . . . , ID.sub.r,h,[T.sub.n.sub.].

[0113] The above groups of messages are supplied to the message-analysis procedure 300, which comprises the fingerprinting procedure 310 and the anomaly-detection procedure 320.

[0114] FIG. 8 shows in greater detail the above message-analysis procedure 300.

[0115] As has been said, the procedure 310 obtains, from the arrival times a.sub.i of the groups of messages ID.sub.r,h[T.sub.1.sub.], . . . , ID.sub.r,h,[T.sub.n.sub.], the respective slopes and intercepts, S[k], e[k] via Eqs. (1)-(2) (or alternatively Eq. (3) and (4)) and calculation of a regression (Eq. (5)).

[0116] In the anomaly-detection procedure 320, it is next envisaged to perform, on the basis of the values of slope S[k] and intercept e[k], as well as the correlation values ρ, calculated on the received messages, downstream of the clustering procedure 200 and the whitelisting procedure 100, a classification, for example through a three-level decisional logic classifier.

[0117] Hence, the procedure 320 comprises a step of correlation analysis 350 on messages belonging to one and the same group of messages ID.sub.r,h,[T.sub.i.sub.]. As described in FIG. 11 in the above correlation analysis there is computed a correlation index ρ between the average offsets O.sub.avg of pairs of messages coming from one and the same control unit h in the group of messages with the same period T.sub.i.

[0118] Once the values of correlation ρ have been obtained in step 350, then in a testing step 355 a check is made to verify whether the value of correlation index ρ is higher than a value, that preferably can be set, for example 0.8. Supplied to the classifier 360 is the information on whether the testing step 355 has yielded a positive result or a negative result (e.g., yes/no, or pass/fail).

[0119] The procedure 320 further comprises a step 330 of comparison of slopes S[k] associated to consecutive messages belonging to one and the same group of messages ID.sub.r,h,[T.sub.i.sub.]. In particular, selected in step 330 are the values of slope of two consecutive messages. Then, in step 335 a check is made to verify whether their difference is greater than a slope threshold STH set.

[0120] Supplied to the classifier 360 is the information on whether the testing step 335 has yielded a positive result or a negative result (e.g., yes/no, or pass/fail).

[0121] The procedure 320 further comprises a step 340 of comparison of values of intercept e[k] associated to consecutive messages belonging to one and the same group of messages ID.sub.r,h,[T.sub.i.sub.].

[0122] In particular, selected in step 340 are the values of error or intercept e[k] of two consecutive messages received; then, in the testing step 345 a check is made to verify whether their difference is greater than a slope threshold ETH set.

[0123] Supplied to the classifier 360 is the information on whether the testing step 345 has yielded a positive result or a negative result (e.g., yes/no, or pass/fail).

[0124] The classifier 360 is of a heuristic type and is configured, for example, via logic rules of an IF, . . . , THEN type. Other alternative embodiments of pattern-recognition classifier are possible, including neural networks. In alternative embodiments, the classifier 360 may comprise one or more further input quantities or information data.

[0125] In a preferred embodiment, supplied to the classifier 360 is first the result of the test 355 on the correlation index, and then the result of the other two tests 335 and 345, preferably first the test 335 and then test 345. It has been found that this order of the tests presents advantages in terms of classification accuracy, but it is clear that in variant embodiments it is possible to order the tests differently.

[0126] Hence, for example, if the correlation analysis 350-355 yields a negative result, or if the slopes of two straight lines nominally belonging to the same ECU are very different (steps 330-335), the classifier 360 records a classification error. This classification error can be reported to the user via a specific signal, or recorded in data structures to generate other types of events.

[0127] Hence, it is envisaged to supply the results of said first check 350, said second check 330, and said third check 340 to an operation of message classification, performed by the classifier 360, configured to supply a result RC comprising a confirmation of message classification according to the transmitting node 11, for example the ECU h, and the message identifier, ID, or an indication of classification error as a function of said results.

[0128] If the correlation coefficient p in the check 350 is higher than a first given threshold, the classification operation indicates the node that is transmitting the messages as corresponding to the nominal node; if it is lower, it records a classification error and indicates the transmitting node as being different from the nominal node.

[0129] If the second check 330 yields a negative result, i.e., there is a change in slope, the classification operation 360 indicates as result RC a masquerade attack.

[0130] If the third check 340 yields a negative result, the classification operation 360 indicates as result RC a fabrication attack.

[0131] As has been said, in one embodiment, the above classification operation 360 is an operation of decisional logic discrimination in which first the result of the first check 350 is evaluated, i.e., whether the correlation coefficient ρ is higher than a first given threshold, and the result of the second check and/or the result of the third check are/is evaluated if the result of the first check is affirmative.

[0132] The method described hence exploits the information known a priori to check the correlation between identifiers ID belonging to one and the same ECU, or node 11. By information known a priori, in so far as it is stored in the device 20 or in any case accessible thereto, is understood the information concerning the network topology, the number of ECUs, or more in general of transmitting nodes 11, and the number and type of identifiers ID transmitted by each of these ECUs.

[0133] Hence, from what has been described above, the advantages of the solution proposed emerge clearly.

[0134] The solution described advantageously makes it possible to perform as a virtual MAC, recognizing the behaviour of the specific device starting from the time drift of the periodic messages that itself sends over the network bus.

[0135] The invention has been described in an illustrative manner. It is to be understood that the terminology which has been used is intended to be in the nature of words of description rather than of limitation. Many modifications and variations of the invention are possible in light of the above teachings. Therefore, within the scope of the appended claims, the invention may be practiced other than as specifically described.

Method For Protection From Cyber Attacks To A Vehicle Based Upon Time Analysis, And Corresponding Device

Assignee

Inventors

Cpc classification

Classification Explorer

H04L2463/121

ELECTRICITY

Classification Explorer

H04L2209/84

ELECTRICITY

Classification Explorer

H04L12/40

ELECTRICITY

Classification Explorer

H04L2012/40215

ELECTRICITY

Classification Explorer

H04L63/1416

ELECTRICITY

Classification Explorer

H04L2463/146

ELECTRICITY

Classification Explorer

H04L63/0876

ELECTRICITY

Classification Explorer

H04L2012/40273

ELECTRICITY

International classification

Classification Explorer

H04L9/40

ELECTRICITY

Abstract

Claims

Description