Driving assistance method and system
11648935 · 2023-05-16
Assignee
Inventors
- David Sierra Gonzalez (Gijon, ES)
- Christian Laugier (Montbonnot Saint-Martin, FR)
- Jilles Steeve Dibangoye (Villeurbanne, FR)
- Alejandro Dizan Vasquez Govea (Palo Alto, CA, US)
- Nicolas Vignard (Brussels, BE)
Cpc classification
B60W50/14
PERFORMING OPERATIONS; TRANSPORTING
B60W30/0956
PERFORMING OPERATIONS; TRANSPORTING
B60W30/0953
PERFORMING OPERATIONS; TRANSPORTING
G06V20/58
PHYSICS
G06V20/588
PHYSICS
G08G1/166
PHYSICS
B60W30/09
PERFORMING OPERATIONS; TRANSPORTING
G08G1/0962
PHYSICS
B60W2050/0028
PERFORMING OPERATIONS; TRANSPORTING
International classification
B60W30/09
PERFORMING OPERATIONS; TRANSPORTING
B60W30/095
PERFORMING OPERATIONS; TRANSPORTING
B60W50/00
PERFORMING OPERATIONS; TRANSPORTING
B60W50/14
PERFORMING OPERATIONS; TRANSPORTING
G06V20/56
PHYSICS
G06V20/58
PHYSICS
Abstract
A driving assistance system includes a sensor set, a data storage device and an output device. The sensor set detects a set of road users and, for each road user, a current state including a current speed and a current position. The data storage device includes a finite plurality of behavioral models. The data processor assigns a behavioral model to each road user, probabilistically estimates, for each road user, a belief state comprising a set of alternative subsequent states and corresponding probabilities, each alternative subsequent state including a speed and a position, according to the behavioral model assigned to each road user, and determines a risk of collision of the road vehicle with a road user, based on the probabilistically estimated future state of each road user. The output device outputs a driver warning signal or executes an avoidance action if the risk of collision exceeds a predetermined threshold.
Claims
1. A driving assistance method for a road vehicle, the driving assistance method comprising the steps of: detecting, within a traffic scene including the road vehicle, a set of road users and, for each road user of the set of road users, a current state including a current speed and a current position; assigning a dynamic behavioral model, from among a finite plurality of behavioral models, to each road user of the set of road users, wherein said dynamic behavioral model has at least one dynamic feature for taking into account a state of a road user other than the road user to which the dynamic behavioral model is assigned, the at least one dynamic feature of the dynamic behavioral model being applied to the road user to which the dynamic behavioral model is assigned; probabilistically estimating by a data processor, for each road user of the set of road users, a belief state for each time step of a plurality of successive subsequent time steps, each belief state comprising a set of alternative states and corresponding probabilities, each alternative state including a speed and a position, by using the dynamic behavioral model assigned to each road user of the set of road users, wherein a value of the at least one dynamic feature of said dynamic behavioral model is probability-weighted according to an occupancy probability distribution calculated on the basis of belief states of the set of road users for a preceding time step, to estimate a probability of an action from a set of alternative actions for each road user of the set of road users at the preceding time step, and then calculating each alternative state resulting from each action of the set of alternative subsequent actions for the corresponding road user; determining a risk of collision of the road vehicle with a road user of the set of road users, based on the probabilistically estimated belief state of each road user of the set of road users at a time step of the plurality of successive subsequent time steps; and outputting a driver warning signal and/or executing an avoidance action if the risk of collision exceeds a predetermined threshold.
2. The driving assistance method of claim 1, wherein the step of assigning a behavioral model, from among a finite plurality of behavioral models, to each road user of the set of road users is carried out, by a data processor, based on behavior by each road user of said set of road users prior to the current states.
3. The driving assistance method of claim 2, wherein, to assign to a road user a behavioral model from among the finite plurality of behavioral models, an aggregated probability of a prior trajectory of the road user, including successive prior states of the road user, is calculated using each behavioral model, and the behavioral model with the highest aggregated probability is selected.
4. The driving assistance method of claim 1, wherein each behavioral model of the finite plurality of behavioral models is associated with a cost function for calculating a cost of a subsequent action from a current state, and wherein the probability of an action from a set of alternative subsequent actions for a given road user is estimated as one minus the ratio of a cost of the action to the sum total of costs of the set of alternative subsequent actions, according to the cost function associated with a behavioral model assigned to the given road user.
5. The driving assistance method of claim 4, wherein the cost function comprises a dynamic cost component whose value is calculated by multiplying the probability-weighted value of the at least one dynamic feature by a corresponding component of a dynamic weight vector associated to a driving behavior.
6. The driving assistance method of claim 1, comprising a step of sorting the set of road users by order of decreasing driving priority, before the step of estimating a probability of each action from a set of alternative subsequent actions, which is successively carried out for each road user of the set of road users following the order of decreasing driving priority.
7. The driving assistance method of claim 1, wherein the at least one dynamic feature comprises a time-headway and/or a time-to-collision between the road user to which the dynamic behavioral model is assigned and another road user.
8. The driving assistance method of claim 1, wherein the dynamic behavioral model is a dynamic behavioral model learned from observed road user behavior using a machine learning algorithm.
9. The driving assistance method of claim 1, wherein the traffic scene comprises a multi-lane road, and the set of alternative subsequent actions for each road user of the set of road users comprises lane and/or speed changes.
10. A driving assistance system for a road vehicle, the driving assistance system comprising: a sensor set for detecting, within a traffic scene including the road vehicle, a set of road users and, for each road user of said set of road users, a current state including a current speed and a current position; a data storage device for a database comprising a finite plurality of behavioral models; a data processor, connected to the sensor set and to the data storage device, for: assigning a dynamic behavioral model, from among the finite plurality of behavioral models, to each road user of the set of road users, wherein said dynamic behavioral model has at least one dynamic feature for taking into account a state of a road user other than the road user to which the dynamic behavioral model is assigned, the at least one dynamic feature of the dynamic behavioral model being applied to the road user to which the dynamic behavioral model is assigned, probabilistically estimating, for each road user of the set of road users, a belief state for each time step of a plurality of successive subsequent time steps, said belief state comprising a set of alternative states and corresponding probabilities, each alternative state including a speed and a position, each belief state being probabilistically estimated by using the dynamic behavioral model assigned to each road user of the set of road users, wherein a value of the at least one dynamic feature of said dynamic behavioral model is probability-weighted according to an occupancy probability distribution calculated on the basis of belief states of the set of road users for a preceding time step, to estimate a probability of an action from a set of alternative actions for each road user of the set of road users at the preceding time step, and then calculating each alternative state resulting from each action of the set of alternative subsequent actions for the corresponding road user, and determining a risk of collision of the road vehicle with a road user of said set of road users, based on the probabilistically estimated belief state of each road user of the set of road users at a time step of the plurality of successive subsequent time steps; and an output device, connected to the data processor, for outputting a driver warning signal and/or executing an avoidance action if the risk of collision exceeds a predetermined threshold.
11. A road vehicle comprising a driving assistance system according to claim 10.
12. The driving assistance system of claim 10, wherein each behavioral model of the finite plurality of behavioral models is associated with a cost function for calculating a cost of a subsequent action from a current state, and wherein the cost function comprises a dynamic cost component whose value is calculated by multiplying the probability-weighted value of the at least one dynamic feature by a corresponding component of a dynamic weight vector associated to a driving behavior, and wherein the probability of an action from a set of alternative subsequent actions for a given road user is estimated as one minus the ratio of a cost of the action to the sum total of costs of the set of alternative subsequent actions, according to the cost function associated with a behavioral model assigned to the given road user.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The invention may be more completely understood in consideration of the following detailed description of an embodiment in connection with the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6) While the invention is amenable to various modifications and alternative forms, specifics thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit aspects of the invention to the particular embodiment described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the invention.
DETAILED DESCRIPTION
(7) For the following defined terms, these definitions shall be applied, unless a different definition is given in the claims or elsewhere in this specification.
(8) As used in this specification and the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the content clearly dictates otherwise. As used in this specification and the appended claims, the term “or” is generally employed in its sense including “and/or” unless the content clearly dictates otherwise.
(9) The following detailed description should be read with reference to the drawings in which similar elements in different drawings are numbered the same. The detailed description and the drawings, which are not necessarily to scale, depict illustrative embodiments and are not intended to limit the scope of the invention. The illustrative embodiments depicted are intended only as exemplary. Selected features of any illustrative embodiment may be incorporated into an additional embodiment unless clearly stated to the contrary.
(10)
(11)
(12) The obstacle tracker module 211 processes the incoming data from the inertial measurement unit 201, satellite navigation receiver 202, LIDAR 203 and radar 204 in order to identify and track obstacles, in particular mobile obstacles such as other road users in a traffic scene within an area at least longitudinally centered on the road vehicle 1. This can be performed, for instance, using any one of the alternative algorithms disclosed by Anna Petrovskaya, Mathias Perrollaz, Luciano Oliveira, Luciano Spinello, Rudolph Triebel, Alexandros Makris, John-David Yoder, Christian Laugier, Urbano Nunes and Pierre Bessière in “Awareness of Road Scene Participants for Autonomous Driving” in Chapter “Fully Autonomous Driving” of “Handbook of Intelligent Vehicles”, Vol. 2, Springer-Verlag London Ltd, 2012, edited by Azim Eskandarian.
(13) On a multi-lane road within a road network, the lane tracker module 212 processes the incoming data from the front camera 205 in order to identify and track the lanes of the multi-lane road, whereas the localization module 213 processes the same incoming data to localize the road vehicle 1 on one of those lanes and in the road network.
(14) The data processor 103 is also adapted to further process the output data from the first functional processing layer in order to predict a future traffic scene in the area centered on the road vehicle 1. This is illustrated on
(15) The observation management module 221 merges the output from the obstacle tracker module 211, lane tracker module 212 and localization module 213 so as to generate a grid G covering the road in an area centered on the road vehicle 1 at least in a direction of travel, which may be a longitudinal direction of the road, and locate the other road users in the current traffic scene on this grid G, which is divided in cells (x, y), wherein x is a longitudinal cell index corresponding to a longitudinal position on the road, and y is a lane index. The longitudinal resolution of the grid can be denoted by r.sub.x. The observation management module 221 allocates an index i to each road user of a finite set V of m road users, including road vehicle 1, within the area centered on road vehicle 1, and identifies its current position (x.sup.i, y.sup.i) on this grid G, as well as its current speed z.sup.i. The observation management module 221 also identifies as the current preferred speed z.sup.i.sub.• of each road user i the maximum speed observed for that road user i since the last change in the legal speed limit on the road. Together, a tuple ((x.sup.i, y.sup.i), z.sup.i, z.sup.i.sub.•) of position (x.sup.i, y.sup.i), speed z.sup.i and preferred speed z.sup.i.sub.• represent a state s.sup.i of each road user i.
(16) The behavioral model assignment module 222 assigns a behavioral model, selected from a finite plurality of behavioral models in a database stored in the data storage device 102, to each road user i: These behavioral models may be defined by cost functions, and more specifically by dynamic cost functions C.sup.i.sub.t(s.sup.i) with a static part C.sub.s.sup.i(s.sup.i)=θ.sub.s.sup.i.Math.f.sub.s(s.sup.i) and a dynamic part C.sub.d,t.sup.i(s.sup.i)=θ.sub.d.sup.i.Math.f.sub.d,t(s.sup.i), following the equation:
C.sub.t.sup.i(s.sup.i)=θ.sub.s.sup.i.Math.f.sub.s(s.sup.i)+θ.sub.d.sup.i.Math.f.sub.d,t(s.sup.i)
wherein f.sub.s(s.sup.i) is a time-invariant vector of static features of state s.sup.i, f.sub.d,t(s.sup.i) is a vector of dynamic features of state s.sup.i at timestep t, wherein timestep t can take the values {0, 1, . . . , T−1} within a time horizon T, and θ.sub.s.sup.i and θ.sub.d.sup.i are weight vectors to be respectively applied to the static and dynamic features in order to establish the cost of those features for a given type of driving behaviour.
(17) The static features may notably comprise a lane preference, and a speed deviation. For a multi-lane road with n.sub.l lanes, the lane preference may be expressed as a vector of n.sub.l mutually exclusive binary features, whereas the speed deviation corresponds to a difference between speed z.sup.i and preferred speed z.sup.i.sub.•.
(18) The dynamic features of a road user i may notably comprise a time-headway ϕ(s.sup.i, s.sup.j) from a closest leading road user j on the same lane, a time-headway ϕ(s.sup.i, s.sup.k) to a closest trailing road user k on the same lane, a time-to-collision φ(s.sup.i, s.sup.j) with the closest leading road user j on the same lane and a time-to-collision φ(s.sup.i, s.sup.k) with the closest trailing road user k on the same lane.
(19) The time-headway ϕ(s.sup.i, s.sup.j) from a leading road user j can be defined as a time elapsed between the back of the leading road user j passing a point, and the front of road user i reaching the same point, and may be calculated according to the equation:
(20)
(21) The time-headway ϕ(s.sup.i, s.sup.k) to a trailing road user k can be defined as a time elapsed between the back of road user i passing a point, and the front of the trailing road user k reaching the same point, and may be calculated according to the equation:
(22)
(23) The time-to-collision φ(s.sup.i, s.sup.j) with a slower leading road user j can be defined as a time to elapse until the front of road user i hits the back of the leading road user j at their current speeds, and may be calculated according to the equation:
(24)
(25) The time-to-collision φ(s.sup.i, s.sup.k) with a faster trailing road user k can be defined as a time to elapse until the front of the trailing road user k hits the back of road user i at their current speeds, and may be calculated according to the equation:
(26)
(27) Small time-headway and/or time-to-collision values may indicate dangerous situations.
(28) The behavioral model database stored in the data storage device 102 in the present embodiment comprises a finite plurality of alternative dynamic behavioral models, each comprising a set of weight vectors θ.sub.s.sup.i and θ.sub.d.sup.i corresponding to a specific driving behavior. So, for instance, the database may contain a safe driver model, corresponding to a road user with a preference for high time-headway and/or time-to-collision values; an aggressive driver model, corresponding to a road user with a high tolerance for low time-headway and/or time-to-collision values and a preference for the left-most lane in a multi-lane road; an exiting driver model, corresponding to a road user aiming to take an oncoming exit from the multi-lane road, and therefore giving preference to merging into the right-most lane over maintaining its preferred speed; and an incoming driver model, corresponding to a road user adapting its speed in order to merge into the road lanes.
(29) Each dynamic behavioral model may have previously been learnt offline from observed road behavior, for example using a machine learning algorithm such as an Inverse Reinforcement Learning (IRL) algorithm. IRL algorithms aim to find a cost function corresponding to a behavioral model underlying a set of observed trajectories, wherein each trajectory may be defined as a sequence of states over a plurality of consecutive timesteps. The goal of an IRL algorithm is to find the weight vectors of the cost function for which the optimal policy obtained by solving the underlying planning problem would result in trajectories sufficiently similar to the observed trajectories according to a given statistic. IRL algorithms have already been shown as a viable approach for learning cost functions describing driving behaviors, for example by P. Abbeel and A. Y. Ng, in “Apprenticeship learning via inverse reinforcement learning”, Proceedings of the Twenty-first International Conference on Machine Learning (ICML 2004), Banff, Alberta, Canada, Jul. 4-8, 2004, and by S. Levine and V. Koltun, in “Continuous Inverse Optimal Control with Locally Optimal Examples”, Proceedings of the 29.sup.th International Conference on Machine Learning (ICML '12), 2012.
(30) An IRL algorithm that may in particular be used to learn the static and dynamic weight vectors θ.sub.s.sup.i and θ.sub.d.sup.i corresponding to a specific driving behavior is the Maximum Entropy Inverse Reinforcement Learning (ME IRL) algorithm disclosed by Brian D. Ziebart, Andrew Maas, J. Andrew Bagnell and Anind K. Dey in “Maximum Entropy Inverse Reinforcement Learning”, Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, AAAI 2008, Chicago, Ill., USA Jul. 13-17, 2008, pp. 1433-1438. A vehicle with sensors and a data processor may be driven naturally over roads, to observe and record trajectories of road users, from which the behavioral models may be learnt.
(31) The cost function associated with a behavioral model can be used to calculate a probability π.sub.t.sup.i(a.sup.i|s.sup.i) of a given road user i taking an action a.sup.i which will cause a transition Tr.sup.i(s.sup.i, a.sup.i) of the road user i from one state s.sup.i to another state s′.sup.i at time t. This probability may be calculated as one minus the ratio of the cost of state s′.sup.i at time t according to the cost function to the sum total of the costs of states that can be reached, from state s.sup.i, through each and every action available to the road user i from state s.sup.i at time t. For a dynamic cost function C.sup.i.sub.t(s.sup.i), this can be expressed according to the equation:
(32)
wherein A.sup.i is the set of all actions a.sup.i available to the road user i from state s.sup.i at time t.
(33) In the behavioral model assignment module 222, a behavioral model is selected from among the finite plurality of alternative behavioral models in the database stored in the data storage device 102, to be assigned to a given road user i in a current traffic scene, on the basis of a prior trajectory of the road user i. This prior trajectory comprises the successive states of this road user i perceived through the observation management module 221 over a plurality of timesteps, eventually up to the current state s.sup.i. In order to select the behavioral model which best matches the prior trajectory of the road user i from among the finite plurality of alternative behavioral models in the database, the aggregated probability of this prior trajectory according to each behavioral model is calculated using the cost function associated to that behavioral model, and the behavioral model with the highest aggregated probability is selected. More specifically, to calculate the aggregated probability of a prior trajectory according to a behavioral model, the probability of each action throughout the prior trajectory is calculated using the cost function associated with that behavioral model, and multiplied by each other.
(34) The scene prediction module 223 predicts the future evolution of the current traffic scene, as perceived through the observation management module 221, using the behavioral models assigned by the behavioral model assignment module 222 to the road users within that current traffic scene. In particular, the scene prediction module 223 may predict this future evolution throughout a plurality of timesteps from the current traffic scene at a time t=0 to a time t=T, according to the prediction algorithm illustrated on
(35) In a first step S300 in this algorithm, time index t is set to 0, and a belief state b.sup.i.sub.t, which is a probability distribution over the state space of each road user i at time t is initialized for t=0 by setting, as the initial belief state b.sup.i.sub.0 for each road user i of the m road users in the traffic scene, a probability of 1 for the current state s.sup.i of road user i, as perceived through the observation management module 221. To this initial belief state b.sup.i.sub.0 corresponds also an initial occupancy probability distribution o.sub.0.sup.i(x.sup.i,y.sup.i) wherein an occupancy probability of 1 is set at the current position (x.sup.i, y.sup.i) of road user i, as perceived through the observation management module 221. After this initialization step S300, the m road users in the traffic scene are sorted by descending driving priority in the next step S301. This driving priority may be mainly allocated according to the longitudinal position of the road users, with a leading road user being allocated a higher driving priority than a trailing road user. However, the general and/or local rules of the road may also be taken into account in this driving priority allocation.
(36) After executing step S301, road user index i is set to 1 in step S302. Then, in step S303, for each available action a.sup.i from each possible state s.sup.i of road user i at time t, the value of the dynamic cost component associated to each dynamic feature for each possible location (x.sup.j, y.sup.j) of each other road user j at time t is calculated by multiplying the value of that dynamic feature by the corresponding component of dynamic weight vector θ.sub.d.sup.i according to the behavioral model assigned to road user i and probability-weighted according to the occupancy probability distribution o.sub.t.sup.j(x.sup.j,y.sup.j) of road users j. By “each possible state s.sup.i”, we understand in particular each state s.sup.i with a non-zero probability. In the calculation of dynamic features, such as time-headway and time-to-collision, of road user i with respect to a road user j at a location (x.sup.j, y.sup.j), a probability-weighted average of speeds z.sup.j of the possible states s.sup.j in which road user j occupies that location (x.sup.j, y.sup.j) may be used as the speed of road user j.
(37) From the resulting set of values for the probability-weighted dynamic cost components, the highest value for each dynamic cost component is then selected to form a probability-weighted dynamic cost vector used in step S304 to calculate the probability π.sub.t.sup.i(a.sup.i|s.sup.i) of each available action a.sup.i from each possible state s.sup.i for road user i at time t.
(38) In the next step S305, the belief state b.sup.i.sub.t+1 for road user i at the next timestep t+1 is calculated based on the probability π.sub.t.sup.i(a.sup.i|s.sup.i) and transition Tr.sup.i (s.sup.i, a.sup.i) associated to each available action a.sup.i from each possible state s.sup.i for road user i at time t, according to the equation:
(39)
wherein the operator P(s′.sup.i|s.sup.i, a.sup.i)=1{s′.sup.i=Tr.sup.i(s.sup.i, a.sup.i)} is a binary indicator that only takes values 0 or 1, 1{⋅} being an indicator function.
(40) In the following step S306, the occupancy probability distribution o.sub.t+1.sup.i(x.sup.i, y.sup.i) for vehicle i at timestep t+1 is calculated on the basis of belief state b.sup.i.sub.r+1, according to the equation:
(41)
(42) In the next step S307, it is checked whether road user index i has reached the total number of road users m in the traffic scene. If not, road user index i is incremented by one in step S308, before looping back to step S303, so as to calculate the belief states and occupancy probability distributions for time step t+1 for all road users in the traffic scene. On the other hand, if the road user index i has reached the number m, the next step will be step S309, in which it will be checked whether time index Chas reached timestep T−1. If not, time index twill be incremented by one in step S310, before looping back to step S301, so as to calculate the belief states and occupancy probability distributions for timestep t+2 for all road users in the traffic scene. On the other hand, if time index t has reached timestep T−1, the belief states and occupancy probability distributions for all timesteps from t=1 to t=T will have been calculated and the prediction algorithm ends.
(43)
(44) The time complexity of this algorithm grows linearly with respect to the time horizon T and with the number of available actions for each road user, and quadratically with respect to the number of road users and available states for each road user.
(45) Finally, the data processor 103 also comprises a collision risk evaluation module 230 adapted to process the future traffic scene prediction output by scene prediction module 223 in order to determine a risk of collision for road vehicle 1 and, if this risk of collision exceeds a predetermined threshold, output a driver warning signal through warning signal output device 104 and/or execute an avoidance action through driving command output device 105.
(46) Those skilled in the art will recognize that the present invention may be manifested in a variety of forms other than the specific embodiment described and contemplated herein. Accordingly, departure in form and detail may be made without departing from the scope of the present invention as described in the appended claims.