Predicted Destination by User Behavior Learning

Abstract

A system, method and non-transitory computer-readable medium for predicting a trip destination of a user based on user behavior learning are provided. Historical behaviors and a target behavior of the user are received from a feature processing layer, and the received historical behaviors and the target behavior are embedded with features including a time and a location to produce a context modeling layer. A user modeling layer is produced by embedding the context modeling layer. A trip destination is predicted based on historical trip data and target trip data in the user modeling layer.

Claims

1. A method for predicting a trip destination of a user comprising: receiving historical behaviors and a target behavior of the user from a feature processing layer; embedding the received historical behaviors and the target behavior with features including a time and a location to produce a context modeling layer; embedding the context modeling layer to produce a user modeling layer; and predicting the trip destination based on historical trip data and target trip data in the user modeling layer.

2. The method according to claim 1, further comprising: comparing an actual destination of the user to the predicted trip destination to determine an error.

3. The method according to claim 2, further comprising: updating the historical behaviors based on the actual destination.

4. The method according to claim 3, further comprising: repeating the method with the updated historical behaviors.

5. The method according to claim 1, further comprising: outputting the predicted trip destination to a vehicle of the user.

6. The method according to claim 1, wherein the time includes a day of the week and a time of day.

7. The method according to claim 1, further comprising providing a time-to-leave that reminds the user to leave at a particular time due to real time traffic.

8. The method according to claim 1, further comprising sending a reminder to the user to precondition a vehicle of the user before a trip to the trip destination.

9. The method according to claim 1, further comprising controlling a vehicle to automatically precondition the vehicle by heating or cooling an interior of the vehicle before a trip to the trip destination.

10. A non-transitory computer-readable medium storing a program that, when executed by a processor, causes the processor to perform a method comprising: receiving historical behaviors and a target behavior of the user from a feature processing layer; embedding the received historical behaviors and the target behavior with features including a time and a location to produce a context modeling layer; embedding the context modeling layer to produce a user modeling layer; and predicting the trip destination based on historical trip data and target trip data in the user modeling layer.

11. The non-transitory computer-readable medium according to claim 10, wherein the program causes the processor to compare an actual destination of the user to the predicted trip destination to determine an error.

12. The non-transitory computer-readable medium according to claim 10, wherein the program causes the processor to update the historical behaviors based on the actual destination.

13. The non-transitory computer-readable medium according to claim 12, wherein the program causes the processor to repeat the method with the updated historical behaviors.

14. The non-transitory computer-readable medium according to claim 10, wherein the program causes the processor to output the predicted trip destination to a vehicle of the user.

15. The non-transitory computer-readable medium according to claim 10, wherein the time includes a day of the week and a time of day.

16. The non-transitory computer-readable medium according to claim 10, wherein the program causes the processor to provide a time-to-leave that reminds the user to leave at a particular time due to real time traffic.

17. The non-transitory computer-readable medium according to claim 10, wherein the program causes the processor to send a reminder to the user to precondition a vehicle of the user by heating or cooling an interior of the vehicle before a trip to the trip destination.

18. The non-transitory computer-readable medium according to claim 10, wherein the program causes the processor to control a vehicle to automatically precondition the vehicle by heating or cooling an interior of the vehicle before a trip to the trip destination.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] FIG. 1 illustrates a schematic block diagram of an exemplary embodiment of a system according to the present invention.

[0012] FIG. 2 illustrates an exemplary embodiment of a method according to the present invention.

[0013] FIG. 3 illustrates an exemplary embodiment of user behavior learning for a predicted destination based on historical data.

[0014] FIG. 4 illustrates a high level representation of the algorithmic framework according to the present invention.

[0015] FIG. 5 illustrates an exemplary embodiment of behavior embedding for a trip according to the present invention.

DETAILED DESCRIPTION OF THE DRAWINGS

[0016] An objective of the present invention is to provide a predicted destination to a user for any given time and any departure location whenever it is requested by the user. According to exemplary embodiments of the present invention, the algorithmic framework can be used to provide a time-to-leave that reminds a user to leave at the right time due to real time traffic change, smart preconditioning that reminds the user to precondition a vehicle or automatically precondition the vehicle before the trip (e.g., heating/cooling), and smart search and navigation in which one or more destinations are recommend to the user when the user starts to drive based on the user's context.

[0017] At a high level, the system processes the data into features organized by a feature processing layer, a context modeling layer, a user modeling layer, and similarity measurement layers. The system is trainable based on the loss (error) defined by a prediction task, e.g., a predicted destination.

[0018] FIG. 1 illustrates a schematic block diagram of a system according to an exemplary embodiment of the present invention. The system may include, for example, a vehicle 100, a modeling server 110, a mobile device 120, and cloud storage 130. Each of these devices has its own processor and memory and a communication interface(s), wherein the processors are specifically programmed to perform the functions described herein. Telemetry data and the like may be received from the vehicle 100 and may be received from the mobile device 120. The mobile device 120 may be a smart phone, tablet computer or the like. Communication between the modeling server and the vehicle/mobile device may occur via cellular network, WiFi, Bluetooth, or the like. Data gathered from the vehicle 100 and the mobile device 120 may be transmitted to the modeling server 110 or transmitted directly to cloud storage 130.

[0019] FIG. 2 illustrates an exemplary embodiment of a method according to the present invention. In step S201, historical and target behaviors of a user are received from a feature processing layer. In step S202, context embedding for time (e.g., day of week, hour of day) and location from the received historical and target trip data is performed to produce a context modeling layer. In step S203, user behavior embedding is performed on the received context modeling layer to produce a user modeling layer. In step S204, a trip destination is predicted based on user behavior in the user modeling layer, which produces a similarity measurement layer including a probability of the predicted destination being the trip destination. In step S205, the output is compared to the ground truth of the actual destination reached by the user to produce a loss (error), which can be used to minimize the loss and improve the next prediction. During an initial learning phase, the processes may be repeated any desired number of times in order to produce an output with a desired level of accuracy.

[0020] FIG. 3 illustrates an exemplary embodiment of a user behavior learning process according to the present invention. Based on the trip data (e.g., departure times, routes, and destinations) obtained in step 301, destinations of the user are learned in step 302 (e.g., home, work, school, restaurant) and a score is assigned to each destination based on frequency of use or the like. In step 303, the user's routes are learned (e.g., personal, business, preferred routes) based on the learned destinations. Then, a predicted destination of the user can be determined in step 304 based on the learned routes, learned destination, and trip data of the user. The learned routes and destinations for the user are stored in a memory that is accessed by the system

[0021] As illustrated in FIG. 4, a feature processing layer 401 including historical behaviors and a target behavior are input to a feature embedding process 402 that includes day of the week, hour of day, and location. From this processing, a context modeling layer 403 is produced that is used for behavior embedding 404 to produce a user modeling layer 405. A similarity measurement layer 406 is produced based on the user modeling layer and the historical trips and target trip information. The output 407 of this process is compared to the ground truth 408 or actual destination to determine a loss (error) 409 between the target and actual destination.

[0022] The feature processing layer is further described below. A user trip is defined as visiting a certain location at a given temporal context. All visited locations L and temporal context C are modeled to construct the feature processing layer consisting of the raw input. Here, we further decomposed temporal context C into two features: day of week D and hour of day H.

[0023] In order to encode spatial and temporal information, representation learning is applied to generate an “embedding” vector for different objects. An embedding is a mapping of a discrete categorical variable to a vector of continuous numbers. Therefore, the semantic meaning can be enriched for objects such as locations, hour of day, and day of week. Normally, embedding can be trained in a data-driven framework to preserve the semantic meaning of objects. Here we embed features of location set L, day of week set D, and hour of day set H as follows:

[0024] E({L})=[[L.sub.1,1, L.sub.1,2, . . . , L.sub.1,S], . . . , [L.sub.Q,1, L.sub.Q,2, . . . , L.sub.Q,S]]

[0025] E({D})=[[D.sub.1,1, D.sub.1,2, . . . , D.sub.1,S], . . . , [D.sub.7,1, D.sub.7,2, . . . , D.sub.7,S]]

[0026] E({H})=[[H.sub.1,1, H.sub.1,2, . . . , H.sub.1,S], . . . , [H.sub.24,1, H.sub.24,2, . . . , H.sub.24,S]]

[0027] where S is the pre-defined feature size of embedding vector and Q is the size of locations.

[0028] Therefore, we can encode any trip (l, d, h) in which the user visited the l-th location on the d-th day of the week at the h-th hour of the day.

[0029] E(l, d, h)=(lookup.sub.lE ({L}), lookup.sub.dE ({D}), lookup.sub.hE({H}))

[0030] where lookup.sub.i(E) is an operation that extracts the i-th row from the embedding matrix E, and the extracted vector is of (S, 1)-size.

[0031] In the context modeling layer, our goal is, given processed contextual information E(l, d, h) of trip r, to construct its semantic embedding. We let E.sub.i=lookup.sub.i(E) represent the lookup i-th row operation for embedding matrix E, and l, d, h be the lookup index of location, day of week, and hour of day, respectively. Here we introduced the proposed embedding modeling as shown in FIG. 5. In FIG. 5, element 501 represents an (S, 3)-size matrix of trip data (l, d, h) in which the user visited the l-th location on the d-th day of the week at the h-th hour of the day. Element 502 represents a (3, 1)-size vector for behavior embedding, and element 503 represents the behavior embedding for trip r.

[0032] The detailed calculation of the behavior embedding for trip r is as follows:

r=(concatenate.sub.axis=1(E(L.sub.l),E(D.sub.d),E(H.sub.h))×w+b)

[0033] where concatenate.sub.axis=1( ) is an operation that concatenates the three (S, 1)-size vectors along the second axis to generate an (S, 3)-size matrix, and w and b are linear transformation parameters that need to be trained.

[0034] The user modeling layer is further described below. Given the user's trip record r through context modeling, we have user trip sequence R.sub.seq that consists of a sequence of user trip data ordered by timestamps. Our target is to learn the temporal pattern and regularity from the context data embedded in the feature, e.g. day of week and time of day, rather sequence order by time. Thus, we are able to precompute the user behavior offline, and reuse it for prediction purposes, which reduces the memory and computation cost.

[0035] Assuming the user has T number of trips, we concatenate all r along axis t to generate an (S, T)-size matrix, i.e., R.sub.seq=(r.sub.1r.sub.2 . . . r.sub.T).sub.t. R.sub.seq requires a sequential data collection due to the nature of the permutation variance. For example, (r.sub.1r.sub.2 . . . ).sub.t≠(r.sub.2r.sub.1 . . . ).sub.t as we swap any two position of behavior records.

[0036] Unlike conventional sequential modeling methods, we give all possible permutations to the network such that it learns a pattern to be permutation invariant as f (r.sub.1r.sub.2 . . . r.sub.T)=f(pi(r.sub.1r.sub.2 . . . r.sub.T)) for any permutation pi, which reduces computation cost. In practice, such learning can be done through the proposed attention-based network and completely offline. Therefore, the user trip history R.sub.hist can be generated as a set of given all permutations using elements of R.sub.seq, i.e., R.sub.hist={pi (R.sub.seq)} where pi represents all possible permutations. In practice, this operation can be simplified through shuffling.

[0037] Meanwhile, we modeled the target trip in a similar way as the trip record r. The contextual information of the target trip went through all the aforementioned layers to generate a target trip r.sub.tgt.

r.sub.tgt=(concatenate.sub.axis=1(E(L.sub.l),E(D.sub.d),E(H.sub.h))×w+b)

[0038] Given the user trip history R.sub.hist that is (S, T)-size, we can apply a self-attention-mechanism based network, such as the one developed by Google in 2017 (https://arxiv.org/abs/1706.03762) to model the user embedding U. Self-attention is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence. The calculation of user embedding U is simplified as follows.

[00001] $f (Q, K_{i}) = Q^{T} K_{i}$ $a_{i} = softmax (f (Q, K_{i})) = \frac{\exp (f (Q, K_{i}))}{{.Math.}_{j} \exp (f (Q, K_{j}))}$ $U = {.Math.}_{i} a_{i} V_{i}$

[0039] where Q, K, V represents the query, key, and value, respectively, which are concepts used in the attention mechanism. Here Q=K=V=R.sub.hist. After calculation, the output is an (S, T)-size matrix that represents personal embedding based on the user's trip history.

[0040] The similarity measurement layer is further described below. Given the model user embedding U and target trip embedding r.sub.tgt, we can use two simple one-dense-layer neural networks to map two embeddings into common semantic space and then compute the similarity sim(U, r.sub.tgt). The output of each neural network is the following:

Z.sub.u=ReLU(U×w+b)

Z.sub.r.sub.tgt=ReLU(r.sub.tgt×w+b)

sim(U,r.sub.tgt)=dist(Z.sub.u,Z.sub.r.sub.tgt)

[0041] where ReLU is an activation function defined as the positive part of its argument relu(x)=max(0, x.sup.+), dist( )represents the distance measurement such as Euclidean distance, and w and b are the parameters that need to be trained. Therefore, we predict the target trip {tilde over (r)}.sub.tgt through choosing the highest similarity score among all candidates.

[0042] We explored the deployment of the proposed model on trip pattern prediction task that predicts which location user will visit at certain time given his/her trip history. The dataset includes user location tracking including driving. Raw features include, for example, the following: <user ID, location_gps_grid_ID, timestamp), 100 users, 778 locations through a 200 m×200 m grid by map segmentation, over a 20-week period.

[0043] We assume we have user trip records for week w that include the following:

[0044] I.sub.w=

{(visit location i.sub.0 at time t.sub.0), . . . , (visit location i.sub.T at time t.sub.T)}, t ∈w, where we aim to predict I.sub.w+i. We use the first 19 weeks as a set of data for training, where the data contains both location i and timestamp t information for the visit, and use the last week as the test set.

[0045] We applied I-best matching accuracy that is widely used in recommendation systems to measure the performance. Meanwhile, the parameter number and prediction time were reported to indicate the scalability. The following table shows a performance comparison with aforementioned prior art model “ATRank” and different kernel of proposed layers regarding the prediction accuracy and prediction time.

TABLE-US-00001 Prediction Trainable Processing Processing accuracy Parameters time Time Prediction (Top 1 (compared to Hardware (per (per Index Model target size Matching) baseline model) configuration week) target) 1 ATRank * 100 0.59 239,200 CPU: Intel 13.22 sec 6.7 ms (Baseline) users, Xeon E5-2690 2 ATRank * + 1 week, 0.70 221,156 GPU: Tesla 1.08 sec 6.7 ms Proposed 2,468 (−8%) V100-PCIE- Contrext targets 16 GB; Modeling RAM: 112 GB; Layer System: Linux 3 Proposed 0.72 158,256 Ubuntu 1.79 sec .sup. 0.9 sec Model (+22%) (−33%) (x7.4) (x7.4) * represents the model requiring sequential input that causes a heavy computational cost (such as collecting history data [t − Δt, t] based on the target timestamp t).

[0046] The results show that the algorithmic framework and model according to the present invention achieve better prediction performance with outstanding operation/computation efficiency.

[0047] In another exemplary embodiment of the present invention, a non-transitory computer-readable medium is encoded with a computer program that performs the above-described method. Common forms of non-transitory computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, an EPROM, a FLASH-EPROM, any other memory chip or cartridge, or any other medium from which a computer can read.

[0048] The present invention provides a number of significant advantages over conventional systems and methods. As described above, the present invention provides a context-aware learning framework for predicted destination. We are able to leverage context information to model user trip patterns. Not only the final prediction retains an improved performance, but also the intermedia output such as object embedding and user embedding can be the critical features for other downstream tasks, e.g., segmentation.

[0049] The algorithmic framework according to the present invention has low complexity. Instead of jointly modeling history trips and a target trip, we are able to separate them, pre-calculate and store the user embedding offline, and estimate the target trip online only. This dramatically decreases the complexity as online target trip calculation is much cheaper than user embedding that has to go through the all historical trips in the database.

[0050] The present invention also provides rich semantic modeling. By using embeddings, we expand the capabilities of previous Natural Language Processing (NLP) methods by creating contextual representations based on the surrounding context which leads to richer semantic models. Further, the algorithm according to the present invention outperforms conventional models in both higher prediction performance and much lower computation cost.

[0051] The foregoing disclosure has been set forth merely to illustrate the invention and is not intended to be limiting. Since modifications of the disclosed embodiments incorporating the spirit and substance of the invention may occur to persons skilled in the art, the invention should be construed to include everything within the scope of the appended claims and equivalents thereof.

Predicted Destination by User Behavior Learning

Inventors

Cpc classification

Classification Explorer

G01C21/3667

PHYSICS

Classification Explorer

G05B13/0265

PHYSICS

Classification Explorer

G06N20/00

PHYSICS

Classification Explorer

G06N5/04

PHYSICS

Classification Explorer

G06N3/045

PHYSICS

Classification Explorer

G01C21/3484

PHYSICS

Classification Explorer

B60H1/00971

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B60H1/0073

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

B60H1/00778

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G01C21/32

PHYSICS

Classification Explorer

B60H1/00771

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G01C21/3492

PHYSICS

International classification

Classification Explorer

G01C21/34

PHYSICS

Classification Explorer

B60H1/00

PERFORMING OPERATIONS; TRANSPORTING

Classification Explorer

G01C21/32

PHYSICS

Classification Explorer

G01C21/36

PHYSICS

Classification Explorer

G05B13/02

PHYSICS

Classification Explorer

G06N5/04

PHYSICS

Abstract

Claims

Description