Deep User Modeling by Behavior
20210231449 · 2021-07-29
Inventors
Cpc classification
G06F17/16
PHYSICS
G06F18/213
PHYSICS
International classification
G06F17/16
PHYSICS
Abstract
A system, method and non-transitory computer-readable medium are provided for deep user modeling of user behavior. According to the deep user modeling, user behavior vectors that represent historical user behaviors of a user are determined. Based on a concatenation of the user behavior vectors, a variable-length user behavior matrix is determined. The variable-length user behavior matrix is converted into a fixed-length embedding vector via a long short term memory network, and the fixed-length embedding vector is outputted to the user as a predicted target behavior.
Claims
1. A method for performing deep user modeling, comprising: determining user behavior vectors that represent historical user behaviors of a user; determining a variable-length user behavior matrix based on a concatenation of the user behavior vectors; converting the variable-length user behavior matrix into a fixed-length embedding vector via a long short term memory network; and outputting the fixed-length embedding vector to the user as a predicted target behavior.
2. The method according to claim 1, further comprising: updating the variable-length user behavior matrix based on the predicted target behavior.
3. The method according to claim 1, further comprising: guiding the user to a predicted destination in a vehicle based on the predicted target behavior.
4. The method according to claim 1, wherein the fixed-length embedding vector represents a user profile.
5. The method according to claim 1, further comprising: determining an error between the predicted target behavior and an actual user behavior.
6. The method according to claim 5, further comprising: updating the user behavior vectors based on the error.
7. A method for modeling behavior of a user, comprising: receiving user characteristics data of a user; transforming the user characteristics data into user behavior data based on an attention based framework; transforming the user behavior data into a predicted target of user behavior based on a long short term memory processing of the user behavior data; and outputting the predicted target to a mobile device or vehicle of the user.
8. The method according to claim 7, further comprising: determining an error between the predicted target and an actual user behavior.
9. The method according to claim 8, further comprising: updating the user behavior data based on the error.
10. A non-transitory computer-readable medium storing a program that, when executed by a processor, causes the processor to perform a method comprising: determining user behavior vectors that represent historical user behaviors of a user; determining a variable-length user behavior matrix based on a concatenation of the user behavior vectors; converting the variable-length user behavior matrix into a fixed-length embedding vector via a long short term memory network; and outputting the fixed-length embedding vector to the user as a predicted target behavior.
11. The non-transitory computer-readable medium according to claim 10, further comprising: updating the variable-length user behavior matrix based on the predicted target behavior.
12. The non-transitory computer-readable medium according to claim 10, further comprising: guiding the user to a predicted destination in a vehicle based on the predicted target behavior.
13. The non-transitory computer-readable medium according to claim 10, wherein the fixed-length embedding vector represents a user profile.
14. The non-transitory computer-readable medium according to claim 10, further comprising: determining an error between the predicted target behavior and an actual user behavior.
15. The non-transitory computer-readable medium according to claim 14, further comprising: updating the user behavior vectors based on the error.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
DETAILED DESCRIPTION OF THE DRAWINGS
[0027]
[0028]
[0029] According to the proposed algorithm, user behaviors are input and the output is a prediction of the possibility of a target behavior occurring and a user profile inference. The algorithm includes semantic modeling, in which objects (e.g., user interaction I, content O, and context C) are transformed into sematic space. A transform is performed to provide a similarity measure between historical behaviors and the target behavior. The possible behaviors are ranked and the most possible behavior, having the highest similarity against the historical behaviors, is selected as the target behavior. According to the algorithm, the user modeling is based on historical behavior learning, and an evaluation is performed using an N-best match (exact match: 1-best). The algorithm according to the present invention provides rich semantic modeling using discriminative training with a small similarity model and an online learning capability.
[0030] We introduce the transfer learning method to leverage previous leanings from a pre-trained model and avoid starting from scratch for the user profile learning. The pre-trained model is based on a behavior learning model that is supervised and trained based on the loss defined by a prediction task, e.g., destination recommendation. User behavior is defined as taking certain action on certain content at the given context. All user interaction I, content O, and context C are modeled to construct the feature modeling layer consisting of the raw input. Besides the final prediction result, the embedding of objects are trained to have the following matrix:
E({I})=[[I.sub.1,1, I.sub.1,2, . . . , I.sub.1,H], . . . , [I.sub.Q,1, I.sub.Q,2, . . . , I.sub.Q,H]]
E({O})=[[O.sub.1,1, O.sub.1,2, . . . , O.sub.1,H], . . . , [O.sub.K,1, O.sub.K,2, . . . , O.sub.K,H]]
E({C})=[[C.sub.1,1, C.sub.1,2, . . . , C.sub.1,H], . . . , [C.sub.P,1, C.sub.P,2, . . . , C.sub.P,H]]
[0031] r=concatenate.sub.axis=1(E(I.sub.q), E(O.sub.k), E(C.sub.p))×w+b
[0032] where H is the pre-defined feature size of embedding vector, Q, K, P is the size of user interaction, content, and context, respectively, w and b are also the pre-train parameters, r represents one behavior record based on user interaction I.sub.q, content O.sub.k, and context C.sub.p.
[0033] In practice, the pre-trained model can help to transfer the knowledge learned previous and greatly decrease the computation time. The training can be done offline then deploy the learned embedding as features to be fed into proposed user profile learning framework.
[0034]
[0035] As illustrated in
[0036] As one user's behavior might drift along time due to either a non-recurrent event such as a vacation or periodical event such as weekday/weekend routines, we propose a recursive representation of user embedding through considering the delay of the past behaviors and the observed current behaviors. Let U.sub.t the user embedding calculated based on user historical behaviors R.sub.t:t.sub.
U*.sub.t+Δt=α*U*.sub.t+(1−α)*U.sub.t+Δt
[0037] where U*.sub.t is prediction value and U.sub.t+Δt is the observation value.
[0038] We explored the deployment of the proposed model on a trip pattern prediction task that predicts which location a user will visit at a certain time given his/her trip history in an experiment. The dataset includes user location tracking including driving. Raw features of the experiment include, for example, <user ID, location_gps_grid_ID, timestamp), 100 users, 1578 locations through 200 m×200 m grid by map segmentation, over a 6-month period. For the task, we assume a user interaction for user u is the following:
[0039] I.sub.u={(visit location i.sub.0 at time t.sub.0), . . . , (visit location i.sub.T at time t.sub.T)}, where we use the first k of I.sub.u to predict the k+1-th visit in the train set, where data contains both location i and timestamp t information for the visit, and use the first n−1 visit to predict the last one in the test set. We applied top 1-best matching accuracy that is widely used in recommendation systems to measure the performance. Meanwhile, parameter number and response time were reported to indicate the scalability. We also evaluated our model in the online learning case for distributed training purposes.
[0040] We benchmarked the model performance based on different training scenarios (online or offline) and whether transfer learning is enabled. The prediction accuracy and response time are both evaluated on the same test set across all indexed models. The result is shown in the following Table 1.
TABLE-US-00001 TABLE 1 Online Transfer Prediction accuracy Trainable Response time Index Learning Learning Training Data Model (Top 1 Matching) Parameters (second/100 users) 1 N N 6-month data Baseline 0.81 324,590 2.445 2 N Y 6-month data Pre-trained Baseline + 0.83 456,174 0.309 LSTM 3 Y Y First 5-month data Pre-trained Baseline + 0.85 456,174 0.309 for offline training, LSTM last 1-month data for online training 4 N Y Last 1-month data Pre-trained Baseline + 0.76 456,174 0.309 LSTM
[0041] As illustrated in Table 1, when both online learning and transfer learning are enabled, the result of index 3 shows that our proposed algorithm improves the prediction and greatly decreases the response time.
[0042]
where the data points of user i, j, and k are shown in
[0043]
[0044]
[0045]
[0046] In another exemplary embodiment of the present invention, a non-transitory computer-readable medium is encoded with a computer program that performs the above-described method. Common forms of non-transitory computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, an EPROM, a FLASH-EPROM, any other memory chip or cartridge, or any other medium from which a computer can read.
[0047] The present invention provides a number of significant advantages over conventional systems and methods. In particular, the present invention provides a unified algorithmic framework for user modeling based on user behavior that is able to extend to become feature toward different services. The user can be flexibly trained for different tasks driven by user behavior, e.g., predicted destination driven by mobility behavior, recommended feature by app usage behavior, etc. The semantics are enriched for users, which allows computation among users, e.g., user segmentation, user similarity based recommendation, and predictive modeling.
[0048] Also, the system and method according to the present invention has low complexity that improves the service online computation due to compact user modeling and improves the user experience by leveraging personal context to have better predicted performance. The present invention also provides a solution to data sparsity. Additionally, the present invention enables transfer learning and online learning. The pre-trained model can help to transfer the knowledge learned previously and greatly decrease the computation time. Meanwhile, the online learning enables the distributed training to deal with computation scalability to address the large-scale dataset in real-world applications.
[0049] The foregoing disclosure has been set forth merely to illustrate the invention and is not intended to be limiting. Since modifications of the disclosed embodiments incorporating the spirit and substance of the invention may occur to persons skilled in the art, the invention should be construed to include everything within the scope of the appended claims and equivalents thereof.