FORECASTING APPARATUS, FORECASTING METHOD, AND STORAGE MEDIUM
20230229980 · 2023-07-20
Inventors
- Koki KAWABATA (Osaka, JP)
- Yasuko SAKURAI (Osaka, JP)
- Takato HONDA (Osaka, JP)
- Yasushi SAKURAI (Osaka, JP)
Cpc classification
G06F17/16
PHYSICS
G06N7/00
PHYSICS
International classification
G06Q10/04
PHYSICS
Abstract
A forecasting apparatus forecasts an event after a predetermined time, based on a current window being a part of time-series data in multidimension. The forecasting apparatus includes a non-linear transformation unit including a matrix for non-linear transformation, an observation matrix, and a seasonality setting unit. The non-linear transformation unit transforms the time-series data of the current window in a part of dimensions that are related to trends and the time-series data of the current window in a part of dimensions that are related to seasonal intensity into latent first data showing the trends and latent second data showing the seasonal intensity. The observation matrix includes a first observation matrix that reproduces the first data to first estimated data of an original number of dimensions, and a second observation matrix that, by use of seasonality information that has been set in the seasonality setting unit, reproduces the second data to second estimated data of an original number of dimensions, and adds the first estimated data and the second estimated data.
Claims
1. A forecasting apparatus that forecasts an event after a predetermined time by applying estimated data reproduced from time-series data in multidimension that passes through a current window, the forecasting apparatus comprising: a storage unit that sequentially stores the time-series data in the multidimension that passes through the current window; a non-linear transformation unit that, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to trends, non-linearly transforms and outputs latent first data showing the trends, and, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to seasonal intensity, linearly transforms and outputs latent second data showing the seasonal intensity; and an observation matrix unit that includes a first observation matrix that reproduces the first data to first estimated data of an original number of dimensions, and a second observation matrix that, by use of seasonality information that has been set in a seasonality setting unit, reproduces the second data to second estimated data of an original number of dimensions, as seasonality data, and further adds output of the first observation matrix and the second observation matrix and outputs as the estimated data.
2. The forecasting apparatus according to claim 1, comprising a model parameter estimation unit, wherein the model parameter estimation unit adjusts a parameter of the non-linear transformation unit, the first observation matrix, and the second observation matrix, and a setting content of the seasonality setting unit so as to minimize a difference between the estimated data being a result of addition of the first estimated data and the second estimated data, and the time-series data of the current window.
3. The forecasting apparatus according to claim 1, wherein the non-linear transformation unit, in a case in which the multidimension is d-dimensional, receives an input and sends an output of the time-series data for k-dimension (<d) that is obtained by combining the time-series data for kz-dimension related to the trends and the time-series data for kv-dimension related to the seasonal intensity.
4. The forecasting apparatus according to claim 1, wherein the non-linear transformation unit is configured by connecting in series a two-dimensional matrix that performs linear transformation and a three-dimensional tensor matrix that performs non-linear transformation.
5. The forecasting apparatus according to claim 1, further comprising a regime update unit, wherein: the time-series data is configured by an element of a keyword, a location, and elapsed time information; and the regime update unit divides the first observation matrix and the second observation matrix into multiple regimes at least with respect to the element of the location.
6. The forecasting apparatus according to claim 5, further comprising a regime addition unit, wherein: the regime update unit compares a sum of model description cost and data encoding cost, by applying principle of minimum description length, with respect to an original regime model and a divided new regime model; and the regime addition unit, in a case in which cost of the new regime model is lower, additionally registers a parameter configuring the new regime model in a parameter set storage unit.
7. A forecasting method that forecasts an event after a predetermined time by applying estimated data reproduced from time-series data in multidimension that passes through a current window, the forecasting method comprising: sequentially storing in a storage unit the time-series data in the multidimension that passes through the current window; among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to trends, non-linearly transforming and outputting latent first data showing the trends, and, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to seasonal intensity, and linearly transforming and outputting latent second data showing the seasonal intensity; and reproducing the first data to first estimated data of an original number of dimensions by a first observation matrix, and, by use of seasonality information that has been set in a seasonality setting unit, reproducing the second data to second estimated data of an original number of dimensions by a second observation matrix, as seasonality data, and further adding output of the first observation matrix and the second observation matrix and outputting as the estimated data.
8. A non-transitory computer readable storage medium storing a program that causes a computer to implement, in forecasting an event after a predetermined time by applying estimated data reproduced from time-series data in multidimension that passes through a current window: sequentially storing in a storage unit the time-series data in the multidimension that passes through the current window; among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to trends, non-linearly transforming and outputting latent first data showing the trends, and, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to seasonal intensity, and linearly transforming and outputting latent second data showing the seasonal intensity; and reproducing the first data to first estimated data of an original number of dimensions by a first observation matrix, and, by use of seasonality information that has been set in a seasonality setting unit, reproducing the second data to second estimated data of an original number of dimensions by a second observation matrix, as seasonality data, and further adding output of the first observation matrix and the second observation matrix and outputting as the estimated data.
Description
BRIEF DESCRIPTION OF DRAWINGS
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
DESCRIPTION OF EMBODIMENTS
[0037] A large amount of time-series data that includes information on time evolution is able to be represented as a tensor, and is more specifically shown in Mathematical Formula 1,
Given a tensor stream X up to the current time point t.sub.c, which consists of elements at d.sub.l locations for d.sub.k key-words, i.e., X={x.sub.tij}.sub.t, i, j=1.sup.t.sup.
[0042] In other words, the forecasting apparatus (hereafter referred to as CubeCast and will be described below in
[0043] In addition, as shown in Mathematical Formula 1 and
[0044] Subsequently, Table 1 shows a list of main symbols used by CubeCast.
TABLE-US-00001 TABLE 1 Symbol Definition d.sub.k, d.sub.l Number of keywords and locations t.sub.c Current time point χ Tensor stream, i.e., χ ∈ χ.sub.t.sub.
.sup.t.sup.
.sup.k.sup.
.sup.k.sup.
Non-linear dynamical system, i.e,, A ∈
.sup.k×k,
∈
.sup.k×k×k, k = k.sub.z + k.sub.v
Observation matrix set for base trends, i.e.,
= {W.sub.1, . . . , W.sub.m}
Observation matrix set for seasonality, i.e.,
= {U.sub.1, . . . , U.sub.m} p Period of seasonality S Latent seasonal components, i.e., S ∈
.sup.p×k.sup.
.sup.t.sup.
,
,
} Θ Full parameter set, ie., Θ = {S, θ.sub.1, . . . , θ.sub.n}
Regime assignment set, ie.,
= {r.sub.1, . . . , r.sub.n}
[0045] The tensor stream X is considered as a three-dimensional tensor. This is indicated by X ∈ R.sup.tc×dl×dk. Herein, tc denotes the number of time points, and dl and dk respectively denote the number of locations and the number of keywords. An element x.sub.tij corresponds to a search volume at time point t of an i-th location of a j-th keyword. The overall objective is to achieve long-term forecasting of a tensor X while adapting to the latest trends. In addition, as shown in
(l.sub.s-STEP AHEAD FORECASTING). Given: a data stream X.sup.c={x.sup.tij}.sub.t, i, j=t.sub.
[0046] In addition, as shown in Mathematical Formula 2 and
[0047] CubeCast models a latent dynamic pattern served as a foundation of a tensor stream, in order to capture all the above components.
[0048]
[0049] The program storage unit 101 stores a processing program (such as an algorithm to be described below) that CubeCast executes. It is to be noted that the algorithm, by being read out to a main memory (not shown) and executed by a processor, functions as each part 10 to 40 of the calculation unit 1. The data stream storage unit 102 stores search data, and data for a period older than the current window X.sup.c is compressed to be able to be reproduced as needed. The current window storage unit 103 updates and stores time-series data of a current window X.sup.c period for every time point tc. The parameter set storage unit 104 stores various kinds of parameter sets that construct the mathematical model for reproducing estimated data from the time-series data.
[0050] The model parameter estimation unit 10 includes a non-linear transformation unit 11 into which the time-series data of the current window X.sup.c is inputted, latent spaces 12 and 13, a seasonality setting unit 14, and a latent space 15, and also includes observation matrices 16 and 17, and an estimated tensor space 18. It is to be noted that the latent spaces 12, 13, and the estimated tensor space 18 may be memories that are sequentially recorded as the time-series data, or may be employed for illustrative purposes. In addition, calculations in the non-linear transformation unit 11 and the observation matrices 16 and 17 may be either software calculations or hardware processing.
[0051] The non-linear transformation unit 11, as described below, latently retrieves (captures) large trends included in the time-series data searched (detected) in a d-dimension with k [[(]] dimension (k<d), and is configured by a two-dimensional matrix A and a three-dimensional tensor matrix B that are connected in series, as shown in
[0052] In addition, the observation matrices 16 and 17 are set up by not only one type for a location (a region, for example) but also multiple, for example, m weight matrices to reflect singularity with respect to the location, which maintains the accuracy of the forecast information (an event) to be reproduced through a model. For example, the observation matrices W1, W2, . . . , and Wm are provided in the observation matrix 16, and the observation matrices U1, U2, . . . , and Um are corresponded to the observation matrix 16, as shown in
[0053] The regime update unit 20 causes the estimated data reproduced in the estimated tensor space 18 to correspond to the information of the current window Xc, and totals each difference as a square error, and alternately corrects, from a total result, matrices W and U of the observation matrices 16 and 17 and the matrix A and the matrix B of the tensor of the non-linear transformation unit 11, and the value of the latent seasonality S, and enables both the trends and the seasonality to be adjusted with well balance so that the total of the difference may be within a predetermined threshold value.
[0054] In addition, the regime update unit 20 applies the principle of the minimum description length (MDL: minimum description length), for example, as an automatic evaluation method for automatically detecting an optimal model set (a regime). The details will be described below. The regime addition unit 30, as shown in an image diagram in
[0055] Subsequently, each function and operation that CubeCast has will be described. CubeCast has the following three functions (a) to (c).
[0056] (a) Non-linear latent dynamics: CubeCast employs a non-linear dynamical system to capture about complex dynamics (trends) in time series,
[0057] (b) Seasonality: The non-linear dynamical system is extended to handle seasonality that evolves over time, and
[0058] (c) Co-evolving patterns in tensor streams: An adaptive model that is able to describe both temporal and locational differences in tensor streams.
[0059] (a): Latent non-linear dynamics in a single location (place): The simplest case is first considered. For example, hypothetically, there is a single dynamical pattern to which d-dimensional time series are given, such as search volumes for a single country. It is to be noted that, in a basic model, the time series have latent activities and these are assumed to determine behavior of the time series that are actually observed. [0060] zt: kz-dimensional (<d) latent activity at time point t. [0061] et: d-dimensional time-series estimated event observed at time point t.
[0062] In other words, although the actual activity et is able to be observed, the latent activity zt is an unobservable vector that describes non-linear dynamics evolving over time. As shown in
z.sub.t+1=Az.sub.t+z.sub.t {circle around (×)}z.sub.t,
e.sub.t=Wz.sub.t, [Mathematical Formula 3]
[0063] Herein, the symbol “X is written in ∘” in the second term on the right side of the first equation in Mathematical Formula 3 is an operator that indicates the outer product of two vectors. A ∈ R.sup.kz×kz and B ∈ R.sup.kz×kz×kz describe linear/non-linear dynamical activities. W ∈ R.sup.d×kz indicates observation forecasting for obtaining an estimated event et from the latent activity zt.
[0064] (b) For latent seasonal variation: Mathematical Formula 3 being the basic model is extended, and seasonal/cyclic patterns are modeled in time series. More specifically, another latent factor of seasonality that may be able to interact with linear/non-linear activities over time is defined. For example, an intensifying seasonal pattern in conjunction with latent trends is represented. For this purpose, two additional types of latent activities are assumed here. [0065] vt: latent seasonal intensity at time point t, that is, vt ∈ R.sup.kv [0066] S: latent seasonality, that is, S ∈ R.sup.p×kv
[0067] Herein, kv indicates the number of dimensions of a latent space for seasonality, and p is a seasonal period.
[0068]
[0069] Herein, the symbol ∘ in the second equation is an operator that shows an element-wise product of two vectors. A and B are respectively extended to A ∈ R.sup.k×k and B ∈ R.sup.k×k×k. Herein, k=kz+kv. The estimated vector et is obtained by the observation matrix U ∈ R.sup.d×kv for a seasonal latent activity vv and seasonality S as well as W ∈ R.sup.d×kz for latent trends zt. Once the initial states z0 and v0 are obtained, the latent state at the next time point is able to be recursively generated by use of a single common dynamical system. As a result, the latent interaction between a tendency and seasonality is able to be extracted.
[0070] (c) A complete model with multiple locations (places): Multiple different activities in terms of locations are assumed. The equation (Mathematical Formula 4) being the non-linear dynamical system is further extended to enable both time-changing and location-specific patterns to be identified. Here, it is assumed that there is a three-dimensional tensor X. Specifically, the tensor needs to be divided along the second mode, that is, the location dl, into a set of m local groups (m<dl) in order to capture a location-specific activity. That is, a dynamical pattern at the i-th location is able to be described by one of the sets of the observation matrix Wi and the observation matrix Ui. Herein, i ∈ {1, . . . , m}, a single space given by A and B in Mathematical Formula 4 is shared. Accordingly, similar time series share a similar latent non-linear factor. As shown in
[0071] E ∈ R.sup.tc×dl×dk is defined as an estimation tensor of X ∈ R.sup.tc×dl×dk. In a case in which an observation vector x.sub.ti ∈ X for the i-th location at time t is modeled by using the j-th observation matrix, an estimated vector e.sub.ti ∈ E is described as the following Mathematical Formula 5. Subsequently, as shown in Mathematical Formula 6, θ is used as a parameter set of a single non-linear dynamical system and is represented by θ={A, B, W, U}.
(Single regime parameter set). Let θ be the parameter set of a single non-linear dynamical system, namely θ={A, ,
,
}, where
and
are sets of observation matrices for m local groups, i.e.,
={W.sub.1, . . . , W.sub.m} and
={U.sub.1, . . . , U.sub.m}. [Mathematical Formula 6]
[0073] Furthermore, in a case in which regime transition between clear latent dynamics is detected, n is denoted as the proper number of regimes up to the current time point. More specifically, as shown in
(Regime assignment set). Let be a full regime assignment set for Θ, namely,
={r.sub.1, . . . , r.sub.n}, where r.sub.i={r.sub.1, . . . , r.sub.j, . . . , r.sub.d.sub.
[0074] R is a complete regime assignment set of Θ, that is, R={r1, . . . , rn}, ri={r1, . . . , rj, . . . , rd1} is a set of integers dl for the i-th regime θ.sub.i. Therefore, rj ∈ {1, . . . , mi} is a local group index to which the j-th location, for example, a country, belongs. For example, in
[0075] Table 2 is an algorithm that shows a processing procedure of algorithm 1 CubeCast (Xc, Θ, R).
TABLE-US-00002 TABLE 2 Algorithm 1 CUBECAST (χ.sup.c, Θ, ) Input: (a) Current tensor χ.sup.c (b) Full parameter set Θ (c) Regime assignment set
Output: (a) l.sub.s -steps-ahead future values ε.sup.f (b) Updated full parameter set Θ′ (c) Updated regime assignment set
′ 1: /* (I) Estimate a new regime for given data */ 2: {θ, r} ←REGIMEESTIMATION (χ.sup.c, S); // S ϵ Θ 3: /* (II) Update model set and detect current dynamics */ 4: {Θ′,
′} ←REGIMECOMPRESSION (χ.sup.c, Θ,
, θ, r); 5: /* (III) Generate future values using a current regime */ 6:
′};
[0076] Based on Table 2, optimization algorithms for the real-time forecasting of co-evolving tensor streams will be described. In the above, a model based on a non-linear dynamical system has been proposed. In order to effectively and accurately forecast a future event by use of the model, the following two problems need to be addressed. Specifically, (a) forecasting a future event in real time while adaptively generating and switching regimes, and (b) automatic mining, and estimating multiple non-linear dynamics.
[0077] With respect to the problem (a), an effective way is needed to manage the entire model structure step-by-step so as to detect regime switching to another known/unknown regime. With respect to the problem (b), a criterion is needed to determine a sufficiently compressed model that is able to capture the underlying dynamics of data without any human intervention. CubeCast is a streaming algorithm that achieves such problems (a) and (b). The algorithm of Table 2 shows the overall procedure of CubeCast. The basic idea of the algorithm is a tensor encoding system. This updates all the components in a parameter set Θ while processing a current tensor X.sup.c.
[0078] More specifically, the algorithm is configured by the following elements (1) to (3).
[0079] (1) Regime Estimation: Estimating a non-linear dynamical system from zero. Namely, the current tensor X.sup.c is designated to estimate θ. In addition, regime assignment r is placed to divide the tensor and add a set of observation matrices in W and U to the regime θ.
[0080] (2) Regime Compression: Updating all the parameter set Θ and regime assignment set R by use of the current tensor X.sup.c and a newly estimated regime θ and regime assignment r for X.sup.c. In this step, the algorithm determines whether or not to employ a new regime θ and selects an optimal regime for X.sup.c. After the regime assignment set R is updated, the seasonality S is also updated.
[0081] (3) Finally, an is-step ahead future event tensor E.sup.f={e.sub.tij}.sup.te,dl,dk.sub.t, i, j=ts, 1, 1 according to Mathematical Formula 5 by use of the most suitable regime θ and the regime assignment r for X.sup.c selected by Regime Compression.
[0082] Subsequently, an automatic tensor summarization will be described. In this example, an objective function will be described with respect to the minimum description length (MDL) principle in order to automatically detect the optimal model set. According to the MDL principle, as shown in Mathematical Formula 8, the nature of a good summarization is determined by minimizing the sum of the model description cost and data encoding cost as follows.
[0083] Herein, <Θ′> represents the describing cost, and <X |Θ′> represents the cost of describing the data X to which the model θ′ is given. In other words, the above follows the assumption that the more data is able to be compressed, the more is able to be learned about the underlying pattern. Therefore, in the present algorithm, two costs that have a trade-off relationship to each other are proposed for the model.
[0084] Subsequently, the model description cost will be described. The class of the model parameter set that needs to be searched is parameterized by the number of latent states for trends and seasonality as well as the number of regimes. Once these numerical values are obtained, as shown in Mathematical Formula 9, the description complexity of the entire model with the following terms is calculated. [0085] The dimensionality of a tensor:
<t.sub.c>=log*(t.sub.c).sup.1, <d.sub.l>=log*(d.sub.l), <d.sub.k>=log*(d.sub.k) [0086] The dimensionality of latent components:
<k.sub.z>=log*(k.sub.z), <k.sub.v>=log*(k.sub.v), <p>=log*(p). [0087] Seasonality:
<S>=|S|.Math.(log(p)+log(k.sub.v)+c.sub.F)+log*(|S|). [0088] Single regime parameter set:
<θ>=<k.sub.z>+<A>+<B>+<W>+<U>. [Mathematical Formula 9]
[0089] Herein, |.Math.| describes the number of non-zero elements and cF denotes the floating point cost. The model description cost of each component in <θ> is defined as the following Mathematical Formula 10.
<A>=|A|.Math.(2.Math.log(k)+c.sub.F)+log*(|A|),
<>=|
|.Math.(3.Math.log(k)+c.sub.F)+log*(|
|),
<>=Σ.sub.i=1.sup.m|W.sub.i|.Math.(log(d.sub.k)+log(k.sub.z)+c.sub.F)+log*(|W.sub.i|),
<>=Σ.sub.i=1.sup.m|U.sub.i|.Math.(log(d.sub.k)+log(k.sub.v)+c.sub.F)+log*(|U.sub.i|). [Mathematical Formula 10]
[0090] Subsequently, the data encoding cost will be described. The data X is able to be encoded by use of 8 based on the publicly known Huffman coding. The coding scheme assigns the number of bits to each value in X. This is the negative log-likelihood under a Gaussian distribution with mean μ and variance σ.sup.2, which is represented by Mathematical Formula 11.
[0091] Herein, e.sub.t ij ∈ E shows a reconstruction value of x.sub.t ij ∈ X used in Mathematical Formula 5. Finally, the total encoding cost <X; Θ> is obtained as shown in Mathematical Formula 12.
[0092] Subsequently, regime estimation will be described. It is difficult to find the global optimal solution of Mathematical Formula 12 due to interdependent components in the model. In other words, (a) latent dynamical systems A and B, (b) matrices in the observation matrices W and U, and (c) seasonality S, and therefore, the optimal local pattern of components (a) and (b) are aimed to be found by first using a greedy approach. More specifically, the algorithm 2 of Regime Estimation to minimize an equation (Mathematical Formula 12) for a tensor X.sup.c is provided as shown in Table 3.
TABLE-US-00003 TABLE 3 Algorithm 2 REGIMEESTIMATION (χ.sup.c, S) Input: Current tensor χ.sup.c and seasonality S Output: Regime parameter set θ and regime assignment r 1: = ϕ;
= ϕ; r = {r.sub.i = 1|i = 1, . . . , d.sub.l}; 2:
* = ϕ;
* = ϕ; // candidate observation matrix set 3: /* Estimate a regime with a single local activity */ 4:
*; Push U into
*; 6: /* Estimate local activities */ 7: while
* and
* are not empty do 8: Pop an entry W.sub.0 from
*; Pop an entry U.sub.0 from
*; 9: θ ← {A,
,
.sub.F,
.sub.F}; //
.sub.F =
∪
* ∪ {W.sub.0} 10: Initialize r*; Initialize W.sub.1, W.sub.2, U.sub.1, U.sub.2; 11: θ* ← {A*,
*,
.sub.F*,
.sub.F*}; //A* = A,
* =
12: //
.sub.F* =
∪
* ∪ {W.sub.1, W.sub.2},
.sub.F* =
∪
* ∪ {U.sub.1, U.sub.2} 13: while < χ.sup.c; S, θ*, r* > is improved do 14: Estimate r*; 15: Estimate W.sub.1, W.sub.2, U.sub.1, U.sub.2; 16: Estimate A*,
*; 17: end while 18: if < χ.sup.c; S, θ*, r* > is less than < χ.sup.c; S, θ, r > then 19: Push {W.sub.1, W.sub.2} into
*; Push {U.sub.1, U.sub.2} into
*; 20: A ← A*;
←
*; r ← r* 21: else 22: Push W.sub.0 into
; Push U.sub.0 into
; 23: end if 24: end while 25: return {θ, r}; // θ = {A,
,
,
}
[0093] The algorithm 2 shown in Table 3 shows Regime Estimation in detail. First, the current tensor X.sup.c is regarded as a single regime. A discrete local pattern in the current tensor X.sup.c is searched by grouping similar dimensions in a target mode of the current tensor X.sup.c. The first goal is to estimate the optimal parameter θ={A, B, W, U}, to fix the seasonality S, and to minimize the total cost <X.sup.c; S, θ, r>.
[0094] As shown in
[0095] Next, a way to find a difference with respect to one of the aspects of a tensor will be described. An efficient stack-based algorithm that does not consider combinations of all candidates of rich attributes such as locations is proposed. W* and U* are stacks including a local activity candidate that is able to be further divided. The stacks are not empty, and the algorithm pops an entry {W.sub.0, U.sub.0}, and then divides a local group into two by generating {W.sub.1, U.sub.1} and {W.sub.2, U.sub.2}.
[0096] After the first local activity assignment r* for two candidate local groups is initialized, the following three procedures are iterated to estimate a new parameter set θ*.
[0097] (Procedure 1) A reconstruction error is minimized only by updating {W.sub.1, W.sub.2, U.sub.1, U.sub.2} ∈ θ*. (Procedure 2) A reconstruction error is minimized only by updating {A*, B*} ∈ θ*, and (Procedure 3) Based on a newly estimated parameter θ*, regime assignment in r* is rearranged only for the two candidate local groups.
[0098] The new assignment r*.sub.i ∈ r* of the i-th country in a divided local group, for example, is set to a local group index that minimizes the total cost <X.sup.c, .sub.i|A, B, W.sub.j, U.sub.j>, where j ∈ {1, 2}. This alternative procedure makes the latent dynamical system more sophisticated with respect to a divided activity. The update of A and B affects the model quality for the entire local groups, so that all observation matrices W.sub.F and U.sub.F may be used in every iteration. Finally, in a case in which the coding cost with newly estimated components θ* and r* is less than the cost with undivided components θ and r, the algorithm stores a new candidate pair in the stacks W* and U* (that is, m=m+1) and performs subsequent iteration processing. Otherwise, W.sub.0 and U.sub.0 are used as an optimal local group.
[0099] Subsequently, Regime Compression will be described. Actual applications are configured by several individual phases. Regime Compression that makes effective and efficient updating possible is employed so that the approach may be able to detect a next dynamical pattern. The main idea is to employ/update a regime when the total cost of X.sup.c is reduced. The overall Regime Compression algorithm is shown in Table 4 as Algorithm 3.
TABLE-US-00004 TABLE 4 Algorithm 3 REGIMECOMPRESSION (χ.sup.c, Θ, , θ, r) Input: (a) Current tensor χ.sup.c (b) Full parameter set Θ and regime assignment set
(c) Candidate regime θ and regime assignment r Output: Updated model set Θ* and regime assignment set
* 1: /* Search an optimal regime within Θ */ 2:
* ←
∪ r; 5: θ* ← θ; r* ← r; // Replace an optimal regime with a new regime 6: else 7: Θ* ← Θ;
* ←
; 8: end if 9: while < χ.sup.c; S, θ*, r* > is improved do 10: Estimate θ*; // θ* ϵ Θ* 11: Estimate S; // S ϵ Θ* 12: end while 13: return {Θ*,
*};
[0100] When a current tensor X.sup.c is given, an optimal regime is detected based on a previous model set {Θ, R} and a candidate regime {θ, r} estimated using Regime Estimation. A goal is to continue minimizing the total cost of X.sup.c when a model set is given. First, the algorithm searches for an optimal regime θ* ∈ Θ and r* ∈ R. As a result, the coding cost <X.sup.c |S, θ*, r*> is minimized.
[0101] In a case in which the total cost for X.sup.c is less than θ* due to a newly estimated θ, θ is added to Θ. This indicates that θ is a proper summarization for an additional pattern. Otherwise, X.sup.c will be described with θ*. After the algorithm updates regime shift dynamics R, the seasonality S and the current regime θ* are updated by use of the LM algorithm. More specifically, one component is alternately updated with another fixed component. The number kv of seasonal components that minimizes the reconstruction error is able to be found.
[0102] Before online forecasting is started, the number of seasonal components kv and the seasonality S need to be initialized. Therefore, two components are estimated based on independent component analysis (ICA). Specifically, a regime θ is first estimated by use of Regime Estimation. Herein, kv=0 and S=0. Subsequently, kv is varied as kv=1, 2, 3, . . . , and an appropriate number is determined so as to minimize the total cost <X; S, θ, r>. For each given kv, the ICA is applied to the matrix X ∈ R.sup.p×d that is reshaped from X, and an independent component is obtained as S. It is to be noted that, in the present embodiment, the computational time of CubeCast is O(n dldk) per time point. Herein, n is the number of regimes.
[0103] Subsequently, an experiment will be described. Table 5 describes a query (search keywords) of a dataset used for the experiment.
TABLE-US-00005 TABLE 5 ID Dataset Query #1 Apparel zara, uniqlo, h&m, gap, primark #2 Chatapps facebook, LINE, slack, snapchat, twitter, telegram, viber, whatsapp #3 Hobby soccer, baseball, basketball, running, yoga, crafts #4 LinuxOS debian, ubuntu, centos, redhat, fedora, opensuse, steamos, raspbian, kubuntu #5 PythonLib numpy, scipy, sklearn, matplotlib, plotly, tensorflow #6 Shoes booties, flats, heels, loafers, pumps, sandals, sneakers
[0104] Hereinafter, the performance of CubeCast on a real dataset will be described. The present experiment was conducted with respect to the evaluation of effectiveness, accuracy, and scalability. It is to be noted that the present experiment was conducted on an Intel Xeon W-2123 3.6 GHz quad core CPU with 128 GB of memory, running Linux (registered trademark). [0105] Dataset: six real event streams were used on Google (registered trademark) Trends. This contains weekly search volumes for keywords from Jan. 1, 2004 to Dec. 31, 2018 (14 years in total) from 236 countries. It is to be noted that, due to a significant amount of missing data, the top 50 countries were selected in order of the GDP scores of the countries. A value was normalized so that each sequence might have the same mean and variance (that is, z-normalization). [0106] Baseline: the following state-of-the-art algorithm was employed for modeling and forecasting time series as a comparative example method.
<Publicly Known Comparative Example Method>
[0107] (1) RegimeCast (see Patent Literature 1): Real-time forecasting method with multiple discrete non-linear dynamical systems. The number of latent states k=4, the model hierarchy h=2, and the model generation threshold ε=0.5.Math.∥Xc∥ was set up.
[0108] (2) SARIMA: A state space method for obtaining a seasonal element of time series. Based on AIC, the optimal number of parameters for the model was selected from {1, 2, 4, 8}.
[0109] (3) MLDS: Multilinear dynamical system (MLDS) that learns the multilinear projection of each dimension of a sequence of latent tensors. The ranks of the latent tensors {2, 4} and {4, 8} were varied.
[0110] (4) LSTM/GRU: An RNN-based model for time series. A two-layer LSTM/GRU was stacked to encode and decode/forecast parts each of which has 50 units. In addition, a dropout rate of 0.5 to the connection of the output layer was applied. In this learning step, Adam optimization and early stopping were used.
[0111] <Discussion of Experiment>
[0112] (1) Effectiveness
[0113] First, how CubeCast found a dynamical pattern and the structural change over time in a co-evolving tensor stream will be described.
[0114] Overall, the proposed model normally captured latent dynamical patterns for multiple countries and keywords. As shown in
[0115] (2) Accuracy
[0116]
[0117] (3) Scalability
[0118] Finally, the computational time needed by CubeCast for large tensor time series is evaluated by comparison with the comparative example method.
[0119]
[0120] As described above, CubeCast (the present method) has proposed an effective and efficient forecasting method for large time-evolving tensor series. The present method is able to recognize basic trends and seasonality in input observation by extracting the latent non-linear dynamical system. In addition, the present method was shown to have advantages such as being effective, automatic, and scalable, over the above comparative example method with respect to time series forecasting using real Google (registered trademark) search volume datasets. Being effective is to effectively capture complex non-linear dynamics for tensor time series when a long-term future value is forecast. Being automatic is to automatically recognize all components in regimes and the temporal/structural innovations of all the components in the regimes without the need of prior knowledge of data. Being scalable means that the computational time of CubeCast is independent of the time series length.
[0121] Moreover,
[0122] It is to be noted that the present invention is applicable to grouping of locations by country as well as by region, gender, and various other perspectives. In addition, the present invention may also be applicable to marketing and purchase motivation of consumers. Moreover, the present invention is applicable to human activities that include temporal periodicity in nature, in addition to social activities.
[0123] As described above, a forecasting apparatus according to the present invention, in the forecasting apparatus that forecasts an event after a predetermined time by applying estimated data reproduced from time-series data in multidimension that passes through a current window, preferably includes a storage unit that sequentially stores the time-series data in the multidimension that passes through the current window, a non-linear transformation unit that, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to trends, non-linearly transforms and outputs latent first data showing the trends, and, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to seasonal intensity, linearly transforms and outputs latent second data showing the seasonal intensity, an observation matrix unit that includes a first observation matrix that reproduces the first data to first estimated data of an original number of dimensions, and a second observation matrix that, by use of seasonality information that has been set in a seasonality setting unit, reproduces the second data to second estimated data of an original number of dimensions, as seasonality data, and further adds output of the first observation matrix and the second observation matrix and outputs as the estimated data.
[0124] In addition, a forecasting method according to the present invention, in the forecasting method that forecasts an event after a predetermined time by applying estimated data reproduced from time-series data in multidimension that passes through a current window, preferably includes sequentially storing in a storage unit the time-series data in the multidimension that passes through the current window, among the time-series data in the multidimension to be outputted from the storage unit, from time-series data in a part of dimensions that are related to trends, non-linearly transforming and outputting latent first data showing the trends, and, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to seasonal intensity, and linearly transforming and outputting latent second data showing the seasonal intensity, and reproducing the first data to first estimated data of an original number of dimensions by a first observation matrix, and, by use of seasonality information that has been set in a seasonality setting unit, reproducing the second data to second estimated data of an original number of dimensions by a second observation matrix, as seasonality data, and further adding output of the first observation matrix and the second observation matrix and outputting as the estimated data.
[0125] Moreover, a non-transitory computer readable storage medium storing a program according to the present invention causes a computer to preferably implement, in forecasting an event after a predetermined time by applying estimated data reproduced from time-series data in multidimension that passes through a current window, sequentially storing in a storage unit the time-series data in the multidimension that passes through the current window, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to trends, non-linearly transforming and outputting latent first data showing the trends, and, among the time-series data in the multidimension to be outputted from the storage unit, from the time-series data in a part of dimensions that are related to seasonal intensity, and linearly transforming and outputting latent second data showing the seasonal intensity, and reproducing the first data to first estimated data of an original number of dimensions by a first observation matrix, and, by use of seasonality information that has been set in a seasonality setting unit, reproducing the second data to second estimated data of an original number of dimensions by a second observation matrix, as seasonality data, and further adding output of the first observation matrix and the second observation matrix and outputting as the estimated data.
[0126] According to these inventions, from the time-series data of the current window in a part of dimensions that are related to trends and the time-series data of the current window in a part of dimensions that are related to seasonal intensity, by employing, for example, an adaptive non-linear dynamical system including a differential equation as the non-linear transformation unit, the latent first data showing the trends and the latent second data showing the seasonal intensity are extracted by non-linear transformation and by linear transformation, that is, transformed and generated. Then, the first data is reproduced by the first observation matrix to the first estimated data of the original number of dimensions, and the second data is reproduced by the second observation matrix to the second estimated data of the original number of dimensions, by use of seasonality information that has been set in the seasonality setting unit, and the first estimated data and the second estimated data are further added to output of the first observation matrix and the second observation matrix and outputted as the estimated data. Therefore, latent trends and seasonality are extracted from the time-series data and reproduced so that original time-series data may be estimated by a model of the data of the latent trends and the seasonality. The forecasting, since being performed by the model at this time, is performed effectively and highly accurately.
[0127] In addition, the present invention preferably includes a model parameter estimation unit, and the model parameter estimation unit preferably adjusts a parameter of the non-linear transformation unit, the first observation matrix, and the second observation matrix, and a setting content of the seasonality setting unit so as to minimize a difference between the estimated data being a result of addition of the first estimated data and the second estimated data, and the time-series data of the current window. According to this configuration, the estimated data being the result of addition of the first estimated data and the second estimated data may be approximated to the time-series data of the current window, and the model at this time is able to be used to improve forecasting accuracy.
[0128] Moreover, the non-linear transformation unit, in a case in which the multidimension is d-dimensional, preferably receives an input and sends an output of the time-series data for k-dimension (<d) that is obtained by combining the time-series data for kz-dimension related to the trends and the time-series data for kv-dimension related to the seasonal intensity. According to this configuration, latent data is captured in the smaller number of dimensions.
[0129] In addition, the non-linear transformation unit is preferably connected in series to the two-dimensional matrix that performs linear transformation and the three-dimensional tensor matrix that performs non-linear transformation. According to this configuration, the latent second data showing the seasonal intensity is captured in addition to the latent first data showing the trends with the matrix and the tensor matrix.
[0130] Moreover, the present invention preferably includes a regime update unit, and the time-series data is preferably configured by an element of a keyword, a location, and elapsed time information, and the regime update unit preferably divides the first observation matrix and the second observation matrix into multiple regimes at least with respect to the element of the location. According to this configuration, a new regime by use of a location and even further a keyword as the element is able to be increased, which makes it possible to improve forecasting accuracy. It is to be noted that the element of the location may include a place, a region, a country, and other social or physical distinction.
[0131] In addition, the present invention preferably includes a resume addition unit, and the regime update unit preferably compares a sum of model description cost and data encoding cost by applying principle of the minimum description length (MDL), with respect to an original regime model and a divided new regime model, and the regime addition unit, in a case in which cost of the new regime model is lower, preferably additionally registers a parameter configuring the new regime model in a parameter set storage unit. According to this configuration, determination to set a new regime model is made automatically.
REFERENCE SIGNS LIST
[0132] 100 forecasting apparatus [0133] 1 calculation unit [0134] 10 model parameter estimation unit [0135] 11 non-linear transformation unit [0136] 14 seasonality setting unit [0137] 16, 17 observation matrix [0138] 20 regime update unit [0139] 30 regime addition unit [0140] 40 forecasting unit [0141] 101 program storage unit [0142] 102 data stream storage unit [0143] 103 current window storage unit [0144] 104 parameter set storage unit