FEATURE SELECTION FOR A DEMAND FORECASTING SYSTEM

Abstract

Aspects of the present disclosure relate to a demand forecasting system. The demand forecasting system may include components for developing forecasting models, generating demand forecasts, and handling outputs of demand forecasting models. In some embodiments, the demand forecasting system may include a model training system and one or more components that can be used by the model training system to improve model performance.

Claims

1. A feature management system, the system comprising: a data storage system storing a plurality of forecasting models comprising a first item-specific forecasting model for forecasting demand for a first item of a plurality of items and a second item-specific forecasting model for forecasting demand for a second item of the plurality of items; a processor; and a memory storing instructions that, when executed by the processor, cause the feature management system to: access time series data for the first item and the second item; access a feature calendar, the feature calendar including a plurality of features, wherein each feature of the plurality of features is associated with a respective time period during which the feature is active; analyze the time series data for the first item and the feature calendar to identify, using a threshold value corresponding to feature relevance, a first set of features of the plurality of features for use in the first item-specific forecasting model; and analyze the time series data for the second item and the feature calendar to identify, using the threshold value corresponding to feature relevance, a second set of features of the plurality of features for use in the second item-specific forecasting model, wherein the first set of features includes a first feature that is not present in the second set of features, and the second set of features includes a second feature that is not present in the first set of features.

2. The feature management system of claim 1, wherein the first item-specific forecasting model and the second item-specific forecasting model are generalized additive mixed models (GAMM) comprising smooth terms, random effects, and fixed effects; wherein the first item-specific forecasting model and the second item-specific forecasting model include a common set of smooth terms and random effects; and wherein the first set of features and the second set of features are fixed effects.

3. The feature management system of claim 1, wherein the plurality of features are holidays.

4. The feature management system of claim 1, wherein identifying the first set of features of the plurality of features for use in the first item-specific forecasting model occurs at a first time; and wherein the first item-specific forecasting model is used, at a second time later than the first time, to generate a demand forecast for the first item using the first set of features.

5. The feature management system of claim 1, wherein the plurality of forecasting models comprises a channel-specific forecasting model for the first item, the channel-specific forecasting model being configured to generate demand forecasts for the first item for a digital demand channel; and wherein the instructions, when executed by the processor, further cause the feature management system to analyze the time series data for the first item and the feature calendar to identify, using the threshold value corresponding to feature relevance, a third set of features of the plurality of features for use in the channel-specific forecasting model, the third set of features being different from the first set of features.

6. The feature management system of claim 1, wherein the first item-specific forecasting model uses flags that indicate when each feature of the first set of features is active and inactive; and wherein the first item-specific forecasting model, when forecasting demand for the first item, uses features of the first set of features only during respective time periods during which each feature of the first set of features is active.

7. The feature management system of claim 1, wherein the first set of features includes more features than the second set of features; and wherein the first set of features and the second set of features include a common feature.

8. The feature management system of claim 1, wherein analyzing the time series data for the first item and the feature calendar to identify, using the threshold value corresponding to feature relevance, the first set of features of the plurality of features for use in the first item-specific forecasting model comprises: determining a group of items that includes the first item, the group of items sharing a common attribute; accessing aggregated time series data for the group of items; and analyzing the aggregated time series data and the feature calendar to identify, using the threshold value corresponding to feature relevance, group-level features of the plurality of features.

9. The feature management system of claim 8, wherein analyzing the time series data for the first item and the feature calendar to identify, using the threshold value corresponding to feature relevance, the first set of features of the plurality of features for use in the first item-specific forecasting model further comprises: determining that the time series data for the first item is insufficient to identify item-specific features for the item-specific forecasting model; and including only the group-level features in the first set of features.

10. The feature management system of claim 8, wherein analyzing the time series data for the first item and the feature calendar to identify, using the threshold value corresponding to feature relevance, the first set of features of the plurality of features for use in the first item-specific forecasting model further comprises combining the group-level features with item-specific features of the plurality of features to identify the first set of features.

11. The feature management system of claim 1, wherein analyzing the time series data for the first item and the feature calendar to identify, using the threshold value corresponding to feature relevance, the first set of features of the plurality of features for use in the first item-specific forecasting model comprises: decomposing the time series data for the first item to remove effects except the plurality of features, the plurality of features consisting of a set of fixed effects; and ranking the plurality of features by impact on the decomposed time series data.

12. The feature management system of claim 1, wherein the threshold value corresponds to a number of features of the plurality of features to include in the first set of features or corresponds to a number indicative of feature impact for predicting demand of the first item.

13. The feature management system of claim 1, wherein the time series data includes sales data or demand forecasts over days or weeks.

14. A demand forecasting system, the system comprising: a data storage system storing a plurality of forecasting models comprising a first forecasting model for forecasting demand for a first item of a plurality of items and a second forecasting model for forecasting demand for a second item of the plurality of items; a feature management system configured to: access time series data for the first item and the second item; access a feature calendar, the feature calendar including a plurality of features, wherein each feature of the plurality of features is associated with a respective time period during which the feature is active; analyze the feature calendar and the time series data for the first item and the second item to identify, using a threshold value corresponding to feature relevance, a first set of features of the plurality of features for use in the first forecasting model and a second set of features of the plurality of features for use in the second forecasting model, the first set of features being different from the second set of features; and determine a first hyperparameter for the first model and a second hyperparameter for the second model, the first hyperparameter being different from the second hyperparameter.

15. The demand forecasting system of claim 14, wherein the first forecasting model and the second forecasting model are generalized additive mixed models (GAMM); and wherein the first hyperparameter indicates that a smooth term or a random effect is not included in the first forecasting model, and wherein the second hyperparameter indicates that the smooth term or the random effect is included in the second forecasting model.

16. The demand forecasting system of claim 14, wherein determining the first hyperparameter of the first model comprises: using the time series data for the first item, select a plurality of hyperparameters of the first model to optimize; determine optimized values for the plurality of hyperparameters; determine combinations of optimized and default values for the plurality of hyperparameters; and select an optimal combination of optimized and default values from the combinations of optimized and default values, wherein the optimal combination comprises the first hyperparameter.

17. The demand forecasting system of claim 14, wherein the first hyperparameter and the second hyperparameter correspond to parameters for training the first forecasting model and the second forecasting model respectively.

18. A system for forecasting item demand, the system comprising: a data storage system storing a plurality of forecasting models comprising a first forecasting model for forecasting demand for a first item of a plurality of items and a second forecasting model for forecasting demand for a second item of the plurality of items; and a feature management system configured to: access time series data for the first item and the second item; access a feature calendar, the feature calendar including a plurality of features, wherein each feature of the plurality of features is associated with a respective time period during which the feature is active; analyze the feature calendar and the time series data for the first item and the second item to identify, using a threshold value corresponding to feature relevance, a first set of features of the plurality of features for use in the first forecasting model and a second set of features of the plurality of features for use in the second forecasting model, the first set of features being different from the second set of features; and a model training system configured to train the first forecasting model using the first set of features and to train the second forecasting model using the second set of features.

19. The system of claim 18, wherein the first forecasting model comprises features in addition to the first set of features, the features in addition to the first set of features including one or more promotion features; wherein the first forecasting model is configured to forecast demand for the first item, wherein forecasting the demand for the first item comprises applying a promotion system to determine an effect of the one or more promotion features, the promotion system being configured to: for each promotion of the one or more promotion features: determine a price change of the promotion by in part using a promotion price and a redemption rate; determine an elasticity of the item; and using the elasticity and the price change, determine a demand increase for the item.

20. The system of claim 19, wherein the feature management system is configured to update the first forecasting model by adding or removing a promotion of the one or more promotions; and wherein the second forecasting model includes features corresponding to different promotions than the one or more promotions of the first forecasting model.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

[0004] FIG. 1 illustrates an example network environment in which aspects of the present disclosure may be implemented.

[0005] FIG. 2 illustrates a block diagram of an example architecture of the forecasting system.

[0006] FIG. 3 illustrates a schematic representation of aspects of a forecasting model.

[0007] FIG. 4 illustrates a schematic diagram of a disaggregation of a demand forecast for an item.

[0008] FIG. 5 illustrates a schematic diagram of components that may be used in connection with one or more of the generation or use of synthetic data.

[0009] FIG. 6 is a flowchart of an example method according to aspects of the present disclosure.

[0010] FIG. 7 is a flowchart of an example method according to aspects of the present disclosure.

[0011] FIG. 8 illustrates a block diagram of an example architecture of a feature management system.

[0012] FIG. 9 is a flowchart of an example method according to aspects of the present disclosure.

[0013] FIG. 10 illustrates a schematic diagram of determining holidays for an example item.

[0014] FIG. 11 illustrates a schematic diagram of an example representation of an example technique for identifying feature importance.

[0015] FIG. 12 illustrates an example diagram depicting holiday feature importance.

[0016] FIG. 13 illustrates an example architecture of the hyperparameter tuning system.

[0017] FIG. 14 is a flowchart of an example method according to aspects of the present disclosure.

[0018] FIG. 15A illustrates an example table depicting example hyperparameter combinations.

[0019] FIG. 15B illustrates an example tree diagram for selecting hyperparameter values.

[0020] FIG. 16 illustrates an example architecture of the promotion system.

[0021] FIG. 17 is a flowchart of an example method according to aspects of the present disclosure.

[0022] FIG. 18 illustrates an example architecture of a similar item data pooling system.

[0023] FIG. 19 is a flowchart of an example method according to aspects of the present disclosure.

[0024] FIG. 20 illustrate an example architecture that includes a forecasting application programming interface.

[0025] FIG. 21 illustrates example aspects of a demand transfer engine.

[0026] FIG. 22 illustrates an example graph depicting items and substitutability scores.

[0027] FIG. 23 illustrates an example architecture of an artificial intelligence system.

[0028] FIG. 24 illustrates an example user interface.

[0029] FIG. 25 illustrates a block diagram of an example computing environment.

[0030] FIG. 26 illustrates an example user interface according to aspects of the present disclosure.

[0031] FIG. 27 illustrates an example user interface according to aspects of the present disclosure.

[0032] FIG. 28 illustrates an example user interface according to aspects of the present disclosure.

[0033] FIG. 29 illustrates an example user interface according to aspects of the present disclosure.

DETAILED DESCRIPTION

[0034] Aspects of the present disclosure relate to a system for forecasting demand, such as item demand over time. In some instances, demand over time is a complex stochastic process dependent on many variables, some of which may be unknown or unknowable. Given this complex stochastic nature of forecasting demand, computer models may be useful. Because computer systems can process, store, and exchange data in ways that go far beyond what humans can perform, computer models may forecast demand using methods that are distinct from any related human methods for forecasting data. That is, modeling complex stochastic processes, such as demand over time, is an endeavor that goes to the core of computer technology. Furthermore, as a result of computers' unique processing capabilities, computer models may ultimately derive much more accurate demand forecasts at much faster speeds than other techniques for forecasting demand. Thus, a demand forecasting system may use computer models; however, the effectiveness of such a forecasting system may depend on various factor, such as the quantity and quality of training data, the features used by the forecasting system, hyperparameters of the forecasting system, responsiveness to user inputs, or other factors.

[0035] In example aspects, the forecasting system disclosed herein may forecast demand for a plurality of items. For an example item, the forecasting system may forecast the demand for the example item at each of a plurality of locations (e.g., a plurality of individual stores or regions), may forecast the demand for the example item generally (e.g., across all locations), or may forecast demand for the example item across different fulfillment channels (e.g., a digital fulfillment process, a customer pickup fulfillment process, or an in-store fulfillment process). To do so, the forecasting system may apply one or more models that are trained on historical demand data associated with the example item, which may include historical sales for the example item. Accounting for this historical demand data and other considerations, the models may output a demand forecast. Example aspects of the forecasting system are illustrated in connection with FIG. 1.

[0036] FIG. 1 illustrates an example network environment 100 in which aspects of the present disclosure may be implemented. In the example shown, the environment 100 includes a forecasting system 102, retail locations 104, a digital platform 106, item data 108, sales data 110, forecast data 112, an external system 114, an administrator 120, a forecast consumer 122, and networks 124a-c. In some embodiments, one or more of the components illustrated in FIG. 1 may be associated with a common entity. For example, one or more of the components illustrated in FIG. 1 may be part of an information system of an organization or may be communicatively coupled with aspects of an information system of an entity. In some embodiments, the entity is a retailer. In some embodiments, the entity may be a manufacturer, logistics organization, software organization, or other type of entity.

[0037] The forecasting system 102 may be a collection of hardware and software configured to forecast demand. In some embodiments, the forecasting system 102 may use computer models to generate demand forecasts for items. In examples, the items are offered for sale by a retailer. In some embodiments, aspects of the forecasting system 102 may be distributed across different computing environments. In some embodiments, the forecasting system 102 may receive data from one or more of the components 104-118, may receive a configuration from the administrator 120 or the forecast consumer 122, and may output demand forecast to the forecast consumer 122. In some embodiments, communications between the forecasting system 102 and other components of FIG. 1 may occur in real time in response to, for example, updates to data of the databases 108-112, inputs from a forecast consumer 122 or the administrator 120, or a timed workflow that triggers data exchanges among components of FIG. 1. An example architecture of the forecasting system 102 is illustrated and described in connection with FIG. 2.

[0038] The forecasting system 102 may be used to generate different types of demand forecasts at the item level, class level, or entire retail chain level, and for overall demand, digital demand (e.g., purchases via website or mobile application), or in the store demand. An example of such a modeling approach is described in U.S. Pat. No. 11,373,199, entitled Method and System for Generating Ensemble Demand Forecasts, the disclosure of which is hereby incorporated by reference in its entirety. In further examples, the demand forecasting model may be a chain-level model that forecasts overall demand for an item based on item attributes. Such a modeling approach is described in U.S. Pat. No. 11,182,808, entitled Method and System for Attributes Based Forecasting, the disclosure of which is hereby incorporated by reference in its entirety.

[0039] The retail locations 104 may be one or more physical locations associated with a retailer. In some embodiments, the forecasting system 102 may generate demand forecast for a location or group of locations of the retail locations 104. In some instances, a customer may purchase or pickup an item from a retail location of the retail locations 104, which may be registered as demand for that item from that retail location at the time that the customer purchased or picked up the item. In some instances, an item may be shipped from a retail location of the retail locations 104, which may be registered as demand for that item at the time of purchase or shipment, and may or may not be registered as demand at that location. The retail locations 104 may include stores, warehouses, sortation centers, manufacturing facilities, shipping hubs, offices, or other locations that may be used by an entity associated with components of FIG. 1. When a retail location is a store, the retail location may include distinct physical areas and equipment to facilitate fulfillment of orders according to different fulfillment channels, such as a first area and equipment for shipping items from the store and a second area and equipment for providing items to customers who shop in the store. The store may include equipment, such as cameras, video processing software and hardware, automated scanning devices, robotic processing equipment, and other hardware for registering demand (e.g., item movement, purchases, or shipment) at the store and automatically providing data for that demand to backend systems, such as, for example, the item data 108, sales data 110, or forecasting system 102. In some embodiments, the retail locations 104 may be divided into groups, such as groups based on geography or location type. In some embodiments, the retail locations 104 may include computing systems via which the retail locations 104 may transfer data to and from other components of FIG. 1.

[0040] The digital platform 106 may include computer systems that can receive or track demand for items. For example, the digital platform 106 may include one or components that facilitate the purchasing of items. In some embodiments, the digital platform 106 includes a website, such as website of a retailer or a website communicatively coupled with a digital retailer system. In some embodiments, the digital platform 106 may include a mobile application.

[0041] The item data 108 may include data related to items associated with a retailer. For example, the items may be sold by the retailer. In some embodiments, the item data 108 may include data for items for which forecasts are generated by the forecasting system 102. In some embodiments, for a given item, the item data may include one or more of the following: an item name; an item description; an item price; an item inventory; an item vendor; one or more item images; one or more item attributes; seasonal characteristics of the item; one or more locations associated with the item; one or more classifications of the item; an indication of whether the item is a new item; or other data associated with the item. In some embodiments, the forecasting system 102 may use data of the item data 108 to forecast item demand and to determine item similarities.

[0042] The sales data 110 may include historical sales data for items, such as items of the item data 108. In some embodiments, the historical sales data for an item may correspond with historical demand for the item. However, in some instances, sales data 110 may not always accurately reflect demand. For example, if an item is out of stock during a time period at a location or group of locations, then the sales for that item might be low due to its unavailability, not due to a lack of demand. As another example, the sales data 110, which may be captured by computing systems of store locations or of a digital retail platform, may incorrectly record sales in certain instances. For example, coding errors may improperly attribute sales of one item to another item, or may result in programs or integrations that fail to accurately capture, transfer, or store sales data. As a result, the sales data 110 may be incorrect. As another example, an item may be new or limited and therefore the corresponding sales data may be zero or artificially low. Additional scenarios that render the sales data 110 an imperfect technical representation of demand are likewise possible. As described further in connection with at least the synthetic data generator 210 and FIGS. 5-7, aspects of the present disclosure may modify the sales data 110 using synthetically generated sales data to replace data that may be corrupted, incorrect, or otherwise unrepresentative of demand. In some embodiments, the sales data 110 includes separate data that represents actual sales and data that represents demand, which may include synthetically generated sales. In other embodiments, such data may be combined.

[0043] In some embodiments, the sales data 110 may include sales data for an item at a particular location or group of locations. In some embodiments, the sales data 110 is time series data. For example, the sales data 110 may include historical sales data for an item at a location over time, and the sales data 110 may also include historical sales data over time for the item across a supply chain. Different granularities of time are possible. For example, the sales data 110 may include sales per day, per week, per month, per quarter, per year, per a combination thereof, or per another time metric. In some embodiments, the sales data 110 may include sales data generated at the digital platform 106, which may include sales at a website. In some embodiments, the sales data 110 may include synthetic data that is derived at least in part from historical sales data. In some embodiments, the sales data 110 is continuously updated as sales are registered at one or more of the retail locations 104 or the digital platform 106. For example, the sales data 110 may be updated in real time. In some embodiments, sales data 110 is provided to the forecasting system 102, and the forecasting system 102 may use the sales data 110 to train a forecasting model. In some embodiments, the sales data 110 is converted into a standardized format. For instance, sales generated at one of the retail locations 104 may have a different format than sales generated at the digital platform 106, and these formats may be standardized in the sales data 110.

[0044] The forecast data 112 may include current and past forecasts for items, such as items of the item data 108. In some embodiments, the forecast data 112 is time series data. For example, the forecast data 112 may include demand forecasts for one or more items over time, such as per day, per week, per month, per quarter, per year, per a combination thereof, or per another time metric. The forecast data 112 may be generated by the forecasting system 102. The forecast data 112 may be provided by the forecasting system 102 to the forecast consumer 122. In some embodiments, the forecast data 112 may be compared with the sales data 110 to determine an accuracy of the forecasting system 102. In some embodiments, the forecasting system 102 may check the forecast data 112 for anomalies.

[0045] The external system 114 may include data, software, hardware, or infrastructure that may be communicatively coupled with the forecasting system 102. In the example shown, the external system 114 includes an external application 116 and external data 118. The external application 114 may be, for example, a software, platform, or infrastructure service that is useable by the forecasting system 102 to manage or process data. The external data 118 may include a remote data storage service.

[0046] The administrator 120 may be a person or system that configures aspects of the forecasting system 102. For example, the administrator 120 may be an engineer that updates the forecasting system 102, deploys the forecasting system 102, updates a parameter of a configuration file of the forecasting system 102, updates a hyperparameter for a model or process of the forecasting system 102, or performs other operations in connection with the forecasting system 102.

[0047] The forecast consumer 122 may be a system or person that receives data from the forecasting system 102. For example, the forecast consumer may receive demand forecasts from the forecasting system 102. In some embodiments, the forecast consumer 122 may be part of an inventory management system or a supply chain management system. In some embodiments, the forecast consumer 122 may be part of the digital platform 106 or one or more of the retail locations 104. In some embodiments, the forecast consumer 122 may, in response to receiving a forecast from the forecasting system 102, automatically adjust an inventory level for one or more items based on data of the forecast. For example, the forecast consumer 122 may initiate an inventory transfer after receiving a demand forecast from the forecasting system 102. In some embodiments, the forecasting system 102 provides demand forecasts for items to the forecast consumer 122 in real time, and the forecast consumer 122 may automatically rebalance inventory levels at the retail locations 104 based on the demand forecasts.

[0048] In some embodiments, the forecast consumer 122 may be an application that includes a graphical user interface (GUI). The application may be, for example, a web application. The GUI may include visualizations of data received from the forecasting system 102, and the GUI may include selectable input fields for providing data to the forecasting system 102. In some embodiments, a user may query the forecasting system 102 to generate a demand forecast by using components of the GUI. In some embodiments, the forecasting system 102 may, in response to receiving a query from a forecast consumer 122, use an artificial intelligence (AI) to determine an effect of the user's query and generate a recommendation based on the effect. In some embodiments, the forecast consumer 122 may access the item data 108, the sales data 110, or the forecast data 112.

[0049] In examples, the forecast consumer 122 may represent a user affiliated with an enterprise associated with the forecasting system 102. For example, the forecast consumer 122 may be an employee, such as a business analyst, data scientists, or the like who is seeking to generate forecasts regarding various types of business operations, such as forecasted demand for goods or services of the enterprise, forecasted supply issues within an enterprise supply chain, and the like.

[0050] Although the example of FIG. 1 depicts a single forecast consumer 122, the forecasting system 102 may be coupled with a plurality of forecast consumers, each of which may be a different type of system or may be a common type of system. Furthermore, the forecasting system 102 may simultaneously communicate with a plurality of forecast consumers 122.

[0051] FIG. 2 illustrates a block diagram of an example architecture of the forecasting system 102. The forecasting system 102 may include a plurality of subsystems, which may themselves include a plurality of subsystems. Systems and subsystems of the forecasting system 102 are described as having certain features and performing certain operations. However, as will be understood by those having skill in the art, features and operations associated with a first system in a first embodiment may be associated with a different system in a different embodiment. Moreover, such features and operations may overlap across multiple subsystems, and systems in addition to those illustrated in the example of FIG. 2 may be included in the forecasting system 102.

[0052] In the example shown, the forecasting system 102 includes a model development system 202, a model deployment system 204, and a model output system 206. Each of these systems may be a grouping of components that may be part of the forecasting system 102. In some embodiments, components of the systems 202-206 may use aspects of a common computing and data storage infrastructure. In some embodiments, components of the systems 202-206 may use different computing or data storage infrastructures. Although a component in FIG. 2 may be depicted as part of one of the systems 202-206, each of components may belong to a different one of the systems 202-206.

[0053] In the example shown, the model development system 202, the model deployment system 204, and the model output system 206 may exchange data. For example, the forecasting models 222 may be trained using components of the model development system 202. The forecasting models 222 may be deployed and used in the model deployment system 204. In examples, other components of the model deployment system 204 may be used in connection with the forecasting models 222 as part of generating demand forecast. The demand forecasts may be provided to the model output system 206 for storage, subsequent processing, or output to a forecast consumer. In some embodiments, data received or generated by the model output system 206 may be provided to the model development system 202 for refining the forecasting models 222, for generating new models, or for other purposes.

[0054] In some embodiments, components described in connection with FIG. 2 may be software. In some embodiments, one or more of the components may be a combination of software and hardware. Additionally, the data exchanges among components of the forecasting system 102 are not limited to the arrows depicted in the example of FIG. 2. Furthermore, the forecasting system 102 may include more or fewer components than those illustrated in connection with FIG. 1.

[0055] In the example shown, the model development system 202 includes the sales data 110, a model training system 208, a synthetic data generator 210, a feature management system 212, a hyperparameter tuning system 214, a promotion system 216, and a model management system 218. Example operations that may be performed by components of the model development system 202 include, but are not limited to, the following: data sourcing; data extraction, cleansing, and creation; data maintenance and updating; data pipeline development; feature development; data exploration and wrangling for model-specific features; model creation; business logic encoding; model evaluation; model selection; and other operations.

[0056] The model training system 208 may train the forecasting models 222. In some embodiments, the forecasting models 222 may be a plurality of models, such as a model for each item of a plurality of items in the item data 108. In such an embodiment, the model training system 208 may train each of the plurality of models. Training the forecasting models 222 may include various steps. For example, the model training system 208 may determine a plurality of features that are to be considered by the forecasting models 222, examples of which are further described below in connection with FIG. 3. In some embodiments, the model training system 208 may use the sales data 110 to train the forecasting models 222. Furthermore, using the sales data 110, the model training system 208 may update weights, parameters, intercepts, random effects, fixed effects, functions, or biases of the forecasting models 222, or any other modifiable aspect of the forecasting models 222, depending on the implementation of the forecasting models 222, as part of training. In some embodiments, the model training system 208 may periodically retrain the forecasting models 222.

[0057] In some embodiments, the forecasting models 222 are stored in one or more data storage systems. For instance, an example forecasting model may be represented at least in part by one or more files, such as configuration files and code that represent features, parameters, weights, and operations of the forecasting models. Such files may be stored in a data storage system. When modifying a model, such as by selecting features for a model or training a model, one or more files associated with the model may be accessed and updated. Moreover, when using a model, such as to forecast demand, the one or more files associated with the model may be accessed.

[0058] The synthetic data generator 210 may generate data that is used to train the forecasting models 222. For example, the synthetic data generator 210 may generate synthetic sales data. Synthetic sales data may be data corresponding to sales that did not actually occur. In some embodiments, the synthetic sales data may be added to the sales data 110. In some embodiments, to generate synthetic sales data, the synthetic data generator 210 may use data corresponding to actual sales from the sales data 110 and may use item data 108. In some embodiments, the synthetic sales data may be used to supplement flawed historical sales data, such as historical sales data for an item during a time associated with a stockout event. As another example, the synthetic sales data may be used to generate synthetic sales data for new items or items with insufficient historical sales data. Example aspects of the synthetic data generator 210 are described in connection with FIGS. 5-7.

[0059] The feature management system 212 may select features that are to be used by the forecasting models 222. For example, the feature management system 212 may select, from a plurality of potential features, a subset of features that are to be used by the forecasting models 222. The feature management system 212 may select different features for different forecasting models of the forecasting models 222. Additionally, the feature management system 212 may, in some embodiments, perform one or more operations associated with the hyperparameter tuning system 214. Example aspects of the feature management system 212 are described in connection with FIGS. 8-12.

[0060] The hyperparameter tuning system 214 may determine hyperparameters and their corresponding values for the forecasting models 222. Example hyperparameters may include a degree to which a feature is weighted in a forecasting model 222. Furthermore, a hyperparameter may include a setting that is used by the model training system 208 when training the forecasting models 222. Example aspects of the hyperparameter tuning system 214 are described in connection with FIGS. 13-15.

[0061] The promotion system 216 may determine the effects of promotions on demand. A promotion may include, for example, a discounted price for an item. As described further herein, there may be different types of promotions, which may have varying degrees of impact on demand forecasts. In some embodiments, the promotion system 216 may determine which promotions are to be considered by the forecasting models 222, and the promotion system 216 may determine the degree to which a promotion effects a demand forecast. In some embodiments, the promotion system 216 may determine a price effect of a promotion, and based on the price effect and an elasticity model, the promotion system 216 may determine an effect on the demand forecast based on the promotion. Example aspects of the promotion system 216 are described in connection with FIGS. 16-17.

[0062] The model management system 218 may manage the forecasting models 222. For example, the model management system 218 may manage the deployment, storage, updating, versioning, and other aspects of the forecasting models 222.

[0063] In the example shown, the model deployment system 204 includes a forecasting model controller 220, forecasting models 222, a forecast disaggregator 224, and support models 226. In addition to performing operations related to model deployment, the model deployment system 204 may perform operations related to operating the forecasting models 222. In some embodiments, components of the model deployment system 204 may perform operations that include, but are not limited to, the following: model performance optimization; model versioning and governance; scoring and training automation; data drift evaluation and retraining; and other operations.

[0064] The forecasting model controller 220 may determine a time at which demand forecasts are generated using the forecasting models 222. For example, the forecasting model controller 220 may generate demand forecasts weekly by using the forecasting models 222. In some embodiments, the forecasting model controller 220 executes forecasting models 222 in parallel to simultaneously generate demand forecasts for a plurality of items. In some embodiments, the forecasting model controller 220 may execute, for a given item, one or more passes with a forecasting model of the forecasting models 222 to try to generate a demand forecast. For example, the forecasting model 220 may attempt to execute a forecasting model for the item. If that fails (e.g., due to data sparsity, due to a model timing out, or due to another reason), then the forecasting controller 220 may attempt to execute a forecasting model for a cluster of items, where the cluster of items includes the item. If that fails, then forecasting model controller 220 may attempt to execute a forecasting model of the forecasting models 222 for a supply chain. Other examples of using the forecasting models 222 to generate a demand forecast are likewise possible.

[0065] In some embodiments, the forecasting model controller 220 may include a load balancer that distributes computations of components of the forecasting system 102 across different computing systems. In some embodiments the forecasting model controller 220 identifies one or more items or batches of items for which to determine a demand forecast.

[0066] The forecasting models 222 may include one or more models for forecasting item demand. In some embodiments, the forecasting models 222 are trained using data generated or output by components of the model development system 202. In some embodiments, one or more of the forecasting models 222 may be a general additive mixed model (GAMM). In some embodiments, one or more of the forecasting models 222 may be a general additive model (GAM) or a different model. In some embodiments, one or more of the forecasting models 222 may be a machine learning model. In some instances, a first item may be associated with a first model type of forecasting models 222 and a second item may be associated with a second model type of the forecasting models 222. In some embodiments, the forecasting models 222 may forecast demand for an item based on a plurality of item features. In examples, the item features may include one or more features from the item data 108. In some embodiments, the forecasting models 222 include location-based and time-based features for forecasting demand for a given item.

[0067] In some embodiments, the forecasting models 222 may include a plurality of location-specific models and may include a model for a group of locations (e.g., for a full supply chain). Thus, when forecasting demand for an item, a location-specific model may be executed to forecast item demand at a particular location, and a chain-level model may be applied to determine an overall item demand forecast. In some embodiments, the chain-level model includes an aggregation of location-specific models.

[0068] In some embodiments, the forecasting models 222 include item-specific forecasting models. In such an embodiment, an item-specific forecasting model of the forecasting models 222 may be trained using historical demand data for the item. In some embodiments, the forecasting models 222 may be associated with a list of items for which the forecasting models 222 are trained to generate a demand forecast. In some embodiments, the forecasting models 222 may be configured to generate a demand forecast at a particular time or a range of times.

[0069] The structures of features may vary within a given model of the forecasting models 222. For example, a forecasting model may include a combination of random effects, fixed effects, linear features, non-linear features, smooth terms, random effects, intercepts, or other types of features. In some embodiments, a fixed effect is a linear term and a smooth term is a non-linear term. In some embodiments, a smooth term models forecasting variability for groups of items. Different features may be modeled using different feature types. For example, a seasonality feature may be modelled as a smooth term, whereas a store feature may be modelled as a random effect, or vice-versa. Additionally, depending on a model type, the features may vary. For example, an item-specific forecasting model that determines item forecasts for a digital or store demand may include different features than a store or chain-level model that determines demand at the store or chain-level.

[0070] In some embodiments, a forecasting model of the forecasting models 222 includes an ensemble of forecasting models. For example, for a given item or group of items, the forecasting model may include different models depending on an item, location, or time granularity at which demand forecasts are being generated. Additionally, in some embodiments, the forecasting model may include a first item-specific model and also a plurality of forecasting models for similar items, thereby enabling the forecasting model to perform multiple passes if needed to forecast demand for the item, such as a first pass with the first item-specific forecasting model and a subsequent pass that aggregates and weighs demand forecasts from the plurality of forecasting models for the similar items.

[0071] Advantageously, for embodiments in which a GAMM model is used, different features, and different techniques for modelling features, may be altered or retained to improve model accuracy by capturing differences in the training data that are pertinent to each model, while still enabling efficient backend resource utilization, such as code re-use and shared storage, for model features that are common across models. By developing and managing models in this way, model accuracy may be improved without an associated increased demand for computer resources, and in fact, memory storage space and training compute times may be decreased in some instances by re-using components across models.

[0072] The forecast disaggregator 224 may disaggregate forecasts generated by the forecasting models 222. For example, for a given item, a forecasting model 222 may generate a demand forecast across a plurality of stores (e.g., a full supply chain or a region within a supply chain) during a given week for the item. The forecast disaggregator 224 may, for example, disaggregate this demand forecast into demand forecasts for each store of the plurality of stores. Furthermore, the forecast disaggregator 224 may disaggregate demand forecasts based on time, such as disaggregating a demand forecast from a week forecast to individual days. FIG. 4 illustrates an example of the forecast disaggregator 224 disaggregating a demand forecast.

[0073] The support models 226 may include one or more models that are used by one or more of the forecasting models 222, the model training system 208, or another component as part of forecasting demand or training the forecasting models 222. In the example shown, the support models 226 include an elasticity model 228, an embedding generator 230, an item clustering model 232, and an event model 234.

[0074] The validation tool 240 may evaluate demand forecasts output by the forecasting models 222. For example, the validation tool 240 may compare demand forecasts of the forecasting models 222 with actual sales data to evaluate a performance of the forecasting models 222. In some embodiments, the model training system 208 may use the validation tool 240 to monitor performance of the forecasting models 222 and evaluate when to retrain the forecasting models 222.

[0075] The anomaly detector 242 may detect anomalies in demand forecasts generated by the forecasting models 222. An anomaly may be a statistically significant deviation from an expected demand forecast. Examples of anomalies may include a percentage deviation from a historical value. For instance, if a demand forecast for an item is 30% higher than a historical demand value for the item, then the anomaly detector 242 may detect this deviation as an anomaly. In some embodiments, an anomaly may be manually flagged by a user. In some embodiments, the anomaly detector 242 may determine a cause of a difference between a demand forecast and an actual demand value. In some embodiments, an indication of an anomaly may be output to a user.

[0076] The forecast consumer interfaces 246 may include interfaces for interacting with systems that receive demand forecasts from the forecasting system 102, such as the forecast consumer 122. In the example shown, the forecast consumer interfaces 246 include a GUI 248 and an application programming interface (API) 250. In some embodiments, the GUI 248 may be part of an application that is provided by the forecasting system 102 or that is communicatively coupled with the forecasting system 102. In some embodiments, a forecast consumer interface may receive an input from a user. Based on the input, various operations may be performed by subsystems of the forecasting system 102, such as components illustrated in the example of FIG. 2. Components involved in generating a response to the input may be part of a response generation system. The response generation system may comprise systems of the forecasting system 102 and components external to the forecasting system 102 that can, among other things, receive a user input, generate a response, which may include a demand forecast, and output the response. Depending on the embodiment, and the type of input received, the response generation system may include or deploy different subsystems. In some embodiments, the response generation system is the AI system 252.

[0077] In some instances, an input requests a demand forecast for an item. In some instances, additional operations may be performed to augment the input (e.g., determine additional data that provides context to or supplements the input) as part of generating a demand forecast, or additional operations may be performed to augment a response, such as providing not only the demand forecast but also a recommendation or data corresponding to other effects associated with a demand forecast, such as effects on other items. Based on the input, the forecast consumer interface, or another component communicatively coupled therewith, may identify one or more subsystems that may be used to process the input or data associated with the input.

[0078] The API 250 may provide one or more functions of the forecasting system 102 to other applications. For example, API 250 may enable an application to query the forecasting system 102 for a forecast of an item. Furthermore, in some embodiments, the API 250 may provide one or more subfunctions, such as operations performed by components of the forecasting system 102, to other applications.

[0079] The artificial intelligence (AI) system 242 may be a system that uses components of the forecasting system 102 as part of a human-in-the-loop AI tool. In some embodiments, the AI system is a generative AI system. In some embodiments, the AI system 252 is configured to receive an input from a user. The user may be using a system associated with a retailer. For example, the AI system may receive an input corresponding to an inventory level, a price, a scenario, an event, or another input associated with data that may influence a demand forecast. In view of the input, the AI system may determine an effect on a demand forecast. As part of doing so, the AI system may use the forecasting models 222. In some embodiments, the AI system 252 may provide a recommendation based at least in part on the generated demand forecast. Example aspects of the AI system 252 are further described in connection with FIGS. 23-24.

[0080] FIG. 3 illustrates a schematic representation of aspects of a forecasting model of the forecasting models 222. In the example of FIG. 3, the forecasting model is a GAMM. FIG. 3 shows example features that may be considered by the forecasting model. In the example shown, results for the features are added. However, in other embodiments, different features may be considered, and the features may be combined in a different manner.

[0081] In the example shown, the element 302 represents a demand forecast for an item at location/during a time 1. The location l may be a single location or a group of locations. The location/may be a store. The time t may be a given week, such as the second week of March, or the time t may be a different span of time, such as a particular year, or a time from the present. In some embodiments, the demand forecast corresponds to expected sales. In some embodiments, the demand forecast is a logarithm of expected sales. In the example shown, the demand forecast is a summation of the features 304-314.

[0082] The element 304 represents an intercept. In some embodiments, the intercept represents a baseline value. In some embodiments, the intercept is an average weekly sale. In some embodiments, the intercept is an average of weekly sales for group of stores to which the location/belongs.

[0083] The element 306 may represent a difference in average sales at the location l. In some embodiments, the element 30 is a random effect.

[0084] The element 308 may represent a seasonality factor. In some embodiments, the function .sub.i is a cyclic spline function that models the effects of season on demand for the item.

[0085] The element 310 may represent the effect of a promotion on demand for the item. In some embodiments, the promotion lift is an offset. In some embodiments, the effect of the element 310 may be determined by the promotion system 216.

[0086] The element 312 represents holiday features. There may be a plurality of holiday features. For example, there may be a holiday features for each holiday that is considered for the item. In some embodiments, the holiday features are linear terms.

[0087] The element 314 represents a trend feature. In some embodiments, the trend feature includes a time-weighted regression for capturing recent demand trends. The trend feature may account for demand changes within a predefined amount of previous time. In the example shown, the predefined amount of previous time is represented by the variable h. As an example, h may be a number of weeks, such as 4. The functions .sub.2, .sub.3, and so on may be splines.

[0088] As shown, there may be more features than those illustrated and described in connection with FIG. 3.

[0089] FIG. 4 illustrates a schematic diagram 400 illustrating a disaggregation of a demand forecast for an item. In the example shown, a forecasting model of the forecasting models 222 may generate an aggregate demand forecast for an item, such as the item-chain-week forecast 402, which represents a demand forecast for an item across all locations of a supply chain during a week. In some embodiments, the forecast disaggregator 224 may disaggregate the item-chain-week forecast to a plurality of item-store-week forecast 404, which may represent forecasted demand for the item for the week at each store of a plurality of stores. In some embodiments, the forecast disaggregator may further disaggregate the demand forecasts. For example, for a given item-store-week forecast, the forecast disaggregator 224 may disaggregate the demand forecast into a plurality of item-store-day forecasts 406, which may represent the demand forecasts for the item at the store on each day of a week. As such, the forecasting system 102 may be configured to generate demand forecast at a desired level of granularity for each item.

[0090] In example aspects, demand forecasting models are trained using historical sales data. However, in some instances, the historical sales data may be incomplete or unavailable. For example, a new item may not have any historical sales data. As another example, historical sales data may be lower than actual historical demand because of inventory constraints. That is, in some instances, the sales data for an item may be low because the item was out of stock, not because the demand was low.

[0091] In example aspects, systems and methods for generating synthetic data that may be used to train forecasting models are disclosed. In some embodiments, an item for which synthetic data may be generated may be one or more of the following: a new item; an item that is rarely sold; an item that experience a stockout event; or another item for which historical sales data may not be predictive of future demand. For instance, for such items, the historical number of items sold may be a bad indicator of item demand. Use of sales data as being representative of demand in these situations could result in poor performance of forecasting models.

[0092] In example aspects, with respect to new items or other items for which historical sales data may be sparse, aspects of the present disclosure may generate sales data in a synthetic manner. In an example, similar items may be identified, and synthetic time series data may be generated based on data associated with the similar items. The synthetic time series data may be used to train forecasting models that forecast demand for the item. In some embodiments, once sales data associated with the item becomes available, that sales data can be analyzed to determine if it is representative of demand (e.g., lacks stockouts, is sufficiently non-sparse, etc.). The synthetic demand signal for an item may then be adjusted as needed using other statistical or machine learning models, such as forecasting models relating to price elasticity and promotion effects, at the desired granularity level (e.g., chain, region, store). In some embodiments, a combination of synthetic sales data and actual sales data may be used to train forecasting models.

[0093] In example aspects, with respect to stockout events, aspects of the present disclosure may synthetically alter the historical sales data such that it reflects what the sales would have been if there had not been stockouts. In some instances, such synthetically altered sales data may represent an unconstrained demand, which may be the metric that the forecasting models seek to forecast. Correcting for out-of-stock events may include first identifying a stockout event. For example, a statistical or machine learning approach can be used to determine a stockout event. As an example, the system may identify that the sales for an item are a threshold amount (e.g., 30%) below their typical sales amount. This may indicate that the item is out of stock. As another example, a stockout event may be identified by subtracting sales of an item from a known inventory level to identify that the inventory is sufficiently low, such that there may be a stockout. The stockout event may be identified at an item-location level of granularity or at a different level of granularity.

[0094] Furthermore, in example aspects, for a given item and a given time during which a stockout has been identified, the synthetic data generator 210 may generate synthetic sales data that represents the predicted sales data if the stockout had not occurred. There are various techniques for generating this synthetic sales data, as described, for example, in connection with the operation 704 of FIG. 7. Furthermore, in example aspects, the model training system 208 may update the forecasting models 222 by using data generated by the synthetic data generator 210, as described further in connection with FIGS. 6-7. As such, for a given item, rather than training the forecasting models 222 using sales data for an item that was affected by stockouts, the forecasting models 222 may be trained using corrected synthetic data that reflects sales as if there were no stockouts.

[0095] In example aspects, when creating synthetic time series data, there may be multiple ways to view the problem from a technical, statistical point of view. For example, synthetic time series data can be created either on the basis of a set of input signals, generating time series data, or generating both time series data and accompanying input signals that would be associated with that time series data. In the first case, some sufficient statistics representative of a data set may be provided (e.g., mean and variance of various features), and a posterior predictive distribution may be used to generate synthetic data representative of a dataset having such features, either using independent (non-past-viewing) data generation, or with an attempt to generate a cohesive time series (e.g., considering data at time t when generating data for time t+1). In the second case, demand or sales data is generated, but also input data signals that would generate that data. Depending on the embodiments, the synthetic data generator 210 may implement one of or both of the approaches as part of generating synthetic time series data. Furthermore, depending on the embodiment, either of the two approaches may use statistical models or machine learning models.

[0096] FIG. 5 illustrates a schematic diagram 500 including components that may be used in connection with one or more of the generation or use of synthetic data. The diagram 500 includes item data 108, sales data 110, synthetic data generator 210, model training system 208, and forecasting models 222. Furthermore, the diagram 500 illustrates example data exchanges 502-510 between components of the diagram 500. In the example of FIG. 5, the data exchanges 502-510 are described as relating to a query item, which may be an item that is part of an item catalog or that may be part of an item catalog, and for which the synthetic data generator 210 may generate data. However, aspects of FIG. 5 may be applied to generate synthetic data for a plurality of different items.

[0097] In the example shown, the synthetic data generator 210 may receive query item sales data 502 from the sales data 110. The query item sales data 502 may include historical sales data for the query item. In some instances, the query item sales data 502 may be empty or sparse, because the item may be a new item. In some embodiments, query item sales data 502 may be for a particular location, for a group of locations, or for a digital sales channel. In some embodiments, the query item sales data 502 may be from a particular time, such as during a particular year or another range of time in the past. In some embodiments, the query item sales data 502 may include all available historical sales data for an item. Furthermore, the query item sales data 502 may also include sales data for items that are similar to the query item.

[0098] In the example shown, the synthetic data generator 210 may receive query item data 504 from the item data 108. The query item data 504 may include attributes or characteristics of the query item, such as, for example, an item classification, name, description, image, location, or channel. Furthermore, the synthetic data generator 210 may receive data for items that are or may be similar to items similar to the item for which sales data 502 was received.

[0099] In some embodiments, the synthetic data generator 210, having received the query item sales data 502 and the query item data 504, may determine whether to generate synthetic data. To determine whether to generate synthetic data for an item, the synthetic data generator 210 may determine whether the historical sales data, which may be part of the sales data 502, is sufficient or insufficient. In some embodiments, insufficient historical sales data may include data corresponding to historical sales data for a new item or for an item that was out of stock during a time period. An example of determining that the historical sales data for an item is insufficient in the context of a new item is described in connection with the step 602 and with FIG. 6 more generally. An example of determining that the historical sales data for an item is insufficient in the context of an item that was out of stock is described in connection with the step 702 and FIG. 7 more generally. Other examples of insufficient historical sales data can likewise be addressed using aspects of the present disclosure and may include incorrectly recorded or captured sales data, anomalous sales data, or other incorrect sales data.

[0100] In the example shown, the synthetic data generator 210 may provide query item sales data 506 and synthetic query item sales data 508 to the model training system 508. The query item sales data 506 may be actual sales data of the sales data 110 for the query item. In some instances, the query item sales data 506 is the same as the query item sales data 502. In some instances, the query item sales data 506 is a subset of the query item sales data 502, in which sales data is removed that is associated with a stockout event. The synthetic query item sales data 508 may be data generated by the synthetic data generator 210. In some embodiments, each of the query item sales data 506 and the synthetic query item sales data 508 may be time series data. In some embodiments, the query item sales data 506 and the synthetic query item sales data 508 may be non-overlapping time series data, in which the synthetic query item sales data 508 fills gaps in times of the query item sales data 506.

[0101] In the example shown, the model training system 208 may receive the data 506-508, and the model training system 208 may use the data 506-508 to train a model for forecasting demand for the query item. As shown by the element 510, the model training system 208 may add the model trained to forecast demand for the query item to the forecasting models 222. In some embodiments, the model training system 208 may update an existing model of the forecasting models 510 with the data 506-508, or only with the data 508.

[0102] FIGS. 6-7 are flowcharts that illustrate example methods for generating synthetic data. Although certain operations of FIGS. 6-7 are described as being performed by the synthetic data generator 210, one or more of operations of the FIGS. 6-7 may be performed by a different component or combination of components. In the example of FIG. 6, the synthetic data generator 210 generates synthetic data for an item that lacks sufficient sales history, such as a new item or a slow-selling item. In the example of FIG. 7, the synthetic data generator 210 generates synthetic data for an item in response to an identification of a stockout. Furthermore, there may additional situations for which synthetic data generator 210 may generate synthetic data, such as, for example, generating synthetic data to validate or test forecasting models, generating synthetic data to supplement existing data that may not be deficient, or generating synthetic data for another reason. In some embodiments, the synthetic sales data described in connection with FIGS. 6-7 may correspond to an unconstrained demand for an item during a time period. The unconstrained demand may represent hypothetical sales for an item during the time period if the item had been fully available during that time period. For example, this unconstrained demand may represent what the sales of the item would have been in the absence of an identified stockout event, or if the item had been sold by a given location during the time period.

[0103] FIG. 6 is a flowchart of an example method 600 for generating synthetic sales data. In the example of FIG. 6, synthetic sales data may be generated for a new item or for an item that otherwise lacks sufficient historical sales data to train a forecasting model.

[0104] In the example shown, the synthetic data generator 210 may identify an item that lacks sufficient sales history (step 602). In some embodiments, items that lack sufficient sales history may be items for which the associated sales history is insufficient to train a forecasting model with a threshold level of accuracy and confidence. In some embodiments, parameters for the forecasting model may not converge during training (e.g., may not be sufficiently stable or constant between training rounds) within an allocated training time or after using all available training instances, which may indicate that there is insufficient sales history to train the model. In some embodiments, an item lacking sufficient sales history may be identified after an attempt by the model training system 208 to generate a forecasting model for the item. In some embodiments, an item lacking sufficient sales history may be a new item, and it may be identified by the synthetic data generator 210 based on an indication that the item is new. In some embodiments, the synthetic data generator 210 determines a quantity of sales data available for an item and determines, based on this quantity, whether the sales data is sufficient to train a forecasting model. In some embodiments, the synthetic data generator 210 may determine a quantity of sales data for the item over time, from which a degree of data sparsity may be determined. Based on the degree of data sparsity, it may be determined whether the item has sufficient sales history.

[0105] In some embodiments, the synthetic data generator 210 may analyze historical sales data for a plurality of items, and based on this analysis, identify items lacking sufficient sales history. In some embodiments, a user may manually provide an indication to the synthetic data generator 210 that identifies items for which there is insufficient sales history. In some embodiments, the synthetic data generator 210 may determine that an item has sufficient sales data for a first location, or group of locations, but not for a second location, or second group of locations. In other embodiments, the synthetic data generator 210 may determine that there is insufficient sales data for a certain fulfillment channel (e.g., in store sales) but sufficient sales data for a different fulfillment channel (e.g., digital demand).

[0106] In the example shown, the synthetic data generator 210 may identify similar items (step 604). For instance, for an example item identified during the step 602, the synthetic data generator 210 may identify similar items. In some embodiments, the similar items may be any items for which there may be sales data that can be used as part of generating synthetic sales data for the example item. As an example, the similar items may be part of a class to which the example item belongs. As another example, the similar items may include item characteristics that are similar to characteristics of the example item, such as, for example, a similar name, a similar description, a similar feature, a similar attribute, or another similar characteristic. As another example, the similar items may be items with similar demand patterns as the example item, such as sharing similar demand changes based on season, based on price changes, based on events, or based on other features that may affect demand. In some embodiments, the synthetic data generator 210 may perform an embedding-based search to identify similar items where respective embeddings for respective items may be generated using a transformer-based model that generates embeddings using one or more of titles, descriptions, images, reviews, classifications, metadata, or other data.

[0107] In some embodiments, identifying similar items may include identifying the same item as the example item but at a different location. For example, the example item may not have any sales data at a certain location, but the example item may have sales data associated with sales at different locations. In some embodiments, the sales of the item at the different locations may be used to generate synthetic sales data for the item.

[0108] In the example shown, the synthetic data generator 210 may generate synthetic sales data (step 606). For example, the synthetic data generator 210 may generate synthetic sales data using sales data of similar items. In some embodiments, the synthetic data generator 210 may pool sales data of the similar items to generate synthetic sales data for the example item. For example, the synthetic data generator 210 may use an average, or a weighted average based on item similarity, of sales data for similar items to generate the synthetic sales data for the example item, example aspects of which are described in connection with FIG. 18.

[0109] As an example, the synthetic data generator 210 may identify five items that are similar to an example item. These five items may have sufficient data to train a forecasting model. As an example, these five items may have historical sales data for a certain week of a previous year. The sales of these five items during that week may be aggregated to generate synthetic sales data for the example item during the week. In some embodiments, aggregating the similar item sales data may include averaging the sales data or may include a weighted average of the sales data of the similar item sales, where sales data for items that are more similar to the example item are weighted more heavily than sales data for items that are relatively less similar to the example item.

[0110] As another example, the synthetic data generator 210 may generate synthetic sales data using sales data of the item at locations at which the item was available during the relevant time period. For example, if the item has insufficient sales data at a location during a time period because the item was unavailable during that time period (e.g., because it was not sold at the location during the time period, or because the item was out of stock), and if the item was sold at different location during that time period, then sales data of the item at the different location may be used to generate synthetic sales data. In some embodiments, this process may include first selecting one or more similar locations and then adjusting the sales data from the one or more similar locations to account for any differences between the location and the one or more similar locations.

[0111] In the example shown, the synthetic data generator 210 may modify the sales data 110 to include the synthetically generated sales data for the item (step 607). For example, this may consist of the synthetic data generator 210 storing the synthetic sales data in a database that tracks demand or sales for the item, thereby updating the sales data 110 to include the synthetic sales data.

[0112] In the example shown, the model training system 208 may train a forecasting model using the synthetic sales data (step 608). The forecasting model may be from the forecasting models 222 described above in connection with FIG. 2. For example, the forecasting model may be an item-specific forecasting model that generates demand forecasts, at a location or at a plurality of locations, for the item for which synthetic sales data was generated. In some embodiments, if there is already a forecasting model for the example item, then the model training system 208 may update the existing forecasting model, whereas in some embodiments, the forecasting model may be trained from scratch using at least the synthetic sales data. In some embodiments, training the forecasting model may include updating model weights or parameters. In some embodiments, multiple models may be trained using the synthetic sales data. For example, both a location-specific, item-specific forecasting model and a multi-location, item-specific forecasting model (e.g., a model that forecasts demand for an item at a group of locations or across an entire supply chain) may be trained using the synthetic data.

[0113] In some embodiments, the model training system 208 may use a combination of synthetically generated sales data and actual sales data for the example item. For example, the example item may have interrupted or intermittent sales data, and the model training system 208 may use the interrupted or intermittent sales data in combination with the synthetically generated sales data to train the forecasting model. As another example, a new item may begin with no actual sales data. As such, initially training a forecasting model for the new model may consist only of using synthetic data. However, as the item is sold, the forecasting system 102 may receive actual sales data for the item. In some embodiments, the model training system 208 may combine the actual sales data with the synthetically generated data to train the forecasting model. As more actual sales data is received for the item, the model training system 208 may progressively phase out the use of synthetically generated sales data such that, once there is a sufficient amount of data, only actual sales data is used to train the forecasting model. An example of such a series of operations is illustrated in connection with FIG. 6.

[0114] In the example shown, a forecasting model may forecast demand (step 610). For example, the model trained during the step 608 may forecast item demand. For example, the forecasting systems 102 may receive a request to forecast demand for the item. The request may originate from a forecast consumer or may be scheduled. The request may pertain to a particular time period (e.g., a certain day or week) and location, which may be a particular store or a group of stores. The request may also pertain to a plurality of time periods (e.g., each week for a year). Based on the request, the forecasting system 102 may select a forecasting model that corresponds to the request from the forecasting models 222. For example, if the request is to forecast demand for a certain item, then the forecasting system 102 may select, from the forecasting models 222, an item-specific forecasting model associated with that item. Using parameters that were trained with the synthetically generated time series data, the forecasting model may process the request to generate a demand forecast, which may be output to the requesting application.

[0115] In the example shown, the forecasting system 102 may receive sales data (step 612). In some embodiments, the sales data may be, for example, actual sales data as opposed to synthetically generated sales data. For instance, for the example item, sales data registered at retail locations 104 or the digital platform 106 may be received by received at the sales data 110 and the forecast system 102. In some embodiments, the sales data may be continuously received in real time, or in near real time, by the forecasting system 102.

[0116] In the example shown, the synthetic data generator 210 may again generate synthetic data for the example item after actual sales data has been received (returning to the step 612). In some embodiments, regeneration of synthetic sales data may occur periodically. In some embodiments, the synthetic data generator 210 may use the actual sales data for the example item as part of generating new synthetic data. For example, given the actual sales data, the synthetic data generator 210 may reassess and thereby regenerate sales quantities in the synthetic sales data. For example, if the volume of actual sales data is greater than forecasted, then the volume of sales in the synthetic sales data may be increased. Similarly, if the volume of actual sales data is less than forecasted, then the volume of sales in the synthetic sales data may be decreased.

[0117] In some embodiments, the model training system 208 may retrain the forecasting model for the example item. In some embodiments, retraining the forecasting model may use a combination of actual sales data and synthetic sales data, where the ratio of actual sales data to synthetic sales data used in training may increase as more actual sales data becomes available. For example, the model training system 208 may phase out the synthetic sales data with the actual sales data. This may include multiple iterations of retraining the forecasting model, where with each iteration, relatively more actual sales data and relatively less synthetic sales data is used to train the forecasting model. For example, at a first time of retraining the forecasting model, both actual and synthetic sales data may be used, but at a second, later time, only actual sales data may be used to train the forecasting model, thereby completing the phasing out of the synthetic training data.

[0118] FIG. 7 is a flowchart of an example method for generating synthetic data for an item that experiences a stockout. For exemplary purpose, operations of FIG. 7 are described as being performed with respect to an example item. However, as will be understood, operations of FIG. 7 may be performed for a plurality of items.

[0119] In the example shown, the synthetic data generator 210 may identify a stockout for an item (step 702). A stockout may occur when the inventory of an example item at a location is sufficiently low that may affect purchasing of the example item at the location. For example, a stockout may be a period of time during which the inventory of the example item is zero or below a threshold amount such that it cannot be fully available to purchase at the location. For instance, if a first size of a clothing item is available at a store but a second size of the clothing item is not available at the store, then this may, in some instances, be considered a stockout for the clothing item. In some embodiments, a stockout is identified for an item across a plurality of locations. In some embodiments, a stockout is identified for a period of time, such as a month, week, day, or other level of granularity.

[0120] Depending on the embodiment, a different technique may be used to identify a stockout. In some embodiments, the synthetic data generator 210 may identify a stockout by using derived inventory levels. For example, the synthetic data generator 210 may determine an initial amount of inventory of the example item at a location. Based on sales data, restock events, and an initial inventory level, the synthetic data generator 210 may determine a current level. In response to determining that the that the inventory level is below a threshold amount, or in response to determining that the inventory level is zero, the synthetic data generator 210 may determine that there is a stockout.

[0121] In some embodiments, the synthetic data generator 210 may determine a stockout by detecting an anomaly in sales data for the example item. Various techniques may be used to detect anomalies. For example, if the synthetic data generator 210 determines that the sales data for the example item is sufficiently below a forecasted amount, then this may be identified as a potential stockout. For example, based on a plurality of conditions that may affect sales (e.g., a location, trends, historical sales, a season, a price, holidays, a promotion, features of an item assortment, an external event, etc.), a demand may have been forecasted for the item during the time period. Then if the actual sales for the item during the time period are 25% (or another threshold percentage or nominal value) of the forecasted amount (or historical amount), then the synthetic data generator 210 may identify this as an anomaly and as a potential stockout that occurred during the time period.

[0122] As another example, the synthetic data generator may compare sales during the time period with sales during one or more other time periods. If the sales during the time period are lower than the sales during the one or more other time periods, this may be detected as an anomaly and a potential stockout. The threshold may be, for example, a nominal amount or a percentage. As an example, if sales for Item A are between 100 to 150 units for a plurality of weeks, and if the sales for Item A are 10 for a certain week, then it may be determined, given that the sales for the week are 10% of a normal range, that an anomaly occurred during that week, which may indicate a stockout event. In some embodiments, other statistical methods can be used to identify anomalies in sales data that may be indicative of a stockout.

[0123] In some embodiments, the synthetic data generator 210 may determine a stockout based on a user input. For example, a worker at a location may provide an indication that the example item is out of stock. As another example, a customer may provide feedback indicating that the example item is out of stock. Additionally, when the synthetic data generator 210 identifies a stockout, such as based on derived inventory levels or anomaly detection, the synthetic data generator 210 may provide data corresponding to the identified stockout to a user, who may verify that the stockout occurred. Other techniques for identifying stockouts are likewise possible.

[0124] In the example shown, the synthetic data generator 210 may generate synthetic sales data (step 704). There are various techniques that may be used by the synthetic data generator 210 for generating synthetic data for an item that is out of stock. For instance, example techniques for generating synthetic data, which may be used in the context of FIG. 7, are described above in connection with the step 606 of FIG. 6. Additionally, FIG. 7 illustrates additional example techniques that may be used to generate synthetic sales data, as illustrated by the steps of 706-712. In some embodiments, the synthetic data generator 210 may use one or more of the techniques described in connection with steps 706-712 to generate synthetic sales data. In some embodiments, the synthetic data generator 210 may use a combination of techniques for generating synthetic sales data.

[0125] In the example shown, the synthetic data generator 210 may use a forecasting model of the forecasting models 222 to generate synthetic data (step 706). In some embodiments, using the forecasting model to generate the synthetic data comprises determining quantities of sales that were most likely to have occurred during the time period given conditions (e.g., a price, location, season, event, trend, promotion, etc.) during that time period. For example, the synthetic data generator 210 may use the forecasting model trained to forecast demand for the example item to predict what the item sales would have been if there had not been a stockout event. For example, the synthetic data generator 210 may use a forecasting model to backcast what the sales would have been to generate synthetic sales data. In some embodiments, this may include simply using the forecasted demand for the item as the synthetic sales data. As an example, if it is determined that a stockout event occurred during the second week of March in 2024, the synthetic data generator 210 may use the forecasted demand for the example demand during the second week of March in 2024 as synthetic data.

[0126] In the example shown, the synthetic data generator 210 may generate synthetic sales data using sales data of similar items (step 708). For example, the synthetic data generator 210 may identify similar items, and based at least in part on the sales data of the identified similar items, generate synthetic data for the example item. Example aspects of generating synthetic sales data using similar items are described above in connection with steps 604-606 of FIG. 6.

[0127] In the example shown, the synthetic data generator 210 may use a Bayesian model to generate synthetic sales data. For example, the Bayesian model may generate synthetic data for the example item based at least in part on sales of the item at other locations, sales of the item during different times from the stockout period, or sales of other items. In some embodiments, the Bayesian model may use other data instead of or in addition to sales data, such as data related to pricing, events, macroeconomic conditions, or other conditions that could have affected sales of the example item. By using such data, the Bayesian model may identify potential volumes of sales for the example item during the stockout period and may identify likelihoods of the potential volumes. In some embodiments, the synthetic data generator 210 may select the sales volumes having a greatest likelihood as part of the synthetic data. In some embodiments, when using a Bayesian approach, the synthetic data generator 210 may use a different model for each item or group of items, thereby advantageously improving, in some instances, the likelihood that the synthetic data accurately represents what sales would have been in the absence of a stockout.

[0128] In the example shown, the synthetic data generator 210 may use a neural network to generate synthetic data (step 712). For example, the neural network may be trained to predict a demand for an item based on one or more factors that may influence demand, such as, for example, seasonality, price, location, promotion, trends, holidays, previous sales, or other factors. In such an embodiment, the synthetic data generator 210 may input, into the neural network, conditions that were present during the stockout event of the example item to predict what the demand would have been in the absence of a stockout. In some embodiments, the neural network may be a transformer-based neural network. In some embodiments, when using a neural network, the synthetic data generator 210 may use a common neural network for all items.

[0129] In the example shown, the synthetic data generator 210 may modify sales data using the synthetic sales data (step 714). In some embodiments, the synthetic data generator 210 may remove sales data for a time during which a stockout event occurred and replace the removed sales data with the synthetic sales data. For example, the synthetic data generator 210 may identify the time periods that were affected by the time period, such as one or more certain weeks. Then the synthetic data generator 210 may replace only the sales data for those time periods that were affected by the stockout event with the synthetic sales data while leaving other sales data for item unchanged. In some embodiments, the synthetic data generator 210 may supplement the actual sales data with the synthetic sales data without removing actual sales data, even if at least some of the actual sales data may have been affected by a stock out.

[0130] In the example shown, the model training system 208 may train the forecasting model for the example item using the synthetic data (step 716), example aspects of which are described in connection with the step 608 of FIG. 6. For example, the model training system 208 may use the sales data 110, which may have been modified to include the synthetic sales data, to train an item-specific demand forecasting model. In some embodiments, training the forecasting model includes using the synthetic data of the sales data 110 and also using data of the sales data 110 that was not synthetically generated, such as sales data during time periods for which no stockout event nor other anomalous events were detected.

[0131] In the example shown, the forecasting model may forecast demand (step 718), example aspects of which are described in connection with the step 610 of FIG. 6. Moreover, the forecasting model may process, using parameters that were trained using the synthetically generated sales data for the stockout event, a demand forecasting request to generate a demand forecast that is provided to the requesting application.

[0132] In example aspects, there may be various features that may affect demand for an item. An example of such features may include holidays. For example, there may be increased demand for toys during Christmas and turkey during Thanksgiving. As such, demand forecasting models may more accurately forecast demand by accounting for features such as holidays. However, in the case of holidays, not every item is sensitive to every holiday. For example, the demand for flowers may be sensitive to Mother's Day but not to Halloween, and the demand for toothpaste may not be sensitive to any holiday. In some instances, it could be suboptimal to consider every holiday for every item, because doing so could create model instability and noise. For example, a forecasting model could, in some instances, improperly attribute changes of sales of an item to occurrence of a holiday, even if the item is not, in fact, sensitive to the holiday. In some instances, such improper attribution could result in model overfitting issues and degradation of forecasting accuracy.

[0133] In example aspects, a feature management system 212 is disclosed that may identify which holidays to consider for which items, thereby generating a list of zero or more holidays that are to be considered when generating demand forecasts for each item. For example, for a given item, with respect to the element 312 of FIG. 3, a forecasting model of the forecasting models 222 may consider the list of zero or more holidays for the item that have been identified as having an impact on demand for the example item.

[0134] In example aspects, the feature management system 212 may acquire features of sales and promotions, and a calendar of potentially relevant holidays and pre-holiday time periods. The feature management system 212 may use this data to identify important holidays for each item by modeling and clustering historical sales data. In some instances, items are modeled and clustered separately for seasonal and non-seasonal items within a department.

[0135] In example aspects, the feature management system 212 may use one of various techniques for identifying relevant holidays for an item. For example, the feature management system 212 may use a regression model, a clustering model, a different type of model, or a combination thereof. In some instances, for a given item, the results of these techniques are combined to form a list of relevant holidays for an item. In some embodiments, the feature management system 212 may determine a list of relevant holidays for a group of items, such as an item cluster, department, or other category. In some embodiments, the list of relevant holidays is used by one or more of the forecasting model for the item or the model training system 208 used to train the forecasting model for the item. As such, the feature management system 212 may run independently from the forecasting model itself.

[0136] Aspects of the feature management system 212 may provide various technical advantages. In some embodiments, by selecting model features that are relevant on an item-by-item basis before performing inference, the forecasting models may not only be more accurate (e.g., by reducing noise associated with irrelevant features), but may also be trained faster (e.g., by training fewer features) and may also generate forecasts faster (e.g., by considering fewer features). Additionally, the feature management system 212 may expand the number of features that can be potentially considered by forecasting models 222. For example, by considering fewer, yet more relevant, holidays for each item, the number of potential holidays that may be considered can be expanded. For example, new holidays or cultural events may be potentially considered, since not every holiday is considered for every item. Yet still, in some embodiments, different features may be considered for different demand channels. For example, a given holiday may increase demand of items sold via the digital platform 106 but not for the retail locations 104, or vice-versa. Furthermore, in some embodiments, region-specific holiday impacts may be identified. For example, demand for an item at a location in a first region may receive an increased demand forecast during a given holiday than demand for that item at a location in a second, different region. As such, one or more of location-specific, region-specific, or channel-specific holiday effects may be identified and used by the forecasting models 222.

[0137] Yet still, the feature management system 212 may identify changes in demand pattern effects that may vary across different holidays. For example, changes in demand patterns due to a holiday may be defined as spanning particular weeks or days. The feature management system 212 may determine that different holidays have different time structures when it comes to how they relate to demand. For example, the feature management system 212 may determine that Christmas may cause a multi-week, progressive increase in demand but may not cause any increased demand on the day itself, whereas the Super Bowl may cause an increase in demand on only the day of the event and the day prior. As such, aspects of the feature management system 212 may refine the time periods for holidays to better reflect the holiday impact on demand.

[0138] Yet still, the feature management system 212 is not limited to identifying features that are holidays. For example, aspects of the feature management system 212 may also be used to identify which items are sensitive to other events or time periods. As an example, aspects of the feature management system 212 could be extended to identify which items are sensitive to a back-to-school time or which items are sensitive to certain weather events.

[0139] Yet still, aspects of the feature management system 212 may improve performance of the forecasting models 222 by identifying item-specific features to be considered by the forecasting models 222. Such improvements may include more accurate forecasting because only the most relevant holidays are considered for each item and more potential holidays may be considered. Such improvements may also include faster model execution times because fewer holidays are considered for each item. Other advantages associated with the feature management system 212 are likewise possible.

[0140] FIG. 8 illustrates a block diagram of an example architecture of the feature management system 212. In the example shown, the feature management system 212 includes components for selecting features to be considered by the forecasting models 222. As described in connection with FIG. 8, the feature management system 212 may determine holiday features for an example item at an example location. However, as will be understood, the feature management system 212 may determine other features that may impact demand apart from holiday features. Moreover, as will be understood, the feature management system may identify features for a plurality of items, for a plurality of locations, or both a plurality of items across a plurality of locations.

[0141] The feature data generator 802 may generate a plurality of features for the feature selection process or may process time series data such that it can be used to identify features for the features selection process. The plurality of features may include potential features that may be included in demand forecasting models, such as the forecasting models 222. In some embodiments, the plurality of features includes only a subset of features of the forecasting models 222. For example, the plurality of features may include features whose relevance varies by a relatively large degree across items. For such features, it may be useful for model accuracy and speed to only include the features that are sufficiently relevant for an item, rather than including all such features. For example, a forecasting model for a toy may be improved by considering a holiday feature associated with Christmas, whereas a forecasting model for bananas may not be improved, and in fact may be made worse, by considering a feature associated with a Christmas holiday. The utility from a model accuracy and speed perspective of such a feature may vary greatly between forecasting models. In some embodiments, the plurality of features include only fixed effects of a GAMM model, such as the fixed effects shown in connection with FIG. 3. Accordingly, different fixed effects may be selected for different forecasting models while a common set of other features, such as smooth terms, random effects, and other fixed effects may be included across forecasting models, although the weights and applications of such other features may vary across models due to model-specific training processes. In some embodiments, the plurality of features from which sets of features are to be selected include a certain type of fixed effects, such as holidays.

[0142] As shown, the feature data generator 802 may receive sales data 110 for the example item. Furthermore, the feature data generator 802 may receive item data 108 for the example item, such as an item profile or item attributes. Furthermore, the feature data generator 702 may receive forecast data 112. In some embodiments, the feature data generator 802 may perform scrubbing, imputation, or other adjustments to input data. In some embodiments, the feature data generator 802 may aggregate time series data across a plurality of items or locations, such that group-level or region-level features for a forecasting item may be identified. In some embodiments, the feature data generator 802 may apply the forecasting model 222. In some embodiments, the feature data generator 802 may perform aspects of the method 900 of FIG. 9.

[0143] The feature calendar generator 804 may generate a calendar of times during which a feature is applied. The features may correspond to the potential features that are generated or identified by the feature data generator 802. In some embodiments, the calendar may include a multi-dimensional array that maps a feature to a time period. During a feature's associated time period, it may be active and applied to by a forecasting model of which the feature is part. Outside of the feature's associated time period, the feature may not be active, and therefore may not be applied by a forecasting model, even if the feature is part of the forecasting model. In some embodiments, when a forecasting model includes a feature with an associated time period corresponding to a feature calendar, the forecasting model may use a flag to determine whether the feature is active. For instance, for a time-based feature, the forecasting model may check the corresponding flag. If the flag is active, the forecasting model may use the feature when forecasting demand; if the flag is not active, the forecasting model may not use the feature when forecasting demand.

[0144] As an example, the feature calendar generator 804 may generate a calendar of holidays. In some embodiments, the feature calendar generator 804 may generate a calendar that represents overlapping holidays. In some embodiments, the feature calendar generator 804 may generate holidays that can be represented as a binary (i.e., the holiday is occurring, or it is not) or as a magnitude (e.g., the effect of a Christmas holiday may be lower on December 22.sup.nd than December 23.sup.rd). Moreover, the feature calendar generator 804 may define a holiday as spanning a day, week, or another granularity of time that captures the days for which the holiday may have an effect. In some embodiments, the feature calendar generator 804 may generate a custom feature calendar for each item or for each item-location combination. In some embodiments, the feature calendar generator 804 generates a calendar of potential holidays, and the potential holidays are customized on an item-by-item basis using the feature selectors 806 and the feature combiner 812.

[0145] The feature selectors 806 may determine which features, such as holidays, to consider for an item. In some embodiments, there may be one or more feature selectors that may be used by the feature management system. In the example shown, the feature selectors 806 include a regression model 808, a machine learning model 809, and a clustering model 810. In some embodiments, the feature selectors 806 may include additional items. In some embodiments, the feature selectors 806 may receive data from the feature data generator 802 and the feature calendar generator 804. For example, the feature selectors 806 may receive one or more of time series data from the feature data generator 802, such as sales data or forecast data, or a list of potential features to select for models. Additionally, the feature selectors 806 may receive a feature calendar from the feature calendar 804. In some embodiments, the feature selectors 806 may identify relevant features on an item-by-item basis. In some embodiments, the feature selectors 806 may use time series data for a group of items that includes a selected item to identify group-level features. For example, the regression model 808, machine learning model 809, and the clustering model 810 may each use this aggregated time series data to determine features that are applicable for a group, and these group-level features may be included in a set of features for an item.

[0146] The regression model 808 may be used to identify the impact of features in the sales data 110, in sales data as refined by the feature data generator 802, or in demand forecasts, or demand forecast as refined by the feature data generator 802. In some embodiments, the regression model 808 includes a plurality of possible models and may execute a plurality of possible techniques for identifying feature importance. In some embodiments, the regression model 808 includes the machine learning model 809.

[0147] In some embodiments, the regression model 808 may perform one or more of a forward, backward, stepwise, or subset variable selection process. In some embodiments, the regression model 808 is a linear regression model that uses historical sales data or forecast data to identify relevant holidays for an item. The historical sales data, or a model trained using the historical sales data, may be decomposed. For example, the sales data may be decomposed to remove all variables that impact the time series data except for a set of potential features from which the feature selectors 806 are selecting features. As an example, decomposing the time series data may include removing effects such as long-term trends, pricing promotions, and other non-holiday effects. Additionally, seasonal effects and seasonal items may also be removed. As a result, the time series data may be converted such that it only reflects the effects of holidays and noise. Then, by comparing this decomposed sales data to the calendar of holidays from the feature calendar 804, it can be determined which holidays are responsible for the changes in demand for the item.

[0148] In some embodiments, the regression model 808 is a LASSO model and may shrink the coefficient of some features to zero, as controlled by lambda. In some embodiments, the LASSO model may fail if there is insufficient historical data for an item, or if the item is a slow seller with sparse sales data. In this case, a random forest algorithm may, in some embodiments, be used. The random forest algorithm may be an example of the machine learning model 809. In some embodiments, the random forest algorithm measures each feature by the percent increase in mean squared error if consideration of that feature is removed from the model. The features are then ranked by importance, and the features that may be selected are those that occur prior to an elbow in an importance curve, as shown, for example, in the example of FIG. 11.

[0149] The clustering model 810 may identify features for a group of items. A group of items may be, for example, two or more items that share a common attribute. Sharing a common attribute may include belonging to a common department, class, or other pre-defined category of items. Other example techniques for determining a group of items are described at least in connection with the step 604 of FIG. 6 and in connection with FIGS. 18-19. As such, if a particular item belonging to a group of items does not itself have sufficient historical sales data, a relevant feature for that item may nevertheless be identified if it belongs to a group of items that are sensitive to that feature. In some embodiments, normalized descriptive metrics for feature mean and standard deviations are calculated, and this data may be clustered. In some embodiments, a threshold parameter is applied to the clustering results, and the list of holidays with values exceeding it may be considered relevant features for the group of items. In some embodiments, clustering may be performed separately for seasonal and non-seasonal items. In some embodiments, seasonal items are sold only for a certain duration during the year. Excluding seasonal items from non-seasonal items before any clustering is performed can improve forecasting results. In some embodiments, an item may be considered non-seasonal if sold in more than ten stores for at least twenty-six weeks.

[0150] The feature combiner 812 may receive features from the feature selectors 806 that have been identified as important for the example item. In some embodiments, the feature combiner 812 performs a union of the features identified by different feature selectors 806. Additionally, in some embodiments, the feature combiner 812 may combine features identified by different passes of a single feature selector 806 using different data, such as item-specific time series data and aggregated time series data. In some embodiments, the feature combiner 812 selects a number of most important features identified by the feature selectors 806, where importance may correspond to degree of impact on a demand forecast. After selecting features, the feature management system 212 may provide the selected features to the forecasting model 222 or the model training system 208 to be included as part of the forecasting model for the example item. In some embodiments, the feature management system 212 may store the features as part of an item profile for the example item in the item data 108.

[0151] FIG. 9 is a flowchart of an example method 900 for determining model features. Although operations of the method 900 are described as being performed by the feature management system 212, one or more other components may perform operations of the method 900. Furthermore, although operations of the method 900 are described in connection with determining holiday features for an example item at a location, the method 900 may be extended to determining non-holiday features, for a plurality of items, or across a plurality of locations.

[0152] In the example shown, the feature management system 212 may begin a feature identification process (step 902). Beginning the feature identification process, which may include the steps 904-910 of FIG. 9, may occur as part of a scheduled process for identifying or updating features of the forecasting models. In some embodiments, the feature management system 212 may perform the feature identification process as part of updating forecasting models 222. The feature management system 212 may also begin the feature identification process for each new forecasting model, which may be created for each new item. In some embodiments, the feature management system 212 may periodically reperform the feature identification process. In some embodiments, the feature management system 212 may perform the feature identification process simultaneously for a plurality of items.

[0153] In the example shown, the feature management system 212 may access time series data (step 904). For example, the feature management system 212 may access time series data corresponding to an item or group of items for which a forecasting model is being generated. The time series data may include one or more of historical sales data, synthetic sales data, or forecasting data. For example, if the feature management system 212 is selecting features for an item-specific forecasting model, then the time series data may include sales data or forecasting data for that item. In addition, the feature management system 212 may also access item data for the item, such as data of the item data 108.

[0154] In the example shown, the feature management system 212 may generate a feature calendar (step 906), example aspects of which are described in connection with the feature calendar generator 804 of FIG. 8.

[0155] In the example shown, the feature management system 212 may select features for the example item (step 908). Selecting, or identifying, features for the example item may include analyzing the time series data for the item and analyzing a feature calendar. Depending on the embodiment, the feature management system 212 may use various selection approaches and models to identify features. Example techniques include applying one or more of the models 808, 809, or 810 in combination with a threshold value corresponds to feature relevance. The threshold value may be a number of features to be selected. For example, the threshold value may be five, which may indicate that the five features of a plurality features that are most predictive of item demand are selected, where different techniques may be used to determine how predictive of demand a given feature is, as described in connection with the models 808-810 and as described below. As another example, the threshold value may be a requirement of how predictive of demand a given feature is to be included as part of the forecasting model. If the predictive value of a given feature is greater than the threshold value, it may be selected; if not it is not, it may not be selected.

[0156] Analyzing the time series data and feature calendar may include building multiple item-chain models with selected holidays removed and comparing the goodness-of-fit of these models while penalizing the model for over-fitting. A In some embodiments, models may be compared, and therefore features may be compared to one another and to a threshold value, by using Akaike Information Criterion (AIC). In some embodiments, as part of selecting features, the feature management system 212 may measure forecast performance of various models (e.g., random forest, XGBoost, LASSO, or other models) to determine the importance of particular features in the prediction of demand for the example item. Furthermore, the feature management system 212 may test identified holiday features in a linear and GAMM models using various feature selection methods (forward, backward, stepwise and best subsets) to select the best model. By performing such a selection method, the features may be ranked, and the top X number of features may be selected, where X may correspond to a threshold value of number of features of a plurality of features to include in the forecasting model. Furthermore, the feature management system 212 may perform multiple passes to identify features of a plurality of features for a given model. For example, first item-specific features may be identified and then group-level features may be identified for a group of items that include the item. Item-specific features may be features that are identified at an item level, as opposed to a group level. Different items may share certain item-specific features, such as two items including a certain fixed effect, like a certain holiday. Additional example aspects of steps 908 are described in connection with the feature selectors 806 and feature combiner 812 of FIG. 8, and in connection with FIGS. 11-12.

[0157] In the example shown, the feature management system 212 may combine features for the example item (step 910). For example, the feature management system 212 may combine features from multiple passes of feature identification for the forecasting model or may combine features from different feature selectors. In some embodiments, the feature management system 212 may determine whether to use item-specific features for the example item or to use group-level features for the example item. For example, holiday features may be determined at an item level using one or more of the regression model 808 or the machine learning model 809 and at a group level by using the clustering model 810. As another example, the regression model 808 may determine item-specific features using first time series data and may determine group-level features using aggregated time-series data.

[0158] In some embodiments, item-specific features may be determined if sufficient data is available. For example, the feature management system 212 may attempt to apply one or more of the feature selectors 806 to the item-specific time series data, and if the feature selectors fail to identify relevant features, then it may be determined that the time series data is insufficient. As another example, it may be determined that the time series data is insufficient if the item has been sold for less than least two years. Other examples of determining whether the time series data is sufficient to identify item-specific features are described in connection with step 602 of FIG. 6.

[0159] In some embodiments, selecting features at the item level may enable detection of intra-group nuances for how different items of a group may be sensitive to features. Therefore, in some embodiments, if there is sufficient data to identify features at an item level, then item-specific features will be used. If sufficient data is not available, however, then group-level features may be used. In other embodiments, a combination of item-specific features and group-level features may be used. Example aspects of combining features are described in connection with the feature combiner 812 and in connection with FIG. 10.

[0160] In the example shown, the feature management system 212 may output features (step 912). Outputting the features may include updating a forecasting model, such as by modifying one or more files associated with the forecasting model, such that the model includes the features identified for the forecasting model during the steps 910-912. The forecasting model may then be trained using the identified features and use the trained features as part of generating demand forecasts.

[0161] Although aspects of the method 900 of FIG. 9 were described in connection with identifying features for an example item, the method 900 may be used to determine features for a plurality of different items. For example, the method 900 may be used to identify features for a first item-specific forecasting model, a second item-specific forecasting model for another item, or additional forecasting models, such as chain or group-level forecasting models, location-specific forecasting models, channel-specific forecasting models, or combinations of such models. As a result, different forecasting models may include different sets of features of a plurality of features. For example, one forecasting model may include more features from a plurality of features than another model. For instance, a first item-specific forecasting model may include eight fixed effects associated with holidays, whereas a second item-specific forecasting model may include six fixed effects associated with holidays. Additionally, although there may, in some instances, be overlap between sets of features for different items (e.g., forecasting models for different items may include a common Christmas fixed effect), one forecasting model may include a fixed effect that is not present in another forecasting model, and vice-versa.

[0162] FIG. 10 illustrates a schematic diagram of determining holidays for an example item 1001. As an example, the item 1001 may be a particular type of a candy with a particular size, such as a 10-ounce bag of Peanut M&M's. In the example shown, the item 1001 may belong to a class 1003, which may be a group of items that includes the item 1001. Moreover, the class 1003 of items may belong to a department 1005 of items, which may be a group of items that includes the class 1003 and one or more additional classes of items. In an example, the hierarchy of the item 1001, class 1003, and department 1005 is part of a hierarchical categorization of items in a retail catalog. In the example of FIG. 10, the feature management system 212 may determine three sets of holiday features, the features 1002, the features 1004, and the features 1006.

[0163] As an example, the feature management system 212 may apply one or more of the regression model 808 or the machine learning model 809, along with a threshold value, to determine the model features 1002. In some embodiments, although the model features 1002 are determined based at least in part using data associated with one or more of the class 1003 or department 1005, the model features 1002 may also be determined at least in part using item data that is specific to the item 1001. In the example shown, the feature management system 212 may determines that the holidays of Christmas, Christmas.minus, Father's Day, Halloween, and Labor Day, are sufficiently important holidays for the item 1001. Each of these holidays may correspond to one or more dates in a feature calendar, as described in connection with the feature calendar generator 804. The holiday of Christmas.minus may represent, for example, a day or a week prior to the Christmas holiday.

[0164] In the example shown, the feature management system 212 may apply the clustering model 810 to determine the clustering features 1004. The clustering features 1004 may represent relevant holidays for the department 1005, which includes the item 1001. In the example shown, the clustering features 1004 include Christmas, Christmas.minus, Good Friday, and Valentines.

[0165] In the example shown, the feature management system 212 may combine the features 1002 and 1004. For example, the feature management system 212 may apply the feature combiner 812 to combine the features 1002 and 1004. In the example shown, the feature management system may take a union of the features 1002 and 1004, resulting in a list of holiday features that includes holidays from each of the features 1002 and 1004. In some embodiments, the feature management system 212 may combine the features 1002 and 1004 in a different manner, such as by selecting only features that occur in both of the features 1002 and 1004 or selecting features from the features 1002 and 1004 that have a sufficiently high impact on demand. In some embodiments, the forecasting model for the item 1001 may use the features 1006 when forecasting demand for the item 1001.

[0166] FIG. 11 illustrates an example diagram 1100 depicting an example representation of an example technique for identifying feature importance, which may correspond to a feature's impact on a demand forecast. In the example shown, a tree-based method (e.g., a random forest) is used for its by-product of feature importance shown in the diagram 1100. The x-axis 1102 of the diagram represents a number of variables (e.g., a number of holidays considered by a forecasting model). The y-axis 1104 represents a mean squared error (MSE) of the forecasting model. The line 1106 represents the mean squared error of the forecasting model with respect to the number of variables. The dashed line 1108 represents a cut-off point. In the example shown, the cut-off point is 8. The cut-off point may correspond to a threshold value used to identify features.

[0167] Using data depicted by the diagram 1100, each feature used in the forecasting model is measured by the percent increase in MSE if it is removed from the model. The most important variables may have the highest metric (e.g., may be represented by the variables on the left-hand side of the diagram 1100). In the example shown, the 8 features represented by the number 1-8 may be identified as sufficiently important for the item demand that is being forecasted to be included in the forecasting model. In some embodiments, data such as the data depicted by FIG. 11 may be used in combination with data like the data depicted by FIG. 12.

[0168] FIG. 12 illustrates an example diagram 1200 depicting holiday feature importance. In some embodiments, data of the diagram 1200 may be determined by applying one or more of the feature selectors 806 for an example item. The x-axis 1202 of the diagram 1200 represents a degree of an effect on demand for the example item based on a holiday. The y-axis 1204 represents a plurality of potential holidays to consider for the example item. In the example shown, the holidays are listed in order based on the magnitude of effect. For example, as shown, the holiday christmas.minus has the greater effect on demand for the example item. As shown, some relevant holidays may increase demand for the example item (e.g., christmas.minus), whereas other relevant holidays may decrease demand for the example item (e.g., new year's day).

[0169] In some embodiments, the feature management system 212 may select the top X most impactful holiday features to consider when forecasting demand for the example item, an example of which is described in connection with the diagram 1100 of FIG. 11. For example, if the feature management system 212 selects the top three holidays having the greatest impact, then in the example of FIG. 12, the feature management system 212 may select christmas.minus, good friday, and new year's day. In some embodiments, the feature management system 212 may select all holidays that have a sufficiently large effect on demand. For example, if the feature management system 212 selects all items having an effect of at least 0.4, then the feature management system may select christmas.minus, good friday, new year's day, and christmas. Other variations for selecting relevant features are likewise possible.

[0170] In example aspects, a hyperparameter may be a parameter that influences the output or operation of a model but is not itself a parameter of the model. In the context of demand forecasting models, an example hyperparameter may be whether a model considers a particular factor, such as a seasonality factor, and the weight given to such a factor. In some instances, some hyperparameters may be optimal in the context of some items but not others. For example, considering individual item seasonality may improve forecasting accuracy for a grocery department but not for a beauty department.

[0171] In example aspects, a hyperparameter tuning system 214 may optimize hyperparameters for groups of items. In some embodiments, the hyperparameter tuning system 214 may use Bayesian hyperparameter optimization to find optimal hyperparameters within a search space. The Bayesian optimization may enable identification of which hyper parameters have greater or lesser importance. In some embodiments, hyperparameters identified as more important may then be analyzed to determine whether a default value or an optimized value should be used. For example, a grid of all options of hyperparameter combinations (default and optimized for each hyper parameter determined to be important) may be created, and the overall objective may be assessed for each option combination.

[0172] As described further in connection with FIGS. 13-14, the hyperparameter tuning system 214 may, in some embodiments, generally use two operations for identifying hyperparameters and their corresponding values. For example, the hyperparameter tuning system 214 may identify a set of optimal hyperparameters for a department of items. Continuing with the example, the hyperparameter tuning system 214 may then create groups of items in the department, and for each group of items, assign a set a of hyperparameter values.

[0173] Aspects of the present disclosure related to the hyperparameter tuning system 214 may provide various technical advantages. For example, selecting and tuning hyperparameters using aspects of the hyperparameter tuning system 214 has shown to result in an average accuracy improvement of 3-4%. Furthermore, the hyperparameter tuning system 214 may enable the forecasting models 222 to more accurately predict sales of highly seasonal products with limited life spans. Furthermore, the hyperparameter tuning system 214 may efficiently select appropriate hyperparameters for groups of items, even if different items within the group have different optimal hyperparameters. It can be difficult to select appropriate hyperparameters efficiently for each group of items. Yet still, the hyperparameter tuning system 214 is not specific to any particular hyperparameters. As such, it may be extended to include additional or different hyperparameters. As will be appreciated, these are only some of the advantages provided by aspects of the hyperparameter tuning system 214.

[0174] FIG. 13 illustrates an example architecture of the hyperparameter tuning system 214. In the example shown, the hyperparameter tuning system 214 includes a hyperparameter optimizer 1302, a hyperparameter contextualizer 1304, and hyperparameter default values 1306. In some embodiments, the hyperparameter tuning system 214 may be turned on and off by a user, and a user can override an output of the forecasting model. As such, even if the tuned hyperparameter values determined by the hyperparameter tuning system 214 are not used for a given forecast or forecast model, the technology can store the following forecasts: (1) a forecast from the demand forecasting model without tuned hyperparameters; (2) a forecast from the demand forecasting model with tuned hyperparameters; and (3) a user override forecast.

[0175] The hyperparameter optimizer 1302 may receive item data 108, sales data 110, and forecast data 112. In some embodiments, for a group of items, such as a department of items, the hyperparameter optimizer 1302 may identify hyperparameters of forecasting models that forecast demand for the group of items. Moreover, the hyperparameter optimizer 1302 may determine optimal values for the identified hyperparameters. Example operations that may be performed by the hyperparameters optimizer may include the steps 1404 and 1406 of FIG. 14, which are further described below.

[0176] The hyperparameter contextualizer 1304 may receive identifications of hyperparameters and their corresponding optimal values as determined by the hyperparameter optimizer 1302. In some embodiments, the hyperparameter contextualizer 1304 may also receive hyperparameter default values 1306. In some embodiments, whereas the hyperparameter optimizer 1302 determines optimal hyperparameter values for a group of items, such as a class or department, the hyperparameter contextualizer 1304 may determine, on a more granular level, whether to use a group-level optimal value or default value for each of a plurality of hyperparameters. In some embodiments, the hyperparameter contextualizer 1304 may perform steps 1408-1412 of the method 1400, which are further described below.

[0177] The hyperparameter default values 1306 are values for the hyperparameters that may be different from the values determined by the hyperparameter optimizer 1302. In some embodiments, the hyperparameter default values 1306 are the hyperparameter values that are used prior to performing a hyperparameter tuning process. In some embodiments, the hyperparameter default values 1306 are input by a user and as such, may represent knowledge regarding the demand forecasts and the forecasting models 222 that may not be accessible to the hyperparameter optimizer 1302.

[0178] FIG. 14 is a flowchart of an example method 1400 for identifying and tuning hyperparameters. Although aspects of the method 1400 are described as being performed for an example group of items, the method 1400 may be applied to a plurality of groups of items and a plurality of items across an item catalog. Although aspects of the method 1400 are described as being performed by the hyperparameter tuning system 214, other components may also perform operations of the method 1400.

[0179] In the example shown, the hyperparameter tuning system 214 may select items (step 1402). For example, the hyperparameter tuning system 214 may select a department of items, which may be associated with one or more forecasting models of the forecasting models 222. The hyperparameter tuning system 214 may identify and tune hyperparameters for these one or more forecasting models. In some embodiments, the selected items may be identified by a user. In some embodiments, the one or more models for the selected items may be experiencing a poor forecasting performance. In some embodiments, the hyperparameter tuning system 214 may be scheduled to periodically update hyperparameter values for groups of items.

[0180] In the example shown, the hyperparameter tuning system 214 may select hyperparameters associated with the one or more forecasting models (step 1404). For example, a given forecasting model may include a plurality of hyperparameters. A hyperparameter may indicate whether a potential factor is considered by the forecasting model and may include a weight given to the factor. From the potential hyperparameters, the hyperparameter tuning system 214 may select one or more hyperparameters to be optimized. Depending on the embodiment, different hyperparameters may be used. Example hyperparameters include the following: whether each item has its own seasonality; whether each item has its own trend; a residual parameter that decides the responsiveness of forecasts to week-over-week variations; demand elasticity; an intercept value or how an intercept value is determined; a pre-processing parameter, or a post-processing parameter. Other hyperparameters are likewise possible.

[0181] In the example shown, the hyperparameter tuning system determine optimal values for the selected hyperparameters (step 1406). The optimal values may be the values that optimize an objective function, such minimizing a difference between a forecasted demand and an actual demand, when forecasting demand across the selected items. In some embodiments, to perform the optimization, the hyperparameter tuning system 214 may use an exploration-exploitation algorithm. For example, the hyperparameter tuning system 214 may execute experiments using different values for the selected hyperparameters to determine a quality of the values and to generate information that can be used to select hyperparameter values for subsequent experiments. In some embodiments, the hyperparameter tuning system 214 may use a Bayesian hyperparameter optimization algorithm. This algorithm may produce optimal values for the selected hyperparameters for the department of items.

[0182] In the example shown, the hyperparameter tuning system 214 may select hyperparameters to contextualize (step 1408). For example, the hyperparameter tuning system 214 may select hyperparameters that most impact demand forecasts. For example, the hyperparameters that most contribute to the objective value be selected. In some embodiments, the hyperparameters may be selected from the group of hyperparameters for which optimal values were determined (e.g., at the step 1406), and the degree of importance of a given hyperparameter may be determined while determining the optimal hyperparameter values.

[0183] In the example shown, the hyperparameter tuning system 214 may determine hyperparameter default values (step 1410). For example, the hyperparameter tuning system 214 may retrieve the hyperparameter default values 1306 of FIG. 13.

[0184] In the example shown, the hyperparameter tuning system 214 may determine combinations of hyperparameter values (step 1412). For example, using the identified most important hyperparameters, the hyperparameter tuning system 214 may create sets of possible combinations for the default and optimal values for these hyperparameters. The default values may be the values determined at the step 1410, and the optimal values may be the values determined at the step 1406. An example of combinations of hyperparameters determined by the hyperparameter tuning system 214 is illustrated in the example of FIG. 15A.

[0185] In the example shown, the hyperparameter tuning system 214 may evaluate the combinations of hyperparameter values for the selected items (step 1414). For example, the hyperparameter tuning system 214 may group items of the selected items and determine, for each item group, whether to apply an optimal or default value for the evaluated hyperparameters. For example, referring to the diagram 1500 of FIG. 15A, the hyperparameter tuning system 214 may determine which of the four quadrants (i.e., which of a combination of default or optimal hyperparameter values) are to be applied to each item of the selected items.

[0186] For example, for each item in the selected items, the hyperparameter tuning system 214 may evaluate the objective function for each combination of hyperparameter values. For example, to evaluate the objective for the cell of {default, optimized} of the diagram 1500 of FIG. 15A, the default value of item trend responsiveness and the optimized value for residual parameter may be used, and when used, the forecasting error is calculated to determine an objective value for an item for that cell.

[0187] Furthermore, for every item of the selected items, the hyperparameter tuning system 214 may create a vector Y of evaluated objectives across all combinations of hyperparameter values and fit a multivariate regression tree to explain Y with important item-level features X. The features X may include features that are deemed to be important for grouping items, such as belonging to a particular class or having a particular attribute. In some embodiments, the regression tree that is used is a machine learning model that learns how to group the items of the department based on the identified features X and the values Y of evaluated objectives across the combinations on the grid, as described further in connection with FIG. 15B. For each group identified by the regression tree, the hyperparameter tuning system 214 may pick the combination of the grid having the hyperparameter setting with the maximum (or minimum, in some embodiments) average objective.

[0188] In the example shown, the hyperparameter tuning system 214 may output hyperparameter values (step 1416). For example, once the hyperparameters for groups of items are determined, the hyperparameter tuning system 214 may provide these hyperparameters to a main forecasting model, such as the forecasting models 222 for use when generating demand forecasts. In some embodiments, the hyperparameter tuning system 214 may provide the hyperparameters to the model training system 208 for use when training the forecasting models 222. In some embodiments, the hyperparameter tuning system 214 may store the hyperparameters.

[0189] FIG. 15A illustrates a diagram 1500 depicting example hyperparameter combinations. In the example shown, the diagram 1500 shows two example hyperparameters: seasonality and trend. Seasonality may be whether and to what extent an item demand is sensitive to changes in times of a year. Trend may be whether and to what extent an item demand is sensitive to recent changes (e.g., within the last one, two, or three weeks) in the item demand. Other hyperparameters are likewise possible. In the example shown, the hyperparameter tuning system 214 may have determined default and optimized values for each of seasonality and trend. Therefore, the hyperparameter tuning system 214 may determine whether to apply a default or optimized value for each of seasonality and trend to groups of items. As would be appreciated by those having skill in the art, if there are more hyperparameters to be evaluated for a certain group of items, there may be more combinations of default and optimized values that are to be evaluated. For example, if there are three hyperparameters to evaluate, then there may be eight combinations of default and optimized values to evaluate, if there are four hyperparameters to evaluate, then there may be sixteen combinations to evaluate, and so on.

[0190] FIG. 15B illustrates an example tree 1502 for determining which combination of hyperparameter values depicted in the diagram 1500 to apply to each subgroup of items for a group of items. For example, the tree depicts a group of N items, which may be, for example, a class, department, catalog, or other group of items. For the group of N items, the hyperparameter tuning system 214 may have determined that the hyperparameters of trend and seasonality will be contextualized, such that, for subgroups of the N items, different values (e.g., a default or optimized value) of the hyperparameters of seasonality and trend may be applied.

[0191] Moreover, the hyperparameter tuning system 214, or an administrator or engineer associated with the forecasting system 102, may have determined attributes X according to which the N items are to be divided into subgroups. The attributes X may be item attributes, as described further, for example, in connection with the item data 108 of FIG. 1. As shown by the leaves of the example tree 1502, the hyperparameter tuning system 214 may have separated the items N into 5 subgroups. For each of the subgroups, the hyperparameter tuning system 214 may then evaluate the combinations of hyperparameter values of the diagram 1500 to determine which hyperparameter values are to be used when forecasting demand for items within each of the identified item subgroups of N1-N5.

[0192] In some instances, when there is a promotion for an item, it may be expected that the sales for that item will increase. However, it may be challenging to determine the extent of that increase. For example, there are different types of promotions, and the effects of different promotions may vary. Moreover, some items are more sensitive to promotions than other items because some items may be more sensitive to price changes than others. As such, it can be challenging to accurately forecast the effects of promotions on demand.

[0193] In example aspects a promotion system 216 determines an impact on a demand forecast for an item based on a promotion. For example, referring to the example of FIG. 3, when forecasting item demand, the promotion system 216 may determine a value for the promo lift factor 310. In some embodiments, when the promo lift is applied, it may increase the demand forecast for an item at a location. The promo lift factor may include a combination of an elasticity element and a price effect element.

[0194] In example aspects, for the elasticity element, the promotion system 216 may determine, on an item-by-item basis, a price elasticity, or a sensitivity to a price change, for the demand of an item. Elasticity may vary across items. For example, the demand for some items may be more sensitive to price changes than the demand for other items. For example, demand for luxury goods, such as snacks, may be more sensitive to price changes than demand for other goods, such as medicine. To determine an elasticity effect for an item, the promotion system 216 may use a demand curve model, which may be based on results of previous promotions or other price changes. In some embodiments, the greater the elasticity for an item, the greater the impact a promotion may have on demand.

[0195] In example aspects, for the price effect element, the promotion system 216 may determine a price change for an item due to a promotion. The price change may be the difference between the promotion price and a regular price. However, in some embodiments, given that some promotions have requirements, or are not available to all customers, the price change for a promotion may, in some instances, be different than a difference between a regular price and a promotion price. For example, the price change may vary depending not only on the discount amount but also on the redemption rate for the promotion. In some embodiments, the redemption rate may correspond to the percentage of item sales that receive the promotion price.

[0196] In example aspects, there may be various types of promotions. In some instances, the redemption rates may vary across different types of promotions. As such, in some embodiments, a price effect may vary across different types of promotions and the impact on demand may vary across different types of promotions.

[0197] In example aspects, the price effect for a given promotion includes its discount value and redemption rate. For example, if promotion X has a redemption rate of 50% and if the discount for promotion X is 10%, then the price effect for the promotion may be 5%. For each item to which the promotion applies, the price effect may be combined with the elasticity effect for the item to determine the promotional lift. For example, the elasticity for item Y may indicate that the demand for item Y increases by 1% for every 2.5% price decrease. If, for example, promotion X applies to item Y, then the demand boost for item Y due to promotion X may be 2% (a 1% increase for every 2.5% decrease in demand).

[0198] In example aspects, the promotion system 216 may perform additional operations associated with promotions. For example, the promotion system 216 may account for a demand increase caused by no discount promotions, which may be an advertisement. In some embodiments, the promotion system 216 may determine an effect of stacked promotions, such as a 10% member-only promotion for all items combined with a BOGO promotion for a particular item. In some embodiments, the promotion system 216 may account for a cannibalization effect. For example, if item X is on promotion and item Y is not on promotion, then a demand for item Y may decrease, because customers may purchase item X instead of item Y. In some embodiments, the promotion system 216 may account for a way in which a promotion is transmitted. For example, a promotion that is mailed to customers may have a different impact than the same promotion that is provided to customers in a circular. As will be understood, these are only some of the features that may be enabled by use of the promotion system 216.

[0199] FIG. 16 illustrates an example architecture of the promotion system 216. In the example shown, the promotion system 216 includes promotions 1602, redemption rate calculator 1604, price change calculator 1606, and elasticity models 1608. In some embodiments, the promotion system 216 may receive data from one or more other components described in connection with FIG. 1 or 2, such as, for example, item data 108, sales data 110, or forecast data 112, and the promotion system 216 may use such data as part of determining an impact of promotions on a demand forecast. The promotion system 216 may include more or fewer components than those described in connection with FIG. 16.

[0200] The promotions 1602 may include data corresponding to promotions. The promotions 1602 may include data for past, current, and future promotions. In some embodiments, each promotion may be associated with a promotion type, a discount, an item or group of items, a time, a location or group of locations, a channel, an advertisement mechanism, and other data.

[0201] In some embodiments, there may be various types of promotions. An example promotion type is a price-based offer. A price-based offer may be a discount, such as 10% off an item. In some embodiments, to qualify for a price-based offer, the customer may only need to purchase the item. Another example promotion type is a basket promotion. A basket promotion may require a customer to take an additional purchasing action to receive the promo.

[0202] There may be various types of basket promotions, such as a buy-one-get-one (BOGO) promotion (e.g., Buy 1 Get 1 50% off), mandated multiples (Buy 3 and receive $2 off each), hurdle offers (spend $50 and save $10 or get a $10 gift card), or other types of promotions that require a customer to take an additional action to qualify for the promotion. Another example promotion type is a single-use promotion. Single-use promotions may be promotions that can only be used once by a customer, such as a coupon for 50% off a particular item. Another example promotion type is a mass promotion. A mass promotion may be a promotion that is available to every customer. Another example promotion type is a member-only promotion, which may only be available to a subset of customers, such as customers that are part of a membership program. Another example promotion type is a personalized promotion, which may only be used by a specific customer. Other promotion types, or variations of promotion types, may include the following: free item promotions, free gift card promotions, location-based promotions, and shipping promotions. Other types of promotions are likewise possible. Moreover, there may be promotions that are available in-store but not online, or vice-versa. There may also be regional or store-specific promotions. Furthermore, some promotions may overlap. For example, a given promotion can be both a member-only promotion and may also be a price-based offer.

[0203] The redemption rate calculator 1604 may be a program that determines redemption rates for promotions. For a given promotion, the redemption rate may be the percentage of items sold during a promotion period for which a promotion discount is applied. For example, if a promotion is a single-use or member-only promotion, then the redemption rate calculator 1604 may determine that not every item sold during the promotion period will qualify for a promotion price. On the other hand, the redemption rate calculator 1604 may determine that, for mass promotions that are sales-based offers, the redemption rate may be 100%, because each time that any customer purchases the item, the customer receives the discount. Therefore, 100% of the purchases result in the promotion being applied. For other promotions, however, the redemption rate may vary. For instance, the redemption rate for basket promotions may not be 100%. For example, if a customer must purchase 2 of an item to qualify for a basket promotion, then it can be inferred that not every customer purchasing the item will qualify for the promotion, because some customers may only buy 1 of the item.

[0204] There may be various data used by the redemption rate calculator 1604 to determine a redemption rate for a given promotion. In some embodiments, the redemption rate calculator 1604 may use historical promotion data, from which redemption rates form previous promotions may be derived and used as a basis for determining redemption rates for subsequent promotions. In some embodiments, the redemption rate calculator 1604 may use data input by a user.

[0205] The price change calculator 1606 may determine an effect on price for a promotion. For example, the price change calculator 1606 may use discount price of a promotion and a redemption rate to determine an estimate price change for an item on promotion. For example, if the discount for a promotion is $5 (e.g., buy 1 item and get the second item for $5 less), and if the redemption rate for the promotion is 80%, then the price change calculator 1606 may determine that the price change for item due to the promotion during the promotion is $4. In some embodiments, the price change calculator 1606 may determine a range of possible price changes for a given promotion.

[0206] The elasticity models 1608 may be models that determine a change in item demand due to a change in item price. The elasticity model 228 of FIG. 2 is an example of one of the elasticity models 1608. In some embodiments, the elasticity models 1608 include an elasticity model for each item. In some embodiments, a group of items may use a common elasticity model. In some embodiments, the elasticity models 1608 may be represented as demand curves. There may be various techniques for determining the elasticity models 1608. In some embodiments, the promotion system 216 may use data associated with historical price changes (e.g., due to a promotion or another reason) and the change of such price changes on historical sales data. In some embodiments, the promotion system 216 may use price and sales data associated with different items to estimate an effect on sales of a price change on an item. In some embodiments, the promotion system 216 may update the elasticity models 1608 in response to receiving sales data associated with new promotions or in response to a user input updating the elasticity models.

[0207] FIG. 17 is a flowchart of an example method 1700 for determining an input of a promotion. Although operations of the method 1700 are described as being performed by the promotion system 216, other components described herein may perform at least some of the operations of the method 1700. Furthermore, although the method 1700 is described in connection with an example item and an example promotion, aspects of the method 1700 may be applied to determine an impact of a plurality of promotions across a plurality of items.

[0208] In the example shown, the promotion system 216 may begin a method for determining an impact of an example promotion (step 1701). In some embodiments, the promotion system 216 may determine the impact of a promotion prior as part of training or executing forecasting models 222. In some embodiments, the promotion system 216 may determine a promotion impact when a promotion is defined (e.g., by a user) or when a promotion period begins. In some embodiments, if a promotion applies to one or more of a plurality of items, locations, or channels, then the promotion system 216 may simultaneously determine an impact of the promotions across the items, locations, and channels.

[0209] In the example shown, the promotion system may determine a price change (step 1702). As described in connection with the promotions 1602, the redemption rate calculator 1604, and price change calculator 1606, the promotion system 216 may determine an estimated aggregate price change for the example promotion. In some embodiments, determining the price change may include one or more steps, as illustrated by the example of FIG. 17.

[0210] In the example shown, the promotion system 216 may determine a discount amount (step 1704). A discount amount may be defined by the promotion as a difference between a promotion price and a regular price. For example, if an item's regular price is $20, and if the promotion price $17, then the discount amount is $3. In some instances, a promotion does not have a discount amount. In some embodiments, if a promotion applies across a group of items (e.g., save 10% on an order), then the discount amount may be proportioned across the items for which the promotion is applicable.

[0211] In the example shown, the promotion system 216 may determine a redemption rate (step 1706). Example aspects of determining a redemption rate are described in connection with the redemption rate calculator 1604 or FIG. 16.

[0212] In the example shown, the promotion system 216 may combine a discount amount and redemption rate (step 1708). In some embodiments, the promotion system 216 may multiply a discount amount and a redemption rate. Other techniques for combining the discount and redemption rate are likewise possible. In some embodiments, the promotion system 216 may execute one or more additional steps as part of determining a price change. For example, in some embodiments, the promotion system 216 may combine a plurality of price changes from a plurality of promotions that may be applied to the example item.

[0213] In the example shown, the promotion system 216 may determine an elasticity for the example item (step 1710). In some embodiments, the promotion system 216 may input one or more of a regular price, promotion price, or price change into an elasticity model for the example item to determine the item's elasticity at the promotion price. In some embodiments, the elasticity may be a ratio that represents a degree to which item demand is changed by a change in price.

[0214] In the example shown, the promotion system 216 may determine a promotion impact (step 1712). In some embodiments, the promotion impact is the marginal increase of demand for the example item due to the example promotion. In some embodiments, determining the promotion impact includes multiplying an elasticity and price change. In some embodiments, the promotion impact may be determined in a different manner. In some embodiments, the promotion system 216 may determine not only an increase to demand for an example item on promotion, but the promotion system 216 may determine a secondary effect on demand for other items, such as items that are not on promotion and that may be complementary or substitute items to the items on promotion.

[0215] In the example shown, the promotion system 216 may output the promotion impact (step 1714). For example, the promotion system 216 may output the promotion impact to a forecasting model of the forecasting models 222. For example, the promotion system 216 may output the marginal increase to an item demand due to the promotion as the promo lift 310.

[0216] In example aspects, sets of potentially overlapping collections of similar items are generated for a given base item. From the similar items, data may be pooled that can be used to train a forecasting model for the given base item. For example, in a collection of six items denoted item A, item B, item D, item E, item F, and item G, the most similar items to item A may include a cluster of item A, item D, item B, and item F. The most similar items to item B may include item B, item D, item F, and item G. Similarly, other groups of items may be formed for others of the items. When building a model for item A, data may be pooled from its group of similar items. Such data pooling may provide more data when training forecasting models, which may result more accurate and precise forecasting models.

[0217] FIG. 18 illustrates an example architecture of a similar item data pooling system 1802. In the example shown, the model training system 208 may use data of the similar item data pooling system 1802 as part of training forecasting models 222. However, the similar item data pooling system 1802 need not be part of the model training system 208. In some embodiments, the similar item data pooling system may determine a list of similar items for each item of a catalog, or for at least some items of an item catalog. In the example shown, the similar item data pooling system 1802 includes an item data extractor 1804, an embeddings generator 1806, a search and ranking system 1808, a similar item aggregator 1810, and item similarity data 1812. The similar item data pooling system 1802 may include more or fewer components than those illustrated in the example of FIG. 18.

[0218] The item data extractor 1804 may receive item data 108 and extract data from the item data 108. For example, the item data extractor 1804 may select item data that is to be used as part of generating item embeddings. In some embodiments, the item data extractor 1804 may select, for an item, an item title, description, categorization within an item catalog hierarchy, image, sales history, review, location availability, or other data.

[0219] The embeddings generator 1806 may generate embeddings associated with items. In some embodiments, the embeddings generator 1806 may include one or more machine learning models that generates embeddings. In some embodiments, the machine learning models may be transformer-based models. In some embodiments, the embeddings generator 1806 may generate embeddings based on one or more of text or image data. For example, for each item, the embeddings generator 1806 may generate embeddings for the item by inputting data extracted by the item data extractor 1804 into a machine learning model.

[0220] The search and ranking system 1808 may determine a group of similar items for an item and rank the similar items. In some embodiments, the search and ranking system 1808 determines similar items by searching for embeddings that are similar to embeddings generated for an item. In some embodiments, the search and ranking system 1808 may use a K-nearest neighbors algorithm to determine similar items. In some embodiments, similar items may be ranked based on embedding distances or similarities.

[0221] The similar item aggregator 1810 may create lists of similar items for each item of a plurality of items. In some embodiments, the lists of similar items may be ranked. In some embodiments, the similar item aggregator 1810 may store the lists of similar items as part of the item similarity data 1812, which may be used by the model training system 208 as part of training forecasting models 222.

[0222] FIG. 19 is a flowchart of an example method 1900 that may be part of training forecasting models using similar item data. Although described as being performed by the similar item data pooling system 1802 and the model training system 208, one or more other components may perform aspects of the method 1900.

[0223] In the example shown, the similar item data pooling system 1802 may generate a list of similar items for an item. In some embodiments, the similar item data pooling system 1802 may, for each item of an item catalog, generate a list of similar items (step 1902). In the example shown, generating a list of similar items for an item may include a plurality of steps. In some embodiments, the similar item data pooling system 1802 may use components of FIG. 18 as part of generating lists of similar items.

[0224] In the example shown, the similar item data pooling system 1802 may generate item embeddings for items (step 1904). In some embodiments, the similar item data pooling system 1802 may use the embeddings generator 1806. In some embodiments, the similar item data pooling system 1802 may determine whether embeddings have already been generated for a token for which embeddings are to be generated. A token may be, for example, a title, sentence, word, or group of words. If so, the similar item data pooling system 1802 may use the embeddings that were already generated. In some embodiments, embeddings for an item may be generated as new items are added to an item catalog.

[0225] In the example shown, the similar item data pooling system 1802 may search for similar embeddings for items (step 1906). For example, for each item, the similar item data pooling system 1802 may apply the search and ranking system 1808 to search for similar items. In some embodiments, the similar item data pooling system 1802 may, for a given item, limit the universe of items for which similar items are searched. For example, the similar item data pooling system 1802 may search for items belonging to a same group as the item. Other search optimizations are likewise possible.

[0226] In the example shown, the similar item data pooling system 1802 may rank similar items for the items (step 1908). For example, for a set of similar items to an item, the set of similar items may be ranked based on similarity to the item. In some embodiments, the ranking is performed by using a Cosine or Euclidean similarity between embeddings for the item and embeddings for the similar items.

[0227] In the example shown, the model training system 208 may select an item (step 1910). For example, the model training system 208 may select an item for which a forecasting model of the forecasting models 222 is to be trained.

[0228] In the example shown, the model training system 208 may determine similar items for the selected item (step 1912). For example, the model training system 208 may select a list of similar items generated by the similar item data pooling system 1802.

[0229] In the example shown, the model training system 208 may determine whether to train a forecasting model for the selected item using similar item data (step 1914). In some embodiments, the model training system 208 may determine whether there is historical sales data associated with the selected item to train the forecasting model. If not, the model training system 208 may elect to use similar item data as part of training the forecasting model. In some embodiments, the model training system 208 may determine how many similar items to use as part of training the forecasting model. For example, the model training system 208 may elect to use three similar items, and then the model training system 208 may select a top three similar items from the list of similar items to the selected item. In some embodiments, the more historical sales data associated with the selected item is available, the fewer number of similar items and amount of similar item data may be used as part of training the forecasting model for the selected item.

[0230] In response to determining to not train the model using similar item data (e.g., taking the NO branch), the model training system 208 may proceed to train the forecasting model for the selected item without similar item data (step 1916). In response to determining to train the model using similar item data (e.g., taking the YES branch), the model training system 208 may proceed to select similar item data (step 1918).

[0231] In the example shown, the model training system 208 may select similar item data (step 1918). For example, the model training system 208 may select sales data for a determined number of similar items to the selected item. In some embodiments, the model training system 208 may select sales data for the top-ranked similar items. In some embodiments, however, the model training system 208 may select sales data for certain similar items. For example, the model training system 208 may first determine whether a given similar item itself has sufficient sales data. If not, the model training system 208 may not select sales data associated with the given similar item. In some embodiments, the model training system 208 may select similar items that are sufficiently different from one another. As such, the model training system 208 may select sales data that corresponds to different aspects of the selected item.

[0232] In the example shown, the model training system 208 may train a forecasting model for the selected item by using the selected similar item data (step 1920). In some embodiments, the model training system 208 may use a combination of sales data associated with the selected item and sales data of the similar items. In some embodiments, operations of the method 1900 may be repeated. For example, the model training system 208 may repeats the steps 1910-1920 as part of retraining a forecasting model. In some embodiments, as a selected item accumulates more historical sales data as time passes, the model training system 208 may use progressively less data from similar items as part of training a forecasting model for the selected item.

[0233] FIG. 20 illustrates an example network environment in which a scoring server 2002 may be implemented. In example aspects, the forecasting system 102 includes a forecasting API that enables an external program to access the forecasting models 222. In some embodiments, the forecasting API is available on demand and implemented using the scoring server 2002, such that an external program can call the forecasting API to receive a demand forecast for an item or a group of items. In some embodiments, the forecasting API allows an external program to use the forecasting models 222 to generate scenario-based demand forecasts. For example, by using the forecasting API, a user may change one or more of a price, time frame, availability, assortment etc., of an item or group of items and generate a demand forecast in view of the change. The forecasting API may be an example of the forecast consumer interfaces 246. Example aspects of the scoring server 2000 that may include the forecasting API are illustrated in the example of FIG. 20.

[0234] The scoring server 200 includes a web application 2004, REST endpoint 2006, a scoring service 2007, and a forecast scoring library 2008. The scoring server 200 may also include or be communicatively coupled with a model database 2010 and feature database 2012. The scoring server may provide both a web application and HTTP/JSON endpoints associated with a forecasting API. The web application 2004 may be an application that allows users to directly query the scoring server 2002. The REST endpoints may parse requests, such as JSON requests, and return results provided by the scoring service 2007 and forecast scoring library 2008. The web application 2004 and REST endpoints 2006 may receive requests 2013 for demand forecasts and return results to a sender of the requests 2013. The requests 2013 may correspond to inputs of a user via the graphical user interface 248.

[0235] The scoring service 2007 may receive requests from the web application 2004 and REST endpoints 2007 using a unified API for generating forecasts for various granularities of item, location, and time combinations. In some embodiments, the scoring service 2007 may route a request for a demand forecast for an item to an API corresponding to the item from among a plurality of APIs exposed by the forecast scoring library 2008. The forecast scoring library 2008 may generate demand forecast responsive to the requests received from the scoring service 2007. In some embodiments, the forecasting scoring library 2008 retrieves a model from the model database 2010 and features from the feature database 2012 corresponding to an item for which demand is to be forecasted responsive to the request 2013. In some embodiments, the forecast scoring library 2008 includes a plurality of model-specific APIs that are called by the scoring service 2007, or another application, for generating item-specific demand forecasts.

[0236] The model database 2010 may store data corresponding to the forecasting models 222. For example, the model database 2010 may store, for each model, parameters, learned weights, and data corresponding to features for the model. The feature database 2012 may include features of the forecasting models 222. For example, for each model stored in the model database 2010, the feature database 2012 may store features for the model. The features may include an identification of features used by the forecasting model and values for the features. In some embodiments, the request 2013 may override one or more of the feature values stored in the features database 2012.

[0237] In some embodiments, the forecasting API enables a user to remove an item from an assortment and to generate demand forecasts for remaining items in the assortment. As such, the forecasting API may enable a user to predict a percentage of demand that is transferred to other items, a percentage of demand that is lost, and an identity of items to which demand is transferred. In some embodiments, to support the capability to forecast demand transfers in response to interactions with the forecasting API, the forecasting system 102 may include a demand transfer engine. Example aspects of a demand transfer engine 2100, which may be part of the forecasting system 102, are illustrated in the example of FIG. 21.

[0238] The demand transfer engine 2100 may measures the relative portion of demand that can transfer to other items when an item is removed from the assortment. In some embodiments, given a list of items and data from customers buying those items, the demand transfer engine 2100 may generate a graph with nodes representing items and the edge weights capturing the degree of substitutability between pairs of items. Larger edge weights may indicate that the pair of items are more likely to be substitutable than pair of items with smaller edge weights. In some embodiments, the demand transfer engine 2100 uses scores between pairs of items to determine groups of substitutable items. These groups of substitutable items may be used to add a new or orphan item to the existing graph of substitution groups, such as by matching an attribute of a new or orphan item to an attribute determined for a group of substitutable items. Additionally, the demand transfer engine 2100 may use a substitution graph to generate demand transfer coefficients for pairs of items, thereby converting the undirected graph to directed graph. In the example of FIG. 21, the demand transfer engine 2100 includes training data sets 2102, a demand transfer compute layer 2104, and demand transfer APIs 2106. In some embodiments, the AI system 252 queries the demand transfer engine 2100 as part of generating a response to a user query.

[0239] The training data sets 2102 include data may be used by the demand transfer engine 2100 to, among other things, determine demand transfers among items, such as by generating substitutability graphs. In some embodiments, data of the training data sets 2102 may be retrieved from one or more of the item data 108 or the sales data 110. In the example shown, the training data sets 2102 include an item lists 2108, customer transaction data 2110, and item attribute data 2112. The item lists 2108 may include lists of items for which the demand transfer engine determines demand transfer values, such as by generating a substitutability graph for items in the list of items. The customer transaction data 2110 may include sales data, such as from the sales data 110, corresponding to items in the item lists 2108. The item attribute data 2112 may include attributes, such as text attributes, image attributes, item metadata, or other data corresponding to items in the list of items.

[0240] The demand transfer compute layer 2104 includes operations that may be performed by the demand transfer engine 2100. In some embodiments, the demand transfer engine 2100 may perform one or more of the steps 2114, 2116, or 2118 for items in a list of items.

[0241] In the example shown, the demand transfer engine 2100 may determine association scores (step 2114). For example, the demand transfer engine 2100 may determine association score for pairs of items in an item list. In some embodiments, an association score corresponds with a similarity score. There may be various techniques for determining an association score between a first item and a second item. As one example, the demand transfer engine 2100 may divide a probability that a customer bought both the first item and the second item by the probability that a customer bought the first item with the probability that the customer bought the second item. As another example, the demand transfer engine may determine a Jaccard Similarity between the first item and the second item.

[0242] In the example shown, the demand transfer engine 2100 may evaluate demand transfer values (step 2116). For example, the demand transfer engine 2100 may evaluate a plurality of association scores determined at the step 2114. In some embodiments, evaluating the demand transfer values includes normalizing edge weights in the substitutability graph. In some embodiments, the normalized demand transfer is calculated by dividing the edge weight between a pair of items by the sum of all scores between pairs of items in the same group. Additionally, evaluating the demand transfer values may include determining item groups. For example, if an association score is less than or equal to one, then the demand transfer value may be set to zero. The graph 2200 illustrates an example substitutability graph with example association scores and groups of items.

[0243] In the example shown, the demand transfer engine 2100 may determine item to item relationships (step 2118). Among other things, this may include one or more of the following operations: determining substitutability groups in a similarity graph; determine attributes of items in substitutability graphs; assign orphan items, such as items for which one or more association scores were unable to be determined, to substitution groups; and assign new items to a group and impute demand transfer values linking the new item to existing items in the substitutability graph.

[0244] The APIs 2106 may define communication interfaces between the demand transfer engine 2100 and other components, such as other components of the forecasting system 102. In the example shown, the APIs 2106 include a query API 2120, which enables a calling application to determine demand transfer values for a particular item, or for a particular pair of items. The APIs 2106 may include a demand transfer serve layer 2122, which may enable a calling application to determine demand transfer values for a collection of items. For example, the demand transfer serve layer 2122 may provide demand transfer values for items of an item list or for a pre-defined category of items.

[0245] FIG. 22 illustrates an example substitutability graph 2200 generated by the demand transfer engine 2100. The graph 2200 includes nodes representing items and edge weights between nodes that represent a degree of substitutability between pairs of items. In some embodiments, substitutability groups may be determined in the graph, and for the substitutability groups, key attributes may be defined. In some embodiments, the edge weights may be determined by analyzing historical sales data to determine probabilities that items were purchased together or substituted for one another. In some embodiments, the forecasting API enables a user to compare demand forecasts of different items and may enable a user to compare demand forecasts of different items in view of potential changes to data associated with the items.

[0246] In some instances, it may be challenging for a forecasting system to model certain types of hypothetical scenarios. For example, in the case of weather events or other seasonal effects, these may affect demand forecasts for certain items, but these scenarios may also affect other supply chain systems, and it can be challenging, in some instances, for a forecasting system to recognize and determine the effects of such scenarios on other supply chain systems.

[0247] In example aspects, an improved human-in-the-loop AI system for demand forecasting is disclosed. For example, the AI system may be configured to receive a user input associated with one or more of promotions, inventory levels, customized similar item groupings, changed future average base prices, long-term trends, assortment updates, business decisions, scenario planning, and other data that may affect demand. In some embodiments, the user input may correspond to a future time. Additionally, users may adjust future hypothetical scenarios by intervening with particular events, such as weather events, competitor store openings or closings, sentiment analysis on trending items, and the like. As an example, a user may input one or more of a selection of a class or modification of future store counts, base prices, or adjustments in pricing, and the AI system may respond with a demand forecast that may be impacted by the user input.

[0248] In some embodiments, the AI system may assist humans when making inventory-related decisions. The AI system may generate recommendations to the user based on information of which the user may not be aware, or based on secondary or tertiary effects of the user's proposed action that the user may not appreciate. For example, when a user submits an inventory override, the user may first provide the proposed override to the AI system, which can generate a recommendation, such as to alter the override or to take another action. The AI system may use the forecasting models 222 as part of generating recommendations.

[0249] The AI system may be communicatively coupled with many different systems. For example, the AI system may be configured to communicate with different databases and platforms that may communicate using different formats. Therefore, the AI system may have access to data that a human user may not have access to, or may not have the skills to access. Therefore, the AI system may detect issues or opportunities that would not be apparent to a user. Moreover, the AI system, unlike a human, has the memory and processing capability to assess information from the different system when evaluating a decision in a manner that is distinct from analogous human capability. Additionally, the AI system may operate in a loop with a human user in which the human inputs improve outputs of the AI system, and the outputs of the AI system are used to make inventory decisions. The results of the inventory decisions are then used to improve the AI system.

[0250] As example applications, the AI system could be used in connection with one or more of the following applications: scenario planning; assortment planning and cannibalization assessment; promotion planning; generating inventory recommendations based on external events, such as weather, store closures, or sporting events; integrating with a supply chain simulator; generating confidence levels for supply and demand; sentiment analysis; parameter tuning; guest behavior analysis; vendor negotiation; and others.

[0251] One application of the AI system is as part of assortment planning. The AI system may serve as a middle layer between a user, an assortment planning tool, and other systems, such as systems of the forecasting system 102. For example, the AI system may generate scenarios for different item assortments. Depending on items in the assortment, different demand forecasts are generated for different items. A user may also adjust prices as part of generating scenarios. The AI system may make simultaneous calls, for example, to the forecasting models 222, the demand transfer engine 2100, or to other systems. Additionally, another example application of the AI system is as part of new item assessments. Forecasting new items may include using the forecasting models 222 to forecast demand for similar items, and the AI system may assist with selecting these similar items.

[0252] FIG. 23 illustrates an example architecture of the AI system 252. The AI system 252 may be used to perform some of the operations described herein. For example, the AI system 252 may enable one or more of the features described in connection with the use cases of the AI system 252 further described below. In the example shown, the AI system 252 includes a user interface 2302, an AI API 2304, a rules engine 2305, AI models 2306, and forecasting models 222.

[0253] The user interface 2302 may receive the user input 2301. In some embodiments, the user interface 2302 may be part of the forecasting system 102, such as the GUI 248. In some embodiments, the user interface 2302 may be part of an application that is communicatively coupled with the forecasting system 102. In some embodiments, the user interface 2302 may be a plurality of different user interfaces that are configured to interact with the AI API 2304.

[0254] The AI API 2304 may be an application programming interface for interacting with the AI system 252. In some embodiments, the AI API 2304 is configured to receive an input corresponding to one or more of the use cases described below. In some embodiments, the AI API 2304 may be configured to receive data corresponding to a user input associated with an action, hypothetical scenario, event, or other occurrence that may affect a retail enterprise. Moreover, the AI API 2304 is configured to receive parameters or other data associated with such a user input.

[0255] The rules engine 2305 may be a middle layer that receives data from the AI API 2304 and that determines one or more models to be used as part of responding to the user input. For example, the rules engine 2305 may select one or more models of the AI models 2306 or the forecasting models 222, and the rules engine 2305 may provide data associated with the user input to the one or more selected models. In some embodiments, the rules engine 2305 may select a model based at least in part on a configuration of the user interface 2302 or a selection in the user interface 2302. In some embodiments, the rules engine 2305, or another component, may facilitate repeated data exchanges with a plurality of AI models 2306 or forecasting models 222 as part of generating the output 2307 to respond to the user input.

[0256] The AI models 2306 may include one or more models that can be used as part of generating a response to the user input 2301. In some embodiments, the AI models 2306 include different models that are trained to process different types of data or to process data belonging to different domains.

[0257] In an example, the AI models 2306 include a supply chain forecasting model. The supply chain forecasting model may be used to generate predicted states of a supply chain for a multi-location organization, such as a large retail organization. For example, the supply chain forecasting model may generate a simulated forecast of a supply chain state based on replenishment assumptions into a supply chain. An example model is described in U.S. Patent Pub. No. 2021/0334829, entitled User Interface for Visualizing Output from Supply Chain Replenishment Simulation, the disclosure of which is hereby incorporated by reference in its entirety.

[0258] In an example, the AI models 2306 include a promotion forecasting model. The promotion forecasting model may optimize targeted promotional programs for items based on past promotions and performance. Examples of such a promotional optimization model are described in U.S. Pat. No. 11,403,668, entitled Multitask Transfer Learning for Optimization of Targeted Promotional Programs, the disclosure of which is hereby incorporated by reference in its entirety.

[0259] In an example, the AI models 2306 include external effects models. The external effects models may include one or more models indicating effects on demand or on supply chain operations induced by external effects. For example, various external effects may be introduced by such a model, such as goods shortages, weather condition effects, seasonality models, and the like. Such models may be separate from the above forecasting models or integrated therein, for example as described in U.S. Pat. No. 11,182,808, mentioned above.

[0260] In some embodiments, there may be various use cases for the AI system 252. Some example use cases are set forth below.

[0261] In example aspects, the AI system 252 may interact with a user to generate one or more types of forecasts or recommendations for scenario planning purposes. Accordingly, in some example implementations, a user input may be introduced as part of a scenario planning process, which applies different sets of business rules and desired outcomes to model particular planning processes. This may allow for planning of promotions in future time periods, and allow users to change the timing or extent of promotions in those time periods, while enabling the AI system 252 to accommodate the user adjustments to timing or promotion detail to achieve the desired outcomes. Additionally, user inputs may be received into a modeling process that may adjust baseline assumptions made by the model. Example assumptions include, for example, a future average base price of an item, or a long-term trend at an item or class level for sales of such items. Additionally, a user may provide inputs that's change underlying assumptions such as store account for an item (e.g., the number or selection of stores that are designated to carry a given item).

[0262] In example aspects, user inputs may be introduced to define an impact of anticipated assortment planning changes on forecasts for individual items. For example, in a default scenario, forecasted sales of an item assume that the item will be carried forward in future planning periods. User inputs may indicate that another item is planned to be sold during a future time, which may cannibalize sales of the item. In some embodiments, the AI system 252 may detect the cannibalization effect, predict a degree of the cannibalization, and provide a recommendation. As another example, a plan may be in place to reduce footprint for a given item within physical retail locations, or other changes may be anticipated that could change item demand but which are not well accounted for by existing forecasting models. An example modeling process that may receive user inputs to address anticipated changes in assortment plans and that may be utilized by the AI system 252 is described in U.S. patent application Ser. No. 18/100,257, entitled Incremental Value Assessment Tool and User Interface, the disclosure of which is hereby incorporated by reference in its entirety.

[0263] In example aspects, a modeling system for planning promotional activities by a retail enterprise may be improved by receiving user inputs before a modeling process occurs. The user inputs may be used to modify model output based on one-time business circumstances that are in place at the time of the anticipated promotion. For example, a user may provide input regarding potential cross-promotion impact or cross-elasticity impact on sales of an item, for example if a competitor retail organization is currently promoting or will promote that same item for sale. Such competitive promotional activity may affect item demand in a way not otherwise readily captured by model activities. In some embodiments, the AI system 252 may ingest data relating to such planned promotional activities and determine what impact, if any, such activities may have on other processes.

[0264] In a still further example, item demand forecasts or sales forecasts may be adjusted based on user inputs indicating changes in conditions at one or more physical retail locations, or within an overall retail supply chain network. For example, weather events, such as hurricanes, floods, wildfires, major sporting events, or other unanticipated events (e.g., limited time store closures and the like) may be ingested by the AI system 252, such that modeling may automatically reflect changes in item demand in response to such events. For example, in the case of user inputs identifying a potential hurricane event, the AI system 252 may cause demand for water or nonperishable goods may increase in days prior to the event at stores within a predetermined distance from a forecasted event location.

[0265] Additionally, in response to user input regarding such an event, changes to overall supply chain response may be determined by the AI system 252, which may indicate a time of reallocation of items to areas in need (e.g., where demand may be greatly increased). In other examples, events such as store closures of competitor stores may change overall demand for items at a given store. For example, a competitor store going out of business that is located in geographic proximity to one or more retail locations of a retail enterprise may results in at least short-term increases in demand for items carried by both the retail location and the closed competitor store. Similarly, competitor store openings may result in changes in demand. Such events may be ingested by the AI system 252, which may detect the likely downstream effects and provide one or more of a recommendation or demand forecast to a user that corresponds to the downstream effects. Such downstream effects may include decreases in demand for similarly situated stores, or increases in demand in the case of complementary stores (e.g., warehouse or club type stores, which may increase overall foot traffic at the retail location).

[0266] In example aspects, in addition to improved demand forecasting, user input prior to execution of models may provide improvements in the context of supply chain simulation. Such user inputs may enable what if scenario simulations based on then-current supply chain states and supply chain constraints (e.g., vendor constraints, shipping constraints, back-room space at retail locations, and the like) to determine a real life sales forecast based on both demand and realistic supply or stocking levels.

[0267] In the context of forecast demand, the forecasting system 102 may utilize user input to better indicate, on a customized item by item basis, confidence levels for individualized item forecasts. This is particularly useful in the case of new items, seasonal items, or items that otherwise have low sales history. In some cases, confidence in a forecast may be higher based on a high level of similarity between the item for which a forecast is generated (e.g., the new item) and another item for which more history is available, and therefore forecast confidence is greater. In other cases, confidence in a forecast may be lower based on the uniqueness of the item, the user perceived variability in potential outcomes in terms of sales demand, and the like. By allowing a user to individually provide input regarding confidence in a demand forecast, downstream systems may be able to automatically adjust for that variable confidence level by adjusting stocking levels, and the like.

[0268] In example aspects, user inputs to the AI system 252 may be allowed to define adjustments to demand based on sentiment analysis regarding individualized, trending items. For example, a user may be able to identify a particular item that is offered for sale which may be associated with a short-term trend as reflected online, for example in social media. A trend identification tool may be available to individual users, who may in turn identify items associated with a trend. The items associated with the trend be adjusted in accordance with a response expected for that trend, for example adjustments to short-term demand. These adjustments may result in changes to stocking level forecasts, item level purchases of individual items, supply chain adjustments to accommodate additional capacity, and the like.

[0269] In a further example, the AI system 252 may be used to generate text recommendations to a user based on the user's actions. For example, if the user is creating a scenario, the user may input hypothetical data pertaining to the scenario. The forecasting system 102 may determine a demand forecast for the scenario and may also determine side effects or secondary effects, such as on other items or other aspects of a retail enterprise, related to the user's scenarios. In some embodiments, the AI system 252 generates a text recommendation that notifies the user of the demand forecast for the item and also of the other effects that the user's modification may have. To do so, in some embodiments, the AI system 252 uses a pre-trained large language model that is provided additional context when generating a response, where the additional context may include the demand forecast for the item and the side effects or secondary effects of the user's proposed changed.

[0270] In some examples, user inputs may be received to define a segmentation of a customer population. In such a case, an optimal assortment may be modeled for each defined segment of the customer population, rather than across the entire collection of enterprise customers.

[0271] In further examples, user inputs may be utilized to either model vendor support for promotional programs, or to expose certain models to vendors for purposes of receiving user input from those vendors. Regarding modeling of vendor support, a user may provide information regarding the extent of funds a vendor may provide to be featured on promotional literature created and distributed by the retail enterprise. Such inputs may either be user entered or modeled based on historical data. Regarding exposure of models to vendors, in some instances, forecasting models may be exposed to a vendor via a portal or API. Each vendor may submit requests to the portal or API, and receive promotion forecasts, thereby allowing each vendor to determine potentially valuable promotions the vendor may be able to offer for its items via the retail enterprise.

[0272] In still further examples, beyond use of user inputs as affecting downstream AI models, it is recognized that a combination of upstream AI models and user inputs may be utilized. In one example, a downstream model used to forecast guest demand may be used, but an upstream ordering process may combine the downstream demand for goods with additional factors, such as spoilage for perishable items. For example, a separate algorithm to predict spoilage rates based on past history of spoilage on a per item basis (e.g., rates of spoilage of individual types of produce items) may be combined with downstream demands to determine an upstream ordering requirement so that an individual retail location may order an appropriate amount of items to fulfill customer demand.

[0273] In further examples, an upstream assortment item recommendation process may be used to determine items to be carried within an overall online or in-store assortment. In example implementations, such a model may generate recommendations on items or item types to be carried within a given assortment. Such a model may generate recommendations based on existing enterprise data, and may determine trends in demand and correlations between demand and item attributes such that additional items may be identified as recommended to be carried within an item assortment. In other examples, received data, such as queries for recommendations or comments from enterprise customers may be included within training data, such that recommendations for new items to include in an item assortment may reflect current item assortment, trends in demand, and explicit customer feedback. An example of a system for generating recommendations for new items is described in U.S. Provisional Patent Application No. 63/507,779, and entitled System for Recommending Items and Item Designs Based on AI-Generated Images, the disclosure of which is hereby incorporated by reference in its entirety.

[0274] In still further examples, additional models may be used to modify demand forecasts based on display factors. In particular, a location of an item within a retail organization, or its prominence on a retail website, may affect the extent to which the item is purchased. In some examples, a model of display effects may be provided as to physical display changes (end cap, check lane displays, and the like) may be used to adjust individual item demand at a particular retail location based on the selected location of the item.

[0275] Still further, in some types or classes of items (e.g., apparel) there is a fixed inventory and that inventory is required to be allocated across a plurality of retail locations. In such instances, rather than modeling only demand for the item, the desired modeling would involve generation of recommendations regarding the specific retail locations at which the item should be stocked, a timeframe that the item can be expected to be carried at the retail location, as well as expected revenue and/or margin from sales of the item at that location. Such modeling may allow a retail enterprise to make better assortment decisions, because the limited shelf space at a particular location combined with the shelf space required to carry and display individualized items may be considered in determining the opportunity cost of the shelf space. This allows for better assortment decision-making.

[0276] FIG. 24 illustrates an example user interface 2400 that may be utilized to provide some level of user input into demand forecasting models is illustrated in FIG. 24. In some embodiments, the user interface 2400 is communicatively coupled with the AI system 252. The AI system 252 may, for example, process data or a request input via the user interface 2400 or generate data that is output via the user interface 2400. The user interface 2400 depicts a set of tabs 2401 that allow a user to view relationships among items for among item stocking factors (supply chain effects, seasonality, and the like) a tab allowing the user adjustment of price response for individual items or for classes of items, and a tab (as displayed) showing promotion response analysis.

[0277] In the example of FIG. 24, the promotion response analysis tab is selected. The user interface 2400 displays options 2402 for selecting one or more of a location, item, and time. These options may be selected at different levels of granularity. As examples, for location, a particular store, group of stores, digital fulfillment channel, or supply chain-wide level may be selected. For item, a particular item or group of items may be selected. For time, a day, week, month, quarter, or year may be selected. In some embodiments, the AI system 252 will generate a response that is specific to the location-item-time selection input via the options 2402. Additionally, the user interface 2404 may include promotion options 2404, which may be used by a user to select settings for a hypothetical promotion. For example, the user may input a hypothetical elasticity of the selected item or items, and the AI system 252 or other systems may use the input elasticity as part of generating a response. Other examples of promotion systems may include types of promotions and their corresponding features. For example, a promotion may be indicated to be a display promotion, in which the selected item is displayed on an end cap, or on a certain area of a website for a certain period of time. Other promotion types and corresponding features are likewise possible, as described above.

[0278] Upon selection, in this example, of a location-item-time via the options 2402 and a selection of promotion characteristics, series of analyses may be performed and displayed. The user interface 2400 illustrates examples of such displays. For example, the user interface 2400 includes a region 2406 showing a promotion segmentation of target rating points across average unit volume for a plurality of items, a region 2408 showing sales lift for an item based on a discounts across example promotions, and a region 2410 showing demand lift for a group of items based on a discount across example promotions (after all cross-item cannibalization of demand has been accounted for). In some embodiments, the demand transfer engine 2100 may be used as part of determining intra-group cannibalization illustrated by the region 2410. The key 2412 indicates example promotions. In some embodiments, one or more displays in the user interface 2400 may be updated in real time as the AI system 252 generates a response.

[0279] FIG. 25 illustrates an example block diagram of a virtual or physical computing system 2500. One or more aspects of the computing system 2500 can be used to implement the system and processes described herein.

[0280] In the embodiment shown, the computing system 2500 includes one or more processors 2502, a system memory 2508, and a system bus 2522 that couples the system memory 2508 to the one or more processors 2502. The system memory 2508 includes RAM (Random Access Memory) 2510 and ROM (Read-Only Memory) 2512. A basic input/output system that contains the basic routines that help to transfer information between elements within the computing system 2500, such as during startup, is stored in the ROM 2512. The computing system 2500 further includes a mass storage device 2514. The mass storage device 2514 is able to store software instructions and data. The one or more processors 2502 can be one or more central processing units or other processors.

[0281] The mass storage device 2514 is connected to the one or more processors 2502 through a mass storage controller (not shown) connected to the system bus 2522. The mass storage device 2514 and its associated computer-readable data storage media provide non-volatile, non-transitory storage for the computing system 2500. Although the description of computer-readable data storage media contained herein refers to a mass storage device, such as a hard disk or solid-state disk, it should be appreciated by those skilled in the art that computer-readable data storage media can be any available non-transitory, physical device or article of manufacture from which the central display station can read data and/or instructions.

[0282] Computer-readable data storage media include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable software instructions, data structures, program modules or other data. Example types of computer-readable data storage media include, but are not limited to, RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROMs, DVD (Digital Versatile Discs), other optical storage media, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computing system 2500.

[0283] According to various embodiments of the invention, the computing system 2500 may operate in a networked environment using logical connections to remote network devices through the network 2501. The network 2501 is a computer network, such as an enterprise intranet and/or the Internet. The network 2501 can include a LAN, a Wide Area Network (WAN), the internet, wireless transmission mediums, wired transmission mediums, other networks, and combinations thereof. The computing system 2500 may connect to the network 2501 through a network interface unit 2504 connected to the system bus 2522. It should be appreciated that the network interface unit 2504 may also be utilized to connect to other types of networks and remote computing systems. The computing system 2500 also includes an input/output controller 2506 for receiving and processing input from a number of other devices, including a touch user interface display screen, or another type of input device. Similarly, the input/output controller 2506 may provide output to a touch user interface display screen or other type of output device.

[0284] As mentioned briefly above, the mass storage device 2514 and the RAM 2510 of the computing system 2500 can store software instructions and data. The software instructions include an operating system 2518 suitable for controlling the operation of the computing system 2500. The mass storage device 2514 and/or the RAM 2510 also store software instructions, that when executed by the one or more processors 2502, cause one or more of the systems, devices, or components described herein to provide functionality described herein. For example, the mass storage device 2514 and/or the RAM 2510 can store software instructions that, when executed by the one or more processors 2502, cause the computing system 2500 to receive and execute managing network access control and build system processes.

[0285] FIG. 26 illustrates an example user interface 2600. In some embodiments, the user interface 2600 is an example of a graphical user interface 248. In some embodiments, the AI system 252 may process data received or output via the user interface 2600. In the example shown, the user interface 2600 is displayed by the computing device 2601, which may be, for example, a demand forecast consumer or a device of the administrator 120.

[0286] In some embodiments, the user interface 2600 provides an interactive demand forecasting system that, among other things, displays demand forecasts for items and enables a user to override a demand forecast or generate hypothetical demand forecasts. In the example shown, the user interface includes item forecast data 2602, which includes a plurality of example items and associated attributes of the items. For example, the item forecast data 2602 may display data from the item data 108. Additionally, the item forecast data 2602 may display forecasts generated by the forecasting models 222. For example, for the item Arnold's Bread, data from a demand forecast from March 22 to April 5 is displayed. For example, the item forecast data 2602 displays a forecasted units per store per week (UPSPW) for each of the items. The user interface 2600 may include other data pertaining to demand forecasts, such as an indication of whether a demand forecast was generated for a particular store or group of stores, or a fulfillment channel for the forecasted demand. The user interface 2600 may also display historical sales data and other features used by the forecasting models 222 to generate demand forecasts. In the example shown, the user interface 2600 includes an add new item button and a generate forecast button. For example, the user may provide an input (examples of which are illustrated by the inputs 2604-2608) and may select the generate forecast button, which may cause a demand forecast to be generated based at least in part on the input.

[0287] In the example shown, a user may edit data of the user interface 2600 to generate updated demand forecasts. For example, the input 2604 depicts a user editing a Location Count for an example item. For example, the item Cindy's Bread may have been forecasted to be sold at 387 locations, but this feature may be altered via the user interface 2600 to generate an updated demand forecast. For example, the user may input more, fewer, or different locations at which to sell the item, and in response, the forecasting system 102 may generate an updated forecast, which may, in the example shown, result in a different UPSPW for Cindy's Bread. In some embodiments, the AI system 252 may receive data corresponding to the input 2604 and orchestrate operations of communicatively coupled systems to generate a response, which may include a recommendation based on the user's input. For example, in response to receiving data of the input 2604 that reduces a number of locations at which Cindy's Bread is sold, the AI system 252 may determine that such a reduction may reduce the supply of other items at the location, thereby necessitating large purchase orders for such items, or that such a reduction may affect the effectiveness of a promotion that requires a customer to purchase rolls. In some embodiments, the AI system 252 causes a display on the user interface 2600 informing the user of such downstream effects that may occur if the user were to alter the location count for Cindy's Bread. In a similar manner, the AI system 252 may generate recommendations in response to received data associated with the inputs 2606 and 2608.

[0288] In the example shown, the input 2606 illustrates a hypothetical modification of price for Betty's Bread. For example, the user may increase or decrease the price for the associated forecasting period, and a forecasting model associated with Betty's Bread may generate an updated demand forecast. Additionally, the AI system 252 may process and generate recommendations in response to the hypothetical modified price. For example, the AI system 252 may determine that another item, such as Arnold's Bread is affected by the input 2606 adjusting a price of the Betty's Bread. For example, if the price for Betty's Bread is increased, then the demand forecast for Arnold's Bread may likewise increase.

[0289] In the example shown, the input 2608 illustrates an override of a demand forecast. For example, a user may increase or decrease the demand forecast for Arnold's Bread. In some embodiments, an inventory management system communicatively coupled with the user interface 2600 may receive the edit to the demand forecast and may alter purchase orders accordingly. For example, if the demand forecast is increased, then then the inventory management system may generate purchase orders so supply for Arnold's Bread is sufficient to meet the increased demand. In some embodiments, overriding the demand forecast may affect subsequent demand forecasts for the item. For example, if the demand forecast is increased, then subsequent demand forecasts may likewise be increased. Additionally, demand forecasts for other items may be impacted. For example, if the demand forecast is increased, then demand forecasts for complementary items may likewise be increased while demand forecasts for substitute items may be decreased. In some embodiments, when a user overrides the demand forecast, the override may not be committed to other components of the forecasting system or to an inventory management system until the user receives a response generated by the override via the user interface, and until the user subsequently commits the override, such as selecting a button on the user interface.

[0290] FIG. 27 illustrates an example of the user interface 2600. In the example shown, the user selects the add new item button to add a new item to the item forecast data 2602. For example, the new item may be a new item of an item catalog or an existing item of a catalog to be added to the item forecast data 2602. The user may provide an input 2702 to add the new item. In the example shown, the user adds Earl's Bread to the item forecast data 2602.

[0291] In some embodiments, as part of adding a new item, a user may input new item data 2704, which includes a plurality of input fields via which the user may input data corresponding to the new item. The new item data 2704 includes input fields for similar items, location count, price, and overrides, such as overrides of defaults feature values used by a forecasting model to forecast demand for the new item. In some embodiments, a forecasting model may use the new item data 2704 to generate demand forecasts for the new item.

[0292] FIG. 28 illustrates an example user interface 2800. The user interface 2800 may be an example of the graphical user interface 248. In some embodiments, the user interface 2800 is an alternative to the user interface 2600, and vice-versa. For example, operations and features described in connection with the user interface 2800 may be implemented in the user interface 2600, and vice-versa.

[0293] The user interface 2800 includes forecasting data 2802 for an example item, which in the example shown is Jeans Medium Wash. The forecasting data 2802 may display data from the databases 108-112 pertaining to the example item. Additionally, the forecasting data 2802 may include input fields via which the user may input data to generate demand forecasts for the example item. For example, the forecasting data 2802 includes input fields to input a forecast start, a forecast end, a store count, and a price. In the example shown, the example item is a new item, or an item with limited sales history. However, the user interface 2800 and features associated therewith are not limited to new items.

[0294] The user interface 2800 further includes a similar item cluster 2804. The similar item cluster 2804 includes a plurality of similar items to the example item. In some embodiments, data of the similar items may be used to forecast demand for the example item, such as by performing similar item forecasting operations described herein, such as in connection with FIGS. 18-19. There may be different techniques for determining the plurality of similar items. In some embodiments a user may manually select similar items. In some embodiments, the plurality of similar items may automatically be determined upon selection of the example item in the user interface 2800. For example, the AI system 252 may identify the plurality of items, such as by using operations described above in connection with FIGS. 18-19. In some embodiments, a plurality of candidate similar items may be determined by an AI system, and the user may select the plurality of items from among the plurality of candidate similar items.

[0295] The user interface 2800 further includes a demand forecast 2806. The demand forecast is illustrated as a time-series demand forecast for the example item. As shown, the demand forecast includes both forecasted sales for a plurality of time periods and past sales data for the plurality of sales. The demand forecast 2806 includes a demand forecast that is generated based at least in part on data associated with the plurality of similar items displayed in the similar item cluster 2804. For example, a weighted aggregation of demand forecasts for the plurality of items may be used to generate the demand forecast 2806, such as by supplementing a forecast from an item-specific model for the example item from the forecasting models 222, or by replacing an item-specific forecasting model if it has insufficient training data. In some embodiments, the demand forecast 2806 is dynamically updated in response to a user providing an input into the forecasting data 2802 or an input modifying a similar item in the similar item cluster 2804. For example, the demand forecast 2806 may include an initial demand forecast, such as demand forecast using a first set of similar items or first values for features. When the user updates, for example, the plurality of similar items or feature values, the initial demand forecast may be replaced with updated values.

[0296] FIG. 29 illustrates a further example of the user interface 2800. According to the example of FIG. 29, the user may edit the plurality of similar items. The user interface 2800 includes a plurality of candidate items 2902. In some embodiments, the AI system 252, or another application, may identify and rank similar items to the example item, in this case Jeans Medium Wash. The most similar items, such as the X most similar items to the example item, may be initially included in the similar item cluster 2804, and remaining similar items may be included in the plurality of candidate similar items 2902. Advantageously, this provides a baseline of similar items that may be refined by the user, thereby combining deep learning techniques to identify candidate similar items from across a vast catalog of items with a user's judgment with respect to a final determination of similar items. For example, in the example shown, the user may remove the item Khaki from the plurality of items and replace it with the item Jeans Dark Wash. After doing so, the forecasting system 102 may regenerate the demand forecast 2806, which may be displayed by the user interface 2800. Additionally, the forecasting system 102 may regenerate the plurality of candidate similar items based at least in part on the user's modification of the plurality of similar items.

[0297] While particular uses of the technology have been illustrated and discussed above, the disclosed technology can be used with a variety of data structures and processes in accordance with many examples of the technology. The above discussion is not meant to suggest that the disclosed technology is only suitable for implementation with the data structures shown and described above.

[0298] This disclosure described some aspects of the present technology with reference to the accompanying drawings, in which only some of the possible aspects were shown. Other aspects can, however, be embodied in many different forms and should not be construed as limited to the aspects set forth herein. Rather, these aspects were provided so that this disclosure was thorough and complete and fully conveyed the scope of the possible aspects to those skilled in the art.

[0299] As should be appreciated, the various aspects (e.g., operations, memory arrangements, etc.) described with respect to the figures herein are not intended to limit the technology to the particular aspects described. Accordingly, additional configurations can be used to practice the technology herein and/or some aspects described can be excluded without departing from the methods and systems disclosed herein.

[0300] Similarly, where operations of a process are disclosed, those operations are described for purposes of illustrating the present technology and are not intended to limit the disclosure to a particular sequence of operations. For example, the operations can be performed in differing order, two or more operations can be performed concurrently, additional operations can be performed, and disclosed operations can be excluded without departing from the present disclosure. Further, each operation can be accomplished via one or more sub-operations. The disclosed processes can be repeated.

[0301] Although specific aspects were described herein, the scope of the technology is not limited to those specific aspects. One skilled in the art will recognize other aspects or improvements that are within the scope of the present technology. Therefore, the specific structure, acts, or media are disclosed only as illustrative aspects. The scope of the technology is defined by the following claims and any equivalents therein.