COMMUNICATIONS SERVER APPARATUS AND METHOD FOR SIMULATING SUPPLY AND DEMAND CONDITIONS RELATED TO A TRANSPORT SERVICE
20230214767 · 2023-07-06
Inventors
- Kelly KUO (Singapore, SG)
- Xin XU (Singapore, SG)
- Ashwin MADELIL (Singapore, SG)
- Chao XIE (Singapore, SG)
Cpc classification
G06Q30/0202
PHYSICS
G06Q10/087
PHYSICS
International classification
Abstract
A communications server apparatus for simulating supply and demand conditions related to a transport service and deriving associated spatio-temporal prediction data, the communications server apparatus comprising a processor and a memory, and being configured, under the control of the processor, to execute instructions stored in the memory to: obtain supply and demand data, said supply data comprising service provider location and availability data and said demand data comprising user bookings data; generate, using said supply and demand data, aggregated supply and demand data comprising a plurality of data records associated with a plurality of respective predetermined time periods, each record being representative of an available supply pool of one or more service provider types in one of a plurality of regions, and demand therefor, during the respective predetermined time period; generate, using said supply and demand data, probability data for each of said plurality of regions and supply pools in relation to respective predetermined time periods, said probability data comprising probability values representative of a likelihood of demand associated with respective time slot/supply pool/region combinations; perform a simulation of supply and demand conditions in said plurality of regions by mapping said aggregated supply and demand data to said probability data in a trained forecasting model and generating prediction data for each of said plurality of regions and supply pools, said prediction data being representative of a probability that a service provider will receive a user booking in a specified region within a predetermined period of time; and output said prediction data for display on a service provider communications device.
Claims
1. A communications server apparatus for simulating supply and demand conditions related to a service or asset provision and deriving associated spatio-temporal prediction data, the communications server apparatus comprising a processor and a memory, and being configured, under the control of the processor, to execute instructions stored in the memory to: obtain supply and demand data, said supply data comprising service or asset provider location and availability data and said demand data comprising user request data; generate, using said supply and demand data, aggregated supply and demand data comprising a plurality of data records associated with a plurality of respective predetermined time periods, each record being representative of an available supply pool of one or more service provider/asset types in one of a plurality of regions, and demand therefor, during the respective predetermined time period; generate, using said supply and demand data, probability data for each of said plurality of regions and supply pools in relation to respective predetermined time slots, said probability data comprising probability values indicating a likelihood of demand associated with each respective time slot in each region for each supply pool; perform a simulation of supply and demand conditions in said plurality of regions by mapping said aggregated supply and demand data to said probability data in a trained forecasting model and generating prediction data comprising a predicted probability for each of said plurality of regions and supply pools, wherein said predicted probability indicates a probability that a service or asset provider will receive a user request in a specified region within a predetermined period of time; and output a set of said predicted probabilities for said plurality of regions and supply pools for display on a service or asset provider communications device.
2. A communications server apparatus according to claim 1, further configured to only output a set of said predicted probabilities for each supply pool for display on communications devices of service or asset providers recorded as belonging to a service provider/asset type of the respective supply pool.
3. A communications server apparatus according to claim 1, for simulating supply and demand conditions related to a transport service, wherein said demand data comprises user bookings data.
4. The communications server device of claim 1, further comprising a database configured to store live supply and demand data as historical supply and demand data.
5. The communications server apparatus of claim 4, further configured to generate supply pooling logic for mapping a service or asset provider type to a supply pool by deriving one or more supply pools based on historical supply and demand data obtained from said database, and grouping sets of one or more service or asset provider types into supply pools representing greatest demand in a region during each of a plurality of predetermined time periods.
6. The communications server apparatus of claim 1, further configured to generate said aggregated supply and demand data by generating sequential data records, each representative of a predetermined time period and comprising data representative of demand during each of a plurality of time slot in relation to specified supply pools.
7. The communications server apparatus of claim 1, further configured to generate said probability data for each time slot in each region for each supply pool by calculating a probability value based on supply and demand in the respective region during the respective predetermined time period, adjusted to take account of low supply and/or demand conditions.
8. The communications server apparatus of claim 7, further configured to calculate said probability value for each time slot in each region for each supply pool using an adjusted empirical probability calculation for that time slot, region and supply pool wherein a demand count from another time period for the same region and supply pool is added to the numerator and denominator thereof.
9. The communications server apparatus of claim 7, further configured to calculate said probability value for each time slot in each region for each supply pool using an adjusted empirical probability calculation wherein 1 is added to the denominator thereof.
10. The communications server apparatus of claim 4, further configured to generate a training data set for use in generating said trained forecasting model, said training data set comprising aggregated supply and demand data and probability data derived from historical supply and demand data, stored in said database, for said plurality of regions.
11. The communications server apparatus of claim 10, further configured to periodically update said trained forecasting model with new aggregated supply and demand data and probability data derived from recent historical supply and demand data, stored in said database, for said plurality of regions.
12. The communications server apparatus of claim 1, further comprising an API service, and configured to output said prediction data for a specified supply pool, via said API service, to a communications device of a service or asset provider belonging to a service provider/asset type of that supply pool.
13. The communications server apparatus of claim 12, wherein said API service is a heatmap service configured to output said prediction data for said specified supply pool to said communications device of a service or asset provider for display thereon as a heatmap representative of the probability of a service provider receiving a booking in one or more of the plurality of regions within a said predetermined time period.
14. A communications server apparatus according to claim 10, further configured to monitor the performance of said forecasting model and to perform a retraining process in respect thereof in the event that said performance drops below a predetermined threshold.
15. A communications server apparatus according to claim 14, wherein said retraining process comprises determining the aggregated supply and demand data that caused a drop in said performance of the forecasting model, and re-sampling aggregated supply and demand data and probability data including the aggregated supply and demand data that caused a drop in said performance, to generate a new training data set for use in retaining said forecasting model.
16. A method, performed in a communications server apparatus, for simulating supply and demand conditions related to a service or asset provision and deriving associated spatio-temporal prediction data, the method comprising, under control of a processor of the communications server apparatus: obtaining supply and demand data, said supply data comprising service or asset provider location and availability data and said demand data comprising user request data; generating, using said supply and demand data, aggregated supply and demand data comprising a plurality of data records associated with a plurality of respective predetermined time periods, each record being representative of an available supply pool of one or more service or asset provider types in one of a plurality of regions, and demand therefor, during the respective predetermined time period; generating, using said supply and demand data, probability data for each of said plurality of regions and supply pools in relation to respective predetermined time slots, said probability data comprising probability values indicating a likelihood of demand associated with each respective time slot in each region for each supply pool; performing a simulation of supply and demand conditions in said plurality of regions by mapping said aggregated supply and demand data to said probability data in a trained forecasting model and generating prediction data comprising a predicted probability for each of said plurality of regions and supply pools, wherein said predicted probability indicates a probability that a service or asset provider belonging to a service provider/asset type of a specified supply pool will receive a user request in a specified region within a predetermined period of time; and outputting a set of said predicted probabilities for said plurality of regions and supply pools for display on a service or asset provider communications device.
17. A method according to claim 16, wherein outputting said set of predicted probabilities for said plurality of regions and supply pools comprises only outputting a set of predicted probabilities for each supply pool for display on communications devices of service or asset providers recorded as belonging to a service provider/asset type of the respective supply pool.
18. (canceled)
19. A non-transitory storage medium storing instructions which, when executed by a processor, cause the processor to perform the method of claim 16.
20. A communications system for simulating supply and demand conditions related to a service or asset provision and deriving associated spatio-temporal prediction data, at least one user communications device and communications network equipment operable for the communications server apparatus and the at least one user communications device to establish communication with each other therethrough, and at least one service or asset provider communications device and communications network equipment operable for the communications server apparatus and the at least one service provider communications device to establish communication with each other therethrough, the service provider communications device comprising a display for displaying data representative of a plurality of geographical regions, the communications server apparatus comprising a processor and a memory, and being configured, under the control of the processor, to execute instructions stored in the memory to: obtain supply and demand data, said supply data comprising service or asset provider location and availability data received from said at least one service or asset provider communications device, and said demand data comprising user request data received from said at least one user communications device; generate, using said supply and demand data, aggregated supply and demand data comprising a plurality of data records associated with a plurality of respective predetermined time periods, each record being representative of an available supply pool of one or more service or asset provider types in one of a plurality of regions, and demand therefor, during the respective predetermined time period; generate, using said supply and demand data, probability data for each of said plurality of regions and supply pools in relation to respective predetermined time periods; perform a simulation of supply and demand conditions in said plurality of regions by mapping said aggregated supply and demand data to said probability data in a trained forecasting model and generating prediction data comprising a predicted probability for each of said plurality of regions and supply pools, wherein said predicted probability indicates a probability that a service or asset provider will receive a user request in a specified region within a predetermined period of time; and output a set of predicted probabilities for said plurality of regions and supply pools for display on a service or asset provider communications device.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
DETAILED DESCRIPTION
[0025] The techniques described herein are described primarily with reference to use in private hire transport and/or taxi and ride-hailing, but it will be appreciated that these techniques have a broader reach and cover other types of transportation services, including the transportation of documents and goods.
[0026] Referring first to
[0027] Communications server apparatus 102 may be a single server as illustrated schematically in
[0028] User communications device 104 may comprise a number of individual components including, but not limited to, one or more microprocessors 128, a memory 130 (e.g. a volatile memory such as a RAM) for the loading of executable instructions 132, the executable instructions defining the functionality the user communications device 104 carries out under control of the processor 128. User communications device 104 also comprises an input/output module 134 allowing the user communications device 104 to communicate over the communications network 108. User interface 136 is provided for user control. If the user communications device 104 is, say, a smart phone or tablet device, the user interface 136 will have a touch panel display as is prevalent in many smart phone and other handheld devices. Alternatively, if the user communications device is, say, a desktop or laptop computer, the user interface may have, for example, computing peripheral devices such as display monitors, computer keyboards and the like.
[0029] Service provider communications device 106 may be, for example, a smart phone or tablet device with the same or a similar hardware architecture to that of user communications device 104. Service provider communications device 106 may comprise a number of individual components including, but not limited to, one or more microprocessors 138, a memory 140 (e.g. a volatile memory such as a RAM) for the loading of executable instructions 142, the executable instructions defining the functionality the service provider communications device 106 carries out under control of the processor 138. Service provider communications device 106 also comprises an input/output module (which may be or include a transmitter module/receiver module) 144 allowing the service provider communications device 106 to communicate over the communications network 108. User interface 146 is provided for user control. If the service provider communications device 106 is, say, a smart phone or tablet device, the user interface 146 will have a touch panel display as is prevalent in many smart phone and other handheld devices. Alternatively, if the user communications device is, say, a desktop or laptop computer, the user interface may have, for example, computing peripheral devices such as display monitors, computer keyboards and the like.
[0030] In one embodiment, the service provider communications device 106 is configured to push data representative of the service provider (e.g. service provider identity, location and so on) regularly to the communications server apparatus 102 over communications network 108. In another, the communications server apparatus 102 polls the service provider communications device 106 for information. In either case, the data from the service provider communications device 106 (also referred to herein as ‘driver location and availability data’ or ‘supply’ data) are communicated to the communications server apparatus 102 and stored in relevant locations in the database 126 as historical data. For the avoidance of doubt, such supply data may include any one or more of, numbers of available service providers in a particular area or region, times of day associated with the service provider availability, respective service provider types, and even idle times associated with available service providers and types. The historical data received from the service provider communications device 106, and stored in the database 126, includes, amongst other things, data indicative of service providers' location and availability at a given time. This historical data, i.e. supply data, is stored against the respective driver identity data which may include, amongst other things, type of vehicle (e.g. 6-seater, taxi, private hire).
[0031] In one embodiment, the user communications device 104 is configured to push data representative of the user (e.g. user identity, location, transport requirements, and so on) regularly to the communications server apparatus 102 over communications network 108. In another, the communications server apparatus 102 polls the service provider communications device 104 for information. In either case, the data from the user communications device 104 (also referred to herein as ‘bookings data’) are communicated to the communications server apparatus 102 and stored in relevant locations in the database 126 as historical data. For the avoidance of doubt, such bookings data may include any one or more of numbers of bookings in a particular area or region, service provider types associated with those bookings, numbers of passengers and/or type of cargo, times of day at which the bookings are made/required to be serviced, etc. The historical data received from the user communications device 104 is stored in the database 126, includes, amongst other things, location and vehicle type associated with each vehicle booking made. As described in more detail below, historical supply and demand data in the database 126 may be used, within a forecasting model, for deriving simulation data representative of a real-time future supply and demand pattern so as to generate data representative of a probability that a driver will secure a passenger booking at a specified location within a some specified time.
[0032] Referring to
[0033] It should also be appreciated that one or more or all of the supply pooling logic processor 140, the broadcast and demand processor 142, the labels processor 144, the model training processor 146 and the forecasting processor 148 may be implemented in the communications server 102, or one or more of them may be implemented in a remote processing facility (not shown) communicably coupled (for example, wirelessly, over a communications network) to the communications server 102.
[0034] Referring to
[0035] In another process step 203, broadcast and demand count feature data is generated. Broadcast and demand count feature data may comprise at least tracking data indicative of available drivers (at any specified time), their respective locations and the time it takes for them to secure a job that has been received via the bookings data from the user communications device 104. In yet another process step 204, labels data is generated. Each label of a set of labels data may comprise an estimate of probability (for each available driver) of securing a job in some predetermined period of time (e.g. X minutes) for each dataset comprising the supply pool to which the respective driver is assigned, specified location or region and real timeslot. The broadcast count data and labels data from steps 203 and 204 respectively are fed to a model training process 206, where the identifiers from the broadcast count data and the labels data are matched up and used as respective training sets and used in the model training process to generate prediction data indicative the probabilities of receiving a fare or job in a specified location within the next (e.g.) X minutes, based on driver location and availability data and bookings data collected in the last Δt minutes for that specified location. The model training process may utilise any one or more of a number of forecasting models, such as gradient boosted decision trees, multi-layer perceptrons, convolutional neural networks with long short term memory layers, ARIMA models, etc.
[0036] It will be readily appreciated from, and elucidated by, the following description that the process for generating the forecasting model can be used, initially, with historical data, to generate the initial forecasting model, which can then be periodically (or even continuously in near real-time) updated using a driver location and availability (supply) data stream obtained from the service provider communications device 106 and a bookings (demand) data stream obtained from one or more user communications devices 106 whilst the system is being used to book and allocate fares.
[0037] The prediction data obtained from the forecasting model, whilst the system is in use, can be used to generate heatmap data that is transmitted to the user communication device 104 for display in the form of a heatmap, wherein a map of a specified region (comprising multiple locations) can be displayed, including spatio-temporal indicators defining locations clusters of locations where a user can expect to secure a job within a predetermined time. However, it will be readily appreciated that the prediction data may, in other embodiments, be displayed in an alternative manner.
[0038] The supply pooling logic determining (or updating) processor 140 derives logic that groups available drivers into disjoint sets based on vehicle types they have flagged against their identity data. The purpose of the supply pooling logic is to aggregate counts for each set such that, ultimately, the same ‘heatmap’, can be displayed to drivers within the same set. The supply pooling logic processor receives driver location and availability data and bookings data from a driver location and bookings datastore (referenced hereinafter) in the database 126 of the communications server apparatus 102. This data is obtained from the driver location and availability data and the bookings data received from the service provider communications device 104 and the user communications device 106 whilst the communications system 100 is in use; and, essentially, comprises regularly-updated driver location/availability and bookings data such that the supply pool logic can be regularly updated to reflect current conditions.
[0039]
[0040] Referring additionally to
[0041] Referring back to
[0042] Thus, for example, with combinations [B, C, D], [B, D, F] and [A, C, E] identified from the data records 312 illustrated in
[0043] Finally, each possible taxi-type combination identified in step 320 is assigned to a ‘supply pool’ to which it is most similar, thereby defining supply pooling logic 322 in the form of a mapping of combinations of taxi-types to respective ‘supply pools’. Thus, in this specific example, [B, C] and [A, C] have been designated as supply pools. The possible taxi-type combinations (excluding the non-‘major’ taxi types) are [B, C], [B] and [A, C]. Thus taxi-type combination [B, C] is assigned to supply pool [B, C], taxi-type combination [B] is assigned to supply pool [B, C] and taxi-type combination [A, C] is assigned to supply pool [A, C].
[0044] The data records 307 (also referred to herein as an ‘aggregated supply and demand stream’) may be stored, as historical data, in the database 126 of the communications server apparatus 102, but this is not essential and, as will be appreciated by a person skilled in the art, the data records 307 may be stored elsewhere, either in the communications server apparatus 102 or in a remote storage facility, such as the Cloud. The data records, including the supply pooling logic data records may be updated, periodically or continuously in near real-time using driver availability data and bookings data streams obtained from the service provider communications device 106 and the user communications device 104 respectively.
[0045] The supply pooling logic 322, generated in the manner described above, can then be used to assign a ‘supply pool’ to each available driver in a specified location at a specified time, based on their currently-activated taxi-type combination.
[0046] As referenced above, the broadcast and demand count features processor 142 is used to keep track, and provide data indicative, of available drivers, their locations and the time it takes for them to receive a job that has been broadcast by the communications server apparatus 102. Algorithms, such as a geohash system, are known to define specific geographical locations anywhere in the world. Such systems tend to treat an area as a series of equally sized, rectangular and adjacent cells. In an embodiment of the communications system 100, the broadcast and demand count features processor 142 may use a known geohash system defining rectangular cells of around 1.2 km×609.4 m, although this is not necessarily intended to be in any way limiting. The ‘counts’, representing historical and/or real-time data indicative of available drivers in each location and the time it takes for them to receive a broadcasted job (whilst at that location), may be aggregated at geohash level. However, in view of the size and configuration of each cell, simply keeping track of raw counts without some spatial smoothing (to take account, for example, of real-time movement of the driver through the area) may cause a technical difficulty due to data sparsity.
[0047] Therefore, the broadcast and demand count features processor 142 performs spatial smoothing of the count data, for example, as follows. Referring to
[0048] Thus, and referring to
[0049] The broadcast and demand count features processor 142 records in respect of sequential (e.g.) 15-minute periods, for each supply pool, geohash and 15-minute slot, data indicating a count representative of the number of drivers whose ‘time-to-broadcast’ was between 0-2 mins, 2-5 mins, 5-10 mins and 10-15 mins, and also a count indicative of the number of drivers that did not receive any broadcast within that 15-minute period.
[0050] Thus, referring to
[0051] The labels processor 144 is configured to generate labels that are constructed as estimates of the actual probability of receiving a job in X minutes for each supply pool/geohash/2-minute slot. The labels processor is configured to utilise data stored in a driver location and bookings datastore, for example hosted by the database 126 of the communications server apparatus 102. The driver locations and bookings datastore receives and stores driver location and availability data and bookings data received from the service provider communications device 104 and the user communications device 104 whilst the communications system 100 is in use. The labels are updated periodically to take into account new data that has been received and stored since the last update.
[0052] In order to calculate the probability of a driver receiving a job (broadcast) in X minutes for each supply pool (to which the driver belongs)/geohash/2-minute time slot, a method of generating labels is performed, wherein an item of label data is generated for each supply pool/geohash/2-minute time slot, and the result is a databank representing data records, wherein each data record carries a set of probabilities for a specified (e.g. 2-minute) timeslot, each probability value being associated with a supply pool/geohash combination.
[0053] The process starts by calculating an initial estimate of empirical probability. Thus, data is gathered, each label data instance being indicative of a number of drivers belonging to a specified supply pool/geohash/2-minute time slot combination. Referring to
[0054] Accordingly, and referring to
[0055] Further adjustment is needed to differentiate between high denominator and low denominator situations, so as to enable the calculation to differentiate between situations where the total number of drivers is large and those where it is small. For example, 1 driver receiving a job out of 1 available driver will result in a probability of 1/(1+1)=0.5; whereas 100 drivers receiving jobs out of 100 available drivers will result in a probability of 100/(100+1)=0.990. In order to effect this adjustment in an exemplary technique, ‘1’ is added to the denominator. Referring to
[0056] The above process is repeated for each 2-minute slot and in respect of each supply pool/geohash combination to produce a databank of probabilities, as illustrated in
[0057] The output of the model training processor is fed to the forecasting module which is configured to perform calculations or simulations to generate, based on the current driver location and availability data stream and the bookings data stream, predicted probabilities of an available driver receiving a broadcast within X minutes. The predicted probability data is output in association with a supply pool/geohash/timeslot identifier, in a manner similar to that described with reference to
[0058] Referring to
[0059] Referring additionally to
[0060] In order to receive data representative of the predictions data 709, a service provider communications device 106 transmits a request, over the communications network 108, to the heatmap service 702, and the heatmap service 702 returns a response in the form of heatmap data representative of selected predictions data 709 associated only with the supply pool(s) containing the taxi type(s) that are ‘switched on’ for the driver using the service provider communications device 106.
[0061] Thus, the exemplary communications system 100 provides an end-to-end system wherein driver location and availability data is collected from drivers' mobile devices, and bookings data (passenger requests) is collected from users' mobile devices. The raw location and availability data is aggregated and combined with bookings data, to generate an aggregated demand and supply stream (as described with reference to
[0062] The model will experience a loss of performance over time as supply and demand relationships in individual regions alter. In some cases, this can happen very suddenly. In other cases, the degradation takes place over a longer period.
[0063] Thus, in some implementations of the techniques described above, a model retraining pipeline may be implemented that monitors model performance and retrains if or when performance is deemed to have dropped below a predetermined threshold. Such model retraining pipeline facilitates the optimisation of resources when retraining the model, since training iterations can be optimised and timed for maximum impact. In some implementations, such model performance may be monitored and, when a drop in performance is detected, a correlation between the drop in model performance and the supply and/or demand conditions that caused the drop may be performed.
[0064] In a model retraining process, the data on which good model performance is achieved and data on which poor model performance is detected can both be utilised in retraining the model. In order to identify the data that has caused the drop in performance, control charts, such as those used in Statistical Process Control, or ‘prime’ charts may, for example, be used, as will be known to a person skilled in the art. The input data, including the data that caused the drop in performance, is re-sampled to retrain the model. As a result, supply and demand distributions within the forecasting model can be smoothed and shaped each time the model is retrained in order to avoid, or at least mitigate, issues caused by extreme anomalies in demand/supply patterns.
[0065] It will be appreciated that the invention has been described by way of example only. Various modifications may be made to the techniques described herein without departing from the spirit and scope of the appended claims. The disclosed techniques comprise techniques which may be provided in a stand-alone manner, or in combination with one another. Therefore, features described with respect to one technique may also be presented in combination with another technique.