SYSTEM AND METHOD FOR RIDESHARE MATCHING BASED ON LOCALITY SENSITIVE HASHING
20220260376 · 2022-08-18
Inventors
Cpc classification
G06N5/01
PHYSICS
G01C21/3453
PHYSICS
G06F18/21326
PHYSICS
G01C21/3446
PHYSICS
International classification
Abstract
A system for rideshare matching using locality sensitive hashing is disclosed, including at least one rider device and at least one driver device in operable connection with a network. A rideshare application is in operable communication with the network and configured for matching a driver to a rider within a match pool via an artificial intelligence engine operating a locality sensitive hashing module.
Claims
1. A system for rideshare matching using locality sensitive hashing, the system comprising: at least one rider device and at least one driver device in operable connection with a network; a rideshare application in operable communication with the network, the rideshare application configured for matching a driver to one or more riders within a match pool via an artificial intelligence engine operating a locality sensitive hashing module.
2. The system of claim 1, wherein a space and time discretized route is constructed for each ride and each driver in the match pool.
3. The system of claim 2, wherein the space-time discretized routes of rides and drivers are weighted by a linear route cost function. A route cost function is linear if the total cost of a route is the sum of costs incurred along its segments. Examples of linear route cost functions include travel duration, travel distance, tolls and taxes incurred along the route etc., or any linear combination of them.
4. The system of claim 3, wherein one or more vector representations are constructed from the space-time discretized route of each ride and each driver.
5. The system of claim 4, wherein vector representations of ride routes and driver routes are transformed using one or more asymmetric locality sensitive hashing scheme transformations.
6. The system of claim 5, wherein vector representations of ride routes and driver routes or their transformations are hashed using a locality sensitive hashing scheme or a combination of several such schemes, and thereafter stored in a locality sensitive hashing data structure.
7. The system of claim 6, wherein vector representations of ride routes and driver routes or their transformations are hashed using a locality sensitive hashing scheme or a combination of several such schemes, and thereafter used to query a locality sensitive data structure.
8. The system of claim 7, wherein the rides and drivers found by querying a locality sensitive hashing data structure for a given ride are used as potential matches for that ride in solving the rideshare matching problem.
9. The system of claim 8, wherein a plurality of rideshare trips are constructed via the retrieved potential matches.
10. A method for rideshare matching using locality sensitive hashing, the method comprising the steps of: constructing a space-time discretized route for each of a plurality of rides and drivers; constructing a preprocessing vector representation for each of the plurality of ride routes and driver routes; constructing a locality sensitive hashing data structure with the preprocessing vector representations; constructing a query vector representation for each of the plurality of ride routes and driver routes; finding, for each ride using its query vector representation, potential matching rides and potential matching drivers from the locality sensitive hashing data structure; and constructing at least one rideshare trip using retrieved potential matches.
11. The method of claim 10, wherein at least one rideshare trip is constructed using a greedy method.
12. The method of claim 10, wherein at least one rideshare trip is constructed using a dynamic programming method.
13. The method of claim 10, wherein at least one rideshare trip is constructed using a constraint optimization method.
14. The method of claim 10, wherein at least one rideshare trip is constructed while optimizing for predicted future ride arrivals using stochastic optimization methods such as stochastic programming, approximate dynamic programming, neural methods etc.
15. The method of claim 10, wherein both current rides and pre-scheduled future rides are included in the match pool while constructing rideshare trips.
16. The method of claim 10, wherein the locality sensitive hashing data structure is persisted across successive iterations of the method. In such an implementation, rides are added to the data structure when they arrive and are deleted from the data structure once they get matched to form a trip. Driver routes are updated in the data structure when these routes get updated.
17. The method of claim 10, wherein dimensionality reduction techniques are employed to reduce the dimensionality of the vector representations of ride routes and driver routes before hashing them using locality sensitive hashing schemes.
18. The method of claim 10, wherein the method is repeated with multiple space-time discretizations with varying granularities. In such an implementation, the potential matches obtained from each repetition of the method can be combined together.
19. The method of claim 10, wherein several alternate routes for each ride and each driver are found, which are then space-time discretized. In such an implementation, the potential matches found using different routes can be combined together.
20. The method of claim 10, further comprising one or more modules implementing heuristics such as Haversine Overlap between ride routes and driver routes, spatial proximity of rides and drivers etc. to filter or augment the potential matches retrieved from the locality sensitive hashing data structure.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] A complete understanding of the present embodiments and the advantages and features thereof will be more readily understood by reference to the following detailed description when considered in conjunction with the accompanying drawings wherein:
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
DETAILED DESCRIPTION
[0023] The specific details of the single embodiment or variety of embodiments described herein are to the described system and methods of use. Any specific details of the embodiments are used for demonstration purposes only, and no unnecessary limitations or inferences are to be understood thereon.
[0024] Before describing in detail exemplary embodiments, it is noted that the embodiments reside primarily in combinations of components and procedures related to the system. Accordingly, the system components have been represented, where appropriate, by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments of the present disclosure so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.
[0025] In general, the embodiments provided herein relate to a system and method of rideshare matching based on the Artificial Intelligence technique of locality sensitive hashing (LSH). LSH is a technique for efficient similarity search employed in a wide variety of engineering and scientific domains. However, its use had not been introduced for rideshare matching in the prior art. The method and system for efficient rideshare matching in real-time from a match pool of tens of thousands of ride requests (henceforth simply called rides) and driver routes is provided to efficiently and effectively match rides with suitable drivers. The method depends on a novel efficient randomized spatio-temporal search algorithm.
[0026] The embodiments include a novel efficient randomized spatio-temporal search method for finding potential rideshare matches. As stated above, once the potential matches are found efficiently, rideshare trips can be constructed using a variety of approaches such as dynamic programming, greedy methods, solving constraint satisfaction programs etc. and the rideshare matching problem can be solved. The method works by transforming the potential match finding problem into a similarity search problem and employing the technique of locality sensitive hashing for similarity search.
[0027] The method includes the steps of transforming rides and driver routes into vector representations in a high-dimensional space capturing their essential spatial properties (physical route in the real road network) as well as temporal properties (arrival time, service time constraints etc.). Next, the method includes the step of deriving a similarity measure for this vector space that captures the notion of match utility, rides and driver routes with high utility from being matched together have similar representations (according to the defined similarity measure); while those that are not feasible to be matched together or have low utility from being matched together have dissimilar representations. The method further includes the step of employing locality sensitive hashing technique for similarity searches in this space. This amounts to finding, for each ride, top rides and driver routes with highest match utility with it.
[0028] The system utilizes the artificial intelligence technique of locality sensitive hashing. A vector representation for rides and driver routes is provided which preserves their essential spatio-temporal features important for rideshare matching (physical route in the real road network, time of arrival, maximum allowed delay etc.). The system provides a similarity measure on spatio-temporal vector representations for rides and driver routes that capture utility of matching them together.
[0029] The method can very efficiently incorporate the information of the real road network in making ride matching decisions. Further, the method can work with a hybrid match pool of current as well as future pre-scheduled rides. Any cost function can be handled while defining the utility of matching as long as it is linear in the route (I.e., the total cost along a route is the sum of costs incurred along the segments of the route). Examples include distance traveled, time of travel, tolls and fees along the route, etc. or any linear combination of these. The method is dynamic and can adapt to changing traffic conditions (weather, accidents, events, closures, constructions etc.) as costs of route segments change (as long as this information is available in real-time which is the case with most modern routing systems). Prior heuristic methods are static and do not adapt.
[0030]
[0031] Processors 110 suitable for the execution of a computer program include both general and special purpose microprocessors and any one or more processors of any digital computing device. The processor 110 will receive instructions and data from a read-only memory or a random-access memory or both. The essential elements of a computing device are a processor for performing actions in accordance with instructions and one or more memory devices for storing instructions and data. Generally, a computing device will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks; however, a computing device need not have such devices. Moreover, a computing device can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device (e.g., a universal serial bus (USB) flash drive).
[0032] A network interface may be configured to allow data to be exchanged between the computer system 100 and other devices attached to a network 130, such as other computer systems, or between nodes of the computer system 100. In various embodiments, the network interface may support communication via wired or wireless general data networks, such as any suitable type of Ethernet network, for example, via telecommunications/telephony networks such as analog voice networks or digital fiber communications networks, via storage area networks such as Fiber Channel SANs, or via any other suitable type of network and/or protocol.
[0033] The memory 120 may include application instructions 150, configured to implement certain embodiments described herein, and a database 160, comprising various data accessible by the application instructions 150. In one embodiment, the application instructions 150 may include software elements corresponding to one or more of the various embodiments described herein. For example, application instructions 150 may be implemented in various embodiments using any desired programming language, scripting language, or combination of programming languages and/or scripting languages (e.g., C, C++, C#, JAVA®, JAVASCRIPT®, PERL®, PYTHON, GOLANG etc.).
[0034] The steps and actions of the computer system 100 described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium may be coupled to the processor 110 such that the processor 110 can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integrated into the processor 110. Further, in some embodiments, the processor 110 and the storage medium may reside in an Application Specific Integrated Circuit (ASIC). In the alternative, the processor and the storage medium may reside as discrete components in a computing device. Additionally, in some embodiments, the events or actions of a method or algorithm may reside as one or any combination or set of codes and instructions on a machine-readable medium or computer-readable medium, which may be incorporated into a computer program product.
[0035] Also, any connection may be associated with a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. “Disk” and “disc,” as used herein, include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc where disks usually reproduce data magnetically, while discs usually reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
[0036] In some embodiments, the system is world-wide-web (www) based, and the network server is a web server delivering HTML, XML, etc., web pages to the computing devices. In other embodiments, a client-server architecture may be implemented, in which a network server executes enterprise and custom software, exchanging data with custom client applications running on the computing device.
[0037]
[0038] In some embodiments, to encode the spatial properties of rides, the system and method utilize the entire physical route of the ride. However, to capture spatial proximity, the system and method discretizes space as shown in
[0039]
[0040] For example, suppose ride r traverses S, A, B, C, D, T in time intervals I0, I1, I2, I3, I4, and I5 respectively. Similarly, ride r′ traverses S′, A, B, C, T′ in time intervals I0, I1, I2, I3, and I4 respectively. The rides r and r′ are represented with their space-time discretized routes of <(S, I0), (A, I1), (B, I2), (C, I3), (D, I4), (T, I5)> and <(S′, I0), (A, I1), (B, I2), (C, I3), (T′, I4)>, respectively. The segment <(A, I1), (B, I2)> and <(B, I2), (C, I3)> are the common space-time discretized segments.
[0041] Finally, the system turns the space-time discretized routes into vector representations. The vector space is a high-dimensional space with each possible space-time discretized route segment as a dimension. Each ride and driver route has two vector representations including a pre-preprocessing vector representation and a query vector representation. The pre-processing vector representation includes a vector with a “0” in every dimension except for dimensions that correspond to a segment in its space-time discretized route, in which case it is the cost of that segment. The query vector representation is similar to the pre-processing vector representation but there is simply a ‘1’ in dimensions that correspond to a segment in its space-time discretized route.
[0042] In general, there can be other methods of vectorization that encodes the spatio-temporal properties of rides and driver routes that are significant for ride matching. The pre-processing vector representation and the query vector representation mentioned here are examples of such vectorization of the spatio-temporal properties of rides and driver routes.
[0043] In the next step, the system derives the similarity. The match utility between two rides or a ride and a driver route is proportional to the total cost of the common segments, which would translate to savings if they are served together. As shown in
[0044] In some embodiments, the preprocessing vector representation is used while constructing the asymmetric locality sensitive hashing dataset and the query vector representation is used while querying the asymmetric locality sensitive hashing dataset to find overlapping matches for a ride. With the above two vector representations the sum of the costs of the common edges in the intersection of the spatio-temporal routes of rides r and r′ is the inner product between the preprocessing vector representation of r and the query vector representation of r′. Thus, approximate overlapping match search can be solved via approximate maximum inner product search (MIPS).
[0045] In general, the similarity measure will depend upon the specific vectorization method used. For example, inner product is a suitable similarity measure with the vectorization scheme presented here that constructs a preprocessing vector representation and a query vector representation for each ride and driver route.
[0046] Next, the system performs the similarity search. With representations defined mathematically in a vector space, the system can use the technique of locality sensitive hashing to find similar rides and driver routes as matches.
[0047] In some embodiments with similarity defined as inner product, for the purpose of similarity search, the system combines an asymmetric LSH construction that transforms Maximum Inner Product search problem to Maximum Cosine Similarity Search with a LSH scheme for Maximum Cosine Similarity Search. For example, the cross-polytope LSH construction for Maximum Cosine Similarity Search can be used for this purpose.
[0048]
[0049] In some embodiments, to improve the success probability of finding high utility potential matches for a ride, the system may consider multiple alternate routes for rides and driver routes. In this alternative, the system constructs vector representations for all the alternate routes to represent the corresponding ride or driver route. Combining ride matches obtained from alternate routes significantly improves success probability.
[0050] In some embodiments, the system may discretize space and time in different granularities which will give rise to multiple representations for each ride and driver route. The system then uses a hybrid method repeating the original method with different discretizations and combining the results.
[0051] In some embodiments, the system can persist the constructed LSH data structure and incrementally add the newly arrived rides while deleting the rides already assigned or picked up. Driver routes should be replaced in the data structure whenever they get updated and also periodically to prevent them from becoming too stale. This incremental approach can improve runtime efficiency in real-time operation.
[0052] In some embodiments, the system can include current as well as pre-scheduled future rides in the same match pool. The temporal aspect of the method and system presented here can effectively find matches between such rides. For example, a current ride or driver route can be matched to a future ride pre-scheduled to arrive 30-minutes from now if the current ride or driver route is expected to reach the pickup of the future ride in that time and have good overlap with its route thereafter.
[0053] In some embodiments, the system may use stochastic optimization methods to make matches predicting for future ride arrivals. Some of the possibilities include stochastic programming, approximate dynamic programming and neural methods.
[0054]
[0055] Many different embodiments have been disclosed herein, in connection with the above description and the drawings. It will be understood that it would be unduly repetitious and obfuscating to describe and illustrate every combination and subcombination of these embodiments. Accordingly, all embodiments can be combined in any way or combination, and the present specification, including the drawings, shall be construed to constitute a complete written description of all combinations and subcombinations of the embodiments described herein, and of the manner and process of making and using them, and shall support claims to any such combination or subcombination.
[0056] An equivalent substitution of two or more elements can be made for any one of the elements in the claims below or that a single element can be substituted for two or more elements in a claim. Although elements can be described above as acting in certain combinations and even initially claimed as such, it is to be expressly understood that one or more elements from a claimed combination can in some cases be excised from the combination and that the claimed combination can be directed to a subcombination or variation of a subcombination.
[0057] It will be appreciated by persons skilled in the art that the present embodiment is not limited to what has been particularly shown and described hereinabove. A variety of modifications and variations are possible in light of the above teachings without departing from the following claims.