Methods and Systems for Caching Data Communications Over Computer Networks
20230336640 · 2023-10-19
Assignee
Inventors
- Alan Arolovitch (Brookline, MA, US)
- Shmuel Bachar (Herzliyya, IL)
- Dror Moshe Gavish (Shoham, IL)
- Shahar Guy Grin (Ramat HaSharon, IL)
- Shay Shemer (Hod Hasharon, IL)
Cpc classification
H04L67/568
ELECTRICITY
H04L67/108
ELECTRICITY
H04N21/632
ELECTRICITY
H04L67/51
ELECTRICITY
International classification
H04L67/568
ELECTRICITY
G06F16/957
PHYSICS
H04L67/51
ELECTRICITY
H04N21/63
ELECTRICITY
Abstract
A computer-implemented method of caching multi-session data communications in a computer network, includes the steps of: (a) receiving, intercepting, or monitoring one or more data sessions between a client executing a multi-session application for retrieving a desired content object and one or more metadata services, said client communicating with the one or more metadata services to discover metadata for the content object; (b) analyzing queries and responses exchanged between the client and the one of more metadata services to discover metadata for the content object; (c) receiving or intercepting subsequent data sessions between the client and content sources; (d) identifying a data protocol used by the client and identifying data queries within the data sessions; (e) identifying the content object or portions thereof requested by the client in the data queries; and (f) determining if the content object or portions thereof are stored in cache and, if so, sending the content object or portions thereof stored in cache to the client, and, if not, sending the data queries to the content sources, storing data responses from the content sources, and sending the data responses to the client.
Claims
1-21. (canceled)
22. An apparatus comprising: a processor; and a memory, coupled to the processor, that stores code for caching multi-session data communications in a computer network, wherein when the code is executed by the processor, the apparatus is configured to perform: (a) receiving, intercepting, or monitoring one or more data sessions between a client executing a multi-session application for retrieving a desired content object and one or more metadata services, said client communicating with the one or more metadata services to discover metadata for the content object; (b) analyzing queries and responses exchanged between the client and the one of more metadata services to discover metadata for the content object; (c) receiving or intercepting subsequent data sessions between the client and content sources; (d) identifying a data protocol used by the client and identifying data queries within the data sessions; (e) identifying the content object or portions thereof requested by the client in the data queries; and (f) determining if the content object or portions thereof are stored in cache and, if so, sending the content object or portions thereof stored in cache to the client, and, if not, sending the data queries to the content sources, storing data responses from the content sources, and sending the data responses to the client.
23. A non-transitory, computer readable medium having code stored therein for caching multi-session data communications in a computer network, wherein when the code is executed by a processor, the processor performs operations comprising: (a) receiving, intercepting, or monitoring one or more data sessions between a client computer system in the computer network executing a multi-session application for retrieving a desired content object and one or more metadata services, said client communicating with the one or more metadata services to discover metadata for the content object; (b) analyzing queries and responses exchanged between the client and the one of more metadata services to discover metadata for the content object; (c) receiving or intercepting subsequent data sessions between the client and content sources; (d) identifying a data protocol used by the client computer system and identifying data queries within the data sessions; (a) (e) identifying the content object or portions thereof requested by the client computer system in the data queries; and determining if the content object or portions thereof are stored in cache and, if so, sending the content object or portions thereof stored in cache to the client computer system, and, if not, sending the data queries to the content sources, storing data responses from the content sources, and sending the data responses to the client.
24. A computer-implemented method of caching multi-session data communications in a computer network, comprising the steps of: executing instructions, stored in a non-transitory memory, by a processor in the computer network to perform steps comprising: (b) receiving, intercepting, or monitoring one or more data sessions between a client computer system in the computer network executing a multi-session application for retrieving a desired content object and one or more electronic metadata services, said client computer system communicating with the one or more electronic metadata services to discover metadata for the content object; (c) analyzing queries and responses exchanged between the client computer system and the one of more electronic metadata services to discover metadata for the content object; (d) receiving or intercepting subsequent data sessions between the client computer system and content sources; (e) identifying a data protocol used by the client computer system and identifying data queries within the data sessions; (f) identifying the content object or portions thereof requested by the client computer system in the data queries; and (g) determining if the content object or portions thereof are stored in cache and, if so, sending the content object or portions thereof stored in cache to the client computer system, and, if not, sending the data queries to the content sources, storing data responses from the content sources, and sending the data responses to the client.
Description
BRIEF SUMMARY OF THE DISCLOSURE
[0019] In accordance with one or more embodiments, a computer-implemented method of caching multi-session data communications in a computer network is provided, including the steps of: (a) receiving, intercepting, or monitoring one or more data sessions between a client executing a multi-session application for retrieving a desired content object and one or more metadata services, said client communicating with the one or more metadata services to discover metadata for the content object; (b) analyzing queries and responses exchanged between the client and the one of more metadata services to discover metadata for the content object; (c) receiving or intercepting subsequent data sessions between the client and content sources; (d) identifying a data protocol used by the client and identifying data queries within the data sessions; (e) identifying the content object or portions thereof requested by the client in the data queries; and (f) determining if the content object or portions thereof are stored in cache and, if so, sending the content object or portions thereof stored in cache to the client, and, if not, sending the data queries to the content sources, storing data responses from the content sources, and sending the data responses to the client.
[0020] In accordance with one or more embodiments, a computer-implemented caching service is provided for caching multi-session data communications in a computer network. The caching service is configured to: (a) receive, intercept, or monitor one or more data sessions between a client executing a multi-session application for retrieving a desired content object and one or more metadata services, said client communicating with the one or more metadata services to discover metadata for the content object; (b) analyze queries and responses exchanged between the client and the one of more metadata services to discover metadata for the content object; (c) receive or intercept subsequent data sessions between the client and content sources; (d) identify a data protocol used by the client and identify data queries within the data sessions; (e) identify the content object or portions thereof requested by the client in the data queries; and (f) determine if the content object or portions thereof are stored in cache and, if so, send the content object or portions thereof stored in cache to the client, and, if not, send the data queries to the content sources, store data responses from the content sources, and send the data responses to the client.
[0021]
[0022]
DETAILED DESCRIPTION
[0023] In accordance with various embodiments, a service is provided for caching of applications that utilize multiple sessions for retrieval of same content object (e.g., file or stream).
[0024] The multi-session applications supported by the caching service can include: [0025] (a) applications that utilize one or more sessions to discover information about a content object (hereinafter “content object meta-data”), that identifies the content sources that the application contacts to retrieve the content object, data protocols used to do so, and data queries used to retrieve the object. [0026] (b) applications that utilize multiple sessions to retrieve the content object, passing information necessary for object identification only in some of the sessions.
(a) Multi-Session Applications Utilizing Content Object Meta-Data for Content Object Retrieval
[0027]
[0028] The content object meta-data includes at least one variable, selected from the following: [0029] (i) addresses of content source(s); [0030] (ii) protocols supported by an individual content source; [0031] (iii) encryption keys, per object or per individual content source; and [0032] (iv) content object structure.
[0033] The content source address can be identified through an IP address, e.g., using IPv4 IP address 1.1.1.1 or IPv6 address fe80::200:f8ff:fe21:67cf, or using a domain name, e.g., cache12.bos.us.cdn.net, that can be resolved to IP address using Domain Name System (DNS).
[0034] The content source address can use either implicitly named port number for applications using well-known protocol ports (e.g., port tcp/80 used by HTTP protocol) or name ports explicitly.
[0035] The content source address can be identified in conjunction with protocols supported by it, including, but not limited to, using universal resource locators (URL), as defined in RFC1738, that specifies protocol, content source address, port and remote path to the object.
[0036] The content object structure information includes information allowing client A.sub.1 to form data queries for parts of the object and to verify correctness of data responses received in response to such queries.
[0037] The content object structure information includes information pertaining to parts comprising the objects, e.g., “pieces” used by Bittorrent protocol, “parts” used by eDonkey P2P protocol or “playback levels” used in adaptive bitrate streaming protocols, such as Microsoft Silverlight Smooth Streaming, Adobe HTTP Dynamic Streaming, Apple HTTP Live Streaming, among others.
[0038] The information about content objects parts includes at least one of the following: enumeration of parts of the content object, length of each part, data checksum of each part, availability of parts at a specific content source, where the content source is identified using content source addresses as defined in [0023-0025] above.
[0039] The meta-data including all or some of the above information can be stored in a separate file with a pre-defined structure, e.g. a torrent file for Bittorrent or a manifest file used by Microsoft Silverlight smooth streaming.
[0040] The meta-data services M offering content object meta-data may include dedicated network servers designed to support delivery of a specific application or one or more content objects (e.g., Bittorrent trackers, ED2K servers, etc.), generic search engines (Google, Microsoft Bing, or others), a network of computer nodes that collectively stores the meta-data (e.g. distributed hash table networks used by P2P applications), or other clients that participate in distributed content source discovery networks (e.g., distributed hash table networks), or other clients that are downloading and/or serving the content object Z.sub.1 and maintain meta-data related to it.
[0041] Client A.sub.1 may use multiple meta-data services M to discover content object meta-data, where one service M.sub.1 can provide part of the content object meta-data and optionally point to another service M.sub.2 to provide another part.
[0042] Thus, for example, client A.sub.1 may retrieve a torrent file from a Bittorrent search engine that includes the content object data structure information as well as URL of a Bittorrent tracker that provides the information of currently active content source addresses.
[0043] Client A.sub.1 may continue to send data queries to meta-data services M during download of content object Z.sub.1 or portions of it, for purposes of identification of new content sources and/or content object structure information (for example, in case of object Z.sub.1 being a live stream, of which new parts become continuously available).
[0044] In accordance with one or more embodiments, the caching service C receives and stores data queries and/or responses exchanged between client A.sub.1 and one or more meta-data services M.
[0045] In accordance with one or more embodiments, the caching service C intercepts the sessions between A.sub.1 and M, either by being in data path between A.sub.1 and B, or through use of one or more dedicated redirection devices (e.g., a load balancer, a router, a DPI device, etc.) that sit in data path and redirect specific data sessions to the caching service C, and relays the data queries and responses between A.sub.1 and M.
[0046] In accordance with one or more embodiments, the caching service C modifies at least one of the meta-data responses provided by the meta-data service M, e.g., to indicate the caching service C as a content source or as a meta-data service for the content object Z.sub.1.
[0047] In accordance with one or more embodiments, the caching service C receives a copy of communications between the client A.sub.1 and the meta-data services M, using an optical tap, mirror port or other device replicating network traffic.
[0048] In accordance with one or more embodiments, the caching service C receives the data queries related to content object Z.sub.1 from client A.sub.1 by virtue of offering at least one of the meta-data services M.
[0049] In accordance with one or more embodiments, the caching service C subsequently queries the meta-data services M itself for meta-data related to content object Z.sub.1, and receives and stores the responses.
[0050] In accordance with one or more embodiments, the caching service C continuously analyzes the queries and responses exchanged between at least one client A.sub.1 and the meta-data services M, as well as the responses received by the caching service C directly from the meta-data services M, as described above.
[0051] As a result, the caching service C maintains content object meta-data M.sub.Z for at least one content object Z.sub.1 that client A.sub.1 is retrieving.
[0052] In accordance with one or more embodiments, the caching service C stores meta-data responses as part of meta-data M.sub.z in conjunction with the most recent time the response was received by C.
[0053] The caching service C subsequently periodically discards any responses that were received more than some time ago based on time-out.
[0054] In accordance with one or more embodiments, the caching service monitors meta-data requests and responses and discards any stored responses that contradict meta-data responses received later.
[0055] Following retrieval of meta-data pertaining to the content object Z.sub.1, the client A.sub.1 and at least one of content sources B.sub.1 discovered by the client A.sub.1 using the meta-data services M, start establishing data sessions with each other, for purpose of retrieving content object Z.sub.1 or part of it by A.sub.1.
[0056] In accordance with one or more embodiments, the caching service C intercepts the data sessions S.sub.1 established between the client A.sub.1 and the content sources B.sub.1.
[0057] In accordance with one or more embodiments, the caching service C intercepts the data sessions either by being in a data path between A.sub.1 and B.sub.1, or through use of one or more dedicated redirection devices (e.g., load balancer, router, DPI device, etc.) that sit in data path and redirect specific data sessions to the caching service C.
[0058] In accordance with one or more embodiments, the caching service C intercepts only such sessions that have been established between A.sub.1 and such content sources B.sub.1′, that match the meta-data Mz stored for the object Z.sub.1 by the caching service C.
[0059] In accordance with one or more embodiments, the client A.sub.1 establishes at least one session S.sub.2 with the caching service C, which is identified by the client Alas one of the content sources for the content object Z.sub.1.
[0060] In accordance with one or more embodiments, the caching service C utilizes at least one of the following protocols to interpret data queries and data responses in the session S.sub.1 between the client A.sub.1 and content source S.sub.1: [0061] (i) data protocols associated with the client A.sub.1, as part of meta-data M.sub.z, as described above; [0062] (ii) data protocols associated with the session S.sub.1, as part of meta-data M.sub.z, as described above; and [0063] (iii) data protocols identified by the caching service C when analyzing the data queries and responses received in the session S.sub.1, using signature-based or other generic protocol identification technique.
[0064] In accordance with one or more embodiments, the caching service C utilizes similar approach for session S.sub.2.
[0065] In accordance with one or more embodiments, when failing to identify data protocol of session S.sub.1 and S.sub.2, using method described in [0048], the caching service C may apply at least one of encryption keys K, stored by C as part of the meta-data M.sub.z, to establish an encrypted session with either client A.sub.1, or content source B.sub.1, or both.
[0066] The encryption keys K may be associated with the content object Z (e.g., in Bittorrent the hash identifier of object Z is used for encryption of sessions between Bittorrent peers), or specific content sources.
[0067] In accordance with one or more embodiments, following establishment of data session with client A.sub.1 and identification of the protocol used in this session, the caching service C receives data query Q.sub.1 for object Z.sub.1 or portion of it from the client A.sub.1.
[0068] In accordance with one or more embodiments, the caching service C identifies a response matching the query, using the meta-data Mz associated with the content object Z.sub.1 as described above.
[0069] For example, if the client A.sub.1 requests a chunk of 500 Kbps playback level of content object Z.sub.1, available over Microsoft Silverlight smooth streaming protocol, that starts at offset 0, without identification of the end offset, the caching service C may use the meta-data Mz describing the object Z.sub.1, to identify the end offset.
[0070] In accordance with one or more embodiments, if the matching response R.sub.1 to the query Q.sub.1 is stored by the caching service C, C delivers the response to the end client A.sub.1.
[0071] In accordance with one or more embodiments, the caching service C may use the stored meta-data Mz associated with the content object Z to verify the validity of the data response R.sub.1, before sending it to the client A.sub.1.
[0072] In accordance with one or more embodiments, when a matching response to the query Q.sub.1 is not available at the caching service C and the query Q.sub.1 has been sent as part of session S.sub.1 between the client A1 and the content source B.sub.1, the caching service C forwards the query to retrieve such response from the content source B.sub.1, receives and optionally stores the response and relays the response to the client A.sub.1.
[0073] In accordance with one or more embodiments, when a matching response to the query Q.sub.1 is not found at the caching service C, the caching service C sends data query Q.sub.1′ allowing it to respond to the data query Q.sub.1 to at least one of content sources B, identified by C as carrying the content object Z, based on the meta-data Mz stored by C.
[0074] Subsequently, the caching service C receives the responses R.sub.1′ for these queries, stores them and optionally verifies their validity against the meta-data Mz, and delivers response to the query Q.sub.1 to the client A.sub.1.
[0075] In accordance with one or more embodiments, when a matching response to the query Q.sub.1 is not found at the caching service C, C may redirect the client A.sub.1 to one of content sources B for the content object Z, as stored by the caching service in the meta-data Mz.
(b) Multi-Session Applications Allowing Identification of Content Object Only in Some Sessions
[0076] Client A.sub.2 establishes multiple sessions S.sub.2 to one or more destinations B.sub.2 to retrieve content object Z.sub.2, in parallel or in series. The client A.sub.2 sends data queries for portions of the content object Z.sub.2 in each such session.
[0077] Depending on the naming convention for the content object Z and/or its parts, used by client A.sub.2 and destination(s) B.sub.2, the caching service C, intercepting or receiving sessions S.sub.2, may not be able to identify the content object and/or portions of it requested by client A.sub.2 in each session, or identify data responses matching those queries.
[0078] The client A.sub.2 and content source(s) B.sub.2 may use dynamic URL (so-called “hashed URLs”) to identify object Z.sub.2 that is assigned uniquely for each download of the content object Z.sub.2. In this case caching service C cannot rely on the data in the data query alone to identify a matching response, but rather analyzes data responses to identify the requested object and match it to the previously stored data responses.
[0079] According to one or more embodiments, when receiving such data queries and/or responses in one or more sessions S.sub.2 that allow identification of the content object Z, C stores the content object Z.sub.2 identification together with the IP address of client A.sub.2, the IP address of content source B.sub.2, and the dynamic content identification (e.g. URL) used by client A.sub.2, in a list L.sub.2.
[0080] According to one or more embodiments, when caching service C receives a data query and/or data response that does not allow it to identify the content object Z referenced in the query and/or response, caching service C establishes whether the IP address of client A.sub.2, dynamic content identification URL, and IP address of content source B.sub.2 are stored in list L.sub.2.
[0081] According to one or more embodiments, in case of applications that utilize multiple content sources, the caching service C may disregard the IP address of content source B.sub.2.
[0082] According to one or more embodiments, caching service C removes entries from list L.sub.4 based on the timeout since last activity seen by client A.sub.2, related to content object Z.sub.2.
[0083] The processes of the caching service described above may be implemented in software, hardware, firmware, or any combination thereof. The processes are preferably implemented in one or more computer programs executing on a programmable device including a processor, a storage medium readable by the processor (including, e.g., volatile and non-volatile memory and/or storage elements), and input and output devices. Each computer program can be a set of instructions (program code) in a code module resident in the random access memory of the device. Until required by the device, the set of instructions may be stored in another computer memory (e.g., in a hard disk drive, or in a removable memory such as an optical disk, external hard drive, memory card, or flash drive) or stored on another computer system and downloaded via the Internet or other network.
[0084] Having thus described several illustrative embodiments, it is to be appreciated that various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to form a part of this disclosure, and are intended to be within the spirit and scope of this disclosure. While some examples presented herein involve specific combinations of functions or structural elements, it should be understood that those functions and elements may be combined in other ways according to the present disclosure to accomplish the same or different objectives. In particular, acts, elements, and features discussed in connection with one embodiment are not intended to be excluded from similar or other roles in other embodiments.
[0085] Additionally, elements and components described herein may be further divided into additional components or joined together to form fewer components for performing the same functions. For example, the caching service may comprise one or more physical machines, or virtual machines running on one or more physical machines. In addition, the caching service may comprise a cluster of computers or numerous distributed computers that are connected by the Internet or another network.
[0086] Accordingly, the foregoing description and attached drawings are by way of example only, and are not intended to be limiting.