Intelligent switching of client packets among a group of servers
09954785 ยท 2018-04-24
Assignee
Inventors
- Leonard L. Lu (Holmdel, NJ)
- Deh-Phone K. Hsing (Basking Ridge, NJ)
- Bo-Chao Cheng (Summit, NJ)
- Tsong-Ho Wu (Englishtown, NJ)
Cpc classification
H04L67/02
ELECTRICITY
H04L67/10015
ELECTRICITY
H04L47/25
ELECTRICITY
H04L67/1008
ELECTRICITY
H04L47/32
ELECTRICITY
International classification
Abstract
The content-aware application switch and methods thereof intelligently switch client packets to one server among a group of servers in a server farm. The switch uses Layer 7 or application content parsed from a packet to help select the server and to schedule the transmitting of the packet to the server. This enables refined load-balancing and Quality of-Service control tailored to the application being switched. In an exemplary embodiment of the invention, a method includes maintaining a server load metric for each server in a group of servers; parsing application content from a packet; selecting a destination server from the group of servers, wherein selecting the destination server is dependent on the server load metric for each server, assigning a priority to the packet, the priority being dependent on the application content; and dropping the packet if the priority comprises at least one of a predetermined type.
Claims
1. A method comprising: parsing application content from a packet; selecting a destination server from a group of servers, wherein selecting the destination server is dependent on a load metric for each server; assigning a priority to the packet, the priority dependent on the application content; transmitting the packet to the destination server according to a transmitting schedule; and dropping the packet if the priority comprises a predetermined type.
2. The method of claim 1, wherein the transmitting schedule is dependent on the priority.
3. The method of claim 1, further comprising referencing an access control list to enforce security, the access control list being dependent at least in part upon the application content.
4. The method of claim 3, wherein referencing an access control list further comprises dropping the packet if the application content comprises a first predetermined type.
5. The method of claim 4, wherein referencing an access control list further comprises redirecting the packet to a predetermined location if the application content comprises a second predetermined type.
6. The method of claim 1, wherein the priority comprises at least one of a first priority and a second priority, the second priority being lower than the first priority, wherein the destination server has a workload above a first predetermined level, and the transmitting schedule is constructed such that, if the packet comprises the first priority then the packet is transmitted to the destination server without delay and if the packet comprises the second priority then the packet is held back from being transmitted to the destination server.
7. The method of claim 1, further comprising at least one of: determining at least one eligible server from among the group of servers, wherein determining at least one eligible server is dependent on the application content; and selecting the destination server from among the at least one eligible servers.
8. The method of claim 1, further comprising determining an estimated application load for the destination server, the estimated application load being dependent at least in part upon the application content; wherein selecting a destination server is dependent at least in part upon the estimated application load.
9. The method of claim 1, wherein the transmitting schedule is dependent at least in part upon the server load metric for each server.
10. The method of claim 1, wherein the application content comprises a Hypertext Transfer Protocol header.
11. A non-transitory computer readable medium, the computer readable medium including instructions, which when executed perform steps comprising: parsing application content from a packet; selecting a destination server from a group of servers, wherein selecting the destination server is dependent on a load metric for each server; assigning a priority to the packet, the priority dependent on the application content; transmitting the packet to the destination server according to a transmitting schedule; and dropping the packet if the priority comprises a predetermined type.
12. The non-transitory computer readable medium of claim 11, wherein the transmitting schedule is dependent on the priority.
13. The non-transitory computer readable medium of claim 11, wherein the steps further comprise referencing an access control list to enforce security, the access control list being dependent at least in part upon the application content.
14. The non-transitory computer readable medium of claim 13, wherein referencing an access control list further comprises dropping the packet if the application content comprises a first predetermined type.
15. The non-transitory computer readable medium of claim 14, wherein referencing an access control list further comprises redirecting the packet to a predetermined location if the application content comprises a second predetermined type.
16. The non-transitory computer readable medium of claim 11, wherein the priority comprises at least one of a first priority and a second priority, the second priority being lower than the first priority, wherein the destination server has a workload above a first predetermined level, and the transmitting schedule is constructed such that, if the packet comprises the first priority then the packet is transmitted to the destination server without delay and if the packet comprises the second priority then the packet is held back from being transmitted to the destination server.
17. A method comprising: parsing application content from a packet; determining at least one eligible server from a group of servers, wherein determining at least one eligible server is dependent at least in part upon the application content; selecting a destination server from the at least one eligible server; determining an estimated application load for the destination server, the estimated application load dependent at least in part upon the application content; assigning a priority to the packet, the priority being dependent at least in part upon the application content; transmitting the packet to the destination server according to a transmitting schedule; and dropping the packet if the priority comprises a predetermined type.
18. The method of claim 17, wherein the step for selecting the destination server is dependent at least in part upon the server load metric for each server in the group of servers and the estimated application load.
19. The method of claim 17, wherein the transmitting schedule is dependent on the priority.
20. The method of claim 17, wherein the priority further comprises at least one of a first priority and a second priority, the second priority being lower than the first priority, wherein the destination server has a workload above a first predetermined level of the at least one of a predetermined level, and the transmitting schedule is constructed such that, if the packet comprises the first priority then the packet is transmitted to the destination server without delay and if the packet comprises the second priority then the packet is held back from being transmitted to the destination server.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
(24) Content-Aware Application Switch
(25) According to one aspect of the present invention, a content-aware application switch enables switching a client packet to a server among a group of servers, where the selected server and the priority of the packet are dependent on the content carried in the packet.
(26)
(27) A browser accesses a web application by the URL pointing to the website hosting the web application. In the simple case when a web server is accessible directly from the Internet, the URL for the website will be the IP address of the server. In the case of the data center, where the web servers are behind the application switch 20 and not directly accessible, the URL for the website is a virtual IP address that actually points to the application switch. Each server in the server farm 30 has its own IP address, which is a local IP address accessible only within the data center. The virtual IP address enables the client packet to be delivered to the switch. The switch selects a server and delivers the packet to the selected server by addressing the server's local IP address. This switching process involves modifying the destination IP address in each packet header from that of the switch to that of the selected server. Likewise, the hardware MAC address in the packet header is modified.
(28) The application switch 20 basically receives packets from clients on the Internet, examines the content of the packets, and based on the content, prioritizes the packets and selects appropriate servers to send the packets to. The application switch 20 comprises an IP bus 100 that enables IP communications with the Internet 50 and the server farms 30. A buffer controller 120 temporarily stores packets passing through the application switch.
(29) A packet classifier 140 snoops receive packets from the IP bus 100 so as to examine the content in each packet and classify the identified content pattern as one of predefined content pattern classes.
(30) A transmit controller 160 uses the assigned content class to look up a set of tables 180 to determine transmit instructions for each packet. Ultimately, the transmit instructions determine which server the packet is to be directed to, and in what order. The transmit controller includes a load balancer 162 and a QoS controller 164. The load balancer 162 selects a server that can best serve the request associated with the packet based on the content class, server farm configuration and the current loads of the servers. The QoS controller 164 prioritizes each packet based on the content class and the predefined policy for each class.
(31) Once the transmit instructions have been determined for each packet, the transmit controller cooperates with the buffer controller 120 to release the temporarily stored packets to the selected server according to their determined priority.
(32) Thus, it can be seen that packets are routed from the client 60 to the application switch 20 using Layer 3 information. From the switch to a selected server, Layer 4-7 information is used to select a server and set priority, and Layers 2-4 information is used to deliver the packet to the selected server.
(33)
(34) The content-aware application switch 20 shown in
(35) The application-related or Layer 7 message carried in a packet includes HTTP header and other HTTP payload such as data or other personalized information. The information can be used to make switching decisions based on QoS considerations. For example, at an online merchandising website, the customer who is actually trying to buy an item can be distinguished from another who is merely browsing a catalog by the web page they are currently requesting. This information can be determined from the URL contained in the HTTP header, since it actually points to the current webpage. Similarly, the packets from a preferred customer may be identified by the cookie contained in the HTTP header. Based on a set predefined policies, the packets can be prioritized, with the higher priority ones getting better quality of service (QoS).
(36)
(37) Buffering:
(38) Step 200: Store an input packet in a buffer. The input packet is typically associated with a client request.
(39) Content Classification:
(40) Step 210: Parse out the Layer 7 content from the packet. Lower layer information is also parsed out to obtain various addresses, but Layer 7 content provides information about the application associated with the input packet.
(41) Step 212: Assign a content class to the packet based on its parsed content by reference to a predefined set of content class definitions.
(42) Load-Balancing:
(43) Step 220: Select a destination server for the packet from among a group of servers, where the server selection is a predefined function of the individual server properties, including server loads, among the group of servers.
(44) Alternatively,
(45) Step 220: In another embodiment, select a destination server for the packet from among a group of servers, where the server selection is a predefined function of the individual server properties among the group of servers and the content class of the packet. The dependency on content class allows more refined load balancing. For example, the content class assigned to the packet allows identification of the group of eligible servers for serving the application, and also the estimated load the application placed on the selected server.
(46) QoS Control:
(47) Step 230: Queue the packet according to a priority that is given by a predefined function of content class.
(48) Switching:
(49) Step 240: Release the packet from the buffer to the designation server according to a schedule that depends on the assigned priority and the properties and workload of the destination server.
(50) In the preferred embodiment, all packets belonging to the same TCP session are assigned similar transmission characteristics and are therefore treated as a group. Classification by the packet classifier 140 need only be performed on a sample packet of the group. The transmit controller 160 then assigns the same classification to the rest of the packets in the same TCP session. In this way, it has been estimated that the packet classifier need only process five percent of all packet traffic. A TCP session can be identified by the unique combination of Source IP Address (Layer 3 header) and Source Port Number (Layer 4 header).
(51)
(52) Buffer Controller
(53) Packets entering the application switch 20 are temporarily stored in the buffer controller 120. The buffer controller comprises a receiver 320, a packet buffer 322 and a transmitter 324. On the ingress side, client packets arriving from the Internet are picked up by the receiver 320 from the IP bus. The receiver stores the packets in the packet buffer 322 and also creates a packet tag for each stored packet. The packet tag is a token for the stored packet and is used by the transmit controller to assign transmit instructions to the stored packet.
(54)
(55) Tables
(56) Returning to
(57) The tables include a TCP session table 182, an ACL (access control list) table 184, one or more server table 186, a class policy table 188 and others. These tables are accessible by the various components of the application switch and will be described more fully in connection these components later. In the preferred embodiment, the tables are stored in non-volatile memory and loaded into random access memory (RAM) during operation.
(58) The TCP session table 182 keeps track of TCP sessions. When a packet is received into the application switched, it is checked against the TCP session table to see if it belongs to an existing TCP session. If the packet is part of an existing TCP session, it will be assigned transmission instructions similar to other packets of the same TCP session. If the packet does not match any existing TCP session, a new TCP session will be registered in the TCP session table 182.
(59) The ACL table 184 is a listing of access control instructions versus content class. Essentially, it allows a user or an administrator to control access based on parsed Layer 2-7 information. In one embodiment, it is incorporated into the class policy table.
(60) The class policy table 188 allows a user or an administrator to set policies or business rules to different content classes. In the preferred embodiment, a priority is assigned to each content class.
(61) Packet Classifier
(62) As a receive packet is taken up by the receiver 320, a copy of it is snooped by the packet classifier 140. The packet classifier comprises a content parser 340 that parses out content from the various header fields and data portion (Layers 2 to 7) of the a packet. A content class classifier 342 recognizes the parsed content as one of a set of predefined patterns and classifies each pattern by reference to a content class dictionary 344.
(63)
(64) Returning to
(65) Basically, Layer 2-4 information parsed by the packet classifier is used by the receiver 320 to make a preliminary determination of what to do with each packet. For example, the classification on (Layer 2-4) headers is useful for screening out uninterested web traffic. This includes:
(66) a. Checking destination MAC address for Layer 2 switching;
(67) b. Setting an Access Control List (ACL) by looking at TCP/IP headers, and returning a flag for reject or allowed packet traffic;
(68) c. Identifying uninterested web traffic by looking at TCP/IP headers, such as user specified VIP and/or web service port numbers, and returning a flag to indicate uninteresting web traffic and forwarding output MAC port number;
(69) d. Identifying management traffic by looking at TCP/IP headers, such as the appearance of the actual IP address of the application switch and/or network management port numbers, and returning a flag to indicate management traffic and forwarding output MAC port number.
(70) In a preferred embodiment, Layer 7 information is also examined for making a preliminary determination of what to do with each packet. In particular, the ACL is also controlled by application layer information where packets carrying certain class of content are accepted, rejected or redirected to a predetermined location.
(71) If the receiver 320 determines from the Layers 3-4 information that the packet is not related to web traffic, the transmit controller 160 will not need the application layer (Layer 7) information from the packet classifier 140. It will be notified by the receiver to process the transmit instructions of the packet accordingly. Otherwise, the transmit controller will take the application layer information into account.
(72) The classification on application layer (Layer 7) content makes it possible to assign, in combination with the policy table 188, more refined transmit instructions for a packet. In general, the inbound packets are classified based on at least three categories of information. The first one is related to the nature of the application. Different applications will be treated differently based on their business values. Different application can be identified by the associated URL path information. The second is related to client's history. Based on the historical behavior of a client, the client can be assigned a priority. A specified cookie field can be used to accumulate client history information, and be examined to classify the inbound packets. The third category is related to client's browsing status. The business value associated with clients in different browsing stages will be different. A client in a buying mode who has put items into the shopping cart and/or provided his/her credit card information has higher business value than the clients in random surfing mode. The different browsing stages can be determined by examining the URL paths (i.e., the web pages being pointed to) and/or from specified cookie fields established to identify clients in different browsing stages.
(73) The application layer classification therefore includes checking URL and Cookie values, and return URL and Cookie Pattern index based on these values. From the URL, the HTTP request method (GET, HEAD, POST, etc.) is also examined. Policy can be set to disable certain methods, like PUT or DELETE. Examples of other possible URL patterns that may be checked, include: TABLE-US-00001 GET/subdir/filname.html (or HEAD, POST, PUT, etc.) GET /subdir/* .gif GET/(all.cgi, .bin, and .exe files) GET /*.asp?userid=1234 (or all userid between 100 and 500) Host: www.companyname.com Referer: http:/www.companyname0.com/subdir/filename.html
(74) Examples of possible Cookie patterns that may be checked, includes: TABLE-US-00002 Cookie: ***; userid=1234; (or all userid between 100 and 500) Cookie: ***; shoppingcartexists=yes; ***; shipping=fedex.
(75) In a preferred embodiment, the packet classifier 140 is implemented with the aid of a PAX.port 1100 Classification Processor manufactured by Solidum Systems Corp., Scotts Valley, Calif., U.S.A. Classification is only performed on packets from the Internet side, i.e., ingress traffic, which will inspect all the packets at the speed of 500 Mbs (312K packet per second (pps), assuming 200 bytes of average packet size) and parsing of all Layer 2 to 7 fields.
(76) Transmit Controller
(77) The transmit controller's job is to use information parsed from a packet to assign transmit instructions for the packet in order to stage the packet for transmission. In the preferred embodiment, the transmit controller performs an initial application-layer security screening by checking against the ACL table 184. Dependent on the determined content class of the packet, the ACL table may prescribe that the packet is to be dropped, or redirect to a predetermined location, or other actions. On the other hand, if the ACL table grant access for the packet to be switched to a destination server among the server farm, the transmit controller will invoke the load balancer 162 and the QoS controller 164 to provide transmit instructions for the packet.
(78) In the preferred embodiment, the transmit instructions include specifying which destination server the packet is to send to and with what priority. The destination server is determined by the load-balancer 162 component of the transmit controller and the priority is determined by the QoS controller 164 component. These determinations are made by reference to both the content class for the packet and tables containing server and priority information.
(79)
(80)
(81)
(82) To determine a destination server for a packet, the load balancer 162 (
(83) At least four types of load-balancing schemes are applicable. The first three types are simpler, without considering the number of existing connections on each server. For example, the first type is Round Robin, which chooses a server among a group in turns. The second is Weighted Round Robin, which is similar to Round Robin, but each server is weighted by its DefaultServerWeight (
(84) The fourth type of load-balancing scheme is Weighted Least Connection, which is the preferred scheme. It involves more computations but provides a more refined balance. Basically, it selects a server with the minimum number of weighted connections (i.e. CurrentLoad). A weighted connection takes into consideration that different class of requests presents different load on a server as represented by the ClassWeight value given in the policy table of
(85)
(86) Step 370: Read the list of servers in ServerGroup. ServerGroup is the group of servers predetermined to be eligible for serving packets of a given content class (see.
(87) Step 372: Delete all servers with CurrentConnections+1>=MaxConnections. All servers in the group that are already connected to the maximum need not be considered.
(88) Step 374: For the remaining servers in the ServerGroup, select the server with the smallest CurrentLoad.
(89) Step 376: End.
(90) Conventional load balancing mechanisms take into account only server utilization factor. If all servers are busy, no one can get in since it gives no preferential treatment to traffic with higher business value. The present application switch 20 is capable of prioritizing all inbound packets (i.e., packets on ingress traffic) according to predefined business values.
(91) In the preferred embodiment, the QoS controller 164 (
(92)
(93) The transmit scheduler 360 effectively generates a transmit queue by picking off the packet tags from the various queues according a predefined schedule and sends the prioritized transmit packet tags to the transmitter 324 of the Buffer controller 120 (see
(94) The transmit scheduler schedules removal of the packet tags from each queue with the aid of two flags. The EmptyFlag indicates whether the queue is empty (=1) or nonempty (=0). When a queue is non-empty, it is ready for packet removal, subject to the condition of the ActiveFlag. The ActiveFlag is used to implement server headroom and indicates whether the queue is active (=1) or not active (=0) for packet tag removal. When a queue is inactive, it is in a sleep state, and can be used to hold back lower priority packets. In general, there will be a set of these two flags for each queue. The ActiveFlags such as HActiveFlag 414, MActiveFlag 424 and LActiveFlag 434 are updated dynamically at predetermined intervals based on the load of the associated server. Thus, for the High priority queue 410, the corresponding flags are HEmptyFlag 412 and HActiveFlag 414. For the Medium priority queue 420, the corresponding flags are MEmptyFlag 422 and MActiveFlag 424. For the Low priority queue 430, the corresponding flags are LEmptyFlag 432 and LActiveFlag 434.
(95)
(96) In addition to the sleeping times for Medium and Low priority queues, there are also maximum queue size thresholds for both Medium and Low priority queues. When the Medium (Low) priority queue reaches the maximum queue size, the oldest packet tags in the Medium (Low) priority queue will be discarded. Since High priority packet tags will be served first and there is likely no large queue size built up for these packet tags, there will be no maximum queue size restriction for them.
(97)
(98) High Priority Queue
(99) Step 450: While there is a packet tag in the High priority queue and the queue is active, transfer the packet tag to the transmitter. Otherwise go to Step 460.
(100) Medium Priority Queue
(101) Step 460: Do Steps 462 while the High priority queue is empty, otherwise go to Step 450.
(102) Step 462: While there is a packet tag in the Medium priority queue and the queue is active, transfer the packet tag to the transmitter. Otherwise go to Step 470.
(103) Low Priority Queue
(104) Step 470: Do Step 472 while the High priority queue is empty, otherwise go to Step 450.
(105) Step 472: Do Step 474 while the Medium priority queue is empty, otherwise go to Step 462.
(106) Step 474: While there is a packet tag in the Low priority queue and the queue is active, transfer the packet tag to the transmitter, otherwise go to Step 480.
(107) Step 480: End.
(108) For N server ports, when more than one port has a non-empty High priority queue, the transmit scheduler transfers the packet tags from the plurality of non-empty High priority queues in an equitable manner, such as using a Round Robin schedule. The Medium and Low priority queues are treated similarly.
(109) Referring again to
(110) Traffic from a server back to the client is usually deterministic and the application switch merely performs the function of changing the IP and MAC addresses of a packet from that of the switch to that of the client. This is implemented by TCP splicing or Network Address Translation (NAT).
(111) The application switch is preferably implemented by a collection of tightly coupled application-specific integrated circuits (ASICs). In the preferred embodiment, a network processor, embodied by multiple programmable microengines and a core processor, is used to implement and manage the various components. An example of such a network processor is Intel IXP 1200 Network Processor manufactured by Intel Corporation, Santa Clara, Calif., U.S.A.
(112) Slow-Start New Server Selection
(113) According to another aspect of the invention, a slow-start server selection method is advantageous employed to alleviate the problem of a server newly put online from being swamped due to existing load balancing schemes.
(114) The four established load balancing algorithms identified earlier do not address the problem when a new server is brought up online among a group of servers participating in load balancing. The newly added server, by virtue of an initial low workload, can be flooded with new requests, which will quickly degrade the service quality perceived by the users. This is because these algorithms take into account only server utilization factor, (e.g., selecting a server with the least workload) resulting in the selection tipping heavily towards the newly added server.
(115) As described earlier, the preferred load-balancing scheme is Weighted Least Connection, which selects a server with the minimum CurrentLoad, where CurrentLoad=number of weighted connections=Summing over {DefaultServerWeight*ClassLoad(i)}, where i=1 to CurrentConnections. (See
(116) In the slow-start load-balancing method, the server load metric for a newly added server has a configurable server-weight factor. In the calculation for CurrentLoad, the DefaultServerWeight is replaced by a DynamicServerWeight (see
(117)
(118)
(119) In a preferred implementation, there are three processes. The first is the initialization process of setting an initial value for the ServerWeight of the new server. The second is switching a packet subject to load balancing with the group of servers, including the new server. The third is to adjust the ServerWeight of the new server to converge to DefaultServerWeight over a predefined period during load balancing.
(120)
(121) Step 510: Initialize k=SlowStartCount.
(122) Step 512: Set ServerWeight=DynamicServerWeight=2.sup.kDefaultServerWeight, go to Step 540.
(123) Step 540: At predefined intervals while load balancing is ongoing (see Steps 530 and 532), test if k=>0?, if so, go to Step 542, else go to Step 550.
(124) Step 542: Set k1, decrementing the server-weight factor by half, then go to Step 512.
(125) Step 550: End.
(126) This is the point where k is zero, and the new server has a ServerWeight DefaultServerWeight. Its CurrentLoad normalized by its DefaultServerWeight is similar to that of the other servers in the group.
(127)
(128) Step 530: Compute CurrentLoad of the server using ServerWeight.
(129) Step 532: Select a server based on CurrentLoads among the Server Group. Content-Aware Switching Without Delayed Binding
(130) According to another aspect of the invention, a method is provided to perform content-aware switching without incurring delay and excessive processing while initially waiting for content to become available in order to make switching decisions.
(131) The servers in a data center/call center can be the performance bottleneck for web applications in many cases. All existing load-balancing algorithms mostly use Layers 3 and/or 4 information to select a server.
(132) As described earlier, different web applications may have different required server load implications. This information is derivable by identifying from the application layer (Layer 7) the class of application and associating it with a ClassWeight (see
(133) However, application layer information typically arrives after the initial TCP handshaking process. The first few packets used for handshaking purposes carry no application layer information. Thus, if load balancing is also dependent on Layer 7 information, the switch will have to wait until after the handshaking is completed to obtain it in order to select a server (delayed binding). The former treats all applications on the same server group equally and does not take into account the difference in load demand by different applications. The latter uses TCP splicing and is process intensive.
(134)
(135) The present invention prescribes using application layer (Layer 7) information to perform load balancing as soon as the first handshaking packet from a new TCP session arrives. This is accomplished by using the application layer information from a previous TCP session as a best estimate for the new session. This scheme works if there is only one server group in the server farm, as is typical, and therefore Layer 7 information is not necessary to select a server group. Thus, load balancing is performed on the basis of workloads of servers based on data from a previous TCP session. Since a server can be selected on the fly, the handshaking packets can be sent directly to the server without performing the tandem process of TCP splicing.
(136)
(137) As for QoS control, the few handshaking packets at the beginning of a new TCP session are assigned a default High priority so that the handshaking process can be completed without delay and the Layer 7 information be available as soon as possible. When the Layer 7 information becomes available, it will be used to prioritize the current packets in the manner described earlier.
(138)
(139) Step 600: If the packet is a first handshaking packet, go to Step 610, else go to Step 620.
(140) Step 610: Retrieve existing server load metrics for the group of servers under loads balancing. These existing server load metrics have been updated based on application layer information of a previous TCP session.
(141) Step 612: Select a server from the group of servers based on the existing server load metrics.
(142) Step 614: Set the packet to a default priority. Go to Step 630.
(143) Step 620: If the packet a handshaking packet, go to Step 622, else go to Step 630.
(144) Step 622: If application layer information for current TCP session has already been obtained, go to Step 624, else go to Step 630.
(145) Step 624: Use the application layer information from the packet to set priority for the packet.
(146) Step 626: Use the application layer information from the packet to update the server load metrics. The server load metric will be used in the next TCP session to select a server. Go to Step 630.
(147) Step 630: Direct the packet to the selected server according to a predefined schedule dependent on the assigned priority.
(148) Step 640: End
(149) Thus it is possible, by the present invention, to implement application-aware load balancing and QoS control without having to use delayed binding. The server load metrics used in the load-balancing algorithm are updated based on application layer information after a server is selected. This way, the server selection process does not need to wait (hence requiring delayed binding) for application layer information to arrive in order to select a server for the current request. However, after the application layer information is available, the server load metric is updated based on the application and hence reflected on the next server selection process.
(150) While the embodiments of this invention that have been described are the preferred implementations, those skilled in the art will understand that variations thereof may also be possible. Therefore, the invention is entitled to protection within the full scope of the appended claims.