LOSSLESS DATA TRAFFIC DEADLOCK MANAGEMENT SYSTEM
20210126865 ยท 2021-04-29
Inventors
Cpc classification
H04L43/10
ELECTRICITY
H04L47/2475
ELECTRICITY
International classification
Abstract
A lossless data traffic deadlock management system includes a first networking device coupled to a second networking device and a third networking device. The first networking device provide a lossless data traffic flow on a first data traffic path via the second networking device and to a destination. The first networking device then receives a congestion communication from the second networking device that is indicative of a deadlock associated with the second networking device. In response, the first networking device identifies the first data traffic path as a congested route, and the identification of the first data traffic path as the congested route causes the first networking device to provide the lossless data traffic flow on a second data traffic path via the third networking device to the destination.
Claims
1. A lossless data traffic deadlock management system, comprising: a second networking device; a third networking device; and a first networking device that is coupled to the second networking device and the third networking device, wherein the first networking device is configured to: provide, on a first data traffic path via the second networking device and to a destination, a lossless data traffic flow; receive, from the second networking device, a congestion communication that is indicative of a deadlock associated with the second networking device; and identify, in response to receiving the congestion communication, the first data traffic path as a congested route, wherein the identification of the first data traffic path as the congested route causes the first networking device to provide the lossless data traffic flow on a second data traffic path via the third networking device to the destination.
2. The system of claim 1, wherein the first networking device is configured to: receive, on a port that is connected to the second networking device, the congestion communication; identify, via at least one application layer, one or more routes that are reachable via the second networking device that is connected to the port; determine that the destination is reachable utilizing the first data traffic path; and identify the first data traffic path as the congested route by associating a congestion flag with the first data traffic path.
3. The system of claim 3, wherein the first networking device is configured to: remove, via at least one non-application layer, a reachability of the destination via the second networking device, wherein the removal of the reachability of the destination via the second networking device causes the first networking device to provide the lossless data traffic flow on the second data traffic path via the third networking device to the destination.
4. The system of claim 1, wherein the first networking device is configured to: generate, in response to identifying the first data traffic path as the congested route, a congestion alarm for one or more routes that are reachable via the second networking device that is connected to the port.
5. The system of claim 1, wherein the first networking device is configured to: determine that the congestion situation in the second networking device no longer exists; and remove, in response to determine that the congestion situation in the second networking device no longer exists, the identification of the first data traffic path as the congested route, wherein the removal of the identification of the first data traffic path as the congested route causes the first networking device to provide the lossless data traffic flow on the first data traffic path via the second networking device to the destination.
6. The system of claim 5, wherein the first networking device is configured to: receive, on a port that is connected to the second networking device, a decongested communication that is indicative that the deadlock associated with the second networking device no longer exists; identify, via at least one application layer, the second networking device that is connected to the port; determine that the destination is reachable utilizing the first data traffic path via the second networking device; and remove the identification of the first data traffic path via the second networking device as the congested route by resetting a congestion flag associated with the first data traffic path.
7. An Information Handling System (IHS), comprising: a processing system; and a memory system that is coupled to the processing system and that includes instructions that, when executed by the processing system, cause the processing system to provide a deadlock management engine that is configured to: provide, on a first data traffic path via a second networking device and to a destination, a lossless data traffic flow; receive, from the second networking device, a congestion communication that is indicative of a deadlock associated with the second networking device; and identify, in response to receiving the congestion communication, the first data traffic path as a congested route, wherein the identification of the first data traffic path as the congested route causes the first networking device to provide the lossless data traffic flow on a second data traffic path via a third networking device to the destination.
8. The IHS of claim 7, wherein the deadlock management engine is configured to: receive, on a port that is connected to the second networking device, the congestion communication; identify, via at least one application layer, one or more routes that are reachable via the second networking device that is connected to the port; determine that the destination is reachable utilizing the first data traffic path; and identify the first data traffic path as the congested route by associating a congestion flag with the first data traffic path.
9. The IHS of claim 8, wherein the deadlock management engine is configured to: remove, via at least one non-application layer, a reachability of the destination via the second networking device, wherein the removal of the reachability of the destination via the second networking device causes the first networking device to provide the lossless data traffic flow on the second data traffic path via the third networking device to the destination.
10. The IHS of claim 7, wherein the deadlock management engine is configured to: generate, in response to identifying the first data traffic path as the congested route, a congestion alarm for one or more routes that are reachable via the second networking device that is connected to the port.
11. The IHS of claim 7, wherein the deadlock management engine is configured to: determine that the congestion situation in the second networking device no longer exists; and remove, in response to determine that the congestion situation in the second networking device no longer exists, the identification of the first data traffic path as the congested route, wherein the removal of the identification of the first data traffic path as the congested route causes the first networking device to provide the lossless data traffic flow on the first data traffic path via the second networking device to the destination.
12. The IHS of claim 11, wherein the deadlock management engine is configured to: receive, on a port that is connected to the second networking device, a decongested communication that is indicative that the deadlock associated with the second networking device no longer exists; identify, via at least one application layer, the second networking device that is connected to the port; determine that the destination is reachable utilizing the first data traffic path via the second networking device; and remove the identification of the first data traffic path via the second networking device as the congested route by resetting a congestion flag associated with the first data traffic path.
13. The IHS of claim 7, wherein congestion communication is a Priority Flow Control (PFC) communication.
14. A method for managing deadlocks for lossless data traffic, comprising: providing, by a first networking device on a first data traffic path via a second networking device and to a destination, a lossless data traffic flow; receiving, by the first networking device from the second networking device, a congestion communication that is indicative of a deadlock associated with the second networking device; and identifying, by the first networking device in response to receiving the congestion communication, the first data traffic path as a congested route, wherein the identification of the first data traffic path as the congested route causes the first networking device to provide the lossless data traffic flow on a second data traffic path via a third networking device to the destination.
15. The method of claim 14, further comprising: receiving, by the first networking device on a port that is connected to the second networking device, the congestion communication; identifying, by the first networking device via at least one application layer, one or more routes that are reachable via the second networking device that is connected to the port; determining, by the first networking device, that the destination is reachable utilizing the first data traffic path; and identifying, by the first networking device, the first data traffic path as the congested route by associating a congestion flag with the first data traffic path.
16. The method of claim 15, further comprising: removing, by the first networking device via at least one non-application layer, a reachability of the destination via the second networking device, wherein the removal of the reachability of the destination via the second networking device causes the first networking device to provide the lossless data traffic flow on the second data traffic path via the third networking device to the destination.
17. The method of claim 14, further comprising: generating, by the first networking device, in response to identifying the first data traffic path as the congested route, a congestion alarm for one or more routes that are reachable via the second networking device that is connected to the port.
18. The method of claim 14, further comprising: determining, by the first networking device, that the congestion situation in the second networking device no longer exists; and removing, by the first networking device in response to determine that the congestion situation in the second networking device no longer exists, the identification of the first data traffic path as the congested route, wherein the removal of the identification of the first data traffic path as the congested route causes the first networking device to provide the lossless data traffic flow on the first data traffic path via the second networking device to the destination.
19. The method of claim 18, further comprising: receiving, by the first networking device on a port that is connected to the second networking device, a decongested communication that is indicative that the deadlock associated with the second networking device no longer exists; identifying, by the first networking device via at least one application layer, the second networking device that is connected to the port; determining, by the first networking device, that the destination is reachable utilizing the first data traffic path via the second networking device; and removing, by the first networking device, the identification of the first data traffic path via the second networking device as the congested route by resetting a congestion flag associated with the first data traffic path.
20. The method of claim 14, wherein congestion communication is a Priority Flow Control (PFC) communication.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0009]
[0010]
[0011]
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
DETAILED DESCRIPTION
[0020] For purposes of this disclosure, an information handling system may include any instrumentality or aggregate of instrumentalities operable to compute, calculate, determine, classify, process, transmit, receive, retrieve, originate, switch, store, display, communicate, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, or other purposes. For example, an information handling system may be a personal computer (e.g., desktop or laptop), tablet computer, mobile device (e.g., personal digital assistant (PDA) or smart phone), server (e.g., blade server or rack server), a network storage device, or any other suitable device and may vary in size, shape, performance, functionality, and price. The information handling system may include random access memory (RAM), one or more processing resources such as a central processing unit (CPU) or hardware or software control logic, ROM, and/or other types of nonvolatile memory. Additional components of the information handling system may include one or more disk drives, one or more network ports for communicating with external devices as well as various input and output (I/O) devices, such as a keyboard, a mouse, touchscreen and/or a video display. The information handling system may also include one or more buses operable to transmit communications between the various hardware components.
[0021] In one embodiment, IHS 100,
[0022] Referring now to
[0023] Referring now to
[0024] The chassis 302 may also house a storage system (not illustrated, but which may include the storage 108 discussed above with reference to
[0025] Referring now to
[0026] The method 400 begins at block 402 where a first networking device provides a lossless data traffic flow on a first data traffic path via a second networking device. In an embodiment, at block 402, the deadlock management engine 304 in the networking device 204/300 may operate to provide a lossless data traffic flow 500 along a first data traffic path. For example, as illustrated in
[0027] As such, in an example of the routing of the data traffic flow 500 utilizing ECMP routing strategies, the deadlock management engine in the networking device 204/300 may operate to perform a hashing operation on the lossless data traffic flow 500 to determine that that lossless data traffic flow should be forwarded out of a port on the networking device 204 that is coupled to the networking device 206, and as illustrated, that hashing operation results in the data traffic flow 500 being forwarded along the first data traffic path illustrated in
[0028] The method 400 the proceeds to block 404 where the first networking device receives a congestion communication from the second networking device that is indicative of a deadlock associated with the second networking device. In an embodiment, at or prior to block 404, a congestion condition may arise in the deadlock management system 200. For example, with reference to the networking devices 206, 212, and 214 in the deadlock management system 200 illustrated in
[0029] As will be appreciated by one of skill in the art, each of the lossless data traffic flows 600, 602, and 604 may utilize one or more buffers in the networking devices 206, 212, and 214 that are dedicated to storing lossless data traffic packets/frames for forwarding by their respective networking device. As illustrated in
[0030] However, while a particular congestion situation/deadlock has been described as being provided by a PFC deadlock/PFC storm situation, one of skill in the art in possession of the present disclosure will appreciate that other congestion situations may benefit from the teachings of the present disclosure as thus will fall within its scope as well. For example, Layer 2 (L2) loops in a portion of a network that includes the deadlock management system 200, incorrect Quality of Service (QoS) configuration in some networking devices in a network that includes the deadlock management system 200, a networking device with a full buffer/queue due to some other lossless data traffic situation, faulty Network Interface Controller(s) in networking devices in a network that includes the deadlock management system 200, and/or other situations known in the art may result in the congestion situation detected by the networking device 204 at block 404 while remaining within the scope of the present disclosure as well. As such, as block 404, the deadlock management engine 304 in the networking device 204 may receive congestion communications such as, for example, the PFC pause frames 612 transmitted by the networking device 206 as discussed above, via its communication system 308.
[0031] The method 400 the proceeds to block 406 where the first networking device identifies the first data traffic path as a congested route. In an embodiment, at block 406, the deadlock management engine 304 in the networking device 204/300 may identify the first data traffic path illustrated in
[0032] The method 400 the proceeds to block 408 where the first networking device provides the lossless data traffic flow on a second data traffic path via a third networking device. In an embodiment, at block 408 an in response to determining that the first data traffic path illustrated in
[0033] In a specific example of blocks 404, 406, and 408 discussed above, the deadlock management engine 304 in the networking device 204/300 may receive the congestion communication (e.g., the PFC pause frames) on a port that is connected to the networking device 206. In response to receiving the congestion communication, one or more application layers in the networking device 204 may scan through the routes that are reachable via the port that received the congestion communication and identify the networking device 206, may determine that the destination of the lossless data traffic 500 is reachable via the networking device 206, and identify the first data traffic path as the congested route by associating a congestion flag with the first data traffic path. As will be appreciated by one of skill in the art in possession of the present disclosure, the association of the congestion flag with the first data traffic path will indicate to the non-application layers in the networking device 204 to remove the reachability of the destination of the lossless data traffic 500 via the networking device 206, and cause that lossless data traffic to take an alternate route if it exists. In some embodiment, the deadlock management engine 304 in the networking device 204/300 may generate a congestion alarm for any next-hop neighbor device that is reachable via the port upon which the congestion communication was received, and/or may update the routes that have alternate paths in its deadlock management database 306 (i.e., other routes without alternate paths will not be affected.)
[0034] As such, with reference to the example illustrated in
[0035] The method 400 then proceeds to decision block 410 where it is determined whether the deadlock associated with the second networking device no longer exists. In an embodiment, at decision block 410, the deadlock management engine 304 in the networking device 204/300 may then monitor to determine whether the number of PFC pause frames received by the networking device 204 from the networking device 206 continues to exceed the deadlock detection rate that is indicative of the deadlock situation such as the PFC deadlock/PFC storm situation discussed above, and/or whether the networking device 204 has been unable to forward lossless data traffic frames to the networking device 206 for some period of time. As such, at decision block 410, the deadlock management engine 304 in the networking device 204 may determine whether the congestion communications received from the networking device 206 continue to be received at the rate that is indicative of a deadlock in the networking device 206, and/or whether the networking device 204 has been unable to forward lossless data traffic frames to the networking device 206 for some period of time. If, at decision block 410, it is determined that the deadlock associated with the second networking device continues to exist, the method 400 returns to block 408. As such, the method 400 may operate to provides the lossless data traffic flow 500 on the second data traffic path illustrated in
[0036] If, at decision block 410, it is determined that the deadlock associated with the second networking device no longer exists, the method 400 proceeds to block 412 where the first networking device removes the identification of the first data traffic path as a congested route. In an embodiment, if at block 410 the deadlock management engine 304 in the networking device 204/300 determines that the congestion communications received from the networking device 206 are below the rate that is indicative of a deadlock in the networking device 206, and/or that the networking device 204 has able to forward lossless data traffic frames to the networking device 206 within some period of time, at block 412 the deadlock management engine 304 in the networking device 204/300 may remove the identification of the first data traffic path illustrated in
[0037] In a specific example of blocks 410, 412, and 414 discussed above, the deadlock management engine 304 in the networking device 204/300 may determine that the congestion communications (e.g., the PFC pause frames) are no longer being received at a relatively high rate (or at all) on the port that is connected to the networking device 206, and/or that the networking device 204 has been able to forward lossless data traffic frames to the networking device 206 after some period of time of not being able to do so. In response, one or more application layers in the networking device 204 may scan through the routes that are reachable via that port and identify the networking device 206, determine that the destination of the lossless data traffic 500 is reachable via the networking device 206, and reset the congestion flag that was previously associated with the first data traffic path to identify it as a congested route. As will be appreciated by one of skill in the art in possession of the present disclosure, the resetting of the congestion flag associated with the first data traffic path will indicate to the non-application layers in the networking device 204 to add the reachability of the destination of the lossless data traffic 500 via the networking device 206, and allow that lossless data traffic to take the first data traffic path.
[0038] In some embodiment, the deadlock management engine 304 in the networking device 204/300 may clear any congestion alarms for any next-hop neighbor device that is reachable via the port that is no longer associated with the congestion, and/or may update the routes in its deadlock management database 306. As such, with reference to the example illustrated in
[0039] Thus, systems and methods have been described that provide for the transmission of lossless data traffic with improved resolution of deadlocks without the issues associated with conventional deadlock management techniques discussed above. As discussed below, in some embodiments, such deadlock resolution may be accomplished by a first networking device that provides a lossless data traffic flow on a first data traffic path via a second networking device and to a destination, receives a congestion communication from the second networking device that is indicative of a deadlock associated with the second networking device and, in response, identifies the first data traffic path as a congested route to cause the first networking device to provide the lossless data traffic flow on a second data traffic path via a third networking device to the destination. As described above, this allows the first networking device to increase the performance of lossless data traffic transmission a between a source device and the destination device by rerouting traffic around deadlocks, while allowing deadlocks to resolve more quickly than in conventional systems due to decreased amount of lossless data traffic sent to the congested networking device, and reducing data traffic losses that may occur due to deadlocks when an alternate path exists. As will be appreciated by one of skill in the art in possession of the present disclosure, the systems and methods of the present disclosure provide a localized solution that modifies the routes in a particular networking device, while all other networking devices/nodes in the network are unaffected.
[0040] Although illustrative embodiments have been shown and described, a wide range of modification, change and substitution is contemplated in the foregoing disclosure and in some instances, some features of the embodiments may be employed without a corresponding use of other features. Accordingly, it is appropriate that the appended claims be construed broadly and in a manner consistent with the scope of the embodiments disclosed herein.