Guaranteed bandwidth for segment routed (SR) paths

Abstract

At least one bandwidth-guaranteed segment routing (SR) path through a network is determined by: (a) receiving, as input, a bandwidth demand value; (b) obtaining network information; (c) determining a constrained shortest multipath (CSG.sub.i); (d) determining a set of SR segment-list(s) (S.sub.i=[sl.sub.1.sup.i, sl.sub.2.sup.i . . . sl.sub.n.sup.i]) a that are needed to steer traffic over CSG.sub.i; and (e) tuning the loadshares in L.sub.i, using S.sub.i and the per segment-list loadshare (L.sub.i=[l.sub.1.sup.i, l.sub.2.sup.i . . . l.sub.n.sup.i]), the per segment equal cost multipath (“ECMP”), and the per link residual capacity, such that the bandwidth capacity that can be carried over CSG.sub.i is maximized.

Claims

1. A computer-implemented method for determining at least one bandwidth-guaranteed segment routing (SR) path through a network from an ingress device to an egress device, the computer-implemented method comprising: a) receiving, as input, a bandwidth demand value; b) obtaining network information; c) determining a constrained shortest multipath (CSG.sub.i) from the ingress device to the egress device; d) determining a set of SR segment-list(s) (S.sub.i=[sl.sub.1.sup.i, sl.sub.2.sup.i . . . sl.sub.n.sup.i]) that are needed to steer traffic over CSG.sub.i; and e) tuning each of a plurality of loadshares in a set of segment link loadshares L.sub.i that the ingress device uses to steer portions of the bandwidth demand to the egress device, using all of 1) S.sub.i and the per segment-list loadshare (L.sub.i=[l.sub.1.sup.i, l.sub.2.sup.i, . . . l.sub.n.sup.i]), 2) the per segment equal cost multipath (“ECMP”), and 3) the per link residual capacity, such that the bandwidth capacity over CSG.sub.i is maximized or such that the bandwidth capacity meets a threshold value.

2. The computer-implemented method of claim 1 wherein the CSG.sub.i is formed of paths of equal cost of minimum accumulative path metric.

3. The computer-implemented method of claim 1 wherein the CSG.sub.i is formed of paths of equal cost of minimum accumulative path metric after excluding link(s) due to any topological constraints.

4. The computer-implemented method of claim 1 wherein the CSG.sub.i is formed of paths of equal cost of minimum accumulative path metric after pruning out zero residual bandwidth links.

5. The computer-implemented method of claim 1 wherein the CSG.sub.i is formed of paths of equal cost of minimum accumulative path metric.

6. The computer-implemented method of claim 1 wherein the act of obtaining network information is performed by accessing information in a traffic engineering database (TED), the computer-implemented method further comprising: f) updating the TED or a workspace including information from the TED, to deduct bandwidth capacity used on CSG.sub.i.

7. The computer-implemented method of claim 6 further comprising: g) determining whether or not the (remaining) bandwidth demand is satisfied by CSG.sub.i; and h) responsive to a determination that the capacity of CSG.sub.i is smaller than the (remaining) demand, repeating the acts (a)-(e).

8. The computer-implemented method of claim 1 wherein the act of tuning the loadshares in L.sub.i, using S.sub.i and the per segment-list loadshare (L.sub.i=[l.sub.1.sup.i, l.sub.2.sup.i, . . . l.sub.n.sup.i]), the per segment equal cost multipath (“ECMP”), and the per link residual capacity, such that the bandwidth capacity that is carried over CSG.sub.i is maximized, uses a sequential least squares programming procedure.

9. A router serving as an ingress of a SR path and comprising: a) at least one routing processor; and b) a non-transitory computer readable medium storing processor executable instructions which, when executed by the at least one routing processor, cause the at least one routing processor to determine at least one bandwidth-guaranteed segment routing (SR) path through a network from the router serving as the ingress of the SR path to an egress router, by performing a method comprising: a) receiving, as input, a bandwidth demand value; b) obtaining network information; c) determining a constrained shortest multipath (CSG.sub.i) from the router serving as the ingress of the SR path to the egress router; d) determining a set of SR segment-list(s) (S.sub.i=[sl.sub.1.sup.i, sl.sub.2.sup.i . . . sl.sub.n.sup.i]) that are needed to steer traffic over CSG.sub.i; and e) tuning each of a plurality of loadshares in a set of segment link loadshares L.sub.i that the router serving at the ingress of the SR path uses to steer portions of the bandwidth demand to the egress router, using all of 1) S.sub.i and the per segment-list loadshare (L.sub.i=[l.sub.1.sup.i, l.sub.2.sup.i, . . . l.sub.n.sup.i]), 2) the per segment equal cost multipath (“ECMP”), and 3) the per link residual capacity, such that the bandwidth capacity over CSG.sub.i is maximized or such that the bandwidth capacity meets a threshold value.

10. The router of claim 9 wherein the CSG.sub.i is formed of paths of equal cost of minimum accumulative path metric.

11. The router of claim 9 wherein the CSG.sub.i is formed of paths of equal cost of minimum accumulative path metric after excluding link(s) due to any topological constraints.

12. The router of claim 9 wherein the CSG.sub.i is formed of paths of equal cost of minimum accumulative path metric after pruning out zero residual bandwidth links.

13. The router of claim 9 wherein the CSG.sub.i is formed of paths of equal cost of minimum accumulative path metric.

14. The router of claim 9 wherein the act of obtaining network information is performed by accessing information in a traffic engineering database (TED), the method further comprising: f) updating the TED or a workspace including information from the TED, to deduct bandwidth capacity used on CSG.sub.i.

15. The router of claim 14 wherein the method further comprises: g) determining whether or not the (remaining) bandwidth demand is satisfied by CSG.sub.i; and h) responsive to a determination that the capacity of CSG.sub.i is smaller than the (remaining) demand, repeating the acts (a)-(e).

16. The router of claim 9 wherein the act of tuning the loadshares in L.sub.i, using S.sub.i and the per segment-list loadshare (L.sub.i=[l.sub.1.sup.i, l.sub.2.sup.i, . . . l.sub.n.sup.i]), the per segment equal cost multipath (“ECMP”), and the per link residual capacity, such that the bandwidth capacity over CSG.sub.i is maximized, uses a sequential least squares programming procedure.

17. A server in communication with a router serving as an ingress of a SR path, the server comprising: a) at least one path computation element (PCE); and b) a non-transitory computer readable medium storing processor executable instructions which, when executed by the at least one PCE, cause the at least one PCE to determine at least one bandwidth-guaranteed segment routing (SR) path through a network from the router serving as the ingress of the SR path to an egress router, by performing a method comprising: a) receiving, as input, a bandwidth demand value; b) obtaining network information; c) determining a constrained shortest multipath (CSG.sub.i) from the router serving as the ingress of the SR path to the egress router; d) determining a set of SR segment-list(s) (S.sub.i=[sl.sub.1.sup.i, sl.sub.2.sup.i . . . sl.sub.n.sup.i]) that are needed to steer traffic over CSG.sub.i; and e) tuning each of a plurality of loadshares in a set of segment link loadshares L.sub.i that the router serving at the ingress of the SR path uses to steer portions of the bandwidth demand to the egress router, using all of 1) S.sub.i and the per segment-list loadshare (L.sub.i=[l.sub.1.sup.i, l.sub.2.sup.i, . . . l.sub.n.sup.i]), 2) the per segment equal cost multipath (“ECMP”), and 3) the per link residual capacity, such that the bandwidth capacity over CSG.sub.i is maximized or such that the bandwidth capacity meets a threshold value.

Description

§ 3. BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is an example network used to illustrate an SR domain.

(2) FIG. 2 is an example network used to illustrate SR paths through an SR domain.

(3) FIG. 3 is an example network used to illustrate adjacency segments in an SR domain.

(4) FIGS. 4A and 4B are an example network used to illustrate prefix segments in an SR domain.

(5) FIG. 5 is an example network used to illustrate the use of MPLS labels derived from adjacency segments.

(6) FIG. 6 is an example network used to illustrate the use of MPLS labels derived from prefix segments.

(7) FIG. 7 is a flow diagram of an example method for determining SR bandwidth constrained path(s) in a manner consistent with the present description.

(8) FIG. 8 illustrates a first example of operations of the example method of FIG. 7.

(9) FIGS. 9A-9C illustrate a second example of operations of the example method of FIG. 7.

(10) FIG. 10 illustrates two data forwarding systems, which may be used as nodes in an SR domain, coupled via communications links.

(11) FIG. 11 is a block diagram of a router which may be used a node in an SR domain.

(12) FIG. 12 is an example architecture in which ASICS may be distributed in a packet forwarding component to divide the responsibility of packet forwarding.

(13) FIGS. 13A and 13B is an example of operations of the example architecture of FIG. 12.

(14) FIG. 14 is a flow diagram of an example method for providing packet forwarding in an example router.

(15) FIG. 15 is a block diagram of an exemplary machine 1500 that may perform one or more of the processes described, and/or store information used and/or generated by such processes.

(16) FIGS. 16 and 17 illustrate pseudo code and a flow diagram, respectively, of an example method for determining a set of SR segments list(s) that are needed to steer traffic over the i.sup.th constrained shorted multipath.

(17) FIGS. 18-20 illustrate alternative architectures for implementing an example method consistent with the present description.

§ 4. DETAILED DESCRIPTION

(18) The present disclosure may involve novel methods, apparatus, message formats, and/or data structures for determining bandwidth guaranteed SR paths. The following description is presented to enable one skilled in the art to make and use the described embodiments, and is provided in the context of particular applications and their requirements. Thus, the following description of example embodiments provides illustration and description, but is not intended to be exhaustive or to limit the present disclosure to the precise form disclosed. Various modifications to the disclosed embodiments will be apparent to those skilled in the art, and the general principles set forth below may be applied to other embodiments and applications. For example, although a series of acts may be described with reference to a flow diagram, the order of acts may differ in other implementations when the performance of one act is not dependent on the completion of another act. Further, non-dependent acts may be performed in parallel. No element, act or instruction used in the description should be construed as critical or essential to the present description unless explicitly described as such. Also, as used herein, the article “a” is intended to include one or more items. Where only one item is intended, the term “one” or similar language is used. Thus, the present disclosure is not intended to be limited to the embodiments shown and the inventors regard their invention as any patentable subject matter described.

§ 4.1 Definintions and Terminology

(19) SG(R,D): the shortest multi-path directed acyclic graph from root R to destination D. This is analogous to the IGP computed shortest multi-path graph from R to D with no constraints on topology and when optimizing for the IGP metric.

(20) CSG(R,D): the constrained shortest multi-path directed acyclic graph from R to destination D. (The classical CSPF algorithm is extended by example methods consistent with the present description to become multi-path aware and support constraints on the topology and optimization of an arbitrary path metric such as TE, latency, hops, etc.)

(21) sl: SR segment-list that is composed of an ordered set of segments that resemble the path(s) that dataflow will follow. The segments of a segment-list are copied in a Segment Routing Header (SRH) that is imposed on top of data packets that are steered over the SR Path.

§ 4.2 Example Methods

(22) Example methods consistent with the present description determine bandwidth guaranteed SR paths. Such method(s) may be referred to as “SR Bandwidth Constrained Path Algorithm” (“SR-BCPA”). Goals of such example methods include, for example: determining a placement for the incoming traffic demand on link(s) where enough required resources are available to carry the share of traffic, so it minimizes the chances of congestion; determining an SR Path that utilizes ECMP capable SR segments whenever possible and accounting for load balancing of traffic on the available ECMP path(s); and optimizing for the chosen path metric (e.g. delay, TE metric, hops, etc.) when selecting the set of feasible path(s).

(23) Referring to FIG. 7, an example method 700 consistent with the present description may be used to determine a bandwidth-guaranteed SR path. The example method 700 receives, as input, a bandwidth demand value. (Block 710) The example method 700 also initializes an index (e.g., i=0). (Block 720) The example method also obtains network information from a traffic engineering database (TED) (or from a workspace). (Block 730) The example method 700 then determines a constrained shortest multipath (CSG). (Block 740) The CSG may be formed of paths of equal cost of minimum accumulative path metric (e.g., after excluding link(s) due to any topological constraints (e.g. link affinities) and pruning out zero residual bandwidth links). The example method 700 then determines a set of SR segment-list(s) (S.sub.i=[sl.sub.1.sup.i, sl.sub.2.sup.i . . . sl.sub.n.sup.i]) that are needed to steer traffic over CSG.sub.i (e.g., using any known SR path to segment-list segment compression algorithm). (Block 750) Next, using S.sub.i and the per segment-list loadshare (L.sub.i=[l.sub.1.sup.i, l.sub.2.sup.i . . . l.sub.n.sup.i]), the per segment ECMP, and the per link residual capacity, the loadshares in L.sub.i, are tuned such that the bandwidth capacity that can be carried over CSG.sub.i is maximized (or at least increased to exceed a threshold). (Block 760) The example method 700 may then update the TED (or a workspace) to deduct resources (e.g., bandwidth capacity) used on CSG.sub.i. (Block 770) This may be done using L.sub.i and the per link traffic ratio on link(s) of CSG.sub.i. Next, it is determined whether or not the (remaining) bandwidth demand can be satisfied by CSG.sub.i. (Decision 780) If not (Decision 780, NO) (i.e., when the capacity of CSG.sub.i is smaller than the (remaining) demand, the index is incremented (e.g., i=i+1) (Block 790), and the method returns to block 740. If, on the other hand, the (remining) bandwidth demand can satisfied by CSG.sub.i (Decision 780, YES), the method 700 is left. (Node 799)

(24) Referring back to block 730, example methods consistent with the present description may use topology information composed from the TED and the per link residual capacities (or available bandwidth). For SR path computation purposes, it is assumed the per link residual capacities are managed by a resource manager that keeps track of the per SR Path resource allocation on each traversed link and that gets reflected on the TED used for new path computations.

(25) The following properties can be derived about the determined CSG.sub.i:

(26) ${CSG}_{i} .Math. {\begin{matrix} c_{i} = path cost \\ X_{i} = total capacity \\ S_{i} = [{sl}_{1}^{i}, {sl}_{2}^{i} .Math. {sl}_{n}^{i}] \\ L_{i} = [l_{1}^{i}, l_{2}^{i}, .Math. l_{n}^{i}] \end{matrix}$
where: X.sub.i: is the cost of the i.sup.th CSG, X.sub.i: is the bandwidth capacity of the i.sup.th CSG, which is to be maximized (or increased to at least a determined threshold), S.sub.i: is the set of segment-lists needed to steer traffic over the path(s) described by the i.sup.th CSG, L.sub.i: is the per segment-list loadshare that the ingress uses to steer portion of the incoming demand on to S.sub.i. These loadshares are tuned by the optimization problem to maximize the capacity of the i.sup.th CSG.

(27) The weight distribution of the total incoming traffic on to each CSG can be represented as:

(28) $W = [w 1, w 2, .Math. wk], and wi = \frac{X i}{{.Math.}_{k = 0}^{K} Xk}$
where w.sub.i: is the load-share of traffic carried by the i.sup.th CSG.

(29) The effective load carried by each segment-list sl can be computed as:
w.sub.i×L.sub.i∀sl(s) in S.sub.i.

§ 4.2.1 EXAMPLES OF OPERATIONS OF EXAMPLE METHODS

§ 4.2.1.1 First Example

(30) Referring to FIG. 8, consider the following incoming traffic demand D=12 U from “H” destined to “T”. In the first iteration, H runs the example method 700 to find CSG.sub.1 and succeeds to steer 10U over CSG.sub.1:

(31) ${CSG}_{1} .Math. {\begin{matrix} Cost c_{1} = 30 \\ Capacity X_{1} = 10 U \\ S_{1} = [{sl}_{1}^{1}, {sl}_{2}^{1} .Math. {sl}_{n}^{1}] \\ L_{1} = [l_{1}^{1}, l_{2}^{1}, .Math. l_{n}^{1}] \end{matrix}$

(32) Since, however the bandwidth demand is not yet satisfied (Recall, e.g., 780, NO) (10 U<12 U), the method 700 performs a second iteration. In the second iteration, H runs the example method 700 to find CSPG.sub.2 and consequently steer the remainder 2 U over CSG.sub.2:

(33) ${CSG}_{2} .Math. {\begin{matrix} Cost c_{2} = 40 \\ Capacity X_{2} = 10 U \\ S_{2} = [{sl}_{1}^{2}, {sl}_{2}^{2} .Math. {sl}_{n}^{2}] \\ L_{2} = [l_{1}^{2}, l_{2}^{2}, .Math. l_{n}^{2}] \end{matrix}$
H updates the weight distribution as:

(34) W×L, where W=[w1, w2] and w1= 10/12 and w2= 2/12 and L=[L.sub.1, L.sub.2] and S=[S.sub.1, S.sub.2] describe the set of segment-list(s) found for each iteration.

§ 4.2.1.2 Second Example

(35) FIG. 9A depicts an example network topology. The residual bandwidth (e.g., available from TED or a workspace) of each link is marked. For example, the link connecting node (R1) with node (R2), and the link connecting node(R1) to node(R3) have residual bandwidths of 3 units, and 4 units respectively.

(36) The example method 700 is performed to determine the maximum capacity X that node (R1) can send to node (R8) over the most optimal path(s). The steps involved include:

(37) TABLE-US-00001 STEP DESCRIPTION 1 Compute CSG.sub.1 (Recall 740 of FIG. 7.) 2 Determine S1, the set of segment-list(s) to steer traffic over CSG.sub.1 (Recall 750 of FIG. 7.): sl.sub.1.sup.1 = {node-SID(6), node-SID(8)} sl.sub.2.sup.1 = {node-SID(2), node-SID(5), node-SID(8)} and, S.sub.1 = {sl.sub.1.sup.1, sl.sub.2.sup.1} L.sub.1 = {l.sub.1, l.sub.2} 3 Formulate the set of constraint equalities and inequalities 4 Solve the optimization problem to maximize the capacity of CSG.sub.1 and find: L, and W (Recall 760 of FIG. 7.)

(38) The set of constraints equations can be derived as:

(39) $\begin{matrix} e_{1 2} .fwdarw. \frac{l_{2} x}{3} + l_{1} x \leq 3 & (1) \\ e_{1 3} .fwdarw. \frac{l_{2} x}{3} \leq 4 & (2) \\ e_{1 4} .fwdarw. \frac{l_{2} x}{3} \leq 5 & (3) \\ e_{2 5} .fwdarw. l_{1} x \leq 3 & (4) \\ e_{2 6} .fwdarw. \frac{l_{2} x}{3} \leq 6 & (5) \\ e_{3 6} .fwdarw. \frac{l_{2} x}{3} \leq 3 & (6) \\ e_{4 6} .fwdarw. \frac{l_{2} x}{3} \leq 6 & (7) \\ e_{5 8} .fwdarw. l_{1} x \leq 6 & (8) \\ e_{6 8} .fwdarw. l_{2} x \leq 5 & (9) \end{matrix}$

(40) Where e.sub.xy is the unidirectional edge (or link) connecting node(x) to node(y). Inequalities (1)-(3) are derived from the three (3) links exiting node R1. The denominator 3 indicates ECMP over the three links. Inequalities (5)-(7) are derived from the three (3) links entering node R6. Again, the denominator 3 indicates ECMP over the three links.

(41) The inequalities or equations above can be simplified further to:
l.sub.1+l.sub.2=1 (10)
l.sub.1x≤3 (11)
l.sub.2x≤4 (12)
l.sub.2x+3l.sub.1x≤9 (13)

(42) Equation (10) is derived from the fact that the sum of the loads is always 1. Inequality (11) corresponds to inequality (6), and inequality (12) corresponds to inequality (9). Inequality (13) is derived from inequality (1). Expressions (10)-(13) can be programmatically solved (e.g., using non-linear programming such as Sequential Least SQuares Programming (“SLSQP”)). The example below uses python. ($python compute_cap_weigths.py). In the example of FIGS. 9A-9C, the following values are generated:
Capacity (X)=6.333333327495734
I.sub.2=0.7894736842874963
I.sub.1=0.21052631571250358

§ 4.3 EXAMPLE ARCHITECTURES AND APPARATUS

(43) The nodes may be forwarding devices such as routers for example. FIG. 10 illustrates two data forwarding systems 1010 and 1020 coupled via communications links 1030. The links may be physical links or “wireless” links. The data forwarding systems 1010,1020 may be routers for example. If the data forwarding systems 1010,1020 are example routers, each may include a control component (e.g., a routing engine) 1014,1024 and a forwarding component 1012,1022. Each data forwarding system 1010,1020 includes one or more interfaces 1016,1026 that terminate one or more communications links 1030.

(44) As just discussed above, and referring to FIG. 11, some example routers 1100 include a control component (e.g., routing engine) 1110 and a packet forwarding component (e.g., a packet forwarding engine) 1190.

(45) The control component 1110 may include an operating system (OS) kernel 1120, routing protocol process(es) 1130, label-based forwarding protocol process(es) 1140, interface process(es) 1150, user interface (e.g., command line interface) process(es) 1160, and chassis process(es) 1170, and may store routing table(s) 1139, label forwarding information 1145, and forwarding (e.g., route-based and/or label-based) table(s) 1180. As shown, the routing protocol process(es) 1130 may support routing protocols such as the routing information protocol (“RIP”) 1131, the intermediate system-to-intermediate system protocol (“IS-IS”) 1132, the open shortest path first protocol (“OSPF”) 1133, the enhanced interior gateway routing protocol (“EIGRP”) 1134 and the boarder gateway protocol (“BGP”) 1135, and the label-based forwarding protocol process(es) 1140 may support protocols such as BGP 1135, the label distribution protocol (“LDP”) 1136 and the resource reservation protocol (“RSVP”) 1137. One or more components (not shown) may permit a user 1165 to interact with the user interface process(es) 1160. Similarly, one or more components (not shown) may permit an outside device to interact with one or more of the router protocol process(es) 1130, the label-based forwarding protocol process(es) 1140, the interface process(es) 1150, and the chassis process(es) 1170, via SNMP 1185, and such processes may send information to an outside device via SNMP 1185. Example embodiments consistent with the present description may be implemented in one or more routing protocol processes 1130.

(46) The packet forwarding component 1190 may include a microkernel 1192, interface process(es) 1193, distributed ASICs 1194, chassis process(es) 1195 and forwarding (e.g., route-based and/or label-based) table(s) 1196.

(47) In the example router 1100 of FIG. 11, the control component 1110 handles tasks such as performing routing protocols, performing label-based forwarding protocols, control packet processing, etc., which frees the packet forwarding component 1190 to forward received packets quickly. That is, received control packets (e.g., routing protocol packets and/or label-based forwarding protocol packets) are not fully processed on the packet forwarding component 1190 itself, but are passed to the control component 1110, thereby reducing the amount of work that the packet forwarding component 1190 has to do and freeing it to process packets to be forwarded efficiently. Thus, the control component 1110 is primarily responsible for running routing protocols and/or label-based forwarding protocols, maintaining the routing tables and/or label forwarding information, sending forwarding table updates to the packet forwarding component 1190, and performing system management. The example control component 1110 may handle routing protocol packets, provide a management interface, provide configuration management, perform accounting, and provide alarms. The processes 1130, 1140, 1150, 1160 and 1170 may be modular, and may interact with the OS kernel 1120. That is, nearly all of the processes communicate directly with the OS kernel 1120. Using modular software that cleanly separates processes from each other isolates problems of a given process so that such problems do not impact other processes that may be running. Additionally, using modular software facilitates easier scaling.

(48) Still referring to FIG. 11, the example OS kernel 1120 may incorporate an application programming interface (“API”) system for external program calls and scripting capabilities. The control component 1110 may be based on an Intel PCI platform running the OS from flash memory, with an alternate copy stored on the router's hard disk. The OS kernel 1120 is layered on the Intel PCI platform and establishes communication between the Intel PCI platform and processes of the control component 1110. The OS kernel 1120 also ensures that the forwarding tables 1196 in use by the packet forwarding component 1190 are in sync with those 1180 in the control component 1110. Thus, in addition to providing the underlying infrastructure to control component 1110 software processes, the OS kernel 1120 also provides a link between the control component 1110 and the packet forwarding component 1190.

(49) Referring to the routing protocol process(es) 1130 of FIG. 11, this process(es) 1130 provides routing and routing control functions within the platform. In this example, the RIP 1131, ISIS 1132, OSPF 1133 and EIGRP 1134 (and BGP 1135) protocols are provided. Naturally, other routing protocols may be provided in addition, or alternatively. Similarly, the label-based forwarding protocol process(es) 1140 provides label forwarding and label control functions. In this example, the LDP 1136 and RSVP 1137 (and BGP 1135) protocols are provided. Naturally, other label-based forwarding protocols (e.g., MPLS, SR, etc.) may be provided in addition, or alternatively. In the example router 1100, the routing table(s) 1139 is produced by the routing protocol process(es) 1130, while the label forwarding information 1145 is produced by the label-based forwarding protocol process(es) 1140.

(50) Still referring to FIG. 11, the interface process(es) 1150 performs configuration of the physical interfaces (Recall, e.g., 1016 and 926 of FIG. 10.) and encapsulation.

(51) The example control component 1110 may provide several ways to manage the router. For example, it 1110 may provide a user interface process(es) 1160 which allows a system operator 1165 to interact with the system through configuration, modifications, and monitoring. The SNMP 1185 allows SNMP-capable systems to communicate with the router platform. This also allows the platform to provide necessary SNMP information to external agents. For example, the SNMP 1185 may permit management of the system from a network management station running software, such as Hewlett-Packard's Network Node Manager (“HP-NNM”), through a framework, such as Hewlett-Packard's OpenView. Accounting of packets (generally referred to as traffic statistics) may be performed by the control component 1110, thereby avoiding slowing traffic forwarding by the packet forwarding component 1190.

(52) Although not shown, the example router 1100 may provide for out-of-band management, RS-232 DB9 ports for serial console and remote management access, and tertiary storage using a removable PC card. Further, although not shown, a craft interface positioned on the front of the chassis provides an external view into the internal workings of the router. It can be used as a troubleshooting tool, a monitoring tool, or both. The craft interface may include LED indicators, alarm indicators, control component ports, and/or a display screen. Finally, the craft interface may provide interaction with a command line interface (“CLI”) 1160 via a console port, an auxiliary port, and/or a management Ethernet port

(53) The packet forwarding component 1190 is responsible for properly outputting received packets as quickly as possible. If there is no entry in the forwarding table for a given destination or a given label and the packet forwarding component 1190 cannot perform forwarding by itself, it 1190 may send the packets bound for that unknown destination off to the control component 1110 for processing. The example packet forwarding component 1190 is designed to perform Layer 2 and Layer 3 switching, route lookups, and rapid packet forwarding.

(54) As shown in FIG. 11, the example packet forwarding component 1190 has an embedded microkernel 1192, interface process(es) 1193, distributed ASICs 1194, and chassis process(es) 1195, and stores a forwarding (e.g., route-based and/or label-based) table(s) 1196. The microkernel 1192 interacts with the interface process(es) 1193 and the chassis process(es) 1195 to monitor and control these functions. The interface process(es) 1192 has direct communication with the OS kernel 1120 of the control component 1110. This communication includes forwarding exception packets and control packets to the control component 1110, receiving packets to be forwarded, receiving forwarding table updates, providing information about the health of the packet forwarding component 1190 to the control component 1110, and permitting configuration of the interfaces from the user interface (e.g., CLI) process(es) 1160 of the control component 1110. The stored forwarding table(s) 1196 is static until a new one is received from the control component 1110. The interface process(es) 1193 uses the forwarding table(s) 1196 to look up next-hop information. The interface process(es) 1193 also has direct communication with the distributed ASICs 1194. Finally, the chassis process(es) 1195 may communicate directly with the microkernel 1192 and with the distributed ASICs 1194.

(55) Referring back to distributed ASICs 1194 of FIG. 11, FIG. 12 is an example of how the ASICS may be distributed in the packet forwarding component 1190 to divide the responsibility of packet forwarding. As shown in FIG. 12, the ASICs of the packet forwarding component 1190 may be distributed on physical interface cards (“PICs”) 1210, flexible PIC concentrators (“FPCs”) 1220, a midplane or backplane 1230, and a system control board(s) 1240 (for switching and/or forwarding). Switching fabric is also shown as a system switch board (“SSB”), or a switching and forwarding module (“SFM”) 1250. Each of the PICs 1210 includes one or more PIC I/O managers 1215. Each of the FPCs 1220 includes one or more I/O managers 1222, each with an associated memory 1224. The midplane/backplane 1230 includes buffer managers 1235a, 1235b. Finally, the system control board 1240 includes an internet processor 1242 and an instance of the forwarding table 1244 (Recall, e.g., 1196 of FIG. 11).

(56) Still referring to FIG. 12, the PICs 1210 contain the interface ports. Each PIC 1210 may be plugged into an FPC 1220. Each individual PIC 1210 may contain an ASIC that handles media-specific functions, such as framing or encapsulation. Some example PICs 1210 provide SDH/SONET, ATM, Gigabit Ethernet, Fast Ethernet, and/or DS3/E3 interface ports.

(57) An FPC 1220 can contain from one or more PICs 1210, and may carry the signals from the PICs 1210 to the midplane/backplane 1230 as shown in FIG. 12.

(58) The midplane/backplane 1230 holds the line cards. The line cards may connect into the midplane/backplane 1230 when inserted into the example router's chassis from the front. The control component (e.g., routing engine) 1110 may plug into the rear of the midplane/backplane 1230 from the rear of the chassis. The midplane/backplane 1230 may carry electrical (or optical) signals and power to each line card and to the control component 1110.

(59) The system control board 1240 may perform forwarding lookup. It 1240 may also communicate errors to the routing engine. Further, it 1240 may also monitor the condition of the router based on information it receives from sensors. If an abnormal condition is detected, the system control board 1240 may immediately notify the control component 1110.

(60) Referring to FIGS. 12, 13A and 13B, in some exemplary routers, each of the PICs 1210,1110′ contains at least one I/O manager ASIC 1215 responsible for media-specific tasks, such as encapsulation. The packets pass through these I/O ASICs on their way into and out of the router. The I/O manager ASIC 1215 on the PIC 1210,1110′ is responsible for managing the connection to the I/O manager ASIC 1222 on the FPC 1220,1120′, managing link-layer framing and creating the bit stream, performing cyclical redundancy checks (CRCs), and detecting link-layer errors and generating alarms, when appropriate. The FPC 1220 includes another I/O manager ASIC 1222. This ASIC 1222 takes the packets from the PICs 1210 and breaks them into (e.g., 74-byte) memory blocks. This FPC I/O manager ASIC 1222 sends the blocks to a first distributed buffer manager (DBM) 1235a′, decoding encapsulation and protocol-specific information, counting packets and bytes for each logical circuit, verifying packet integrity, and applying class of service (CoS) rules to packets. At this point, the packet is first written to memory. More specifically, the example DBM ASIC 1235a′ manages and writes packets to the shared memory 1224 across all FPCs 1220. In parallel, the first DBM ASIC 1235a′ also extracts information on the destination of the packet and passes this forwarding-related information to the Internet processor 1242/1142′. The Internet processor 1242/1142′ performs the route lookup using the forwarding table 1244 and sends the information over to a second DBM ASIC 1235b′. The Internet processor ASIC 1242/1142′ also collects exception packets (i.e., those without a forwarding table entry) and sends them to the control component 1110. The second DBM ASIC 1235b′ then takes this information and the 74-byte blocks and forwards them to the I/O manager ASIC 1222 of the egress FPC 1220/1120′ (or multiple egress FPCs, in the case of multicast) for reassembly. (Thus, the DBM ASICs 1235a′ and 1235b′ are responsible for managing the packet memory 1224 distributed across all FPCs 1220/1120′, extracting forwarding-related information from packets, and instructing the FPC where to forward packets.)

(61) The I/O manager ASIC 1222 on the egress FPC 1220/1120′ may perform some value-added services. In addition to incrementing time to live (“TTL”) values and re-encapsulating the packet for handling by the PIC 1210, it can also apply class-of-service (CoS) rules. To do this, it may queue a pointer to the packet in one of the available queues, each having a share of link bandwidth, before applying the rules to the packet. Queuing can be based on various rules. Thus, the I/O manager ASIC 1222 on the egress FPC 1220/1120′ may be responsible for receiving the blocks from the second DBM ASIC 1235b′, incrementing TTL values, queuing a pointer to the packet, if necessary, before applying CoS rules, re-encapsulating the blocks, and sending the encapsulated packets to the PIC I/O manager ASIC 1215.

(62) FIG. 14 is a flow diagram of an example method 1400 for providing packet forwarding in the example router. The main acts of the method 1400 are triggered when a packet is received on an ingress (incoming) port or interface. (Event 1410) The types of checksum and frame checks that are required by the type of medium it serves are performed and the packet is output, as a serial bit stream. (Block 1420) The packet is then decapsulated and parsed into (e.g., 64-byte) blocks. (Block 1430) The packets are written to buffer memory and the forwarding information is passed on the Internet processor. (Block 1440) The passed forwarding information is then used to lookup a route in the forwarding table. (Block 1450) Note that the forwarding table can typically handle unicast packets that do not have options (e.g., accounting) set, and multicast packets for which it already has a cached entry. Thus, if it is determined that these conditions are met (YES branch of Decision 1460), the packet forwarding component finds the next hop and egress interface, and the packet is forwarded (or queued for forwarding) to the next hop via the egress interface (Block 1470) before the method 1400 is left (Node 1490) Otherwise, if these conditions are not met (NO branch of Decision 1460), the forwarding information is sent to the control component 1110 for advanced forwarding resolution (Block 1480) before the method 1400 is left (Node 1490).

(63) Referring back to block 1470, the packet may be queued. Actually, as stated earlier with reference to FIG. 12, a pointer to the packet may be queued. The packet itself may remain in the shared memory. Thus, all queuing decisions and CoS rules may be applied in the absence of the actual packet. When the pointer for the packet reaches the front of the line, the I/O manager ASIC 1222 may send a request for the packet to the second DBM ASIC 1235b. The DBM ASIC 1235 reads the blocks from shared memory and sends them to the I/O manager ASIC 1222 on the FPC 1220, which then serializes the bits and sends them to the media-specific ASIC of the egress interface. The I/O manager ASIC 1215 on the egress PIC 1210 may apply the physical-layer framing, perform the CRC, and send the bit stream out over the link.

(64) Referring back to block 1480 of FIG. 14, as well as FIG. 12, regarding the transfer of control and exception packets, the system control board 1240 handles nearly all exception packets. For example, the system control board 1240 may pass exception packets to the control component 1110.

(65) Although example embodiments consistent with the present invention may be implemented on the example routers of FIG. 10 or 11, embodiments consistent with the present invention may be implemented on communications network nodes (e.g., routers, switches, etc.) having different architectures, or even a remote server (e.g., a path computation element (“PCE”)). More generally, embodiments consistent with the present invention may be implemented on an example system 1400 as illustrated on FIG. 15.

(66) FIG. 15 is a block diagram of an exemplary machine 1500 that may perform one or more of the processes described, and/or store information used and/or generated by such processes. The exemplary machine 1500 includes one or more processors 1510, one or more input/output interface units 1530, one or more storage devices 1520, and one or more system buses and/or networks 1540 for facilitating the communication of information among the coupled elements. One or more input devices 1532 and one or more output devices 1534 may be coupled with the one or more input/output interfaces 1530. The one or more processors 1510 may execute machine-executable instructions (e.g., C or C++ running on the Linux operating system widely available from a number of vendors such as Red Hat, Inc. of Durham, N.C.) to effect one or more aspects of the present invention. At least a portion of the machine executable instructions may be stored (temporarily or more permanently) on the one or more storage devices 1520 and/or may be received from an external source via one or more input interface units 1530. The machine executable instructions may be stored as various software modules, each module performing one or more operations. Functional software modules are examples of components of the invention.

(67) In some embodiments consistent with the present invention, the processors 1510 may be one or more microprocessors and/or ASICs. The bus 1540 may include a system bus. The storage devices 1520 may include system memory, such as read only memory (ROM) and/or random access memory (RAM). The storage devices 1520 may also include a hard disk drive for reading from and writing to a hard disk, a magnetic disk drive for reading from or writing to a (e.g., removable) magnetic disk, an optical disk drive for reading from or writing to a removable (magneto-) optical disk such as a compact disk or other (magneto-) optical media, or solid-state non-volatile storage.

(68) Some example embodiments consistent with the present invention may also be provided as a machine-readable medium for storing the machine-executable instructions. The machine-readable medium may be non-transitory and may include, but is not limited to, flash memory, optical disks, CD-ROMs, DVD ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards or any other type of machine-readable media suitable for storing electronic instructions. For example, example embodiments consistent with the present invention may be downloaded as a computer program which may be transferred from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of a communication link (e.g., a modem or network connection) and stored on a non-transitory storage medium. The machine-readable medium may also be referred to as a processor-readable medium.

(69) Example embodiments consistent with the present invention (or components or modules thereof) might be implemented in hardware, such as one or more field programmable gate arrays (“FPGA”s), one or more integrated circuits such as ASICs, one or more network processors, etc. Alternatively, or in addition, embodiments consistent with the present invention (or components or modules thereof) might be implemented as stored program instructions executed by a processor. Such hardware and/or software might be provided in an addressed data (e.g., packet, cell, etc.) forwarding device (e.g., a switch, a router, etc.), a laptop computer, desktop computer, a tablet computer, a mobile phone, or any device that has computing and networking capabilities.

§ 4.4 Refinements, Alternatives and Extensions

(70) Referring back to block 750 of FIG. 7, the set of SR segment list(s) need to steer traffic over the i.sup.th CSG may be determined using the technique illustrated in FIGS. 16 and 17. Block reference numbers used in the flow diagram of FIG. 17 are annotated onto the pseudo code of FIG. 16.

(71) Referring back to block 760 of FIG. 7, the loadshares may be tuned to maximize the bandwidth capacity that may be carried over the i.sup.th CSG is a non-linear programming problem that may be solved using Sequential Least SQuares Programming (“SLSQP”).

(72) FIGS. 18-20 illustrate different architectures that may be used to implement an example method consistent with the present description. More specifically, FIG. 18 illustrates a centralized architecture in which a centralized controller includes a path computation element (PCE), a resource manager (RM) and a BGP route reflector (RR). The PCE can communicate with the RM and the BGP RR using, for example, Google's open source remote procedure call (gRPC) and/or the PCE protocol (PCEP). The PCE includes a segment routing (SR) bandwidth (BW) path computation engine which uses and/or generates information in a traffic engineering database (TED) and a label switched path (LSP) database. The RM includes a link CAC (Call Admission Control, for performing admission control on the link(s) that constitute the SR Path) database. More specifically, the RM includes a link database of SR link(s) in the network and where the SR path reservation(s) and admission is performed and maintained on the traversed link(s). The BGP RR may store BGP link state information. As shown, R1 in domain 1 can communicate with the PCE using, for example, PCEP. Local state information, such as link capacity, link utilization, and per SID traffic rates can be communicated from each of the domains to the RM. This may be done using, for example, BGP-LS, Telemetry, and/or SNMP. Further, information can be exchanged between each of the domains and the BGP RR using, for example, BGP-LS.

(73) FIG. 19 illustrates an architecture using distributed computation and distributed CAC. In this example, the centralized node includes a central instance of the RM and the BGP RR. As shown, domain 1 includes the PCE and local RM with a local instance of the CAC, while domain 2 includes a local RM with a local instance of the CAC. The SR BW path computation module may communicate with the BGP RR using, for example, BGP-LS. The local instances of the RM may communicate with the centralized RM using, for example, gRPC, PCEP, and/or BGP. Finally, local state information, such as link capacity, link utilization, and per SID traffic rates can be communicated from each of the domains to the BGP-RR. This may be done using, for example, BGP-LS.

(74) Finally, FIG. 20 illustrates an architecture using distributed computation and a centralized CAC. In this example, the centralized node includes the RM and the BGP RR. As shown, domain 1 includes the PCE. The SR BW path computation module may communicate with the BGP RR using, for example, BGP-LS. The SR BW path computation module may communicate with the RM to request certain allocations, and receive a response to its request(s). Local state information, such as link capacity, link utilization, and per SID traffic rates can be communicated from each of the domains to the BGP-RR. This may be done using, for example, BGP-LS.

§ 4.5 CONCLUSIONS

(75) Example embodiments consistent with the present description allows the setup of SR path(s) with bandwidth guarantees in SR network.

(76) Example embodiments consistent with the present description are applicable to SRv6 and SR-MPLS dataplane technologies.

(77) Example embodiments consistent with the present description enable auto-bandwidth to work for SR Path(s).

(78) Example embodiments consistent with the present description can work on a central computation server, where per path reservations are managed centrally.

(79) Example embodiments consistent with the present description are compatible with RSVP-TE LSP and bandwidth reservations in the same network.

Guaranteed bandwidth for segment routed (SR) paths

Assignee

Inventors

Cpc classification

Classification Explorer

H04L45/033

ELECTRICITY

Classification Explorer

G06F16/2379

PHYSICS

Classification Explorer

H04L45/34

ELECTRICITY

Classification Explorer

H04L45/02

ELECTRICITY

Classification Explorer

H04L45/125

ELECTRICITY

Classification Explorer

H04L45/036

ELECTRICITY

Classification Explorer

H04L45/24

ELECTRICITY

International classification

Classification Explorer

H04L12/729

ELECTRICITY

Classification Explorer

H04L12/707

ELECTRICITY

Classification Explorer

H04L12/751

ELECTRICITY

Classification Explorer

G06F16/23

PHYSICS

Abstract

Claims

Description