Apparatus and method for synchronized networks
10135721 ยท 2018-11-20
Inventors
Cpc classification
H04L45/00
ELECTRICITY
Y02D30/50
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
H04L69/321
ELECTRICITY
H04L69/32
ELECTRICITY
International classification
H04L7/00
ELECTRICITY
Abstract
An apparatus and method for network routing is provided. Synchronized networks are disclosed which enable fast connection set up and release in a tiered hierarchy of circuit switched nodes. Nodes in the network can aggregate and disaggregate data according to a transform algorithm allowing for dynamic frame and frame segment sizing. Connections within the network can be organized by paired connections performing aggregation and disaggregation according to control vectors.
Claims
1. A method of routing network data, the method comprising: receiving network data including data designating a destination address; identifying a paired connection with an exit node corresponding to the destination address, changing the identified paired connection from a virtual state to a real state; assigning a bandwidth to the real state paired connection; implicitly addressing the network data to generate an aggregated data stream of cellets containing the received network data, the aggregated data stream comprising at least one space/time frame including at least one cellet from the aggregated data stream of cellets, wherein a position of the at least one cellet in the at least one space/time frame is determined based on the identified paired connection, and wherein a size of the at least one cellet or a number of cellets included in the at least one space/time frame is determined based on the assigned bandwidth; and transmitting the cellets over the real state paired connection.
2. The method of claim 1, wherein the virtual state paired connection stores connection parameters in a connection bandwidth register while assigning zero bandwidth to the connection.
3. The method of claim 1, wherein the assigned bandwidth of the real state connection pair is dynamically adjusted by changing the number of cellets in the at least one space/time frame of the aggregated data stream.
4. The method of claim 1, further comprising: transmitting control vectors passed to an exit node including data indicative of disaggregating the implicitly addressed cellets, such that the disaggregation of the aggregated data stream is performed at the exit node.
5. The method of claim 1, further comprising: determining a period of the at least one space/time frame and/or the size of the at least one cellet based on a data rate of a trunk or link and a minimum required bandwidth for any packet flow on the trunk or link.
6. The method of claim 5, further comprising: allocating a number of cellets per space/time frame to the packet flow based on a ratio of a required bandwidth of a first packet flow to the minimum required bandwidth for any packet flow on the trunk or link.
7. The method of claim 6, wherein the cellets allocated to the first packet flow are nearly uniformly spaced in a space/time domain.
8. The method of claim 7, wherein the variations in the required bandwidth of the first packet flow include zero bandwidth, and wherein allocation corresponding to zero bandwidth is zero cellet, and the zero-bandwidth connection is maintained in a virtual state.
9. The method of claim 1, further comprising: determining a route for the connection from a plurality of routes based, at least in part, on route latencies.
10. The method of claim 1, wherein the control vector exists in a control plane physically and logically unreachable from a user port.
11. A system for performing network data routing, comprising: an entry node associated with a packet source of a network, configured to: receive network data including data designating a destination address; identify a paired connection with an exit node corresponding to the destination address, change the identified paired connection from a virtual state to a real state; assign a bandwidth to the real state paired connection; implicitly address the network data to generate an aggregated data stream of cellets containing the received network data, the aggregated data stream comprising at least one space/time frame including at least one cellet from the aggregated data stream of cellets, wherein a position of the at least one cellet in the at least one space/time frame is determined based on the identified paired connection, and wherein a size of the at least one cellet or a number of cellets included in the at least one space/time frame is determined based on the assigned bandwidth; and transmit the cellets over the real state paired connection.
12. The system of claim 11, wherein the virtual state paired connection stores connection parameters in a connection bandwidth register while assigning zero bandwidth to the connection.
13. The system of claim 11, wherein the assigned bandwidth of the real state connection pair is dynamically adjusted by changing the number of cellets in the at least one space/time frame of the aggregated data stream.
14. The system of claim 11, wherein the entry node transmits control vectors passed to an exit node including data indicative of disaggregating the implicitly addressed cellets, such that the disaggregation of the aggregated data stream is performed at the exit node.
15. The system of claim 11, wherein the entry node determines a period of the at least one space/time frame and/or the size of the at least one cellet based on a data rate of a trunk or link and a minimum required bandwidth for any packet flow on the trunk or link.
16. The system of claim 15, wherein the entry node allocates a number of cellets per space/time frame to the packet flow based on a ratio of a required bandwidth of a first packet flow to the minimum required bandwidth for any packet flow on the trunk or link.
17. The system of claim 16, wherein the cellets allocated to the first packet flow are nearly uniformly spaced in a space/time domain.
18. The system of claim 17, wherein the variations in the required bandwidth of the first packet flow include zero bandwidth, and wherein allocation corresponding to zero bandwidth is zero cellet, and the zero-bandwidth connection is maintained in a virtual state.
19. The system of claim 11, wherein the entry node determines a route for the connection from a plurality of routes based, at least in part, on route latencies.
20. The system of claim 11, wherein the control vector exists in a control plane physically and logically unreachable from a user port.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
DETAILED DESCRIPTION OF THE EMBODIMENTS
(31) Details, with reference to the figures, disclose several illustrative preferred embodiments for implementing the system and method of this application. A person of ordinary skill in the art to which the device and method described herein pertain will understand or appreciate the features of certain embodiments. Such a person of ordinary skill in the art is a skilled artisan for conciseness and readability.
(32) A skilled artisan, in light of this disclosure, will appreciate that certain components described herein can advantageously be implemented using computer software, hardware, firmware, or any combination of software, hardware, and firmware. Though network nodes will typically implement control elements in hardware or firmware, any control logic that can be implemented using hardware is implementable using various combinations of hardware, software, or firmware not described herein. For example, firmware or software on a general-purpose computer can completely implement such control.
(33) A skilled artisan, in light of this disclosure, could divide or combine the modules described herein. For example, in light of this disclosure, a skilled artisan will appreciate that a single component can provide the functionality of a number of components in a network. Conversely, any one component is divisible into multiple components.
(34) The foregoing and other variations to the embodiments described herein are achievable by a skilled artisan without departing from the invention. With the understanding therefore, that the described embodiments are illustrative and that the invention is not limited to the described embodiments, certain embodiments are described below with reference to the drawings.
(35) Top Level Overview of a Synchronized Adaptive INfrastructure (SAIN)
(36) Synchronized Adaptive INfrastructure (SAIN) is a digital networking technology that enables setting up and taking down circuit connections in less than a millisecond. This technology is the subject of the present application. The technology enables a bright line separation of user data from routed data transport as shown in
(37) The forwarding delay at SAIN transit nodes can be a few nanoseconds with no greater jitter or delay variation. The SAIN approach can also greatly reduce system power consumption due to the technology using data fragments as short as one bit with time division switches tied to semi-static memory maps. No data headers exist for routing within the network and there are no jitter-producing packet buffers. All packet buffers exist only at ingress and egress nodes that connect to user terminal equipment.
(38) Although the technology disclosed herein is usable in a number of network architectures and structures, this application focuses on a two-tier structure shown in bottom the two tiers of
(39) An E-Node is the interface level node of a SAIN network to user devices. A group of E-nodes can be connected to a Transit Node (T-Node) which performs hierarchical routing to a destination T-node. An E-Node contains Parameterized User Interfaces (PUIs) to the user world and conversion to the SAIN transport world disclosed in this application. Each E-Node switch can set up and manage connections within a path to every other E-Node in a network. It accomplishes this by sending data to and receiving aggregated data from a parent T-Node that can act both as a source T-Node and a destination T-Node. Each source T-Node forwards connection aggregations from each of its child Source E-Nodes segregated into superpaths destined to each T-Node for forwarding on to the children of the T-Node.
(40) A parent source T-Node reaggregates its child source E-Nodes from a one-source-E-Node to all-destination-E-Nodes appearance into one-destination-E-Node from all-source-E-Nodes appearance. It forwards its reaggregations to each T-Node as appropriate in the network. As shown in more detail below, the process contains the following steps: 1. A source E-Node aggregates incoming traffic into paths destined to all other (destination) E-Nodes. 2. The source E-Node aggregates its paths into superpaths destined to destination T-nodes, the superpaths being aggregations of the paths for each destination T-node's child E-nodes. 3. The source E-Node aggregates the superpaths into a higher-level superpath that is capable of forwarding all traffic generated by the source E-Node to its parent source T-Node. 4. The source T-Node rearranges the one-to-many E-Node source structure into a many-to-one destination structure. 5. The source T-Node sends the destination structure to each destination T-Node. 6. The destination T-Node disaggregated the structure and sends all source E-Node traffic to each destination E-Node. 7. The destination E-Node disaggregates the traffic from source E-Nodes into paths for delivery of traffic to users. The process is the disaggregate inverse of the each source E-Node aggregation methods.
(41) Each destination T-Node forwards each source aggregation to the child E-Node destination. The result is a network that uses pre-established routes to remove the need for dynamic hop-by-hop routing. A SAIN network need not replace existing networks, but can overcome limitations that now exist by interconnecting with existing networks. In addition, it provides superior service at low cost and low power requirements in greenfield and upgrade applications, particularly those involved with optical fiber transmission.
(42) The Role of Prior Art
(43)
(44) Overall SAIN Structure
(45)
(46) A part of a termination at each end of an Underlay Network 100 network includes an inline Parameterized User Interface 210 that connects to a Protocol Translator 214. A Protocol Translator 214 translates any user protocol, including specifically Ethernet, into a serial bit stream where an incoming packet header can be replaced by a Connection Identifier, which is a compressed header that can be as small as one or two bytes.
(47) Inside the Underlay Network 100, there are data units called a cellets. A cellet can be a data fragment whose length is any number of bits, depending on its environment. It is a data element of fixed size for a given data link, but can vary from link to link. A cellet can be as small as one bit or as large a needed to forward high bandwidth aggregations.
(48) Packet-based addressing is termed explicit addressing. Explicit addressing appears in packet-based encapsulation of data. A packet header includes a source and destination address, the formal nature of which depends on a specific protocol, such as the Ethernet and the Internet (IP) Protocol.
(49) Cellets can exist within periodic time division multiplexed frames. Such frames can vary in duration from nanoseconds to seconds depending on the type of data forwarded within a network. A frame is a collection of cellets, the number of which defines the frame size. For a given connection, the position of its cellets determines is its identification.
(50) This method of identifying one connection from another is termed implicit addressing. Implicit addressing enables addressing of a data element, such as a cellet, by its position within a time or space division frame. For example, the third cellet in a time division frame might belong to connection from point A to point B while a fourth cellet could belong to a connection from Point C to point D.
(51) Implicit addressing can reduce the amount of bandwidth required for explicit addressing methods using packet headers and is especially suited to variable speed time division switching applications. This method is very robust compared to using packet headers to identify a connection to which the packet data belongs. Implicit addressing is the basic addressing method used within the world's legacy telephone network.
(52) A major point of differentiation between the telephone network and a SAIN network is the ability of the SAIN network to use cellets of different lengths in different parts of the network. The SAIN multiplexing algorithm can provide transport methods for any type of data protocol that exists outside the SAIN Underlay Network 100.
(53) A second major point of differentiation is that explicit addressing depends upon getting a packet sent to its proper destination without error. With implicit addressing, the likelihood of an error in setting up a connection is extremely small. More importantly, once it is set up, there are no further addressing messages required for the duration of a data epoch, which can vary from sub-microseconds to years. Any data errors that may occur are independent of setting up a connection itself. This is the reason that the telephone network is so reliable with high quality service. The same reliability can occur in a SAIN network. Once a connection is set up, reading of packet headers occurs only at the network edge in order to send a packet to an assigned FIFO buffer connected to a location in a SAIN switch for the duration of a packet flow. If a customer desires private line service, the connection, once set up, can exist for any amount of time as long as the customer pays his bill. There need be no special engineering.
(54) The major shortcoming of the telephony approach results from the methods used to setup a connection; time was ridiculously long for data applications. In a SAIN network, the connection time including setup time can be one millisecond or less without wasting bandwidth. In many circumstances, a connection can switch from a virtual state or a sleep state to a real state in microseconds or less using a one-way message that can often be one byte or less in length encapsulated in Control Vectors. A virtual state requires no bandwidth, and a sleep state uses a very small amount of keep-alive bandwidth. Virtual and sleep states are further described below. A Control Vector can be an implicitly addressed message made up of cellets, each of which is a message applied to some aspect of a communication process.
(55) Two parameters determine the cellet and frame size required in a SAIN network. These are: 1. trunk or link data rate (bits per second); and 2. the minimum amount of bandwidth required to transport a user connection.
(56) For purposes of this application, a trunk is a physical object such as an optical fiber, wired, or wireless connection that carries data traffic across a network. A link is a logical object that is a connection embedded within a trunk. Both a link and a trunk generally carry a plurality of connections such as implicitly or explicitly defined time/space division objects.
(57) For example, a one-gigahertz (Ghz) link data rate link can support a connection whose data rate is less than one gigabit per second (Gbps) with a one-bit cellet. The number of cellets in a frame size depends on the minimum amount of bandwidth required. A frame period is the product of the number of cellets per frame and the link data rate. The minimum amount of bandwidth required, called the Quantum Data Rate (QDR) equals the cellet size divided by the period of a frame. The frame period is the number of cellet per frame divided by the link data rate. For example, if a frame period is one microsecond and the cellet size is one bit, the QDR is 1,000,000 bits per second. For link data rates greater than 1 Gbps, a cellet can be larger. For example, an 8-bit cellet can encapsulate an aggregation of one-bit cellets where the data rate of the aggregation is 8 Gbps or less. For a one-microsecond frame, the QDR would be 8,000,000 bits per second.
(58) The embodiments of this application divide traffic made up of either aggregations of user connection data or aggregations of such aggregations into frames of cellets. They use methods of setting up a connection and its bandwidth by: 1) defining the number of cellets per frame for each connection or aggregations thereof, and 2) providing a clocking mechanism that places each data cellet into an assigned physical time and/or space location within the frame. The term connection is a generic term used for an aggregation of connections as well as for a connection at the user level.
(59) Methods disclosed in this application use Synchronized Adaptive Infrastructure technology; they are SAIN methods. SAIN methods disclosed herein are useful for implementing digital communication networks for any purpose. One goal is to build networks that interconnect with and use components of existing networks. A second goal is to lay a foundation for a new generation of networking to meet current and future challenges. The embodiments focus on methods that can benefit future networking in general while significantly enhancing current networks.
(60) To make the methods and apparatus easy to explain and understand, the examples used in the drawings and discussion are for Metropolitan Networks in general and Metropolitan Ethernet Networks in particular. Doing so does not limit the use of the technology in other contexts in any way. This application usually described apparatus in hardware terms. As is known to those skilled in the art, components described in hardware can also be implemented in software, and software versions can produce the same results.
(61) The following are some of the basic aspects of the technology and words used to define its methods either in the aggregate or individually: 1. Two-point connection controls can eliminate the need for hop-by-hop connection routing. This establishment of connections and dynamic control of their bandwidths can take place only at source and destination points inside the bright line SAIN Underlay Network 100. This network control is separated from user data protocols. 2. The network's control plane is physically and logically unreachable from a user port, thereby enhancing network security. 3. Connections can be set up on a simplex basis; a duplex connection consists of two simplex connections. 4. SAIN networking can exist in a hierarchical network topology of two or more tiers thereby enabling massive distribution of network control. 5. Synchronizing network nodes to a common clock can eliminate most of the complexity and stochastic nature of asynchronously clocked alternatives. 6. The basis for switching can use a physical circuit-based multiplexing mechanism described in referenced patents of the inventor. This overcomes limitations placed on circuit switching of the past and allows much greater scalability to both low and high bandwidths with low deterministic latency. 7. Using semi-static routing with a large choice of route alternatives in place of dynamic hop-by-hop routing further simplifies networking. The approach results in deterministic operational parameters, including dynamic connection bandwidth. 8. Because of the synchronized nodes, latency can become the only metric required for deterministic Quality of Service (QoS) in a SAIN network. Inside a SAIN network, packet buffer congestion need not exist so that packet-loss-rate as a QoS parameter is not meaningful. 9. This fundamental structure along with deployment parameters makes jitter and delay variation small enough to be negligible. 10. Except in catastrophic circumstances, the control mechanism of the network guarantees delivery of all data accepted into the network. 11. Traffic shaping and user policing are requirements in stochastic networks. Overall, stochastic networking of the 1970s was a suitable choice for the message- and file-transfer-based traffic market of the time. Today's voice, video, and multimedia markets are predominantly traffic flows, i.e., they are circuit-based. Morphing a stochastic network into a circuit-based network with protocol overlays has been no small task in today's network. Placing a circuit-based underlayer beneath what already exists is much simpler. It is much less expensive in capital and operational costs using much less source electrical power.
(62) The current Internet requires a relatively small number of ever-larger one size fits all edge routers. The SAIN network structure morphs into a huge number of massively distributed mini-edge-routers. Each mini-router focuses on local users' languages, social and commercial needs, and inclusive interconnectivity within a Metro Network and the outside world. All user data in native user protocols exists only at the ingress and egress edges of a SAIN Underlay Network 100. Internally, the network exists between the Host, Terminal, Server, or Network 101 ingress and egress connections using the OSI Layer 2 and above protocols and the physical transport Layer 1 of the OSI Model. In other words, it exists in its own SAIN Underlayer 1.5. This definition does not preclude using protocols that emulate Physical Layer 1 on which the SAIN Underlayer can exist.
(63) A Host, Terminal, Server, or Network 101, using any manner of digital access protocols, connects to the SAIN network through a User Interface Connection 290 much like legacy networks. A major goal of SAIN networking is to provide users with a network that supports their current needs without requiring modification of current user applications. An additional goal is to enable service providers to overcome current network deficiencies of scalability, performance, and cost while using predominantly existing network deployments.
(64) The top-level principle of a SAIN Underlay Network 100 emphasizes one of its main benefits compared to existing networks. The SAIN network converts user data into bit streams that conform to a simple forwarding protocol used throughout an underlayer 1.5.
(65) The main purpose of the forwarding protocol is to transfer user data bits transparently from source to destination end-points in a robust and deterministic manner. The methods use synchronized clocks among switching nodes of the network in a manner that eliminates most of the complexity and service quality problems caused by the stochastic nature of current networks. The clocking mechanism can focus on synchronizing node clocks with one another. This can include synchronizing all nodes to Coordinated Universal Time (UTC) based on existing network synchronizing techniques.
(66) SAIN nodes with synchronized clocks use deterministic methods to overcome bursty data at network entry ports before accepting data for delivery. Packet buffers placed before data entry into the SAIN Underlay Network 100 assures delivery without packet loss. Packet buffers are not relied on inside the SAIN Underlayer 1.5. Placing buffers within routers in legacy networks is a major cause of Quality of Service complexity and poor performance. SAIN methods reduce the burstiness of data presented to legacy core and access networks.
(67) A Parameterized User Interface 210 is a flexible data interface that can be 1) generic for commodity data types and 2) application-specific for special data types. The PUI 210 can be replaceable and upgradable to meet changing user or network provider needs.
(68) The SAIN network can use elements of current networks. For example, a Parameterized User Interface 210 extracts information from user input data in sufficient detail to determine the intended egress destination(s) within the network. It can also determine the service class to which the traffic belongs. Unlike conventionally routed networks, the network prioritizes traffic by applying more bandwidth or less bandwidth for each traffic type. It can adjust bandwidth to meet tight latency specifications both for bursty traffic and for traffic flows with time-varying data rates. Without substantial overprovisioning, the SAIN approach prevents network congestion and dropped packets that force retransmission of data where bandwidth is already scarce.
(69) This first level of aggregation eliminates a substantial amount of router complexity and processing power required of packet-based approaches. An important interface will focus on Metropolitan Area Network (MAN) Ethernet standards such as those defined by the IEEE, ITU, ANSI, and organizations like the Metro Ethernet Forum (MEF). This focus does not suggest limiting the universe of types of networks for which embodiments herein are applied.
(70) The interface to the SAIN network includes the word Parameterized for an important reason. Since the SAIN network uses an internal data transfer protocol that is universally applicable to all network access protocols, there can be many variations of Parameterized User Interface (PUI) 210 to accommodate the outside world. Each connected Ingress PUI 211/Egress PUI 212 pair supports mutually compatible protocols for specific user applications. Beyond that, there are no limiting technical restrictions. The parameterized nature of the interfaces allows new user access protocols to be added to Ingress PUIs 211 and Egress PUIs 212 by software downloads from users, their organizations (in Virtual Private Networks, for example), or from network service providers as upgrades.
(71) One additional advantage of the Parameterized User Interface 210 approach is the distributed nature of dealing with a wide variety of traffic types. A large number of highly distributed small processors replace the complex all things to all people large edge node routers. Distributing processing power within a large network of relatively simple elements can be an effective way to generate enormous processing power at relatively low cost.
(72) The Ingress PUI 211 uses Protocol Translator 214 functions to encapsulate user data protocols at a source end-point into the SAIN network transfer protocol. At a destination end-point, the Protocol Translator 214 changes the SAIN internal network protocol back into a user-friendly form. An E-Node contains a plurality of user connections through ingress and Egress PUIs 212 defined above.
(73) The disclosures of this application are in the context of the hierarchical structure shown in
The Basics of the SAIN Transform Algorithm
(74)
(75) For purposes of this application, a frame of data is a periodic, ordered, time/space collection of cellets where each cellet consists of a defined number of data bits. Within a given frame, cellets have the same number of bits. Each cellet is bound to a specific connection (or aggregation thereof as a new connection). In other words, each cellet is a fragment from a short or long serial stream of data. To transmit a plurality of data streams within a single frame, cellets from the plurality of connections are intermixed within the frame. The SAIN transform algorithm places cellets from a given connection nearly uniformly spaced throughout a Time/Space Division. Each cellet represents a quantum of bandwidth equal to the number of bits in a cellet divided by the period of the frame. In other words, a cellet represents a Quantum Data Rate (QDR) equal to the number of bits in a cellet multiplied by a periodic Frame Rate.
(76) For purposes of brevity, the use of time/space becomes time. In other words, phrases like Time/Space Division Multiplexing become Time Division Multiplexing. Unless specifically pointed out to the contrary within this application, the word time related in some way to multiplexing implies both time and space as the basis therefor.
(77) The SAIN Transform Algorithm includes defining a frame of cellets in two domains, a Connection Domain 150 and a Time Domain 160 shown in the figures. The Connection Domain 150 shown in
(78) The second domain is the Time Domain 160 shown in
(79)
(80) For example, starting with the Connection Domain 150, the first cellet on the left is 0 (i.e. 0000 in binary notation). Clearly, the matching cellet in the Time Domain 160 is also 0000 (i.e. 0 in decimal notation). The next cellet to the right in the Connection Domain 150 is 0001 with matching cellet in the Time Domain 160 1000 (i.e., 8 in decimal notation). Cellets 0010 and 0011 follow in the Connection Domain 150 with matching cellets 0100 and 2100, (i.e. 4, and 12) in the Time Domain 160. The inverse is also true; each cellet in the physical Time Domain 160 point to a cellet in the Connection Domain 150. Physical cellets exist only in the Time Domain 160. The cellets in the Connection Domain 150 point to the physical location of cellets in the Time Domain 160.
(81) Taken together, the two Domains define a Time Domain memory map for a multiplexer system. Each position in the Time Domain 160 frame denotes a physical time (or space) cellet corresponding to a cellet in the Connection Domain 150 that denotes the connection to which each cellet belongs.
(82) A benefit of building frames using the algorithm is that the cellet positions are nearly uniformly spaced throughout a Time Division Frame thereby reducing switch latency for any given connection.
(83) Power-of-Two Frame and Segment Lengths
(84) A Time Domain 160 is divisible into segments by dividing the frame length by an integer. Where F is the frame length (i.e., number of cellets) and n is an integer, each segment contains exactly INT F/n cellets if n is an integer divisor of F or will be a combination of INT F/n and INT F/n+1 cellet segments if n is not an integer divisor. Dividing a Time Division Frame into segments exploits the distributed cellet positioning throughout Time Domain 160 frames within SAIN switches.
(85) Obtaining Time Division Frames with Equally Spaced Cellets
(86) Time Division switches work on the following basis: a frame or a small subframe segment of data from one or more sources is stored before being reordered (or manipulated in other ways) for transmission on an outgoing link. Segmenting a frame is a method of reducing switch latency. There are many ways to divide a frame into segments, but dividing it by a power of two is an important one.
(87) The equally spaced property is not limited to PoT divisors. Any integer is usable as long as it is a submultiple divisor of the frame length. For example, a frame with 20 cellets divides into five segments with four cellets each. A single cellet placed in each segment defines a connection with bandwidth equal to five times quantum data rate of the frame.
(88) Non-Equally Spaced Power-of-Two Length Connections
(89)
(90) Dividing a Non-Power-of-Two Length Frame by a Power-of-Two
(91) Dividing a frame by a power of two produces a power-of-two segmented frame, i.e., a frame of PoT segments. PoT segmentation does not depend on the total frame length being a power of two; it is useful for frames of any length.
(92) In the
(93)
(94) Dealing with Non-Power-of-Two Connections
(95)
(96)
(97) Overview of Routing E-Node to E-Node Paths in a SAIN Network
(98) An embodiment of a SAIN network can use a plurality of E-Nodes in one network tier connected to a plurality of T-Nodes in a next higher network tier. Each E-Node can act as both a source and a destination node to every other E-Node in the network. Each source-E-Node to destination-E-Node connection is a path. Aggregations of paths within an E-Node embed each path from the E-Node to every other E-Node. The aggregations connect to a parent T-Node for processing and forwarding to each T-Node in the network. Each destination E-Node disaggregates the aggregations of paths whose sources are from every other in the two-tier network.
(99) For purposes of explanation and embodiments, this application assumes that each E-Node connects to a single T-Node in the next higher tier. Expanding to multiple connections can take place in two ways. One is to enable an E-Node to attach to one or more parent. The other is to divide a parent T-Node into a plurality of sub-T-Nodes disbursed for survivability and for security reasons.
(100) In
(101) Network embodiments could include nodes other than E-Nodes 200 and T-Nodes 300 and could include directly connected E-Nodes 200.
(102) One embodiment of T-Node 300 interconnections is a mesh network as shown in
(103)
(104) In a SAIN network, a path is a simplex connection from one E-Node 200 to another E-Node 200. A duplex user connection comprises two paths, one in each direction of travel.
(105) The role of T-Nodes 300 in a SAIN network is to provide superpaths that are aggregations of E-Node 200-to-E-Node 200 paths. Their interconnections are also set up on a simplex basis. These superpaths can be controlled using duplex Control Vectors that contain messages in cellet form embedded within implicitly addressed frames. Other control methods for superpaths are possible.
(106) The role of a path is to aggregate user connections at a source E-Node 200 and to deliver the aggregation, not individual connections, to a destination E-Node 200. Each of an interconnected pair of E-Nodes 200 can act as both a source and a destination node as described below.
(107) A source E-Node 200 is the control node for each path. In other words, the source E-Node 200 uses a pre-determined route for a path that has enough bandwidth to support arriving user traffic end-to-end from source to a destination E-Node 200. As user traffic intensity varies, an E-Node 200 allocates more or less bandwidth to the path and concomitantly, can include adjustment of available bandwidth to support multiple classes of traffic when network bandwidth becomes scarce. Embodiments below detail apparatus and methods involved to accomplish these tasks.
(108) The major requirement of networking is to be able to interconnect all nodes accepting user data to all nodes able to deliver user data. Making use of a three-tier hierarchy shown in
(109) In a Metro Network, an E-Node 200 aggregates all incoming user data into a plurality of paths. Each path is an aggregation of all data entering the E-Node 200 for delivery at another E-Node 200. The E-Node 200 aggregates its paths into the number of superpaths equal to the number of T-Nodes 300 in the network. It then aggregates these superpaths into a higher-level superpath that contains all user data deliverable to all other E-Nodes 200. The source E-Node 200 forwards this superpath to its parent T-Node 300. The parent T-Node 300 then routes each of the intermediate superpaths to the appropriate destination T-Node 300. In the model network the result is that 25 source E-Nodes 200 connect to 25 destination E-Nodes 200 attached to each of 20 T-Nodes 300 with one exception. [A source E-Node 200 need not connect the source E-Node 200 itself for data. It may set up a small amount of bandwidth in a test loop-back arrangement to verify the integrity of its two-way connection to its parent T-Node 300.]
(110) In a configuration of multiply connected T-Nodes 300, a plurality of possible routes exist each of which can delineate a superpath aggregation of paths. A table of such routes can contain important parameters that enable the network to select, dynamically, routes that optimize network performance. For example, each pre-determined route is loop-free with known end-to-end latency. The table can also include the bandwidth available for each route, updated periodically by the system.
(111) Reference Numeral Methodology
(112) In what follows, generic forwarding elements (i.e., those that send or receive data) are assigned drawing reference numerals that either: 1) end in a 0, or 2) are a single- or a two-digit numeral. A subtype of each element keeps its first one or two digits and adds a 1 if the subject matter involves source-end functionality. A subtype adds a 2 for destination-end functionality. For example, an E-Node 200 denotes a generic E-Node that sends and receives data. A Source E-Node 201 denotes the sending end functionality of the E-Node 200 and a Destination E-Node 202 denotes its receiving end functionality. The reason for this is to differentiate between sending and receiving functions of the network thereby simplifying the following disclosures.
(113) The following disclosure first describes embodiments of individual subsystems of a network, followed by disclosure of embodiments of the system as a whole.
(114) Embodiment of a SAIN Switch Stack Selector
(115)
(116) One embodiment of SAIN switches includes a Switch Stack Selector 120 shown in
(117) The Switch Stack Selector 120 implements the SAIN transform described herein above starting with The Basics of the SAIN Transform Algorithm at paragraph 0. The referenced patents describe in detail the methods applicable to the apparatus shown in
N=2.sup.n, where
n=1+INT (log.sub.2(F1)).(1)
(118) The Frame Clock Generator 121 emits F Frame Clock 130 pulses during a frame, including a Frame Reset 123 pulse that sets the System Clock 124 and Frame Clock 130 to zero. Each Frame Clock 130 pulse causes the counter to increment by 1. The Cellet Counter 133 counting environment includes N virtual frame states. The environment can include empty cellets in a frame as described at paragraph 0ff.
(119) Shown below the Cellet Counter 133 in
(120) The number of cellets assigned to a connection m equals the number stored the Connection Bandwidth Register 142 at connection m+1 minus the number stored in the Connection Bandwidth Register 142 at connection m. If the two numbers are equal, the m connection has no cellets in the Connection Domain 150. In other words, it represents a virtual connection. [A virtual connection is a connection with zero allocated bandwidth. The virtual connection is a physical connection placeholder that can become data bearing.] This is a unique and important property within a SAIN network. A connection can exist in a virtual state even when a call, path, or superpath has no bandwidth assigned. This is an important benefit of using implicit addressing within a SAIN network.
(121) Note that the least significant bit of the Cellet Counter 133 appears on the left in
(122) When the number in the Cellet Counter 133 is both 1. greater than or equal to the number in Connection Bandwidth Register 142 at connection m, and 2. less than the number in the Connection Bandwidth Register 142 at connection m+1,
the system places its attached Selector Line 138 in
(123) The CC/CBR Empty Connection 134 determines if a 1 least significant bit of the Cellet Counter 133 forecasts a virtual cellet as its next state. The real cellets exist at frame positions numbered (0, 1, 2 . . . F1). There is the same number of Connection Domain positions. The difference is that the virtual frame cellets in the Connection Domain do not exist at positions (F, F+1 . . . N1). All Cellet Counter 133 values that refer to non-existent virtual connections only if Connection Domain 150 values are greater than the virtual frame length N/2-1. If this were not so, the PoT value of the virtual frame length would be lower. As shown in
(124) In addition, an embodiment of a switch based on power-of-two-length segments (i.e., PoT segments) can use a property of the SAIN transform algorithm to designate all PoT segment boundaries. A frame of any length F less than a power of two, can contain a maximum number of PoT segments equal to the largest power of two less than F. In other words, the maximum number of PoT segments possible is N/2, the virtual fame length of the frame divided by two. When applying the transform algorithm to a Connection Domain of a frame, the PoT segment boundaries start at the N/2 cellets in the Time Domain that correspond to the first N/2 cellets in the Connection Domain. If the frame length itself is a power of two long, the real frame length and the virtual frame length are the same, i.e., N=F. In this case, N replaces N/2 above.
(125) The set one cellet per of PoT segment defines the maximum base data rate that is the maximum PoT data rate supported by the frame. Any submultiple of the maximum base data rate is usable to advantage in a SAIN network. Any integer multiple of the base data rate (including the maximum base data rate itself) is also possible. For example, if the frame length is 6 cellets, the virtual frame length, N, is 8. N/2 is 4 and there two zero-length cellet positions in the frame. In the Time Domain, one zero-length position occurs in the each of two PoT segments. The maximum base data rate is two cellets per frame period and the maximum data rate of a connection is 6 cellets per frame period, i.e., 3 cellets per PoT segment.
(126) CC/CBR Empty Connection 134 contains the Connection Domain 150 number corresponding to the first non-existent cellet in the Connection Domain 150 frame. The CC/CBR Empty Connection 134 enables determining whether incrementing the current Cellet Counter 133 value by 1 will result in a Connection Domain 150 non-existent cellet position. This occurs by inverting the least significant bit in the Cellet Counter 133 connected to the most significant bit of the CC/CBR Empty Connection 134.
(127) This is equivalent to incrementing the Cellet Counter 133 by 1 when its current value ends in a 0, a value that represents a Connection Domain address in the first half of Connection Domain 150 virtual frame. The value stored in the frame is F if the frame is less than the virtual frame length. The CC/CBR Empty Connection 134 is empty if F=N, i.e., if the actual frame length is a power-of-two in length.
(128) Embodiment of a Frame Clock Generator
(129) A SAIN network places a high-speed system clock at each network node. The plurality of node clocks can synchronize directly or indirectly with a common clock source. Clocks in the E-Nodes 200 can synchronize to their parent T-Nodes 300 and each T-Node 300 can connect directly or indirectly to a common clock source using standard clocking technology such as IEEE Standard 1588 or other methods including U.S. Pat. No. 2,986,723.
(130)
(131) The Clock Generator 121 uses three input signals. One is a Frame Reset 123 signal generated by the system to denote the start of a frame. Another is the high-speed System Clock 124 signal. A third signal is a Frame Size Increment (FSI) 122 that enables deriving a Frame Clock 130 signal from the high speed System Clock 124 such that:
FSI=f.sub.sp.sub.F/F(2)
where f.sub.s=the high-speed System Clock Rate in megahertz and p.sub.F=the Frame Period in microseconds to produce both Frame Clock 130 and Quadrature Clock 131 pulses.
(132) A network controller with a microprocessor stores the entities shown in
(133) Operation (2) 602 begins when a Frame Reset 123 occurs in the network. This signal keeps a set of clocks, and thereby synchronizes frame start times within a SAIN network node. The operation sets the System Clock Counter 129 and the Flipflop within the Comparator 127 to 0. The Flipflop distinguishes whether the value in the Adder Register 126 denotes a Frame Clock 130 or a Quadrature Clock 131 pulse. In the operations as described, a Flipflop value of 0 denotes a Quadrature Clock value that causes rounding of comparison values in the sequel as explained next.
(134) Operation (3) 603 begins incrementing the System Clock Counter 129 by one from the System Clock 124. Operation (4) 604 detects an overflow state of the System Clock Counter 129. The purpose of detecting the overflow is to ensure that the system has remained in synchronization with Frame Reset 123.
(135) Operation (5) 605 determines if the System Clock Counter 129 is greater than the value in the Adder Register 126. If it is not, it reverts to Operation (3) 603. If it is true, the system goes on to Operation (6) 606 where two things occur. [Note: Since a value of 0.5 exists first in the Adder Register 126, the first System Clock 129 pulse counted is larger than the 0.5 stored at frame reset time.] The first is to change the state of the Flipflop attached to the Comparator 127 from 0 to 1, or 1 to 0. At the beginning of a frame, the Flipflop is set to a 0 state. This results in its status changing from 0 to 1. Operation 606 also causes the Adder Register 126 to be incremented by the FSI/2 value stored in the Increment Register 125. The next time System Clock Counter 129 is greater than the Adder Register 126, the Flipflop state is set to 0.
(136) Operations (7) 607 determines the Flipflop's state and sends a pulse on either Frame Clock 130 (Operation (8) 608) or Quadrature Clock 131 (Operation (9) 609). Operation (10) 610 then Increment Register Trigger 125a increments the Adder Register 126 by FSI/2 and triggers Operation (3) 603.
(137) For a numeric illustrative example of the embodiment, set F=3,856 cellets, p.sub.F=0.125 msec, and f.sub.s=10.sup.6 kHz (1 Ghz). The fractional part of the FSI needs to have only enough binary places to assure that the frame count equals F cellets exactly. This number can be calculated by the following formulas:
FSI=INT(TotN/F2.sup.Exp)/2.sup.Exp where
TotN=p.sub.Ff.sub.s=Total number of high-speed clock pulses in a frame, and
Exp=INT(LOG.sub.2(F1)+2.(3)
Using these formulas,
FSI=INT(10.sup.60.125/3,8562.sup.13)/2.sup.13=32.4169921875.
Note that using the FSI/2 as the Frame Size Increment 122 shown in
(138) An alternative embodiment replaces the two-state Flipflop with a one-shot Flipflop (not shown) where the backside of the output pulse from the Flipflop produces the Quadrature Clock 131. In this case, Operation 601 is not performed (i.e., the FSI is not divided by 2) as the Increment Register 125 value and the Exp value is not increased by one.
(139) Embodiment of a Connection Comparator/Connection Bandwidth Register (CC/CBR) Stage
(140)
(141) Upon a switching operation of a Switch Node Controller 560, the two CBR Stacks exchange roles. If CBR Stack A 553a is operational, the Switch Node Controller 560 activates the A labeled elements as shown by the dark line in the figure.
(142) If the switch is a Generic Aggregation Switch 501, the traffic cellets from sources at the source end of a connection Sources or Sink Gates 550 pass data cellets to fill cellet positions in an outbound multiplex stream. If the switch is a Generic Disaggregation Switch 502, data cellets from cellet positions in an inbound multiplex stream to a destination sink.
(143) An embodiment of a Comparator/Connection Bandwidth Register stage is the focus of
(144) As shown in
(145) As shown in
Embodiments for Changing Bandwidth of Paths and Superpaths
(146) Changing the bandwidth allocated to a frame in a network occurs in conjunction with changing the bandwidths of individual connections within a frame. Changing frame bandwidth in a SAIN Switch Stack Selector 120 is a two-state process that changes the number of cellets within the frame. The first state involves compaction of the plurality of connections within the frame into a contiguous range of Connection Domain 150 cellets starting with address 0: The second state involves adding or taking away cellets from a Connection Domain 150/Time Domain 160 frame. The order in which the steps occur depends on whether the number of cellets per frame is increasing or decreasing. When increasing the Frame Size, increasing its size precedes increasing bandwidths of connections within the frame. When decreasing the Frame Size, reducing connection bandwidths within the frame to a level that will fit within a smaller size frame precedes reducing the Frame Size.
(147)
(148) If the Aggregation Switch Node Controller 560 discovers that the amount of bandwidth required within a given frame must be increased, the first step is to compact the current connections to contiguous Connection Domain 150 range. Using methods of the embodiments of this application automatically causes such compacting of connections. The next step is to add a contiguous range to the frame to support the additional bandwidth required. The Aggregation Switch Node Controller 560 does this by adding a CC/CBR Spare Connection 135 to the frame.
(149) The CC/CBR Spare Connection 135 has no link to a data source or sink. It does not require calculating the number of cellets required in the spare bandwidth since the CC/CBR Empty Connection 134 value set in the Switch Stack Selector 120 automatically sets the number of cellets. Although the number of empty channel cellets does not require calculation or storage, the sum of all cellet ranges in the frame, including spare bandwidth, must equal the Frame Size F.
(150) A key part of implementing a SAIN network is connection bandwidth management. An important goal of SAIN networking is assured-delivery of traffic accepted by the network. In other words, change the legacy packet-network paradigm from accept all traffic as it arrives, and discard that which cannot be delivered to accept traffic only if delivery is assured and only discard traffic under disaster or certain programmed circumstances.
(151) In legacy networks, the goal is achievable only with substantial overprovisioning. As a result, adding new network capacity necessary to keep up with demand is very expensive. In addition, discarding packets just adds to traffic intensity by requiring retransmission of the forwarding failures of the network.
(152) Fortunately, the SAIN structure enables implementation of a simple subsystem of reporting availability of bandwidth by Quality of Service Class throughout the network before accepting data into the network. The worst that can happen is for the network to inform a user all connections are busy for less important traffic classes. Data awaiting forwarding is storable for later transmission without requiring readmission by the user. The result achieves superior performance without the large amount of overprovisioning. In addition, the SAIN structure enables dynamic re-routing of traffic before it enters the forwarding part of the network thus optimizing the use of installed bandwidth.
(153) Embodiments of Methods that Increase Path and Superpath Bandwidth
(154) Allocating bandwidth within a SAIN network is a very dynamic process. It is most dynamic at the path aggregation level since this level is closest to the burstiness of user traffic. Higher-aggregation-level traffic changes as traffic loads shift, but these shifts are less dynamic. An individual traffic burst at the path level represents only a small proportion of total traffic at one of the higher aggregation levels.
(155) When a new high-bandwidth streaming connection shows up at a User Source Data Port 291 of a Source E-Node 201, rapid expansion of available bandwidth must occur quickly. The flowchart in
(156) An Outline of SAIN Aggregation/Disaggregation Node Pair Switch Types
(157) In a SAIN network, all switches exist in pairs of one aggregation switch and one (or more) disaggregation switch(es). The result is that all control of connections and their assigned bandwidths require communication only between each node pair.
(158) There are four types of aggregation switches and matching disaggregation switches in the SAIN network disclosed in this application. A part of all switches is a Switch Stack Selector 120. The switching subsystem of a SAIN network contains a plurality of entity types independent of their network application. These are 1) an aggregation switch, 2) a disaggregation switch, and 3) duplex Control Vectors between the two switches as an operational pair. Control Vectors are private message-bearing two-way conversations between an aggregation switch and its paired disaggregation switch.
(159) The generic and four subtypes of aggregation switches are: 1. a Generic Aggregation Switch 501; 2. a Path Aggregation Switch 511 (i.e., a Level 1 Aggregation Switch); 3. a Level 2 L2 Aggregation Switch 521; 4. a Level 3 L3 Aggregation Switch 531.
(160) The generic and three subtypes of disaggregation switches are: 1. a Generic Disaggregation Switch 502; 2. a Path Disaggregation Switch 512 (i.e., a Level 1 Disaggregation Switch); 3. a Level 2 L2 Disaggregation Switch 522; and 4. a Level 3 L3 Disaggregation Switch 532.
(161) In addition to these designations, for brevity, an aggregation/disaggregation node pair becomes the following: 1. A Generic A/D Switch Pair 503; 2. a Path Aggregation Switch 511/Path Disaggregation Switch 512 pair becomes a Path A/D Pair 513 3. a Level 2 Aggregation Switch 521/Level 2 Disaggregation Switch 522 pair becomes an L2A/D Pair 523; 4. a Level 3 Aggregation Switch 531/a Level 3 Disaggregation Switch 532 becomes an L3A/D Pair 533
(162) In addition to the switch types listed above, one additional structure exists to accomplish a key SAIN network objective. This is the Crossconnect Switch 540, which can be used to interconnect switches of the same level. In some embodiments, the Crossconnect Switch 540 is used to interconnect Level 2 switches at a source T-node by aggregating traffic from child E-nodes according to destination E-nodes. In other embodiments, a Crossconnect Switch 540 could be used at a destination T-node rather than the source T-node.
(163) Embodiment of a Generic Aggregation/Disaggregation Switch Pair
(164) Unlike the telephone network, a modern communication network must cope with rapidly changing traffic intensity throughout the network. The Public Switched Telephone Network (PSTN) handles just one type of traffic efficientlyvoice. A voice call, once established, remains connected for a substantial period, usually of the order of minutes. Modern networks do not work that way. Voice traffic is a critical part of today's traffic in terms of Quality of Service, but it is only a small part in terms of traffic intensity. Total traffic intensity varies over a wide range in relatively short periods. In addition, using silence detection, where data is passes only when someone is talking, is a part of today's packet-based voice networks. Unfortunately, the packet overhead required is nearly large enough to make silence detection less useful than it can be in a circuit-based network. In a SAIN network, the concept of silence detection can be implemented by a virtual connection. The virtual connection can maintain a connection to a destination node and activate transmission with a few bits using a Control Vector for control instead of full packet headers.
(165) Voice traffic has become a very small part of overall traffic in communication networks. Even so, there are corollaries in transmitting action-oriented video where it is important to change available bandwidth to meet ever-changing demand.
(166) Aggregating bursty traffic using packet buffers internal to a network can smooth traffic gyrations, but only to a degree. Placing packet buffers only at network edges and using 1) virtual connections and 2) dynamic bandwidth management that changes bandwidth assigned to traffic aggregations quickly can provide significant network improvements. Doing this without resorting to a large amount of overprovisioning is one of the major advantages of the SAIN paradigm. This section of the application shows the basic methods and apparatus for doing so.
(167)
(168) As shown in
(169) There are a number of methods to provide stable clocks in each SAIN switch node. The goal is to assure their mutual synchronization as a self-contained network. The larger approach is to synchronize the nodes to a common global clocking source such as Coordinated Universal Time (UTC) using existing methods. IEEE Standard 1588 has demonstrated ability to achieve synchronization to within a few nanoseconds.
(170) In addition to synchronizing node clocks, it is necessary to have knowledge of where a frame starts for all links leaving and entering a node. A simple method can use a synchronized clock in each T Node as a reference clock for all E Node attached thereto. For a source E Node, each frame generated can arrive at its parent T Node slightly ahead of the start time of an outgoing T Node frame.
(171) For aggregated data originating from a plurality of T Nodes, there are two general methods to provide synchronization to attached E Nodes. One is to buffer incoming cellet traffic so that frames from all distant T Nodes are time-aligned to overcome differences in link propagation delay. This method has the shortcomings of adding delay to nearby T Nodes.
(172) A second method makes use of the timing method outlined above where all E-Nodes frames are time-aligned with their parent T-Node frames. A simple method to achieve this result without injecting detrimental latency into the network is to measure the E-Node to T-Node delay is to assume that the round-trip delay is twice the one-way delay. Each Source E-Node 201 sends its frames far enough in advance to assure that the parent Source T-Node 301 receives them in a time-aligned fashion.
(173) Using the model network as an example, there are 19 distant T-Nodes sending data to each other T-Node. Each T-Node aggregates data received from all of its E-Node sources into Level 2 frames. At a destination Level 2 to Path Level interface, there are 20 frame start times, one from each T-Node. Aligning the frames for each Source L2 Aggregation 721 at Source T-Nodes 301 does not assure that all Destination L2 Disaggregation 722 are time-aligned. There is no assurance that the distance from one T-Node to another is that same.
(174) The start times of a frame have importance only within E-Node pairs. As disclosed later in the application, there is no need for keeping the frame start time intact along a route of transit nodes. The QDR and cellet size needed to handle potential Source E-Node 201 traffic determines the frame period required for a Path A/D Pair 513. This requirement does not exist in the transit links between a Source E-Node 201 and a Destination E-Node 202. Frame periods measured in microseconds often cover the need at the Path A/D Pair 513 level. This requirement does not exist inside a SAIN network beyond the first path level of aggregation. Dividing transit frames into very small segments can result in nanosecond or smaller periods resulting very small delays with no jitter or meaningful delay variation.
(175) The Frame Size Increment 122 is a system parameter that can change frequently. It is a key part of the methods of this application to achieve the adaptive objectives of SAIN. The frequency with which the parameter changes is inversely proportional to the aggregation level of a link. In other words, the Frame Size Increment 122 changes most frequently at the path aggregation level and least frequently at Level 3. The frame size at a given level, measured in total amount of data, must increase if the switch involved requires more bandwidth to handle its traffic load. A mixture of high clock rates and relatively large cellets supports the increase.
(176) Another aspect of SAIN networks is the requirement that the content of Connection Bandwidth Register 142 (see
(177) Bandwidths assigned to connections within each route are a set of positions within Connection Bandwidth Register 142 of a Generic A/D Switch Pair 503. Each position denotes bandwidth of a connection by storing the number of cellets per Connection Domain frame. These positions can remain in place for extended periods for flow-based traffic with nearly constant average bandwidth. Such traffic includes, but not limited to voice, streaming media, certain classes of video, and embedded clips within web sites.
(178) Within a Path A/D Pair 513 pair between two E-Nodes, the Connection Bandwidth Register 142 positions can be in one of three states. They are a real state (i.e., operational state), a sleep state, or a virtual state. A real state carries customer traffic along with necessary management and control plane traffic. A sleep state is a state that can turn into a real state quickly. It would include, for example, sending enough control traffic to and from the sleep-state switch terminations to assure rapid real state restoral. A virtual state of a route has positions within Connection Bandwidth Register 142, but with zero assigned bandwidth. In a sleep state where temporarily no substantial traffic exists, there can be enough control bandwidth to pass information assuring data connection viability.
(179) For proper operation in a real state, each switch pair must have sufficient assigned bandwidth to embed connections presented to it. The aggregation of these connections becomes a connection to a switch pair at a higher aggregation level. When a new connection is set up within a Generic A/D Switch Pair 503 pair at any level, steps apply as shown in
(180)
(181)
(182) Step (3) 633 traffic is less restrictive. It may still require an average bandwidth over a given period, but there is neither interactivity nor sub-second UTC certainty. In other words, it has elastic properties. For example, it is recorded material that is to be played in real time, but can endure a slightly delayed start time to fill a cache to the point that service interruption will not occur because intermittent bandwidth starvation. Most broadcast traffic and multimedia traffic is in this category.
(183) The following Table 1 contains estimates of possible latency requirements for each class of service. These type of Quality of Service objective and more can be included in Service Level Agreements (SLAs) between service providers and their customers. The table is only an example to show that mere priority among service types is not a very good way to denote service classes. Priority alone cannot represent what a user can specify and observe.
(184) TABLE-US-00001 TABLE 1 MEF Service Satisfactory Latency Service Class Excluding Propagation Comments Constant Bit Rate A 1.0 sec-50 msec When committed bandwidth is not in use, it is re- (CBR) assignable as long as re-establishment of its committed value occurs within the time allotted. Web Site Search B 250 msec Round Trip Service appears after a mouse click on a link or (RT) depressing an Enter key after URL submission. Audio/Video B 500 msec RT Clicking or using a TV remote on an On-Demand Broadcast or or Broadcast Connection. Streaming On-Demand Traffic B 1 msec to 2 hours When available from source E-Mail C 1.0-60 sec Clicking on Send/Receive Messaging C 2 sec RT Pressing an Enter Key VOIP A 200 msec RT Stop Talking/ Start listening Control Vectors A 1.0 sec-5 msec Latency determines bandwidth efficiency. Control Packets A 1-10 msec Latency determines bandwidth efficiency.
Diagrams Showing SAIN Node Physical Connectivity
(185) Referring to
(186) The Source T-Node 301 to Destination T-Node 302 link may include a number of Transit Nodes as shown in
(187)
(188) Diagrams Showing SAIN Node Logical Connectivity
(189)
(190) The first logical links can aggregate a plurality of user connections into a Source Path Aggregation 711 as shown in
(191) As shown in the example in
(192) The Source L3 Aggregation 731 shown in
(193) Each Source T-Node 301 forwards each Source L2 Aggregation 721 received from its child Source E-Nodes 201. Each Source T-Node 301 modifies the Source L2 Aggregations 721 received from its child Source T-Nodes 301 to become Destination L2 Disaggregations 722. The modifications change the contents of the aggregations from one Source E-Node to many Destination E-Nodes into one Destination E-Node from many Source E-Nodes. This modification can be performed by crossconnect switches as disclosed below. Each Source T-Node 301 sends modified Destination L2 Aggregations 721 to be treated as Destination L2 Disaggregations 722 by each of the Destination T-Nodes 302. These are the multiplexed aggregations sent over a Source-Destination TT-Link 341/Destination-Source TT-Link 342 pairs shown in
(194) Embodiment of Path Aggregation and Disaggregation (Level 1) Switch Pairs
(195) Using the model network with a total of 500 E-Nodes 200 and 20 T-Nodes 300 as an example,
(196) In
Embodiment to Set Up and Maintain User Connections
(197) At the time of SAIN network formation, it contains a plurality of E-Nodes 200. Each E-Node 200 can be capable of performing Source E-Node 201 and Destination E-Node 202 functionality. Each Source E-Node 201 within an E-Node 200 can connect to every Destination E-Node 202 in the network, except the Destination E-Node 202 within the E-Node 200. Likewise, each Destination E-Node 202 within an E-Node 200 is connectable from every Source E-Node 201 in the network, except the Source E-Node 201 within E-Node 200. The T-Node 301 configurations occur in accordance with the methods disclosed below.
(198) When operational, as shown in
(199) Using an Ingress Parameterized User Interface (PUI) 210 can produce the following inputs to an Ingress E-Node Controller 221: 1. Destination E-Node address(es), 2. Traffic type, such as a. Unicast, or b. Multicast, or c. Broadcast, and/or d. Ethernet (MAC Address), or e. Other defined address type 3. Port Number(s) of an E-Node or Ethernet Bridge 4. Latency Class or Assigned Class of Service
(200) The Ingress PUI 211a first searches its Address Cache 216 and Connection ID (CID) Cache 218 to determine if the incoming packet matches the one that existed within the (recent) past, matching the items listed above. If it does, the PUI sends the packet to the Source Assigned FIFO Buffer 243 selected previously, and sends an alert signal announcing that fact to the Ingress E-Node Controller 221. The Ingress E-Node Controller 221 then determines if the bandwidth assigned meets the class objectives of item 4 in the above list. If the amount of bandwidth available allows the system to meet the item 4 objective, the Ingress E-Node Controller 221 takes no action. If there is more bandwidth assigned than necessary, i.e., if it is more than enough to empty the buffer, the Ingress E-Node Controller 221 may reduce the bandwidth for the connection. Reducing the bandwidth can be performed by reducing the connection's number of cellet slots stored in the location in Switch Stack Selector 120.
(201) For a new connection that does not exist in the Address Cache 216, the network uses conventional Ethernet, Domain Name System (DNS), and/or router methods to find a Destination E-Node 202 connection address. A table of MAC addresses and associated E-Node and port addresses within the system enables the methods used. Other methods are possible or may evolve including large databases matching Internet URLs or other standards to E-Node addresses. The Ingress PUI 211a sends its connection information with Destination E-Node 202 address(es) to the Ingress E-Node Controller 221 designates an Source Assigned FIFO Buffer 243 from the Source FIFO Buffer Pool 241 to the incoming connection. It concomitantly assigns the Source Assigned FIFO Buffer 243 to a location in the CBR Stack 553 within a selected Path Aggregation Switch 511. The Source Assigned FIFO Buffer 243 is the data source for CBR Stack 553 in the Switch 511. The FIFO Bus 240 transfers the cellets from the Source Assigned FIFO Buffer 243 to the Switch 511, which aggregates the cellets into a Source Path Aggregation 711 multiplexed data stream.
Aggregating User Path Connections
(202) The Path Aggregation Switches 511 within an E-Node 200 aggregates user data connections into Source Path Aggregations 711. Each path originates in a Source E-Node 201 Path Aggregation Switch 511 and terminates in a Destination E-Node 202 Path Disaggregation Switch 512.
(203)
(204) A Parameterized User Interface (PUI) 210 performs two functions: 1) acting as an Ingress PUI 211 and 2) acting as an Egress PUI 212. Associated with both functions are an Address Cache 216 and a Connection Identifier Cache 218. The Address Cache 216 stores the current address information about both source and destination PUI 210 connections. When a new address information appears within an Ingress PUI 211, it is stored within the Source E-Node 201 housing the Ingress PUI 211. The information is then available beyond the PUI 210 involved in the connections. Sharing the information with Destination E-Nodes 202 is often appropriate. It is also appropriate to store the information within a database available to the entire Metro Network and beyond.
(205) The Connection Identifier Cache 218 can store packet header information that appears in successive packets without modification. Associated with the information is a Connection IDentifier (CID), a small number of bits that represents the information. When a packet enters an Ingress PUI 211 that requires a new CID, the Ingress E-Node Controller 221 or other processor in the system can provide a CID. The Ingress E-Node Controller 221 sends the new CID with relevant information to one or more E-Node Controllers as necessary. The operation is similar to the IETF RObust Header Compression (ROHC) RTFs that are available as standards for its detailed design.
(206) CIDs can also become part of a network-wide database where appropriate. Such caches reduce sending header information with data packets. Control Vectors contain implicitly addressed message segments which can replace traditional control packets as used in other networks. This approach provides deterministic control, message latency, and saves bandwidth between source and destination E-Nodes 200 and T-Nodes 210.
(207)
(208) For purposes of this section, an addressing notation is introduced as described below for E-Nodes 200 and T-Nodes 210. This addressing information is used for reference. For this addressing notation, assume that each T-Node 301 has a 6-bit address assigned (T00, T01 . . . T63). Further, assume that each E-Node 200 has a 6-bit number assigned within its parent T-Node 301 domain. Assuming that each T-Node 301 can support up to 64 E-Nodes 200, a unique 12-bit E-Node 200 address is 64(T-Node address)+(E-Node address). A network that starts small can scale to contain 64 T-Nodes 210 each of which can scale to 64 E-Nodes 200 for 4,096 E-Nodes 200. With this approach, the parent T-Node address for an E-Node is
=INT(E-Node address/64).(4)
The 6-bit E-Node address within a T-Node domain is
=(E-Node address)MOD 64.(5)
(209)
(210)
(211) The connecting Destination T-Node 302 labeled T11 has attached Destination E-Nodes 202 labeled E0704 through E0728. The TT-Link between the Source T-Node 301 T06 connects to the Destination T-Node 302 T11 called a Source-to-Destination TT-Link 341 or a Destination-from-Source TT-Link 342 depending on the location of a Crossconnect Switch 540 shown in
(212) The three aggregation levels shown in L3D1 in the figure; b. Aggregate Destination L2 Disaggregations 722 to a Destination E-Node 202, labeled L3A3
L3D3 in the figure; c. Aggregate all Source L2 Aggregations 721 from one Source T-Node 301 to Destination T-Node 302, labeled L3A2
L3D2 in the figure.
(213) The role of an L3A2L3D2 connection in 3c above can be either a direct link between two T-Nodes 300, or, more likely (in a network with more than a small number of nodes) is to pass a connection through tandem transit nodes. In this case, there is a plurality of L3D2
L3A2 transfers. In other words, each transit node contains an ingress L3 Disaggregation Switch 532 connected to an egress L3 Aggregation Switch 531.
(214) In
(215) When bandwidth is available for a new connection, it is the responsibility of the Aggregation Switch Node Controller 560 in a Generic Aggregation Switch 501 to apply the bandwidth in a deterministic fashion. For a new user connection, the Aggregation Switch Node Controller 560 uses information gathered from the Parameterized User Interface (PUI) 210 concerning the connection type for path level aggregation. In addition, at the path level, there can be a plurality of service classes in a network.
Path Frame Synchronization
(216) Frame synchronization that affect user connections take place in just two places in a SAIN networkat a Path Aggregation Switch 511 and a Path Disaggregation Switch 512. Several methods are available. One is to send a frame preamble similar to the type used in packet-based systems such as Ethernet. This approach requires considerable overhead processing.
(217) Another embodiment requires that the child Source E-Nodes 201 of a Source T-Node 301 synchronize their clocks to the Source T-Node 301. The goal is to assure that frame start time of each Path Aggregation Switch 511 in the Source E-Nodes 201 arrives to be assigned cellet spaces in a master frame between the Source T-Node 301 and its Destination T-Node 302 partner. In this manner, the responsibility for frame synchronization can belong to an L3A/D Pair 533 among other methods.
(218) Embodiment of Level 2 Aggregation and Disaggregation Switch Pairs
(219) For the model network, each Source E-Node 201 contains 20 Level 2 Aggregation Switches 521, one for each Destination T-Node 302. Each Level 2 Aggregation Switch 521 aggregates all paths that originate at the Source E-Node 201 and terminate on one of the Destination T-Nodes 212. In the model network, 19 of the Level 2 Aggregation Switches 521 aggregate 25 paths to Destination E-Nodes 202; for the 20th Level 2 Aggregation Switch 521 that backhauls to the 24 Destination E-Nodes 202 whose parent is also the parent of the Source E-Node 201.
(220)
(221)
(222) Each of the plurality of Source Paths 711 can become a Destination Path Disaggregation 712 at a Destination E-Node 202, as shown in
(223) Embodiment of Level 3 Aggregation and Disaggregation Switch Pairs
(224) Each Source E-Node 201 aggregates its Level 2 Source Superpath 721 traffic into a Level 3 Source Superpath 731. A Source Superpath 731 contains all paths originating from the Source E-Node 201 to all Destination E-Nodes 202 in the network as shown in
(225) In the model network embodiment, The Level 3 Source Superpath 731 contains all 499 Source Paths 711 from each Source E-Node 201 and sends it to its parent Source T-Node 301 in an ET-Trunk 231 as shown in
(226) Embodiment of a Crossconnect Switch in a Source T-Node.
(227) Within each Source T-Node 301, there can be one Crossconnect Switch 540 for each T-Node 300 in a network. Each Crossconnect Switch 540 is dedicated to forwarding traffic to one of the Destination T-Nodes 302 in the network with the traffic converted, in the model network, from one-to-many to many-to-one. That is, traffic received from each Source E-Node 201 by the Source T-Node 301 is directed to many Destination E-nodes 202 for a Destination T-Node 302 and the Crossconnect Switch 540 converts the traffic such that many Source E-Nodes 201 are aggregated together according to a single Destination E-Node 202.
(228) A Source T-Node 301 contains, among other objects, a plurality of Level 3 Disaggregation Switches 532 shown in
(229) Each paired Destination L3 Disaggregation 732 disaggregates the 25 Level 2 Destination Superpaths 722 of the model network. Each Level 2 Destination Superpath 722 contains path traffic destined to Destination E-Nodes 202 connected to one of the Destination T-Nodes 212. The parent Source T-Node 301 connects each Level 2 Destination Superpath 722 to the Crossconnect Switch 540 in the parent Source T-Node 301 that forwards traffic to the proper Destination T-Node 302.
(230) The Path Aggregation Switches 511 in a Source E-Node 201 encapsulates all of the node's paths to every other Destination E-Node 202 in the network. However, the Generic Disaggregation Switch 502 in each of the Destination E-Nodes 202 receives paths from every other Source E-Node 201 in the network. The Crossconnect Switch 540 can be used to reorganize the path traffic to accomplish this goal.
(231)
(232)
(233) The outputs from the Level 2 Disaggregation Switches 522 connect to the input side of a set of Level 2 Aggregation Switches 521 inside the Crossconnect Switch 540. The multiplexed output of Level 2 Aggregation Switches 521 connects to a Level 3 L3 Aggregation Switch 531 as shown in
(234)
(235) Embodiments for Connecting Tandem Nodes
(236)
(237) An important aspect of
(238) In a tandem node, a Generic Disaggregation Switch 502 connection is set to receive a data aggregation from an incoming connection. In other words, a tandem node provides a Generic Disaggregation Switch 502 that pairs with an upstream Generic Aggregation Switch 501. Such switch pairing can be a T-Node-to-T-Node L3 Aggregation Switch 531 connection from an upstream node. The Tandem Node contains one or more L3 Aggregation Switches 531 that forward the data aggregation from the L3 Disaggregation Switch 532 to one or more downlink nodes.
(239) Forwarding to more than one downlink node occurs in multicasting connectivity. The mechanism for controlling such processes can use Control Vectors managed by other T-Nodes where each such T-Node has Control Vector connectivity to the T-Nodes involved therein.
(240) Embodiments for Building Low Latency High Capacity Networks
(241) Classical Time Division Switching networks pass one frame of information on to subsequent nodes where the frame pattern persists in the same manner from one network link to another. This is the case within the Public Switched Telephone Network, for example. In a SAIN network, there is no need to replicate frame patterns within each tandem link. This opens the way to minimize switch latency at transit nodes and other advantages such as dividing high-speed data among multiple optical trunks.
(242)
(243) The first parameter is the System Clock Rate and its inverse, the Clock Period. As shown in
(244) At the Path Aggregation Switch 511 at the bottom of each example are three parameters: the Frame Rate for the switch, the Cellet Size, and the number of Frame Segments. The other parameters shown derive from these three plus either a System Clock Rate or a aggregation level clock rate.
(245)
(246) The 50 subframes per frame each have 250 Clock Pulses and the subframe period is 160 nsec. In addition, if each subframe is filled with a single cellet, the total bandwidth if equal to 50125,000=6,250,000 bps, i.e. 6.25 Mbps. This now becomes the minimum aggregate bandwidth for a Path A/D Pair 513. It does nothing to the QDR, where a connection can be any integer multiple of 125 Kbps. It only affects the bandwidth increments for additional capacity, i.e., any integer multiple of 6.25 Mbps. The maximum aggregate data rate for this set of parameters is 1.5625 Gbps, the System Clock Rate for a one-bit cellet.
(247) At an L2 Aggregation Switch 521, its frame period is now the subframe period from the Path Aggregation Level. In the example, it is 160 nsec At Level 2, the L2 Frame Rate is the same as the Min Aggregate Bandwidth at the Path Level. The result at layer to is:
The L2 Cellet Size chosen is 16 resulting in an L2 QDR=166.25=100 Mbps.
The L2 Min Aggregate Bandwidth=Frame SegmentsQDR=50100=5 Gbps.
The Max Aggregate Bandwidth=L2 Cellet SizeL2 Clock Rate=161.5625 25 Gbps.
The L2 Subframe Period=L2 Frame Period/L2 Frame Segments=160/50=3.2 nsec.
(248) The Frame Period at Level L3 is 3.2 nsec imported from Level 2. The chosen cellet size is 64 bits and the number of L3 Frame Segments is 1. With this value, the L3 Subframe Period is still 3.2 nsec, the same as that of the L2 Subframe Period and a Max Aggregate rate of 100 Gbps. The
(249) These results for the two figures appear interesting and can be useful, but they do not take full advantage of the power-of-two properties of the SAIN algorithm, which can result in uniformly spaced cellets within frames.
(250)
(251) The only dependent parameters that change with the L3 Cellet Size change are the L3 Min Aggregate Bandwidth and the L3 Max Aggregate Bandwidth. As shown in
(252) A skilled artisan can implement the SAIN switches disclosed herein by building switch elements that change either 1) by automation as traffic loads change or 2) by an operator making changes from a management control station. For example, the size of cellets sent from a source or to a sink can be set to one of a table of alternatives.
(253) The methods and apparatus disclosed have an important side effect in being able to implement switches that can scale to extremely high data rates. Since the system can use power-of-two-related cellet sizes to advantage. For example, the switch can send each bit of a 32-bit cellet over 32 25 Gbps optical fiber wavelengths using the state of the art synchronization methods referenced herein. The total data rate of the combination is 800 Gbps. Increasing the cellet size to 64 bits and 64 fiber wavelengths, the result is a 1.6 terabit per second (Tbps) trunk. In this embodiment, there is no need for dealing with the splitting and recombining packets. As always in a SAIN network, packets exist only at ingress and egress ports. With coherent optical trunks emerging, the number that will result in hundreds of gigabits per second using a single wavelength, the amount of data within a single fiber increases significantly.
(254) Even though much of the disclosures herein have assumed that at the Path and Level 2 aggregation levels the cellet size can be one bit, other cellet sizes are possible. With ever-higher clock rates emerging for semiconductors, this will be an important way to take advantage of the SAIN multiplexing algorithm for serial data sources and sinks. However, many sources and sinks begin as multi-bit words, particularly those that are 8, 16, 32, and 64 bits wide. Serializing these words is a common method of operation in many contexts, and a SAIN network can be one of them. The upside of this approach is easy to understand in terms of simplicity within a stochastic network. However, there is a downside to this process.
(255) Even though the epochs are small, there is still a disassembly and reassembly time associated with this process. In these cases, the SAIN methods of this application can result in an ultimate minimization of end-to-end latency within networks. In applications such as semiconductors where distances are very small and optical transport is the ultimate in reducing power required, the methods can find application. In longer distance applications, applying the SAIN methods can result in measuring end-to-end latency in picoseconds. This can result in the ability to triangulate physical locations to compete effectively without relying on GPS and other satellite-based methods that have built-in reliability and survivability issues. The only issues affecting accuracy can be due to temperature and earth spatial variations with time, but this has been a well-researched area dating back many decades.
(256) Upon network (or a subnetwork) instantiation, node switches can be set up with initial Minimum Aggregate Bandwidth settings similar to those shown in
(257) Embodiments for Connecting Paired Switches
(258) There are two different circumstances in connecting one SAIN switch to a downlink switch. In one instance, one switch is a Generic Aggregation Switch 501 that connects to a paired Generic Disaggregation Switch 502. In another circumstance, a switch may be either a Generic Aggregation Switch 501 or a Generic Disaggregation Switch 502 where its downlink switch is a transit (tandem) switch described in the last section.
(259) The Basics of the SAIN Transform Algorithm at paragraph 0 detail memory maps in
(260) The memory maps can match each other in paired switches. In addition, the system can synchronize the start time of the Path Disaggregation Switch 512 frame to begin shortly after receiving the Path Aggregation Switch 511 start time. In other words, the synchronization process compensates for all network propagation delays between the source send time and destination E-Node's arrival time. The only restriction is that the cellet boundaries of each received cellet from a Path Aggregation Switch 511 occur so that receiving data in the arriving cellet occurs in time to place it in the concomitant outgoing cellet position.
(261) As described below, the information described in The Basics of the SAIN Transform Algorithm can enable a skilled artisan to assure timing of this embodiment occurs. That is, if a connection requires a number of cellets that are not a multiple of a PoT, there is no problem as long as the connection starts at the same Connection Domain cellet position in two relevant switch stacks. 1. Switching can make use of the PoT boundaries by treating the cellets within the boundaries as a unit of switching. Each PoT segment treats the PoT boundaries as subframe boundaries in a switch downstream from a Generic Aggregation Switch 501 and returning to the original boundary at a Generic Disaggregation Switch 502. 2. Such switching adds no latency beyond a single cellet buffer as long as the number of cellets per PoT segment p is the same. In other words, as long as the data rate being switched is an integer multiple of the base data rate, defined to be one cellet per PoT segment. 3. If the number of cellets per PoT segment is not the same, a two-PoT-segment FIFO buffer requires p+1 cellets to ensure that each outgoing cellet slot has an incoming cellet. 4. The result is that the end-to-end latency remains constant as long as the base data rate remains unchanged even when the integer multiple changes.
(262)
(263)
(264) Embodiments Using Power of Two Length Subframes to Minimize Latency
(265) This section pertains to all paired switches in a SAIN network, but is especially important in paired in tandem node switches. Using Power of Two subframes in the proper manner can minimize latency beyond merely relying on the small length of the subframes.
(266)
(267)
(268) A major advantage of equally spaced cellets is that a transit node switch does not need to wait an entire PoT segment to be certain that all cellets that were supposed to arrive did so.
(269) Equal Spacing of Cellets
(270) Equal spacing of cellets for a connection depends on two parameters. These are:
(271) 1. The number of cellets per PoT segment are a power of two; and 2. The number of PoT segments per frame is a divisor of the start position of the connection in the virtual frame length of the Connection Domain.
(272) To illustrate further, assume that a PoT segment into smaller PoT segments for assignment of low speed traffic.
(273) The results are: 1. Jitter can be is at most one link data rate position (plus one link clock period), and 2. Any fixed latency, though small, is deterministic and predictable; network controllers can account for it in end-to-end latency measurements. 3. Position of an E-Node port can be measurable in centimeters and decimeters where high-speed links exist; the accuracy of the measurement can exceed that of GPS.
(274)
(275)
(276)
(277) Clearly, not all connections are going to be a power of two in length in the real world. This is not a big issue insofar as Path A/D Pair 513 pairs are concerned. Requiring FIFO buffers to match incoming packet to a Source E-Node 201 with outgoing connection is a one-time occurrence. For transport links with data rates exceeding a gigabit per second have very large maximum base data rates that are large powers of two. Their subdivision to lower powers of two can assure low-latency transit node operation disclosed herein.