PCI Express to PCI Express based low latency interconnect scheme for clustering systems

20220374388 ยท 2022-11-24

    Inventors

    Cpc classification

    International classification

    Abstract

    PCI Express (PCIE) is a Bus or I/O interconnect standard for use inside the computer or embedded system enabling faster data transfers to and from peripheral devices. The standard has achieved a degree of stability such that other applications can be implemented using PCIE as basis. A PCIE based interconnect scheme for switching and inter-connection between multiple PCIE enabled systems, each having its own PCIE root complexes controlling a PCIE bus, enabling the scalability of PCIE architecture to be used for data transport between inter-connected PCIE buses of systems forming a cluster using PCIE protocol, through a switch enabled for PCIE interconnect, forming the hub of a star connected network is proposed. These systems forming the cluster can be any computing, control, storage or embedded system. The scalability of the interconnect will allow the cluster to grow the bandwidth between the systems as they become necessary without changing to a different connection architecture.

    Claims

    1. A network switch for interconnecting a plurality of PCI-Express computing systems, in a cluster, wherein each of the plurality of PCI-Express based computing systems comprise at least a root complex controlling a PCI-Express bus of the respective PCI-Express computing system, the PCI-Express bus having at least a PCI -Express outbound port that connects to an inbound port on the network switch for data transfer back and forth between said PCI-Express computing system and the network switch using PCI-Express protocol; the network switch comprising: a plurality of inbound ports where in a first inbound port on the network switch is connected to a PCI-Express outbound port on the PCI-Express bus of a first of the plurality of interconnected PCI-Express computing systems, wherein the PCI-Express outbound port of the first of the interconnected PCI-Express computing systems is connected to a root complex of the first of the plurality of interconnected PCI-Express computing systems via the PCI-express bus of the first of the plurality of interconnected PCI-Express computing systems; a second of the inbound port of the plurality of inbound ports of the network switch is connected to a PCI-Express outbound port of a second of the plurality of interconnected PCI-Express computing systems; wherein the PCI-Express outbound port of the second of the interconnected PCI-Express computing systems is connected to a PCI-Express root complex of the second of the plurality of interconnected PCI-Express computing systems via the PCI-express bus of the second of the plurality of interconnected PCI-Express computing systems; wherein data is transferred to and from the first of the interconnected PCI-Express computing systems and the first inbound port of the network switch nusing PCI-Express protocol; wherein data is transferred to and from the second of said interconnected PCI-Express computing systems and the second inbound port of the network switch using PCI-Express protocol; and wherein data is transferred between said first inbound port on the network switch and the second inbound port on the network switch, such that data transfer and communication is performed between the first and second of the interconnected PCI-Express computing systems in the PCI-Express cluster using PCI-Express protocol; wherein each of the plurality of PCI-Express computing systems that connects to the network switch connects by way of the PCI-Express outbound port at a PC- Express based peripheral module forming an end point of the respective PCI-Express Bus configured as an I/O module enabled for system inter-connection; wherein the PCI-Express endpoints of each of the plurality of PCI-Express computing systems that connects to the network switch are part of the PCI-Express Bus that is the I/O interconnect of the respective PCI-Express computing systems; wherein the network switch is configured for data transfer back and forth between the PCI-Express computing systems, connected to the network switch using PCI-Express protocol; and wherein the data transfer within the PCI-Express bus of each of the connected PCI-Express computing system is under the control of the respective root complexes using PCI-Express protocols.

    2. The network switch of claim 1 wherein the network switch also includes a PCI-Express inbound port for connecting to a PCI-Express inbound port on a second network switch using PCI-Express protocol, for transferring data to and from the first or second of said interconnected PCI-Express computing systems and a third PCI-Express computing system connected to the second network switch using PCI-Express protocol.

    3. The network switch of claim 1 wherein the network switch comprises one or more semiconductor switch devices.

    4. The plurality of computing systems of claim 1, are systems selected from a group comprising computing systems, control systems, storage systems and embedded systems.

    5. A system comprising: a plurality of PCI-Express computing systems inter-connected in a cluster through a network switch and using PCI-Express protocol for transferring data back and forth among the plurality of PCI-Express computing systems; wherein each of the plurality of PCI-Express computing systems interconnected in the cluster is enabled with a PCI Express Bus that is an input/output (I/O) interconnect for data transfer to and from connected PCI-Express peripheral modules forming end points of the PCI Express Bus under the control of a respective root complex of each of the plurality of PCI-Express computing systems interconnected in the cluster; wherein each of the plurality of PCI-Express computing systems interconnected in the cluster comprise at least a PCI-Express peripheral module for system interconnection on the PCI-Express bus configured as a PCI-Express outbound port enabled for data transfer connecting to the root complex of the respective PCI-Express computing system; wherein said network switch comprises: a) a plurality of inbound ports; wherein a first inbound port on said network switch is connected to a first PCI-Express outbound port enabled for data transfer of a first PCI-Express computing system of the plurality of PCI-Express computing systems; and b) at least a second inbound port on said network switch is connected to a second PCI-Express outbound port enabled for data transfer of a second PCI-Express computing system of the plurality of PCI-Express computing systems; wherein data is transferred within each interconnected PCI-Express system over the PCI-Express bus under the control of the respective root complexes; and wherein data is transferred to and from said first PCI-Express computing system via the first PCI-Express outbound port and said first inbound port on said network switch using PCI-Express protocol; wherein data is transferred to and from said second PCI-Express computing system via the second PCI-Express outbound port and said second inbound port on said network switch using PCI-Express protocol; and data is further transferred between said first inbound port on said network switch and said second inbound port on said network switch, such that data transfer and communication is performed between the first and second PCI-Express computing systems in the PCI-Express cluster using PCI-Express protocol.

    6. The system of claim 5, wherein said network switch also includes a third port enabled for data transfer for connecting to a fourth port enabled for data transfer on a second network switch using PCI-Express protocol, for transferring data to and from the first or second PCI-Express computing systems and a third PCI-Express computing system connected to the second network switch using PCI-Express protocol.

    7. The system of claim 5 wherein the network switch comprises one or more semiconductor switch devices.

    8. The plurality of inter-connected computing systems of claim 5, are systems selected from a group comprising computing systems, control systems, storage systems and embedded systems.

    Description

    DESCRIPTION OF FIGURES

    [0016] FIG. 1 Typical Interconnected (multi-system) cluster (shown with eight systems connected in a star architecture using direct connected data links between PCIE standard based peripheral to PCIE standard based peripheral)

    [0017] FIG. 2 -is a cluster using multiple interconnect modules or switches to interconnect smaller clusters.

    [0018] Explanation of Numbering and Lettering in FIG. 1

    [0019] (1) to (8): Number of Systems interconnected in FIG. 1 (9): Switch sub-system. (10): Software configuration and control input for the switch. (1a) to (8a): PCI Express based peripheral module (PCIE Modules) attached to systems. (1b) to (8b): PCI Express based peripheral modules (PCIE Modules) at switch. (1L) to (8L): PCIE based peripheral module to PCIE based peripheral module connections having n-links (n-data links)

    [0020] Explanation of Numbering and Lettering in FIG. 2

    [0021] (12-1) and (12-2): clusters (9-1) and (9-2): interconnect modules or switch sub-systems. (10-1) and (10-2): Software configuration inputs (11-1) and (11-2): Switch to switch interconnect module in the cluster (11L): Switch to switch interconnection

    Description of Invention

    [0022] PCI Express is a Bus or I/O interconnect standard for use inside the computer or embedded system enabling faster data transfers to and from peripheral devices. The standard is still evolving but has achieved a degree of stability such that other applications can be implemented using PCIE as basis. A PCIE based interconnect scheme to enable switching and inter-connection between multiple PCIE enabled systems each having its own PCIE root complex, such that the scalability of PCIE architecture can be applied to enable data transport between connected systems to form a cluster of systems, is proposed. These connected systems can be any computing, control, storage or embedded system. The scalability of the interconnect will allow the cluster to grow the bandwidth between the systems as they become necessary without changing to a different connection architecture.

    [0023] FIG. 1 is a typical cluster interconnect. The Multi-system cluster shown consist of eight units or systems {(1) to (8)} that are to be interconnected. Each system is PCI Express (PCIE) based system with a PCIE root complex for control of data transfer to and from connected peripheral devices via PCIE peripheral modules as is standard for PCIE based systems. Each system to be interconnected has at least a PCIE based peripheral module {(1a) to (8a)} as an IO module, at the interconnect port enabled for system interconnection, with n-links built into or attached to the system. (9) is an interconnect module or a switch sub-system, which has number of PCIE based connection modules equal to or more than the number of systems to be interconnected, in this case of FIG. 1 this number being eight {(1b) to (8b)}, that can be interconnected for data transfer through the switch. A software based control input is provided to configure and/or control the operation of the switch and enable connections between the switch ports for transfer of data. Link connections {(1L) to (8L)} attach the PCIE based peripheral modules 1a to 8a, enabled for interconnection on the respective systems 1 to 8, to the on the switch with n links. The value of n can vary depending on the connect band width required by the system.

    [0024] When data has to be transferred between say system 1 and system 5, in the simple case, the control is used to establish an internal link between PCIE based peripheral modules 1b and 5b at the respective ports of the switch. A hand shake is established between outbound communication enabled PCIE based peripheral module (PCIE Module) 1a and inbound PCIE module 1b at the switch port and outbound PCIE module 5a on the switch port and inbound communication enabled PCIE module 5b . This provides a through connection between the PCIE modules 1a to 5b through the switch allowing data transfer. Data can then be transferred at speed between the modules and hence between systems. In more complex cases data can also be transferred and qued in storage implemented in the switch, at the ports and then when links are free transferred out to the right systems at speed.

    [0025] Multiple systems can be interconnected at one time to form a multi-system that allow data and information transfer and sharing through the switch. It is also possible to connect smaller clusters together to take advantage of the growth in system volume by using an available connection scheme that interconnects the switches that form a node of the cluster.

    [0026] If need for higher bandwidth and low latency data transfers between systems increase, the connections can grow by increasing the number of links connecting the PCIE modules between the systems in the cluster and the switch without completely changing the architecture of the interconnect. This scalability is of great importance in retaining flexibility for growth and scaling of the cluster.

    [0027] It should be understood that the system may consist of peripheral devices, storage devices and processors and any other communication devices. The interconnect is agnostic to the type of device as long as they have a PCIE module at the port to enable the connection to the switch. This feature will reduce the cost of expanding the system by changing the switch interconnect density alone for growth of the multi-system.

    [0028] PCIE is currently being standardized and that will enable the use of the existing PCIE modules to be used from different vendors to reduce the over all cost of the system. In addition using a standardized module in the system as well as the switch will allow the cost of software development to be reduced and in the long run use available software to configure and run the systems.

    [0029] As the expansion of the cluster in terms of number of systems, connected, bandwidth usage and control will all be cost effective, it is expected the over all system cost can be reduced and over all performance improved by standardized PCIE module use with standardized software control.

    [0030] Typical connect operation may be explained with reference to two of the systems, example system (1) and system (5). System (1) has a PCIE module (1a) at the interconnect port and that is connected by the connection link or data-link or link (1L) to a PCIE module (1b) at the 10 port of the switch (9). System (5) is similarly connected to the switch trough the PCIE module (5a ) at its interconnect port to the PCIE module (5b ) at the switch (9) IO port by link (5L). Each PCIE module operates for transfer of data to and from it by standard PCI Express protocols, provided by the configuration software loaded into the PCIE modules and switch. The switch operates by the software control and configuration loaded in through the software configuration input.

    [0031] FIG. 2 is that of a multi-switch cluster. As the need tom interconnect larger number of systems increase, it will be optimum to interconnect multiple switches of the clusters to form a new larger cluster. Such a connection is shown in FIG. 2. The shown connection is for two smaller clusters (12-1 and 12-2) interconnected using PCIE modules that can be connected together using any low latency switch to switch connection (11-10 and 11-2), connected using interconnect links (11L) to provide sufficient band width for the connection. The switch to switch connection transmits and receives data and information using any suitable protocol and the switches provide the interconnection internally through the software configuration loaded into them.

    [0032] The following are some of the advantages of the disclosed interconnect scheme 1. Provide a low latency interconnect for the cluster. 2. Use of PCI Express based protocols for data and information transfer within the cluster. 3. Ease of growth in bandwidth as the system requirements increase by increasing the number of links within the cluster. 4. Standardized PCIE component use in the cluster reduce initial cost. 5. Lower cost of growth due to standardization of hardware and software. 6. Path of expansion from a small cluster to larger clusters as need grows. 7. Future proofed system architecture. 8. Any speed increase in the link connection due to technology advance is directly applicable to the interconnection scheme.

    [0033] In fact the disclosed interconnect scheme provides advantages for low latency multi-system cluster growth that are not available from any other source.

    [0034] While the invention has been described in terms of several embodiments, those of ordinary skill in the art will recognize that the invention is not limited to the embodiments described, but can be practiced with modification and alteration within the spirit and scope of the appended claims. Multiple existing methods and methods developed using newly developed technology may be used to establish the hand shake between systems and to improve data transfer and latency. The description is thus to be regarded as illustrative instead of limiting and capable of using any new technology developments in the field of communication an data transfer. There are numerous other variations to different aspects of the invention described above, which in the interest of conciseness have not been provided in detail. Accordingly, other embodiments are limited only within the scope of the claims.