Patent classifications
H04L49/106
FPGA-efficient directional two-dimensional router
A configurable directional 2D router for Networks on Chips (NOCs) is disclosed. The router, which may be bufferless, is designed for implementation in programmable logic in FPGAs, and achieves theoretical lower bounds on FPGA resource consumption for various applications. The router employs an FPGA router switch design that consumes only one 6-LUT or 8-input ALM logic cell per router per bit of router link width. A NOC comprising a plurality of routers may be configured as a directional 2D torus, or in diverse ways, network sizes and topologies, data widths, routing functions, performance-energy tradeoffs, and other options. The router and NOC enable feasible FPGA implementation of large integrated systems on chips, interconnecting hundreds of client cores over high bandwidth links, including compute and accelerator cores, industry standard IP cores, DRAM/HBM/HMC channels, PCI Express channels, and 10G/25G/40G/100G/400G networks.
AUGMENTING DATA PLANE FUNCTIONALITY WITH FIELD PROGRAMMABLE INTEGRATED CIRCUITS
Some embodiments use one or more FPGAs and external memories associated with the FPGAs to implement large, hash-addressable tables for a data plane circuit. These embodiments configure at least one message processing stage of the DP circuit to store (1) a first plurality of records for matching with a set of data messages received by the DP circuit, and (2) a redirection record redirecting data messages that do not match the first plurality of records to a DP egress port associated with the memory circuit. These embodiments configure an external memory circuit to store a larger, second set of records for matching with redirected data messages received through the DP egress port associated with the memory circuit. This external memory circuit is a hash-addressable memory in some embodiments. To determine whether a redirected data message matches a record in the second set of record, the method of some embodiments configures an FPGA associated with the hash-addressable external memory to use a collision free hash process to generate a collision-free, hash address value from a set of attributes of the data message. This hash address value specifies an address in the external memory for the record in the second set of records to compare with the redirected data message.
Recording in an external memory data messages processed by a data plane circuit
Some embodiments provide novel circuits for recording data messages received by a data plane circuit of a forwarding element in an external memory outside of the data plane circuit. The external memory in some embodiments is outside of the forwarding element. In some embodiments, the data plane circuit encapsulates the received data messages that should be recorded with encapsulation headers, inserts into these headers addresses that identify locations for storing these data messages in a memory external to the data plane circuit, and forwards these encapsulated data messages so that these messages can be stored in the external memory by another circuit. Instead of encapsulating received data messages for storage, the data plane circuit in some embodiments encapsulates copies of the received data messages for storage. Accordingly, in these embodiments, the data plane circuit makes copies of the data messages that it needs to record.
Augmenting data plane functionality with field programmable integrated circuits
Some embodiments provide novel circuits for augmenting the functionality of a data plane circuit of a forwarding element with one or more field programmable circuits and external memory circuits. The external memories in some embodiments serve as deep buffers that receive through one or more FPGAs a set of data messages from the data plane (DP) circuit to store temporarily. In some of these embodiments, one or more of the FPGAs implement schedulers that specify when data messages should be retrieved from the external memories and provided back to the data plane circuit for forwarding through the network. For instance, in some embodiments, a particular FPGA can perform a scheduling operation for a first set of data messages stored in its associated external memory, and can direct another FPGA to perform the scheduling operation for a second set of data messages stored in the particular FPGA's associated external memory. Specifically, in these embodiments, the particular FPGA determines when the first subset of data messages stored in its associated external memory should be forwarded back to the data plane circuit to forward to data messages in the network, while directing another FPGA to determine when a second subset of data messages stored in the particular FPGA's external memory should be forwarded back to the data plane circuit.
Augmenting data plane functionality with field programmable integrated circuits
Some embodiments use one or more FPGAs and external memories associated with the FPGAs to implement large, hash-addressable tables for a data plane circuit. These embodiments configure at least one message processing stage of the DP circuit to store (1) a first plurality of records for matching with a set of data messages received by the DP circuit, and (2) a redirection record redirecting data messages that do not match the first plurality of records to a DP egress port associated with the memory circuit. These embodiments configure an external memory circuit to store a larger, second set of records for matching with redirected data messages received through the DP egress port associated with the memory circuit. This external memory circuit is a hash-addressable memory in some embodiments. To determine whether a redirected data message matches a record in the second set of record, the method of some embodiments configures an FPGA associated with the hash-addressable external memory to use a collision free hash process to generate a collision-free, hash address value from a set of attributes of the data message. This hash address value specifies an address in the external memory for the record in the second set of records to compare with the redirected data message.
Augmenting data plane functionality with field programmable integrated circuits
Some embodiments use one or more FPGAs and external memories associated with the FPGAs to implement large, hash-addressable tables for a data plane circuit. These embodiments configure at least one message processing stage of the DP circuit to store (1) a first plurality of records for matching with a set of data messages received by the DP circuit, and (2) a redirection record redirecting data messages that do not match the first plurality of records to a DP egress port associated with the memory circuit. These embodiments configure an external memory circuit to store a larger, second set of records for matching with redirected data messages received through the DP egress port associated with the memory circuit. This external memory circuit is a hash-addressable memory in some embodiments. To determine whether a redirected data message matches a record in the second set of record, the method of some embodiments configures an FPGA associated with the hash-addressable external memory to use a collision free hash process to generate a collision-free, hash address value from a set of attributes of the data message. This hash address value specifies an address in the external memory for the record in the second set of records to compare with the redirected data message.
FPGA-EFFICIENT DIRECTIONAL TWO-DIMENSIONAL ROUTER
A configurable directional 2D router for Networks on Chips (NOCs) is disclosed. The router, which may be bufferless, is designed for implementation in programmable logic in FPGAs, and achieves theoretical lower bounds on FPGA resource consumption for various applications. The router employs an FPGA router switch design that consumes only one 6-LUT or 8-input ALM logic cell per router per bit of router link width. A NOC comprising a plurality of routers may be configured as a directional 2D torus, or in diverse ways, network sizes and topologies, data widths, routing functions, performance-energy tradeoffs, and other options. The router and NOC enable feasible FPGA implementation of large integrated systems on chips, interconnecting hundreds of client cores over high bandwidth links, including compute and accelerator cores, industry standard IP cores, DRAM/HBM/HMC channels, PCI Express channels, and 10G/25G/40G/100G/400G networks.
MULTICAST MESSAGE DELIVERY USING A DIRECTIONAL TWO-DIMENSIONAL ROUTER AND NETWORK
A system and method for multicast delivery of messages using a configurable directional 2D router for Networks on Chips (NOCs) is disclosed. The router is well suited for implementation in programmable logic in FPGAs and achieves theoretical lower bounds on FPGA resource consumption. A NOC comprising a plurality of routers may be configured as a directional 2D torus, or in diverse ways, network sizes and topologies, data widths, routing functions, performance-energy tradeoffs, and other options. The NOC may transmit a unicast message from one source client core to one destination client core, or a multicast message from one source client core to a plurality of destination client cores, or an arbitrary mix of unicast and multicast messages, simultaneously. A multicast message destination may include all client cores of routers with a particular first or second dimension coordinate, or all client cores, or some arbitrary subsets of client cores.
CONNECTING DIVERSE CLIENT CORES USING A DIRECTIONAL TWO-DIMENSIONAL ROUTER AND NETWORK
A configurable directional 2D router for Networks on Chips (NOCs) is disclosed. The router is well suited for implementation in programmable logic in FPGAs, and achieves theoretical lower bounds on FPGA resource consumption for various applications. The router may employ an FPGA router switch design that consumes only one 6-LUT or 8-input ALM logic cell per router per bit of router link width. A NOC comprising a plurality of routers may be configured as a directional 2D torus, or in diverse ways, network sizes and topologies, data widths, routing functions, performance-energy tradeoffs, and other options. System on chip designs may employ a plurality of NOCs with different configuration parameters to customize the system to the application or workload characteristics. A great diversity of NOC client cores, for communication amongst various external interfaces and devices, and on-chip interfaces and resources, may be coupled to a router in order to efficiently communicate with other NOC client cores. The router and NOC enable feasible FPGA implementation of large integrated systems on chips, interconnecting hundreds of client cores over high bandwidth links, including compute and accelerator cores, industry standard IP cores, DRAM/HBM/HMC channels, PCI Express channels, and 10G/25G/40G/100G/400G networks.
NETWORK ARCHITECTURE WITH HARMONIC CONNECTIONS
A computer network organized in a logical grid having rows and columns can include network nodes coupled according to harmonics. Each network node can be coupled to network nodes of the same row using a set of horizontal strands according to a set of horizontal harmonics. Each of the horizontal harmonics specifies a node distance along the row between adjacent connection points on the corresponding horizontal strand. Each network node can also be coupled to network nodes of the same column using a set of vertical strands according to a set of vertical harmonics. Each of the vertical harmonics specifies a node distance along the column between adjacent connection points on the corresponding vertical strand.