PROCESSING SYSTEM, RELATED INTEGRATED CIRCUIT, DEVICE AND METHOD
20240283736 ยท 2024-08-22
Inventors
Cpc classification
H04L67/12
ELECTRICITY
H04L45/566
ELECTRICITY
International classification
Abstract
A hardware network accelerator comprises a plurality of Ethernet communication interfaces, a plurality of memories, and a further memory. Each memory stores records comprising destination IP data identifying a destination IP address range. The further memory stores further records, wherein each record comprises next-hop data indicating a next-hop IP address, next-hop enable data, and network port data indicating an Ethernet communication interface. Each Ethernet communication interface is configured to obtain an IP packet, access in parallel the memories in order to read the records, select a record having a destination IP address range containing the destination IP address of the IP packet, read the further record associated with the selected record from the further memory, and select the indicated Ethernet communication interface. The selected Ethernet communication interface is configured to transmit an Ethernet frame comprising the IP packet based on the next-hop enable data and next-hop data.
Claims
1. A processing system comprising: a hardware network accelerator, comprising: a plurality of Ethernet communication interfaces; a plurality of memories, wherein each of the plurality of memories is configured to store a plurality of records, each record comprising: enable data indicating whether the respective record contains valid data; and destination internet protocol (IP) data identifying a destination IP address range; a further memory, configured to store a plurality of further records, wherein each record is associated univocally with a respective further record, and wherein each further record comprises: next-hop data indicating a next-hop IP address; next-hop enable data indicating whether a respective destination IP address range may be reached directly or via the respective next-hop IP address; and network port data indicating one of the plurality of Ethernet communication interfaces; wherein each Ethernet communication interface is configured to: obtain an IP packet comprising a destination IP address; access in parallel the plurality of memories in order to sequentially read at least in part the records stored to each of the plurality of memories; compare the destination IP address with the destination IP data of the read records containing valid data in order to select a record having the destination IP address range containing the destination IP address; read the further record associated with the selected record from the further memory; and select one of the Ethernet communication interfaces based on the network port data of the read further record; wherein the selected Ethernet communication interface is configured to: determine a target IP address by selecting: a) the destination IP address when the next-hop enable data of the read further record indicate that the respective destination IP address range may be reached directly, or b) the next-hop IP address of the read further record when the next-hop enable data of the read further record indicate that the respective destination IP address range may be reached via the respective next-hop IP address; determine a target media access control (MAC) address for the target IP address; generate an Ethernet frame comprising an Ethernet header and as payload the IP packet, wherein the Ethernet header comprises as destination MAC address the target MAC address, and a source MAC address configured for the selected Ethernet communication interface; and transmit the Ethernet frame to an Ethernet network connected to the selected Ethernet communication interface; a communication system connecting a microprocessor to the hardware network accelerator; and the microprocessor, configured to program the records in the plurality of memories and the further records in the further memory.
2. The processing system according to claim 1, wherein each record comprises: a first data section comprising a first destination IP address included in the respective destination IP address range, and a first control data section comprising the enable data and a field for storing a subnet or network mask for the respective destination IP address range; and wherein each further record comprises: a second data section comprising the next-hop IP address, and a second control data section comprising the next-hop enable data and the network port data.
3. The processing system according to claim 2, wherein the hardware network accelerator comprises an address translation and dispatcher circuit configured to manage an address map, wherein each memory location in the plurality of memories and each memory location in the further memory is associated with a respective memory address in the address map, wherein the address translation and dispatcher circuit is configured to: receive a memory write request comprising a memory address in the address map and respective data to be stored; select one of the plurality of memories and the further memory based on the received memory address; select a memory location of the selected memory based on the received memory address; and store the data to be stored to the selected memory location of the selected memory.
4. The processing system according to claim 3, wherein the memory address comprises a first field, a second field and a third field; wherein the first field indicates whether the received data should be stored to the plurality of memories or the further memory; wherein, when the first field indicates that the received data should be stored to the plurality of memories, the second field indicates a record in the plurality of memories, and the third field indicates whether the received data should be stored to the first data section or the first control data section of the record indicated by the second field; and wherein, when the first field indicates that the received data should be stored to the further memory, the second field indicates a further record in the further memory, and the third field indicates whether the received data should be stored to the second data section or the second control data section of the further record indicated by the second field.
5. The processing system according to claim 4, wherein at least one of the first data section, the first control data section, the second data section and the second control data section is stored to a plurality of memory slots in the respective memory, and wherein the memory address comprises a fourth field indicating a respective memory slot of the respective plurality of memory slots.
6. The processing system according to claim 5, wherein each memory location of the plurality of memories and the further memory comprises a plurality of bytes, and wherein the memory address comprises a fifth field indicating a sub-set of the bytes of the memory slot indicated by the first field, the second field, the third field, and optionally the fourth field.
7. The processing system according to claim 3, wherein the communication system comprises a slave communication interface associated with the hardware network accelerator, wherein the communication system has a physical address range having a physical address sub-range associated with the slave communication interface, wherein the slave communication interface is configured to: receive a write request from the microprocessor, the write request comprising an address in the physical address sub-range and respective transmitted data; extract a memory address in the address map of the address translation and dispatcher circuit from the address in the physical address sub-range; and provide a memory write request to the address translation and dispatcher circuit, the memory write request comprising the extracted memory address and the transmitted data.
8. The processing system according to claim 1, wherein each Ethernet communication interface comprises a search engine comprising a plurality of search circuits, wherein each search circuit is configured to: access a respective memory of the plurality of memories in order to sequentially read at least in part the records stored to the respective memory of the plurality of memories; and compare the destination IP address with the destination IP data of the respective read records containing valid data in order to determine whether the record has a destination IP address range containing the destination IP address.
9. The processing system according to claim 8, wherein each search circuit is configured to sequentially read the records stored to the respective memory of the plurality of memories until the enable data indicate that the respective record does not contain valid data or at least one of the search circuits determines that a record has a destination IP address range containing the destination IP address.
10. The processing system according to claim 1, wherein the hardware network accelerator comprises a memory controller configured to decide, which of the Ethernet communication interfaces may access the plurality of memories and the further memory.
11. The processing system according to claim 1, wherein the processing system is disposed on an integrated circuit.
12. A vehicle electronic system comprising: a plurality of processing systems, each processing system comprising: a hardware network accelerator, comprising: a plurality of Ethernet communication interfaces; a plurality of memories, wherein each of the plurality of memories is configured to store a plurality of records, each record comprising: enable data indicating whether the respective record contains valid data; and destination internet protocol (IP) data identifying a destination IP address range; a further memory, configured to store a plurality of further records, wherein each record is associated univocally with a respective further record, and wherein each further record comprises: next-hop data indicating a next-hop IP address; next-hop enable data indicating whether a respective destination IP address range may be reached directly or via the respective next-hop IP address; and network port data indicating one of the plurality of Ethernet communication interfaces; wherein each Ethernet communication interface is configured to: obtain an IP packet comprising a destination IP address; access in parallel the plurality of memories in order to sequentially read at least in part the records stored to each of the plurality of memories; compare the destination IP address with the destination IP data of the read records containing valid data in order to select a record having the destination IP address range containing the destination IP address; read the further record associated with the selected record from the further memory; and select one of the Ethernet communication interfaces based on the network port data of the read further record; wherein the selected Ethernet communication interface is configured to: determine a target IP address by selecting: a) the destination IP address when the next-hop enable data of the read further record indicate that the respective destination IP address range may be reached directly, or b) the next-hop IP address of the read further record when the next-hop enable data of the read further record indicate that the respective destination IP address range may be reached via the respective next-hop IP address; determine a target media access control (MAC) address for the target IP address; generate an Ethernet frame comprising an Ethernet header and as payload the IP packet, wherein the Ethernet header comprises as destination MAC address the target MAC address, and a source MAC address configured for the selected Ethernet communication interface; and transmit the Ethernet frame to an Ethernet network connected to the selected Ethernet communication interface; a communication system connecting a microprocessor to the hardware network accelerator; and the microprocessor, configured to program the records in the plurality of memories and the further records in the further memory; and an Ethernet communication system connecting the processing systems.
13. A method of operating a processing system, the processing system comprising a microprocessor, a communication system connecting the microprocessor to a hardware network accelerator, and the hardware network accelerator comprising a plurality of Ethernet communication interfaces, a plurality of memories, a further memory, the method comprising: programming, by the microprocessor, a plurality of records in the plurality of memories, each record comprising enable data indicating whether the respective record contains valid data, and destination internet protocol (IP) data identifying a destination IP address range programming, by the microprocessor, further records in the further memory, each of the plurality of records associated univocally with a respective further record, and each further record comprising next-hop data indicating a next-hop IP address, next-hop enable data indicating whether a respective destination IP address range may be reached directly or via the respective next-hop IP address, and network port data indicating one of the plurality of Ethernet communication interfaces; obtaining, by one of the Ethernet communication interfaces, an IP packet comprising a destination IP address; using, by the one of the Ethernet communication interfaces, the plurality of records in the plurality of memories and the respective further records in the further memory to select one of the Ethernet communication interfaces; generating, by the selected Ethernet communication interface, an Ethernet frame; and transmitting, by the selected Ethernet communication interface, the Ethernet frame to an Ethernet network connected to the selected Ethernet communication interface.
14. The method according to claim 13, wherein the one of the Ethernet communication interfaces is further configured to: access in parallel the plurality of memories in order to sequentially read at least in part the records stored to each of the plurality of memories; compare the destination IP address with the destination IP data of the read records containing valid data in order to select a record having the destination IP address range containing the destination IP address; read the further record associated with the selected record from the further memory; and select one of the Ethernet communication interfaces based on the network port data of the read further record.
15. The method according to claim 13, wherein the selected Ethernet communication interface is configured to: determine a target IP address by selecting the destination IP address in response to the next-hop enable data of the read further record indicating that the respective destination IP address range is directly reachable; determine a target media access control (MAC) address for the target IP address; and generate the Ethernet frame comprising an Ethernet header and as payload the IP packet, the Ethernet header comprising as destination MAC address the target MAC address, and a source MAC address configured for the selected Ethernet communication interface.
16. The method according to claim 13, wherein the selected Ethernet communication interface is configured to: determine a target IP address by selecting the next-hop IP address of the read further record in response to the next-hop enable data of the read further record indicating that the respective destination IP address range is reachable via the respective next-hop IP address; determine a target media access control (MAC) address for the target IP address; generate the Ethernet frame comprising an Ethernet header and as payload the IP packet, wherein the Ethernet header comprises as destination MAC address the target MAC address, and a source MAC address configured for the selected Ethernet communication interface.
17. The method according to claim 13, wherein the hardware network accelerator comprises an address translation and dispatcher circuit configured to manage an address map, and each memory location in the plurality of memories and each memory location in the further memory is associated with a respective memory address in the address map, the method further comprising: receiving, by the address translation and dispatcher circuit, a memory write request comprising a memory address in the address map and respective data to be stored; selecting, by the address translation and dispatcher circuit, one of the plurality of memories and the further memory based on the received memory address; selecting, by the address translation and dispatcher circuit, a memory location of the selected memory based on the received memory address; and storing, by the address translation and dispatcher circuit, the data to be stored to the selected memory location of the selected memory.
18. The method according to claim 17, wherein the communication system comprises a slave communication interface associated with the hardware network accelerator, and the communication system has a physical address range having a physical address sub-range associated with the slave communication interface, the method further comprising: receiving, by the slave communication interface, a write request from the microprocessor, the write request comprising an address in the physical address sub-range and respective transmitted data; extracting, by the slave communication interface, a memory address in the address map of the address translation and dispatcher circuit from the address in the physical address sub-range; and providing, by the slave communication interface, the memory write request to the address translation and dispatcher circuit, the memory write request comprising the extracted memory address and the transmitted data.
19. The method according to claim 13, wherein the one of the Ethernet communication interfaces comprises a search engine comprising a plurality of search circuits, the method further comprising: accessing, by each search circuit, a respective memory of the plurality of memories in order to sequentially read at least in part the records stored to the respective memory of the plurality of memories; and comparing, by each search circuit, the destination IP address with the destination IP data of the respective read records containing valid data in order to determine whether the record has a destination IP address range containing the destination IP address.
20. The method according to claim 19, wherein the method further comprises: sequentially reading, by each search circuit, the records stored to the respective memory of the plurality of memories until the enable data indicate that the respective record does not contain valid data or the search circuit determines that a record has a destination IP address range containing the destination IP address.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0065] Embodiments of the present disclosure will now be described with reference to the annexed drawings, which are provided purely by way of non-limiting example and in which:
[0066]
[0067]
[0068]
[0069]
[0070]
[0071]
[0072]
[0073]
[0074]
[0075]
[0076]
[0077]
[0078]
[0079]
DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS
[0080] In the following description, numerous specific details are given to provide a thorough understanding of embodiments. The embodiments can be practiced without one or several specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the embodiments.
[0081] Reference throughout this specification to one embodiment or an embodiment means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of the phrases in one embodiment or in an embodiment in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
[0082] The headings provided herein are for convenience only and do not interpret the scope or meaning of the embodiments.
[0083] In the following
[0084] As mentioned before, various embodiments of the present disclosure provide solutions for integrated network accelerators, in particular in the context of the organization and management of IP routing tables.
[0085]
[0086] Specifically, such a network accelerator 4 may be used as a resource 106 in the processing system 10 shown in
[0087] Specifically, in the embodiment considered, the network accelerator 4 comprises at least one communication interface 40 often also indicated as network port. For example, in
[0088] In various embodiments, the communication interface 40 obtains an IP packet comprising an IP header IP_H, which in turn includes a destination IP address (see
[0089] For example, in various embodiments, the communication interface 40 is configured to receive data from the communication system 114 (not shown in
[0090] Conversely, in order to receive the IP packet from the network 20 to which the communication interface 40 is connected, the communication interface 40 may be configured to receive an Ethernet frame and extract the IP packet from the payload E_D of the Ethernet frame.
[0091] In various embodiments, an IP packet obtained by a communication interface 40 may be destined to (and should thus be routed to): [0092] the communication system 114, e.g., to a processing core 102 or another resource 106 within the processing system 10 (either directly or via a DMA transfer), e.g., by extracting the respective payload IP_D or D; [0093] to the network 20 to which the communication interface 40 is connected; or [0094] in case of a plurality of communication interfaces 40, to another communication interfaces 40 of the network accelerator 4.
[0095] Accordingly, in various embodiments, the network accelerator 4 may also be configured to implement an IP router. In this case, the generation of the IP packet within the processing system 10 and the forwarding of IP packets to the communication system 114 may be purely optional.
[0096] In various embodiments, the network accelerator 4, e.g., each communication interface, may also implement firewall functions, such as a destination IP address and/or (e.g., TCP and/or UDP) destination port filtering, and/or a source IP address and/or (e.g., TCP and/or UDP) source port filtering.
[0097] As described in the foregoing, in order to forward IP packets, the processing system 10 uses an IP routing table. For example, the IP routing table could be managed in software, wherein an IP packet received by a communication interface 40 from a respective network, is provided to a processing core 102, which then determines via software instructions whether the IP packet should be processed internally or forwarded to another communication 40.
[0098] Conversely, in various embodiments, the network accelerator 4 is configured to manage the routing of the IP packet directly in hardware. Specifically, in various embodiments, the network accelerator 4 comprises a routing table search engine 400 configured to analyses a routing table stored to a memory 44. In general, the search engine 400 may be common for the complete network accelerator 4 or (as shown in
[0099] Accordingly, a processing core 102 may configure the IP routing table by writing the content of the memory 44. For example, for this purpose, the memory controller 42 may have associated a communication interface 46 connected to the communication system 114 of the processing system 10. For example, the communication interface 46 may be a slave interface of the communication system 114, such as a peripheral bridge. In general, the communication interface 46 may also be included in the memory controller 42.
[0100] Accordingly, once having obtained (i.e., received or generated) an IP packet, a communication interface 40 may access the memory 44 and read the IP routing table in order to decide how to forward the IP packet (to a network 20, another communication interface 40 or to a further circuit within the processing system 10, such as a processing core 102).
[0101] In various embodiments, in order to improve the velocity of the read operation of the IP routing table, the memory 44 may indeed be implemented with a plurality of NB memories 44.sub.1 to 44.sub.NB, such as a plurality of NB RAM banks. Accordingly, in this case, each search engine 400 may indeed comprise NB search circuits 400.sub.1 to 400.sub.NB, wherein each search circuit is configured to access a respective memory 44.sub.1 to 44.sub.NB.
[0102] Accordingly, once having processed the routing table, the search engine 400 may decide whether the IP packet should be forwarded to the respective network 20, another communication interface 40 or to a circuit 102/106 within the processing system 10. For example, in order to provide the data to a processing core 102, a resource 106 or another communication interface 40, the communication interface 40 may comprise an integrated DMA interface or may use a general-purpose DMA controller 110 configured to store the IP packet or the respective payload IP_D via a DMA transfer to the memory 104b or one or more dedicated RAM memories within the network accelerator 4. Conversely, in order to forward the IP packet to the network 20, the communication interface 40 may generate an Ethernet frame comprising the IP packet as payload E_D and an Ethernet header E_H comprising its own MAC address as source address and the MAC address of the next-hop as target address. In various embodiments, the communication interface 40 may also be configured to manage a plurality of virtual Ethernet interfaces, e.g., by configuring a plurality of source MAC addresses.
[0103]
[0104] Specifically, in the embodiment considered, the memory 44 comprises a given number N of routing table entry slots RTE.sub.1 to RTE.sub.N, wherein each slot RTE may store a respective routing table entry. For example, in case the memory 44 comprises NB memory banks 44.sub.1 to 44.sub.NB, each memory bank may store N/NB slots. For example, the first memory bank 44.sub.1 may be arranged to store the slots RTE.sub.1 to RTE.sub.N/NB.
[0105] Specifically, in the embodiment considered, each routing table entry RTE comprises the following fields: [0106] a destination IP address field IP_DA adapted to specify a destination IP address; [0107] a next-hop IP address field IP_NH; and [0108] control data CNTRL.
[0109] For example, in various embodiments, the control data CNTRL comprise an enable field EN indicating that the respective routing table entry RTE contains valid data. Moreover, in various embodiments, the control data CNTRL comprise a field SUBNET for indicating a subnet or netmask for the destination IP address, which thus permits to specify (together with the destination IP address field IP_DA) a destination IP address range. In various embodiments, the control data CNTRL comprise also a field NH_EN, such as a next-hop address enable bit, indicating whether: [0110] a destination IP address in the destination IP address range may be reached directly, which implies that the next-hop address should correspond to the destination IP address of the IP packet, or [0111] a destination IP address in the destination IP address range may not be reached directly, which implies that the next-hop address should correspond to the address indicated in the next-hop IP address field IP_NH.
[0112] Accordingly, the control data CNTRL may indicate whether to use the IP address of the IP packet or the next-hop IP address field IP_NH as target for the Ethernet communication, which is then mapped by the communication interface 40 to a respective MAC address to be inserted in the Ethernet frame. Moreover, in case of a plurality of (physical or virtual) communication interfaces, a field DST_PORT of the control data CNTRL may specify a (physical or virtual) communication interface 40 to be used to transmit the IP packet.
[0113] In general, the number of physical memory slots/data words NW occupied by each routing table entry RTE may depend on the length of the IP address (i.e., whether IPv4 or IPv6 addresses are supported) and the number of bits of each memory slot.
[0114] Accordingly, during a search process, a search engine 400 inside each communication interface 40 and having a parallelism equal to the number NB of memory banks, may accesses in parallel all NB memory banks 44.sub.1 to 44.sub.NB, read sequentially the stored entries RTE inside each memory bank and compare the specified destination IP addresses range (as indicated by the destination IP address IP DA and the control data CNTRL) with the destination IP address of the IP packet. In case of a match, the search process may be stopped and the control data CNTRL may be used to determine the target of the communication, e.g., for determining a communication interface 40 to be used to transmit the IP packet, and the IP address of the next target (i.e., the final destination or a gateway node indicated in the next-hop IP address field IP_NH).
[0115] Accordingly, if required, the communication interface 40 may forward the IP packet to the indicated communication interface 40. The indicated (virtual or physical) communication interface may then determine the destination MAC address associated with the IP address of the next target, generate the respective Ethernet frame (by adding the destination MAC address and its own MAC address as source MAC address) and transmit the Ethernet frame to the connected network 20. For example, as well known in the art, the communication interface 40 may manage for this purpose a table of devices connected to the respective network 20, wherein this table comprises the MAC addresses of the devices and the respective IP addresses. For example, such a table may be obtained via the Address Resolution Protocol (ARP) and is usually called ARP cache.
[0116] The inventors have observed that the previously described organization of the memory 44 has several disadvantages. In fact, in order to implement the search, it is sufficient that a search engine 400 is able to just determine the destination IP address range (as specified by the data IP_DA and CNTRL) of the routing table entries RTE having stored valid data (as specified by the data CNTRL). Accordingly, in order to implement a parallel search function, parallel access to the complete routing table entries RTE (in particular the data IP_NH) is not required, but a parallel access to the destination IP address IP_DA and part of the control data CNTRL is sufficient.
[0117] Accordingly,
[0118] Specifically, in the embodiment considered, the data of each routing data entry slots RTE.sub.1 to RTE.sub.N adapted to be stored to the memory 44 are now organized into two parts: [0119] a first part RTEA having stored the data used to determine whether the routing data apply to the destination IP address of the IP packet; and [0120] a second part RTEB having stored data specifying the routing behavior of the IP packet.
[0121] Accordingly, in line with the foregoing, the data of a given routing data entry slot RTEA may comprise: [0122] an enable field EN, such as an enable bit, indicating whether the respective routing table entry RTEA contains valid data; and [0123] data specifying a destination address range, e.g., comprising an IP address IP_DA and a field SUBNET specifying a subnet mask or alternatively a netmask.
[0124] For example, as shown in
[0125] Conversely, the data of a given routing data entry slot RTEB may comprise: [0126] the next-hop IP address field IP_NH; and [0127] further control data IP_NH_CNTRL.
[0128] For example, as shown in
[0129] In the embodiment considered, the first part of routing data RTEA may thus be stored to respective routing data slots RTEA.sub.1 to RTEA.sub.N in a memory 48 and the second part of routing data RTEB may be stored to respective routing data slots RTEB.sub.1 to RTEB.sub.N in a memory 50.
[0130] Specifically, as shown in
[0131] In various embodiments, instead of filling sequentially the slots of data RTEA.sub.1 to RTEA.sub.N in the various memories 48.sub.1 to 48.sub.NB, the data are stored in an interleaved manner to the memories 48.sub.1 to 48.sub.NB, i.e., the first data RTEA.sub.1 associated with a first routing data entry RTE.sub.1 are stored to the first slot of the first memory 48.sub.1, the second data RTEA.sub.2 associated with a second routing data entry RTE.sub.1 are stored to the first slot of the second memory 48.sub.2, . . . , and the last data RTEA.sub.N associated with a last routing data entry RTE.sub.N are stored to the last slot of the last memory 48.sub.NB.
[0132] Conversely, since a parallel access to the routing data RTEB may not be required, the data RTEB may be stored to a single memory 50. Accordingly, even though the memory 50 could indeed be implemented with a plurality of physical RAM memory banks, these memory banks are connected to the same memory interface of the memory controller 42.
[0133] Accordingly, in the embodiment considered, each search circuit 400.sub.1 to 400.sub.NB of a given communication interface 40 may access (via the memory controller 42) a respective memory 48.sub.1 to 48.sub.NB, and sequentially read the respective entries RTEA with an enable field EN indicating that the respective routing table entry RTEA contains valid data. In parallel the search engine 400 may determine whether the destination IP address of the IP packet is included in the destination address range indicated by the record RTEA.
[0134] Accordingly, once having found and selected a given record RTEA, the respective search circuit 400.sub.1 to 400.sub.NB or another circuit of the search engine 400 also knows the index of the selected record RTEA, and may use this index in order to read the respective data RTEB from the memory 50. In general, the data RTEB may be stored in any suitable manner to the memory 50, which still permits a univocal mapping of a given selected record RTEA to the respective record RTEB. For example: [0135] the records RTEA may be stored sequentially to the memories 48 and the records RTEB may be stored sequentially to the memory 50; [0136] the records RTEA may be stored in an interleaved manner to the memories 48 and the records RTEB may be stored sequentially to the memory 50; or [0137] the records RTEA may be stored in an interleaved manner to the memories 48 and the records RTEB may be stored in an interleaved manner to the memory 50.
[0138] Accordingly, in the embodiment considered, the memory 48 stores up to N records RTEA indicating the supported destination IP addresses IP_DA, the respective associated subnet mask SUBNET, and the entry enable field EN of the routing table entry RTE. This memory 48 may be implemented with a selectable level NB of parallelism. Preferably, an interleaved scheme is used to store the various records RTEA. Accordingly, the search engines 400 of the various communication interfaces 40.sub.1 to 40.sub.NP may have a parallelism equal to the number NB of memory banks 48.sub.1 to 48.sub.NB, and may sequentially read the content of the memory 48, which is shared among all the communication interfaces 40. As soon as a match is found, the search process may be stopped and the index of the matching entry may be used to access the respective record RTEB in the memory 50 of the same routing table entry RTE, wherein the record RTEB has stored the respective next-hop address IP_NH and the respective additional control data CNTRL. Accordingly, also the memory 50 is shared among all communication interfaces 40, and may be implemented using a single memory bank.
[0139] Generally, a routing table could also include a plurality of routes which could apply to a given destination IP address. Accordingly, in various embodiments, the search circuits 400.sub.1 to 400.sub.NB could return all matching records RTEA, and the search engine 400 could select a best matching route, e.g., by using a metric or cost stored with the control data CNTRL of the record RTEB. Conversely, as indicated in the foregoing, in various embodiments, the first matching route is selected. In this case, the records RTE (and accordingly the respective records RTEA and RTEB) should be ordered accordingly. For example, in typical routing methods, the longest matching mask is selected, i.e., the smallest matching destination address range IP_DA. Accordingly, in this case, the routing table entries RTE should be already stored to the memories 48 and 50 in the requested order. For example, such a re-ordering of the routing table may be managed by a processing core 102 of the processing system 10, which may write the memories 48 and 50.
[0140] Accordingly, in various embodiments, the content of memories 48 and 50 is programmable via a processing core 102 of the processing system 10. In general, any suitable address mapping may be used to map a given sub-range of the physical address range of the communication system 114 to the physical address range of the memories 48.sub.1 to 48.sub.NB and 50. In general, it is not required that these mapping indeed reflects the order of the storage locations within the memories 48.sub.1 to 48.sub.NB and 50.
[0141] For example,
[0142] For example, in the embodiment shown in
[0145] Similarly, in the embodiment shown in
[0148] Accordingly, in the embodiment considered, the memory 48 may store data IP_DA0 to IP_DA(N-1) for the N destination IP addresses, and data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL for the respective control data, i.e., the memory 48 has at least 8.Math.N bytes of space, which may be divided into NB memory banks 48.sub.1 to 48.sub.NB. Similarly, the memory 50 may store data IP_NH0 to IP_NH(N-1) for the N next-hop addresses, and data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL for the respective control data, i.e., also the memory 50 has at least 8.Math.N bytes of space and may be implemented with a single memory bank. Accordingly, when using IPv4 addresses with a 32-bit memory, a total of four memory slots are occupied by the data of each routing table entry RTE (two memory slots in the memory 48 for the data RTEA and two memory slots in the memory 50 for the data RTEB).
[0149] For example, the memory controller 42 may perform an address translation operation, wherein the data IP_DA0 start at a given start address and the following addresses are mapped to the data IP_DA1 to IP_DA(N-1). Following addresses may then be mapped sequentially to the data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL, the data IP_NH0 to IP_NH(N-1) and the data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL. Accordingly, in the embodiment considered: [0150] the group of data IP_DA0 to IP_DA(N-1) have associated a first address subrange; [0151] the group of data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL have associated a second address subrange; [0152] the group of data IP_NH0 to IP_NH(N-1) have associated a third address subrange; and [0153] the group of data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL have associated a fourth address subrange.
[0154] In general, a given group of data may follow immediately the previous group of data or the address map may have a gap between the respective address subranges. In various embodiments, the groups of data may also have a different order.
[0155] Conversely, in the embodiment shown in
[0158] Similarly, in the embodiment shown in
[0161] Accordingly, in the embodiment considered, the memory 48 may store data IP_DA_W0 to IP_DA_W3 for each of the N destination IP addresses IP_DA0 to IP_DA(N-1), and data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL for the respective control data, i.e., the memory 48 has at least 20.Math.N bytes of space, wherein the respective records RTE may be stored into NB memory banks 48.sub.1 to 48.sub.NB. Similarly, the memory 50 may store data IP_NH_W0 to IP_NH_W3 for each of the N next-hop addresses IP_NH0 to IP_NH(N-1), and data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL for the respective control data, i.e., also the memory 50 has at least 20.Math.N bytes of space and may be implemented with a single memory bank. Accordingly, when supporting IPv6 addresses with a 32-bit memory, a total of ten memory slots are occupied by the data of each routing table entry RTE (five memory slots in the memory 48 for the data RTEA and five memory slots in the memory 50 for the data RTEB).
[0162] For example, the memory controller 42 may perform an address translation operation, wherein the data IP_DA0_W0 of the first destination IP address IP_DA0 start at a given start address and the following addresses are mapped to the other data words IP_DA0_W1 to IP_DA0_W3 of the first destination IP address IP_DA0, and similarly to the data words IP_DA_W0 to IP_DA_W3 of the following destination IP address IP_DA1 to IP_DA(N-1). The following addresses are then mapped sequentially to the data IP_DA0_CNTRL to IP_DA(N-1)_CNTRL, the data words IP_NH_W0 to IP_NH_W3 of the next-hop addresses IP_NH0 to IP_NH(N-1), e.g., data words IP_NH0_W0 and the data IP_NH0_CNTRL to IP_NH(N-1)_CNTRL. Thus, also in this case, the address range is organized in four groups for the data IP_DA, IP_DA_CNTRL, IP_NH and IP_NH_CNTRL.
[0163] In both cases, the control data IP_DA_CTRL may comprise, in addition to the entry enable/valid bit EN, a field SUBNET for specifying the subnet mask of the address. The length of this field may be application dependent. For example, the inventors have observed that for typical applications the field SUBNET may have: [0164] in case only IPv4 addresses are supported, 6 bits for indicating an IPv4 subnet; [0165] in case also IPv6 addresses are supported, 8 bits for indicating an IPv6 prefix.
[0166] For example, in case of IPv4 addresses, the respective value stored to the field SUBNET may indicate an integer value corresponding to the number of Most Significant Bits (MSB) of the address IP_DA which are held valid, and the remaining Least Significant Bits (LSB) of the address IP_DA may be masked for determining the respective destination IP address range of the routing table entry. For example, the binary values 011000 of the field SUBNET may indicate the subnet /24, which corresponds to a netmask of 255.255.255.0.
[0167] Conversely, the control data IP_NH_CTRL may comprise, in addition to the next-hop enable field NH_EN, a field DST_PORT for specifying a communication interface/network port 40 to be used to transmit the IP packet. The length of this field depends on the number of supported (physical or virtual) communication interfaces 40. For example, in various embodiments, the field DST_PORT may have 6 bits, which permits to support up to 64 (physical or virtual) communication interfaces 40.
[0168] Accordingly, in the embodiments considered, the memory controller 42 may be configured to implement an address translation and provide an address range, which is addressable by one or more of the processing cores 102 of the processing system 10. Those of skill in the art will appreciate that the address map of the memory controller 42 may also be organized differently. For example, the data IP_DA, IP_DA_CTRL, IP_NH and IP_NH_CTRL of a given routing table entry RTE may have consecutive addresses.
[0169] Accordingly, the processing core 102 may use the address map provided by the memory controller 42 in order to access via software instructions the memory slots of the memories 48 (in particular the respective memory banks 48.sub.1 to 48.sub.NB) and 50. As mentioned before, indeed the various memory slots described with respect to
[0170] Accordingly, a processing core 102 and/or another processing system 10 may be configured to implement a routing algorithm, such as Open Shortest Path First (OSPF) or Border Gateway Protocol (BGP), thereby defining the content of the routing table entries RTE and the respective order. Next, the processing core 102 and/or the other processing system 10 may determine the respective records RTEA and RTEB and store the respective data to the memories 48 (memories 48.sub.1 to 48.sub.NB) and 50 by using the address map provided by the memory controller 42.
[0171] Specifically, as will be described in greater detail in the following, in various embodiments, the processing core 102 (or another master circuit connected to the communication system 114) may read or write the content of the memories 48 and 50 by sending a read or write request to the communication system 114, wherein the request comprises an address of a sub-range managed by the communication interface 46. Thus, the address map of the physical address range of the communication system 114 may also be different from the address map provided by the address translation circuit. For example, in various embodiments, the address map of the translation circuit starts at 0, while the slave interface 46 may have associated an address range starting at a given start address/offset and having a dimension corresponding to the dimension of the address map provided by the circuit. Accordingly, in this case, the slave interface 46 may receive a request and generate the address ADDR provided to the address translation circuit by removing the start address/offset from the address received with the request.
[0172] In addition to or as alternative to a slave interface, the interface 46 may also comprise an integrated DMA controller, whereby the processing core 102 may program the routing table by storing the respective data to the memory 104b and the integrated DMA controller may automatically transfer the routing table from the memory 104b to the address translation circuit 424.
[0173]
[0174] Specifically, as described in the foregoing, in various embodiments, each communication interface 40 comprises a respective search engine 400, which comprises a plurality of NB search circuits 400.sub.1 to 400.sub.NB. Accordingly, each communication interface 40/search engine 400 generates NB sets of control signals for performing in parallel read operations from the NB memory banks 48.sub.1 to 48.sub.NB. Specifically, the first sets of control signals generated by the first search circuits 400.sub.1 of the communication interfaces 40.sub.1 to 40.sub.NP are provided to a first arbiter 420.sub.1, which selects a search circuit 400.sub.1 permitted to access the first memory bank 48.sub.1. Similarly, the second sets of control signals generated by the second search circuits 400.sub.2 of the communication interfaces 40.sub.1 to 40.sub.NP are provided to a second arbiter 420.sub.2, which selects a search circuit 400.sub.2 permitted to access the first memory bank 48.sub.2. The same connections are also applied for the other control signals, e.g., the last sets of control signals generated by the last search circuits 400.sub.NB of the communication interfaces 40.sub.1 to 40.sub.NP are provided to a last arbiter 420.sub.NB, which selects a search circuit 400.sub.NB permitted to access the last memory bank 48.sub.NP.
[0175] Similarly, in various embodiments, each communication interface 40/search engine 400 generates a further set of control signals for performing a read operation from the memory bank 50. In general, these further control signals may be generated directly by each search circuit 400.sub.1 to 400.sub.NB of a given search engine 400 or a shared circuit of the search engine 400. Accordingly, the further sets of control signals generated by the communication interfaces 40.sub.1 to 40.sub.NP are provided to a further arbiter 422, which selects a communication interface 40.sub.1 to 40.sub.NP permitted to access the memory bank 50. For example, the arbiters 420.sub.1 to 420.sub.NP and 422 may implement a round-robin arbitration.
[0176] As described in the foregoing, in various embodiments, the memory controller 42 manages also the access of the communication interface 46 (connected to the communication system 114) to the memory banks 48.sub.1 to 48.sub.NB and 50. In this respect, as described with respect to
[0177] In various embodiments, instead of providing the signals of the dispatcher circuit 424 to the arbiters 420.sub.1 to 420.sub.NB and 422, the memories 48 and 50 are implemented with dual port memories, and the arbiters 420.sub.1 to 420.sub.NB and 422 (and thus the search circuits 400) may access a first memory port, and the dispatcher circuit 424 may access a second memory port of the respective memory.
[0178] Specifically, while the communication interfaces 40.sub.1 to 40.sub.2 are preferably configured to perform only read requests (and not write requests) to the memory banks 48.sub.1 to 48.sub.NB and 50, the communication interface 46 is configured to perform (via the address translation circuit 424) at least write request (and preferably also read requests) to the memory banks 48.sub.1 to 48.sub.NB and 50. In this respect, in various embodiments, the communication interface 46 is a slave interface connected to the communication system 114 of the processing system 10. Accordingly, such a slave interface 46 may receive read or write request from any master interface connected to the communication system 114, such as a processing core 102 or a DMA controller 110. In general, the slave interface 46, the various master interfaces and/or the communication system 114 may also implement an access protection in order to limit read or write access to the addresses in the address map provided by the circuit 424 (and accordingly to the memory banks 48.sub.1 to 48.sub.NB and 50).
[0179] Accordingly, in the embodiment considered, the routing table memory controller 42, is used to arbitrate the access requests to the shared routing table stored to the memory banks 48.sub.1 to 48.sub.NB and 50, that can concurrently arrive from the communication interfaces 40.sub.1 to 40.sub.NP and from another circuit (external to the network accelerator 4) via the communication/programming interface 46 (e.g., an AXI slave). The search engines 400.sub.1 to 400.sub.NB inside each communication interface/network port 40.sub.1 to 40.sub.NP preferably can generate only read access requests during the lookup process, while the programming interface 46 may generate both read and write requests to the memory banks 48.sub.1 to 48.sub.NB and 50.
[0180] Specifically, in the embodiment considered, each memory bank 48.sub.1 to 48.sub.NB and 50 has associated a respective arbitration circuit 420.sub.1 to 420.sub.NB and 422, which are configured to arbiters the incoming access requests, e.g., by using a round-robin scheduling algorithm.
[0181] Concerning the interface 46, an address translation and dispatcher circuit 424 is configured to manage an address map (see
[0182] In various embodiments, the address translation operation (circuit 424) is only implemented for the access via the interface 46, while the search circuits 400.sub.1 to 400.sub.NB access the memories by using directly the addresses of the respective memory slots. In fact, each search circuit 400.sub.1 to 400.sub.NB should only access a respective memory bank 48.sub.1 to 48.sub.NB.
[0183] In the following will now be described in greater detail possible embodiments of the operations of the slave interface 46 and the address translation and dispatcher circuit 424.
[0184] Specifically,
[0185] Specifically, in the embodiment considered, a given number n of LSB bits of the address ADR correspond to the address ADDR to be provided to the dispatcher circuit 424. Accordingly, the remaining MSB bits may be used to identify the interface 46 within the communication system 114.
[0186] In the embodiment considered, a first bit MEMT of the address ADDR, preferably the MSB bit of the address ADDR, indicates the memory type, i.e., whether to access the memory 48 (e.g., MEMT=0) or the memory 50 (e.g., MEMT=1). Moreover, a second bit DC indicates whether to access the data section (e.g., DC =0), i.e., the slots IP_DA or IP_NH, or the control section (e.g., DC=1), i.e., the slots IP_DA_CNTRL or IP_NH_CNTRL, of the respective memory. Finally, a field SLOT #indicates the respective slot number. Accordingly, when N slots have to be supported, the field SLOT #has at least n=log.sub.2(N) bits, e.g., bits SLOT #[n-1:0]
[0187] Specifically, as shown in
[0188] Conversely, as shown in
[0189] In various embodiments, the address field ADDR may also comprise a byte index B #. For example, this byte index B #may be useful in order to permit just a programming of single bytes in the memories 48 and 50. For example, when the memories 48 and 50 have 32 bits, the field B #may have two bits and indicate one of the 4 bytes of a respective memory slot indicated by the other address data ADDR. In general, the byte index B #may be used when the communication system 114 has a data width being smaller than the data width of the memories 48 and 50, or in order to ensure that only single bytes may be programmed.
[0190] Thus, in various embodiments, the (word) index W #may be used to select a given word of the selected record slot #, and the byte index B #may be used to select a given byte in the selected memory slot as indicated by the data SLOT #and W #.
[0191]
[0192] Specifically, once the address translation circuit 424 receives at a step 4000 a (read or write) request comprising a given address ADDR in the address map of the dispatcher circuit 424, the address translation circuit 424 determines whether the received address ADDR is associated with the memory 48 or the memory 50.
[0193] For example, in the embodiments shown in
[0196] Optionally, the address translation circuit 424 may compare the received address ADDR with a lower threshold corresponding to the address associated with the first slot in the memory 48 (e.g., corresponding to the address of the memory slot IP_DA0_W0) and/or with an upper threshold corresponding to the address associated with the last slot in the memory 50 (e.g., corresponding to the address of the memory slot IP_NH(N-1)_CNTRL).
[0197] Conversely, in the embodiments shown in
[0198] Accordingly, in case the address ADDR is associated with the memory 48 (as schematically shown via an output 0 of the step 4002), the dispatcher circuit 424 proceeds to a verification step 4004.
[0199] Specifically, the step 4004 is purely optional and is used in case different address translation operations are required for the destination IP addresses IP_DA and the control data IP_DA_CNTRL. In fact, in the embodiment considered, the address translation circuit 424 determines whether the received address ADDR is associated with a destination IP address IP_DA or control data IP_DA_CNTRL. For example, in the embodiments shown in
[0202] Conversely, in the embodiments shown in
[0203] Accordingly, in case the address ADDR is associated with data IP_DA (as
[0204] schematically shown via an output 0 of the step 4004), the address translation circuit 424 proceeds to a step 4008, where the address translation circuit 424 applies the address translation operation in order to access the data IP_DA in the memory 48.
[0205] For example, as described in the foregoing, the field SLOT #may indicate a given record slot, and optionally the index W #may indicate a respective data word of the record. Specifically, by storing the records RTEA in an interleaved manner to the memories 48.sub.1to 48.sub.NB, a given number nb=log.sub.2(NB) of LSB bits of the field SLOT #may be used to select one of the memory banks 48.sub.1 to 48.sub.NB, and the remaining bits of the field SLOT #may be used to select a respective record slot in the selected memory bank 48.sub.1to 48.sub.NB. Specifically, as described with respect to
[0206] Conversely, in case the address ADDR is associated with control data IP_DA_CTRL (as schematically shown via an output 1 of the step 4004), the address translation circuit 424 proceeds to a step 4010, where the address translation circuit 424 applies the address translation operation in order to access the control data IP_DA_CTRL in the memory 48.
[0207] For example, as described in the foregoing, the field SLOT #may indicate a given record. Specifically, by storing the slots in an interleaved manner to the memories 48.sub.1 to 48.sub.NB, a given number nb=log.sub.2(NB) of LSB bits of the field SLOT #may again be used to select one of the memory banks 48.sub.1 to 48.sub.NB, and the remaining bits of the field SLOT #may be used to select a respective record slot in the selected memory bank 48.sub.1 to 48.sub.NB. Specifically, as described in the foregoing, preferably the control data IP_DA_CTRL use only a single memory slot. Accordingly, both for IPv4 and IPv6, each memory slot may already correspond to the respective record slot of control data IP_DA_CNTRL, i.e., the field W #may be omitted for the control data IP_DA_CTRL.
[0208] Conversely, in case the address ADDR is associated with the memory 50 (as schematically shown via an output 1 of the step 4002), the address translation circuit 424 proceeds to a verification step 4006.
[0209] Specifically, also the step 4006 is purely optional and is used in case different address translation operations are required for the next-hop addresses IP_NH and the control data IP_NH_CNTRL. In fact, in the embodiment considered, the address translation circuit 424 determines whether the received address ADDR is associated with a next-hop addresses IP_NH or control data IP_NH_CNTRL. For example, in the embodiments shown in
[0212] Again, in the embodiments shown in
[0213] Accordingly, in case the address ADDR is associated with data IP_NH (as schematically shown via an output 0 of the step 4006), the address translation circuit 424 proceeds to a step 4012, where the address translation circuit 424 applies the address translation operation in order to access the data IP_NH in the memory 50.
[0214] For example, as described in the foregoing, the field SLOT #may indicate a given record, and optionally the field W #may indicate a given word in the record. Specifically, by storing the slots sequentially to the memory 50, the field SLOT #may be used to select a respective record slot in the memory 50. Specifically, in case of IPv4 addresses and 32-bit memories, each memory slot again corresponds to the respective record slot. Conversely, as described with respect to
[0215] Conversely, in case the address ADDR is associated with control data IP_NH_CTRL (as schematically shown via an output 1 of the step 4006), the address translation circuit 424 proceeds to a step 4014, where the address translation circuit 424 applies the address translation operation in order to access the control data IP_NH_CTRL in the memory 50.
[0216] For example, as described in the foregoing, the field SLOT #may indicate a given record. Specifically, by storing the slots sequentially to the memory 50 the field SLOT #may be used to select a respective record slot in the memory 50. Specifically, as described in the foregoing, preferably the control data IP_DA_CTRL use only a single memory slot. Accordingly, both for IPv4 and IPv6, each memory slot may already correspond to the respective record slot of control data IP_NH_CNTRL.
[0217] As described in the foregoing, in various embodiments, the dispatcher circuit may also support the byte index B #. For example, this byte index may be used to select a given bytes of a selected memory slot, as indicated by the field SLOT #and optionally W #.
[0218] In various embodiments, the slots of the memories 48 and 50 may also have less bits than the data width of the communication system 114. In this case, the dispatcher 424 may be configured to automatically generate a plurality of read or write requests. For example, such multiple read or write requests may access automatically consecutive memory locations in the same memory. Moreover, in case of the memory 48, such multiple read or write requests may also access in parallel respective memory banks 48.sub.1 to 48.sub.NB.
[0219] In various embodiments, the control data IP_DA_CNTRL and/or IP_NH_CNTRL may also comprise further data. For example, in various embodiments, the control data IP_DA_CNTRL and/or IP_NH_CNTRL comprise additional Error Correction Code (ECC) data. For example, the ECC data included in the data IP_DA_CNTRL may be calculated according to a given ECC scheme (at least) based on the data IP_DA and SUBNET. Similarly, the ECC data included in the data IP_NH_CNTRL may be calculated according to a given ECC scheme (at least) based on the data IP_NH, NH_EN and DST_PORT. For example, in various embodiments is used a Single-Error-Correction and Double-Error-Detection (SECDED) code.
[0220] For example, such an arrangement is particularly useful, e.g., in case of ASIL-B compliant processing systems, because the memory controller 42 and/or the search engines 400 may verify the correctness of the data stored to the memories 48 and 50 without any intervention of a processing core 102. For example, in case of a fault is detected (such as an uncorrectable error, such as a double-bit error), the memory controller 42 and/or the respective search engine 400 may signal the error to a fault collection and error management circuit of the processing system 10, which may signal the error to the one or more processing cores 102 and/or to a pad or pin of the integrated circuit of the processing system 10.
[0221] Accordingly, in the solutions described in the foregoing, the organization of the routing table data permits a fast lookup of a destination IP address within an IP routing table, without relying on slow software-based lookups or complex and expansive Ternary Content Addressable Memories (TCAM). For this purpose, the proposed solutions may perform a parallel search by accessing in parallel a plurality of memory banks 48.sub.1 to 48.sub.NB having stored the principal data of the search operation. In fact, in various embodiments, the routing table data are split and stored to two different memories 48 and 50, whose content is optimized for the routing operations executed in HW. This separation permits that the memory 48 contains (only) the data useful for detecting a match of the destination IP address in the specified destination IP ranges, while the other data specifying the respective route (e.g., next-hop, egress interface index, etc.) may be stored in a single memory bank 50, whose content is retrieved only after a match is found.
[0222] Accordingly, the disclosed solutions permit a fast lookup process that does not significantly delay the packet transmission, thereby achieving high data rates and low latencies.
[0223] Of course, without prejudice to the principle of the invention, the details of construction and the embodiments may vary widely with respect to what has been described and illustrated herein purely by way of example, without thereby departing from the scope of the present invention, as defined by the ensuing claims.