ALGORITHMIC TCAM BASED TERNARY LOOKUP
20230127391 · 2023-04-27
Assignee
Inventors
- Patrick Bosshart (Plano, TX)
- Michael G. Ferrara (Palo Alto, CA)
- Jay E.S. Peterson (San Francisco, CA, US)
Cpc classification
G11C11/40615
PHYSICS
G11C7/1072
PHYSICS
G06F16/90339
PHYSICS
International classification
G11C11/406
PHYSICS
G11C7/10
PHYSICS
Abstract
An algorithmic TCAM based ternary lookup method is provided. The method stores entries for ternary lookup into several sub-tables. All entries in each sub-table have a sub-table key that includes the same common portion of the entry. No two sub-tables are associated with the same sub-table key. The method stores the keys in a sub-table keys table in TCAM. Each key has a different priority. The method stores the entries for each sub-table in random access memory. Each entry in a sub-table has a different priority. The method receives a search request to perform a ternary lookup for an input data item. A ternary lookup into the ternary sub-table key table stored in TCAM is performed to retrieve a sub-table index. The method performs a ternary lookup across the entries of the sub-table associated with the retrieved index to identify the highest priority matched entry for the input data item.
Claims
1. Integrated circuit for use in performing packet forwarding-related operations related to incoming packet data, the integrated circuit being configurable to use machine-readable tertiary content addressable memory (TCAM) in association with the packet forwarding-related operations, the integrated circuit comprising: other machine-readable memory for use in the packet forwarding-related operations, the machine-readable TCAM and the other machine-readable memory to store table data for use in the packet forwarding-related operations; and at least one processing unit to execute compiler-produced program instructions, the compiler-produced program instructions, when executed by the at least one processing unit resulting in the integrated circuit being configured to perform the packet forwarding-related operations, the packet forwarding-related operations being configurable to comprise: performing parallel table lookup operations to determine multiple entries in the table data that match, at least in part, at least one portion of the incoming packet data; selecting, based upon relative priorities associated with the multiple entries, one of the multiple entries that corresponds to a highest one of the relative priorities; and based upon the one of the multiple entries, determining at least one action to be executed in relation to the incoming packet data; wherein: the table data is configurable to comprise multiple tables; at least one of the multiple tables is configurable to indicate, at least in part, the relative priorities in association with table indices that are associated with the at least one portion of the incoming packet data; the table data is configurable to indicate packet processing rules that comprise the at least one action; the packet processing rules are configurable to indicate one or more of: at least one packet drop; at least one next packet hop determination; and/or at least one forwarding port determination; and the table indices are configurable to correspond, at least in part, to tree nodes.
2. The integrated circuit of claim 1, wherein: the parallel table lookup operations involve two or more of the multiple tables.
3. The integrated circuit of claim 2, wherein: an application specific integrated circuit comprises the integrated circuit.
4. The integrated circuit of claim 3, wherein: the other machine-readable memory comprises programmable memory; and the at least one processing unit comprises one or more processors.
5. The integrated circuit of claim 4, wherein: at least one other of the multiple tables is configurable to indicate respective mask data associated with respective entries; and the respective mask data is to be applied to certain portions of the incoming packet data.
6. The integrated circuit of claim 5, wherein: another integrated circuit comprises the machine-readable TCAM.
7. Machine-readable instructions to be executed by an integrated circuit, the integrated circuit being for use in performing packet forwarding-related operations related to incoming packet data, the integrated circuit being configurable to use machine-readable tertiary content addressable memory (TCAM) in association with the packet forwarding-related operations, the integrated circuit comprising other machine-readable memory and at least one processing unit, the other machine-readable memory being for use in the packet forwarding-related operations, the machine-readable instructions, when executed by the integrated circuit, resulting in the integrated circuit being configured to perform certain operations comprising: storing, in the machine-readable TCAM and the other machine-readable memory, table data for use in the packet forwarding-related operations; and executing, by the at least one processing unit, compiler-produced program instructions, the compiler-produced program instructions, when executed by the at least one processing unit resulting in the integrated circuit being configured to perform the packet forwarding-related operations, the packet forwarding-related operations being configurable to comprise: performing parallel table lookup operations to determine multiple entries in the table data that match, at least in part, at least one portion of the incoming packet data; selecting, based upon relative priorities associated with the multiple entries, one of the multiple entries that corresponds to a highest one of the relative priorities; and based upon the one of the multiple entries, determining at least one action to be executed in relation to the incoming packet data; wherein: the table data is configurable to comprise multiple tables; at least one of the multiple tables is configurable to indicate, at least in part, the relative priorities in association with table indices that are associated with the at least one portion of the incoming packet data; the table data is configurable to indicate packet processing rules that comprise the at least one action; the packet processing rules are configurable to indicate one or more of: at least one packet drop; at least one next packet hop determination; and/or at least one forwarding port determination; and the table indices are configurable to correspond, at least in part, to tree nodes.
8. The machine-readable instructions of claim 7, wherein: the parallel table lookup operations involve two or more of the multiple tables.
9. The machine-readable instructions of claim 8, wherein: an application specific integrated circuit comprises the integrated circuit.
10. The machine-readable instructions of claim 9, wherein: the other machine-readable memory comprises programmable memory; and the at least one processing unit comprises one or more processors.
11. The machine-readable instructions of claim 10, wherein: at least one other of the multiple tables is configurable to indicate respective mask data associated with respective entries; and the respective mask data is to be applied to certain portions of the incoming packet data.
12. The machine-readable instructions of claim 11, wherein: another integrated circuit comprises the machine-readable TCAM.
13. A method implemented using an integrated circuit, the integrated circuit being for use in performing packet forwarding-related operations related to incoming packet data, the integrated circuit being configurable to use machine-readable tertiary content addressable memory (TCAM) in association with the packet forwarding-related operations, the integrated circuit comprising other machine-readable memory and at least one processing unit, the other machine-readable memory being for use in the packet forwarding-related operations, the method comprising: storing, in the machine-readable TCAM and the other machine-readable memory, table data for use in the packet forwarding-related operations; and executing, by the at least one processing unit, compiler-produced program instructions, the compiler-produced program instructions, when executed by the at least one processing unit resulting in the integrated circuit being configured to perform the packet forwarding-related operations, the packet forwarding-related operations being configurable to comprise: performing parallel table lookup operations to determine multiple entries in the table data that match, at least in part, at least one portion of the incoming packet data; selecting, based upon relative priorities associated with the multiple entries, one of the multiple entries that corresponds to a highest one of the relative priorities; and based upon the one of the multiple entries, determining at least one action to be executed in relation to the incoming packet data; wherein: the table data is configurable to comprise multiple tables; at least one of the multiple tables is configurable to indicate, at least in part, the relative priorities in association with table indices that are associated with the at least one portion of the incoming packet data; the table data is configurable to indicate packet processing rules that comprise the at least one action; the packet processing rules are configurable to indicate one or more of: at least one packet drop; at least one next packet hop determination; and/or at least one forwarding port determination; and the table indices are configurable to correspond, at least in part, to tree nodes.
14. The method of claim 13, wherein: the parallel table lookup operations involve two or more of the multiple tables.
15. The method of claim 14, wherein: an application specific integrated circuit comprises the integrated circuit.
16. The method of claim 15, wherein: the other machine-readable memory comprises programmable memory; and the at least one processing unit comprises one or more processors.
17. The method of claim 16, wherein: at least one other of the multiple tables is configurable to indicate respective mask data associated with respective entries; and the respective mask data is to be applied to certain portions of the incoming packet data.
18. The method of claim 17, wherein: another integrated circuit comprises the machine-readable TCAM.
19. A network switch for performing packet forwarding-related operations related to a network and incoming packet data to be received via the network, the network switch being configurable to use machine-readable tertiary content addressable memory (TCAM) in association with the packet forwarding-related operations, the network switch comprising: ports to be coupled to the network; and an integrated circuit coupled to the ports, the integrated circuit comprising: other machine-readable memory for use in the packet forwarding-related operations, the machine-readable TCAM and the other machine-readable memory to store table data for use in the packet forwarding-related operations; and at least one processing unit to execute compiler-produced program instructions, the compiler-produced program instructions, when executed by the at least one processing unit resulting in the integrated circuit being configured to perform the packet forwarding-related operations, the packet forwarding-related operations being configurable to comprise: performing parallel table lookup operations to determine multiple entries in the table data that match, at least in part, at least one portion of the incoming packet data; selecting, based upon relative priorities associated with the multiple entries, one of the multiple entries that corresponds to a highest one of the relative priorities; and based upon the one of the multiple entries, determining at least one action to be executed in relation to the incoming packet data; wherein: the table data is configurable to comprise multiple tables; at least one of the multiple tables is configurable to indicate, at least in part, the relative priorities in association with table indices that are associated with the at least one portion of the incoming packet data; the table data is configurable to indicate packet processing rules that comprise the at least one action; the packet processing rules are configurable to indicate one or more of: at least one packet drop; at least one next packet hop determination; and/or at least one forwarding port determination; and the table indices are configurable to correspond, at least in part, to tree nodes.
20. The network switch of claim 19, wherein: the parallel table lookup operations involve two or more of the multiple tables.
21. The network switch of claim 20, wherein: an application specific integrated circuit comprises the integrated circuit.
22. The network switch of claim 21, wherein: the other machine-readable memory comprises programmable memory; and the at least one processing unit comprises one or more processors.
23. The network switch of claim 22, wherein: at least one other of the multiple tables is configurable to indicate respective mask data associated with respective entries; and the respective mask data is to be applied to certain portions of the incoming packet data.
24. The network switch of claim 23, wherein: another integrated circuit comprises the machine-readable TCAM.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The novel features of the invention are set forth in the appended claims. However, for purposes of explanation, several embodiments of the invention are set forth in the following figures.
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
DETAILED DESCRIPTION
[0023] In the following detailed description of the invention, numerous details, examples, and embodiments of the invention are set forth and described. However, it will be clear and apparent to one skilled in the art that the invention is not limited to the embodiments set forth and that the invention may be practiced without some of the specific details and examples discussed.
[0024]
[0025] The process stores (at 110) the entries for ternary lookup, the entries' associated masks, and the entry specific search results in the prioritized search tables in random access memory. The mask is used to implement ternary values where each bit can have a value of 0, 1, or don't care. The entry specific search result associated with each entry is the result that is returned by a ternary lookup process when the entry is the highest priority entry that matches the input data in a search request. The process then stores (at 115) the sub-table keys and the associated masks in a “ternary sub-table keys” table in TCAM. The mask for each entry is used to implement ternary values where each bit can have a value of 0, 1, or don't care. The process then ends.
[0026]
[0027] Unlike the ternary sub-table keys table, table 215 that includes ternary lookup sub-tables 221-223 is stored in random access memory 220 such as SRAM. Each sub-table 221-223 includes one or more entries 231-239. Each entry includes data, the associated mask, and entry specific data. Each sub-table entry is also associated with a priority. In some embodiments, the sub-tables are stored in random access memory based on their priorities (e.g., the first entry has the highest, followed by the next highest priority entry, and ending to the least priority entry). Whenever a search in a sub-table results in more than one match, the entry with the highest priority is returned as the search result. For instance, in Sub-table 221, entry 231 has the highest priority followed by entry 232, while entry 233 has the lowest priority.
[0028] As shown in the expanded view 271 for entry 231, the entry includes data 291, mask 292, and entry specific search result 293. Similarly, the expanded view 273 for entry 233 shows that the entry includes data 294, mask 295, and entry specific search result 296.
[0029]
[0030] As shown, the process receives (at 305) a search request to match an entry in a ternary lookup table. The process performs (at 310) a ternary lookup into the ternary sub-table keys table stored in TCAM to retrieve a sub-table index to identify a sub-table in RAM. The TCAM hardware searches the entries in parallel and returns the sub-table index associated with the highest priority match. The sub-table index is used to identify a sub-table in the ternary lookup sub-tables stored in random access memory.
[0031] In the example of
[0032] The process then performs (at 315) a ternary lookup into the identified sub-table that is stored in RAM to find the highest priority match. The process then applies (at 320) the search result associated with the matched entry. In some embodiments, a set of processing units or other specialized hardware of a computing system that implements the ternary lookup performs the search of the identified sub-table. The process then ends. Several examples for utilizing processes 100 and 300 for performing ternary lookups for different applications are described below.
[0033]
[0034] The example of
[0035] In
[0036] A forwarding table maps the network prefixes to a router's ports and is used to forward packets received at the router to the port where the next destination network device (or the packet's next hop) resides. When there are multiple overlapping routes for a destination, the router chooses the most specific route, i.e., the route with the longest prefix. For instance if 190.12.5.0/26 and 190.12.0.0/16 both match, the router chooses the /26 address. Ternary lookup handles this by using don't care bits to ignore the bits associated with the host address portion of the destination IP address. The most specific routes in an LPM table are given higher priorities. The forwarding tables can also include a default next hop to fall back if no other table entries results in a match.
[0037]
[0038] In
[0039] The nodes 510-570 (which are marked with a black dot) are the nodes associated with the entries of LPM table 400 in
[0040]
[0041] Table 615 stores the entries for the 4 sub-tables 641-644 associated with each sub-table key. Table 615 is stored in random access memory such as SRAM or DRAM. As shown in the expanded view 670 for the second entry of sub-table 641, the entry includes data 651, a mask 652, and an entry-specific search result 653, which is returned as the search result if the entry is the highest priority entry that matches a searched input. In this example, the entry data is 00101100, the associated mask is 001011**, and the entry-specific search result is port 2. Applying the mask to the entry data results in the masked value 001011**. For convenience, the sub-table 641-644 entries are shown as two tuples (a masked value and the search result).
[0042] Each of the entries in a sub-table 641-644 is associated with a priority. The priorities are conceptually shown in
[0043]
[0044] The process then performs (at 710) a parallel ternary lookup in the sub-table keys table by the TCAM hardware to return the sub-table index corresponding to the match with the highest priority. The process determines (at 715) whether a match was found by the TCAM ternary lookup. When a match is not found, the process applies (at 720) a default result of the sub-table keys table. For example, the process identifies a default router port as the next hop for an incoming packet. The process then ends. Otherwise, the process uses (at 725) the sub-table index returned by the TCAM hardware to identify a sub-table stored in random access memory to continue the search.
[0045]
[0046] Therefore, if the highest priority key that matches the search criteria is the n.sup.th entry of the table 605 in
[0047] In the example of
[0048]
[0049] In the example of
[0050] In the example of
[0051] The input key is used by TCAM hardware to search in table 605, which is stored in TCAM. In this example, the packet 910 is received at one of the port switches in switch port label 3 933. As a result, sub-table index 901 is identified as the sub-table index. Sub-table index 901 has a value of 3. This value is used to index into ternary lookup table sub-tables 615 to identify sub-table 943 as the sub-table to search to find an entry for the input key 910.
[0052] Referring back to
[0053] Otherwise, the process uses (at 745) the sub-table entry found by the ternary search of the identified sub-table (which is stored in random access memory) as the match for the input key.
[0054] Referring back to
[0055] In the example of
[0056]
[0057] Entries 1041 and 1042 indicate that Rule 1 is only concerned with layer 4 port 80 (the mask 1042 makes the first 4 values associated with the source IP address don't care). Entries 1051 and 1052 indicate that Rule 2 is only concerned with a 24 bit source IP address. The lower 8 bits of the IP address and the layer 4 port value are masked by 0's in the mask 1052.
[0058] Entries 1061 and 1062 indicate that Rule 3 is concerned with the 16 bit sub-network portion of the IP address 194,20/16 and the layer 4 port 80 (the mask 1062 makes the lowest 16 bits associated with the source IP address don't care). The search results associated with Rules 1-3 are drop 1043, permit 1053, and permit 1063.
[0059] The input key 910 in
[0060] Many of the above-described features and applications are implemented as software processes that are specified as a set of instructions recorded on a computer readable storage medium (also referred to as computer readable medium). When these instructions are executed by one or more processing unit(s) (e.g., one or more processors, cores of processors, or other processing units), they cause the processing unit(s) to perform the actions indicated in the instructions. Examples of computer readable media include, but are not limited to, CD-ROMs, flash drives, RAM chips, hard drives, EPROMs, etc. The computer readable media does not include carrier waves and electronic signals passing wirelessly or over wired connections.
[0061] In this specification, the term “software” is meant to include firmware residing in read-only memory or applications stored in magnetic storage, which can be read into memory for processing by a processor. Also, in some embodiments, multiple software inventions can be implemented as sub-parts of a larger program while remaining distinct software inventions. In some embodiments, multiple software inventions can also be implemented as separate programs. Finally, any combination of separate programs that together implement a software invention described here is within the scope of the invention. In some embodiments, the software programs, when installed to operate on one or more electronic systems, define one or more specific machine implementations that execute and perform the operations of the software programs.
[0062]
[0063] The bus 1105 collectively represents all system, peripheral, and chipset buses that communicatively connect the numerous internal devices of the electronic system 1100. For instance, the bus 1105 communicatively connects the processing unit(s) 1110 with the read-only memory 1130, the system memory 1120, and the permanent storage device 1135.
[0064] From these various memory units, the processing unit(s) 1110 retrieve instructions to execute and data to process in order to execute the processes of the invention. The processing unit(s) may be a single processor or a multi-core processor in different embodiments.
[0065] The read-only-memory 1130 stores static data and instructions that are needed by the processing unit(s) 1110 and other modules of the electronic system. The permanent storage device 1135, on the other hand, is a read-and-write memory device. This device is a non-volatile memory unit that stores instructions and data even when the electronic system 1100 is off Some embodiments of the invention use a mass-storage device (such as a magnetic or optical disk and its corresponding disk drive) as the permanent storage device 1135.
[0066] Other embodiments use a removable storage device (such as a floppy disk, flash drive, etc.) as the permanent storage device. Like the permanent storage device 1135, the system memory 1120 is a read-and-write memory device. However, unlike storage device 1135, the system memory is a volatile read-and-write memory, such as random access memory. The system memory stores some of the instructions and data that the processor needs at runtime. In some embodiments, the invention's processes are stored in the system memory 1120, the permanent storage device 1135, and/or the read-only memory 1130. From these various memory units, the processing unit(s) 1110 retrieve instructions to execute and data to process in order to execute the processes of some embodiments.
[0067] The bus 1105 also connects to the input and output devices 1140 and 1145. The input devices enable the user to communicate information and select commands to the electronic system. The input devices 1140 include alphanumeric keyboards and pointing devices (also called “cursor control devices”). The output devices 1145 display images generated by the electronic system. The output devices include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD). Some embodiments include devices such as a touchscreen that function as both input and output devices.
[0068] Finally, as shown in
[0069] Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer-readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media). Some examples of such computer-readable media include RAM, ROM, read-only compact discs (CD-ROM), recordable compact discs (CD-R), rewritable compact discs (CD-RW), read-only digital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM), a variety of recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.), flash memory (e.g., SD cards, mini-SD cards, micro-SD cards, etc.), magnetic and/or solid state hard drives, read-only and recordable Blu-Ray® discs, ultra density optical discs, any other optical or magnetic media, and floppy disks. The computer-readable media may store a computer program that is executable by at least one processing unit and includes sets of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor using an interpreter.
[0070] While the above discussion primarily refers to microprocessor or multi-core processors that execute software, some embodiments are performed by one or more integrated circuits, such as application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs). In some embodiments, such integrated circuits execute instructions that are stored on the circuit itself.
[0071] As used in this specification, the terms “computer”, “server”, “processor”, and “memory” all refer to electronic or other technological devices. These terms exclude people or groups of people. For the purposes of the specification, the terms display or displaying means displaying on an electronic device. As used in this specification, the terms “computer readable medium,” “computer readable media,” and “machine readable medium” are entirely restricted to tangible, physical objects that store information in a form that is readable by a computer. These terms exclude any wireless signals, wired download signals, and any other ephemeral or transitory signals.
[0072] While the invention has been described with reference to numerous specific details, one of ordinary skill in the art will recognize that the invention can be embodied in other specific forms without departing from the spirit of the invention. In addition, a number of the figures (including
[0073] In view of the foregoing, one of ordinary skill in the art would understand that the invention is not to be limited by the foregoing illustrative details, but rather is to be defined by the appended claims.