Multistage round robin arbitration
10838892 ยท 2020-11-17
Assignee
Inventors
Cpc classification
G06F9/4881
PHYSICS
International classification
G06F3/00
PHYSICS
Abstract
A device includes a first and a second stage round robin arbitrations. The first stage receives request signals and selects a subset the request signals. Each request signal is associated with whether a component is requesting access to a common resource. The second stage receives the selected subset and grants access to the common resource to each request signal of the selected subset that is requesting access, in a round robin fashion. The second stage outputs an enable signal to the first stage when the selected subset is processed. The first stage selects another subset and transmits the selected another subset to the second stage for round robin processing thereof. The process is repeated until all subsets with at least one request signal to access the common resource is processed and granted access in a round robin fashion.
Claims
1. A device comprising: a plurality of logical gates configured to receive a plurality of request signals, wherein each logical gate of the plurality of logical gates is configured to receive a set of request signals of the plurality of request signals, wherein each request signal is associated with whether a component is requesting access to a common resource, wherein each logical gate of the plurality of logical gates is further configured to output a signal indicating whether a request signal of the corresponding set of request signals is asserted to access the common resource, and wherein each set of request signals of the plurality of request signals is formed based on priority associated therewith; a first round robin arbiter configured to receive output signals from the plurality of logical gates, wherein the first round robin arbiter is configured to output an index signal identifying a logical gate of the plurality of logical gates where the request signal is asserted; a multiplexer configured to receive the plurality of request signals, wherein the multiplexer is further configured to receive the index signal and wherein the multiplexer selects a set of request signals of the plurality of request signals corresponding to the request signal that is asserted; and a second round robin arbiter configured to receive the selected set of request signals of the plurality of request signals from the multiplexer, and wherein the second round robin arbiter is further configured to grant access to the common resource to each request signal of the selected set of request signals that is requesting access to the common resource in a round robin fashion, and wherein the second round robin arbiter is further configured to output a first round robin enable signal to the first round robin arbiter to select another logic gate of the plurality of logical gates where a request signal is asserted; a delay element coupled to the second round robin arbiter, wherein the delay element is configured to receive an enable signal and to delay the enable signal by an amount of time needed for the first round robin arbiter to generate the index signal, wherein the delayed enable signal initiates the second round robin arbitration signal to process received signals.
2. The device of claim 1, wherein the plurality of logical gates comprises a logical OR gate.
3. The device of claim 1, further comprising another multiplexer coupled to an output of the delay element, wherein the another multiplexer is configured to receive the delayed enable signal and further configured to receive the enable signal, and wherein the another multiplexer selects delayed enable signal or the enable signal based on the first round robin enable signal generated by the second round robin arbiter, and wherein the another multiplexer is configured to output the selected signal to the second round robin arbiter to initiate the second round robin arbiter to process received signals.
4. The device of claim 1, wherein the delay element is a flop delay.
5. The device of claim 1, wherein the first round robin arbiter is configured to receive the first round robin enable signal output and is further configured to select the another logic gate of the plurality of logical gates in a round robin fashion where a request signal is asserted to access the common resource, and wherein the first round robin arbiter is configured to output an updated index signal based on the another logic gate, wherein the multiplexer is configured to receive the updated index signal and wherein the multiplexer selects another set of request signals of the plurality of request signals corresponding to the request signal that is asserted, wherein the second round robin arbiter is further configured to receive the selected another set of request signals of the plurality of request signals from the multiplexer and wherein the second round robin arbiter is further configured to grant access to the common resource to each request signal of the selected another set of request signals in a round robin fashion.
6. The device of claim 5, wherein the second round robin arbiter is further configured to output another first round robin enable signal to the first round robin arbiter to select yet another logic gate of the plurality of logical gates where a request signal is asserted to access the common resource.
7. The device of claim 1, wherein the common resource is selected from a group consisting of a bus, storage memory, and network cards.
8. A device comprising: a first stage round robin arbitration configured to receive a plurality of request signals, wherein each request signal is associated with whether a component is requesting access to a common resource, wherein the plurality of request signals is formed into a plurality of subset request signals based on priority associated therewith, and wherein the first stage round robin arbitration is configured to select a subset of request signals from the plurality of subset request signals; and a second stage round robin arbitration configured to receive the selected subset and further configured to grant access to the common resource to each request signal of the selected subset that is requesting access to the common resource in a round robin fashion, wherein the second stage round robin arbitration is further configured to output an enable signal to the first stage round robin arbitration, wherein the first stage round robin arbitration is configured to select another subset of the plurality of request signals, and wherein the first stage round robin arbitration is further configured to transmit the selected another subset of the plurality of request signals to the second stage round robin arbitration for round robin processing thereof; a delay element coupled to the second stage round robin arbitration, wherein the delay element is configured to receive an enable signal and to delay the enable signal by an amount of time needed for the first stage round robin arbitration to generate the selected subset, wherein the delayed enable signal initiates the second stage round robin arbitration to process received selected subset.
9. The device of claim 8, wherein at least one request signal within the selected subset is asserted to access the common resource.
10. The device of claim 8, wherein the first stage round robin arbitration selects the another subset of the plurality of request signals responsive to receiving enable signal from the second round robin stage, and wherein the another subset of the plurality of request signals comprises at least one request signal within the selected another subset that is asserted to access the common resource, and wherein the second stage round robin arbitration is configured to receive the selected another subset and grant access to the common resource to each request signal within the selected another subset that is requesting access to the common resource in a round robin fashion.
11. The device of claim 8, wherein the delay element is a flop delay.
12. The device of claim 8, wherein the first stage round robin arbitration comprises: a plurality of logical gates configured to receive the plurality of request signals and output a plurality of logical output signals, wherein each logical gate of the plurality of logical gates receives a subset of the plurality of request signals; a round robin arbiter configured to receive the plurality of logical output signals and select one subset of the plurality of request signals wherein the selected one subset includes at least one request signal to access the common resource in a round robin fashion; and a multiplexer configured to receive the selected one subset and transmit the selected one subset to the second stage round robin arbitration.
13. The device of claim 12, wherein the plurality of logical gates includes a logical OR gate.
14. The device of claim 8, wherein the common resource is selected from a group consisting of a bus, storage memory, and network cards.
15. A method comprising: receiving a plurality of request signals, wherein each request signal is associated with whether a component is requesting access to a common resource; grouping the plurality of request signals into a plurality of subsets based on priority associated with request signals; generating an index signal by a first round robin arbiter to select a subset of the plurality of subsets that includes at least one request signal asserted to access the common resource; granting access to the common resource to each request signal of the selected subset that is requesting access to the common resource in a round robin fashion; subsequent to the granting, selecting by a second round robin arbiter another subset of the plurality of subsets that includes at least one request signal asserted to access the common resource; granting access to the common resource to each request signal of the selected another subset that is requesting access to the common resource in a round robin fashion; and repeating the selecting and granting access until all requests signals within subsets of the plurality of subsets with at least one request signal asserted to access the common resource is granted access in a round robin fashion; and receiving an enable signal, by a delay element coupled to the second round robin arbiter, and delaying the enable signal by an amount of time needed for the first round robin arbiter to generate the index signal, wherein the delayed enable signal initiates the second round robin arbiter to process received signals.
16. The method of claim 15, wherein the common resource is selected from a group consisting of a bus, storage memory, and network cards.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) So that the manner in which the above recited features can be understood in detail, a more particular description, briefly summarized above, may be had by reference to example implementations, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical example implementations and are therefore not to be considered limiting of its scope.
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9) To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements of one example may be beneficially incorporated in other examples.
DETAILED DESCRIPTION
(10) Examples described herein relate to a multistage round robin arbitration architecture to grant access to a common resource, e.g., system buses, storage memories, network cards, processors, etc. Round robin arbitration architecture is a key component in defining the latency and performance of the system. The proposed multistage round robin arbitration architecture may be used regardless of the number of queues and the queue sizes and the number of request signals while achieving reduced latency and increased operating clock frequency. The proposed multistage round robin arbitration architecture achieves both high speed and has lower latency in comparison to the conventional architecture. The proposed architecture supports pipelining to improve performance, thereby resulting in an increased operating clock frequency. Moreover, the proposed architecture results in lower latency when arbitrating among a large number of request signals and queues. It is appreciated that while round robin architecture is described the proposed architecture may be extended to other types of arbitration schemes, e.g., priority arbitration, using a similar architecture. As such, description of the multistage arbitration architecture in a round robin fashion is for illustrative purposes only and should not be construed as limiting the scope of the embodiments.
(11) Various features are described hereinafter with reference to the figures. It should be noted that the figures may or may not be drawn to scale and that the elements of similar structures or functions are represented by like reference numerals throughout the figures. It should be noted that the figures are only intended to facilitate the description of the features. They are not intended as an exhaustive description of the claimed invention or as a limitation on the scope of the claimed invention. For example, various methods according to some examples can include more or fewer operations, and the sequence of operations in various methods according to examples may be different than described herein. In addition, an illustrated example need not have all the aspects or advantages shown. An aspect or an advantage described in conjunction with a particular example is not necessarily limited to that example and can be practiced in any other examples even if not so illustrated or if not so explicitly described.
(12) Some general concepts will first be described to clarify terms and nomenclature used throughout this description.
(13) Referring now to
(14) In some embodiments, each logical component outputs a signal indicative of whether at least one request signal within the received word is asserted indicating that a component associated with that request signal is requesting access to the common resource. For example, in this illustrative embodiment, the logical OR gate 192 receives the word W.sub.1 102 that is a grouping of four request signals. Each request signal of the word W.sub.1 102 is associated with its corresponding component in the system and if the associated request signal is asserted high it indicates that the corresponding component is requesting access to the common resource. Accordingly, if any one of the four request signals in W.sub.1 102 is asserted high, the logical OR gate 192 outputs a signal that is high indicating that at least one component corresponding to a request signal within W.sub.1 102 is requesting access to the common resource. In contrast, if none of the request signals within the W.sub.1 102 is asserted high, the logical OR gate 192 outputs a signal that is asserted low indicating that no component corresponding to the request signals within W.sub.1 102 is requesting access to the common resource. It is appreciated that other logical gates similarly output a corresponding for their respective received word. It is appreciated that only 4 logical OR gates with four inputs are shown. However, it is appreciated that any number of OR gates with any number of inputs may be used. Even further, other embodiments may use a logical gate other than an OR gate. As such, any discussion with respect to the number of logical gates, the OR gate, and the number of inputs is for illustrative purposes and not intended to limit the scope of the embodiments.
(15) In some embodiments, the logical gates 192-198 output their respective signals to a round robin arbiter 110. In this illustrative embodiment, the round robin arbiter is a 4:2 where it receives four inputs and outputs a signal with 2 bits. The round robin arbiter 110 on the first pass may select the first output signal from the logical OR gates that is asserted high. For example, presuming for the purposes of this illustration that logical OR gate 192 outputs a signal asserted high indicating that at least one request signal within the W.sub.1 102 is requesting access to the common resource, but that logical OR gates 194 and 196 output a signal asserted low indicating that no component associated with those request signals is requesting access to the common resource, and finally the logical OR gate 198 outputs a signal asserted high indicating that at least one request signal within the W.sub.4 108 is requesting access to the common resource. The round robin arbiter 110 may select the logical OR gate 192 because it is the first one in the grouping of the logical OR gates with an output signal asserted high. However, it is appreciated that in some embodiments, a different ordering selection may be done, e.g., the round robin arbiter 110 may select the logical OR gate 198 first.
(16) In some embodiments, the round robin arbiter 110 may output an index signal 112 that identifies the logical OR gate 192 and the request signals in W.sub.1 102. Since the round robin arbiter 110 is a 4 input arbiter, it may use two bits to identify the logical OR gate. In this illustrative example, 00 may be used to identify logical OR gate 192 and its corresponding W.sub.1 102, 01 may be used to identify logical OR gate 194 and its corresponding W.sub.2 104, 10 may be used to identify logical OR gate 196 and its corresponding W.sub.3 106, and 11 may be used to identify logical OR gate 198 and its corresponding W.sub.4 108. In this illustrative example, the index signal 112 outputs a 00 value indicating the logical OR gate 192 and its corresponding W.sub.1 102.
(17) In some embodiments, a multiplexer 120 may receive the index signal 112 to select between the different words, W.sub.1 102, W.sub.2 104, W.sub.3 106, and W.sub.4 108. In this illustrative example, since the index signal 112 has a 00 value, the word W.sub.1 102 is selected. The multiplexer 120 outputs a signal 122 which is the selected word, e.g., W.sub.1 102.
(18) In some embodiments, a second round robin arbiter 130 may be used and receives the output signal 122. In this example, the MUX output signal 122 is W.sub.1 102. The round robin arbiter 130 processes the requests within the W.sub.1 102. For illustrative purpose it is presumed that 3 of the 4 request signals within W.sub.1 102 are asserted high indicating that their respective components are requesting access to the common resource, e.g., the first, third, and fourth request signals are asserted high for illustrative purposes. Since the words are groups of 4 request signals, then the round robin arbiter 130 may use 2 bits to identify the request signal and its corresponding component. For example, 00 may be used to identify the first request signal within W.sub.1 102 and its corresponding component, 01 may be used to identify the second request signal within W.sub.1 102 and its corresponding component, 10 may be used to identify the third request signal within W.sub.1 102 and 11 may be used to identify the first request signal within W.sub.1 102 and its corresponding component.
(19) In some embodiments, the round robin arbiter 130 outputs an index signal 134 with values 00 corresponding to the first request signal of W.sub.1 102 being asserted high. As such, the component corresponding to the first request signal is granted access to the common resource by issuing grant signal 132. Once the component that was just granted access is done with the common resource, the round robin arbiter 130 processes the remaining request signals within the word W.sub.1 102. However, in this example since the second request signal as identified by 01 is not asserted high it indicates that the component corresponding to that request signal is not requesting access to the common resource and therefore is skipped. The round robin arbiter 130 then processes the remaining request signals within the word W.sub.1 102. In this example, since the third request signal as identified by 10 is asserted high it indicates that the component corresponding to that request signal is requesting access to the common resource. As such, the round robin arbiter 130 outputs the index signal 134 with value 10 and it outputs the grant signal 132 to grant the component corresponding to the 10 request signal within the word W.sub.1 102 access to the common resource. Once the component that was just granted access is done with the common resource, the round robin arbiter 130 processes the remaining request signals within the word W.sub.1 102. In this example, since the fourth request signal as identified by 11 is asserted high it indicates that the component corresponding to that request signal is requesting access to the common resource. As such, the round robin arbiter 130 outputs the index signal 134 with value 11 and it outputs the grant signal 132 to grant the component corresponding to the 11 request signal within the word W.sub.1 102 access to the common resource. Once the component that was just granted access is done with the common resource, the round robin arbiter 130 processes the remaining request signals within the word W.sub.1 102. However, since no other request signal remains in the word W.sub.1 102, the round robin arbiter 130 notifies the round robin arbiter 110 to process the next word.
(20) As presented above and for illustrative purposes it was presumed that words W.sub.2 104 and W.sub.3 106 have no request signals that is asserted high indicative of an access grant request to the common resource, the round robin arbiter 110 selects the next word with at least one request signal being asserted high, e.g., W.sub.4 108.
(21) In some embodiments, the round robin arbiter 110 may output an index signal 112 that identifies the logical OR gate 198 and the request signals in W.sub.4 108. In this illustrative example, the index signal 112 outputs a 11 value indicating the logical OR gate 198 and its corresponding W.sub.4 108.
(22) In some embodiments, a multiplexer 120 may receive the index signal 112 to select between the different words, W.sub.1 102, W.sub.2 104, W.sub.3 106, and W.sub.4 108. In this illustrative example, since the index signal 112 has a 11 value, the word W.sub.4 108 is selected. The multiplexer 120 outputs a signal 122 which is the selected word, e.g., W.sub.4 108.
(23) In some embodiments, the round robin arbiter 130 receives the output signal 122. In this example, the MUX output signal 122 is W.sub.4 108. The round robin arbiter 130 processes the requests within the W.sub.4 108. For illustrative purpose it is presumed that 2 of the 4 request signals within W.sub.4 108 are asserted high indicating that their respective components are requesting access to the common resource, e.g., the first, and third request signals are asserted high for illustrative purposes. Since the words are groups of 4 request signals, then the round robin arbiter 130 may use 2 bits to identify the request signal and its corresponding component. For example, 00 may be used to identify the first request signal within W.sub.4 108 and its corresponding component, 01 may be used to identify the second request signal within W.sub.4 108 and its corresponding component, 10 may be used to identify the third request signal within W.sub.4 108 and 11 may be used to identify the first request signal within W.sub.4 108 and its corresponding component.
(24) In some embodiments, the round robin arbiter 130 outputs an index signal 134 with values 00 corresponding to the first request signal of W.sub.4 108 being asserted high. As such, the component corresponding to the first request signal is granted access to the common resource by issuing grant signal 132. Once the component that was just granted access is done with the common resource, the round robin arbiter 130 processes the remaining request signals within the word W.sub.4 108. However, in this example since the second request signal as identified by 01 is not asserted high it indicates that the component corresponding to that request signal is not requesting access to the common resource and therefore is skipped. The round robin arbiter 130 then processes the remaining request signals within the word W.sub.4 108. In this example, since the third request signal as identified by 10 is asserted high it indicates that the component corresponding to that request signal is requesting access to the common resource. As such, the round robin arbiter 130 outputs the index signal 134 with value 10 and it outputs the grant signal 132 to grant the component corresponding to the 10 request signal within the word W.sub.4 108 access to the common resource. Once the component that was just granted access is done with the common resource, the round robin arbiter 130 processes the remaining request signals within the word W.sub.4 108. However, in this example since the fourth request signal as identified by 11 is not asserted high it indicates that the component corresponding to that request signal is not requesting access to the common resource and therefore is skipped. The round robin arbiter 130 processes the remaining request signals within the word W.sub.4 108. However, since no other request signal remains in the word W.sub.4 108, the round robin arbiter 130 notifies the round robin arbiter 110 to process the next word.
(25) It is appreciated that describing the embodiments with two stage round robin arbiters, four logical OR gates, and four words along with one multiplexer is for illustrative purposes only and should not be construed as limiting the scope of the embodiments.
(26) Referring now to
(27) It is appreciated that in some embodiments, a global enable signal 142 may also be used. The global enable signal 142 when asserted indicates that arbitration is requested. The global enable signal 142 may be sent to both the round robin arbiter 110 and round robin arbiter 130. However, since a multistage round robin arbitration architecture is used, the enable signal 142 may be logically ANDed with the enable signal 136, such that the round robin arbiter 110 only starts processing the next work or outputs after the subsequent round robin arbiters, e.g., round robin arbiter 130, is done and ready to accept the next set of requests. Moreover, it is appreciated that a flop delay 140 may be used to delay the global enable signal 142 that is input to the round robin arbiter 130. It is appreciated that the delay may be the amount of time needed for the round robin arbiter 110 to generate the index signal 112. As such, the round robin arbiter 130 is ready to process the request signals as soon as the preceding round robin arbiter, e.g., round robin arbiter 110, generates its output.
(28) Referring now to
(29) Referring now to
(30) Referring now to
(31) Referring now to
(32) It is appreciated that each round robin stage receives an enabling signal from subsequent round robin stage. For example, round robin stage 320 receives the enable signal 374 from round robin stage 350 that is subsequent to the round robin stage 320. Moreover, the round robin stage 350 receives enable signal 372 from a subsequent round robin stage (not shown). In some embodiments, a global enabling signal is also transmitted to each round robin stage that may be ANDed. In other words, the round robin stage operates only when both enabling signals are asserted high.
(33) Referring now to
(34) Referring now to
(35)
(36) In the example of
(37)
(38) In some FPGAs, each programmable tile can include at least one programmable interconnect element (INT) 950 having connections to input and output terminals 952 of a programmable logic element within the same tile, as shown by examples included in
(39) In an example implementation, a CLB 930 can include a configurable logic element (CLE) 960 that can be programmed to implement user logic plus a single programmable interconnect element (INT) 950. A BRAM 932 can include a BRAM logic element (BRL) 962 in addition to one or more programmable interconnect elements. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured example, a BRAM tile has the same height as five CLBs, but other numbers (e.g., four) can also be used. A signal processing block 934 can include a DSP logic element (DSPL) 964 in addition to an appropriate number of programmable interconnect elements. An 10B 936 can include, for example, two instances of an input/output logic element (IOL) 966 in addition to one instance of the programmable interconnect element 950. As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the input/output logic element 966 typically are not confined to the area of the input/output logic element 966.
(40) In the pictured example, a horizontal area near the center of the die is used for configuration, clock, and other control logic. Vertical columns 968 extending from this horizontal area or column are used to distribute the clocks and configuration signals across the breadth of the FPGA.
(41) Some FPGAs utilizing the architecture illustrated in
(42) Note that
(43) While the foregoing is directed to specific examples, other and further examples may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.