Low power low-density parity-check decoding
09935654 ยท 2018-04-03
Assignee
Inventors
- Behnam Sedighi (South Bend, IN, US)
- Nagaraj Prasanth Anthapadmanabhan (Bridgewater, NJ, US)
- Dusan Suvakovic (Pleasanton, CA, US)
Cpc classification
H03M13/6577
ELECTRICITY
H03M13/1122
ELECTRICITY
H03M13/112
ELECTRICITY
H03M13/6569
ELECTRICITY
H03M13/6502
ELECTRICITY
International classification
H04L1/00
ELECTRICITY
Abstract
In general, a minimum determination capability, adapted for determining one or more minimum values from a set of values, is provided. The minimum determination capability may enable, for a set of values, determination of a first minimum value representing a smallest value of the set of values and a second minimum value representing an approximation of a next-smallest value of the set of values. The minimum determination capability may enable, for a set of values where each of the values is represented as a respective set of bits at a respective set of bit positions, determination of a minimum value of the set of values based on a set of bitwise comparisons performed for the respective bit positions of the values.
Claims
1. An apparatus, comprising: a low-density parity-check (LDPC) decoder comprising a set of hardware modules configured to: receive a set of values; evaluate a first portion of the values to determine a magnitude of a minimum value of the first portion of the values; evaluate a second portion of the values to determine a magnitude of a minimum value of the second portion of the values; and determine, based on a comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values, a first minimum value representing a magnitude of a smallest value of the set of values and a second minimum value representing an approximation of a magnitude of a next-smallest value of the set of values.
2. The apparatus of claim 1, wherein the first portion of the values comprises a first half of the values and the second portion of the values comprises a second half of the values.
3. The apparatus of claim 1, wherein the first portion of the values comprises more than half of the values and the second portion of the values comprises less than half of the values.
4. The apparatus of claim 1, wherein the set of hardware modules comprises: a first hardware module configured to evaluate the first portion of the values to determine the magnitude of the minimum value of the first portion of the values; and a second hardware module configured to evaluate the second portion of the values to determine the magnitude of the minimum value of the second portion of the values.
5. The apparatus of claim 4, wherein the set of hardware modules comprises a comparator module configured to: receive the magnitude of the minimum value of the first portion of the values from the first hardware module; receive the magnitude of the minimum value of the second portion of the values from the second hardware module; compare the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values; and output, based on comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values, a select signal configured for use in indicating whether the magnitude of the minimum value of the first portion of the values or the magnitude of the minimum value of the second portion of the values is the magnitude of the first minimum value representing the smallest value of the set of values.
6. The apparatus of claim 5, wherein the set of hardware modules comprises a value multiplexer configured to: receive the magnitude of the minimum value of the first portion of the values from the first hardware module; receive the magnitude of the minimum value of the second portion of the values from the second hardware module; receive the select signal output by the comparator module; and output the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values, based on the select signal, in a manner for indicating the magnitude of the first minimum value representing the smallest value of the set of values and in a manner for indicating the magnitude of the second minimum value representing the approximation of the next-smallest value of the set of values.
7. The apparatus of claim 5, wherein: the first hardware module is configured to evaluate the first portion of the values to determine a first index associated with the minimum value of the first portion of the values; and the second hardware module is configured to evaluate the second portion of the values to determine a second index associated with the minimum value of the second portion of the values.
8. The apparatus of claim 7, wherein the set of hardware modules comprises an index multiplexer configured to: receive, from the first hardware module, the first index associated with the minimum value of the first portion of the values; receive, from the second hardware module, the second index associated with the minimum value of the second portion of the values; receive the select signal output by the comparator module; and output the first index when the magnitude of the minimum value of the first portion of the values is identified as the first minimum value representing the smallest value of the set of values or output the second index when the magnitude of the minimum value of the second portion of the values is identified as the first minimum value representing the smallest value of the set of values.
9. The apparatus of claim 1, wherein the set of values comprises M values, wherein the set of hardware modules comprises: a hierarchical tree of value comparators disposed in X stages, the hierarchical tree of value comparators comprising: X-1 stages of 2-input minimum value comparators disposed at the first X-1 stages of the hierarchical tree of value comparators; and a 2-input minimum-maximum comparator module disposed at an X-th stage of the hierarchical tree of value comparators; wherein 2X M.
10. The apparatus of claim 9, wherein the 2-input minimum value comparators are configured to receive two input values and output a minimum of the two input values and an index associated with the minimum of the two input values.
11. The apparatus of claim 9, wherein the 2-input minimum-maximum comparator module is configured to: receive, from a first one of the 2-input minimum value comparators disposed at the (X-1)-th stage of the hierarchical tree of value comparators, a magnitude of a first intermediate minimum input value; receive, from a second one of the 2-input minimum value comparators disposed at the (X-1)-th stage of the hierarchical tree of value comparators, a magnitude of a second intermediate minimum input value; and output, based on comparison of the magnitude of the first intermediate minimum input value and the magnitude of the second intermediate minimum input value, the first minimum value representing the smallest value of the set of values, the second minimum value representing the approximation of the next-smallest value of the set of values, and an index associated with the first minimum value representing the smallest value of the set of values.
12. The apparatus of claim 1, wherein the set of values comprises M values, wherein the set of hardware modules comprises: a hierarchical tree of value comparators disposed in X stages, the hierarchical tree of value comparators comprising: X-3 stages of 2-input minimum value comparators disposed at the first X-3 stages of the hierarchical tree of value comparators; one stage of two-input minimum-maximum comparator modules disposed at the (X-2)-th stage of the hierarchical tree of value comparators; and two stages of 4-input min1-min2 comparator modules disposed at the (X-1)th and X-th stages of the hierarchical tree of value comparators; wherein 2X M.
13. The apparatus of claim 12, wherein the 2-input minimum value comparators are configured to receive two input values and output a minimum of the two input values and an index associated with the minimum of the two input values.
14. The apparatus of claim 12, wherein the 2-input minimum-maximum comparator modules are configured to: receive, from a first one of the 2-input minimum value comparators disposed at the (X-3)-th stage of the hierarchical tree of value comparators, a first minimum input value; receive, from a second one of the 2-input minimum value comparators disposed at the (X-3)-th stage of the hierarchical tree of value comparators, a second minimum input value; and output, based on comparison of the first minimum input value and the second minimum input value, a minimum output value representing a smaller of the first minimum input value and the second minimum input value and a maximum output value representing a larger of the first minimum input value and the second minimum input value.
15. The apparatus of claim 12, wherein the 4-input min1-min2 comparator modules are configured to: receive a set of input values comprising a first minimum value, a first maximum value, a second minimum value, and a second maximum value; and output, based on one or more comparisons of input values from the set of input values, a first minimum output value representing a smallest of the input values and a second minimum output value representing a next-smallest of the input values.
16. The apparatus of claim 1, wherein the set of hardware modules is configured to receive the set of values from a set of variable node units (VNUs), wherein the set of hardware modules is configured to compute a set of responses for the set of VNUs based on the first minimum value and the second minimum value.
17. The apparatus of claim 16, wherein the set of hardware modules is configured to: propagate the set of responses for the set of VNUs toward the set of VNUs.
18. The apparatus of claim 1, wherein the set of hardware modules is disposed within a check node unit (CNU) of the LDPC decoder.
19. The apparatus of claim 1, wherein the apparatus is a receiving unit configured to be communicate via a communication network.
20. A method, comprising: receiving, at a low-density parity-check (LDPC) decoder comprising a set of hardware modules, a set of values; evaluating, at the LDPC decoder, a first portion of the values to determine a magnitude of a minimum value of the first portion of the values; evaluating, at the LDPC decoder, a second portion of the values to determine a magnitude of a minimum value of the second portion of the values; and determining, at the LDPC decoder based on a comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values, a first minimum value representing a magnitude of a smallest value of the set of values and a second minimum value representing an approximation of a magnitude of a next-smallest value of the set of values.
21. An apparatus, comprising: a low-density parity-check (LDPC) decoder comprising a set of variable node units (VNUs) and a check node unit (CNU), the CNU comprising a set of hardware modules configured to: receive, by the CNU from the set of VNUs, a set of values; evaluate a first portion of the values to determine a magnitude of a minimum value of the first portion of the values; evaluate a second portion of the values to determine a magnitude of a minimum value of the second portion of the values; compute a set of responses for the set of VNUs based on a comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values; and send, by the CNU toward the set of VNUs, the set of responses for the set of VNUs.
22. The apparatus of claim 21, wherein the first portion of the values comprises a first half of the values and the second portion of the values comprises a second half of the values.
23. The apparatus of claim 21, wherein the first portion of the values comprises more than half of the values and the second portion of the values comprises less than half of the values.
24. The apparatus of claim 21, wherein, to compute the set of responses for the set of VNUs based on the comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values, the set of hardware modules is configured to: determine a first minimum value (Min1) for use in computing the set of responses for the set of VNUs, wherein the first minimum value (Min1) for use in computing the set of responses for the set of VNUs is a lesser of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values; determine a second minimum value (Min2) for use in computing the set of responses for the set of VNUs, wherein the second minimum value (Min2) for use in computing the set of responses for the set of VNUs is a greater of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values; determine a location (Ind1) of the first minimum value (Min1) within the set of values; and compute the set of responses for the set of VNUs based on the first minimum value (Min1), the second minimum value (Min2), and the location (Ind1) of the first minimum value (Min1).
25. A method for use by a low-density parity-check (LDPC) decoder comprising a set of variable node units (VNUs) and a check node unit (CNU), the CNU comprising a set of hardware modules, the method comprising: receiving, by the CNU of the LDPC decoder from the set of VNUs of the LDPC, a set of values; evaluating, by the CNU of the LDPC decoder, a first portion of the values to determine a magnitude of a minimum value of the first portion of the values; evaluating, by the CNU of the LDPC decoder, a second portion of the values to determine a magnitude of a minimum value of the second portion of the values; computing, by the CNU of the LDPC decoder, a set of responses for the set of VNUs based on a comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values; and sending, by the CNU of the LDPC decoder toward the set of VNUs, the set of responses for the set of VNUs.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The teachings herein can be readily understood by considering the detailed description in conjunction with the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14) To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements common to the figures.
DETAILED DESCRIPTION OF EMBODIMENTS
(15) In general, a minimum determination capability, adapted for determining one or more minimum values from a set of values, is provided. In at least some embodiments, the minimum determination capability enables, for a set of values, determination of (1) at least one of a magnitude or an identification of a first minimum value representing a smallest value of the set of values and (2) at least one of a magnitude or an identification of a second minimum value representing an approximation of a next-smallest value of the set of values. While such embodiments may be used within various contexts, such embodiments may be well-suited for use by check node unit (CNUs) of a low-density parity-check (LDPC) decoder, where the CNUs are configured to determine a magnitude (typically denoted as Min.sub.1) and an identification (typically denoted as Ind.sub.1) of a first minimum value representing a smallest value of the set of values and to approximate a magnitude (typically denoted as Min.sub.2) of a second minimum value representing an approximation of a next-smallest value of the set of values, in order to provide an LDPC decoder having a bit error rate (BER) performance comparable to that of conventional LDPC decoders while also reducing the complexity, power consumption, and chip area of the LDPC decoder. In at least some embodiments, the minimum determination capability enables, for a set of values where each of the values is represented as a respective set of bits at a respective set of bit positions, determination of at least one characteristic of a minimum value of the set of values (e.g., at least one of a magnitude of the minimum value of the set of values or an identification of one of the values of the set of values having the magnitude of the minimum value of the set of values) based on a set of bitwise comparisons performed for the respective bit positions of the values. These and various other embodiments of the minimum determination capability may be better understood by way of reference to an exemplary system including an LDPC decoder, as depicted in
(16)
(17) As depicted in
(18) As further depicted in
(19) The operation of LDPC decoder 122 in decoding LDPC codewords received at LDPC decoder 122 may be better understood by first considering a commonly used LDPC decoding algorithm known as the normalized min-sum algorithm (MSA). In MSA, if a CNU is connected to M VNUs and, thus, receives M input messages (denoted as .sub.i, i=1, . . . , M), the CNU then computes M output messages (denoted as .sub.i, i=1, . . . , M); i.e., one for each of the connected VNUs. In MSA, the output message .sub.i to VNU i involves the computation of the minimum of incoming messages from all the remaining VNUs j=1, . . . , M where ji. More specifically, for 1=1 to M,
(20)
where .sub.i and .sub.i are the i.sup.th input and output messages of the CNU, the sign function sign(.) returns the arithmetic sign (i.e., outputs +1 or 1 depending on the sign), the absolute-value function |.| returns the arithmetic magnitude, the minimum function min(.) returns the minimum value of the arguments, and S.sub.norm is a normalization factor. The input and output messages have a word-length of w, including the sign bit (i.e., the magnitude portion of the messages is w1 bits). It will be appreciated that, typically, the complexity of the CNU is primarily in evaluating the minimum function. In order to evaluate this minimum for each of the output messages, the computation essentially reduces to computing the first and second minimums among all input messages, as explained next. If the first and second minimums, amongst all |.sub.i|s, are indicated by Min.sub.1=|.sub.Ind1| and Min.sub.2, where Ind.sub.1 is the index of the first minimum, then the output of the min(.) function in the equation above is Min.sub.2 for i=Ind.sub.1, and Min.sub.1 otherwise. If more than one input has a magnitude equal to Min.sub.1 (i.e., multiple inputs have the same smallest value), then Min.sub.2 is equal to Min.sub.1 and Ind.sub.1 essentially plays no role.
(21) In at least some embodiments, the LDPC decoder 122 is configured to use a justified approximation of the minimum computation of MSA which still achieves good BER performance compared to use of MSA. More specifically, each CNU 125 of the LDPC decoder 122 may be configured to use a justified approximation of the minimum computation of MSA. The use of the justified approximation of the minimum computation of MSA by a CNU 125 obviates the need for the CNU 125 to be configured to compute the second minimum at each stage of the CNU 125. Thus, use of the justified approximation of the minimum computation of MSA provides BER performance comparable to that of conventional LDPC decoders while reducing the complexity, power consumption, and chip area of such conventional LDPC decoders.
(22) In at least some embodiments, a CNU 125.sub.i of LDPC decoder 122 may be configured to determine the magnitude of the first minimum value (Min.sub.1) and to approximate the magnitude of the second minimum value (Min.sub.2). The CNU 125.sub.i is configured to receive M input messages from M VNUs 124 to which the CNU.sub.i 125 is connected, where the M messages convey M|.sub.i| values from the M VNUs 124 to which the CNU.sub.i 125 is connected. The CNU 125.sub.i is configured to evaluate or process a first portion of the M messages to determine a magnitude of a minimum value from among the |.sub.i| values of the messages of the first portion of the M messages, evaluate or process a second portion of the M messages to determine a magnitude of a minimum value from among the |.sub.i| values of the messages of the second portion of the M messages, and compare the magnitude of the minimum value from among the |.sub.i| values of the messages of the first portion of the M messages and the magnitude of the minimum value from among the |.sub.i| values of the messages of the second portion of the M messages to determine the magnitude of the first minimum value (Min.sub.1) and an approximation of the magnitude of the second minimum value (Min.sub.2). In at least some embodiments, the first and second portions of the M messages may include equal numbers of messages (namely, M/2 messages per portion). In at least some embodiments, the first portion of M messages may include the first M/2 messages (including the |.sub.1|-|.sub.M/2| values) and the second portion of M messages may include the second M/2 messages (including the |.sub.(M/2)+1|-|.sub.M| values), although it will be appreciated that the M messages (and, thus, the |.sub.i| values of the M messages) may be evaluated or processed in various other combinations in order to determine the magnitude of the first minimum value (Min.sub.1) and to approximate the magnitude of the second minimum value (Min.sub.2). In at least some embodiments, the first and second portions of the M messages may include arbitrarily selected portions of the M messages. The magnitude of the first minimum value (Min.sub.1) is the lesser of the minimum |.sub.i| value from among the |.sub.i| values of the messages of the first portion of the M messages and the minimum |.sub.i| value from among the |.sub.i| values of the messages of the second portion of the M messages. The magnitude of the second minimum value (Min.sub.2) is the greater of the minimum |.sub.i| value from among the |.sub.i| value of the messages of the first portion of the M messages and the minimum |.sub.i| value from among the |.sub.i| value of the messages of the second portion of the M messages. The CNU 125.sub.i also may be configured to determine the index of the first minimum value (Min.sub.1), which provides an indication of the location of the first minimum value (Min.sub.1) within the |.sub.i| values of the M messages (i.e., identification of which of the |.sub.i| values of the M messages has the magnitude given by the first minimum value (Min.sub.1)). The CNU 125.sub.i also may be configured to determine the index of the second minimum value (Min.sub.2), which provides an indication of the location of the second minimum value (Min.sub.2) within the |.sub.i| values of the M messages (i.e., identification of which of the |.sub.i| values of the M messages has the magnitude given by the second minimum value (Min.sub.2)).
(23) Accordingly, it will be appreciated that the first minimum value (Min.sub.1) that is output will be the magnitude of the smallest value of all of the |.sub.i| values of the M messages received by CNU 125.sub.i, and that the second minimum value (Min.sub.2) that is output may or may not be the magnitude of the true next smallest value of all of the |.sub.i| values of the M messages received by CNU 125.sub.i (and, thus, is described herein as providing an approximation of the magnitude of the second minimum value (Min.sub.2)). For example, if the smallest |.sub.i| value of the M messages is in the first portion of messages evaluated or processed and the next smallest |.sub.i| value of the M messages is in the second portion of messages evaluated or processed, then the first minimum value (Min.sub.1) will be the magnitude of the smallest value of all of the |.sub.i| values of the M messages received by CNU 125.sub.i and second minimum value (Min.sub.2) will be magnitude of the true next smallest value of all of the |.sub.i| values of the M messages received by CNU 125.sub.i. By contrast, for example, if the smallest |.sub.i| value of the M messages and the next smallest |.sub.i| value of the M messages are in the same portion of messages evaluated or processed, then the first minimum value (Min.sub.1) will be the magnitude of the smallest value of all of the |.sub.i| values of the M messages received by CNU 125.sub.i but the second minimum value (Min.sub.2) will not be the magnitude of the true next smallest value of all of the |.sub.i| values of the M messages received by CNU 125.sub.i (and, thus, the second minimum value (Min.sub.2) is overestimating the magnitude of the true next smallest value of all of the |.sub.i| values of the M messages). In other words, the first minimum value (Min.sub.1) will always be computed correctly, and the second minimum value (Min.sub.2) may or may not be computed correctly (and, thus, again, is considered to be an approximation of the second minimum value (Min.sub.2)). However, since M1 outputs of a CNU depend only on the first minimum value (Min.sub.1), only one output of the CNU might suffer from error. Thus, given only a potential for a minimal increase in error resulting from approximating the second minimum value (Min.sub.2), it is possible to achieve BER performance comparable to that of conventional LDPC decoders while reducing chip area, complexity, and power consumption of conventional LDPC decoders.
(24) It will be appreciated that, although primarily depicted and described with respect to operation of the CNU 125.sub.i of LDPC decoder 122 in computing the magnitudes of the minimum values, the CNU 125.sub.i of LDPC decoder 122 may be configured to perform various other functions (e.g., sign calculation and the like, as will be understood by one skilled in the art) which have been omitted herein for purposes of clarity.
(25) It will be appreciated that, following calculation of first minimum value (Min.sub.1) and the second minimum value (Min.sub.2) as discussed above, decoding may proceed in the normal manner, a description of which has been omitted herein for purposes of clarity.
(26) An exemplary embodiment of a minimum determination module 126.sub.i of a CNU 125.sub.i of LDPC decoder 122 is depicted and described with respect to
(27)
(28) The processing modules 210 each are configured to receive a set of M/2|.sub.i| values (from among the set of M|.sub.i| values received in the M messages from the M VNUs to which the CNU is connected) and to determine a magnitude of a minimum |.sub.i| value from among the set of M/2|.sub.i| values and an associated index associated with the minimum |.sub.i| value from among the set of M/2|.sub.i| values, respectively. More specifically, processing modules 210 are configured such that (1) first processing module 210.sub.A is configured to receive a first half of the M|.sub.i| values (e.g., including the |.sub.1|-|.sub.M/2| values) and to determine a magnitude of a minimum |.sub.i| value from among the set of M/2|.sub.i| values (denoted as min_A) and an associated index associated with the minimum |.sub.i| value from among the set of M/2|.sub.i| values (denoted as Ind_A) which indicates which of the M/2|.sub.i| values has the magnitude of the minimum |.sub.i| value from among the set of M/2|.sub.i| values and (2) second processing module 210.sub.B is configured to receive a second half of the M|.sub.i| values (e.g., including the |.sub.(M/2)+1|-|.sub.M| values) and to determine a minimum |.sub.i| values from among the set of M/2|.sub.i| values (denoted as min_B) and an associated index associated with the minimum |.sub.i| value from among the set of M/2|.sub.i| values (denoted as Ind_B) which indicates which of the M/2|.sub.i| values has the magnitude of the minimum |.sub.i| value from among the set of M/2|.sub.i| values. The minimum value outputs of the processing modules 210 (min_A, min_B) are provided as inputs to comparator 220 and as inputs to value multiplexer 230. The index outputs of the processing modules 210 (Ind_A, Ind_B) are provided as inputs to index multiplexer 240.
(29) The comparator 220 is configured to compare the minimum value outputs of the processing modules 210 (min_A, min_B) to determine which of the minimum values is smaller, and to generate a select signal for the value multiplexer 230 on the basis of which of the minimum values (min_A, min_B) is smaller. The value multiplexer 230 is configured to receive the minimum value outputs of the processing modules 210 (min_A, min_B) and, under control of the select signal from comparator 220, to output the minimum value outputs of the processing modules 210 in a manner for indicating which of the minimum value outputs of the processing modules 210 is output as the first minimum value (Min.sub.1) of the set of M|.sub.i| values and which of the minimum value outputs of the processing modules 210 is output as the second minimum value (Min.sub.2) of the set of M|.sub.i| values. The value multiplexer 230 is configured to pass the select signal from the comparator 220 through to index multiplexer 240 for controlling outputting of the first index (Ind.sub.1) of the first minimum value (Min.sub.1) in accordance with outputting of the minimum value outputs of the value multiplexer 230. The index multiplexer 240 is configured to receive the indexes of the processing modules 210 (Ind_A, Ind_B), and to output the indexes of the processing modules 210 in a manner for associating the first index (Ind.sub.1) of the first minimum value (Min.sub.1) with the first minimum value (Min.sub.1) output from value multiplexer 230.
(30) In this manner, the smallest of the minimum value outputs of the processing modules 210 may be output as the first minimum value (Min.sub.1) of the set of M|.sub.i| values and the next smallest of the minimum value outputs of the processing modules 210 may be output as the second minimum value (Min.sub.2) of the set of M.sub.i values and, further, the first index (Ind.sub.1) of the first minimum value (Min.sub.1) may be associated with the first minimum value (Min.sub.1) to indicate the location of the first minimum value (Min.sub.1) within the set of M.sub.i values (and, optionally, the second index (Ind.sub.2) of the second minimum value (Min.sub.2) may be associated with the second minimum value (Min.sub.2) to indicate the location of the second minimum value (Min.sub.2) within the set of M.sub.i values).
(31) It will be appreciated that, although primarily depicted and described with respect to embodiments in which processing modules 210 are configured such that (1) first processing module 210.sub.A is configured to receive and process a first half of the M|.sub.i| values (e.g., including the specific |.sub.1|-|.sub.M/2| values) and (2) second processing module 210.sub.B is configured to receive and process a second half of the M|.sub.i| values (e.g., including the specific |.sub.(M/2)+1|-|.sub.M| values), the processing modules 210 may be configured such that the processing modules 210 receive respective halves of the M|.sub.i| values but the specific |.sub.i| values that are provided to the processing modules 210 are arranged differently (e.g., first processing module 210.sub.A is configured to receive and process the |.sub.1|-|.sub.M/4| values and the |.sub.(3M/4)+1|-|.sub.M| values and second processing module 210.sub.B is configured to receive and process the |.sub.(M/4)+1|-|.sub.3M/4| values), the processing modules 210 may be configured such that one of the processing modules 210 receives different sized portions of the |.sub.i| values (e.g., first processing module 201.sub.A receives and processes greater than M/2 |.sub.i| values and second processing module 201.sub.B receives and processes less than M/2|.sub.i| values), or the like, as well as various combinations thereof. Accordingly, it will be appreciated that, although primarily depicted and described with respect to a specific arrangement of functions using specific numbers, types, and arrangements of modules, functions of minimum determination module 200 of
(32) Referring again to
(33)
(34) As depicted in
(35) The minimum determination module 300 includes X-1 stages of 2-input minimum modules 310. The 2-input minimum modules 310 each include two value inputs for receiving two values to be compared and one value output for outputting the minimum value of the two compared values. In the case of the first stage of the tree structure, the two value inputs of the 2-input minimum module 310 receive 2|.sub.i| values received by minimum determination module 300 from the VNUs to which the CNU is connected. In the case of any additional stages of the tree structure other than the first stage of the tree structure (e.g., a k-th stage), the two value inputs of the 2-input minimum module 310 receive two |.sub.i| values output from two 2-input minimum modules 310 at the previous stage (e.g., a (k1)th stage) of the tree structure. Additionally, in the case of any additional stages of the tree structure other than the first stage of the tree structure, (e.g., a k-th stage), the 2-input minimum module 310 also includes (1) two index inputs for receiving indexes associated with the two values received via the two value inputs and (2) one index output for outputting the one of the two received indexes associated with the minimum value output from the value output. As discussed further below, the index may be propagated in various ways (e.g., the index is log 2(M) bits long in which case we simply output one of the 2 received indexes in each stage, the index grows by 1 bit at each stage (in which case one of the 2 received indexes is output and an extra bit is further appended depending on which input was minimum), or the like). An exemplary 2-input minimum module 310 is depicted in
(36) The 2-input minimum module 310 is configured for use at a stage other than the first stage of the tree structure. The 2-input minimum module 310 includes a minimum determination element 311, a value multiplexer 312, and an index multiplexer 313. The minimum determination element 311 receives the two values (denoted as A and B) from the previous stage of the tree structure, the value multiplexer 312 also receives the two values (again, denoted as A and B) from the previous stage of the tree structure, and the index multiplexer 313 receives the two indexes (denoted as Ind.sub.A and Ind.sub.B, which are associated with values A and B, respectively) from the previous stage of the tree structure. The minimum determination element 311 compares the two values to determine which of the two values is smaller, and outputs a signal indicative as to which of the two values is smaller. The indication as to which of the two values is smaller is used as a control signal for both the value multiplexer 312 and the index multiplexer 313. If a determination is made that value A is less than value B, the signal indicative as to which of the two values is smaller that is output from minimum determination element 311 causes value multiplexer 312 to select the input corresponding to value A and, similarly, causes index multiplexer 313 to select the input corresponding to Ind.sub.A. Alternatively, if a determination is made that value B is less than value A, the signal indicative as to which of the two values is smaller that is output from minimum determination element 311 causes value multiplexer 312 to select the input corresponding to value B and, similarly, causes index multiplexer 313 to select the input corresponding Ind.sub.B. In this manner, the minimum values and associated indexes for the minimum values may be propagated toward the 2-input min-max module 320 for a final determination of the first minimum value (Min.sub.1) which is the smallest of the M |.sub.i| values received by the minimum determination module 300 and the second minimum value (Min.sub.2) which is an approximation of the next smallest of the M|.sub.i| values received by the minimum determination module 300. It will be appreciated that a 2-input minimum module 310 for use at the first stage of the tree structure may omit the index multiplexer 313 and, rather, may simply output an index associated with the minimum value for use at the next stage of the tree structure. It will be appreciated that, although depicted and described with respect to a specific embodiment of a 2-input minimum module 310 (illustratively, exemplary 2-input minimum module 310), the 2-input minimum module 310 may be implemented in various other ways in order to provide functions of the 2-input minimum module 310 as presented herein.
(37) The minimum determination module 300 includes a 2-input min-max module 320 in the X.sup.th stage of the tree structure. The 2-input min-max module 320 includes two value inputs for receiving two values from the (X-1).sup.th stage of the tree structure and two value outputs for outputting the two values based on comparison of the two values. The 2-input min-max module 320 is configured to compare the two values received via the two value inputs, and to output the two values from the two value outputs in a manner for indicating (1) the first minimum value (Min.sub.1), which is the smaller of the two values received by the 2-input min-max module 320 and provides the magnitude of the smallest value of the M|.sub.i| values received by the minimum determination module 300 and (2) the second minimum value (Min.sub.2), which is the larger of the two values received by the 2-input min-max module 320 and provides an approximation of the magnitude of the next-smallest value of the M|.sub.i| values received by the minimum determination module 300. Additionally, the 2-input min-max module 320 also includes (1) two index inputs for receiving indexes associated with the two values received via the two value inputs and (2) one index output for outputting the one of the two received indexes associated with the first minimum value (Min.sub.1) determined by 2-input min-max module 320 (which, as discussed herein, is indicative of a location, within the M|.sub.i| values received by the minimum determination module 300, of the |.sub.i| value providing the magnitude of the smallest value of the M|.sub.i| values received by the minimum determination module 300). As discussed further below, the index may be propagated in various ways (e.g., the index is log 2(M) bits long in which case we simply output one of the 2 received indexes in each stage, the index grows by 1 bit at each stage (in which case one of the 2 received indexes is output and an extra bit is further appended depending on which input was minimum), or the like). An exemplary 2-input min-max module 320, which is suitable for use as a 2-input min-max module 320, is depicted in
(38) The 2-input min-max module 320 includes a minimum determination element 321, a minimum value multiplexer 322.sub.min and a maximum value multiplexer 322.sub.max, and an index multiplexer 323. The minimum determination element 321 receives the two values (denoted as A and B) from the (X-1).sup.th stage of the tree structure, the minimum value multiplexer 322.sub.min and the maximum value multiplexer 322.sub.max each also receive the two values (again, denoted as A and B) from the (X-1).sup.th stage of the tree structure, and the index multiplexer 323 receives the two indexes (denoted as Ind.sub.A and Ind.sub.B, which are associated with values A and B, respectively) from the (X-1).sup.th stage of the tree structure. The minimum determination element 321 compares the two values to determine which of the two values is smaller, and outputs a signal (e.g., typically 1 or 0, although any suitable signal may be used) indicative as to which of the two values is smaller. The indication as to which of the two values is smaller is used as a control signal for both the minimum value multiplexer 322.sub.min and the maximum value multiplexer 322.sub.max, as well as for the index multiplexer 323. If a determination is made that value A is less than value B, the signal indicative as to which of the two values is smaller that is output from minimum determination element 321 causes minimum value multiplexer 322.sub.min to select the input corresponding to value A and causes the maximum value multiplexer 322.sub.max to select the input corresponding to value B and, further, causes index multiplexer 323 to select the input corresponding to Ind.sub.A. Alternatively, if a determination is made that value B is less than value A, the signal indicative as to which of the two values is smaller that is output from minimum determination element 321 causes minimum value multiplexer 322.sub.min to select the input corresponding to value B and causes the maximum value multiplexer 322.sub.max to select the input corresponding to value A and, further, causes index multiplexer 323 to select the input corresponding to Ind.sub.B. In this manner, the 2-input min-max module 320 is able to output the first minimum value (Min.sub.1) which is the magnitude of the smallest of the M|.sub.i| values received by the minimum determination module 300 and the second minimum value (Min.sub.2) which is an approximation of the magnitude of the next smallest of the M|.sub.i| values received by the minimum determination module 300. It will be appreciated that, although depicted and described with respect to a specific embodiment of a 2-input min-max module 320 (illustratively, exemplary 2-input min-max module 320), the 2-input min-max module 320 may be implemented in various other ways in order to provide functions of the 2-input min-max module 320 as presented herein.
(39) As discussed above, minimum determination module 300, in addition to supporting propagation of |.sub.i| values, also supports propagation of associated index values. In at least some embodiments (as presented in
(40) It will be appreciated that, although primarily depicted and described with respect to embodiments of the minimum determination module 200 in which only a single 2-input min-max module 320 is used in the tree structure and the remaining 2-input elements of the tree structure are 2-input minimum modules 310, in at least some embodiments the minimum determination module may be configured to use 2-input min-max modules at one or more earlier stages of the tree structure, in which case the stage(s) of the tree structure preceding the 2-input min-max modules may include 2-input minimum modules 310 (i.e., such that less 2-input minimum modules 310 would be used) and the stage(s) of the tree structure following the 2-input min-max modules may include 4-input min1-min2 modules (discussed further below). An exemplary embodiment of the minimum determination module 200 in which multiple 2-input min-max modules are used in the tree structure is depicted and described with respect to
(41)
(42) The minimum determination module 400 includes X-3 stages of 2-input minimum modules 410 (in the first (X-3) stages). The 2-input minimum modules 410 of
(43) The minimum determination module 400 includes one stage of 2-input min-max modules 410 (in the (X-2).sup.th stage). The 2-input min-max modules 420 of
(44) The minimum determination module 400 includes two stages of 4-input min1-min2 modules 430 (in the (X-1).sup.th and X.sup.th stages). The 4-input min1-min2 modules 430 each include two sets of value inputs for receiving four values from the previous stage of the tree structure and two value outputs for outputting the two values based on comparison of the four input values. The 4-input min1-min2 modules 430 each are configured to compare two pairs of values received via the two sets of value inputs, and to output the two values from the two value outputs in a manner for indicating (1) the first minimum value (Min.sub.1), which is the smallest of two of the values in a first pair of values received by the 4-input min1-min2 module 430 and provides the magnitude of the smallest value of the M|.sub.i| values received by the minimum determination module 400 and (2) the second minimum value (Min.sub.2), which is smallest of the remaining three values received by the 4-input min1-min2 modules 430 and provides an approximation of the magnitude of the next-smallest value of the M|.sub.i| values received by the minimum determination module 400. Additionally, each 4-input min1-min2 module also includes (1) two index inputs for receiving two indexes associated with values received via the two sets of value inputs, respectively and (2) one index output for outputting the one of the two received indexes associated with the first minimum value (Min.sub.1) determined by 4-input min1-min2 modules 430 (which, as discussed herein, is indicative of a location, within the M|.sub.i| values received by the minimum determination module 400, of the |.sub.i| value providing the magnitude of the smallest value of the M|.sub.i| values received by the minimum determination module 400). As discussed further below, the index may be propagated in various ways (e.g., the index is log 2(M) bits long in which case we simply output one of the 2 received indexes in each stage, the index grows by 1 bit at each stage (in which case one of the 2 received indexes is output and an extra bit is further appended depending on which input was minimum), or the like). An exemplary 4-input min1-min2 module 430, which is suitable for use as a 4-input min1-min2 module 430, is depicted in
(45) The 4-input min1-min2 module 430 includes a first minimum determination element 431.sub.1, a second minimum determination element 431.sub.2, four multiplexers 432.sub.1-432.sub.4 (collectively, multiplexers 432), and an index multiplexer 433. The first minimum determination element 431.sub.1 receives the two values (denoted as Min1.sub.A and Min1.sub.B) from the previous stage of the tree structure, compares the two values to determine which of the two values is smaller, and outputs a signal (denoted as Ind[q], e.g., typically 1 or 0 although any suitable signal may be used) indicative as to which of the two values is smaller. The first multiplexer 432.sub.1 receives two values (denoted as Min1.sub.A and Min1.sub.B) from the previous stage of the tree structure, selects the smaller of the two values, and outputs the smaller value as Min.sub.1. The second multiplexer 432.sub.2 receives two values (denoted as Min1.sub.A and Min1.sub.B) from the previous stage of the tree structure, selects the larger of the two values, and outputs the larger value as an input to the second minimum determination element 431.sub.2 and as an input to the fourth multiplexer 432.sub.4. The third multiplexer 432.sub.3 receives two values (denoted as Min2.sub.A and Min2.sub.B) from the previous stage of the tree structure and provides an appropriate second input for second minimum determination element 431.sub.2. The second minimum determination element 431.sub.2, if Min1.sub.A<Min1.sub.B, compares Min1.sub.B with Min2.sub.A (which is output by third multiplexer 432.sub.3) to determine Min.sub.2. The second minimum determination element 431.sub.2, if Min1.sub.13<Min1.sub.A, compares Min1.sub.A with Min2.sub.B (which is output by third multiplexer 432.sub.3) to determine Min.sub.2. The fourth multiplexer 432.sub.4 receives the same two input values as the second minimum determination element 431.sub.2 and selects the smaller of the two values based on the control signal received from second minimum determination element 431.sub.2 and outputs the smaller value as Min.sub.2. It is noted that in each stage the Ind[q] bit is appended to the preceding Ind[q1] . . . Ind[1] to form the entire index of the true minimum value from the set of |.sub.i| values input to minimum determination module 400, which provides an identification of which of the |.sub.i| values input to minimum determination module 400 has a magnitude that corresponds to the true minimum value from the set of |.sub.i| values input to minimum determination module 400. It will be appreciated that, although depicted and described with respect to a specific embodiment of a 4-input min1-min2 module 430 (illustratively, exemplary 4-input min1-min2 module 430), the 4-input min1-min2 module 430 may be implemented in various other ways in order to provide functions of the 4-input min1-min2 module 430 as presented herein.
(46) It will be appreciated that, although primarily depicted and described with respect to specific embodiments of the minimum determination module 200 (illustratively, minimum determination module 300 of
(47) Referring again to
(48)
(49) The processing module 500 processes the group of 8 |.sub.i| values from the 8 messages received from 8 VNUs 124 in order to determine the minimum |.sub.i| value from among the 8 |.sub.i| values of the group (which gives the magnitude of the minimum |.sub.i| value from among the 8 |.sub.i| values of the group, but does not indicate which of the 8 |.sub.i| values of the group corresponds to the minimum |.sub.i| value from among the 8 |.sub.i| values of the group) and the index of the minimum |.sub.i| value from among the 8 |.sub.i| values of the group (which identifies which of the 8 |.sub.i| values of the group corresponds to the minimum |.sub.i| value from among the 8 |.sub.i| values of the group (e.g., a location of minimum |.sub.i| value from among the 8 |.sub.i| values of the group within the 8 |.sub.i| values of the group), but does not indicate the magnitude of the minimum |.sub.i| value from among the 8 |.sub.i| values of the group). The 8 MSBs of bit set 501.sub.3 are provided to zero detector module 510.sub.3, which performs a logical AND operation on the 8 MSBs to produce a corresponding found signal f(bit 3). The found signal f(bit 3) is (a) output as the bit for the 3.sup.rd bit position (MSB) of the minimum |.sub.i| value from among the 8 |.sub.i| values being processed by processing module 500 and (b) fed back as an input to the mask generation module 520.sub.3 that is associated with the 8 MSBs of bit set 501.sub.3. The found signal f(bit 3) is set equal to 0 based on detection of the presence of at least one zero bit among the 8 bits of bit set 501.sub.3, and is set to 1 otherwise. The mask generation module 520.sub.3 receives the 8 MSBs of bit set 501.sub.3 and the found signal f(bit 3) output from zero detector module 510.sub.3, and uses the 8 MSBs of bit set 501.sub.3 and the found signal f(bit 3) to produce a disable signal (denoted as ds(bit 2) which is an 8-bit signal defined as ds(bit 2)=ds.sub.i(bit 2), ds.sub.2(bit 2), . . . , ds.sub.8(bit2) where the subscripts correspond to the 8 |.sub.i| values) for use by mask application module 530.sub.2 associated with the bit set 501.sub.2 including the next MSBs of the 8 |.sub.i| values. The mask generation module 520.sub.3 produces the disable signal ds(bit 2) by, based on a determination that the found signal f(bit 3) is active (e.g., found signal f(bit 3)=0), set a corresponding bit of the disable signal ds(bit 2) equal to 1 for each bit of bit set 501.sub.3 that is equal to 1. For example, if the 8 MSBs of bit set 501.sub.3 are 10011011 and f(bit 3) is 0 (indicative that the 8 MSBs of bit set 501.sub.3 included at least one 0), then ds(bit 2) will be 10011011. The 8 bits of bit set 501.sub.2, rather than being provided directly to zero detector module 510.sub.2, are provided to the mask application module 530.sub.2 associated with the bit set 501.sub.2. The mask application module 530.sub.2 associated with the bit set 501.sub.2 receives the 8 bits of bit set 501.sub.2 and the disable signal ds(bit 2) generated by mask generation module 520.sub.3, and masks the 8 bits of bit set 501.sub.2 with the 8 bits of the disable signal ds(bit 2) to produce a masked bit set 502.sub.2 (including 8 masked bits, denoted as .sub.1(bit 2)-.sub.8(bit 2)) which is provided to the zero detector module 510.sub.2 instead of the 8 bits of bit set 501.sub.2. The disable signal ds(bit 2) turns into 1 the 8 bits of bit set 501.sub.2 (namely, bits |.sub.k|(bit 2)) for which the corresponding bits of bit set 501.sub.3 (namely, bits |.sub.k|(bit 3)) are equal to 1 if, for at least one value of m (where mk), |.sub.m|(bit 3) equals 0. The 8 masked bits of masked bit set 502.sub.2 are provided to zero detector module 510.sub.2 associated with the bit set 501.sub.2, which performs a logical AND operation on the 8 masked bits to produce a corresponding found signal f(bit 2). The found signal f(bit 2) is (a) output as the bit for the 2.sup.nd bit position (second MSB) of the minimum |.sub.i| value from among the 8 |.sub.i| values being processed by processing module 500 and (b) fed back as an input to the mask generation module 520.sub.2 that is associated with the 8 bits of bit set 501.sub.2. The found signal f(bit 2) is set equal to 0 based on detection of the presence of at least one zero bit among the 8 bits of masked bit set 502.sub.2, and is set to 1 otherwise. The mask generation module 520.sub.2 receives the 8 bits of bit set 501.sub.2 and the found signal f(bit 2) output from zero detector module 510.sub.2, and uses the 8 bits of bit set 501.sub.2 and the found signal f(bit 2) to produce a disable signal (denoted as ds(bit 1) which is an 8-bit signal defined as ds(bit 1)=ds.sub.1(bit 1), ds.sub.2(bit 1), . . . , ds.sub.8(bit1) where the subscripts correspond to the 8 |.sub.i| values) for use by mask application module 530.sub.1 associated with the bit set 501.sub.1 including the next MSBs of the 8 |.sub.i| values. The processing then continues as discussed above in order to produce a corresponding found signal f(bit 1) which is output as the bit for the 1.sup.st bit position (third MSB) of the minimum |.sub.i| value from among the 8 |.sub.i| values being processed by processing module 500 and to produce a corresponding found signal f(bit 0) which is output as the bit for the 0.sup.th bit position (LSB) of the minimum |.sub.i| value from among the 8 |.sub.i| values being processed by the processing module 500. In this manner, the concatenation of the found signals f(bit 3)-f(bit 0) provides the minimum |.sub.i| value from among the 8 |.sub.i| values of the group. Additionally, the index determination module 540, which is associated with the bit position of the LSB, is configured to determine the index of the minimum |.sub.i| value from among the 8 |.sub.i| values of the group. The index determination module 540 receives the 8 bits of bit set 501.sub.0 and the 8 bits of the masked bit set 502.sub.0, and uses a truth table to determine the index of the minimum |.sub.i| value from among the 8 |.sub.i| values of the group. Accordingly, as depicted in
(50)
(51)
(52) The mask generation module 710 receives the 8 bits of the bit set for bit position p (denoted as |.sub.1|(bit p)-|.sub.8|(bit p)) and the found signal f(bit p) output from the zero detector of bit position p, and produces a disable signal ds(bit p1) for use by mask application module 720 associated with bit position p1. The mask generation module 710 includes 8 AND gates 711.sub.1-711.sub.8 (collectively, AND gates 711) and an inverter 712. The AND gates 711 each include two inputs and one output, respectively. The inverter 712 includes a single input and a single output. The 8 bits of the bit set (namely, |.sub.1|(bit p)-|.sub.8|(bit p)) are input into first inputs of the 8 AND gates 711.sub.1-711.sub.8, respectively. The input of the inverter 712 receives found signal f(bit p) and outputs an inverted found signal f(bit p). The inverted found signal f(bit p) is input into each of the second inputs of the 8 AND gates 711.sub.1-711.sub.8, respectively. If the found signal f(bit p) for bit position p is a 0 (indicative that at least one of the bits at bit position p was a 0) then the inverted found signal f(bit p) is a 1 such that, for each of the |.sub.i|(bit p) values of bit position p that were 1, the associated AND gate 711.sub.i will ensure that the corresponding disable signal ds.sub.i(bit p1) for the next bit position p1 is also a 1 since those |.sub.i| values cannot be the minimum value of the set of |.sub.i| values and, thus, should not be evaluated as part of the zero detection performed at the next bit position p1. If the found signal f(bit p) is a 1 (indicative that all of the bits at bit position p were 1) then the inverted found signal f(bit p) is a 0 such that, regardless of the |.sub.i|(bit p) values of bit position p that were 1, the associated AND gates 711 will ensure that the corresponding disable signals ds(bit p1) for the next bit position p1 are 0. The outputs of the 8 AND gates 711.sub.1-711.sub.8 form the disable signal ds(bit p1) for use by mask application module 720 associated with bit position p1.
(53) The mask application module 720 receives the 8 bits of the bit set for bit position p1 (denoted as |.sub.1|(bit p1)-|.sub.8|(bit p1)) and the disable signal ds(bit p1) from mask generation module 710, and produces the 8 bits of the masked bit set for bit position p1 (denoted as masked bits .sub.1(bit p1)-.sub.8(bit p1)). The mask application module 720 includes 8 OR gates 721.sub.1-721.sub.8 (collectively, OR gates 721), each of which includes two inputs and one output, respectively. The 8 bits of the bit set (namely, |.sub.1|(bit p1)-|.sub.8|(bit p1)) are input into first inputs of the 8 OR gates 721.sub.1-721.sub.8, respectively. The 8 bits of the disable signal ds(bit p1) are input into second inputs of the 8 OR gates 721.sub.1-721.sub.8, respectively. If the disable signal ds.sub.i(bit p1) for the bit position p1 is a 1 (indicative that the bit |.sub.i|(bit p) of the previous bit position was 1 even though at least one other bit |.sub.j| (bit p) of the previous bit position was 0 then the associated OR gate 721.sub.i ensures that the corresponding masked bit .sub.i(bit p1) is a 1 (regardless of whether the associated bit |.sub.i|(bit p1) of bit position p1 is 1 or 0) and, thus, that the associated |.sub.i| value cannot be the minimum value of the set of |.sub.i| values (i.e., even though the current bit |.sub.i|(bit p1) of bit position p1 is a 0, it was previously determined that the |.sub.i| value cannot be the minimum value of the set of |.sub.i| values since at least one other |.sub.i| value from the set of |.sub.i| values has a 0 at a more significant bit position while the |.sub.i| value has a 1 at that more significant bit position). In other words, even though the current bit |.sub.i|(bit p1) of bit position p1 of the given |.sub.i| value is a 0, this 0 value is blocked, or masked, from being considered by the zero detector module for bit position p1 since, as noted above, it was previously determined that the given |.sub.i| value cannot be the minimum value of the set of |.sub.i| values since at least one other |.sub.i| value from the set of |.sub.i| values has a 0 at a more significant bit position while the given |.sub.i| value has a 1 at that more significant bit position. The outputs of the 8 OR gates 721.sub.1-721.sub.8 form the masked bit set (namely, .sub.1(bit p1)-.sub.8(bit p1)) which is provided to the zero detector module for bit position p1.
(54)
(55)
(56) It will be appreciated that, although primarily depicted and described herein with respect to embodiments in which numbers are represented using a sign-magnitude representation and the magnitude portions of the numbers are compared for determining the first minimum value and an approximation of the second minimum value, in at least some embodiments various modules depicted and described herein may be used (or adapted for use) for direct comparisons of the numbers (i.e., direct comparisons of the sign-magnitude representations of the numbers, including both the sign and magnitude portions of the numbers). For example, if, by the used sign-magnitude convention, the sign bit value 0 represents a negative number, then the sign bit can be considered to be the MSB and processed according to the description of minimum determination module 400. Alternatively, for example, if the sign bit value 1 represents a negative number, then the sign bits of all input numbers should be inverted and the inverted sign bits can be processed by the minimum determination module 400 as their MSBs. It will be appreciated that other mechanisms for handling sign-magnitude representations of numbers may be supported for use in determining the first minimum value and an approximation of the second minimum value.
(57) It will be appreciated that, although primarily depicted and described herein with respect to embodiments in which the numbers that are processed for determining the first minimum value and an approximation of the second minimum value include sign and magnitude portions, in at least some embodiments various modules depicted and described herein may be used (or adapted for use) for determining the first minimum value and an approximation of the second minimum value for numbers that do not include a sign portion or for determining the first minimum value and an approximation of the second minimum value for numbers independent or irrespective of whether the numbers include a sign portion or merely represent magnitudes.
(58) It will be appreciated that, although primarily depicted and described herein as determining an approximation of the second minimum value (Min.sub.2) given a set of values, the determination of the second minimum value (Min.sub.2) also may be said to be a determination of at least an approximation of the second minimum value (Min.sub.2) since the reference to at least may be used to cover the fact that, in at least some cases, the second minimum value (Min.sub.2) will be the true second-smallest value of all of the values in the set of values.
(59) It will be appreciated that, although primarily depicted and described herein with respect to embodiments applied within the context of an LDPC decoder (e.g., in which evaluation of a set of input values to determine a smallest value of the set of input values and an approximation of a next-smallest value of the set of input values is performed for identifying the first minimum value (Min.sub.1) and second minimum value (Min.sub.2) for use by a CNU in computing a set of responses to a set of VNUs from which the input values were received), various embodiments depicted and described herein may be used within various other contexts (e.g., other devices, environments, technologies, or the like) for evaluating a set of input values to determine a smallest value of the set of input values and an approximation of a next-smallest value of the set of input values. Accordingly, a more general embodiment of a method for evaluating a set of input values to determine a smallest value of the set of input values and an approximation of a next-smallest value of the set of input values is depicted and described in
(60)
(61)
(62) At step 1101, method 1100 begins.
(63) At step 1110, a set of values is received. The set of values may be received from any suitable source of values.
(64) At step 1120, a minimum value of the set of values is determined. The minimum value of the set of values may be determined based on bitwise comparisons of bits of the values on a per bit position basis. The minimum value of the set of values may be determined based on bitwise comparisons of bits of the values on a per bit position basis beginning with the most significant bit position of the values (and, thus, the most significant bits of the values) and proceeding toward the least significant bit position (and, thus, the least significant bits of the values). The bitwise comparisons on a bit position basis may be performed as depicted and described with respect to
(65) The minimum value of the set of values may be determined based on bitwise comparisons by using the bitwise comparisons to determine at least one of a magnitude of the minimum value of the set of values or an indication of which of the values of the set of values has a magnitude of the minimum value of the set of values.
(66) In at least some embodiments, only the magnitude of the minimum value of the set of values may be determined without determining which of the values in the set of values has that minimum magnitude (e.g., for a set of input values including 2, 6, 7, 1, 4, determination of only the magnitude may only provide an indication that the minimum value has a magnitude of 1 without an indication that the fourth value in the set of values is the value which has that minimum magnitude). In at least some embodiments, a determination of which of the values in the set of values has that minimum magnitude (e.g., in the above example, determining that the fourth value in the set of values is the value which has the determined minimum magnitude) also may be performed (e.g., based on bitwise comparisons, by searching the set of values to identify which of the values has that determined minimum magnitude, or the like).
(67) In at least some embodiments, only an indication of which of the values of the set of values has a magnitude of the minimum value of the set of values may be determined without determining the magnitude of the minimum value of the set of values that is associated with that indicated value of the set of values (e.g., for a set of input values including 2, 6, 7, 1, 4, determination of only an indication of which of the values of the set of values has a magnitude of the minimum value of the set of values provide an indication that the fourth value in the set of values has the minimum magnitude without an indication that the magnitude of the fourth value is 1). In at least some embodiments, a determination of the magnitude of the identified value of the set of values (e.g., in the above example, determining that the magnitude of the fourth value in the set of values is 1) also may be performed (e.g., based on bitwise comparisons, by reading or accessing the identified value of the set of values to determine the magnitude of the identified value of the set of values, or the like).
(68) In at least some embodiments, both the magnitude of the minimum value of the set of values and an indication of which of the values in the set of values has that minimum magnitude may be determined (e.g., for a set of input values including 2, 6, 7, 1, 4, determination that the magnitude of the minimum value in the set of values is 1 and an indication that the fourth value in the set of values is the value which has that minimum magnitude). The use of bitwise comparisons of bits of a set of values on a per bit position basis to determine a minimum value of the set of values (both the magnitude of the minimum value of the set of values and an indication of which of the values in the set of values has that minimum magnitude) may be further understood by way of reference to the following example. In this example, assume that there are three values (value v1=100, value v2=011, value v3=010) that need to be evaluated in order to determine the minimum value (which will be value v3=010). A first bitwise comparison is performed at the MSB position for the MSBs of the three values (namely, 1 from value v1, 0 from value v2, and 0 from value v3) to determine whether any of the bits of the three values are 0. Here, since two of the values (value v2 and value v3) have a 0 in the MSB position, it is known that the minimum value of the three values begins with a 0 and, further, that one of the values (namely, value v1) cannot be the minimum value. Accordingly, an output is provided which may be used to indicate that the minimum value of the three values begins with a 0. In the exemplary embodiment of
(69) At step 1199, method 1100 ends.
(70) It will be appreciated that, although primarily depicted and described herein with respect to embodiments in which the number of input values (M) in the set of input values being evaluated is a power of 2 (e.g., for determining first and second minimum values, for determining a single minimum value, or the like), various embodiments depicted and described herein may be configured for evaluating a set of input values where the number of input values (M) in the set of input values is not a power of 2. In at least some such embodiments, the module or modules used for evaluating the set of input values may be configured based on a next-higher power of 2 (e.g., for M=12 the module or modules used for evaluating the set of 12 input values may be based on evaluation of a set of 16 input values, for M=60 the module or modules used for evaluating the set of 60 input values may be based on evaluation of a set of 64 input values, or so forth). In at least some such embodiments, configuration of the module or modules used for evaluating the set of input values based on a next-higher power of 2 may use open input connections, dummy variables, or the like.
(71) As discussed herein, various embodiments of the LDPC decoding capability presented herein provide an approximation of conventional LDPC decoders that has bit error rate (BER) performance comparable to that of conventional LDPC decoders while reducing chip area and power consumption (as discussed further below with respect to Table 1) and complexity (as discussed further below with respect to Table 2).
(72) As discussed herein, various embodiments of the LDPC decoding capability presented herein provide an approximation of conventional LDPC decoders that has BER performance comparable to that of conventional LDPC decoders while reducing chip area and power consumption. The use of a logic synthesis tool may be employed to quantify chip area and power consumption benefits for at least some embodiments of the LDPC decoding capability. For example, assuming an 8-input CNU with word length w=4 bits where each design is synthesized in 90 nm CMOS for minimum area at V.sub.DD=1.2 V, results for a conventional LDPC decoder, an LDPC decoder designed based on the paper entitled A Bit-Serial Approximate Min-Sum LDPC decoder and FPGA Implementation by Darabiha et al., and an LDPC decoder based on various embodiments presented herein are presented in Table 1.
(73) TABLE-US-00001 TABLE 1 Darabiha Various Embodiments Conventional et al. Presented Herein Leaf Cells Count 234 226 162 Area (m.sup.2) 2172 2168 1542 Propagation Delay (ns) 2.90 2.14 2.11 Dynamic Power 0.61 0.57 0.43 Dissipation (W/MHz) Leakage Power 6.2 6.5 4.6 Dissipation (W)
(74) As discussed herein, various embodiments of the LDPC decoding capability presented herein provide an approximation of conventional LDPC decoders that has BER performance comparable to that of conventional LDPC decoders while reducing complexity. The complexity of a conventional LDPC decoder and an LDPC decoder based on various embodiments presented herein is presented in Table 2. The number of 1-bit 2-to-1 multiplexers (MUX2s) and the number of (w1)-bit comparators (COMPs) are reduced by a factor about 2 and a factor of about 1.5, respectively. The improvement in the propagation delay depends on log.sub.2M. The number of operations that are not related to finding the minimums (e.g. XOR operations) is not expected to be affected by embodiments presented herein. It is noted that, knowing the area, power dissipation, and delay of the cells (e.g., MUXs, comparators, and other elements), it is possible to estimate the benefits of various embodiments presented herein using Table 2 for any given LDPC code in a particular CMOS technology. In Table 2, t.sub.MUX corresponds to the delay of a MUX2, t.sub.COMP corresponds to the delay of a comparator (COMP), and t.sub.add corresponds to the delay of an adder.
(75) TABLE-US-00002 TABLE 2 Various Embodiments Conventional Presented Herein # MUX2s 4M(w 1) 2M (w 1) # (w 1)-bit COMPs 1.5M 2 M 1 # (w 1)-bit adders 2 2 # log.sub.2M-bit COMPs M M # XORs 2M 1 2M 1 Delay (1 + 2log.sub.2M)t.sub.MUX + (1 + log.sub.2M) t.sub.MUX + (2 + log.sub.2M)t.sub.COMP + (1 + log.sub.2M)t.sub.COMP + t.sub.ADD t.sub.ADD
Various advantages of embodiments of the LDPC decoding capability presented herein may be further understood by way of simulations related to a conventional LDPC decoder and an LDPC decoder based on various embodiments presented herein. A simulation was performed using an LDPC(2048,1723) code defined in the 10 Gbase-T Ethernet standard and an LDPC(576,288) code defined in the WiMax standard may be used as test bench. The system-level characterization of the decoder was performed in MATLAB. Encoded data was sent through an additive white Gaussian noise (AWGN) channel using non-return-to-zero (NRZ) signaling. For MSA decoding, S.sub.norm is 0.75, and w is either 4 or 5 bits. To evaluate the performance of the conventional CNU circuits and CNU circuits designed based on various embodiments presented herein, a combinational logic including sign and normalization calculations was simulated. The CNU circuits of the simulation were implemented in Verilog and then synthesized in a 90-nm CMOS technology. SPICE simulation using the same technology was used to find the relationship between supply voltage, power dissipation, and propagation delay. With respect to BER performance between a conventional LDPC decoder and an LDPC decoder based on various embodiments presented herein, the simulation indicated that (1) for LDPC(2048,1723), a SNR penalty of 0.1 dB and 0.2 dB was observed for w equal to 4 and 5 bit, respectively, and (2) for LDPC(576,288), the SNR penalty remained below 0.1 dB. With respect to post-FEC BER versus SNR for word lengths of 4 and 5 in a LDPC(2048, 1723) code, between a conventional LDPC decoder, an LDPC decoder designed based on the paper entitled A Bit-Serial Approximate Min-Sum LDPC decoder and FPGA Implementation by Darabiha et al., and an LDPC decoder based on various embodiments presented herein, the simulation indicated that (1) an LDPC decoder based on various embodiments presented herein may have a negligible increase in the required SNR for a given BER over a conventional LDPC decoder and an LDPC decoder designed based on the paper entitled A Bit-Serial Approximate Min-Sum LDPC decoder and FPGA Implementation by Darabiha et al. and (2) for a BER lower than 10.sup.4, the average number of iterations to finish decoding (assuming that early termination is utilized) is about 7% higher and 3% higher for an LDPC decoder based on various embodiments presented herein as compared with a conventional LDPC decoder and an LDPC decoder designed based on the paper entitled A Bit-Serial Approximate Min-Sum LDPC decoder and FPGA Implementation by Darabiha et al., respectively (however, although the power dissipation of an LDPC decoder in fact increases with the number of iterations, the power saving in the CNU due to various embodiments presented herein is much larger than 7% and, thus, in total, use of various embodiments presented herein results in lower power dissipation).
(76) As discussed herein, various embodiments of the LDPC decoding capability presented herein provide various advantages over various conventional LDPC decoder designs (as discussed further below with respect to Table 3 depicted below). Table 3 corresponds to implementations of a CNU, of an LDPC(2048,1723) in a fully-parallel decoder implementation, with M=32 inputs and word length w=5 bits. Comparing designs 1 and 3 of Table 3, which are both optimized for chip area, a CNU according to various embodiments presented herein occupies 37% less area than a conventional CNU, and also has lower power dissipation and lower propagation delay than a conventional CNU. In order to compare the two circuits with the same propagation delay, and hence throughput, the design (i.e., design 1) of the conventional CNU was re-synthesized for a higher speed (i.e., design 2). Comparing designs 2 and 3 of Table 3, a CNU according to various embodiments presented herein occupies 44% less area than a conventional CNU. Design 4, which was optimized for the highest throughput, has an area and power dissipation close to that of design 2, but it provides a throughput two times higher than that of design 2. If throughput is not the main concern, but area and power dissipation are the most critical, voltage scaling (VS) can be considered. The supply voltage (V.sub.DD) of design 3 was lowered to a point where a propagation delay equal to that of design 1 was obtained (i.e., design 5), and a comparison of the results (i.e., design 5) with design 1 shows a three time reduction in power dissipation.
(77) TABLE-US-00003 TABLE 3 Various Embodiments Conventional Presented Herein Design Number 1 2 3 4 5 Optimized For: Area Speed Area Speed Area Leaf Cells Count 1353 2054 823 1961 823 Area (m.sup.2) 12470 14157 7907 14888 7907 Supply 1.2 1.2 1.2 1.2 0.9 Voltage (V) Propagation 5.11 3.8 3.77 1.8 5.11 Delay (ns) Dynamic Power 3.98 3.85 2.47 3.64 1.33 Dissipation (W/MHz) Leakage Power 36 41 24 57 9.6 Dissipation (W)
(78) As discussed herein, various embodiments of the LDPC decoding capability presented herein provide reduced power dissipation as compared to that of conventional LDPC decoders. It is noted that the average total power dissipation of the entire LDPC(N,K) decoder (not just the CNU) with early termination can be expressed as:
(79)
where I.sub.avg and I.sub.max are the average and maximum number of iterations, respectively. P.sub.VNU and P.sub.CNU are the power dissipation of a VNU and a CNU at a clock frequency of f.sub.CK, respectively. C.sub.INT is the total capacitance of the interconnect wires between CNUs and VNUs, and a is the signal activity factor. f.sub.CK is the clock frequency at which the decoder provides the desired throughput after I.sub.max iterations. C.sub.INT is proportional to the total length of the interconnect wires and, thus, approximately proportional to the square-root of the total area. In a fully-parallel implementation, the total area is proportional to NA.sub.VNU+(NK)A.sub.CNU, where A.sub.VNU and A.sub.CNU are the chip area of a VNU and a CNU, respectively. As a result, the average power dissipation can be written as
(80)
where parameter is a function of technology, chip area utilization factor, and average signal activity factor. The bigger the , the higher the impact of interconnects on the average power dissipation. In order to evaluate the impact of various embodiments of the LDPC decoding capability on the power dissipation of a LDPC(2048,1732) decoder, a VNU was synthesized (having a dynamic power dissipation of 3.05 W/MHz, a leakage power of 14 W, and an area of 4760 m.sup.2) and the total power dissipation of the decoder was evaluated using assuming design 1 (conventional) and design 5 (various embodiments presented herein) for the CNUs. The use of various embodiments presented herein resulted in lower power dissipation for the LDPC(2048,1732) decoder.
(81)
(82) The computer 1200 includes a processor 1202 (e.g., a central processing unit (CPU) and/or other suitable processor(s)) and a memory 1204 (e.g., random access memory (RAM), read only memory (ROM), and the like).
(83) The computer 1200 also may include a cooperating module/process 1205. The cooperating process 1205 can be loaded into memory 1204 and executed by the processor 1202 to implement functions as discussed herein and, thus, cooperating process 1205 (including associated data structures) can be stored on a computer readable storage medium, e.g., RAM memory, magnetic or optical drive or diskette, and the like.
(84) The computer 1200 also may include one or more input/output devices 1206 (e.g., a user input device (such as a keyboard, a keypad, a mouse, and the like), a user output device (such as a display, a speaker, and the like), an input port, an output port, a receiver, a transmitter, one or more storage devices (e.g., a tape drive, a floppy drive, a hard disk drive, a compact disk drive, and the like), or the like, as well as various combinations thereof).
(85) It will be appreciated that computer 1200 depicted in
(86) It will be appreciated that the functions depicted and described herein may be implemented in software (e.g., via implementation of software on one or more processors, for executing on a general purpose computer (e.g., via execution by one or more processors) so as to implement a special purpose computer, and the like) and/or may be implemented in hardware (e.g., using a general purpose computer, one or more application specific integrated circuits (ASIC), and/or any other hardware equivalents).
(87) It will be appreciated that some of the steps discussed herein as software methods may be implemented within hardware, for example, as circuitry that cooperates with the processor to perform various method steps. Portions of the functions/elements described herein may be implemented as a computer program product wherein computer instructions, when processed by a computer, adapt the operation of the computer such that the methods and/or techniques described herein are invoked or otherwise provided. Instructions for invoking methods described herein may be stored in fixed or removable media (e.g., non-transitory computer-readable storage media), transmitted via a data stream in a broadcast or other signal bearing medium, and/or stored within a memory within a computing device operating according to the instructions.
(88) It will be appreciated that the term or as used herein refers to a non-exclusive or, unless otherwise indicated (e.g., use of or else or or in the alternative).
(89) It will be appreciated that, although various embodiments which incorporate the teachings presented herein have been shown and described in detail herein, those skilled in the art can readily devise many other varied embodiments that still incorporate these teachings.