Threshold-based min-sum algorithm to lower the error floors of quantized low-density parity-check decoders
11962324 ยท 2024-04-16
Assignee
Inventors
- Homayoon HATAMI (San Diego, CA, US)
- David G. Mitchell (Las Cruces, NM, US)
- Daniel Costello (Clarendon Hills, IL, US)
- Thomas Fuja (South Bend, IN, US)
Cpc classification
H03M13/6577
ELECTRICITY
H03M13/6583
ELECTRICITY
H03M13/114
ELECTRICITY
H03M13/1125
ELECTRICITY
H03M13/1137
ELECTRICITY
H03M13/1128
ELECTRICITY
H03M13/1122
ELECTRICITY
H03M13/112
ELECTRICITY
International classification
Abstract
A modified version of the min-sum algorithm (MSA) which can lower the error floor performance of quantized LDPC decoders. A threshold attenuated min-sum algorithm (TAMSA) and/or threshold offset min-sum algorithm (TOMSA), which selectively attenuates or offsets a check node log-likelihood ratio (LLR) if the check node receives any variable node LLR with magnitude below a predetermined threshold, while allowing a check node LLR to reach the maximum quantizer level if all the variable node LLRs received by the check node have magnitude greater than the threshold. Embodiments of the present invention can provide desirable results even without knowledge of the location, type, or multiplicity of such objects and can be implemented with only a minor modification to existing decoder hardware.
Claims
1. A method for lowering an error floor of a low-density parity-check (LDPC) decoder chip, thereby improving bit-error-rate and/or frame-error-rate performance of the LDPC decoder chip, the method comprising: for each message passed from a check node to a variable node, computing a check node log-likelihood ratio within a check node processing unit of the LDPC decoder chip by iteratively passing quantized messages between processing units on a decoder chip, wherein a plurality of check nodes are connected to a respective variable node and wherein a plurality of variable nodes are connected to a respective check node, and wherein the connections are specified by a parity-check matrix of LDPC code, wherein computing further comprises: comparing a minimum value of a set of variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold, wherein each of the input variable node log-likelihood ratio magnitudes are one of a plurality connected to a respective check node; based on the results of the comparison, determining at the check node processing unit whether to apply a reduction to a check node log-likelihood ratio magnitude or not to apply a reduction to the check node log-likelihood ratio magnitude; applying a reduction to the check node log-likelihood ratio magnitude in instances when the determining step determines that a reduction should be applied; and not applying a reduction to the check node log-likelihood ratio magnitude in instances when the determining step determines that a reduction should not be applied.
2. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude at the check node processing unit comprises applying attenuation.
3. The method of claim 2 wherein applying attenuation comprises multiplying the check node log-likelihood ratio magnitude by a value that is greater than 0 and less than 1.
4. The method of claim 3 wherein applying attenuation comprises multiplying the check node log-likelihood ratio magnitude by a value that is less than one and greater than or equal to 0.5.
5. The method of claim 2 wherein applying attenuation comprises applying attenuation to the check node log-likelihood ratio magnitude before the check node log-likelihood ratio magnitude is passed from a check node to a variable node.
6. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude at the check node processing unit comprises applying an offset.
7. The method of claim 6 wherein applying an offset comprises applying a subtraction function.
8. The method of claim 7 wherein applying a subtraction function comprises subtraction of a predetermined number that is greater than zero.
9. The method of claim 7 wherein applying a subtraction function comprises subtracting a value greater than zero before passing the check node log-likelihood ratio to a connected variable node.
10. The method of claim 1 wherein comparing a minimum value of a set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold comprises determining whether the minimum value is less than the threshold.
11. The method of claim 1 wherein comparing a minimum value of a set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold comprises determining whether the minimum value is less than or equal to the threshold.
12. The method of claim 1 wherein comparing a minimum value of a set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold comprises comparing the minimum value with a predetermined value.
13. The method of claim 12 wherein the predetermined value comprises a value having a magnitude greater than 0.
14. The method of claim 1 wherein comparing a minimum value of a set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold further comprises comparing a second minimum value of the set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold.
15. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude comprises applying a multiplication function of greater than zero and less than one and applying a subtraction function to the check node log-likelihood ratio magnitude.
16. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude further comprises varying an amount of the reduction for each iteration.
17. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude further comprises varying an amount of the reduction based on a location of the check node log-likelihood ratio in a graph.
Description
BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
(1) The accompanying drawings, which are incorporated into and form a part of the specification, illustrate one or more embodiments of the present invention and, together with the description, serve to explain the principles of the invention. The drawings are only for the purpose of illustrating one or more embodiments of the invention and are not to be construed as limiting the invention. In the drawings:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
DETAILED DESCRIPTION OF THE INVENTION
(14) Embodiments of the present invention relate to decoding low-density parity-check (LDPC) codes, wherein the attenuated min-sum algorithm (AMSA) and the offset min-sum algorithm (OMSA) can outperform the conventional min-sum algorithm (MSA) at low signal-to-noise-ratios (SNRs), i.e., in the waterfall region of the bit error rate curve. For quantized decoders, MSA actually outperforms AMSA and OMSA in the error floor (high SNR) region, and all three algorithms suffer from a relatively high error floor. Embodiments of the present invention can include a modified MSA that can outperform MSA, AMSA, and OMSA across all SNRs. The implementation complexity of embodiments of the present invention are only slightly higher than that of the AMSA or OMSA. The simulated performance of embodiments of the present invention, using several classes of LDPC codes (including spatially coupled LDPC codes), is shown to outperform the MSA, AMSA, and OMSA across all SNRs.
(15) Embodiments of the present invention relate to a novel modification to the check node update of quantized MSA that is straightforward to implement and reduces the error floor when compared to other known methods.
(16) As background, let V={v.sub.1, v.sub.2, . . . , v.sub.n} and C={c.sub.1, c.sub.2, . . . , c.sub.m} represent the sets of variable nodes and check nodes, respectively, of a bipartite Tanner graph representation of an LDPC code with parity-check matrix H. Assume that a binary codeword u=(u.sub.1, u.sub.2, . . . , u.sub.n) is binary phase shift keyed (BPSK) modulated such that each zero is mapped to +1 and each one is mapped to ?1. The modulated signal is transmitted over an AWGN channel with mean 0 and standard deviation ?. The received signal is {tilde over (r)}=1?2u+n, where n is the channel noise. The quantized version of {tilde over (r)} is denoted as r=(r.sub.1, r.sub.2, . . . , r.sub.n).
(17) The Min-Sum Algorithm and its Modifications. The MSA is an iterative MP algorithm that is simpler to implement than the SPA. Unlike the SPA, the MSA does not require channel noise information to calculate the channel log-likelihood ratios (LLRs). The SPA is optimum for codes without cycles, but for finite length codes and finite precision LLRs, the SPA is not necessarily optimum, particularly with respect to error floor performance. Let .sub.ij represents the LLR passed from variable node v.sub.i to check node c.sub.j in a given iteration and let
.sub.ji represent the LLR passed from c.sub.j to v.sub.i. The check nodes that are neighbors to v.sub.i are denoted N(v.sub.i), and the variable nodes that are neighbors to c.sub.j are denoted N(c.sub.j). To initialize decoding, each variable node v.sub.i passes r.sub.i to the check nodes in N(v.sub.i), i.e.,
.sub.ij=r.sub.i,(Equation 1)
where the .sub.ij's computed throughout the decoding process are referred to as the variable node LLRs. The check node operation to calculate the LLRs sent from check node c.sub.j to variable node v.sub.i is given by
(18)
where the .sub.ji's computed throughout the decoding process are referred to as the check node LLRs. After each iteration, the hard decision estimate ? is checked to see if it is a valid codeword, where ?=0 if and only if
(19)
If ? is a valid codeword, or if the iteration number has reached I.sub.max, decoding stops. Otherwise, the variable node LLRs are calculated as.sub.ij=r.sub.i+?.sub.j?N(v.sub.
.sub.ji(Equation 4)
and decoding continues using equation 2. Two modified versions of the MSA, called attenuated (or normalized) MSA (AMSA) and offset MSA (OMSA), were introduced to reduce the waterfall performance loss of the MSA compared to the SPA. The modified check node computations are given by
(20)
respectively, where ?, ?>0 are constants. In both algorithms, the check node LLR magnitudes are modified to be smaller than those of MSA. This reduces the negative effect of overestimating the LLR magnitudes in the MSA, whose larger check node LLR magnitudes compared to the SPA can cause additional errors in decoding at low SNRs.
(21) Implementation of the MSA, AMSA, and OMSA. To implement the check node update of equation 2, in the check node processing unit corresponding to c.sub.j, the sign and magnitude of .sub.ji to be sent to each v.sub.i are calculated separately as follows. First, for all i?N(c.sub.j) the signs of
.sub.ij are multiplied to form ?.sub.i?N(c.sub.
.sub.i,j). Then, for each i?N(c.sub.j), sign(
.sub.ij) is multiplied by ?.sub.i?N(c.sub.
.sub.ij) to form ?.sub.i?N(c.sub.
.sub.ij). Second, the process of calculating |
.sub.ji| involves finding two minimum values, the first and second minimum of all the |
.sub.ij| at check node c.sub.j, denoted
.sub.1,j and
.sub.2,j, respectively. For each
.sub.ji, if the variable node v.sub.i corresponds to
.sub.1,j, then |
.sub.ji|=
.sub.2,j, otherwise, |
.sub.ji|=
.sub.1,j. The implementation of equation 5 or 6 is the same with an extra step of attenuating or offsetting the minimum values.
(22) The process of finding .sub.1,j and
.sub.2,j is complex to implement. Therefore, several methods have been suggested to reduce the complexity of the process or to avoid calculating
.sub.2,j and instead estimate it based on
.sub.1,j. The result is that
.sub.1,j plays an important role in the check node processing unit, and embodiments of the present invention can also rely on
.sub.1,j, the extension of the algorithm to techniques designed for complexity reduction possible.
(23) Quantized Decoders. In a uniform quantized decoder, the operations in equations 1-6 have finite precision, i.e., the values are quantized to a set of numbers ranging from ?l.sub.max to l.sub.max, with step size ?, where the resulting quantizer thresholds are set from
(24)
The attenuation and offset parameters ? and ? in equations 5 and 6 that have the best iterative decoding thresholds can be found by computer simulation or by using a technique called quantized density evolution.
(25) Trapping Sets and Error Floors. Let A denote a subset of V of cardinality a. Let A.sub.even and A.sub.odd represent the subsets of check nodes connected to variable nodes in A with even and odd degrees, respectively, where |A.sub.odd|=b. Here, A is referred to as an (a, b) trapping set. A is defined to be an (a, b) absorbing set if each variable node in A is connected to fewer check nodes in A.sub.odd than in A.sub.even. These sets, along with similar objects such as elementary trapping sets and leafless elementary trapping sets, are known to cause most of the decoding errors at high SNRs in MP decoders. In
(26) Threshold Attenuated/Offset MSA.Motivation and Rationale. Although it is known that applying attenuation or offset when computing the check node LLRs typically improves performance in the low SNR (waterfall) region of the BER curve for quantized decoders, because high SNR performance is tied to problematic graphical objects, the AMSA and OMSA do not necessarily achieve a good error floor. For example, assuming BPSK modulation on the AWGN channel,
(27)
(28) At high SNRs, for a received vector r of channel LLRs, decoding is successful with high probability. In the case of unsuccessful decoding, it is known that a small number of problematic objects are likely to be the cause, i.e., objects containing variable nodes with unreliable (small magnitude) LLR values. In this regime, the channel LLRs for the variable nodes outside a problematic object will be, however, mostly reliable and have large magnitudes. In other words, the outside LLRs are typically initially large (with the correct sign) and will continue to grow quickly to even larger values (often l.sub.max). However, even if some and/or all of the incorrect sign LLRs inside a problematic object are initially small, they can also be observed to grow quickly to larger values without correcting the errors in sign. This happens because the problematic object contains at least one short cycle, which prevents correction of the sign errors.
(29) To improve the probability of correcting errors occurring in a problematic object G(A) at high SNR, we have found that it is helpful if the LLR magnitudes sent from a check node c.sub.j?A.sub.even to variable nodes v.sub.i?A grow more slowly (i.e. are attenuated) when c.sub.j receives at least one unreliable (small magnitude) LLR from a variable node in A. This ensures that any incorrect LLRs received from the channel in A are not reinforced. On the other hand, if a check node c.sub.j (inside or outside G(A)) receives all large magnitude LLRs, these can be helpful for decoding and hence should not be attenuated. These two factors form the essence of the new threshold-based modification of AMSA/OMSA, presented below, that can lead to correct decoding of a received vector r that would not otherwise occur.
(30) A Threshold Attenuated/Offset MSA. An embodiment of the present invention preferably makes use of a relationship observed at high SNRs between the variable node LLR magnitudes |.sub.ij| received by check node c.sub.j and the likelihood of the check node c.sub.j being inside a problematic object G(A). This relationship allows the problem of locating errors affected by G(A) to instead merely considering the variable node LLR magnitudes |
.sub.ij| received at check node c.sub.j, i.e., relying on the |
.sub.ij|'s to tell if c.sub.j is likely to be inside G(A) and has the potential to cause decoding failures. At high SNRs, the check node LLRs outside G(A) typically grow faster than the LLRs inside G(A). Therefore, if a check node c.sub.j receives at least one small LLR, i.e., min.sub.i?N(c.sub.
.sub.ij|
|
.sub.1,j|??, where ? is a predetermined threshold, it is likely that c.sub.j is inside G(A). Consequently, to improve the error floor performance, the check node computation in equation 7 is preferably used to replace equation 2, where ??1 is an attenuation parameter designed to reduce the check node LLR magnitudes sent from a check node c.sub.j inside G(A) to the variable nodes in A. This modified check node update algorithm is referred to as the threshold attenuated MSA (TAMSA). As will be further shown, with a proper choice of the parameters ? and ?, the TAMSA is capable of correctly decoding some of the errors that occur in the AMSA or MSA due to problematic objects.
(31) In equation 7, ? is used to make the check node LLR magnitudes smaller when min.sub.i?N(c.sub..sub.ij|??. As an alternative (or in combination), an offset parameter ? can be used to serve the same purpose, as illustrated in equation 8, where ?>0 is an offset parameter that reduces the check node LLR magnitudes. This modified check node updated algorithm is denoted as the threshold offset MSA (TOMSA). Both the TAMSA and TOMSA selectively, or locally, reduce the magnitudes of the check node LLRs that are likely to belong to a problematic object without requiring knowledge of its location or structure. The TAMSA and TOMSA add a simple threshold test compared to the AMSA and OMSA, while the attenuation (offset) parameter only needs to be applied to a few check nodes at high SNRs.
(32) In one embodiment, the threshold ? can optionally be varied by iteration numberfor example, the value of ? used in equations 7 and 8 can be a function ?(I) of the iteration number 0?I?I.sub.max. The threshold ? can also optionally be varied by graph locationfor example as a function ?(j) of check node index j. Although embodiments of the present invention can provide desirable results without such variations, such variations can provide further performance improvements.
(33) Implementation of Threshold Attenuated/Offset MSA. For the MSA, for some number K of inputs to a check node processing unit, the implementation of sub-units to calculate .sub.1,j and
.sub.2,j and the index needed to identify which input created
.sub.1,j required a significant number of multiplexers, comparators, and inverters, which is a function of K. A check node processing unit preferably includes some additional sub-units to generate the proper output and apply the attenuation (offset) parameter for the AMSA and/or OMSA. Implementation of the TAMSA and/or TOMSA adds just two simple steps to the implementation of the AMSA and/or OMSA. First, for a check node processing unit corresponding to c.sub.j, after calculating
.sub.1,j and
.sub.2,j, the value of
.sub.1,j is preferably compared to ?. Second, a decision is made based on the outcome of the comparison to use the attenuated (offset) or non-attenuated (non-offset) output. Consequently, implementation of the TAMSA and/or TOMSA requires just one extra comparator and K extra multiplexers to decide if attenuation (offset) should be applied. If not, the additional multiplication for attenuation (or subtraction for offset) is not necessary. Hence, the extra requirements do not significantly increase the overall area or delay of a check node processing unit.
(34) To illustrate the robustness of an embodiment of the present invention, consider the (8000,4000) MacKay code, the progressive edge growth (PEG) (1008,504) LDPC code, and the quasi-cyclic (155,64) Tanner code decoded with various algorithms, including the TAMSA and TOMSA according to an embodiment of the present invention, with different parameters, each using a 5-bit uniform quantizer with ?=0.15 and l.sub.max=2.25.
(35) Performance Estimation Based on Problematic Objects. The impact of a problematic object on the performance of an LDPC code decoded with the MSA, AMSA, and TAMSA can be estimated. To do so, a lower bound on the FER of any LDPC code containing a given problematic object (sub-graph) is derived, assuming a particular message passing decoder and decoder quantization. A crucial aspect of the lower bound is that it is code-independent, in the sense that it can be derived based only on a problematic object and then applied to any code containing that object. Given the dominant problematic object, decoder quantization, and decoding algorithm, a performance estimate of the code containing the dominant object can be derived. The number, type, and location of problematic objects in the Tanner graph do not need to be known to implement the algorithm. However, if the dominant problematic object is known, the performance estimate can facilitate determination of the optimum algorithm parameters. The lower bounds are tight for a variety of codes, problematic objects, and decoding algorithms.
(36) By analyzing the AWGN channel performance simulations of the (8000,4000) code with a 5-bit quantizer, the (5,3) absorbing set of
(37) Simulated Performance of LDPC Codes with TAMSA and TOMSA Decoders.
(38)
(39)
(40) Table 1 illustrates average number of iterations recorded for the quasi-cyclic (155,64) Tanner code with the MSA, AMSA, and TAMSA decoding algorithms.
(41) TABLE-US-00001 TABLE 1 E.sub.b/N.sub.0 MSA AMSA TAMSA 1 dB 68.95 59.28 59.24 2 dB 30.4 23.13 22.9 3 dB 7.82 6.28 6.2 4 dB 3.06 2.95 2.87 5 dB 1.97 1.98 1.98 6 dB 1.44 1.46 1.46 7 dB 1.09 1.1 1.1 8 dB 0.85 0.86 0.86
(42) Layered MP decoding of LDPC-BCs converges faster than standard MP decoding and is commonly employed in the implementation of quasi-cyclic codes.
(43) Parameter Set Selection for TAMSA and TOMSA Decoders.
(44) In the error floor, instead of running time-consuming code simulations, a different method can be applied to problematic objects to find the parameter sets (?, ?) that lead to the best error floor performance. From the contour plots in
(45) In
(46) As previously discussed, AMSA and/or OMSA can be viewed as a particular case of TAMSA and/or TOMSA and, as such, the performance of TAMSA and/or TOMSA is at least as good as AMSA and/or TOMSA with optimal parameter selection. Moreover, significant performance improvements can be seen for a variety of code structures and lengths.
(47) Application of the TAMSA to Spatially Coupled LDPC Codes. Spatially Coupled LDPC Codes (SC-LDPCC) are known to combine the best features of both regular and irregular LDPC-BCs, i.e., they achieve excellent performance both in the waterfall and the error floor regions of the BER (FER) curve. The TAMSA is preferably used to decode SC-LDPCCs to further verify the effectiveness of embodiments of the present invention and to illustrate the benefit of combining the advantages of spatial coupling and the TAMSA.
(48) SC-LDPCC Parity-Check Matrix. Given an underlying LDPC-BC with a ??? parity-check matrix H.sub.BC and rate
(49)
a terminated SC-LDPCC with parity-check matrix H.sub.SC.sup.L and syndrome former memory m can be formed by partitioning H.sub.BC into m component matrices H.sub.i,i=0, 1, . . . , m, each of size ???, such that
(50)
and arranging them as
(51)
where the coupling length L>m+1 denotes the number of block columns in H.sub.SC.sup.L and the rate of the terminated SC-LDPCC represented by H.sub.SC.sup.L is given by
(52)
such that
(53)
(54) Sliding Window Decoding of SC-LDPCCs. A sliding window (SW) decoder can be used to address the large latency and complexity requirements of decoding SC-LDPCCs with a standard flooding schedule decoder.
(55) Cut-and-Paste Construction of SC-LDPCC. For the case m=1, the cut-and-paste method of constructing SC-LDPCC uses a cutting vector w=[w.sub.0, w.sub.1, . . . , w.sub.?-1] of non-decreasing, non-negative integers (0<w.sub.0?w.sub.1? . . . ?w.sub.?-1<v) to form two component matrices H.sub.0 and H.sub.1 from a ??? LDPC-BC parity-check matrix H.sub.BC. The cutting vector partitions H.sub.BC, composed of a ??v array of ??? blocks such that ???=????? are formed into two parts, one below and one above the cut, which can be represented by H.sub.0 and H.sub.1, respectively.
(56)
where the underlying LDPC-BC has rate
(57)
For quasi-cyclic LDPC-BCs, such as array codes and Tanner codes, the parameter ? is set equal to the size of the circulant permutation matrices in order to maintain the code structure.
(58) Simulation Results. The simulation results for the SC-LDPCC versions of the (8000,4000) LDPC-BC and the quasi-cyclic (155,64) Tanner code decoded with the TAMSA and an SW decoder with W=6, where 50 iterations were performed at each window position are presented in
(59)
(60)
(61) The preceding examples can be repeated with similar success by substituting the generically or specifically described components and/or operating conditions of embodiments of the present invention for those used in the preceding examples.
(62) Optionally, embodiments of the present invention can include a general or specific purpose computer or distributed system programmed with computer software implementing the steps described above, which computer software may be in any appropriate computer language, including but not limited to C++, FORTRAN, BASIC, Java, Python, Linux, assembly language, microcode, distributed programming languages, etc. The apparatus may also include a plurality of such computers/distributed systems (e.g., connected over the Internet and/or one or more intranets) in a variety of hardware implementations. For example, data processing can be performed by an appropriately programmed microprocessor, computing cloud, Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA), or the like, in conjunction with appropriate memory, network, and bus elements. One or more processors and/or microcontrollers can operate via the instructions of the computer code and the software is preferably stored on one or more tangible non-transitive memory-storage devices.
(63) Note that in the specification and claims, about or approximately means within twenty percent (20%) of the numerical amount cited. All computer software disclosed herein can be embodied on any non-transitory computer-readable medium (including combinations of mediums), including without limitation CD-ROMs, DVD-ROMs, hard drives (local or network storage devices), USB keys, other removable drives, ROMs, and firmware.
(64) Embodiments of the present invention can include every combination of features that are disclosed herein independently from each other. Although the invention has been described in detail with particular reference to the disclosed embodiments, other embodiments can achieve the same results. Variations and modifications of the present invention will be obvious to those skilled in the art and it is intended to cover in the appended claims all such modifications and equivalents. The entire disclosures of all references, applications, patents, and publications cited above are hereby incorporated by reference. Unless specifically stated as being essential above, none of the various components or the interrelationship thereof are essential to the operation of the invention. Rather, desirable results can be achieved by substituting various components and/or reconfiguring their relationships with one another.