Threshold-based min-sum algorithm to lower the error floors of quantized low-density parity-check decoders

Abstract

A modified version of the min-sum algorithm (MSA) which can lower the error floor performance of quantized LDPC decoders. A threshold attenuated min-sum algorithm (TAMSA) and/or threshold offset min-sum algorithm (TOMSA), which selectively attenuates or offsets a check node log-likelihood ratio (LLR) if the check node receives any variable node LLR with magnitude below a predetermined threshold, while allowing a check node LLR to reach the maximum quantizer level if all the variable node LLRs received by the check node have magnitude greater than the threshold. Embodiments of the present invention can provide desirable results even without knowledge of the location, type, or multiplicity of such objects and can be implemented with only a minor modification to existing decoder hardware.

Claims

1. A method for lowering an error floor of a low-density parity-check (LDPC) decoder chip, thereby improving bit-error-rate and/or frame-error-rate performance of the LDPC decoder chip, the method comprising: for each message passed from a check node to a variable node, computing a check node log-likelihood ratio within a check node processing unit of the LDPC decoder chip by iteratively passing quantized messages between processing units on a decoder chip, wherein a plurality of check nodes are connected to a respective variable node and wherein a plurality of variable nodes are connected to a respective check node, and wherein the connections are specified by a parity-check matrix of LDPC code, wherein computing further comprises: comparing a minimum value of a set of variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold, wherein each of the input variable node log-likelihood ratio magnitudes are one of a plurality connected to a respective check node; based on the results of the comparison, determining at the check node processing unit whether to apply a reduction to a check node log-likelihood ratio magnitude or not to apply a reduction to the check node log-likelihood ratio magnitude; applying a reduction to the check node log-likelihood ratio magnitude in instances when the determining step determines that a reduction should be applied; and not applying a reduction to the check node log-likelihood ratio magnitude in instances when the determining step determines that a reduction should not be applied.

2. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude at the check node processing unit comprises applying attenuation.

3. The method of claim 2 wherein applying attenuation comprises multiplying the check node log-likelihood ratio magnitude by a value that is greater than 0 and less than 1.

4. The method of claim 3 wherein applying attenuation comprises multiplying the check node log-likelihood ratio magnitude by a value that is less than one and greater than or equal to 0.5.

5. The method of claim 2 wherein applying attenuation comprises applying attenuation to the check node log-likelihood ratio magnitude before the check node log-likelihood ratio magnitude is passed from a check node to a variable node.

6. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude at the check node processing unit comprises applying an offset.

7. The method of claim 6 wherein applying an offset comprises applying a subtraction function.

8. The method of claim 7 wherein applying a subtraction function comprises subtraction of a predetermined number that is greater than zero.

9. The method of claim 7 wherein applying a subtraction function comprises subtracting a value greater than zero before passing the check node log-likelihood ratio to a connected variable node.

10. The method of claim 1 wherein comparing a minimum value of a set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold comprises determining whether the minimum value is less than the threshold.

11. The method of claim 1 wherein comparing a minimum value of a set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold comprises determining whether the minimum value is less than or equal to the threshold.

12. The method of claim 1 wherein comparing a minimum value of a set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold comprises comparing the minimum value with a predetermined value.

13. The method of claim 12 wherein the predetermined value comprises a value having a magnitude greater than 0.

14. The method of claim 1 wherein comparing a minimum value of a set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold further comprises comparing a second minimum value of the set of connected variable node log-likelihood ratio magnitudes that are input into the check node processing unit with a threshold.

15. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude comprises applying a multiplication function of greater than zero and less than one and applying a subtraction function to the check node log-likelihood ratio magnitude.

16. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude further comprises varying an amount of the reduction for each iteration.

17. The method of claim 1 wherein applying a reduction to the check node log-likelihood ratio magnitude further comprises varying an amount of the reduction based on a location of the check node log-likelihood ratio in a graph.

Description

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

(1) The accompanying drawings, which are incorporated into and form a part of the specification, illustrate one or more embodiments of the present invention and, together with the description, serve to explain the principles of the invention. The drawings are only for the purpose of illustrating one or more embodiments of the invention and are not to be construed as limiting the invention. In the drawings:

(2) FIG. 1 is a drawing which illustrates the sub-graph G(A) induced by a (5,3) absorbing set A;

(3) FIG. 2 is a graph which illustrates the simulated performance of an (8000,4000) LDPC code decoded with the MSA, AMSA, and OMSA, wherein solid curves represent bit-error-rate (BER) and dashed curves represent frame-error-rate (FER);

(4) FIG. 3A is a graph which illustrates the estimated FER performance vs. the parameter sets (?, ?) for 0.5???1, 0???l.sub.max, and E.sub.b/N.sub.0=2 dB;

(5) FIG. 3B is a graph which illustrates a contour plot of the estimated FER performance vs. the parameter sets (?, ?) for 0.5???1, 0???l.sub.max, and E.sub.b/N.sub.0=2 dB;

(6) FIG. 4 is a graph which illustrates the simulated performance of an (8000,4000) LDPC code decoded with the MSA, AMSA, OMSA, TAMSA, and TOMSA, wherein solid curves represent BER and dashed curves represent FER;

(7) FIG. 5 is a graph which illustrates the simulated performance of an (1008,504) progressive-edge-growth LDPC (PEG-LDPC) code decoded with the AMSA, OMSA, and TAMSA, wherein solid curves represent BER and dashed curves represent FER;

(8) FIG. 6 is a graph which illustrates the simulated performance of a (155,64) Tanner LDPC code with the MSA, AMSA, and TAMSA, wherein solid curves represent BER and dashed curves represent FER:

(9) FIG. 7 is a graph which illustrates the simulated BER performance of the (155,64) Tanner LDPC code with the MSA and the TAMSA with both layered MP and standard MP decoding;

(10) FIG. 8 is a graph which illustrates a sliding window (SW) decoder for spatially coupled LDPC codes (SC-LDPCC) operating on the parity-check matrix H.sub.SC;

(11) FIG. 9 is a graph which illustrates an LDPC block code (LDPC-BC) LDPC-BC with 4000?8000 parity-check matrix H.sub.BC, ?=4, ?=8, and ?=1000, which is partitioned by a cutting vector w=[1, 3, 5, 7] to construct the two component matrices H.sub.0 and H.sub.1 based on H.sub.0 and H.sub.1, wherein each square represents a ??? matrix;

(12) FIG. 10 is a graph which illustrates the performance of an (8000,4000) LDPC-BC and its spatially coupled version decoded with the MSA, AMSA, and TAMSA, wherein dashed curves represent the LDPC-BC and solid curves represent the SC-LDPCC decoded with a SW decoder with W=6; and

(13) FIG. 11 is a graph which illustrates the simulated performance of the quasi-cyclic (155,64) Tanner LDPC-BC and its spatially coupled version decoded with an SW decoder with W=6 and the MSA, AMSA, and TAMSA.

DETAILED DESCRIPTION OF THE INVENTION

(14) Embodiments of the present invention relate to decoding low-density parity-check (LDPC) codes, wherein the attenuated min-sum algorithm (AMSA) and the offset min-sum algorithm (OMSA) can outperform the conventional min-sum algorithm (MSA) at low signal-to-noise-ratios (SNRs), i.e., in the waterfall region of the bit error rate curve. For quantized decoders, MSA actually outperforms AMSA and OMSA in the error floor (high SNR) region, and all three algorithms suffer from a relatively high error floor. Embodiments of the present invention can include a modified MSA that can outperform MSA, AMSA, and OMSA across all SNRs. The implementation complexity of embodiments of the present invention are only slightly higher than that of the AMSA or OMSA. The simulated performance of embodiments of the present invention, using several classes of LDPC codes (including spatially coupled LDPC codes), is shown to outperform the MSA, AMSA, and OMSA across all SNRs.

(15) Embodiments of the present invention relate to a novel modification to the check node update of quantized MSA that is straightforward to implement and reduces the error floor when compared to other known methods.

(16) As background, let V={v.sub.1, v.sub.2, . . . , v.sub.n} and C={c.sub.1, c.sub.2, . . . , c.sub.m} represent the sets of variable nodes and check nodes, respectively, of a bipartite Tanner graph representation of an LDPC code with parity-check matrix H. Assume that a binary codeword u=(u.sub.1, u.sub.2, . . . , u.sub.n) is binary phase shift keyed (BPSK) modulated such that each zero is mapped to +1 and each one is mapped to ?1. The modulated signal is transmitted over an AWGN channel with mean 0 and standard deviation ?. The received signal is {tilde over (r)}=1?2u+n, where n is the channel noise. The quantized version of {tilde over (r)} is denoted as r=(r.sub.1, r.sub.2, . . . , r.sub.n).

(17) The Min-Sum Algorithm and its Modifications. The MSA is an iterative MP algorithm that is simpler to implement than the SPA. Unlike the SPA, the MSA does not require channel noise information to calculate the channel log-likelihood ratios (LLRs). The SPA is optimum for codes without cycles, but for finite length codes and finite precision LLRs, the SPA is not necessarily optimum, particularly with respect to error floor performance. Let custom character .sub.ij represents the LLR passed from variable node v.sub.i to check node c.sub.j in a given iteration and let .sub.ji represent the LLR passed from c.sub.j to v.sub.i. The check nodes that are neighbors to v.sub.i are denoted N(v.sub.i), and the variable nodes that are neighbors to c.sub.j are denoted N(c.sub.j). To initialize decoding, each variable node v.sub.i passes r.sub.i to the check nodes in N(v.sub.i), i.e.,
custom character .sub.ij=r.sub.i,(Equation 1)
where the .sub.ij's computed throughout the decoding process are referred to as the variable node LLRs. The check node operation to calculate the LLRs sent from check node c.sub.j to variable node v.sub.i is given by

(18) $\begin{matrix} ?_{ji} = (\underset{i^{} ? N (c_{j}) ? i}{.Math.} sign (?_{i^{} j})) .Math. \min_{i^{} ? N (c_{j}) ? i} .Math. ?_{i^{} j} .Math., & (Equation 2) \end{matrix}$
where the custom character .sub.ji's computed throughout the decoding process are referred to as the check node LLRs. After each iteration, the hard decision estimate ? is checked to see if it is a valid codeword, where ?=0 if and only if

(19) $\begin{matrix} r_{i} + \underset{j^{} ? N (?_{i})}{.Math.} ?_{j^{} i} > 0. & (Equation 3) \end{matrix}$
If ? is a valid codeword, or if the iteration number has reached I.sub.max, decoding stops. Otherwise, the variable node LLRs are calculated as
custom character .sub.ij=r.sub.i+?.sub.j?N(v.sub.i.sub.)j.sub.ji(Equation 4)
and decoding continues using equation 2. Two modified versions of the MSA, called attenuated (or normalized) MSA (AMSA) and offset MSA (OMSA), were introduced to reduce the waterfall performance loss of the MSA compared to the SPA. The modified check node computations are given by

(20) $\begin{matrix} ?_{ji} = ? (\underset{i^{} ? N (c_{j}) ? i}{.Math.} s ign (?_{i^{} j})) .Math. \min_{i^{} ? N (c_{j}) \ i} .Math. ?_{i^{} j} .Math. & (Equation 5) \end{matrix}$ $\begin{matrix} ?_{ji} = (\underset{i^{} ? N (c_{j}) ? i}{.Math.} sign (?_{i^{} j})) .Math. \max {\min_{i^{} ? N (c_{j}) \ i} .Math. ?_{i^{} j} .Math. - ?, 0}, & (Equation 6) \end{matrix}$
respectively, where ?, ?>0 are constants. In both algorithms, the check node LLR magnitudes are modified to be smaller than those of MSA. This reduces the negative effect of overestimating the LLR magnitudes in the MSA, whose larger check node LLR magnitudes compared to the SPA can cause additional errors in decoding at low SNRs.

(21) Implementation of the MSA, AMSA, and OMSA. To implement the check node update of equation 2, in the check node processing unit corresponding to c.sub.j, the sign and magnitude of custom character .sub.ji to be sent to each v.sub.i are calculated separately as follows. First, for all i?N(c.sub.j) the signs of .sub.ij are multiplied to form ?.sub.i?N(c.sub.j.sub.) sign(.sub.i,j). Then, for each i?N(c.sub.j), sign(.sub.ij) is multiplied by ?.sub.i?N(c.sub.j.sub.) sign(.sub.ij) to form ?.sub.i?N(c.sub.j.sub.)i sign( custom character .sub.ij). Second, the process of calculating |.sub.ji| involves finding two minimum values, the first and second minimum of all the |.sub.ij| at check node c.sub.j, denoted .sub.1,j and .sub.2,j, respectively. For each .sub.ji, if the variable node v.sub.i corresponds to .sub.1,j, then | custom character .sub.ji|=.sub.2,j, otherwise, |.sub.ji|=.sub.1,j. The implementation of equation 5 or 6 is the same with an extra step of attenuating or offsetting the minimum values.

(22) The process of finding custom character .sub.1,j and .sub.2,j is complex to implement. Therefore, several methods have been suggested to reduce the complexity of the process or to avoid calculating .sub.2,j and instead estimate it based on .sub.1,j. The result is that .sub.1,j plays an important role in the check node processing unit, and embodiments of the present invention can also rely on custom character .sub.1,j, the extension of the algorithm to techniques designed for complexity reduction possible.

(23) Quantized Decoders. In a uniform quantized decoder, the operations in equations 1-6 have finite precision, i.e., the values are quantized to a set of numbers ranging from ?l.sub.max to l.sub.max, with step size ?, where the resulting quantizer thresholds are set from

(24) $- ?_{\max} + \frac{?}{2} to ?_{\max} - \frac{?}{2} .$
The attenuation and offset parameters ? and ? in equations 5 and 6 that have the best iterative decoding thresholds can be found by computer simulation or by using a technique called quantized density evolution.

(25) Trapping Sets and Error Floors. Let A denote a subset of V of cardinality a. Let A.sub.even and A.sub.odd represent the subsets of check nodes connected to variable nodes in A with even and odd degrees, respectively, where |A.sub.odd|=b. Here, A is referred to as an (a, b) trapping set. A is defined to be an (a, b) absorbing set if each variable node in A is connected to fewer check nodes in A.sub.odd than in A.sub.even. These sets, along with similar objects such as elementary trapping sets and leafless elementary trapping sets, are known to cause most of the decoding errors at high SNRs in MP decoders. In FIG. 1, the sub-graph G(A) induced by a (5,3) absorbing set A is shown.

(26) Threshold Attenuated/Offset MSA.Motivation and Rationale. Although it is known that applying attenuation or offset when computing the check node LLRs typically improves performance in the low SNR (waterfall) region of the BER curve for quantized decoders, because high SNR performance is tied to problematic graphical objects, the AMSA and OMSA do not necessarily achieve a good error floor. For example, assuming BPSK modulation on the AWGN channel, FIG. 2 illustrates the simulated bit-error-rate (BER) and frame-error-rate (FER) performance of the (8000,4000) code, with a 5-bit uniform quantizer, ?=0.15, and l.sub.max=2.25, decoded using the MSA, AMSA, and OMSA. The performance of the quantized SPA using 6-bit quantization (1-bit sign, 2-bit integer, 3-bit fractional) is also illustrated for comparison. The AMSA and OMSA gain about 0.7 dB in the waterfall compared to the MSA. However, all the algorithms eventually exhibit an error floor at higher SNRs.

(27) $\begin{matrix} ?_{j i} = {\begin{matrix} (\underset{i^{} ? N (c_{j}) ? i}{.Math.} sign (?_{i^{} j})) .Math. \min_{i^{} ? N (c_{j}) ? i} .Math. ?_{i^{} j} .Math., if \min_{i^{} ? N (c_{j}) ? i} .Math. ?_{i^{} j} .Math. > ?, \\ ?^{} (\underset{i^{} ? N (c_{j}) ? i}{.Math.} sign (?_{i^{} j})) .Math. \min_{i^{} ? N (c_{j}) ? i} .Math. ?_{i^{} j} .Math., otherwise, \end{matrix} & (Equation 7) \end{matrix}$ $\begin{matrix} ?_{j i} = {\begin{matrix} (\underset{i^{} ? N (c_{j}) ? i}{.Math.} sign (?_{i^{} j})) .Math. \min_{i^{} ? N (c_{j}) ? i} .Math. ?_{i^{} j} .Math., if \min_{i^{} ? N (c_{j}) ? i} .Math. V_{i^{} j} .Math. > ?, \\ (\underset{i^{} ? N (c_{j}) ? i}{.Math.} sign (V_{i^{} j})) .Math. \max {\min_{i^{} ? N (c_{j}) ? i} .Math. V_{i^{} j} .Math. - ?^{}, 0}, othe r w ise, \end{matrix} & (Equation 8) \end{matrix}$

(28) At high SNRs, for a received vector r of channel LLRs, decoding is successful with high probability. In the case of unsuccessful decoding, it is known that a small number of problematic objects are likely to be the cause, i.e., objects containing variable nodes with unreliable (small magnitude) LLR values. In this regime, the channel LLRs for the variable nodes outside a problematic object will be, however, mostly reliable and have large magnitudes. In other words, the outside LLRs are typically initially large (with the correct sign) and will continue to grow quickly to even larger values (often l.sub.max). However, even if some and/or all of the incorrect sign LLRs inside a problematic object are initially small, they can also be observed to grow quickly to larger values without correcting the errors in sign. This happens because the problematic object contains at least one short cycle, which prevents correction of the sign errors.

(29) To improve the probability of correcting errors occurring in a problematic object G(A) at high SNR, we have found that it is helpful if the LLR magnitudes sent from a check node c.sub.j?A.sub.even to variable nodes v.sub.i?A grow more slowly (i.e. are attenuated) when c.sub.j receives at least one unreliable (small magnitude) LLR from a variable node in A. This ensures that any incorrect LLRs received from the channel in A are not reinforced. On the other hand, if a check node c.sub.j (inside or outside G(A)) receives all large magnitude LLRs, these can be helpful for decoding and hence should not be attenuated. These two factors form the essence of the new threshold-based modification of AMSA/OMSA, presented below, that can lead to correct decoding of a received vector r that would not otherwise occur.

(30) A Threshold Attenuated/Offset MSA. An embodiment of the present invention preferably makes use of a relationship observed at high SNRs between the variable node LLR magnitudes | custom character .sub.ij| received by check node c.sub.j and the likelihood of the check node c.sub.j being inside a problematic object G(A). This relationship allows the problem of locating errors affected by G(A) to instead merely considering the variable node LLR magnitudes |.sub.ij| received at check node c.sub.j, i.e., relying on the | custom character .sub.ij|'s to tell if c.sub.j is likely to be inside G(A) and has the potential to cause decoding failures. At high SNRs, the check node LLRs outside G(A) typically grow faster than the LLRs inside G(A). Therefore, if a check node c.sub.j receives at least one small LLR, i.e., min.sub.i?N(c.sub.j.sub.)| custom character .sub.ij||.sub.1,j|??, where ? is a predetermined threshold, it is likely that c.sub.j is inside G(A). Consequently, to improve the error floor performance, the check node computation in equation 7 is preferably used to replace equation 2, where ??1 is an attenuation parameter designed to reduce the check node LLR magnitudes sent from a check node c.sub.j inside G(A) to the variable nodes in A. This modified check node update algorithm is referred to as the threshold attenuated MSA (TAMSA). As will be further shown, with a proper choice of the parameters ? and ?, the TAMSA is capable of correctly decoding some of the errors that occur in the AMSA or MSA due to problematic objects.

(31) In equation 7, ? is used to make the check node LLR magnitudes smaller when min.sub.i?N(c.sub.j.sub.)| custom character .sub.ij|??. As an alternative (or in combination), an offset parameter ? can be used to serve the same purpose, as illustrated in equation 8, where ?>0 is an offset parameter that reduces the check node LLR magnitudes. This modified check node updated algorithm is denoted as the threshold offset MSA (TOMSA). Both the TAMSA and TOMSA selectively, or locally, reduce the magnitudes of the check node LLRs that are likely to belong to a problematic object without requiring knowledge of its location or structure. The TAMSA and TOMSA add a simple threshold test compared to the AMSA and OMSA, while the attenuation (offset) parameter only needs to be applied to a few check nodes at high SNRs.

(32) In one embodiment, the threshold ? can optionally be varied by iteration numberfor example, the value of ? used in equations 7 and 8 can be a function ?(I) of the iteration number 0?I?I.sub.max. The threshold ? can also optionally be varied by graph locationfor example as a function ?(j) of check node index j. Although embodiments of the present invention can provide desirable results without such variations, such variations can provide further performance improvements.

(33) Implementation of Threshold Attenuated/Offset MSA. For the MSA, for some number K of inputs to a check node processing unit, the implementation of sub-units to calculate custom character .sub.1,j and .sub.2,j and the index needed to identify which input created .sub.1,j required a significant number of multiplexers, comparators, and inverters, which is a function of K. A check node processing unit preferably includes some additional sub-units to generate the proper output and apply the attenuation (offset) parameter for the AMSA and/or OMSA. Implementation of the TAMSA and/or TOMSA adds just two simple steps to the implementation of the AMSA and/or OMSA. First, for a check node processing unit corresponding to c.sub.j, after calculating custom character .sub.1,j and .sub.2,j, the value of .sub.1,j is preferably compared to ?. Second, a decision is made based on the outcome of the comparison to use the attenuated (offset) or non-attenuated (non-offset) output. Consequently, implementation of the TAMSA and/or TOMSA requires just one extra comparator and K extra multiplexers to decide if attenuation (offset) should be applied. If not, the additional multiplication for attenuation (or subtraction for offset) is not necessary. Hence, the extra requirements do not significantly increase the overall area or delay of a check node processing unit.

(34) To illustrate the robustness of an embodiment of the present invention, consider the (8000,4000) MacKay code, the progressive edge growth (PEG) (1008,504) LDPC code, and the quasi-cyclic (155,64) Tanner code decoded with various algorithms, including the TAMSA and TOMSA according to an embodiment of the present invention, with different parameters, each using a 5-bit uniform quantizer with ?=0.15 and l.sub.max=2.25.

(35) Performance Estimation Based on Problematic Objects. The impact of a problematic object on the performance of an LDPC code decoded with the MSA, AMSA, and TAMSA can be estimated. To do so, a lower bound on the FER of any LDPC code containing a given problematic object (sub-graph) is derived, assuming a particular message passing decoder and decoder quantization. A crucial aspect of the lower bound is that it is code-independent, in the sense that it can be derived based only on a problematic object and then applied to any code containing that object. Given the dominant problematic object, decoder quantization, and decoding algorithm, a performance estimate of the code containing the dominant object can be derived. The number, type, and location of problematic objects in the Tanner graph do not need to be known to implement the algorithm. However, if the dominant problematic object is known, the performance estimate can facilitate determination of the optimum algorithm parameters. The lower bounds are tight for a variety of codes, problematic objects, and decoding algorithms.

(36) By analyzing the AWGN channel performance simulations of the (8000,4000) code with a 5-bit quantizer, the (5,3) absorbing set of FIG. 1 is found to be the major cause of errors in the error floor. Based on this problematic object, high SNR performance estimates of the code (or any code containing this problematic object as the dominant object) can be obtained for various TAMSA parameter sets and for various values of E.sub.b/N.sub.0(dB). For example, FIG. 3A plots the estimated FER performance vs. the parameter sets (?, ?) for 0.5???1, 0???l.sub.max, and E.sub.b/N.sub.0=2 dB. (A contour plot of the same data is illustrated in FIG. 3B.) Note that when ?=l.sub.max, the TAMSA is equivalent to the AMSA with ?=?, because attenuation is always applied. As illustrated in FIG. 3A, the line ?=l.sub.max=2.25 has a very high FER, meaning that any code containing this (5,3) absorbing set is adversely affected in the error floor when decoded using the AMSA. Also, in two special cases, (?, ?=0) and (?=1, ?), the TAMSA is equivalent to the MSA, because attenuation is never applied for these parameter sets. From FIG. 3A, it can be predicted that the MSA will perform better than the AMSA in the error floor for any code for which the (5,3) absorbing set is dominant. It is also important to note from FIG. 3A that there are values of ? and ? that lead to better performance than can be achieved by either the AMSA or the MSA for any code for which the (5,3) absorbing set is dominant. This observation supports making use of the parameter t in the TAMSA to reduce the error probability associated with a specific problematic object compared to either the AMSA or the MSA.

(37) Simulated Performance of LDPC Codes with TAMSA and TOMSA Decoders. FIG. 4 illustrates the BER and FER performance of the (8000,4000) code for the MSA, the AMSA with ?=0.8, the OMSA with ?=0.15, the TAMSA with parameters (?=0.8, ?=1.5), and the TOMSA with parameters (?=0.15, ?=2). A syndrome-check stopping rule with a maximum number of iterations I.sub.max=50 was employed for all decoders. For the chosen parameters, the TAMSA and TOMSA exhibit one to two orders of magnitude better error floor performance than the MSA, AMSA, and OMSA while maintaining the same waterfall performance.

(38) FIG. 5 illustrates the BER and FER performance of the (semi-structured) (1008,504) PEG-LDPC code for the AMSA, OMSA, and TAMSA with three parameter sets: (?=0.8, ?=2), (?=0.8, ?=1.75), and (?=0.75, ?=1.75). Again, the best error floors are achieved with the TAMSA. The parameter set (?=0.75, ?=1.75) exhibits the most gain, about 1.5 orders of magnitude compared to the AMSA and OMSA for E.sub.b/N.sub.0=4 dB, but its waterfall performance is slightly worse than for the parameter sets (?=0.8, ?=2) and (?=0.8, ?=1.75). This behavior allows the ability to tune the performance of the TAMSA to fit a particular application by choosing the values of ? and ?.

(39) FIG. 6 illustrates the BER performance of the (highly structured) quasi-cyclic (155,64) Tanner code for the AMSA with different values of ?, the TAMSA with parameter set (?=0.8, ?=1.5), and the MSA. Again, it is observed that at high SNRs, the TAMSA significantly outperforms both the AMSA and the MSA, gaining about one order of magnitude in the error floor. An important performance metric for comparison of these algorithms is the average number of iterations performed. Table I gives the average number of iterations for the AMSA, MSA, and TAMSA recorded from 1 dB to 8 dB. We observe that the AMSA and TAMSA have an approximately equal number of average iterations. Moreover, both the AMSA and TAMSA provide a significant reduction in the average number of iterations when compared to the MSA at low SNR. This advantage diminishes as the SNR increases, and all of the algorithms have a similar average number of iterations at high SNR.

(40) Table 1 illustrates average number of iterations recorded for the quasi-cyclic (155,64) Tanner code with the MSA, AMSA, and TAMSA decoding algorithms.

(41) TABLE-US-00001 TABLE 1 E.sub.b/N.sub.0 MSA AMSA TAMSA 1 dB 68.95 59.28 59.24 2 dB 30.4 23.13 22.9 3 dB 7.82 6.28 6.2 4 dB 3.06 2.95 2.87 5 dB 1.97 1.98 1.98 6 dB 1.44 1.46 1.46 7 dB 1.09 1.1 1.1 8 dB 0.85 0.86 0.86

(42) Layered MP decoding of LDPC-BCs converges faster than standard MP decoding and is commonly employed in the implementation of quasi-cyclic codes. FIG. 7 illustrates the BER performance of the quasi-cyclic (155,64) Tanner code, using both a layered MP decoder and a standard MP decoder with 100 iterations each, for both the MSA and the TAMSA with parameter set (?=0.8, ?=1.5). The TAMSA, with both standard and layered decoding, outperforms the MSA and the layered TAMSA and is slightly better than the standard TAMSA in the error floor. Taken together, the results of FIGS. 4-7 illustrate the robustness of TAMSA decoding.

(43) Parameter Set Selection for TAMSA and TOMSA Decoders. FIG. 5 illustrates that the TAMSA parameter sets that lead to the best error floor performance do not necessarily lead to the best waterfall performance. Depending on the application and design goals, the parameter sets can be chosen differently. For example, if ?=?.sub.opt, where ?.sub.opt is the optimal a for the AMSA derived using quantized DE, the best waterfall performance can be achieved. However, choosing ?=?.sub.opt is best suited to larger values of ?, because for ?=l.sub.max the TAMSA and AMSA are the same, and there can be a loss in waterfall performance for smaller values of ?.

(44) In the error floor, instead of running time-consuming code simulations, a different method can be applied to problematic objects to find the parameter sets (?, ?) that lead to the best error floor performance. From the contour plots in FIG. 3B of the FER performance of any code for which the (5,3) absorbing set is dominant in the error floor, it can be seen that certain parameter sets (?, ?) lead to significantly lower FER values than others. These parameter sets can then be used to guide the selection of the parameters that yield the best error floor performance of any code for which the (5,3) absorbing set is dominant. According to FIG. 3B, the best error floor for a code containing the (5,3) absorbing set is achieved by choosing a parameter set in the vicinity of (?=0.65, ?=1). If the goal is to achieve waterfall performance as good as the AMSA with ?=?.sub.opt and to achieve a better error floor than the MSA or AMSA, a good starting point is to set ?=?.sub.opt and then choose the value of ? that leads to the best error floor estimate associated with the dominant problematic object. If there is more than one value of that satisfies this condition, the largest should be chosen, because larger values of ? makes the TAMSA perform closer to the AMSA optimized for waterfall performance. In FIG. 3B, ?.sub.opt=0.8, and therefore choosing the parameter set (?=0.8, ?=1.5) should provide good performance in both the waterfall and the error floor. The simulation results in FIG. 4 of the (8000,4000) LDPC-BC using this parameter set illustrates the advantage of following this approach.

(45) In FIGS. 2, 4, and 5, it can be seen that the OMSA slightly outperforms the AMSA at high SNRs. This follows from the fact that, for the chosen values of l.sub.max, ?, and ?, the LLR magnitudes for the OMSA grow to larger values than for the AMSA (quantized value of l.sub.max??=2.10 vs. quantized value of ??l.sub.max=1.8). Adopting the terminology that check node LLRs larger than ? are reliable while those below ? are unreliable, the reliable check node LLRs of the OMSA (with magnitudes up to 2.10) are more likely to correct additional errors inside a problematic object G(A) than those of the AMSA (with magnitudes only up to 1.8). However, in FIG. 4, it can be seen that the TAMSA has better error floor performance than the TOMSA. While the check node LLRs that satisfy equation 7 or 8 (i.e., the reliable LLRs), for both the TAMSA and the TOMSA can grow to l.sub.max, the check node LLRs that don't satisfy equation 7 or 8 (i.e., the unreliable LLRs), are limited to values smaller than ? (a quantized value of ???=1.2 for the TAMSA vs. a quantized value of ???=1.85 for the TOMSA). Consequently, for the parameter sets chosen for the examples contained herein, the TAMSA makes the unreliable check node LLRs smaller than for the TOMSA, which helps TAMSA correct more errors by slowing down the check node LLR convergence inside a problematic object.

(46) As previously discussed, AMSA and/or OMSA can be viewed as a particular case of TAMSA and/or TOMSA and, as such, the performance of TAMSA and/or TOMSA is at least as good as AMSA and/or TOMSA with optimal parameter selection. Moreover, significant performance improvements can be seen for a variety of code structures and lengths.

(47) Application of the TAMSA to Spatially Coupled LDPC Codes. Spatially Coupled LDPC Codes (SC-LDPCC) are known to combine the best features of both regular and irregular LDPC-BCs, i.e., they achieve excellent performance both in the waterfall and the error floor regions of the BER (FER) curve. The TAMSA is preferably used to decode SC-LDPCCs to further verify the effectiveness of embodiments of the present invention and to illustrate the benefit of combining the advantages of spatial coupling and the TAMSA.

(48) SC-LDPCC Parity-Check Matrix. Given an underlying LDPC-BC with a ??? parity-check matrix H.sub.BC and rate

(49) $R_{B C} = 1 - \frac{?}{v},$
a terminated SC-LDPCC with parity-check matrix H.sub.SC.sup.L and syndrome former memory m can be formed by partitioning H.sub.BC into m component matrices H.sub.i,i=0, 1, . . . , m, each of size ???, such that

(50) $H_{B C} = {.Math.}_{i = 0}^{m} H_{i},$
and arranging them as

(51) $\begin{matrix} H_{SC}^{L} = {[\begin{matrix} H_{0} \\ H_{1} & H_{0} \\ .Math. & H_{1} & ? \\ H_{m} & .Math. & ? & H_{0} \\ H_{m} & H_{1} \\ ? & .Math. \\ H_{m} \end{matrix}]}_{? (L + m) ? v L} & (Equation 9) \end{matrix}$
where the coupling length L>m+1 denotes the number of block columns in H.sub.SC.sup.L and the rate of the terminated SC-LDPCC represented by H.sub.SC.sup.L is given by

(52) $R_{SC}^{L} = \frac{v L - ? (L + m)}{v L} = 1 - \frac{?}{v} (1 + \frac{m}{L}),$
such that

(53) 0 $\lim_{L .fwdarw. ?} R_{S C}^{L} = 1 - \frac{?}{v} = R_{B C} .$

(54) Sliding Window Decoding of SC-LDPCCs. A sliding window (SW) decoder can be used to address the large latency and complexity requirements of decoding SC-LDPCCs with a standard flooding schedule decoder. FIG. 8, illustrates an SW decoder with window size W=6 (blocks) operating on the parity-check matrix H.sub.SC.sup.L of an SC-LDPCC with m=2 and L=10. All the variable nodes and check nodes included in the window (the cross-hatched area in the W=6 area) are updated using an MP algorithm that has access to previously decoded symbols (the cross-hatched area in the m=2 area). The goal is to decode the variable nodes in the first block of the window, called the target symbols. The MP algorithm updates the nodes in the window until some maximum number, which can be a predetermined number, of iterations I.sub.max is reached, after which the first block of target symbols is decoded. Then the window slides one block (? code symbols) to the right and one block down (p parity-check symbols) to decode the second block, and the process continues until the last block of target symbols is decoded.

(55) Cut-and-Paste Construction of SC-LDPCC. For the case m=1, the cut-and-paste method of constructing SC-LDPCC uses a cutting vector w=[w.sub.0, w.sub.1, . . . , w.sub.?-1] of non-decreasing, non-negative integers (0<w.sub.0?w.sub.1? . . . ?w.sub.?-1<v) to form two component matrices H.sub.0 and H.sub.1 from a ??? LDPC-BC parity-check matrix H.sub.BC. The cutting vector partitions H.sub.BC, composed of a ??v array of ??? blocks such that ???=????? are formed into two parts, one below and one above the cut, which can be represented by H.sub.0 and H.sub.1, respectively. FIG. 9 illustrates an example of a matrix H.sub.BC of size 4000?8000, where ?=4, ?=8, and ?=1000, are partitioned into H.sub.0 and H.sub.1 by the cutting vector w=[1, 3, 5, 7]. H.sub.0 and H.sub.1 in equation 9 are then obtained by taking H.sub.0 and setting the H.sub.1 part to all zeros and by taking H.sub.1 and setting the H.sub.0 part to all zeros, respectively, where H.sub.0+H.sub.1=H.sub.BC. The resulting code rate is given by

(56) $R_{SC}^{L} = 1 - \frac{(L + 1) ?}{L v} = 1 - \frac{?}{v} (1 + \frac{1}{L}),$
where the underlying LDPC-BC has rate

(57) $R_{B C} = 1 - \frac{?}{v} .$
For quasi-cyclic LDPC-BCs, such as array codes and Tanner codes, the parameter ? is set equal to the size of the circulant permutation matrices in order to maintain the code structure.

(58) Simulation Results. The simulation results for the SC-LDPCC versions of the (8000,4000) LDPC-BC and the quasi-cyclic (155,64) Tanner code decoded with the TAMSA and an SW decoder with W=6, where 50 iterations were performed at each window position are presented in FIGS. 10 and 11. The SC-LDPCCs both have coupling length L=50 and syndrome former memory m=1. For the (8000,4000) code, ?=1000 and the cutting vector w=[1, 3, 5, 7] (as illustrated in FIG. 9) is chosen, and for the (155,64) Tanner code, ?=31 and w=[2,3,5] is chosen, where the size of the circulant permutation matrices is 31.

(59) FIG. 10 illustrates the BER performance of the (8000,4000) LDPC-BC and its spatially coupled version decoded with an SW decoder with W=6 for the MSA, the AMSA with ?=0.8, and the TAMSA with parameter set (?=0.8, ?=1.5). We see that, for the chosen parameters, the TAMSA again has nearly two orders of magnitude better error floor performance than the MSA and the AMSA, for both the LDPC-BC and the SC-LDPCC, and it maintains the same waterfall performance. In addition, the spatial coupling yields a waterfall gain of about 0.5 dB for all the decoding algorithms compared to the underlying LDPC-BC. Moreover, the dominant problematic object for both the LDPC-BC and the SC-LDPCC decoded with the algorithms and parameters in FIG. 10 is the (5,3) absorbing set of FIG. 1. The multiplicity of this object is N=14 and {circumflex over (N)}=4.92 for the LDPC-BC and SC-LDPCC, respectively, where {circumflex over (N)} is the average multiplicity per block of size ?=8000. Therefore, spatial coupling reduces the number of dominant problematic objects by 64%. This explains the almost one order of magnitude gain in the error floor obtained by spatial coupling compared to the underlying LDPC-BC.

(60) FIG. 11 illustrates the BER performance of the quasi-cyclic (155,64) Tanner code and its spatially coupled version decoded with an SW decoder with W=6 for the MSA, the AMSA with ?=0.8, and the TAMSA with parameter set (?=0.8, ?=1.5). Again, the TAMSA outperforms the AMSA and the MSA at high SNRs by about one order of magnitude in the error floor, for both the LDPC-BC and the SC-LDPCC. In addition, an approximately 2 dB gain for the SC-LDPCC is provided as compared to its underlying LDPC-BC in the waterfall. The dominant problematic object for the (155,64) Tanner LDPC-BC decoded with the algorithms and parameters in FIG. 11 is an (8,2) absorbing set. The multiplicity of this object is about N=465, but in this case {circumflex over (N)}=0 for the SC-LDPCC (i.e., this object is completely removed by spatial coupling). As a result, there is almost five orders of magnitude gain at E.sub.b/N.sub.0>3 dB for the SC-LDPCC as compared to the underlying LDPC-BC.

(61) The preceding examples can be repeated with similar success by substituting the generically or specifically described components and/or operating conditions of embodiments of the present invention for those used in the preceding examples.

(62) Optionally, embodiments of the present invention can include a general or specific purpose computer or distributed system programmed with computer software implementing the steps described above, which computer software may be in any appropriate computer language, including but not limited to C++, FORTRAN, BASIC, Java, Python, Linux, assembly language, microcode, distributed programming languages, etc. The apparatus may also include a plurality of such computers/distributed systems (e.g., connected over the Internet and/or one or more intranets) in a variety of hardware implementations. For example, data processing can be performed by an appropriately programmed microprocessor, computing cloud, Application Specific Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA), or the like, in conjunction with appropriate memory, network, and bus elements. One or more processors and/or microcontrollers can operate via the instructions of the computer code and the software is preferably stored on one or more tangible non-transitive memory-storage devices.

(63) Note that in the specification and claims, about or approximately means within twenty percent (20%) of the numerical amount cited. All computer software disclosed herein can be embodied on any non-transitory computer-readable medium (including combinations of mediums), including without limitation CD-ROMs, DVD-ROMs, hard drives (local or network storage devices), USB keys, other removable drives, ROMs, and firmware.

(64) Embodiments of the present invention can include every combination of features that are disclosed herein independently from each other. Although the invention has been described in detail with particular reference to the disclosed embodiments, other embodiments can achieve the same results. Variations and modifications of the present invention will be obvious to those skilled in the art and it is intended to cover in the appended claims all such modifications and equivalents. The entire disclosures of all references, applications, patents, and publications cited above are hereby incorporated by reference. Unless specifically stated as being essential above, none of the various components or the interrelationship thereof are essential to the operation of the invention. Rather, desirable results can be achieved by substituting various components and/or reconfiguring their relationships with one another.

Threshold-based min-sum algorithm to lower the error floors of quantized low-density parity-check decoders

Assignee

Inventors

Cpc classification

Classification Explorer

H03M13/6577

ELECTRICITY

Classification Explorer

H03M13/6588

ELECTRICITY

Classification Explorer

H03M13/6583

ELECTRICITY

Classification Explorer

H03M9/00

ELECTRICITY

Classification Explorer

H03M13/114

ELECTRICITY

Classification Explorer

H03M13/1125

ELECTRICITY

Classification Explorer

H03M13/1137

ELECTRICITY

Classification Explorer

H03M13/1128

ELECTRICITY

Classification Explorer

H03M13/1122

ELECTRICITY

Classification Explorer

H03M13/112

ELECTRICITY

Classification Explorer

H03M13/658

ELECTRICITY

Classification Explorer

H03M13/616

ELECTRICITY

International classification

Classification Explorer

H03M13/11

ELECTRICITY

Abstract

Claims

Description