Decoding method and decoding system for a parity check code
10680648 · 2020-06-09
Assignee
Inventors
- Stefano Calabró (Senningerberg, LU)
- Peter Kainzmaier (Senningerberg, LU)
- Heinrich Von Kirchbauer (Senningerberg, LU)
Cpc classification
H03M13/6566
ELECTRICITY
H03M13/114
ELECTRICITY
H03M13/1114
ELECTRICITY
H03M13/116
ELECTRICITY
H03M13/112
ELECTRICITY
International classification
Abstract
A decoding system for an iterative decoding of a parity check code comprises a first loop circuit adapted to store log-likelihood ratio values corresponding to a plurality of received data symbols in a memory unit; a second loop circuit adapted to compute a difference between a check-to-variable log-likelihood message at a second iteration step, and a check-to-variable log-likelihood message at a first iteration step, when the first iteration step precedes the second iteration step; and an adder unit adapted to update a log-likelihood ratio value stored on the first loop circuit by adding the difference computed in the second loop circuit; wherein the first loop circuit and the second loop circuit are synchronized such that the adder unit forwards the updated log-likelihood ratio value synchronously both to the first loop circuit and to the second loop circuit.
Claims
1. A decoding system for iterative decoding of a parity check code, comprising: a first loop circuit adapted to store log-likelihood ratio values corresponding to a plurality of received data symbols in a memory unit; a second loop circuit adapted to compute a difference between a check-to-variable log-likelihood message at a second iteration step, and a check-to-variable log-likelihood message at a first iteration step, wherein the first iteration step precedes the second iteration step; and an adder unit adapted to update a log-likelihood ratio value stored in the first loop circuit by adding the difference computed in the second loop circuit; wherein the first loop circuit and the second loop circuit are synchronized such that the adder unit forwards the updated log-likelihood ratio value synchronously both to the first loop circuit and to the second loop circuit.
2. The decoding system according to claim 1, wherein the first loop circuit is adapted to store the updated log-likelihood ratio value in the memory unit.
3. The decoding system according to claim 1, wherein the decoding system is adapted to access the memory unit with only a single read operation per iteration, and/or only a single write operation per iteration.
4. The decoding system according to claim 1, wherein the second loop circuit comprises a permutator element adapted to route the updated log-likelihood ratio value to corresponding check nodes.
5. The decoding system according to claim 4, wherein the second loop circuit comprises an inverse permutator element, wherein the inverse permutator element is adapted to reverse a permutation introduced by the permutator element to the computed difference.
6. The decoding system according to claim 1, wherein the second loop circuit is adapted to compute a check-to-variable log-likelihood message from at least one variable-to-check log-likelihood message.
7. The decoding system according to claim 1, wherein the second loop circuit is adapted to compute a check-to-variable log-likelihood message on a basis of only a subset of variable-to-check log-likelihood messages.
8. The decoding system according to claim 1, wherein the second loop circuit is adapted to compute the check-to-variable log-likelihood message on a basis of an approximation of an exclusive or operation of a plurality of variable-to-check log-likelihood messages.
9. A method for iterative decoding of a parity check code, comprising: storing log-likelihood ratio values corresponding to a plurality of received data symbols in a memory unit of a first loop circuit; computing a difference between a check-to-variable log-likelihood message at a second iteration step, and a check-to-variable log-likelihood message at a first iteration step by means of a second loop circuit, wherein the first iteration step precedes the second iteration step; updating a log-likelihood ratio value stored in the first loop circuit by adding the difference computed in the second loop circuit; and synchronizing the first loop circuit and the second loop circuit such that the updated log-likelihood ratio value is provided synchronously both to the first loop circuit and to the second loop circuit.
10. The method according to claim 9, further comprising storing the updated log-likelihood ratio value in the memory unit.
11. The method according to claim 9, wherein the memory unit is accessed with only a single read operation per iteration, and/or only a single write operation per iteration.
12. The method according to claim 9, further comprising routing the updated log-likelihood ratio value to corresponding check nodes by means of a permutation operation in the second loop circuit.
13. The method according to claim 9, further comprising computing the check-to-variable log-likelihood message from at least one variable-to-check log-likelihood message in the second loop circuit.
14. The method according to claim 9, further comprising computing the check-to-variable log-likelihood message on a basis of only a subset of variable-to-check log-likelihood messages.
15. The method according to claim 9, further comprising computing the check-to-variable log-likelihood message on a basis of an approximation of an exclusive or operation of a plurality of variable-to-check log-likelihood messages.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The features of the present invention and the numerous advantages associated therewith will be best apparent from a detailed description of example embodiments with reference to the accompanying drawings, in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
DESCRIPTION OF EMBODIMENTS
(10) We describe a decoder architecture that overcomes the loop-latency problem and achieves high-performance LDPC decoding at very high data rates. The LDPC decoder architecture allows an efficient VLSI implementation and achieves high throughput and good performance virtually independently of the delay associated with the update of the a posteriori variable LLRs.
(11) Our solution makes use of the increments of the a posteriori LLRs. However, in contrast to other delta architectures, it requires only a single read access and a single write access to the storage of the variable LLRs, and, thus, can be implemented with two-port RAMs with one read and one write port only (instead of dual port RAMs with two read ports and one write port).
(12) The decoding method and decoding system may be employed for the transmission of digital information over optical networks. However, this is merely an example, and in general the decoding techniques according to the present disclosure may be employed for any kind of data storage, data compression, or data transmission over any possible transmission medium.
(13)
(14) The data transmission system 10 comprises an encoding system 12, a transmission channel 14, and a decoding system 16. The encoding system 12 receives input symbols 18, such as a string of digital data bits, and encodes them by means of an encoding method that generates a string of code symbols 20. These code symbols 20 are subjected to the transmission channel 14, which models transmission of the code symbols 20 over a transmission path, such as an optical fibre channel, from a sender station to a receiver station. The transmitted code symbols 22 received at the receiver station may be subjected to the decoding system 16, which converts them into decoded code symbols 24. Ideally, the decoded code symbols 24 are identical to the input symbols 18, or at least a very close approximation.
(15) In general, the input information may be represented by any kind of symbols, such as any b-ary symbols. However, in the following examples we focus on binary symbols, for the ease of presentation.
(16) Decoding of binary LDPC codes can implemented in log-likelihood ratio (LLR) arithmetic, as explained above with reference to Eq. (1). The decoding system 16 is provided with channel LLRs, which express the probability that each bit is 0 or 1. Decoding may be implemented over a factor graph, which may correspond to an adjacency matrix of a parity check matrix, and may proceed in a plurality of decoding iterations
(17) A corresponding decoding structure is schematically illustrated in
(18) The decoding system 16 comprises a first loop circuit 26 and a second loop circuit 28, which may be implemented in hardware. The decoding system 16 further comprises an adder unit 30 that connects the first loop circuit 26 and the second loop circuit 28. Moreover, the decoding system comprises an input node 32, which is connected to the first loop circuit 26 and may initialize the decoding system 16 via a multiplexer unit 34. An output node 36 of the decoding system 16 is connected to the adder unit 30.
(19) As can be further taken from
(20) An output node 30c of the adder unit is connected to the memory input node via a first loop line 40 and the multiplexer unit 34.
(21) The second loop circuit comprises a processing unit 42 with an input node 42a that is connected to the output node 30c of the adder unit 30 via a second loop line 44, and an output node 42b connected to a second input node 30b of the adder unit 30. The processing unit 42 is adapted to compute the difference .sub.ic2v.sub.ic2v.sub.i-1 according to Eq. (5) between a check-to-variable log-likelihood message c2v.sub.i at a second (later) iteration step i, and a check-to-variable log-likelihood message c2v.sub.i-1 at a first (earlier) iteration step i1. The resulting difference is provided at the output node 42b to the second input node 30b of adder unit 30, which adds it to the log-likelihood ratio value v[r] received from the memory unit 38 at the first input node 30a, so as to update the log-likelihood ratio value v[r].
(22) The updated log-likelihood ratio value is provided at the output node 30c of the adder unit 30, and is forwarded to both the first loop circuit 26 and the second loop 28 synchronously.
(23) The decoder system 16 hence implements a double-loop architecture. While the second loop 28 (the processing loop) processes the check nodes and computes the delta values, the first loop 26 (the updating loop) uses the delta values to update the a posteriori LLRs. Differently from delta architectures of the prior art, the two loops 26, 28 are synchronized so that the updated a posteriori LLRs can be forwarded at the same time to the variable storage and the check node processor.
(24) The synchronization of the two loops can be achieved by rescheduling the generation of the delta values in the processing unit 42 to match the next read access from the a posteriori LLR storage at the memory unit 38.
(25) The task separation between the two loops allows to process a new block row before all the updates stemming from the previous block row have been applied. In a standard architecture that implements the layered decoding algorithm of
(26) The decoding schedule assures that the two loops operate synchronously so that, despite the double-loop architecture, only a single read and a single write access to the memory element 38 are sufficient.
(27) For the check-to-variable messages stored in the processing unit 42, we may likewise use a memory unit with one read and one write port. In contrast to the prior art, our solution needs for the check-to-variable messages neither dual port RAMs with one write and two read ports nor delay lines.
(28) Although the proposed solution can be implemented in conjunction with any LLR-XOR approximation compliant with Eq. (15), we develop a novel GRMC algorithm. Our approximation achieves a similar performance as the 3-min algorithm (i.e. the -min algorithm with =3), but requires both less storage and fewer computations.
(29) We assume that the LDPC code is layered and that the constituent sub-matrices of its PCM have size pp. The decoder is implemented in LLR arithmetic and processes a block column per clock cycle and one block row after the other. As illustrated in the example of
(30) The memory unit 38 of the first loop circuit 26 stores the a posteriori LLRs v of all the variables. Both the read port 38a and the write port 38b convey p messages in parallel. The LLRs are stored in their natural unpermuted order.
(31) The multiplexer unit 34 routes towards the memory unit 38 either the updated a posteriori LLRs or, in the initialization phase, the channel LLRs according to line 1 in
(32) The architecture and operation of the second loop circuit 28 and the delta computation will now be described in more detail with reference to
(33) The processing unit 42 of the second loop circuit 28 comprises a permutator element 46, a first adder element 48, a decoder element 50, a second memory unit 52, three check-to-variable selector elements 54, 56, 58, a second adder element 60, and an inverse permutator element 62.
(34) The permutator element 46 is connected to the input node 42a of the processing unit 42 and routes the variable LLRs from the second loop line 44 to the corresponding check nodes. The connections between the individual variables within the j-th block column and the individual check nodes of the i-th block row are determined by the permutation sub-matrix .sub.i,j of the PCM. Therefore, the permutator element 46 implements a time-varying permutation, which is configured according to the sub-matrix at the intersection of the currently processed block row and block column.
(35) A first input node 48a of the adder element 48 is connected to an output of the permutator element 46, and a second input node 48b of the adder element 48 is connected to an output of the first check-to-variable selector element 54. In the adder element 48 the p a posteriori LLRs of the current block column are diminished by the check-to-variable messages computed at the same block row at the previous iteration, according to line 5 in
(36) The p resulting variable-to-check messages are provided at the output node 48c of the first adder element 48, and are passed to the SPC decoder element 50, which is located downstream of the adder element 48. The decoder element 50 returns the intermediate results necessary to compute the new check-to-variable messages from the variable-to-check messages, such as according to Eq. (15). However, the decoder element may not necessarily compute the messages themselves. For instance, the decoder element 50 may merely provide the sign information of the new messages, the set M of possible magnitudes and the set L of required indices, which we denote in the following by sign, magnitude and index (SMI) data.
(37) The second memory unit 52 is located downstream of the decoder element 50, and may be adapted to receive and store, for each block row, the SMI data computed by the decoder element 50. At the beginning of the decoding process, the magnitudes may be set to zero, which is equivalent to the initialization at line 2 of
(38) The actual check-to-variable messages are computed in the check-to-variable selector elements 54, 56, and 58, which are connected downstream of the second memory unit 52. The SMI data are transferred over n.sub.p clock cycles from the second memory unit 52 to the individual C2V selector elements 54, 56, 58. Each C2V selector element 54, 56, 58 may contain a register bank that keeps a local copy of the SMI data for the currently processed block row.
(39) The first selector element 54 computes the messages c2v.sub.i-1 [r], which are then fed back to the second input node 48b of the first adder element 48 to execute line 5 of
(40) With the assumed degree of parallelism of one block column, the number of cycles required to process a block row corresponds to the number of its permutation sub-matrices. If w.sub.min is the minimum row weight, the SMI data can be partitioned in at most
(41)
packets to ensure that all three accesses to the C2V storage can be completed through a single read port. We observe that higher values of n.sub.p are preferable because they correspond to a more compact C2V storage with a reduced port width.
(42) The computation of the incremental LLRs in the second adder element 60 may be suitably delayed to achieve synchronization between the first loop 26 and the second loop 28. Namely, input and output of the second loop 28 may be mutually decoupled and scheduled in such a way that the LLRs and corresponding increments impinge at the same time at the adder unit 30.
(43) To maximize the number of matches at the adder unit 30, we may reorder the sequence of block-rows and block-columns separately for the first loop 26 and the second loop 28. Depending on the specific PCM, it is typically possible to obtain a match for the near totality of the code variables.
(44) The few unmatched LLR increments may be applied by running the first loop 26 only, while the second loop 28 is temporarily suspended. In the same way, the unmatched block columns can be processed by running the second loop 28 only, while the first loop 26 is temporarily suspended.
(45) The inverse permutator element 62 is coupled to the output node 60c of the second adder element 60, and is adapted to reverse the permutation implemented by the permutator block 46 and bring the incremental LLRs into their natural unpermuted order. In detail, the inverse permutator element 62 may apply the permutation .sub.i,j.sup.1 to the increments of the j-th block column originating from the i-th block row.
(46) We now turn to the derivation of a simple and accurate GRMC algorithm for the computation of the check-to-variable messages from the variable-to-check messages. In analogy to Eq. (10) we define the indices of the four variable-to-check messages with the smallest magnitude as
l.sub.karg min|v2c.sub.i(l)| (for k=0,1, . . . ,3)(17)
and, for economy of notation, we introduce
.sub.k|v2c.sub.i(l.sub.k)| (for k=0,1, . . . ,3).(18)
(47) In contrast to the 3-min algorithm, which computes the three exact LLR-XORs of any two of the three smallest magnitudes and the exact LLR-XOR of all three of them, we compute selected approximated LLR-XORs of three of the four smallest magnitudes.
(48) For m<n we approximate the LLR-XOR of .sub.m and .sub.n as
.sub.m.sub.nmax(.sub.m.sub.nl(.sub.n.sub.m),0),(19)
where we exploit the fact that for non-negative arguments .sub.nl is a positive decreasing function, and we neglect the second term within the brackets in Eq. (3). The maximum operation in Eq. (19) ensures that the sign of the result cannot become negative due to the approximation.
(49) Using Eq. (19) we obtain for m<n<q
.sub.m.sub.n
.sub.qmax(.sub.m.sub.nl(.sub.n.sub.m).sub.nl(.sub.q.sub.m),0),(20)
where we neglect second order corrections to the arguments of the .sub.nl terms.
(50) For (m, n, q) equal to (0,1,2), (0,2,3) and (1,2,3) we obtain, respectively,
m.sub.0,1,2max(.sub.0.sub.0,1,2.Math.[.sub.nl(.sub.1.sub.0).sub.nl(.sub.2.sub.0)],0),(21)
m.sub.0,2,3max(.sub.0.sub.0,2,3.Math.[.sub.nl(.sub.2.sub.0).sub.nl(.sub.3.sub.0)],0),(22)
and
m.sub.1,2,3max(.sub.1.sub.1,2,3.Math.[.sub.nl(.sub.2.sub.1).sub.nl(.sub.3.sub.1)],0),(23)
(51) Here we introduced the positive constant corrective factors .sub.0,1,2, .sub.0,2,3 and .sub.1,2,3, which can be experimentally determined by optimizing the performance of the decoder. In practice, for the computation of Eqs. (21), (22) and (23) the function .sub.nl can be approximated by a piecewise polynomial (in particular, linear) function.
(52) Finally, the check-to-variable messages can be approximated as
(53)
(54) For the proposed GRMC algorithm the sets L and M are defined as
L{l.sub.0,l.sub.1}(25)
and
M{m.sub.0,1,2,m.sub.0,2,3,m.sub.1,2,3},(26)
respectively and, therefore, the SMI data for each block row consist of the sign information of the individual check-to-node messages, three message magnitudes and two indices. For the joint computation of the three messages the function .sub.nl must be evaluated 5 times. By comparison, for the 3-min algorithm of the state of the art, the SMI data consist of four message magnitudes and three indices, and the function .sub.nl must be evaluated 8 times.
(55) The amount of SMI data and the number of computations are directly related to the port width of the C2V storage in the second memory unit 52 and the complexity of the SPC decoder element 50, respectively. Therefore this algorithm can be conveniently employed within the decoder architecture of
(56) The proposed GRMC algorithm is summarized in pseudo-code in
(57)
(58) In a first step S10, log-likelihood ratio values corresponding to a plurality of received data symbols are stored in a memory unit of a first loop circuit.
(59) In a second step S12, a difference is computed between a check-to-variable log-likelihood message at a second iteration step, and a check-to-variable log-likelihood message at a first iteration step by means of a second loop circuit, wherein the first iteration step precedes the second iteration step.
(60) In a third step S14, a log-likelihood ratio value stored in the first loop circuit is updated by adding the difference computed in the second loop circuit.
(61) In a fourth step S16, the updated log-likelihood ratio value is provided synchronously both to the first loop circuit and to the second loop circuit
(62) The description of the embodiments and the drawings merely serve to illustrate the method and system according to the invention, but should not be understood to imply any limitation. The scope of the invention is to be determined from the appended claims.