Performance monitoring for a transmission system with forward error correction

11190306 · 2021-11-30

Assignee

Inventors

Cpc classification

International classification

Abstract

A system for determining a channel margin of a data transmission channel (DTC) using error correction under real-world channel conditions is described. The system includes a monitoring unit, an operating state determining unit and a data processing unit. The monitoring unit monitors data transmission along the DTC and estimates a statistical distribution of errors (H) in the transmission of data. The operating state determining unit determines a current value of an operating state parameter for the DTC. The data processing unit determines a reference channel margin associated with said current value of the operating state parameter for a reference channel and the error correction scheme employed, provides a statistical distribution of errors (HR) associated with said reference channel for said current value of said operating state parameter, compute a deviation of H and HR, and computes a reduction of the reference channel margin.

Claims

1. A method of determining a channel margin of a data transmission channel using an error correction under real-world channel conditions, wherein the channel margin corresponds to a distance of a current operating point from an operating point of a data transmission failure, wherein the channel margin is measured in terms of at least one operating state parameter, the method comprising: monitoring a data transmission along the data transmission channel and estimating a statistical distribution H of errors in the data transmission along the data transmission channel, determining a current value of the at least one operating state parameter for the data transmission channel, determining a reference channel margin associated with the current value of the at least one operating state parameter for a reference channel and an employed error correction scheme, providing a statistical distribution H.sub.R of errors associated with the reference channel for the current value of the at least one operating state parameter, computing a deviation of the statistical distributions H and H.sub.R, and computing a reduction of the reference channel margin associated with the at least one operating state parameter based on the deviation, wherein the channel margin of the data transmission channel corresponds to the reference channel margin reduced by the reduction.

2. The method of claim 1, wherein the error correction used in the data transmission channel comprises a forward error correction (FEC).

3. The method of claim 1, wherein the reduction of the reference channel margin comprises one of: a reduction factor, a subtrahend.

4. The method of claim 1, wherein the operating state parameter comprises at least one of: a signal to noise ratio (SNR), a pre-FEC bit error rate (BER), a measure of mutual information between transmitted bits and a received signal.

5. The method of claim 1, wherein the reduction of the reference channel margin comprises a reduction in a net coding gain due to a deviation of the real-world channel conditions from reference channel conditions.

6. The method of claim 1, wherein the reference channel comprises a memoryless channel affected only by additive white Gaussian noise (AWGN).

7. The method of claim 1, wherein the errors in the statistical distribution H.sub.R of errors are bit errors in bits transmitted along the data transmission channel prior to error correction.

8. The method of claim 1, wherein the statistical distribution H.sub.R of errors comprises a distribution of a number of errors per code word of a given length L, wherein at least one of:
10,000≤L≤.500,000, the length L corresponds to the length of data blocks processed by the error correction code.

9. The method of claim 8, wherein the distribution of the number of errors per code word is provided as a histogram.

10. The method of claim 1, wherein the deviation represents an estimate of an information loss on the data transmission channel under real-world channel conditions as compared to the reference channel.

11. The method of claim 10, wherein the method further comprises a step of calculating, from the information loss, a corresponding penalty in terms of the operating state parameter using the inverse C.sup.−1 of a channel capacity C(x), wherein x corresponds to the operating state parameter and the channel capacity C(x) is a function of the operating state parameter x.

12. The method of claim 11, wherein calculating the penalty in terms of the operating state parameter comprises comparing C.sup.−1 (MI.sub.T) and C.sup.−1 (MI.sub.T+Δ.sub.MI), wherein MI.sub.T comprises mutual information between an input and an output of the reference channel at a threshold at which the reference channel fails, and Δ.sub.MI corresponds to the estimate of the information loss on the data transmission channel under real-world channel conditions as compared to the reference channel.

13. The method of claim 1, wherein the deviation of the statistical distributions H and H.sub.R is computed using a mathematical operation D(H∥H.sub.R) which obeys the following criteria: D(H∥H.sub.R)=0 at least in a first circumstance in which the statistical distributions H and H.sub.R are identical; and D(H∥H.sub.R)≥0 at least in each of at least one other circumstance in which the statistical distributions H and H.sub.R are not identical.

14. The method of claim 1, wherein the deviation is represented by at least one of: a Kullback-Leibler divergence, a quantity derived from a Kullback-Leibler divergence.

15. The method of claim 1, wherein the method of determining the channel margin of the data transmission channel is carried out repeatedly during operation of the data transmission channel, and wherein adaptive coding and modulation techniques are employed to adapt to the determined channel margin.

16. The method of claim 1, wherein the data transmission channel comprises an optical transport channel.

17. A method of detecting a decrease in net coding gain associated with an error correction code in a data transmission channel under real-world channel conditions, as compared to the net coding gain for a reference channel, wherein the net coding gain is, at least in part, dependent on at least one operating state parameter, the method comprising: monitoring a data transmission along the data transmission channel and generating a statistical distribution H of errors in the data transmission along the transmission channel, determining a current value of at least one operating state parameter for the data transmission channel, providing a statistical distribution H.sub.R of errors associated with the reference channel for the current value of the at least one operating state parameter, computing a deviation of the statistical distributions H and H.sub.R, wherein the deviation is indicative of a reduction in net coding gain as compared to the reference channel at the same operating state parameter, providing at least one of the following if the deviation of the statistical distributions H and H.sub.R increases above at least one predetermined threshold: an alarm, a notification.

18. A system for determining a channel margin of a data transmission channel using an error correction under real-world channel conditions, wherein the channel margin corresponds to a distance of a current operating point from an operating point of a data transmission failure, wherein the channel margin is measured in terms of at least one operating state parameter, the system comprising: a monitoring unit for monitoring a data transmission along the data transmission channel and estimating a statistical distribution H of errors in the data transmission along the transmission channel, an operating state determining unit for determining a current value of the at least one operating state parameter for the data transmission channel, and at least one data processing unit, comprising at least one processor, and configured for: determining a reference channel margin associated with the current value of the at least one operating state parameter for a reference channel and an employed error correction scheme, providing a statistical distribution H.sub.R of errors associated with the reference channel for the current value of the at least one operating state parameter, computing a deviation of the statistical distributions H and H.sub.R, and computing a reduction of the reference channel margin associated with the at least one operating state parameter based on the deviation, wherein the channel margin of the data transmission channel corresponds to the reference channel margin reduced by the reduction.

19. The system of claim 18, wherein the error correction used in the data transmission channel comprises a forward error correction (FEC).

20. The system of claim 18, wherein the reduction comprises one of: a reduction factor, a subtrahend.

21. The system of claim 18, wherein the operating state parameter comprises at least one of: a signal to noise ratio (SNR), a pre-FEC bit error rate (BER), a measure of mutual information between transmitted bits and a received signal.

22. The system of claim 18, wherein the reduction of the reference channel margin comprises a reduction in a net coding gain due to a deviation of the real-world channel conditions from reference channel conditions.

23. The system of claim 18, wherein the reference channel comprises a memoryless channel affected only by additive white Gaussian noise (AWGN).

24. The system of claim 18, wherein the errors in the statistical distribution H.sub.R of errors are bit errors in bits transmitted along the data transmission channel prior to error correction.

25. The system of claim 18, wherein the statistical distribution H.sub.R of errors comprises a distribution of a number of errors per code word of a given length L, wherein at least one of:
10,000≤L≤.500,000, the length L corresponds to the length of data blocks processed by the error correction code.

26. The system of claim 25, wherein the distribution of number of errors per code word is provided as a histogram.

27. The system of claim 18, wherein the deviation represents an estimate of an information loss on the data transmission channel under real-world channel conditions as compared to the reference channel.

28. The system of claim 27, wherein the at least one data processing unit is further configured for calculating, from the information loss, a corresponding penalty in terms of the operating state parameter using the inverse C.sup.−1 of a channel capacity C(x), wherein x corresponds to the operating state parameter and the channel capacity C(x) is a function of the operating state parameter x [.Math.].

29. The system of claim 28, wherein calculating the penalty in terms of the operating state parameter comprises comparing C.sup.−1 (MI.sub.T) and C.sup.−1 (MI.sub.T+Δ.sub.MI), wherein MI.sub.T comprises mutual information between an input and an output of the reference channel at a threshold at which the reference channel fails, and Δ.sub.MI corresponds to the estimate of the information loss on the data transmission channel under real-world channel conditions as compared to the reference channel.

30. The system of claim 18, wherein the at least one data processing unit is configured for computing the deviation of the statistical distributions H and H.sub.R using a mathematical operation D(H∥H.sub.R) which obeys the following criteria: D(H∥H.sub.R)=0 at least in a first circumstance in which the statistical distributions H and H.sub.R are identical; and D(H∥H.sub.R)≥0 at least in each of at least one other circumstance in which the statistical distributions H and H.sub.R are not identical.

31. The system of claim 18, wherein the deviation is represented by at least one of: a Kullback-Leibler divergence, a quantity derived from a Kullback-Leibler divergence.

32. The system of claim 18, wherein the system is configured for determining the channel margin of the data transmission channel repeatedly during operation of the data transmission channel, and for employing adaptive coding and modulation techniques to adapt to the determined channel margin.

33. The system of claim 18, wherein the data transmission channel comprises an optical transport channel.

34. The method of claim 1, wherein the deviation of the statistical distributions H and H.sub.R is computed using a mathematical operation D(H∥H.sub.R) which obeys the following criteria: D(H∥H.sub.R)=0 if and only if the statistical distributions H and H.sub.R are identical; and D(H∥H.sub.R)≥0 for all other statistical distributions H, H.sub.R.

35. The system of claim 18, wherein the at least one data processing unit is configured for computing the deviation of the statistical distributions H and H.sub.R using a mathematical operation D(H∥H.sub.R) which obeys the following criteria: D(H∥H.sub.R)=0 if and only if the statistical distributions H and H.sub.R are identical; and D(H∥H.sub.R)≥0 for all other statistical distributions H, H.sub.R.

36. The method of claim 13, wherein the at least one other circumstance, in which D(H∥H.sub.R)≥0, is a plurality of other circumstances in which the statistical distributions H and H.sub.R are not identical.

37. The method of claim 30, wherein the at least one other circumstance, in which D(H∥H.sub.R)≥0, is a plurality of other circumstances in which the statistical distributions H and H.sub.R are not identical.

Description

BRIEF DESCRIPTION OF THE FIGURES

(1) FIG. 1 is a flow diagram illustrating a method according to an embodiment of the invention.

(2) FIG. 2 is a diagram summarizing the computational flow of a method according to an embodiment of the invention,

(3) FIG. 3 is a schematic diagram illustrating a system according to an embodiment of the invention, and

(4) FIG. 4 shows a system similar to that of FIG. 3 in combination with an FEC decoder that is capable of soft decoding and an optical spectrum analyzer (OSA) for determining a SNR serving as an operating state parameter.

DESCRIPTION OF A PREFERRED EMBODIMENT

(5) FIG. 1 shows a flow diagram 10 illustrating a method of determining a channel margin of a data transmission channel according to an embodiment of the invention. Herein, the channel margin resembles a distance of a current operating point from an operating point of data transmission failure, and the channel margin is measured in terms of at least one operating state parameter, for example the signal-to-noise ratio (SNR).

(6) In step 12, the data transmission along the transmission channel is monitored, and a statistical distribution H of errors in the transmission of data along said transmission channel is estimated. In step 14, a current value of said at least one operating state parameter for the data transmission channel, such as the SNR, is determined

(7) In step 16, a reference channel margin associated with said current value of said at least one operating state parameter for a reference channel and the error correction scheme employed is determined. Moreover, in step 18, a statistical distribution H.sub.R of errors associated with said reference channel for said current value of said at least one operating state parameter is provided. Note that the order in which steps 12 to 18 are carried out need not be the same as the order in which they are recited in the flow diagram 10. Instead, the order could be changed, and some or all of the steps could be carried out at least partially simultaneously.

(8) In step 20, a deviation of said statistical distributions H and HR is computed, and in step 22, a reduction of the reference channel margin associated with the at least one operating state parameter is computed based on said deviation. Herein, the channel margin of said transmission channel corresponds to the reference channel margin reduced by said reduction.

(9) In a preferred embodiment of the invention, the channel margin of a data transmission channel under real-world channel conditions is calculated from a reference channel margin associated with a reference channel that is a memoryless channel affected only by AWGN.

(10) For this reference channel, the BER threshold and the net coding gain (NCG) of a FEC code are assumed to be known. Herein, the “coding gain” is a measure of the difference between the signal-to-noise ratio levels between the uncoded system and the coded system required to reach the same BER levels when used with the FEC code. Note that the assumption of the reference channel as a memoryless channel affected only by AWGN is in line for example with the NCG definition in the ITU-T recommendation G.975.1 (February 2004) which assumes a binary-input AWGN (BI-AWGN) channel with input alphabet {−1; 1} and a real continuous output alphabet.

(11) On a real channel, in the presence of real-world channel conditions, including memory or additional impairments other than AWGN, a deterioration of the FEC threshold can occur which is often referred to as NCG reduction.

(12) In this embodiment, a simple estimate of the NCG reduction based on the empirical distribution of the pre-FEC bit errors employed. For this purpose, we consider a histogram of the number of pre-FEC bit errors per code word. In particular, we define H(k) to be the empirical frequency of code words containing exactly k pre-FEC bit errors.

(13) It is to be understood that this histogram does not reflect the complete nature of the noise under the real-world channel conditions, and this histogram can for example not be expected to be a good indicator of the Gaussianity of the noise. However, this is actually not critical for the performance of the embodiment, since state-of-the-art soft-decision FEC schemes are very tolerant to moderate deviations of the noise from the normal distribution. For example, it is well known that density evolution under the Gaussian approximation yields excellent results even if the log-likelihood ratios (LLRs) at the various decoding stages are not normally distributed. This topic has been discussed e.g. in S-Y. Chung, T. J. Richardson and R. L. Urbanke, “Analysis of sum-product decoding of low-density parity-check codes using a Gaussian approximation”, IEEE Transactions on Information Theory, volume 47, number 2, on pages 657-670, February 2001.

(14) On the other hand, the error histogram is particularly suitable for revealing the temporal correlation of the noise process, which is in most cases found to be the primary cause of NCG reduction.

(15) In case of independent bit error events, as will be found on the reference channel under the influence of AWGN, the number of pre-FEC bit errors per code word follows the binomial distribution

(16) B L ( k ) = ( L k ) p k ( 1 - p ) L - k ( k = 0 , 1 , .Math. , L ) , ( 1 )
where B.sub.L(k) is the probability that the considered code word contains exactly k pre-FEC bit errors, p is pre-FEC bit error probability, and L is the number of bits per code word. Note that B.sub.L (k) is an example of the statistical distribution H.sub.R of errors associated with said reference channel as referred to in the summary of the invention. As also mentioned in the summary of the invention, in the preferred embodiment, it is measured how the actual channel diverges from the reference BI-AWGN channel by means of the Kullback-Leibler (KL) divergence, which was introduced by S. Kullback and R. A. Leibler in their article “On information and sufficiency”, Annals of Mathematical Statistics, pages 79-86 (1951).

(17) In particular, since in this approach the focus of consideration is on the error correlation, i.e. the occurrence of error bursts, in this embodiment the KL divergence from the binomial distribution B.sub.L(k) to the observed histogram H(k) of the bit errors per code word is employed, which is defined as

(18) D K L ( H B L ) = - .Math. k = 0 L H ( k ) log 2 ( B L ( k ) H ( k ) ) , ( 2 )
with the convention that

(19) H ( k ) = 0 .Math. H ( k ) log 2 ( B L ( k ) H ( k ) ) = 0. ( 3 )
Note that in the KL divergence, the further conditions holds:
D.sub.KL(H∨B.sub.L)≥0  (4)
and
D.sub.KL(H∨B.sub.L)=0⇔H=B.sub.L.  (5)

(20) Although the ultimate goal of this embodiment is to provide a quantitative estimation of the NCG reduction, it is worth mentioning that in various embodiments, the KL divergence can also be used to define a NCG reduction alarm during system operation. To this end it suffices to compare D.sub.KL (H∨B.sub.L) with a (set of) predefined threshold(s), as has been described with reference to the second aspect of the invention above.

(21) Based on the KL divergence, an estimator of the NCG reduction of the real-world channel as compared to the reference channel can be determined, in a manner that will be described next. For deriving such estimator, the terminology of information theory is used, and the estimator is determined with reference to the amount of additional information required for achieving error-free transmission as compared with the reference channel. This approach leads to an estimator that has a number of very attractive properties: It is agnostic with respect to the FEC code, it holds for a general class of communication channels, it relies only on the knowledge of the error histogram, and it is comparatively simple to compute while at the same time being reasonably accurate.

(22) According to the usual understanding, the task of a FEC decoder is to recover the transmitted word from the received word using the knowledge of code and channel. However, looked at this from a different perspective, it can be observed that this is equivalent to estimating the noise realization from the received word using this same knowledge. Herein, the “a noise realization” is understood as the actual noise on the transmitted signal for every bit or symbol, or in other words, the difference between the noisy signal that is actually transmitted, as compared to the signal that would be expected in absence of noise and that is actually recovered by error correction.

(23) It assumed that the FEC threshold for the reference channel is defined as the maximum pre-FEC BER corresponding to virtually error-free transmission on the BI-AWGN reference channel. However, the FEC threshold can be expressed equivalently as the minimum mutual information (MI) between input and output of the BI-AWGN channel, which is required to achieve virtually error-free transmission. The MI threshold MI.sub.T hence represents the amount of information needed by the FEC decoder to estimate the noise realization (and, thus, the transmit data).

(24) Let us denote by

(25) p d f A W G N ( n ) = N ( n ; 0 , N 0 2 ) ( 6 )
the probability density function (pdf) of the noise on the BI-AWGN channel, where N(n; m, σ.sup.2) is the normal distribution with expectation m and variance σ.sup.2.

(26) If we consider the L-fold channel used for the transmission of a whole code-word of length L, the corresponding L-dimensional pdf is

(27) p d f A W G N ( L ) ( n ) = N ( n ; 0 , N 0 2 I L ) ( 7 )
where I.sub.L is the identity matrix of size L, and n and 0 are vectors of size L, wherein n resembles the aforementioned “noise realisation”.

(28) Let us further denote by pdf.sub.Ch.sup.(L)(n) the actual pdf of the L-fold channel, i.e. under real-world channel conditions. The KL divergence D.sub.KL (pdf.sub.Ch.sup.(L)(n)∨pdf.sub.AWGN.sup.(L)(n)) measures how many additional information bits per word are needed to describe a noise realization that is drawn from the actual density function pdf.sub.Ch.sup.(L)(n) instead of from the reference density function pdf.sub.AWFN.sup.(L)(n).

(29) Since the task of the decoder is to estimate the noise realization, one may interpret this KL divergence as the additional information that must be provided to the decoder for virtual error-free transmission on top of the amount of information MI.sub.T required on the reference BI-AWGN channel. The normalized additional number of bits per use of the scalar channel will be

(30) Δ MI = 1 L D K L ( p d f C h ( L ) ( n ) p d f A W G N ( L ) ( n ) ) . ( 8 )

(31) As mentioned above, in this approach the focus is only on the error correlation as the main source of deviation of the channel margin of the true channel as compared to the reference channel margin. Accordingly, it is possible to approximate the KL divergence of the L-dimensional pdfs by the KL divergence to the one-dimensional error histogram from the binomial distribution:
Δ.sub.MI≈D.sub.KL(H∨B.sub.L).  (9)

(32) It is therefore seen that D.sub.KL (H∨B.sub.L) forms an estimator of the information loss on a real channel.

(33) This information loss can then be translated in a penalty in terms of the SNR, i.e. a NCG reduction. To this end, one can compute the SNR increase required to enhance the MI by exactly the amount of information that is lost due to the divergence from the binomial error distribution.

(34) Let us define the monotonic function

(35) C BIAWGN ( E S N 0 ) , ( 10 )
which expresses the channel capacity of the BI-AWGN channel as a function of the SNR

(36) E S N 0 ,
where E.sub.s is the energy per symbol and N.sub.0 is the power spectral density (PSD) of the complex baseband AWGN. As the skilled person will appreciate, the “channel capacity” is the tight upper bound on the rate at which information can be reliably transmitted on a communication channel. Following the terms of the so-called noisy channel coding theorem, the “channel capacity” of a given channel is the highest information rate—in units of information per unit time—that can be achieved with arbitrarily small error probability. In information theory, it is derived that the channel capacity corresponds to the maximum of the mutual information between the input and output of the channel. Note that for a discrete channel, such as for digital transmission, one would in practice typically measure the MRI in units of information per channel use rather than per unit time.

(37) The SNR threshold on the BI-AWGN reference channel can be computed from the inverse of the channel capacity function, i.e. as

(38) E S N 0 | T = C BIAWGN - 1 ( MI T ) , ( 11 )
whereas the SNR threshold on the actual channel is

(39) 0 E S N 0 | ch = C BIAWGN - 1 ( MI T + D K L ( H B L ) ) , ( 12 )
where we account for the information loss due to real-world conditions.
Finally, the NCG reduction Δ.sub.NCG is approximated by the ratio between

(40) E S N 0 | ch and E S N 0 | T :

(41) Δ N C G = C BIAWGN - 1 ( MI T + D K L ( H B L ) ) C BIAWGN - 1 ( MI T ) . ( 13 )

(42) Herein, Δ.sub.NCG resembles the “NCG reduction estimator” referred to above. FIG. 2 summarizes the complete computational flow of the described embodiment for calculating the NCG reduction estimator.

(43) In practical implementations of this embodiment, the code word length L can be between 10,000 and 500,000, and in particular between 50,000 and 250,000, which are realistic code word lengths in optical communication systems. For low-latency systems, shorter codewords are used. The invention can also be used for convolutional codes which have infinite codewords. This case, one may choose L at some arbitrary frame length. Numeric calculation of the binomial distribution for large values of the code word length L is difficult due to the occurrence of factorials of large numbers. In practice, it is therefore convenient to make use of the normal approximation
B.sub.L(k)≈N(k;L.Math.p,L.Math.p.Math.(1−p))(k=0,1, . . . ,L),  (14)
where the pre-FEC bit error probability can be estimated by

(44) p .Math. k = 0 L k L H ( k ) . ( 15 )

(45) Moreover, the performance of a FEC code is often known in terms of the pre-FEC BER threshold rather than the MI threshold. However, the skilled person will appreciate that MI threshold MI.sub.T can be obtained from the BER threshold BER.sub.T of the FEC code. For example, it will be appreciated that the pre-FEC BER on the BI-AWGN channel depends on the SNR as

(46) B E R = 1 2 erfc ( E S N 0 ) , ( 16 )
which, together with equation (10, leads to
MI.sub.T=C.sub.BIAWGN(erfc.sup.−1(2.Math.BER.sub.T)).  (17)

(47) Finally, in some embodiments, the histogram H(k) may be generalized to represent the empirical frequency of code words containing a number pre-FEC bit errors between k.Math.W+1 and (k+1).Math.W. The derivation above, which is valid for W=1, can be immediately extended to the case of a generic bin width W. The use of a bin size W>1 can potentially lead to a lower computational burden.

(48) FIG. 3 schematically shows a system 24 for determining a channel margin of a data transmission channel. The system comprises a computation unit, such as a microprocessor 26, and a memory 28. The microprocessor 26 and the memory 28 are connected by a data connection lines 30, 32 for bidirectional transfer of data. Via a data connection 34, the system 24 is configured to obtain a statistical distribution H of errors in the transmission that can be provided by the FEC decoder (not shown) at the receiving unit of the channel. Moreover, the system 24 may receive, via the same data connection 34, information regarding the current value of the operating state parameter for the data transmission channel, such as the current SNR. The microprocessor 26 is programmed to determine, based on the current value of the operating state parameter and knowledge about the reference channel and error correction scheme employed, a reference channel margin associated with the reference channel. Information about the reference channel and error correction scheme employed therein may be retrieved from the memory 28. The microprocessor 26 is further programmed to provide a statistical distribution H.sub.R of errors associated with said reference channel, which in various embodiments will simply amount to the binomial distribution B.sub.L(k) given in equation (i). The processor 26 is further configured to compute a deviation of the statistical distributions H and H.sub.R, such as the KL divergence D.sub.KL(H∨H) defined in equation (2), and to calculate a NCG reduction Δ.sub.NCG therefrom in a manner described above. The system 24 is further configured to output the NCG reduction on a data connection 36 e.g. to the network management system which may initiate appropriate link adaption operations, for example by employing elective coding and modulation techniques at the transponder level.

(49) FIG. 4 shows a system 24 corresponding to that of FIG. 3 in combination with a FEC decoder 38 that is capable of soft decoding, and with an optical spectrum analyzer (OSA) 40. The FEC decoder 38 comprises a decoder 42, a hard decision device 44 and a comparison device 46. The FEC decoder 38 receives log-likelihood ratios (LLRs) carrying so-called “a priori probabilities” of each bit being zero or one. The FEC decoder 38 provides, in addition to the decoded bits, information about the applied corrections, for example the position and/or or number of the corrections. In case of successful decoding, the applied corrections correspond to the channel errors. A counter 48 is provided for counting the number of bit errors occurring within a code word or frame of a given length. As shown in FIG. 4, the counter may receive a frame start signal triggering the counter 48 to start counting. Downstream of the counter 48, a histogram generator 50 is provided, which generates a histogram representing the aforementioned statistical distribution of errors. The FEC decoder 38, the counter 48 and the histogram generator 50 can e.g. be embodied in dedicated hardware, or in software using a digital signal processor under appropriate programming. In preferred embodiments, the FEC decoder 38, the counter 48 and the histogram generator 50 are implemented in ASIC logic. Note that the FEC decoder 38, in combination with the counter 48 and histogram generator 50, resembles one embodiment of the aforementioned “monitoring unit” for monitoring the data transmission along a transmission channel and estimating a statistical distribution H of errors in the transmission of data along said transmission channel.

(50) The OSA 40 allows for measuring the optical signal-to-noise ratio, which is an example of the “operating state parameter” used in the present invention for determining a channel margin. Accordingly, the OSA 40 resembles an exemplary embodiment of the “operating state determining unit”.

(51) In the specific embodiment described above, the “deviation” of the two statistical distributions H and H.sub.R is calculated as the Kullback-Leibler divergence. This is a particularly useful way of computing the “deviation” of the two statistical distributions for the purpose of the invention, as it allows for directly assessing a valid estimate for the information loss on the real-world channel, from which the reduction in the channel margin can be calculated in a closed form as demonstrated above. Nevertheless, as was emphasized in the summary of the invention, the invention is not limited to this choice of “deviation”. Instead, other types of deviations likewise allow for revealing differences between the real-world channel and the reference channel, that call for a correction of the reference channel margin. In this case, while the elegant derivation based on information theory presented above does not necessarily apply in full, it is nevertheless possible to establish empirical relationships between the observed deviation between the probability distributions and the corresponding reduction in channel margin. Such empirical relationships can e.g. be established by fitting empirical data and/or by generating suitable models revealing the relationship between the deviation in the probability distributions and the reduction in channel margin. This is possible, because the true system margin can e.g. be determined by stretching link adaptation up to the point of system break down. While this is not an attractive option during operation, as it interrupts the transmission of data, for establishing the relationship between the deviation in error distributions and corresponding reduction in channel margin for future use, this is tolerable. Moreover, as was proposed in the summary of the invention, such a relationship can likewise be established by machine learning. So in summary, even in cases where the information theory approach explained above is not followed exactly due to the use of other types of “deviations”, it is still possible to compute a reduction of the reference channel margin associated with the at least one operating state parameter based on said deviation.

(52) In this regard, it is again emphasized that the method of the invention relies on calculating a correction to the reference channel margin only, not the channel margin as such. This implies that the numerical value of the reduction to the reference channel margin need not necessarily be entirely precise in order to provide a useful improvement over existing methods, as the resulting channel margin of the real-world channel is still better than without such correction. Accordingly, the usefulness of the method is not compromised by approximations and simplifications made, including the use of other ways of calculating the deviation between the error distributions than the Kullback-Leibler divergence.

(53) As is further apparent from the above description, the method of the invention can be carried out simultaneously with the data transmission on the transmission channel, without interrupting or even slowing down the data transmission. Moreover, it is easily appreciated that no extra effort is required in obtaining the error distributions in the transmission since the errors in the transmitted data are corrected under the error correction scheme anyhow, and only have to be kept track of in a histogram or the like. Accordingly, the method of the invention can be very easily implemented in existing communication systems with only very moderate modifications thereof.