Method and apparatuses for algorithm on QAM coherent optical detection

Abstract

Blind polarization demultiplexing algorithms based on complex independent component analysis (ICA) by negentropy maximization for quadrature amplitude modulation (QAM) coherent optical systems are disclosed. The polarization demultiplexing is achieved by maximizing the signal's non-Gaussianity measured by the information theoretic quantity of negentropy. An adaptive gradient optimization algorithm and a Quasi-Newton algorithm with accelerated convergence are employed to maximize the negentropy. Certain approximate nonlinear functions can be substitutes for the negentropy which is strictly derived from the probability density function (PDF) of the received noisy QAM signal with phase noise, and this reduces the computational complexity. The numerical simulation and experimental results of polarization division multiplexing (PDM)-quadrature phase shift keying (QPSK) and PDM-16QAM reveal that the ICA demultiplexing algorithms are feasible and effective in coherent systems and the simplified ones can also achieve equivalent performance.

Claims

1. A method for polarization demultiplexing in an optical system for maximizing a signal's non-Gaussianity using independent component analysis, wherein the non-Gaussianity is measured by negentropy based on information quantity of differential entropy, the method comprising: receiving an optical beam containing the signal; generating a local laser beam; coherently detecting the signal using the local laser beam; employing an optimal cost function in a processor; employing a quasi-Newton update algorithm in the processor until the algorithm converges; and thereafter employing a gradient optimization update algorithm to track signal variation, to minimize the optimal cost function in the processor, wherein the negentropy is characterized by the probability density function (PDF) of the signal, and the minimizing the cost function includes substituting the PDF with a nonlinear function.

2. The method of claim 1, wherein the optimal cost function is J(w)=E{log[p.sub.s(w.sup.Hx)]} or J(w)=E{|G(w.sup.Hx)|.sup.2}.

3. The method of claim 2, wherein a probability density function matching the cost function is p.sub.G(y)=e.sup.|G(y)|.sup.2.

4. The method of claim 1, wherein the gradient optimization update algorithm is $w w + \frac{J (w)}{w^{*}}$ $w \frac{w}{\sqrt{w^{H} w}} .$

5. The method of claim 1, wherein the gradient optimization update algorithm gradually adjusts vector parameters using newly updated data.

6. The method of claim 1, wherein the quasi-Newton update algorithm is $w - \frac{1}{2} E {{xg}^{*} (y)} + E {g_{a} (y)} w + E {{xx}^{T}} E {g_{b} (y)} w^{*} .$

7. The method of claim 1, wherein the Quasi-Newton update algorithm is employed by the processor in a batch processing mode.

8. The method of claim 1, further comprising running a one-unit algorithm a plurality of times for different weights respectively; and decorrelating the results of one-unit algorithm, wherein a plurality of independent components are obtained.

9. The method of claim 1, further comprising employing unique frame headers to identify separated signals.

10. The method of claim 1, further comprising preprocessing data for whitening.

11. The method of claim 10, wherein the preprocessing includes using eigenvalue decomposition.

12. The method of claim 10, wherein the preprocessing is achieved adaptively by a gradient algorithm.

13. An optical system for maximizing a signal's non-Gaussianity using independent component analysis, wherein the non-Gaussianity is measured by negentropy based on information quantity of differential entropy, the optical system comprising a transmitter and a receiver, wherein the transmitter includes a laser and transmits a signal over a fiber optic medium using the laser, wherein the receiver carries out steps comprising: receiving an optical beam containing the signal; generating a local laser beam; coherently detecting the signal using the local laser beam; employing an optimal cost function; employing on the signal a quasi-Newton update algorithm until the algorithm converges; employing after the quasi-Newton update algorithm a gradient optimization update algorithm to minimize the optimal cost function; and running a one-unit algorithm a plurality of times for different weights respectively and decorrelating the results, wherein a plurality of independent components are obtained.

14. The optical system of claim 13, wherein the receiver carries out further steps comprising employing unique frame headers to identify separated signals.

15. The optical system of claim 13, wherein the receiver carries out further steps comprising preprocessing data for whitening.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) FIG. 1 is an overall diagram of a coherent PDM system.

(2) FIG. 2(a) is a PDF of QPSK signal with phase noise and OSNR of 24 dB, FIG. 2(b) is a PDF of 16QAM signal with phase noise and OSNR of 28 dB, FIG. 2(c) is a cost function for QPSK, and FIG. 2(d) is a cost function for 16QAM.

(4) FIG. 4 shows the learning curve of the gradient algorithm.

(5) FIGS. 5(a)-(d) show the convergence of the quasi-Newton algorithm.

(6) FIG. 6 shows the MSE of average convergent value versus polarization rotation frequency.

(7) FIG. 7 is a schematic diagram of the simulation system.

(8) FIG. 8 shows the BER of the mixed and unmixed PDM-QPSK or PDM-16QAM signal versus DGD value.

(9) FIGS. 9(a)-(f) show constellations of (a) QPSK before demultiplexing; (b) QPSK after demultiplexing by ICA; (c) QPSK after follow-up DSP; (d) 16QAM before demultiplexing; (e) 16QAM after demultiplexing; and (f) 16QAM after follow-up DSP.

(10) FIG. 10 shows the PDM-QPSK experimental system.

(11) FIG. 11 is an Experiment performance comparison of CMA and ICA with a PMD emulator.

DETAILED DESCRIPTION OF THE INVENTION

Problem Description and PDM Channel Model

(12) As shown in FIG. 1, in a PDM system, independent and identically distributed complex signal a=[a.sub.X,a.sub.Y].sup.T in QAM format is transmitted over both polarization tributaries of fiber. At the receiver, these signals are polarization-split and frequency down-converted by a polarization-diverse coherent receiver in order to extract the in-phase and quadrature components in two orthogonal polarizations of the received signal.

(13) As noted in [21], a is affected by the time-variant phase noise and frequency offset of the lasers at optical transceivers, and also affected by Erbium doped fiber amplifier (EDFA) generated amplified spontaneous emission (ASE) noise in the fiber link, which can be viewed as white Gaussian noise. So the independent components which the ICA algorithms pursue can be expressed as s.sub.K=(a.sub.K+n.sub.K)e.sup.j.sup.K (K=X or Y), where .sub.K is the interference due to the signal phase and is complex Gaussian noise. Additional carrier recovery process is needed to eliminate the ill effects of .sub.K after applying the polarization demultiplexing method.

(14) We assume that chromatic dispersion (CD) has been completely compensated in this paper, since stationary CD can be compensated by a fixed digital equalizer using frequency-domain or time-domain truncation method [15].

(15) PMD is the most indispensable factor to be considered in designing optical PDM communication systems. In theory, PMD is mathematically modeled as a concatenation of birefringence fiber segments with arbitrary rotations around their principal axes and stochastic differential group delay (DGD) of Maxwellian distribution. The Jones transformation matrix can be written as:

(16) $\begin{matrix} H () = {.Math.}_{i = 1}^{n} P_{i} () S_{i} & (1) \end{matrix}$

(17) Here, P.sub.i() is the ith section's delay matrix, and S.sub.i is the ith scattering matrix:

(18) $\begin{matrix} P_{i} () = [\begin{matrix} e^{j_{i} / 2} & 0 \\ 0 & e^{- j_{i} / 2} \end{matrix}] & (2) \\ P_{i} () = [\begin{matrix} e^{j_{i} / 2} & 0 \\ 0 & e^{- j_{i} / 2} \end{matrix}] & (3) \end{matrix}$
where is the angular frequency, .sub.i is the DGD value of the ith section, .sub.i and .sub.i, uniformly distributed in [0,2), respectively denote the frequency-independent random rotation and phase shift of principal axes [29], so that the corresponding PMD vectors would cover the whole Poincar sphere. In all, H() in (1) is a frequency-dependent unitary matrix:

(19) $\begin{matrix} H () = [\begin{matrix} H_{1} () & H_{2} () \\ - H_{2}^{*} () & H_{1}^{*} () \end{matrix}], ([0, 2)) & (4) \end{matrix}$
and |H.sub.1()|.sup.2+|H.sub.2()|.sup.2=1. Due to the frequency-dependent nature of PMD, PDM transmission channel model should be accurately described as a 22 multiple-input multiple-output (MIMO)-finite impulse response (FIR) structure. Thus the received signal x=[x.sub.X,x.sub.Y].sup.T is a mixed and distorted version of s=[s.sub.X,s.sub.Y].sup.T, which also means that the polarization demultiplexing algorithms should be, in principle, capable of dealing with convolutional mixing. However, considering the fact that the 1.sup.st-order PMD value of fiber link is relatively low in many practical scenarios and adaptive equalizers are usually employed for PMD compensation [15] in the following DSP processing, when CD has been well compensated previously, the mixing channel can be simplified as an instantaneous matrix in a short period of time. As was done in [21-23], ignoring the frequency-selective nature of H(), we express the mixing matrix as

(20) $\begin{matrix} H (,) = [\begin{matrix} \cos & \sin e^{j} \\ - \sin e^{- j} & \cos \end{matrix}] & (5) \end{matrix}$
where and are the parameters to be estimated by ICA algorithm discussed below.

(21) Based on the FIG. 1, the mixing matrix links the independent components and mixed signal, namely x=Hs. The goal of ICA demultiplexing algorithm is to seek a matrix W which is the best estimation of H.sup.1, so that the independent components can be obtained from y=Wx. Because the polarization state of fiber is time-varying and usually unstable, the separation method has to be adaptive. The applicability of ICA demultiplexing algorithm depends on its performance, robustness and computational complexity.

(22) Complex ICA Algorithm Detection

(23) A. The Principle and Cost Function

(24) According to the central limit theorem, a classical result in probability theory, the statistics of a mixed signal tends to be more Gaussian than its independent components under certain conditions. As noted by A. Hyvrinen [30], Non-Gaussian is independent. Thus, ICA is to maximize the non-Gaussianity of y=w.sup.Hx, where w is one of the column vectors in W. Non-Gaussianity can be measured by negentropy which is based on the information theoretic quantity of differential entropy [30]. The negentropy of a complex value y can be defined as [27]

(25) $\begin{matrix} \begin{matrix} J_{neg} = H (y_{Gauss}) - H (y) \\ = const - H (y) \end{matrix} & (6) \end{matrix}$
where y.sub.Gauss is a complex Gauss variable of the same variance as y, H is the differential entropy defined as

(26) $\begin{matrix} \begin{matrix} H (y) = E {\log [p (y)]} \\ = - p (y) \log p (y) d y \\ = - p (y^{R}, y^{I}) \log p (y^{R}, y^{I}) d y^{R} d y^{I} \end{matrix} & (7) \end{matrix}$
and p(y)=p(y.sup.R,y.sup.I) is joint PDF of complex variance y. It can be proved in information theory that a Gauss variable has the largest entropy among all random variables of equal variance, so negentropy is always positive and is zero if and only if y is Gauss. Since H(y.sub.Gauss) is constant, maximizing the non-Gaussianity of y is equivalent to minimizing the bivariate differential entropy in (7). Therefore, when using negentropy as a criterion of non-Gaussianity, the optimal cost function is
J(w)=E{log [p.sub.s(w.sup.Hx)]}(8)
where p.sub.s is the PDF of s.sub.K(K=X or Y). In practice, expectation operator E in (8) is usually replaced by an arithmetic average or instantaneous value.

(27) As mentioned in Section II, s=(a+n)e.sup.j , nN(0,2.sup.2), then the PDF of the unmixed independent component is

(28) $\begin{matrix} \begin{matrix} P_{s} (s) = \frac{1}{M} {.Math.}_{i = 1}^{M} p_{s | a_{i}} (s | a_{i}) \\ = \frac{1}{M} {.Math.}_{i = 1}^{M} \frac{1}{2^{2}} \exp (- \frac{{.Math. s .Math.}^{2} + {.Math. a_{i} .Math.}^{2}}{2^{2}}) I_{0} (.Math. \frac{a_{i}^{*} s}{^{2}} .Math.) \end{matrix} & (9) \end{matrix}$
where a.sub.i is the complex point in the constellation diagram, and I.sub.v is the v.sup.th-order modified Bessel function of the first kind. The detailed derivation of p.sub.s(s) is in [21] and the derivation of a from optical signal to noise ratio (OSNR) is presented in Appendix A. The PDF and cost function images of QPSK and 16QAM are shown in FIG. 2. The OSNRs of them are 24 dB and 28 dB, respectively. Because phase noise is uniformly distributed in [0,2), the PDF and the cost functions only rely on the modulus of constellation points. In other words, they are circular symmetric, as can be seen from (9) and FIG. 2.

(29) B. The Gradient Optimization Algorithm

(30) To adaptively calculate the optimal vector w which minimizes the cost function J(w), a gradient optimization method can be employed. The update rule is

(31) $\begin{matrix} w w + \frac{J (w)}{w^{*}} w \frac{w}{\sqrt{w^{H} w}} & (10) \end{matrix}$
where is a negative learning rate, the 2.sup.nd step in the update rule is to guarantee the normality of output.

(32) $\frac{J (w)}{w^{*}}$
is the complex gradient which approaches zero near convergence, and given by

(33) 0 $\begin{matrix} \frac{J (w)}{w^{*}} = \frac{1}{2} E {x g^{*} (w^{H} x)} & (11) \end{matrix}$

(34) The explanation and derivation of complex gradient are in Appendix B. The gradient algorithm can gradually adjust the vector parameters online using the newly updated data. Like other gradient algorithms, it is fit for varying and nonstationary environments, such as fiber PDM channel with PMD interference, but the convergence rate and stability depend on the initial w and learning rate .

(35) C. Acceleration Convergence by Quasi-Newton Algorithm

(36) The gradient algorithm has the drawback that it cannot converge rapidly and accurately. Adali proposed a Quasi-Newton algorithm to accelerate its convergence based on the Lagrangian cost function [27]
L(w,)=J(w)+(w.sup.Hw1)(12)

(37) The second term of (12) is the constraint condition on w, is the Lagrange multiplier, and J(w) is defined in (8). Newton algorithm is a 2.sup.nd-order updating rule which converges faster than gradient algorithm. Using complex gradient and Hessian in [31], Newton update can be defined as

(38) $\begin{matrix} \begin{matrix} = - {(\frac{^{2} L}{* T} {.Math.}_{w = w_{n}})}^{- 1} \frac{L}{*} {.Math.}_{w = w_{n}} \\ = - L_{- 1} L_{*} \\ = - {(J + \tilde{I})}^{- 1} (J_{*} +) \end{matrix} & (13) \end{matrix}$
where custom character =[w.sub.1,w.sub.1*, w.sub.2,w.sub.2*].sup.T. With the sophisticated derivation in Appendix C, we obtain the updating rule of Quasi-Newton algorithm:
wE{xg*(y)}+E{g.sub.a(y)}w+E{xx.sup.T}E{g.sub.b(y)}w*(14)

(39) Though the updating rule of Quasi-Newton algorithm is much more complicated than the gradient algorithm, it is immune to the choice of learning rate and converges faster. A practical and feasible choice is that the processor employs the Quasi-Newton algorithm in batch processing mode at the beginning of computing. Once convergence has been achieved, it may turn into a gradient optimization mode to track the variation.

(40) D. Simplification and Approximation

(41) The cost function and update rules of the gradient and Quasi-Newton algorithms, as mentioned above, are rigidly and exactly derived from the PDF of the independent component. However, the complexity of update rules in (10) and (14), whose complete general formulae are in Appendix B & C, almost leads to their infeasibility in real-time receivers. It is necessary to simplify the expressions.

(42) Some approximate nonlinear functions can be substitutes for negentropy [30], because they implicitly introduce some higher-order statistics which can be viewed as measures of non-Gaussianity. A. Hyvrinen have also proved that any sufficiently smooth even function can be used as a cost function for ICA by either maximizing or minimizing its value [32]. It is proposed in [33] that the cost function for complex ICA can be defined as
J(w)=E{|G(w.sup.Hx)|.sup.2}(15)
where G is a nonlinear function which has to be chosen to match the PDF of independent component. As observed from (8), the PDF matching the cost function in (15) is
p.sub.G(y)=e.sup.|G(y)|.sup.2(16)

(43) Some nonlinear functions, such as G.sub.1(y)=a sin h(y), G.sub.2(y)=y.sup.2, G.sub.3(y)=y.sup.3, along with their associated PDFs are shown in FIG. 3. They are circular symmetric and similar to the actual PDFs and cost functions in FIG. 2, but the cost function |y|.sup.6 is excessively sensitive to outliers due to its rapid rate of growth.

(44) The gradient of the cost function in (15) is

(45) $\begin{matrix} \frac{J (w)}{w^{*}} = \frac{1}{2} E {{xG}^{*} (y) g (y)} & (17) \end{matrix}$
and the update rule of the Quasi-Newton algorithm [33] is

(46) $\begin{matrix} w - E {G^{*} (y) g (y) x} + E {g (y) g^{*} (y)} w + E {{xx}^{T}} E {G^{*} (y) g^{} (y)} w^{*} & (18) \end{matrix}$
where g is the derivative of G and g is the derivative of g. The low-complexity approximate update rules are suitable for hardware implementation.

(47) E. Calculation of Independent Components on Both Polarizations

(48) The previously discussed algorithms are classified as one-unit algorithms which estimate just one of the independent components. To obtain several independent components, a conventional practice is to run one-unit algorithms several times for different weights respectively and then decorrelate them. As for the issue of polarization demultiplexing, this process can be reduced and simplified because the optimized weights w.sub.1 and w.sub.2 must be orthogonal due to the Jones matrix of fiber transmission in (5), namely, w.sub.1.sup.Hw.sub.2=0. Therefore, once we have estimated one of them, say w.sub.1, then w.sub.2 can also be calculated without applying the one-unit algorithm again. To calculate a complex vector that is orthogonal to w.sub.1, a simple way is to use the Gram-Schmidt orthogonalization algorithm:

(49) $\begin{matrix} w_{2} = w_{0} - w_{0} w_{0}^{H} w_{1} w_{2} \frac{w_{2}}{\sqrt{w_{2}^{H} w_{2}}} & (19) \end{matrix}$
where w.sub.0 is a vector that is orthogonal to the initial w.sub.1. The orthogonalization algorithm extracts a vector of the same direction as w.sub.1 and leaves the vector w.sub.2 that is orthogonal to it. Additionally, it is worthwhile to note that ICA cannot make a distinction between the separated demultiplexing signals, so some special frame header information are needed to identify them.

(50) F. Preprocessing

(51) Before applying an ICA algorithm, it is usually profitable to preprocess the data. The preprocessing is mainly for centering and whitening x, to make x a zero-mean and uncorrelated variable, namely, E{x}=0 and E{xx.sup.H}=I. One popular method for whitening is to use eigenvalue decomposition (EVD) of the covariance matrix E{xx.sup.H}, which leads to very high computation-complexity. The whitening can be achieved adaptively by a gradient algorithm, in which the update rule is
WW+[IWxx.sup.HW.sup.H](20)
Where W is the whitening matrix, Wx is the whited signal, and is the convergent rate. A rough interpretation of the rule is that [IWxx.sup.HW.sup.H] becomes zero when whitening is achieved at convergence.
Numerical Simulation

(52) A. Convergence of the Algorithms

(53) In order to investigate the convergence of the proposed algorithms, we firstly assume that PDM channel is static, and the parameters in (5) are =40, =60. The OSNRs are set to 18 dB for PDM-QPSK and 24 dB for PDM-16QAM. The convergence depends on the choice of mixing matrix, but the case shown is typical. A uniformly distributed phase noise has also been attached to the symbols. The learning curves of the gradient algorithms and Quasi-Newton algorithms are shown in FIG. 4-5, respectively, which are the arithmetic means of 1000 times simulation. The nonlinear function used here and in the following part of the article is G.sub.2(y)=y.sup.2.

(54) The curves in FIG. 4 represent the change of cost functions in (8) derived from the PDF of QPSK and 16QAM signal, which reach steady state after 200300 iterations. There is a tradeoff between the accuracy and convergent speed, so has to be chosen carefully. The curves of the same kind of signal converge in a similar fashion. They almost converge to the minima of their cost functions.

(55) The Quasi-Newton algorithms are in batch processing mode which employs 2000 symbols of both polarizations. The convergence is much faster than the above gradient algorithms as can be seen in the horizontal ordinate in FIG. 5, but the computation complexity is even higher. So we could use the Quasi-Newton algorithms to get close to convergence, and then switch to the gradient algorithm to track it.

(56) B. Dynamic Tracking of the Adaptive Algorithms

(57) In order to gain further insight into the dynamic tracking behavior of the adaptive gradient algorithms, we multiply the channel Jones matrix used in Section IV.A by an endless polarization rotation matrix which is formatted as

(58) $\begin{matrix} A = (\begin{matrix} \cos t & \sin t \\ - \sin t & \cos t \end{matrix}) & (21) \end{matrix}$
where is the polarization rotation angular frequency. In FIG. 6, MSE refers to the mean square error of the average convergent values relative to the minima of their corresponding cost functions. The average convergent value is also calculated via an arithmetic mean of 1000 times simulation, and the symbol rates of the QPSK and 16QAM signal are supposed to be 28 Gsym/s. Seen from FIG. 6, the four methods can track a polarization rotation frequency less than 10 Mrad/s without significant convergent stability deterioration.

(59) C. Performance in PMD Emulator

(60) The algorithms are tested with a PMD emulator to evaluate the demultiplexing ability in fiber link. In the simulation in FIG. 7, the optical carrier from a laser of 100 kHz line-width is split into two orthogonal polarizations by a polarization beam splitter (PBS). The two I/Q modulators are driven by 28 GSym/s two levels or four levels signal to generate PDM-QPSK or PDM-16QAM signal, the bit rates of which are therefore 107 Gbit/s and 428 Gbit/s, respectively. After that, the two polarizations tributaries are coupled again by a polarization beam coupler (PBC). An instantaneous mixing Jones matrix as in Section IV.A with a concatenated PMD emulator is employed as transmission channel. As mentioned in Section II, CD is neglected, and the PMD emulator is controlled by the mean differential group delays (DGDs) of fiber segments to simulate the effects of the 1st and higher order PMD of real fibers. The OSNRs are the same as before. At the receiver, a polarization-diverse 90 hybrid, along with a local laser of the same 100 kHz line-width are used for coherent detection.

(61) For every DGD value, BERs (bit error ratios) are measured over 32000 symbols after ICA and follow-up DSP are applied, which includes blind adaptive equalization and phase recovery. The ICA algorithms are implemented in this way: the first 2000 symbols of each polarization are employed for the Quasi-Newton algorithms in a batch processing mode to make the demultiplexing matrix w approach to convergence, and then it switches to the gradient algorithm to track the polarization change adaptively for the rest of the symbols. The results are shown in FIG. 8 with BERs of unmixed signal. No matter which kind of ICA algorithm is applied, the one accurately derived from the signal PDF or the approximate one using nonlinear functions, the results are similar. The coincidence of the curves proves the successful polarization demultiplexing of ICA algorithms and also proves that the simplified ICA algorithms are as effective as the accurate ones even though the computation is greatly reduced.

(62) FIG. 9 shows the constellations for the signal before/after applying ICA methods and after finishing all the DSP processing for the case of zero-DGD. The ICA algorithms transfer the signals to their intended moduli.

(63) Experimental Results

(64) The proposed ICA algorithms are also tested in the experimental system shown in FIG. 10. In the experiment, 40 Gbit/s PDM-QPSK signal is generated by one IQ modulator, a PBS, a PBC and an optical delay line. A 1.sup.st-order PMD emulator and a polarization scrambler are inserted before 100 km standard single mode fiber (SSMF) to ensure that the resulting PMD has all orders. The eye diagram of the generated optical QPSK signal is in FIG. 10(a), and the OSNR of the received optical signal is set to 20 dB by adjusting the EDFA and the optical attenuator.

(65) The DSP algorithms for experimental data are identical to those used for the simulation system except that fiber CD compensation is employed. The BERs shown in FIG. 11 are calculated over 40000 bits. The ICA and CMA have similar performance, and this result is coincident with that of [22]. The result reveals that if singularity does not occur, the ICA and CMA are comparable in polarization demultiplexing for PDM-QPSK. Even so, ICA excels CMA, because singularity cannot exist with ICA [21] and ICA is adaptive to other modulation formats.

CONCLUSION

(66) We have derived the polarization demultiplexing algorithms and their simplifications based on ICA by negentropy maximization. It is found that they are effective for coherent detected PDM-QAM signals. It is further shown by experiment that the performance of ICA and CMA for demultiplexing PDM-QPSK is comparable, but ICA has its own advantages of immunity to singularity and modulation format independence. The ICA algorithms can also be potentially applied to eliminate the crosstalk and interferences between sub-channels in the newest spatial-division multiplexing systems which employ multi-core or few-mode fibers.

APPENDIX

(67) A. Derivation of from OSNR

(68) The power of noise n can be expressed as

(69) $\begin{matrix} \begin{matrix} E ({.Math. n .Math.}^{2}) = {.Math. n .Math.}^{2} p_{n} (n) x y \\ = \frac{{.Math. n_{x} .Math.}^{2} + {.Math. n_{y} .Math.}^{2}}{2^{2}} \exp (- \frac{{.Math. n_{x} .Math.}^{2} + {.Math. n_{y} .Math.}^{2}}{2^{2}}) x y \\ = \frac{{.Math. n_{x} .Math.}^{2}}{\sqrt{2}} \exp (- \frac{{.Math. n_{x} .Math.}^{2}}{2^{2}}) x \frac{{.Math. n_{y} .Math.}^{2}}{\sqrt{2}} \exp (- \frac{{.Math. n_{y} .Math.}^{2}}{2^{2}}) y \\ = 2^{2} \end{matrix} & (22) \end{matrix}$

(70) Assuming the power of received signal is normalized, so received OSNR is

(71) $\begin{matrix} \begin{matrix} OSNR = \frac{E ({.Math. a .Math.}^{2})}{E ({.Math. n .Math.}^{2})} \\ = \frac{1 - E ({.Math. n .Math.}^{2})}{E ({.Math. n .Math.}^{2})} \\ = \frac{1 - 2^{2}}{2^{2}} \end{matrix} & (23) \end{matrix}$

(72) B. Derivation of Complex Gradient the Cost Function

(73) The complex gradient is derived in detail in [31, 34]. Here we use the results directly. Complex gradient is defined as

(74) $\begin{matrix} \frac{f}{z} = \frac{1}{2} (\frac{f}{z^{R}} - j \frac{f}{z^{I}}); \frac{f}{z^{*}} = \frac{1}{2} (\frac{f}{z^{R}} + j \frac{f}{z^{I}}) & (24) \end{matrix}$
where z, z* is the conjugate of z and f is analytic with respect to z and z* independently. So using the chain rule, we have

(75) $\begin{matrix} \begin{matrix} \frac{J (w)}{w_{i}^{*}} = \frac{1}{2} E {\frac{{- \log [p_{s} (w^{H} x)]}}{w_{i}^{R}} + j \frac{{- \log [p_{s} (w^{H} x)]}}{w_{i}^{I}}} \\ = \frac{1}{2} E {\begin{matrix} g^{R} (y) \frac{{(w^{H} x)}^{R}}{w_{i}^{R}} + g^{I} (y) \frac{{(w^{H} x)}^{I}}{w_{i}^{R}} + \\ j [g^{R} (y) \frac{{(w^{H} x)}^{R}}{w_{i}^{I}} + g^{I} (y) \frac{{(w^{H} x)}^{I}}{w_{i}^{I}}] \end{matrix}} \\ = \frac{1}{2} E {g^{R} (y) x_{i}^{R} + g^{I} (y) x_{i}^{I} + j [g^{R} (y) x_{i}^{I} - g^{I} (y) x_{i}^{R}]} \\ = \frac{1}{2} E {x_{i} g^{*} (y)}, (x_{i} = x_{i}^{R} + {jx}_{i}^{I}, g (y) = g^{R} (y) + {jg}^{I} (y)) \end{matrix} & (25) \end{matrix}$

(76) Thus

(77) 0 $\begin{matrix} \frac{J (w)}{w^{*}} = \frac{1}{2} E {{xg}^{*} (w^{H} x)} & (26) \end{matrix}$
where

(78) $\begin{matrix} \begin{matrix} g^{R} (y) = - \frac{\log p_{s} (y)}{y^{R}} \\ = \frac{- y^{R}}{M 2^{4} p_{s} (y)} {.Math.}_{i = 1}^{M} {\exp (- \frac{{.Math. y .Math.}^{2} + {.Math. a_{i} .Math.}^{2}}{2^{2}}) [\begin{matrix} I_{1} (.Math. \frac{a_{i}^{*} y}{^{2}} .Math.) .Math. \frac{a_{i}^{*}}{y} .Math. - \\ I_{0} (.Math. \frac{a_{i}^{*} y}{^{2}} .Math.) \end{matrix}]} \end{matrix} & (27) \end{matrix}$

(79) Similarly,

(80) $\begin{matrix} g^{I} (y) = \frac{- y^{I}}{M 2^{4} p_{s} (y)} {.Math.}_{i = 1}^{M} {\exp (- \frac{{.Math. y .Math.}^{2} + {.Math. a_{i} .Math.}^{2}}{2^{2}}) [I_{1} (.Math. \frac{a_{i}^{*} y}{^{2}} .Math.) .Math. \frac{a_{i}^{*}}{y} .Math. - I_{0} (.Math. \frac{a_{i}^{*} y}{^{2}} .Math.)]} & (28) \end{matrix}$

(81) C. Derivation of Updating Rule of Quasi-Newton Algorithm

(82) We start the derivation from (13) and assuming custom character =.sub.n+1.sub.n. We can have
(.sub.J+).sub.n+1=.sub.J*+.sub.J.sub.n(29)
*.sub.J is the complex gradient [31,34], whose odd elements are defined in (11). The complex Hessian .sub.J is given by [27],

(83) $\begin{matrix} \begin{matrix} J = \frac{^{2} J (w)}{* T} \\ = [\begin{matrix} \frac{^{2} J (w)}{w_{1}^{*} w_{1}} & \frac{^{2} J (w)}{w_{1}^{*} w_{1}^{*}} & \frac{^{2} J (w)}{w_{1}^{*} w_{2}} & \frac{^{2} J (w)}{w_{1}^{*} w_{2}^{*}} \\ \frac{^{2} J (w)}{w_{1} w_{1}} & \frac{^{2} J (w)}{w_{1} w_{1}^{*}} & \frac{^{2} J (w)}{w_{1} w_{2}} & \frac{^{2} J (w)}{w_{1} w_{2}^{*}} \\ \frac{^{2} J (w)}{w_{2}^{*} w_{1}} & \frac{^{2} J (w)}{w_{2}^{*} w_{1}^{*}} & \frac{^{2} J (w)}{w_{2}^{*} w_{2}} & \frac{^{2} J (w)}{w_{2}^{*} w_{2}^{*}} \\ \frac{^{2} J (w)}{w_{2} w_{1}} & \frac{^{2} J (w)}{w_{2} w_{1}^{*}} & \frac{^{2} J (w)}{w_{2} w} & \frac{^{2} J (w)}{w_{2} w_{2}^{*}} \end{matrix}] \\ = E {g_{a} (y) [\begin{matrix} x_{1} x_{1}^{*} & 0 & x_{1} x_{2}^{*} & 0 \\ 0 & x_{1}^{*} x_{1} & 0 & x_{1}^{*} x_{2} \\ x_{2} x_{1}^{*} & 0 & x_{2} x_{2}^{*} & 0 \\ 0 & x_{2}^{*} x_{1} & 0 & x_{2}^{*} x_{2} \end{matrix}]} + \\ E {[\begin{matrix} 0 & x_{1}^{2} g_{b} (y) & 0 & x_{1} x_{2} g_{b} (y) \\ x_{1}^{* 2} g_{b}^{*} (y) & 0 & x_{1}^{*} x_{2}^{*} g_{b}^{*} (y) & 0 \\ 0 & x_{2} x_{1} g_{b} (y) & 0 & x_{2}^{2} g_{b} (y) \\ x_{2}^{*} x_{1}^{*} g_{b}^{*} (y) & 0 & x_{2}^{* 2} g_{b}^{*} (y) & 0 \end{matrix}]} \end{matrix} & (30) \end{matrix}$
where

(84) $\begin{matrix} \begin{matrix} g_{a} (y) = 4 [\frac{g^{R} (y)}{y^{R}} + \frac{g^{I} (y)}{y^{I}} + j (\frac{g^{R} (y)}{y^{I}} - \frac{g^{I} (y)}{y^{R}})] \\ = \frac{8 g (y)}{y} + \frac{2 h (y) {.Math. y .Math.}^{2}}{^{2}} \end{matrix} \begin{matrix} g_{b} (y) = 4 [\frac{g^{R} (y)}{y^{R}} - \frac{g^{I} (y)}{y^{I}} + j (\frac{g^{R} (y)}{y^{I}} + \frac{g^{I} (y)}{y^{R}})] \\ = \frac{2 h (y) [{(y^{R})}^{2} - {(y^{I})}^{2} + j 2 y^{R} y^{I}]}{^{4}} \end{matrix} & (31) \\ h (y) = \frac{1}{{Mp}_{s} (y)} {.Math.}_{i = 1}^{M} {\exp (- \frac{{.Math. y .Math.}^{2} + {.Math. a_{i} .Math.}^{2}}{2^{2}}) [({.Math. \frac{a_{i}}{y} .Math.}^{2} + \frac{1}{^{2}} - \frac{g (y)}{y}) I_{0} (.Math. \frac{a_{i} y}{^{2}} .Math.) - .Math. \frac{2 a_{i}}{y} .Math. (\frac{1}{{.Math. y .Math.}^{2}} + \frac{1}{^{2}} - \frac{g (y)}{2 y}) I_{1} (.Math. \frac{a_{i} y}{^{2}} .Math.)]} & (32) \end{matrix}$

(85) Assuming x has been whitened, and after removing the even rows of the matrix custom character .sub.J and .sub.J*, it results in

(86) $\begin{matrix} (H_{J} + I) w_{n + 1} = - \frac{1}{2} E {{xg}^{*} (y)} + E {{xx}^{H} g_{a} (y)} w_{n} + E {{xx}^{T} g_{b} (y)} w_{n}^{*} - \frac{1}{2} E {{xg}^{*} (y)} + E {g_{a} (y)} w_{n} + E {{xx}^{T}} E {g_{b} (y)} w_{n}^{*} & (33) \end{matrix}$

(87) At the convergent point, ( custom character .sub.J+) will become real, as proved in [27]. Therefore, the fix-point update is

(88) $\begin{matrix} w - \frac{1}{2} E {{xg}^{*} (y)} + E {g_{a} (y)} w + E {{xx}^{T}} E {g_{b} (y)} w^{*} & (34) \end{matrix}$

REFERENCES

(89) [1] X. Zhou, et al., High Spectral Efficiency 400 Gb/s Transmission Using PDM Time-Domain Hybrid 32-64 QAM and Training-Assisted Carrier Recovery, IEEE/OSA Journal of Lightwave Technology, vol. 31, pp. 999-1005, 2013. [2] J. Yu, et al., Transmission of 200 G PDM-CSRZ-QPSK and PDM-16 QAM With a SE of 4 b/s/Hz, IEEE/OSA Journal of Lightwave Technology, vol. 31, pp. 515-522, 2013. [3] P. J. Winzer, High-Spectral-Efficiency Optical Modulation Formats, IEEE/OSA Journal of Lightwave Technology, vol. 30, pp. 3824-3835, 2012. [4] M. Yiran, Y. Qi, T. Yan, C. Simin, and W. Shieh, 1-Tb/s Single-Channel Coherent Optical OFDM Transmission With Orthogonal-Band Multiplexing and Subwavelength Bandwidth Access, IEEE/OSA Journal of Lightwave Technology, vol. 28, pp. 308-315, 2010. [5] Y. Ma, Q. Yang, Y. Tang, S. Chen, and W. Shieh, 1-Tb/s single-channel coherent optical OFDM transmission over 600-km SSMF fiber with subwavelength bandwidth access, OSA Optics Express, vol. 17, pp. 9421-9427, 2009. [6] X. Yuan, J. Zhang, Q. Jing, Y. Zhang, and M. Zhang, A novel scheme for automatic polarization division demultiplexing, in Proc. Asia Communications and Photonics Conference and Exhibition (ACP), 2011, pp. 83092P. [7] A.-L. Yi, L.-S. Yan, J. Ye, W. Pan, B. Luo, and X. S. Yao, A Novel Scheme for All-Optical Automatic Polarization Division Demultiplexing, in Proc. Asia Communications and Photonics Conference and Exhibition (ACP), 2009, pp. TuAA1. [8] X. S. Yao, L. S. Yan, B. Zhang, A. E. Willner, and J. Jiang, All-optic scheme for automatic polarization division demultiplexing, OSA Optics Express, vol. 15, pp. 7407-7414,2007. [9] Z. Yu, X. Yi, Q. Yang, M. Luo, J. Zhang, L. Chen, et al, Polarization demultiplexing in stokes space for coherent optical PDM-OFDM, OSA Optics Express, vol. 21, pp. 3885-3890, 2013. [10] C. Gong, X. Wang, N. Cvijetic, and G. Yue, A Novel Blind Polarization Demultiplexing Algorithm Based on Correlation Analysis, IEEE/OSA Journal of Lightwave Technology, vol. 29, pp. 1258-1264,2011. [11] B. Szafraniec, B. Nebendahl, and T. Marshall, Polarization demultiplexing in Stokes space, OSA Optics Express, vol. 18, pp. 17928-17939, 2010. [12] L. Ling, T. Zhenning, Y. Weizhen, S. Oda, T. Hoshida, and J. C. Rasmussen, Initial tap setup of constant modulus algorithm for polarization de-multiplexing in optical coherent receivers, in Proc. Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference (OFC/NFOEC), 2009, pp. 1-3. [13] A. Vgenis, C. S. Petrou, C. B. Papadias, I. Roudas, and L. Raptis, Nonsingular Constant Modulus Equalizer for PDM-QPSK Coherent Optical Receivers, IEEE Photonics Technology Letters, vol. 22, pp. 45-47, 2010. [14] K. Kikuchi, Polarization-demultiplexing algorithm in the digital coherent receiver, in Digest of the IEEE/LEOS Summer Topical Meetings, 2008, pp. 101-102. [15] X. Zhou and J. Yu, Digital signal processing for coherent optical communication, in Proc. 18th AnnualWireless and Optical Communications Conference (WOCC), 2009, pp. 1-5. [16] O. Zia-Chahabi, R. Le Bidan, C. Laot, and M. Morvan, A self-reconfiguring constant modulus algorithm for proper polarization demultiplexing in coherent optical receivers, in Proc. Optical Fiber Communication Conference and Exposition (OFC/NFOEC), 2012 and the National Fiber Optic Engineers Conference, 2012, pp. 1-3. [17] I. Roudas, A. Vgenis, C. S. Petrou, D. Toumpakaris, J. Hurley, M. Sauer, et al, Optimal Polarization Demultiplexing for Coherent Optical Communications Systems, IEEE/OSA Journal of Lightwave Technology, vol. 28, pp. 1121-1134,2010. [18] S. J. Savory, Digital Coherent Optical Receivers: Algorithms and Subsystems, IEEE Journal of Selected Topics in Quantum Electronics, vol. 16, pp. 1-16, 2010. [19] Y. Han and G. Li, Coherent optical communication using polarization multiple-input-multiple-output, OSA Optics Express, vol. 13, pp. 7527-7534, 2005. [20] C. C. Do, C. Zhu, A. V. Tran, S. Chen, T. Anderson, D. Hewitt, et al., Chromatic dispersion estimation in 40 Gb/s coherent polarization-multiplexed single carrier system using complementary Golay sequences, in Proc. Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference(OFC/NFOEC), 2012, pp. 1-3. [21] Pontus. Johannisson, Henk. Wymeersch, Martin. Sjodin, A. Serdar Tan, et al, Convergence Comparison of the CMA and ICA for Blind Polarization Demultiplexing, IEEE/OSA Journal of Optical Communications and Networking, vol. 3, pp. 493-501, 2011. [22] H. Zhang, Z. Tao, L. Liu, S. Oda, T. Hoshida, and J. C. Rasmussen, Polarization demultiplexing based on independent component analysis in optical coherent receivers, in Proc. 34th European Conference on Optical Communication (ECOC) 2008, pp. 1-2. [23] X. Xiaobo, F. Yaman, Z. Xiang, and L. Guifang, Polarization Demultiplexing by Independent Component Analysis, IEEE Photonics Technology Letters, vol. 22, pp. 805-807, 2010. [24] T. Oktem, A. T. Erdogan, and A. Demir, Adaptive Receiver Structures for Fiber Communication Systems Employing Polarization-Division Multiplexing, IEEE/OSA Journal of Lightwave Technology, vol. 27, pp. 5394-5404, 2009. [25] Turgut M. ktem, Alper T. Erdogan, and Alper Demir, Adaptive Receiver Structures for Fiber Communication Systems Employing Polarization Division Multiplexing: High Symbol Rate Case, IEEE/OSA Journal of Lightwave Technology, vol. 28, pp. 1536-1546, 2010. [26] I. B. Djordjevic, Spatial-Domain-Based Hybrid Multidimensional Coded-Modulation Schemes Enabling Multi-Tb/s Optical Transport, IEEE/OSA Journal of Lightwave Technology, vol. 30, pp. 2315-2328, 2012. [27] M. Novey and T. Adali, Complex ICA by Negentropy Maximization, IEEE Transactions on Neural Networks, vol. 19, pp. 596-609, 2008. [28] M. Novey and T. Adali, Complex Fixed-Point ICA Algorithm for Separation of QAM Sources using Gaussian Mixture Model, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2007, pp. 11-445-11-448. [29] P. K. A. Wai, C. R. Menyuk, and H. H. Chen, Stability of solitons in randomly varying birefringent fibers, Optics. Letters, vol. 16, pp. 1231-1233, 1991. [30] A. Hyvarinen and E. Oja, Independent component analysis: algorithms and applications, Neural Networks, vol. 13, pp. 411-430, 2000. [31] A. van den Bos, Complex gradient and Hessian, IEE ProceedingsVision, Image and Signal Processing, vol. 141, pp. 380-383, 1994. [32] E. Bingham and A. Hyvarinen, A fast fixed-point algorithm for independent component analysis of complex valued signals, International Journal of Neural Systems, vol. 10, pp. 1-8, 2000. [33] T. Adali, L. Hualiang, M. Novey, and J. F. Cardoso, Complex ICA Using Nonlinear Functions, IEEE Transactions on Signal Processing, vol. 56, pp. 4536-4544, 2008. [34] D. H. Brandwood, A complex gradient operator and its application in adaptive array theory, IEE Proceedings H-Microwaves, Optics and Antennas, vol. 130, pp. 11-16, 1983.

Method and apparatuses for algorithm on QAM coherent optical detection

Assignee

Inventors

Cpc classification

Classification Explorer

H04B10/541

ELECTRICITY

Classification Explorer

H04L27/223

ELECTRICITY

Classification Explorer

H04L27/3818

ELECTRICITY

Classification Explorer

H04J14/06

ELECTRICITY

Classification Explorer

H04B10/532

ELECTRICITY

Classification Explorer

H04L27/227

ELECTRICITY

International classification

Classification Explorer

H04L12/28

ELECTRICITY

Classification Explorer

H04L27/22

ELECTRICITY

Classification Explorer

H04L27/38

ELECTRICITY

Classification Explorer

H04L27/227

ELECTRICITY

Classification Explorer

H04B10/532

ELECTRICITY

Classification Explorer

H04B10/54

ELECTRICITY

Classification Explorer

H04J14/06

ELECTRICITY

Abstract

Claims

Description