Method and apparatus for correcting optical waveform distortion and optical signal receiving apparatus

Abstract

This method includes optimizing, by a gradient descent method, a first parameter used in back propagation processing and associated with XPM and a second parameter used in the back propagation processing and associated with SPM and XPM, wherein the back propagation processing is processing to estimate a waveform at a time of transmission by alternately calculating linear terms and nonlinear terms in a nonlinear Schrdinger equation after receiving an optical signal whose waveform shape changed in a transmission line and digitizing a waveform of the received optical signal, and correct, for each channel of plural channels in the transmission line at a time of wavelength-division multiplexing transmission, waveform distortion caused by SPM that occurs in the channel and waveform distortion caused by XPM that occurs in relation with channels other than the channel; and executing the back propagation processing by using the optimized first and second parameters.

Claims

1. A method for correcting optical waveform distortion, comprising: optimizing, by a gradient descent method, a first parameter that is used in back propagation processing and is associated with cross-phase modulation and a second parameter that is used in the back propagation processing and is associated with self-phase modulation and the cross-phase modulation, wherein the back propagation processing is processing to estimate a waveform at a time of transmission by alternately calculating linear terms and nonlinear terms in a nonlinear Schrdinger equation after receiving an optical signal whose waveform shape changed in a transmission line and digitizing a waveform of the received optical signal, and correct, for each channel of plural channels in the transmission line at a time of wavelength-division multiplexing transmission, waveform distortion caused by the self-phase modulation that occurs in the channel and waveform distortion caused by the cross-phase modulation that occurs in relation with channels other than the channel; and executing the aforementioned back propagation processing by using the optimized first and second parameters.

2. The method according to claim 1, wherein the waveform distortion caused by the cross-phase modulation is corrected under approximation that an initial waveform is maintained for an intensity, independent of a propagation distance, and a delay proportional to the propagation distance occurs along a time axis.

3. The method according to claim 1, wherein the second parameter comprises the group velocity dispersion D.sub.2 and nonlinear coefficient g, and the first parameter comprises a walk-off parameter d.sub.n.

4. An apparatus, comprising: a memory; and a processor coupled to the memory and configured to optimize, by a gradient descent method, a first parameter that is used in back propagation processing and is associated with cross-phase modulation and a second parameter that is used in the back propagation processing and is associated with self-phase modulation and the cross-phase modulation, wherein the back propagation processing is processing to estimate a waveform at a time of transmission by alternately calculating linear terms and nonlinear terms in a nonlinear Schrdinger equation after receiving an optical signal whose waveform shape changed in a transmission line and digitizing a waveform of the received optical signal, and correct, for each channel of plural channels in the transmission line at a time of wavelength-division multiplexing transmission, waveform distortion caused by the self-phase modulation that occurs in the channel and waveform distortion caused by the cross-phase modulation that occurs in relation with channels other than the channel.

5. An optical receiver, comprising: a memory; and a processor coupled to the memory and configured to execute back propagation processing by using a first parameter that is used in back propagation processing and is associated with cross-phase modulation and a second parameter that is used in the back propagation processing and is associated with self-phase modulation and the cross-phase modulation, wherein the back propagation processing is processing to estimate a waveform at a time of transmission by alternately calculating linear terms and nonlinear terms in a nonlinear Schrdinger equation after receiving an optical signal whose waveform shape changed in a transmission line and digitizing a waveform of the received optical signal, and correct, for each channel of plural channels in the transmission line at a time of wavelength-division multiplexing transmission, waveform distortion caused by the self-phase modulation that occurs in the channel and waveform distortion caused by the cross-phase modulation that occurs in relation with channels other than the channel, and wherein the first and second parameters are optimized by a gradient descent method.

Description

BRIEF DESCRIPTION OF DRAWINGS

(1) FIG. 1 is a diagram schematically illustrating the spectrum of a WDM signal;

(2) FIG. 2 is a diagram for describing a relationship between spans and steps;

(3) FIG. 3 is a schematic diagram illustrating linear steps, nonlinear steps, and a route of each channel signal component;

(4) FIG. 4A is a diagram illustrating exact routes through which a gradient of D.sub.2.sup.(0) spreads from a linear step L.sub.1, 0.sup.(0) and to later stages;

(5) FIG. 4B is a diagram illustrating approximate routes through which the gradient of D.sub.2.sup.(0) spreads from the linear step L.sub.1, 0.sup.(0) to the later stages;

(6) FIG. 5A is a diagram illustrating a flow of processing of calculating a gradient of a waveform by each transmission line parameter in parameter optimization processing;

(7) FIG. 5B is a diagram illustrating a processing flow of updating each transmission line parameter in the parameter optimization processing;

(8) FIG. 5C is a schematic diagram of an optical transmission system according to the present embodiment;

(9) FIG. 5D is a diagram illustrating a processing flow according to the present embodiment;

(10) FIG. 6 is a diagram illustrating a group velocity dispersion value and a nonlinear constant in each span of a realistic transmission line;

(11) FIG. 7A is a diagram illustrating a second-order group velocity dispersion parameter D.sub.2.sup.(j) with respect to the number of times of repeated learning in a learning process in a case where span launch power was +2 dBm/ch in a six-span ideal transmission line;

(12) FIG. 7B is a diagram illustrating a nonlinear coefficient parameter g (with respect to the number of times of repeated learning in the learning process in the case where the span launch power was +2 dBm/ch in the six-span ideal transmission line;

(13) FIG. 7C is a diagram illustrating d.sub.4.sup.(j) with respect to the number of times of repeated learning in the learning process in the case where the span launch power was +2 dBm/ch in the six-span ideal transmission line;

(14) FIG. 7D is a diagram illustrating d.sub.1.sup.(j) with respect to the number of times of repeated learning in the learning process in the case where the span launch power was +2 dBm/ch in the six-span ideal transmission line;

(15) FIG. 7E is a diagram illustrating a mean squared error with respect to the number of times of repeated learning in the learning process in the case where the span launch power was +2 dBm/ch in the six-span ideal transmission line;

(16) FIG. 8A is a diagram illustrating a result of a Q factor with respect to signal launch power in a case where nonlinear waveform distortion was corrected taking only SPM into account in ideal transmission lines of four spans and six spans;

(17) FIG. 8B is a diagram illustrating a result of a Q factor with respect to signal launch power in a case where nonlinear waveform distortion was corrected taking only SPM into account in ideal transmission lines of eight spans and 10 spans;

(18) FIG. 9A is a diagram illustrating a result of a Q factor with respect to signal launch power in a case where nonlinear waveform distortion was corrected also taking XPM into account in ideal transmission lines of four spans and six spans;

(19) FIG. 9B is a diagram illustrating a result of a Q factor with respect to signal launch power in a case where nonlinear waveform distortion was corrected also taking XPM into account in ideal transmission lines of eight spans and 10 spans;

(20) FIG. 10 is a diagram illustrating a result of a Q factor with respect to the number of transmission spans in cases where nonlinear waveform distortion correction was applied by various methods and was not applied in ideal transmission lines;

(21) FIG. 11A is a diagram illustrating the second-order group velocity dispersion parameter D.sub.2.sup.(j) with respect to the number of times of repeated learning in the learning process in the case where span launch power was +2 dBm/ch in a six-span realistic transmission line;

(22) FIG. 11B is a diagram illustrating g.sup.(j) with respect to the number of times of repeated learning in the learning process in the case where the span launch power was +2 dBm/ch in the six-span realistic transmission line;

(23) FIG. 11C is a diagram illustrating d.sub.4.sup.(j) with respect to the number of times of repeated learning in the learning process in the case where the span launch power was +2 dBm/ch in the six-span realistic transmission line;

(24) FIG. 11D is a diagram illustrating d.sub.1.sup.(j) with respect to the number of times of repeated learning in the learning process in the case where the span launch power was +2 dBm/ch in the six-span realistic transmission line;

(25) FIG. 11E is a diagram illustrating a mean squared error with respect to the number of times of repeated learning in the learning process in the case where the span launch power was +2 dBm/ch in the six-span realistic transmission line;

(26) FIG. 12A is a diagram illustrating a result of a Q factor with respect to signal launch power in a case where nonlinear waveform distortion was corrected taking only SPM into account in realistic transmission lines of four spans and six spans;

(27) FIG. 12B is a diagram illustrating a result of a Q factor with respect to signal launch power in a case where nonlinear waveform distortion was corrected taking only SPM into account in realistic transmission lines of eight spans and 10 spans;

(28) FIG. 13A is a diagram illustrating a result of a Q factor with respect to signal launch power in a case where nonlinear waveform distortion was corrected also taking XPM into account in realistic transmission lines of four spans and six spans;

(29) FIG. 13B is a diagram illustrating a result of a Q factor with respect to signal launch power in a case where nonlinear waveform distortion was corrected also taking XPM into account in realistic transmission lines of eight spans and 10 spans;

(30) FIG. 14 is a diagram illustrating a result of a Q factor with respect to the number of transmission spans in cases where nonlinear waveform distortion correction was applied by various methods and was not applied in realistic transmission lines;

(31) FIG. 15 is a schematic diagram illustrating a transmission system used for an experiment;

(32) FIG. 16A is a diagram illustrating a change of the second-order group velocity dispersion parameter D.sub.2.sup.(j) with respect to the number of times of parameter update;

(33) FIG. 16B is a diagram illustrating a change of the nonlinear coefficient parameter g.sup.(j) with respect to the number of times of parameter update;

(34) FIG. 16C is a diagram illustrating a change of a cross-polarization phase modulation parameter 80 with respect to the number of times of parameter update;

(35) FIG. 16D is a diagram illustrating a change of a walk-off parameter d.sub.5.sup.(j) with respect to the number of times of parameter update;

(36) FIG. 16E is a diagram illustrating a change of a walk-off parameter d.sub.1.sup.(j) with respect to the number of times of parameter update;

(37) FIG. 16F is a diagram illustrating a change of a walk-off parameter d.sub.n.sup.(j) with respect to the number of times of parameter update;

(38) FIG. 16G is a diagram illustrating a change of a Mean Squared Error (MSE) and a moving average of the MSE with respect to the number of times of parameter update;

(39) FIG. 17 is a schematic diagram illustrating an optical fiber connection situation of a six-span transmission line, and a back propagation calculation order;

(40) FIG. 18A is a diagram illustrating signal quality with respect to launch power in a case where nonlinear waveform distortion correction was performed taking only SPM into account in a six-span transmission line and an eight-span transmission line;

(41) FIG. 18B is a diagram illustrating signal quality with respect to launch power in a case where nonlinear waveform distortion correction was performed taking only SPM into account in a 10-span transmission line and a 12-span transmission line;

(42) FIG. 18C is a diagram illustrating signal quality with respect to launch power in a case where nonlinear waveform distortion correction was performed taking only SPM into account in a 14-span transmission line and a 16-span transmission line;

(43) FIG. 19A is a diagram illustrating signal quality with respect to launch power in a case where nonlinear waveform distortion correction was performed taking SPM and XPM into account in a six-span transmission line and an eight-span transmission line;

(44) FIG. 19B is a diagram illustrating signal quality with respect to launch power in a case where nonlinear waveform distortion correction was performed taking SPM and XPM into account in a 10-span transmission line and a 12-span transmission line;

(45) FIG. 19C is a diagram illustrating signal quality with respect to launch power in a case where nonlinear waveform distortion correction was performed taking SPM and XPM into account in a 14-span transmission line and a 16-span transmission line;

(46) FIG. 20 is a diagram illustrating a change of signal quality with respect to a transmission span; and

(47) FIG. 21 is a diagram illustrating the number of product arithmetic operations with respect to a data length N in cases of five channels, 11 channels, and 21 channels.

MODES TO IMPLEMENT PRESENT INVENTION

Basic Idea in Embodiment of Present Invention

(48) A technique of correcting nonlinear waveform distortion according to the present embodiment targeting at a WDM signal whose frequency interval is will be described. FIG. 1 is a schematic view of the spectrum of the WDM signal. Here, a signal to be corrected is a channel number 0, the center frequency of this signal takes a reference value .sub.0=0, and the center frequency of a signal of another channel number n is .sub.n=n (n=1, 2, and . . . ). In this case, an envelope amplitude in expression (1) is expressed as follows as a sum of envelope amplitudes of respective channels.
[Expression 3]
A.sub.p(t)=.sub.nA.sub.p,n(t)e.sup.i.sup.n.sup.t(3)

(49) In this regard, A.sub.p, n(t) represents an envelope amplitude of a baseband (that means that the center frequency is 0) related to a polarization component p of the channel n.

(50) Moreover, only the linear terms of the nonlinear Schrdinger equation expressed in expression (1) are written out as follows.

(51) $[Expression 4]$ $\begin{matrix} i \frac{A_{p}}{z} + \frac{_{2}}{2} \frac{^{2} A_{p}}{t^{2}} - i \frac{_{3}}{6} \frac{^{3} A_{p}}{t^{3}} = - i \frac{}{2} A_{p} & (4) \end{matrix}$

(52) A solution of expression (4) for a waveform obtained by using, as an input, a waveform A.sub.p, n at a coordinate z=0 (z=0, t) in a longitudinal direction of an optical fiber and after propagation over a distance h can be separated for each channel in a frequency domain and is as follows.

(53) $[Expression 5]$ $\begin{matrix} {\tilde{A}}_{p, n} (h,) = {\tilde{A}}_{p, n} (0,) \exp [i (D_{2}^{2} + D_{3}^{3} - T_{n}) - \frac{}{2} h] & (5) \end{matrix}$

(54) Here, D.sub.2(h)=.sub.2h/2 and D.sub.3=.sub.3h/6 are respective cumulative values of second-order and third-order group velocity dispersions. Furthermore, T.sub.n(h, .sub.n)=(.sub.2.sub.n+.sub.3.sub.n.sup.2/2)h represents group delay (walk-off) produced in the signal of the channel number n. The waveform of expression (5) is represented in a time domain using an operator F that represents Fourier transform as follows.

(55) $[Expression 6]$ $\begin{matrix} A_{p, n} (h, t) = \exp (- \frac{}{2} h)^{- 1} \exp [i (D_{2}^{2} + D_{3}^{3} - T_{n})] A_{p, n} (0, t) & (6) \end{matrix}$

(56) On the other hand, only nonlinear terms in the nonlinear Schrdinger equation (1) are written out as follows.

(57) $[Expression 7]$ $\begin{matrix} i \frac{A_{p}}{z} - \frac{8}{9}_{0} ({.Math. A_{p} .Math.}^{2} + {.Math. A_{3 - p} .Math.}^{2}) A_{p} = - i \frac{}{2} A_{p} & (7) \end{matrix}$

(58) Here, a term of propagation loss on the right side is originally a linear term, yet is included to take into account a change of nonlinearity that occurs as a signal intensity attenuates due to propagation loss. In a case where an envelope amplitude B.sub.p is defined as A.sub.p=B.sub.p(z)exp(z/2), B.sub.p represents an amplitude from which the attenuation amount due to propagation loss is separated, and expression (7) is expressed for B.sub.p as follows.

(59) $[Expression 8]$ $\begin{matrix} i \frac{B_{p}}{z} - \frac{8}{9} (z) ({.Math. B_{p} .Math.}^{2} + {.Math. B_{3 - p} .Math.}^{2}) B_{p} = 0 & (8) \end{matrix}$

(60) Here, (z)=.sub.0exp(z) holds.

(61) Expression (8) expresses that, when a signal amplitude attenuates due to the propagation loss, a nonlinear effect in the nonlinear Schrdinger equation can be described in such a way that the original nonlinear coefficient .sub.0 attenuates in the longitudinal direction. When expression (8) is decomposed for each channel, the amplitude of the channel number 0 is expressed as follows.

(62) $[Expression 9]$ $\begin{matrix} \frac{B_{p, 0}}{z} = - i \frac{8}{9} (z) [{.Math. B_{p, 0} .Math.}^{2} + {.Math. B_{3 - p, 0} .Math.}^{2} + {.Math.}_{n 0} {2 {.Math. B_{p, n} .Math.}^{2} + {.Math. B_{3 - p, n} .Math.}^{2}}] B_{p, 0} & (9) \end{matrix}$

(63) On the right side of expression (9), a term including |B.sub.p, 0| (p=1, 2) that is a time waveform of the signal intensity of the channel number 0 causes SPM, and a term including |B.sub.p, n| (p=1, 2) that is a time waveform of the signal intensity of the channel number n0 causes XPM.

(64) As proposed by Non-Patent Literature 4, by introducing some assumptions related to a change of the waveform at a time of propagation in expression (9), it is possible to obtain an approximate solution. The first assumption is that, supposing that a time waveform intensity of the signal of the channel number 0 is invariable with respect to the distance z, B.sub.p, 0(z, t)|.sup.2=|B.sub.p, 0(0, t)|.sup.2 is assumed, and this assumption is usually used for calculation of the split-step Fourier method.

(65) The second assumption is that, while a time waveform of the signal intensity of the channel number n0 is invariable with respect to the distance z, |B.sub.p, n(z, t)|.sup.2=|B.sub.p, n(0, td.sub.nz)|.sup.2 is assumed taking the occurrence of the group delay due to walk-off into account. This equation expresses that, while the shape of the intensity does not depend on a propagation distance and is kept as the initial waveform, the delay d.sub.nz proportional to the propagation distance occurs in a time domain. Here, do expressed below represents a parameter that represents walk-off, and corresponds to a reciprocal of the group velocity.

(66) $[Expression 10]$ $\begin{matrix} d_{n} =_{2}_{n} + \frac{_{3}}{2}_{n}^{2} & (10) \end{matrix}$

(67) As a result of introduction of the assumptions, it is possible to integrate expression (9) with z in the frequency domain, and obtain a following solution for the waveform after propagation over the distance h.

(68) 0 $[Expression 11]$ $\begin{matrix} B_{p, 0} (h, t) = B_{p, 0} (0, t) \exp [i {_{SPM} (h, t) +_{XPM} (h, t)}] & (11) \end{matrix}$ $[Expression 12]$ $\begin{matrix} _{SPM} (h, t) = {gH}_{0} (h) [P_{p, 0} (t) + P_{3 - p, 0} (t)] & (12) \end{matrix}$ $[Expression 13]$ $\begin{matrix} _{XPM} (h, t) = g^{- 1} [{.Math.}_{n 0} H_{n} (h,) {2 {\tilde{P}}_{p, n} () + {\tilde{P}}_{3 - p, n} ()}] & (13) \end{matrix}$

(69) In this regard, when the integration of the length h is performed with respect to the distance z to derive expressions (11) to (13), an integration interval is [h/2, h/2]. Moreover, symbols used for these expressions are defined as follows.

(70) $\begin{matrix} [Expression 14] \\ \begin{matrix} g = - \frac{8}{9}_{0}, \\ P_{p, 0} (t) = {.Math. B_{p, 0} (t) .Math.}^{2}, & {\tilde{P}}_{p, n} () = [{.Math. B_{p, n} (t) .Math.}^{2}], \\ H_{0} (h) = \frac{2}{} \sinh (\frac{h}{2}), & H_{n} (h,) = \frac{2}{k_{n}} \sinh (\frac{k_{n} h}{2}), \\ k_{n} () =_{n} + {id}_{n} \end{matrix} & (14) \end{matrix}$

(71) The linear step of the split-step Fourier method is described by expression (6), and the nonlinear step that takes effects of both of SPM and XPM into account is described by expression (11) for the signal waveform of the channel number 0. Hereinafter, a method for performing calculation of back-propagating an optical signal waveform received through a certain optical fiber transmission line based on these expressions, and estimating a transmitted waveform will be described.

(72) To describe a procedure of back propagation calculation, FIG. 2 illustrates a schematic view illustrating splitting of steps in a case where the number of steps per span is two in a transmission line whose number of spans is two, and a calculation order of the linear steps and the nonlinear steps. Note that even in a case of a different number of spans or a different number of steps, FIG. 2 can be easily generalized. In FIG. 2, a direction from the left to the right is a forward propagation direction, and spans 1 and 2 are defined from a start point to an end point. On the contrary, a direction from the right to the left is a back propagation direction, and steps 1 and 2 are defined in the back propagation direction in each span. The ratio of the lengths of respective steps is not equipartitioned in each span, and is set such that each integral value of the nonlinear coefficient (z)=.sub.0exp(z) that takes into account attenuation of signal power due to fiber loss is equal. In a case where, for example, the number of steps is two, and in a case where a start point of the span is put as z=0, an end point is put as z=z.sub.s, and split of the steps is put as z=z.sub.1, z.sub.1 is defined such that a following expression holds.

(73) $[Expression 15]$ $\begin{matrix} _{0}^{Z_{1}} (z) d z =_{z_{1}}^{Z_{S}} (z) d z & (15) \end{matrix}$

(74) Note that, even in a case where the ratio of the lengths of respective steps is equally partitioned in the span, the present embodiment is applicable as is. Moreover, although a difference in how to split steps in a span influences a waveform distortion correction result according to the methods that are described in Non-Patent Literatures 2 and 4 and do not perform the learning, Non-Patent Literatures 5 to 8 and the present embodiment, which perform the learning, do no more than influence only initial value setting of the learning, and have no difference after the learning converges and is optimized. Moreover, in a case where the number of steps per span is one, the entire one span is calculated as one step.

(75) Next, when calculation of each step is performed, calculation is performed based on the symmetrical type split-step Fourier method (see Non-Patent Literature 3). That is, one step whose distance is h is equally split into a first half and a second half, and calculation that uses expression (6) as the linear step is first performed on a distance h/2 of the first half of the step.

(76) Next, by using a waveform obtained as an output of the linear step as an input, calculation according to expression (11) of the nonlinear step is performed on the distance h of the entire step. Lastly, by using a waveform obtained as an output of the nonlinear step as an input, calculation according to expression (6) of the linear step is performed on the distance h/2 of the second half of the step. Although this operation is repeated per step, calculation is performed for the linear step of a second half of a certain step together with the linear step of a first half of a next step.

(77) In the example in FIG. 2, after a step #1 in a span #2 is split into two, calculation for a linear step L.sup.(0) of the first half is performed for the input waveform, and calculation for a nonlinear step N.sup.(1) is performed, then calculation for L.sup.(1) obtained by coupling the linear step of the first half of the step #1 and a step #2 of the span #2 is performed, and calculation is performed likewise thereafter. In a transmission line whose number of spans is N.sub.span, and whose number of steps per span is N.sub.step, in a case where m=N.sub.spanN.sub.step is put, there are (m+1) linear steps and m nonlinear steps in total. In the example in FIG. 2, N.sub.span=2 and N.sub.step=2 hold, and therefore m=4 holds, the number of linear steps is m+1=5 times from L.sup.(0) to L.sup.(4), and the number of nonlinear steps is m=4 times from N.sup.(1) to N.sup.(14).

(78) Although a signal waveform of each channel independently develops in the linear step of expression (6), signal waveforms of other channels are taken into account to perform calculation taking XPM into account when development of a signal waveform of one channel is calculated in the nonlinear step indicated in expressions (11) to (13). FIG. 3 illustrates a back propagation calculation procedure that indicates this process. As described above, the linear steps and the nonlinear steps are alternately calculated. In the back propagation calculation, an input waveform of the channel number n and the polarization component is put as x.sub.p, n, and an output waveform is put as y.sub.p, n. In this regard, x.sub.p, n represents a waveform received by a receiver after the transmission, and y.sub.p, n corresponds to an estimation result of a transmitted waveform.

(79) In a case where the input and output waveforms in a j-th linear step L.sup.(j) are put as z.sub.p, n.sup.(j) and y.sub.p, n.sup.(j), input/output waveforms of a nonlinear step N.sup.(j) are y.sub.p, n.sup.(j1) and z.sub.p, n.sup.(j). Moreover, as illustrated in FIG. 3, z.sub.p, n.sup.(0)=x.sub.p, n and y.sub.p, n.sup.(m)=y.sub.p, n hold. In a case where the output waveform y.sub.p, n.sup.(j) of L.sup.(j) is input to N.sup.(j+1), a route indicated by a solid line arrow is a flow of waveform data to calculate the phase shift due to SPM in a channel, which is indicated by expression (12), and a route indicated by a dotted line arrow is a flow of waveform data to calculate the phase shift due to XPM between channels, which is indicated by expression (13).

(80) Next, a method according to the present embodiment for optimizing parameters used for the back propagation calculation by the gradient descent method, and maximizing the performance of the nonlinear waveform distortion correction will be described. A specific learning method will be described below. An error function J() is defined as follows.
[Expression 16]
J()=.sub.t|e.sub.t()|.sup.2(16)

(81) Here, represents a vector including the parameters used for the back propagation calculation, and e.sub.t()=y.sub.t()d.sub.t represents an error at a time t. y.sub.t() represents a value of the signal waveform at the time t after the back propagation calculation and is regarded as a function of the parameter , and d.sub.t represents a value of a desired signal at the time t. Here, a waveform of a transmitted signal is used as the desired signal.

(82) Here, a form of the error function in equation (16) is referred to as a Mean Squared Error (MSE). A dataset is composed by plural sets of the waveform d.sub.t of the transmitted signal and a waveform y.sub.t() obtained after calculation for a back propagation transmission line of the parameter is performed, and the parameter is optimized so as to minimize the error function J() by the stochastic gradient descent method based on repeated calculation. As described in Non-Patent Literature 1, an update formula for the parameter based on the stochastic gradient descent method can be obtained as follows.

(83) $\begin{matrix} [Expression 17] \\ _{i + 1} =_{i} - \frac{}{2} J & (17) \end{matrix}$

(84) In this regard, .sub.1 represents a parameter vector that is obtained as an i-th update result, represents a minute positive number for determining a learning speed, and J represents a gradient of the error function for the parameter and is calculated as represented in a following expression.

(85) $\begin{matrix} [Expression 18] \\ J = \frac{J}{} = {.Math.}_{t} \frac{}{} {.Math. e_{t} () .Math.}^{2} = 2 {.Math.}_{t} R e [\frac{e_{t}}{} e_{t}^{*}] & (18) \end{matrix}$

(86) Here, the desired signal, that is, the transmitted signal does not depend on the parameter , and a following expression is held.

(87) $\begin{matrix} [Expression 19] \\ \frac{e_{t}}{} = \frac{y_{t}}{} & (19) \end{matrix}$

(88) Therefore, the update formula for the parameter can be eventually obtained as follows.

(89) $\begin{matrix} [Expression 20] \\ _{i + 1} =_{i} - {.Math.}_{t} R e [\frac{y_{t}}{} (y_{t}^{*} - d_{t}^{*})] & (20) \end{matrix}$

(90) When all parameters that are elements of and are being subjected to the back propagation calculation are updated using expression (20), y.sub.t/ that is a gradient for each parameter is used for the output signal y.sub.t. A calculation formula for calculating this is derived according to the differential chain rule.

(91) According to expression (6), the relationship between the input and output waveforms z.sub.p, n.sup.(j) and y.sub.p, n.sup.(j) in the linear step L.sub.p, n.sup.(j) is as follows.

(92) $\begin{matrix} [Expression 21] \\ y_{p, n}^{(j)} = e^{- \frac{}{2} h}^{- 1} e^{i (D_{2}^{(j)}^{2} + D_{3}^{(j)}^{3} - T_{n}^{(j)})} z_{p, n}^{(j)} & (21) \end{matrix}$

(93) By directly differentiating expression (21), the following expression can be obtained.

(94) $\begin{matrix} [Expression 22] \\ \frac{y_{p, n}^{(j)}}{D_{2}^{(j)}} = e^{- \frac{}{2} h}^{- 1} (i^{2}) e^{i (D_{2}^{(j)}^{2} + D_{3}^{(j)}^{3} - T_{n}^{(j)})} z_{p, n}^{(j)} & (22) \end{matrix}$ $\begin{matrix} [Expression 23] \\ \frac{y_{p, n}^{(j)}}{D_{3}^{(j)}} = e^{- \frac{}{2} h}^{- 1} (i^{3}) e^{i (D_{2}^{(j)}^{2} + D_{3}^{(j)}^{3} - T_{n}^{(j)})} z_{p, n}^{(j)} & (23) \end{matrix}$

(95) Consequently, it is possible to calculate gradients of D.sub.2.sup.(j) and D.sub.3.sup.(j) of the output waveform y.sub.p, n.sup.(j) using the input waveform z.sub.p, n.sup.(j). The gradients obtained here are sent to the next step according to the differential chain rule, and is finally modified in a form of gradients for the output waveform y.sub.p, n=y.sub.p, n.sup.(m). A gradient of y.sub.p, n.sup.(j) for an arbitrary parameter included in a step before L.sub.p, n.sup.(j) can be obtained as follows.

(96) $\begin{matrix} [Expression 24] \\ \frac{y_{p, n}^{(j)}}{} = e^{- \frac{}{2} h}^{- 1} e^{i (D_{2}^{(j)}^{2} + D_{3}^{(j)}^{3} - T_{n}^{(j)})} \frac{z_{p, n}^{(j)}}{} & (24) \end{matrix}$

(97) Consequently, it is possible to calculate the gradient of the output waveform y.sub.p, n.sup.(j) for the parameter using a gradient z.sub.p, n.sup.(j)/ output from an immediately prior nonlinear step. In this regard, a walk-off value T.sub.n.sup.(j) is expressed as follows using walk-off parameters d.sub.n.sup.(j) and d.sub.n.sup.(j+1) included in the nonlinear steps N.sub.p, n.sup.(j) and N.sub.p, n.sup.(j+1) before and after the linear step L.sub.p, n.sup.(j).

(98) 0 $\begin{matrix} [Expression 25] \\ T_{n}^{(0)} = \frac{h_{1}}{2} d_{n}^{(1)}, T_{n}^{(j)} = \frac{h_{j} d_{n}^{(j)} + h_{j + 1} d_{n}^{(j + 1)}}{2}, T_{n}^{(m)} = \frac{h_{m}}{2} d_{n}^{(m)} & (25) \end{matrix}$

(99) Here, h.sub.j represents an interval width of the nonlinear step N.sub.p, n.sup.(j).

(100) Next, a relationship between the input and output waveforms y.sub.p, n.sup.(j) and z.sub.p, n.sup.(j) in the nonlinear step N.sub.p, n.sup.(j) for the signal waveform of the channel 0 is as follows according to expression (11).

(101) $\begin{matrix} [Expression 26] \\ z_{p, 0}^{(j)} = y_{p, 0}^{(j - 1)} \exp [i_{p, 0}^{(j)} (t)] & (26) \end{matrix}$ $\begin{matrix} [Expression 27] \\ _{p, 0}^{(j)} (t) = g^{(j)}^{- 1} [H_{0}^{(j)} {{\tilde{P}}_{p, 0}^{(j - 1)} () +^{(j)} {\tilde{P}}_{3 - p, 0}^{(j - 1)} ()} + {.Math.}_{n 0} H_{n}^{(j)} () {2 {\tilde{P}}_{p, n}^{(j - 1)} () +^{(j)} {\tilde{P}}_{3 - p, n}^{(j - 1)} ()}] & (27) \end{matrix}$ $\begin{matrix} [Expression 28] \\ \begin{matrix} H_{0}^{(j)} = \frac{2}{_{0}^{(j)}} \sinh (\frac{h}{2}_{0}^{(j)}), \\ H_{n}^{(j)} () = \frac{2}{k_{n}^{(j)}} \sinh (\frac{h}{2} k_{n}^{(j)}), \\ k_{n}^{(j)} =_{n}^{(j)} + {id}_{n}^{(j)} \end{matrix} & (28) \end{matrix}$

(102) In this regard, P.sub.p, n.sup.(j1)=|y.sub.p, n.sup.(j1)|.sup.2 represents the intensity of an input waveform y.sub.p, 0.sup.(j1), and also has the following relationship.
[Expression 29]
{tilde over (p)}.sub.p,n.sup.(j1)()= custom character [P.sub.p,n.sup.(j1)(t)]29)

(103) Then, the gradient of the output waveform z.sub.p, n.sup.(j) for parameters g.sup.j), .sup.j), .sub.0.sup.j), .sub.n.sup.(j), and d.sub.n.sup.(j) used in the nonlinear step N.sub.p, n.sup.(j) can be obtained as follows by directly differentiating expression (26).

(104) $\begin{matrix} [Expression 30] \\ \frac{z_{p, 0}^{(j)}}{g^{(j)}} = i^{- 1} [H_{0}^{(j)} {{\tilde{P}}_{p, 0}^{(j - 1)} () +^{(j)} {\tilde{P}}_{3 - p, 0}^{(j - 1)} ()} + {.Math.}_{n 0} H_{n}^{(j)} () {2 {\tilde{P}}_{p, n}^{(j - 1)} () +^{(j)} {\tilde{P}}_{3 - p, n}^{(j - 1)} ()}] z_{p, 0}^{(j)} & (30) \end{matrix}$ $\begin{matrix} [Expression 31] \\ \frac{z_{p, 0}^{(j)}}{^{(j)}} = {ig}^{(j)}^{- 1} [H_{0}^{(j)} {\tilde{P}}_{3 - p, 0}^{(j - 1)} () + {.Math.}_{n 0} H_{n}^{(j)} () {\tilde{P}}_{3 - p, n}^{(j - 1)} ()] z_{p, 0}^{(j)} & (31) \end{matrix}$ $\begin{matrix} [Expression 32] \\ \frac{z_{p, 0}^{(j)}}{_{0}^{(j)}} = {ig}^{(j)} \frac{H_{0}^{(j)}}{_{0}^{(j)}} [P_{p, 0}^{(j - 1)} (t) +^{(j)} P_{3 - p, 0}^{(j - 1)} (t)] z_{p, 0}^{(j)} & (32) \end{matrix}$ $\begin{matrix} [Expression 33] \\ \frac{z_{p, 0}^{(j)}}{_{n}^{(j)}} = {ig}^{(j)}^{- 1} [\frac{H_{n}^{(j)} ()}{_{n}^{(j)}} {2 {\tilde{P}}_{p, n}^{(j - 1)} () +^{(j)} {\tilde{P}}_{3 - p, n}^{(j - 1)} ()}] z_{p, 0}^{(j)} & (33) \end{matrix}$ $\begin{matrix} [Expression 34] \\ \frac{z_{p, 0}^{(j)}}{d_{n}^{(j)}} = {ig}^{(j)}^{- 1} [\frac{H_{n}^{(j)} ()}{d_{n}^{(j)}} {2 {\tilde{P}}_{p, n}^{(j - 1)} () +^{(j)} {\tilde{P}}_{3 - p, n}^{(j - 1)} ()}] z_{p, 0}^{(j)} & (34) \end{matrix}$

(105) Consequently, it is possible to calculate the gradient using the output waveform z.sub.p, n.sup.(j), the intensity P.sub.p, n.sup.(j1) of the input waveform, and a frequency waveform P.sub.p, n.sup.(j1) thereof (P is a symbol with above P).

(106) Moreover, for the gradient of z.sub.p, n.sup.(j) for the arbitrary parameter included in a step before N.sub.p, n.sup.(j), the following equations can be obtained by differentiating expression (26) with .

(107) $\begin{matrix} [Expression 35] \\ \frac{z_{p, 0}^{(j)}}{} = [i \frac{_{p, 0}^{(j)}}{} y_{p, 0}^{(j - 1)} + \frac{y_{p, 0}^{(j - 1)}}{}] \exp [i_{p, 0}^{(j)} (t)] & (35) \end{matrix}$ $\begin{matrix} [Expression 36] \\ \frac{_{p, 0}^{(j)}}{} = g^{(j)}^{- 1} [H_{0}^{(j)} {\frac{{\tilde{P}}_{p, 0}^{(j - 1)} ()}{} +^{(j)} \frac{{\tilde{P}}_{3 - p, 0}^{(j - 1)}}{}} + {.Math.}_{n 0} H_{n}^{(j)} () {2 \frac{{\tilde{P}}_{p, n}^{(j - 1)}}{} +^{(j)} \frac{{\tilde{P}}_{3 - p, n}^{(j - 1)}}{}}] & (36) \end{matrix}$

(108) Note that, although expressions (26) to (36) describe the calculation formulae in the nonlinear steps for the signal of the channel number 0, it is possible to describe a calculation formula likewise for a signal of a general channel number.

(109) To sum up, expressions (22) and (23) are used in the linear step and equations (30) to (34) are used in the nonlinear step to respectively calculate the gradient of the output waveform of that step, which is represented by the parameters used in that step, and the calculation result is passed to a next step. Moreover, expression (24) is used in the linear step and expressions (35) and (36) are used in the nonlinear step to update the gradient passed from the previous step to a gradient of an output of that step, and the updated gradient is passed to a next step. By continuing such calculation from an input side to an output side, it is possible to calculate a gradient of the final output waveform y.sub.p, n=y.sub.p, n.sup.(m), which is represented by all parameters used for the back propagation calculation, and update the parameters according to expression (20).

(110) Note that, in FIG. 3, the dispersion parameters D.sub.2.sup.(j) and D.sub.3.sup.(j) are shared between linear steps L.sub.1, n.sup.(3) and L.sub.2, n.sup.(j) for each orthogonal polarization component of the channel number n, and an average of gradients of both polarization components is collectively calculated when a gradient is calculated in the learning. Moreover, the parameters g.sup.(j), .sup.(j), .sub.0.sup.(j), .sub.n.sup.(j), and d.sub.n.sup.(j) are commonly used in the nonlinear steps N.sub.1, n.sup.(j) and N.sub.2, n.sup.(j) likewise.

(111) Furthermore, in the example described below, for a signal other than a channel number n=0, calculation for the nonlinear step is not performed, and only calculation for the linear step is performed. Although XPM nor SPM is corrected in a case where calculation for the nonlinear step is ignored, it has been confirmed that it is possible to ignore the nonlinear waveform distortion produced in waveforms of channels other than the channel number n=0 when an influence caused by XPM on the channel of the channel number n=0 due to channels other than the channel number n=0 is calculated.

(112) According to the above-described method, it is possible to optimize the transmission line parameters for correcting the nonlinear waveform distortion including XPM by the gradient descent method, and maximize the correction effect.

(113) A method for reducing a calculation amount by selecting parameters to be optimized, and further applying approximation thereto will be described below.

(114) Expression (23) expresses a method for calculating a gradient for the third-order group velocity dispersion of a transmission line, however, an influence of the third-order group dispersion on a waveform of a single channel is little to such a degree that this influence can be ignored for a signal whose symbol rate is several tens of Gbaud or less, therefore, a certain initial value can be set and then fixed without performing the learning, or a third-order group velocity dispersion effect itself can be also ignored. Here, it is assumed that the third-order group velocity dispersion is fixed without performing the learning after the initial value is set. In this regard, although the third-order group velocity dispersion causes walk-off between channels to change at a second-order with respect to a frequency difference between the channels, this effect is taken into account to set walk-off T.sub.n.sup.(j) and an initial value of the walk-off parameter d.sub.n.sup.(j).

(115) Although expressions (32) and (33) express methods for calculating a gradient related to a loss coefficient of each channel, the loss coefficient is a parameter that can be easily measured, and therefore it is assumed that the initial value is kept fixed without calculating the gradient.

(116) Next, how the gradient propagates is restricted. FIGS. 4A and 4B illustrate routes of the linear step and the nonlinear step in a case of m=2 for the input waveforms x.sub.p, n (p=1, 2; n=0, 1) of dual-polarization signals of two channels whose channel numbers are n=0 and 1. In this regard, a route through which a gradient for the second-order group velocity D.sub.2.sup.(0) used in a linear step L.sub.1, 0.sup.) reaches a step of a later stage will be considered.

(117) An output waveform y.sub.1, 0.sup.0) of the linear step L.sub.1, 0.sup.(0) is sent to a nonlinear step N.sub.p, n.sup.(1) (p=1, 2; n=0, 1). An output waveform z.sub.p, n.sup.(1) is calculated in each nonlinear step according to expressions (26) to (28), and a gradient z.sub.p, n.sup.(1)/ for =D.sub.2.sup.(0) is calculated according to expressions (35) and (36) and sent to a subsequent step. The gradient for D.sub.2.sup.(0) propagates to a nonlinear step N.sub.1, 0.sup.(2) via a linear step L.sub.p, n.sup.(1) (p=1, 2; n=0, 1).

(118) The solid lines in FIG. 4A indicate propagation routes of the gradient for D.sub.2.sup.(0)). However, when a value of D.sub.2.sup.(0) commonly used in L.sub.p, 0.sup.(0) (p=1, 2) changes, y.sub.p, 0.sup.(0) influences the waveform of the channel number n=1 due to XPM, and an influence caused by XPM on a signal waveform z.sub.p, 0.sup.(2) of the channel number n=0 in a nonlinear step N.sub.p, 0.sup.(2) is very little, and can be ignored. Taking this into account, propagation routes of a gradient of the certain parameter can be limited in a channel, and are routes indicated by solid lines illustrated in FIG. 4B. In this case, a term including H.sub.n.sup.(j) () is ignored in expression (36), and the following expression is obtained as a result.

(119) $\begin{matrix} [Expression 37] \\ \begin{matrix} \frac{_{p, 0}^{(j)}}{} = g^{(j)} H_{0}^{(j)} {\frac{P_{p, 0}^{(j - 1)} (t)}{} +^{(j)} \frac{P_{3 - p, 0}^{(j - 1)} (t)}{}} \\ = 2 g^{(j)} H_{0}^{(j)} R e [\frac{y_{p, 0}^{(j - 1)}}{} {y_{p, 0}^{(j - 1)}}^{*} +^{(j)} \frac{y_{3 - p, 0}^{(j - 1)}}{} {y_{3 - p, 0}^{(j - 1)}}^{*}] \end{matrix} & (37) \end{matrix}$

(120) To sum up the above-described method with the approximation, calculation can be performed in the linear step by using expression (21) as the development of the waveform, expression (22) as the calculation formula of the gradient related to second-order group velocity dispersion used in that step, and expression (24) as the update formula for the gradient related to the arbitrary parameter having propagated from a previous step. Moreover, in the nonlinear step, it is possible to use expressions (26) to (28) as the development of the waveform, expressions (30), (31), and (34) as the calculation formulae of the gradients related to the parameters g.sup.(j), .sup.(j), and d.sub.n.sup.(j) used in that step, and expressions (35) and (37) as the update formulae for the gradients related to the arbitrary parameter having sent from a previous step.

(121) FIG. 5A illustrates a flow of processing of calculating a gradient of a waveform by respective transmission line parameters in parameter optimization processing. Although the waveform is developed from z.sub.p, n.sup.(1) to y.sub.p, n.sup.(j) according to expression (21) in the linear step L.sup.(j), gradients (y.sub.p, n.sup.(j)/D.sub.2.sup.(j) and y.sub.p, n.sup.(j)/D.sub.3.sup.(j) of a waveform for each transmission line parameter (D.sub.2.sup.(j) and D.sub.3.sup.(j)) used in L.sup.(j) are calculated by arithmetic operations according to expressions (22) and (23). Moreover, the gradients of the waveforms for all of the transmission line parameters included in a previous step before L.sup.(j) are calculated by an arithmetic operation according to expression (24), and z.sub.p, n.sup.(j)/ is updated to y.sub.p, n.sup.(j)/. y.sub.p, n.sup.(j)/D.sub.2.sup.(1) and y.sub.p, n.sup.(j)/D.sub.3.sup.(j) are included in y.sub.p, n.sup.(j)/, and are collectively sent to N.sup.(j+1) that is a next step.

(122) Next, although the waveform is developed from y.sub.p, n.sup.(j) to z.sub.p, n.sup.(j+1) according to expressions (26) to (28) in a nonlinear step N.sup.(j+1), the gradient of the waveform (z.sub.p, n.sup.(j+1))/g.sup.(j+1), z.sub.p, n.sup.(j+1)/.sup.(j+1), z.sub.p, n.sup.(j+1)/.sub.0.sup.(j+1), z.sub.p, n.sup.(j+1)/.sub.n.sup.(j+1), and z.sub.p, n.sup.(j+1/d.sub.n.sup.(j+1)) for each transmission line parameter (g.sup.(j+1), .sup.(j+1), .sub.0.sup.(j+1), .sub.n.sup.(j+1), and d.sub.n.sup.(j+1), used in N.sup.(j+1) is calculated by arithmetic operations according to expressions (30) to (34). Moreover, the gradients of the waveforms for all of the transmission line parameters included in a step before N.sup.(j+1) are calculated by arithmetic operations according to expressions (35) and (37), and y.sub.p, n.sup.(j)/ is updated to z.sub.p, n.sup.(j+1)/. Note that expression (36) may be used instead of expression (37). z.sub.p, n.sup.(j+1)/g.sup.(j+1), z.sub.p, n.sup.(j+1)/.sup.(j+1), z.sub.p, n.sup.(j+1/.sub.0.sup.(j+1), z.sub.p, n.sup.(j+1)/.sub.n.sup.(j+1), and z.sub.p, n.sup.(j+1/d.sub.n.sup.(j+1)are included in z.sub.p, n.sup.(j+1)/, and are collectively sent to L.sup.(j+1) that is a next step.

(123) By repeating such calculation per step, y.sub.t/ that is the gradient of a final output y.sub.t, for all of the parameters in the transmission line is calculated.

(124) Furthermore, FIG. 5B illustrates a flow of processing of updating each transmission line parameter in the parameter optimization processing. That is, the output waveform y.sub.t, a gradient y.sub.t/.sub.i for each transmission line parameter .sub.i and the desired signal d.sub.t are used to update the transmission line parameter .sub.i of an i-th step using expression (20). As illustrated in FIG. 5B, the transmission line parameters D.sub.2 and D.sub.3 are updated in the linear step, and g, , .sub.0, .sub.n, and d.sub.n are updated as the transmission line parameters in the nonlinear step.

(125) Note that, in a case where it is found that the loss coefficients .sub.0 and .sub.n of the optical fiber and a value of the coefficient that corresponds to cross-polarization cross phase modulation is one, learning these transmission line parameters may be omitted. Moreover, in a case where the third-order dispersion effect in a channel can be ignored, the learning of D.sub.3 may be omitted.

System Configuration According to Embodiment

(126) FIG. 5C illustrates a schematic view of an optical transmission system according to the present embodiment. A WDM signal generated by a transmitter propagates in a transmission line including optical fibers and optical amplifiers, and reaches a receiver. The receiver splits a WDM signal for each channel using a demultiplexing device such as an arrayed-waveguide grating. The demultiplexed optical signal waveform of each channel is converted into an electric signal waveform by a coherent receiver, and then input to a Digital Signal Processor (DSP) for signal processing. In the DSP, the electrical signal waveform is converted into numerical data by Analog-to-Digital (AD) conversion, and calculation for various demodulation processing to finally convert the numerical data into a received bit sequence and output the received bit sequence is performed. The demodulation processing includes temporal sampling timing control, sampling rate conversion, clock synchronization, filtering, polarization rotation, carrier recovery, linear waveform distortion correction such as an adaptive equalizer, nonlinear waveform distortion correction according to the present embodiment, symbol decision, and error correction processing and the like. Note that, although a function of executing the above-described parameter optimization processing can be implemented in the DSP, it is also possible to make an external computer perform the parameter optimization processing instead of making the DSP perform the parameter optimization processing, and download the optimized parameters to apply to the nonlinear waveform distortion correction according to the present embodiment. An AD conversion function may be separated from the DSP, and installed between the DSP and the coherent receiver. Although the DSP is independently provided for each channel and the DSPs do not basically operate in conjunction with each other in a conventional optical transmission system, a single processor collectively handles waveform data of plural channels as illustrated in FIG. 5C, or plural independent processors exchange data with each other to calculate the above-described phase shift caused by XPM in the present embodiment.

(127) FIG. 5D illustrates a processing flow executed by the DSP and related to the present embodiment. First, a signal with a known waveform shape is transmitted, and parameter optimization processing for optimizing transmission line parameters in the DSP or in the external computer based on a received signal waveform demodulated by the DSP in the receiver is executed (process S1). This processing includes arithmetic operations according to the flows illustrated in FIGS. 5A and 5B. The parameter optimization processing is executed by a parameter optimizer configured in the DSP, or the external computer. Next, a waveform distortion corrector configured in the DSP executes back propagation processing using the optimized transmission line parameters (process S3). As described with reference to FIGS. 2 and 3, in this back propagation processing, keeping respectively optimized transmission line parameters for the linear step and nonlinear step for each of the plural channels at a time of wavelength division multiplexing transmission, and sequentially performing these calculation, the waveform distortion caused by self-phase modulation occurring within a channel to be processed, and the waveform distortion caused by cross-phase modulation occurring between the channel to be processed and channels other than the channel to be processed are corrected, and a waveform at a time of transmission is estimated. This back propagation processing is executed by the DSP. By performing this processing, it is possible to improve accuracy while suppressing a calculation amount.

EXAMPLE

(128) An effect of the nonlinear waveform distortion correction for a WDM optical signal by a method described in the present embodiment will be described based on a specific example that is based on optical transmission simulation that uses numerical value calculation. An optical signal that is a target to study is a signal obtained by performing 9-channel wavelength division multiplexing on Dual-Polarization (DP) 64-Quadrature Amplitude Modulation (QAM) signal whose symbol rate is 32 Gbaud at 50 GHz in frequency interval, random noise is given to this signal, and an SN ratio is set to 25 dB. Assume that the spectrum of a transmitted signal is one that a root Nyquist filter whose roll-off factor is 0.05 is applied. Channel numbers 4 to +4 are allocated to WDM signals of nine channels in order from a lower frequency, and signal quality of the center channel number 0 is focused upon to test an operation of the nonlinear waveform distortion correction.

(129) One span includes a Standard Single-Mode Fiber (SSMF) whose length is 80 km and an optical amplifier that amplifies the propagation loss of the optical fiber, transmission lines of four spans to 10 spans is assumed, and calculation for transmitting an optical signal using these transmission lines is performed. In an ideal transmission line, assuming that SSMFs of all spans have the same parameters, second-order and third-order group velocity dispersion values are put as 16.641 ps/nm/km and 0.06 ps/nm.sup.2/km, respectively, a nonlinear coefficient is put as 1.3 W.sup.1km.sup.1, and a propagation loss coefficient is put as 0.192 dB/km. On the other hand, in a realistic transmission line, various parameters are different per span or per spot, power of an optical signal also fluctuates from an ideal state, and an effect of the group velocity dispersion and a magnitude of the nonlinear effect eventually vary. A transmission line for which the second-order group velocity dispersion value and the nonlinear coefficient are fluctuated per step as illustrated in FIG. 6 will be considered separately from the ideal transmission line to take this situation into account, and this transmission line will be referred to as a realistic transmission line hereinafter. Note that gray solid lines in FIG. 6 respectively illustrate average values of parameters of all 10 spans for the second-order group velocity dispersion value (a) and the nonlinear coefficient (b). In this regard, in a case where short transmission lines of four to eight spans will be considered, average values in an existing span will be considered.

(130) Numerical value calculation for performing optical transmission simulation handles waveform data that was sampled 32 times the symbol rate on the time axis. By setting one of 4, 6, 8 and 10 as the number of spans and using the split-step Fourier method in which 800 is set as the number of steps per span in the ideal transmission line having the same parameters in each span or in a transmission line having the parameters illustrated in FIG. 6, how an optical signal propagates through the transmission line is calculated according to the nonlinear Schrdinger equation expressed by expression (1) to obtain output waveform data. 1 is set as a value of the coefficient corresponding to the cross-polarization cross phase modulation in expression (1), and values obtained by normalizing the parameters of the optical fiber included in the transmission line are used for the other parameters. Moreover, a noise index of the optical amplifier is put as 6 dB, and random Gaussian distribution noise having noise power determined based on a gain and the noise index is added as spontaneous emission light noise to a signal at a time of amplification. Optical signal power at a time of start of propagation in each span will be referred to as span launch power, and transmission simulation is performed on some kinds of span launch power. When the span launch power is small, while remarkable nonlinear waveform distortion does not occur, noise remarkably deteriorates the SN ratio, and signal quality is lowered. Although the SN ratio improves by increasing the span launch power, the nonlinear waveform distortion becomes remarkable, the signal quality takes a peak value at certain span launch power, and then the signal quality deteriorates at higher span launch power.

(131) Transmission signal data whose number of symbols is 16384 is generated based on a random bit pattern for respective conditions of the numbers of transmission spans and launch power to study, transmission simulation is performed, and an output waveform is stored. Sets of input waveforms and output waveforms of transmission lines are used as datasets, 200 datasets in total are used, and parameters of the back propagation calculation are learned by the gradient descent method that uses the above expressions such that the nonlinear waveform distortion correction is optimized. In the back propagation calculation accompanied by learning of the parameters according to the present example, the number of steps per span is set to one. By contrast with this, in the back propagation calculation that is not accompanied by the learning, both of cases where the numbers of steps per span are one and two will be considered, and the effect is compared with those in the cases where the learning is not performed.

(132) While the gradient descent method is used to learn each parameter, a method called AdaBelief proposed in Non-Patent Literature 9 is used as a specific implementation method for updating each parameter. Note that the learning is possible using methods other than AdaBelief. The update formula of AdaBelief is obtained as follows by putting .sub.0=0 and .sub.0=0.

(133) $\begin{matrix} [Expression 38] \\ \begin{matrix} _{i} = b_{1}_{i - 1} + (1 - b_{1}) J_{i} \\ v_{i} = b_{2} v_{i - 1} + (1 - b_{2}) {(J_{i} -_{i})}^{2} \\ _{i}^{} = \frac{_{i}}{1 - b_{1}} \\ v_{i}^{} = \frac{v_{i}}{1 - b_{2}} \\ _{i} =_{i - 1} - \frac{_{i}^{}}{\sqrt{v_{i}^{}} + e} \end{matrix} & (38) \end{matrix}$

(134) In this regard, J.sub.i represents a gradient of an error function for the parameter at a time of i-th update, and b.sub.1=0.9, b.sub.2=0.999, and e=10.sup.8 are constants. in expression (38) represents a learning coefficient, and a suitable value is set thereto for each parameter. In the present example, the learning coefficient for the parameter D.sub.2 corresponding to the second-order dispersion value is put as =1.0, =2.010.sup.6 is used for the parameter g corresponding to the nonlinear coefficient, =2.010.sup.5 is used for the parameter corresponding to the cross-polarization cross phase modulation coefficient, =2.010.sup.4 is used for the parameter d.sub.n corresponding to the walk-off, and update according to expression (38) is repeated 30000 times to perform the learning. Note that the number of datasets is 200, and therefore an arrangement order of the datasets is rearranged at random every update is performed 200 times to perform repeated learning. Note that this operation does not cause overtraining. Moreover, when similar learning is performed by changing conditions such as a signal modulation format and the number of channels, the learning may be performed 30000 times or more until the parameters converge if necessary. All of 16384 symbols included in the dataset are used as y.sub.t and d.sub.t in expression (20). Another waveform that has a different bit pattern and whose number of symbols is 262144 is used for testing work after the learning to evaluate signal quality after the nonlinear waveform distortion correction.

(135) First, a result of the nonlinear waveform distortion correction in the ideal transmission line whose parameters have the same values in all spans will be described. FIGS. 7A to 7E illustrate fluctuations of various numerical values at a time when the parameters were repeatedly updated 30000 times in the back propagation that also took XPM into account for a dataset in a case where the span launch power was +2 dBm/ch in the ideal transmission line of six spans. FIG. 7A illustrates how the dispersion parameter D.sub.2.sup.(j) in the linear step L.sup.(j) (0j6) fluctuates with respect to the number of times of repetition of parameter update. In linear steps of j=0 and 6, distances are half as those of the other steps, and therefore initial values of D.sub.2.sup.(j) are also half. As the learning advances, the parameters of j=1, 2, 3, 4, and 5 converge to the same value, and the parameter of j=0 also converges to a close value. In contrast with this, only a linear step in a case of j=6, that is, the linear step that is the closest to the transmitter converges to a different value from the other values. These results do not match with a fact that the parameters are equal in all spans in the ideal transmission line that is the target to study, yet were obtained as results that the performance was optimized by the learning, and it is considered that it is one of factors that improve performance compared to a case where the learning is not performed as described later.

(136) FIG. 7B illustrates how the nonlinear parameter g.sup.(j) of the nonlinear step N.sup.(j) (1j6) fluctuates. Although only the nonlinear step in a case of j=6, that is, the nonlinear step that is the closest to the transmitter converges to a slightly different value from the other values, values of the other step converge to substantially same values. FIGS. 7C and 7D illustrate how the walk-off parameters d.sub.4.sup.(j) and d.sub.1.sup.(j) respectively change. Values in all steps substantially overlap, and change of parameters due to the learning is not substantially observed. The correct parameters that are common to each span are given as initial values at a time of start of the learning in the ideal transmission line, and therefore it is thought that the walk-off did not need to be adjusted by the learning. Although not illustrated, other walk-off parameters also converge similar to d.sub.4.sup.(j) and d.sub.1.sup.(j).

(137) FIG. 7E plots MSE values calculated from a received signal waveform and a desired signal, i.e., a transmitted signal waveform. It is found that the MSE rapidly decreases immediately after start of the learning, substantially converges when the number of times of repetition reaches approximately 1000, and then stably transitions. In a case where this method is used for an actual transmission system, when the MSE vibrates near a lowest value, update may be terminated even if various parameters do not converge, and stationary waveform correction after the learning may be started.

(138) FIGS. 8A and 8B illustrate calculation results of a Q factor with respect to each span launch power in a case where the numbers of transmission spans were four, six, eight, and 10, and in cases where the nonlinear waveform distortion correction was performed by the back propagation calculation and was not performed. In this regard, as back propagation calculation conditions, FIGS. 8A and 8B assume a case where the back propagation calculation was performed without learning parameters in 1 step/span and 2 steps/span, and a case where the learning of parameters according to the present embodiment was performed in 1 step/span and then the back propagation calculation was performed, and respectively illustrate results obtained by correcting only distortion caused by SPM without correcting the nonlinear waveform distortion caused by XPM. Furthermore, the Q factor is converted from a value of a Bit Error Rate (BER) according to a following expression.
[Expression 39]
Q.sup.2[dB]=20 log.sub.10[{square root over (2)}erfc.sup.1(2BER)](39)

(139) The results in FIGS. 8A and 8B represents as a whole that deterioration of the SN ratio causes deterioration of the Q factor in an area of low launch power, and, to the contrary, the nonlinear waveform distortion deteriorates the Q factor in an area of high launch power. Focusing on the result of the back propagation calculation in a case where the learning is not performed, a correction effect in the case of 1 step/span is little, and a reasonable correction effect in the case of 2 steps/span can be obtained. Moreover, a result of the back propagation calculation with the learning of the parameters by the method according to the present embodiment represents that, even though the number of steps per span is one, signal quality that slightly exceeds the result of 2 steps/span in a case where the learning is not performed is obtained. This suggests a probability that it is possible to acquire truly necessary parameters for the waveform correction by performing the learning, and it is essentially unnecessary to perform calculation of 2 steps/span. Although a result will not be described, if parameters are actually learned by the method according to the present embodiment in 2 steps/span, it is possible to obtain the same nonlinear waveform distortion correction performance as that in the case of 1 step/span under a condition that the MSE lowers to the minimum.

(140) As for a case of the nonlinear waveform distortion correction by the physical phenomenon-specialized type neural network reported in Non-Patent Literature 8, Non-Patent Literature 8 reports a result that, while a result of 1 step/span is not substantially different from the result of 2 steps/span in a case of a transmission line of 12 spans, a reasonable correction effect could be obtained even though the performance deteriorated even in a case of a configuration of 0.5 step/span, i.e., a case where the number of steps was six. In view of the above, in a case where the method according to the present embodiment is used, it is considered that it is possible to obtain a sufficient correction effect with the number of steps less than that of 2 steps/span, and obtain a reasonable correction effect from a configuration less than 1 step/span.

(141) FIGS. 9A and 9B illustrate results that are similar to those in FIGS. 8A and 8B, and illustrate the results in a case where the back propagation calculation that also takes XPM into account in addition to SPM was performed. By correcting the distortion caused by XPM, signal quality improves, and, above all, signal quality obtained by correcting the waveform by the back propagation calculation of 1 step/span after the learning of the parameters according to the present embodiment is performed exceeds the result of the back propagation calculation of 2 steps/span without the learning, and is the best among these results. Moreover, comparison with the method according to Non-Patent Literature 4 represents that the method according to the present embodiment makes it possible to obtain a sufficient effect in 1 step/span, and can be performed with a realistic calculation amount.

(142) FIG. 10 illustrate results of the Q factors with respect to the number of spans under the optimal launch power in a case where the nonlinear waveform distortion correction is performed under some conditions and in a case where correction is not performed in view of the results of nonlinear waveform distortion correction illustrated in FIGS. 8A, 8B, 9A, and 9B. When a threshold of the Q factor at which data can be received without an error is set to 7.5 dB, a maximum transmission distance in the case where correction is not performed is six spans. On the other hand, it is possible to perform transmission over 10 spans or more by using the method that is the scheme according to the present embodiment and learns the parameters taking both of SPM and XPM into account, and it is found that a transmission distance can be significantly extended.

(143) Next, FIGS. 11A to 14 illustrate results of the realistic transmission line whose parameters are illustrated in FIG. 6. FIGS. 11A to 11E illustrate fluctuations of various numerical values in a case where the parameters were repeatedly updated 30000 times during the back propagation that also took XPM into account for a dataset in a case where span launch power was +2 dBm/ch in the realistic transmission line of six spans. In this regard, assuming a situation that true parameters of the realistic transmission line illustrated in FIG. 6 are unknown, average values of the six spans are provided as initial values of the transmission line parameters to start the learning. FIG. 11A illustrates how the second-order group velocity dispersion parameter D.sub.2.sup.(j) changes, FIG. 11B illustrates how the nonlinear parameter g.sup.(j) changes, and these parameters respectively converge to different values for each span. Although this result matches with the fact in a stationary manner that parameters of an original transmission line change for each span, the resulting values are values that maximize the nonlinear waveform distortion correction, and do not necessarily match with the transmission line parameters illustrated in FIG. 6. FIGS. 11C and 11D represent that the respective walk-off parameters d.sub.4.sup.(j) and d.sub.1.sup.(j) converge to different values in association with different dispersion values of each span. Although not illustrated, the other walk-off parameters also converge similar to d.sub.4.sup.(j) and d.sub.1(j). Similar to FIG. 7E, FIG. 11E illustrates an MSE value with respect to the number of times of learning, and a result in a case of the realistic transmission line in FIG. 11E represents that convergence occurs when the number of times of repetition is approximately 10000 compared to a case of the ideal transmission line illustrated in FIG. 7E. Similar to the result illustrated in FIG. 7E, also in this case, when the MSE vibrates near the lowest value, update may be terminated even if the various parameters do not converge, and stationary waveform correction after the learning may be started.

(144) FIGS. 12A and 12B illustrate calculation results of the Q factor with respect to each span launch power in a case where the numbers of transmission spans are four, six, eight, and 10 similar to FIGS. 8A and 8B and in cases where the nonlinear waveform distortion correction was performed by the back propagation calculation and was not performed. In this regard, as back propagation calculation conditions, FIGS. 12A and 12B assume a case where the back propagation calculation was performed without learning parameters in 1 step/span and 2 steps/span, and a case where the learning of parameters according to the present embodiment was performed in 1 step/span and then the back propagation calculation was performed, and each illustrate results obtained by correcting only distortion caused by SPM without correcting the nonlinear waveform distortion caused by XPM. Moreover, as for the back propagation without the learning, FIGS. 12A and 12B illustrate results of both of a case where the true values illustrated in FIG. 6 were given as parameters used for calculation and a case where average values of the parameters were given. Similar to the results in FIGS. 8A and 8B obtained for the ideal transmission line, it is found also for the realistic transmission line, that signal quality obtained by correcting the waveform by the back propagation calculation with the learning of the parameters according to the present embodiment is the best among these results. As for the back propagation calculation without the learning, it is found that there is no substantial difference between the case where the true values of the transmission line parameters were given and the case where the average values were given, and the correction performance does not significantly depend on the transmission line parameters in a situation that only SPM is corrected.

(145) FIGS. 13A and 13B illustrate similar results to those in FIGS. 12A and 12B in a case where the nonlinear waveform distortion correction was performed taking not only SPM but also XPM into account. By correcting the distortion caused by XPM, signal quality improves also in the realistic transmission line, and, above all, signal quality obtained by correcting the waveform by the back propagation calculation of 1 step/span with learning the parameters according to the present embodiment exceeds the result of the back propagation calculation of 2 steps/span for which the true transmission line parameters were given without performing the learning, and is the best among these results. What should be focused on in the results in FIGS. 13A and 13B is that the result of the back propagation calculation without the learning in the case where the average values of the true values were given as the transmission line parameters represents remarkable deterioration compared to the result in the case where the true values were given. This deterioration is caused because, when XPM is corrected, if correct walk-off values indicating a temporal positional relationship between different channels are not given, phase shift caused by XPM cannot be correctly calculated, and the correction performance deteriorates. Therefore, in a situation that the transmission line parameters fluctuate for each span and the correct values are unknown, the fact that learning the transmission line parameters according to the method described in the present embodiment makes it possible to set optimal values as the walk-off, and being able to effectively correct XPM hold a very important meaning.

(146) Similar to FIG. 10, FIG. 14 illustrates a result of the Q factors with respect to the number of spans under the optimal launch power in a case where the nonlinear waveform distortion correction is performed under some conditions and in a case where no correction is performed in view of the results of the nonlinear waveform distortion correction illustrated in FIGS. 12A, 12B, 13A, and 13B. A similar result to that in the case of the ideal transmission line is obtained also for the realistic transmission line, and it is possible to significantly extend a transmission distance by performing the back propagation calculation after performing the learning of the parameters according to the present embodiment.

(147) The effect of the embodiment was tested by a loop transmission experiment in addition to the above-described simulation results. FIG. 15 illustrates a loop transmission experiment system, and details of an experiment procedure that uses this system and an experiment result will be described below.

(148) As a configuration of the Transmitter (Tx), continuous light of 11 channels with different wavelengths, is output from a wavelength tunable light source, is synthesized by a 161 polarization maintaining coupler, is then input to a Lithium Niobate (LN) dual-polarization IQ modulator, is modulated to an optical signal waveform by electric signals applied to the modulator, and is output. The center frequency is 193.1 THz (the wavelength is 1552.524 nm), the frequency of the continuous light of the 11 channels is set at an interval of 50 GHz from 192.85 THz to 193.35 THz, and a channel number n=5, 4, . . . , 4, and 5 is assigned in order from a lower frequency. The electric signals of four channels to be applied to the modulator are generated by an arbitrary waveform generator whose sampling rate is 64 GSample/s, and each channel corresponds to an X polarization I channel component, an X polarization Q channel component, a Y polarization I channel component, and a Y polarization Q channel component of a dual-polarization IQ modulation signal. These electric signals of the four channels are respectively amplified by a driver amplifier, and applied to the modulator. An optical signal output from the modulator is a dual-polarization QAM signal whose symbol rate is 32 Gbaud, and that has a root Nyquist waveform whose roll-off factor is 0.1. As modulation formats, a uniform distribution 16-QAM signal whose number of bits per single polarization single symbol is four bits, and a Probabilistically Shaping (PS) 64-QAM signal whose number of bits per single polarization single symbol is five bits are used. For each modulation format of 16 QAM and PS-64 QAM, four patterns of the signal waveform that is modulated based on a random bit pattern and that includes 65536 symbols per single polarization are generated. When one of the four patterns is selected, the Tx repeatedly transmits the waveform of this pattern. Note that, although the WDM signals of the 11 channels obtained by modulation are modulated to the same waveforms across all channels, performing long distance transmission through a transmission line in which group velocity dispersion occurs causes the walk-off (group delay between channels), and therefore random XPM between waveforms occurs after the long distance transmission.

(149) Optical power of the WDM signals of the 11 channels generated by the Tx is amplified by the optical amplifier, and is then adjusted by a Variable Optical Attenuator (VOA). After that, optical noise outside a signal band is removed by a Band-Pass Filter (BPF), and the WDM signals are input to an Acoustic Optical Modulator (AOM) that is a switch that switches a loop transmission operation. The signal is input to and output from a loop transmission line through a 3 dB coupler. The loop transmission line includes in order from an input side an optical amplifier, a BPF, a VOA, an SSMF whose length is 84.1 km, an optical amplifier, a BPF, a VOA, an SSMF whose length is 80.5 km, an optical amplifier, an isolator (rightward arrow), a polarization scrambler (Pol. Scrambler), and an AOM. That is, one loop includes the SSMFs of two spans. In the experiment, the transmission distance is set to one of six, eight, 10, 12, 14, and 16 spans, and a signal is transmitted over each distance to test an effect of the nonlinear waveform distortion correction. Note that the SSMF whose length is 84.1 km and the SSMF whose length is 80.5 km have slightly different dispersion characteristics, a measurement result represents that a group velocity dispersion value and a dispersion slope value at 193.1 THz in frequency in a case of the former SSMF are 17.14 ps/nm/km and 0.062 ps/nm.sup.2/km, respectively, and the group velocity dispersion value and the dispersion slope value in a case of the latter SSMF are 16.55 ps/nm/km and 0.058 ps/nm 2/km, respectively, and the group velocity dispersion value and the dispersion slope were estimated as 16.85 ps/nm/km and 0.060 ps/nm.sup.2/km, respectively, as the average characteristics of one loop of the loop transmission line.

(150) As for the signal output from the loop transmission line, only one channel of the 11 channels is extracted by the BPF whose passband is 50 GHz, and is amplified by the optical amplifier, and then is input to a Receiver (Rx). The Rx is a digital coherent receiver that includes a 4-channel real time oscilloscope of 80 GSample/s whose electric band is 33 GHz, a Local Oscillator (LO), and an optical front end, demodulates, by offline digital signal processing, a real time waveform acquired by the oscilloscope, and then performs signal processing for the nonlinear waveform distortion correction offline likewise. In this regard, in the experiment, only signal quality of the center channel whose channel number is n=0 is focused upon, the learning for the nonlinear waveform distortion correction according to the embodiment is performed to maximize this signal quality, and the signal quality after the correction is evaluated.

(151) To perform the back propagation calculation for the nonlinear waveform distortion correction offline after the WDM signal is received, waveforms of all of the 11 channels are received for each channel. Normal demodulation processing that performs up to evaluation on signal quality without performing the back propagation calculation includes dispersion compensation, application of the same root Nyquist filter as that applied at a time of transmission as a matched filter, polarization rotation and demultiplexing of dual-polarization components, resampling to 2 samples/symbols, retiming, carrier frequency estimation and compensation, carrier-phase recovery and 3-tap feed forward-type linear adaptive equalization processing, and symbol decision and acquisition of a bit pattern. Here, the 3-tap equalization processing is butterfly-type 22 MIMO processing that can handle a dual-polarization signal, and is effective to compensate for the polarization crosstalk that occurs due to birefringence in a transmission line, XPM, and the like. On the other hand, the signal waveform demodulated in this way greatly changes from the waveform at the time of reception as a result of application of the root Nyquist filter, and the back propagation calculation cannot be applied as is. Hence, the root Nyquist filter is not applied to the waveform after the dispersion compensation is applied during the normal demodulation processing, the processing performed in the subsequent demodulation process is performed likewise, and the waveform immediately after reception is reproduced as closely as possible.

(152) Incidentally, originally, the optical signals of all of the channels should be simultaneously received by using plural transceivers, the back propagation calculation for the nonlinear waveform distortion correction according to the present embodiment should be performed by using waveforms of all of the channels without demodulating them, and the demodulation should be performed finally. However, in the experiment conducted herein, since the one Rx is used to receive and demodulate each channel in order, the measured waveforms of all of the channels are not synchronized between the channels. Especially, in a WDM signal after transmission through an optical fiber of a long distance, the group delay (walk-off) between the channels occurs in addition to the linear waveform distortion by the effect of the group velocity dispersion, and when the back propagation calculation is performed, it is necessary to start calculation while keeping an accurate group delay amount at the time of reception. However, if the back propagation calculation is performed as is without demodulating the waveform received in an asynchronous manner, the aforementioned condition is not satisfied. Hence, a procedure is adopted that the signals of all of the channels are independently received once, demodulation including dispersion compensation is performed, known pilot symbols are detected from resulting waveforms, timings of all of the channels are synchronized, the compensated group velocity dispersion values are allocated again to give the walk-off to each channel, a WDM signal waveform that would be obtained at a time of collective reception is reproduced, and the back propagation calculation is started.

(153) In the experiment, under each condition of a different number of transmission spans and launch power, a signal of each modulation format is transmitted, received, and demodulated, and a transmitted waveform that is common to all channels and a collectively received waveform of all channels described above are synthesized to generate a dataset. Four datasets are generated for each modulation format in association with four different waveform patterns. Note that, to correctly process the walk-off that occurs during the back propagation calculation, measurement is performed such that symbols much more than 65536 symbols included in one period of the transmitted waveform are included at a time of reception. A frequency difference between the center signal whose channel number is n=0 and an edge channel whose channel number is n=5 is 250 GHz, the maximum number of transmission spans is 16. Therefore, a maximum value of walk-off is estimated as approximately 43500 ps. This walk-off value corresponds to approximately 1400 symbols with respect to a 32-Gbaud signal, and therefore by performing measurement under a condition including the number of symbols greater than the 1400 symbols, it is possible to correctly calculate a walk-off influence when the back propagation is performed using a dataset having a finite time width. Hence, a waveform obtained by adding 5000 symbols to each of both edges in the time domain of the waveform whose one period is 65536 symbols is used to form a dataset.

(154) Prior to the experiment, the following new facts were found by advance study based on simulation. That is, by learning the transmission line parameters only for a waveform obtained for high launch power (e.g., launch power per channel is +3 dBm) equal to or more than a certain value using a signal of a modulation format (e.g., DP-16QAM) having certain complexity, it is possible to apply the resulting parameters to the nonlinear waveform distortion correction for a signal of an arbitrary modulation format whose launch power is equal to or less than the values used for the learning. Based on this fact, in the experiment, the DP-16-QAM signal whose launch power per channel is +3 dBm is transmitted, received, and demodulated to form a dataset, and learn transmission line parameters used for the back propagation calculation. Next, a DP-PS-64-QAM signal whose launch power per channel is 5 dBm to +2 dBm is transmitted, received, and demodulated to form a dataset, perform the nonlinear waveform distortion correction using the previously obtained transmission line parameters, and evaluate an improvement amount of signal quality.

(155) At a time of learning the transmission line parameters, in one learning step, one of four datasets of the DP-16-QAM signal is selected at random to perform the back propagation calculation, then continuous 1024 symbols among 65536 symbols are selected at random, an amplitude waveform of these symbols is put as an output signal waveform y.sub.t in expression (20), a corresponding transmitted signal waveform is used as the desired signal d.sub.t, and a gradient of each parameter is calculated from an error signal to update parameters. By performing such learning, it is possible to randomize the learning process for a limited number of datasets, and advance the learning without causing overfitting. Note that above-described AdaBelief is used as an algorithm of the gradient descent method used for learning.

(156) FIGS. 16A to 16G illustrate how various parameters are updated in a process of learning the transmission line parameters for the nonlinear waveform distortion correction that takes XPM compensation according to the embodiment into account under a back propagation condition of 1 step/span in a case where the number of transmission spans is six. Moreover, FIG. 17 illustrates as a schematic view how optical fibers constituting a six-span transmission line used for the experiment are connected, a fiber length of each span, and a calculation order of the linear step L.sup.(j) (0j6) and the nonlinear step N.sup.(j) (1j6) in the back propagation calculation. According to results of FIGS. 16A to 16G, it is found that the transmission line parameters and signal quality converge after the parameters are updated a certain number of times. In FIGS. 16D and 16E, the walk-off parameters d.sup.(j)5 and d.sup.(j)1 converge to the substantially same value less than an initial value in case of j=1, 3, and 5, and converge to the substantially same value greater than the initial value in a case of j=2, 4, and 6. This is a result that the fact is reflected that the fiber whose length is 84.1 km and the fiber whose length is 80.5 km illustrated in FIG. 15 have slightly different dispersion characteristics. That is, this means that, although the walk-off parameter d.sup.(1).sub.n included in the nonlinear step N.sup.(j) (1j6) is a parameter corresponding to a group delay amount per unit distance of a signal whose channel number is n with respect to a signal of the center channel (number is n=0), and takes a value proportional to a dispersion slope value of the fiber, the parameters of the fiber that had a small dispersion slope value and whose length was 80.5 km in a case of j=1, 3, and 5, and the parameters of the fiber that had a large dispersion slope value and whose length was 84.1 km could be correctly learned. This suggests that, even in a real environment used in the experiment, the learning method proposed by the embodiment makes it possible to correctly learn the effective parameters for the nonlinear waveform distortion correction.

(157) The learning was finished, and the transmission line parameters to perform the nonlinear waveform correction was performed were obtained for each number of transmission spans. An effect of the nonlinear waveform distortion correction for the PS-64-QAM signals of the 11 channels whose launch power range per channel was a range of 4 dBm to +2 dBm was tested based on these parameters. FIGS. 18A to 18C and 19A to 19C plot signal quality with respect to launch power per channel in a case of each number of transmission spans similar to FIGS. 8A, 8B, 9A, and 9B. Similar to the above-described simulation result, it is found that the signal quality is best in a case where the method of the nonlinear waveform distortion correction according to the embodiment is performed, that is, the back propagation calculation of 1 step/span is performed in which the nonlinear shift caused by the XPM that occurs between the channels is taken into account in addition to SPM that occurs in the channel and the parameters are optimized by the learning. Similar to FIG. 10, FIG. 20 plots a result of the signal quality with respect to the number of transmission spans according to whether or not the nonlinear waveform distortion correction is performed or per type, and the similar experiment result to the simulation result was obtained after all. As described above, the experiment result also represents that the effect according to the embodiment is obtained.

(158) Hereinafter, a calculation amount required in a case where the nonlinear waveform distortion correction according to the embodiment is performed and a calculation amount required in a case where a conventional technique is used are compared to indicate that, under a condition of a certain number of channels or less, the calculation amount required for the method according to the embodiment is substantially the same as the calculation amount required for the conventional method. Note that the calculation amount in a case where the back propagation calculation is performed fixing the optimized parameters obtained by finishing the learning will be focused upon hereinafter, although the nonlinear waveform distortion correction according to the embodiment means that the waveform correction is performed by the back propagation calculation after learning optimized values of the transmission line parameters by using the gradient descent method in order to perform the back propagation calculation of 1 step/span while taking both of distortions caused by SPM and XPM into account. The calculation amount in this case is the same as that in a case of the back propagation calculation that takes XPM into account in 1 step/span, and does not perform the learning. Moreover, the conventional technique supposes that only the correction of the distortion caused by SPM is taken into account, the distortion caused by XPM is not corrected, and the back propagation calculation of 2 steps/span is performed. As a precondition for deriving the calculation amount, it is assumed that results of numerical parameters that can be fixed irrespectively of an input waveform are calculated in advance and stored in a Look-Up Table (LUT), and are read and used every time a different waveform is input, and the number of times of arithmetic operations necessary for that calculation itself is not taken into account.

(159) The number of spans is put as S, a data length is put as N, the number of channels to be taken into account for calculation for XPM correction is put as C, and the number of times of calculation per channel and per polarization is calculated. A calculation load of the DSP is mainly a product arithmetic operation, and therefore the number of times of product arithmetic operations is calculated. Assuming that the number of times of product arithmetic operations of real numbers required for a product of complex numbers,
ab=Re[a]Re[b]Im[a]Im[b]+i(Re[a]Im[b]+Re[b]Im[a]),
is four, a total value of the numbers of times of product arithmetic operations of the real numbers is calculated. Furthermore, the number of times of product operations of complex numbers to perform FFT on a complex number signal whose size is N=2.sup.n, is generally N(log.sub.2N2)/2, and therefore the number of times of production operations of the real numbers is 2N(log.sub.2N2) that is four times N(log.sub.2N2)/2.

(160) First, a calculation amount required for expression (6) that is the linear step is estimated. Assuming that A represents a symbol with above A, the number of times of product operations of real numbers to calculate A.sub.p, n(0, )=FA.sub.p, n(0, t) is 2N(log.sub.2N2). A value of exp(h/2+i(D.sub.2.sup.2+D.sub.3.sup.3T.sub.n)) is irrelevant to an input waveform and therefore is stored in the LUT, and the number of times of product arithmetic operations of real numbers to multiply this value to A.sub.p, n(0, ) is 4N. Lastly taking an arithmetic operation required for inverse FFT into account, a total number of times of production operations of real numbers for the entire linear step is 4N+22N(log.sub.2N2)=4N(log.sub.2N1).

(161) Next, the calculation amount required for calculation according to expressions (11) to (13) that are nonlinear steps is estimated. When the nonlinear phase shift amount , which is a real number, is obtained, the number of times of product operations of the real numbers required for calculation of e.sup.i is 6N in total, because, in following fourth-order Taylor expansion, 2N is required for calculation of .sup.2/2, 2N is required for calculation of (.sup.2).sup.2/24 by reusing a result of .sup.2, and 2N is required for calculation of .sup.2/6.

(162) $\begin{matrix} [Mathematical 40] \\ e^{i} = 1 - \frac{1}{2}^{2} + \frac{1}{24} {(^{2})}^{2} + i (- \frac{1}{6}^{2}) & (40) \end{matrix}$

(163) Moreover, the number of times of product operations of real numbers to calculate B.sub.p, 0(0, t)e.sup.i that is a product of complex numbers is 4N. Next, when phase shift .sub.SPM(t) caused by SPM is calculated according to expression (12), P.sub.p, 0(t)=Re[B.sub.p, 0(t)].sup.2+Im[B.sub.p, 0(t)].sup.2 holds, therefore the number of times of product operations of real numbers is 2N, and, taking multiplication of a coefficient gH.sub.0 into account, the number of times of product operations of real numbers is 3N in total. Furthermore, because N times of product operations is required when an intensity P.sub.3p, 0(t) of an orthogonal polarization component to be additionally supplied is multiplied with a coefficient gH.sub.9, the number of times of product operations of the real numbers to calculate .sub.SPM(t) is 4N in the end.

(164) Next, a calculation amount required to calculate phase shift .sub.XPM(t) caused by XPM according to expression (13) is estimated. The intensity waveform P.sub.p, 0(t) has already been obtained at a time when .sub.SPM(t) is calculated, and the number of times of product operations of the real numbers to calculate P.sub.p, 0() by applying FFT to this intensity waveform P.sub.p, 0(t) is 2N(log.sub.2N2). P.sub.p, 0() does not appear in expression (13), yet needs to be supplied to perform the nonlinear waveform distortion correction of other channels, and therefore a calculation amount of P.sub.p, 0() is taken into account. On the other hand, assuming that P.sub.p, n() in a case of n0, which appears in expression (13), is separately calculated and supplied, a calculation amount necessary therefor is not taken into account. When P.sub.p, n() and P.sub.3-p, n() are multiplied with 2 g and g that are coefficients of real numbers, product arithmetic operations of real numbers need to be performed 2N times for each multiplication, that is, 4N times of product operations of real numbers is required in total. Assuming that H.sub.n(h, ) in expression (14) is stored in the LUT, the number of times of product operations of real numbers required to multiply a result of 2gP.sub.p, n()+gP.sub.3p, n() with H.sub.n(h, ) is 4N, and therefore the number of times of product operations of the real numbers required to obtain H.sub.n(h, ) 2gP.sub.p, n()+.sub.qP.sub.3p, n()) is 8N in total. This arithmetic operation is required for C-1 channels of n0, and therefore the number of times of product operations of the real numbers is 8N(C1). Finally taking into account a calculation amount for applying inverse FFT, the calculation amount required to calculate .sub.XPM(t) is 2N (log.sub.2N2)+8N (C1)+2N (log.sub.2N2)=4N(log.sub.2N+2C4). In view of the above, the calculation amount in a case where .sub.XPM(t) is not taken into account for one nonlinear step is 6N+4N+4N=14N, and the calculation amount in a case where .sub.XPM(t) is taken into account to compensate XPM is 14N+4N(log2N+2C4).

(165) In the back propagation calculation where the number of steps per span is M in a transmission line whose number of spans is S, the number of linear steps is MS+1 in total, and the number of nonlinear steps is MS in total. The above-described result represents that the number of times of production operations of real numbers in a case where XPM compensation is not performed is as follows.
4N(log.sub.2N1)(MS+1)+14NMS
The number of times of production operations of the real numbers in a case where XPM compensation is performed is as follows.
4N(log.sub.2N1)(MS+1)+{14N+4N(log.sub.2N+2C4)}MS

(166) FIG. 21 plots the number of times of production operations of real numbers with respect to the data length N in a case where the number of spans is S=10 and the number of channels is C=5, 11, and 21, and in a case where XPM compensation is performed for M=1 step/span as the present embodiment and in a case where XPM compensation is not performed for M=2 steps/span as the conventional scheme. In a case where the number of channels is five, the present embodiment can be carried out with the substantially same calculation amount as that of the conventional scheme. The calculation amount according to the present embodiment is greater than and is 1.6 times that of the conventional scheme in the case where the number of channels is 11, yet stays at the substantially same order of the calculation amount, so that, as indicated above by the simulation result in the case of the nine channels and the experiment result in the case of the 11 channels, it is possible to obtain a great nonlinear waveform distortion correction effect. In a case where the number of channels is 21, the calculation amount required for the present embodiment rises approximately 2.4 times that of the conventional scheme, yet stays at the same order still.

(167) Although the embodiment of the present invention has been described above, the present invention is not limited to this. Although, for example, the example where the stochastic gradient descent method is used has been described, various variations of the gradient descent method are applicable. Furthermore, as described above, it may be possible to obtain a sufficient effect even by taking influences of both of SPM and XPM into account only for part of steps instead of taking the influences of SPM and XPM into account for all steps.

(168) Note that the DSP includes an arithmetic operation unit and a memory. Furthermore, not only the DSP, but also another processor may execute the above-described processing. Furthermore, a program for causing the processor to execute the above-described processing is recorded in a non-volatile memory, and executed when commands included in the program is read out and executed by the processor at a time of execution. Furthermore, a dedicated circuit or a combination of the dedicated circuit and the DSP or the like may execute the above-described processing.

(169) The aforementioned embodiments are summarized as follows.

(170) A method for correcting optical waveform distortion, which relates to a first aspect in the present embodiments, is an optical waveform distortion correction method for correcting optical waveform distortion by estimating a waveform at a time of transmission by alternately calculating linear terms and nonlinear terms in a nonlinear Schrdinger equation after receiving an optical signal whose waveform shape changed by a nonlinear optical effect and a group velocity dispersion effect of an optical fiber that is a transmission line and digitizing a waveform of the received optical signal, characterized in that calculation is performed taking into account not only waveform distortion caused by self-phase modulation that occurs in a channel but also waveform distortion caused by cross-phase modulation that occurs between channels in a time of wavelength-division multiplexing transmission, parameters used for the calculation are optimized by a gradient descent method, and the number of steps per one span of the transmission line is less than 2. It is possible to improve the accuracy while suppressing a calculation load.

(171) The aforementioned number of steps per one span of the transmission line may be equal to or less than 1. Even when the number of steps is reduced like this, it is possible to improve the accuracy.

(172) Furthermore, the aforementioned parameters may include second-order group velocity dispersion, a nonlinear coefficient and walk-off.

(173) A method for correcting optical waveform distortion, which relates to a second aspect in the present embodiment, includes (A) a step of optimizing, by a gradient descent method, a first parameter that is used in back propagation processing and is associated with cross-phase modulation and a second parameter that is used in the back propagation processing and is associated with self-phase modulation and the cross-phase modulation, wherein the back propagation processing is processing to estimate a waveform at a time of transmission by alternately calculating linear terms and nonlinear terms in a nonlinear Schrdinger equation after receiving an optical signal whose waveform shape changed in a transmission line and digitizing a waveform of the received optical signal, and correct, for each channel of plural channels in the transmission line at a time of wavelength-division multiplexing transmission, waveform distortion caused by the self-phase modulation (SPM) that occurs in the channel and waveform distortion caused by the cross-phase modulation (XPM) that occurs in relation with channels other than the channel; and (B) a step of executing the aforementioned back propagation processing by using the optimized first and second parameters.

(174) As described above, by optimizing not only the second parameter (e.g., D2, D3, g, , and .sub.0) but also the first parameter (e.g., .sub.n and d.sub.n) and executing the back propagation processing to correct the waveform distortion caused by the SPM and XPM by using the optimized first and second parameters, even when the calculation load is suppressed by decreasing the number of steps per one span of the transmission line, it becomes possible to obtain sufficient calculation accuracy.

(175) Incidentally, the aforementioned waveform distortion caused by the cross-phase modulation may be corrected under approximation that an initial waveform is maintained for an intensity, independent of a propagation distance, and a delay proportional to the propagation distance occurs along a time axis. It is possible to further suppress the calculation load of the correction (also called compensation) by such approximation.

(176) Furthermore, the aforementioned second parameter may include group velocity dispersion D.sub.2 and a nonlinear coefficient g, and the aforementioned first parameter may include a walk-off parameter d.sub.n. When limiting parameters to be optimized, it is possible to further suppress the calculation load.

Method and apparatus for correcting optical waveform distortion and optical signal receiving apparatus

Assignee

Inventors

Cpc classification

Classification Explorer

H04B10/6163

ELECTRICITY

Classification Explorer

H04B10/2507

ELECTRICITY

Classification Explorer

H04J14/0201

ELECTRICITY

Classification Explorer

H04B10/2543

ELECTRICITY

Classification Explorer

H04J14/02

ELECTRICITY

International classification

Classification Explorer

H04B10/2507

ELECTRICITY

Classification Explorer

H04J14/02

ELECTRICITY

Abstract

Claims

Description