PAM-4 RECEIVER WITH JITTER COMPENSATION CLOCK AND DATA RECOVERY

20220385444 · 2022-12-01

    Inventors

    Cpc classification

    International classification

    Abstract

    A PAM-4 receiver with jitter compensation clock and data recovery is provided. The receiver includes a first-order delay-locked loop (DLL) which employs a bang-bang phase detector (BBPD) and a voltage-controlled delay line (VCDL) circuit supporting 40 MHz jitter tracking bandwidth and static phase skew elimination. A second-order wideband phase-locked loop (WBPLL) using the ¼-rate reference clock provides multi-phase clock generation with low input-to-output latency. To suppress the consequent jitter transfer, a jitter compensation circuit (JCC) acquires the jitter transfer amplitude and frequency information by detecting the DLL loop filter voltage (VLF(s)) signal, and generates an inverted loop filter voltage signal, denoted as VLF.sub.INV(s). The VLF.sub.INV(S) modulates a group of complementary VCDLs (C-VCDLs) to attenuate the jitter transfer on both recovered clock and data. With the provided receiver, a jitter compensation ratio up to 60% can be supported from DC to 4 MHz, with a −3-dB corner frequency of 40 MHz.

    Claims

    1. A four-level pulse amplitude modulation (PAM-4) receiver with jitter compensation clock and data recovery, comprising: a continuous-time linear equalizer configured to equalize an input data signal; a wide-band phase-locked loop (WBPLL) configured to lock to a quarter-rate delay-locked clock signal to generate a plurality of sampling clock signals with evenly separated phases, the plurality of sampling clock signals including N data-sampling clock signals with phases separated by 360°/N and N2 edge-sampling clock signals with phases separated by 360°/(N/2) and interleaving with the N data-sampling clock signals, where Nis a positive even integer; a data decoder configured to decode the equalized data signal with the N data-sampling clock signals to recover a most significant bit (MSB) signal and a least significant bit (LSB) signal; an edge detector configured to detect edge information of the equalized data signal with the N/2 edge-sampling clock signals to generate an edge information signal; a retimer circuit configured to synchronize the recovered MSB signal, the recovered LSB signal and the edge information signal; a delay-locked loop (DLL) configured to: detect a phase skew of the input signal with reference to the sampling clock signals, produce a delay-line control voltage signal based on the detected phase skew, and generate a delay-locked clock signal based on the delay-line control voltage signal; and a jitter compensation circuit (JCC) configured to: compensate jitter transfer from the input data signal with a complementary delay-line control voltage signal to generate a jitter-compensated recovered clock signal, a jitter-compensated recovered LSB signal and a jitter-compensated recovered MSB signal.

    2. The PAM-4 receiver according to claim 1, wherein the continuous-time linear equalizer has a R-C source degeneration and inductor shunt peaking architecture including a differential shunting peaking inductor L, a pair of drain resistors R.sub.D, a degeneration capacitor Cs and a degeneration resistor Rs.

    3. The PAM-4 receiver according to claim 1, wherein the WBPLL comprises: a voltage-controlled oscillator (VCO) configured to generate the plurality of sampling clock signals based on an oscillator control voltage signal; a phase frequency detector configured to detect a phase difference of the generated sampling clock signals with reference to the delay-locked clock signal, and produce a phase difference signal; and a charge pump and a loop filter configured to convert the phase difference signal to the oscillator control voltage signal.

    4. The PAM-4 receiver according to claim 3, wherein the VCO is a ring oscillator including one or more delay cells and a current source for frequency control.

    5. The PAM-4 receiver according to claim 1, wherein the data decoder comprises: N sample-and-hold circuits configured to sample the equalized data signal with the N data-sampling clock signals to obtain N data samples respectively; a 3-level slicer circuit configured to demodulate each of the N data samples into a thermometer coded bit stream by comparing each of the N data samples with three decision threshold voltage levels; a coding converter configured to convert the thermometer coded bit stream to a binary coded bit stream including a MSB bit stream constituting the recovered MSB signal and a LSB bit stream constituting the recovered LSB signal.

    6. The PAM-4 receiver according to claim 5, wherein the data decoder further comprises a digital-to-analog converter (DAC) configured to generate the three decision threshold voltage levels.

    7. The PAM-4 receiver according to claim 5, wherein data decoder further comprises a calibration circuit configured to calibrate voltage offsets at the input data signal.

    8. The PAM-4 receiver according to claim 1, wherein the edge detector comprises: N/2 sample-and-hold circuit configured to sample edges on the equalized data signal with the N/2 edge-sampling clock signals to obtain N/2 edge information samples; and a comparator configured to generate the edge information signals by comparing each of the N/2 edge information samples with a decision threshold voltage level.

    9. The PAM-4 receiver according to claim 1, wherein the DLL comprises: a bang-bang phase detector (BBPD) configured to detect the phase skew of the input signal with reference to the sampling clock signals to generate a phase skew signal; a charge pump and a loop filter configured to convert the phase skew signal to the delay-line control voltage signal; a voltage-controlled delay line (VCDL) circuit configured to generate the delay-locked clock signal based on the delay-line control voltage signal and an input clock signal.

    10. The PAM-4 receiver according to claim 9, wherein the DLL further comprises a buffer circuit and duty cycle correction (DCC) circuit configured to correct duty cycles of the input clock signal and convert the input clock signal from a single-ended clock signal to a differential clock signal.

    11. The PAM-4 receiver according to claim 9, wherein the VCDL circuit comprises: one or more voltage-controlled delay cells, each consists of a pair of NMOSs as input devices and a pair of PMOSs as output devices to produce a delayed output signal having a delay time proportional to the delay-line control voltage signal with reference to the input clock signal.

    12. The PAM-4 receiver according to claim 1, wherein the JCC comprises: a complementary signal generator (CSG) configured to convert the delay-line control voltage signal to an inverted delay-line control voltage signal; and a plurality of complementary VCDL (C-VCDL) circuits including: a first C-VCDL circuit configured to compensate, based on the complementary delay-line control voltage signal, an input jitter transferred to the recovered clock signals to generate the jitter-compensated recovered clock signal; a second C-VCDL circuit configured to compensate, based on the inverted delay-line control voltage signal, an input jitter transferred to the recovered LSB data signal to generate the jitter-compensated recovered LSB signal; and a third C-VCDL circuit configured to compensate, based on the inverted delay-line control voltage signal, an input jitter to the recovered MSB data signal to generate a jitter-compensated recovered MSB signal.

    13. The PAM-4 receiver according to claim 12, wherein the CSG comprises: a clock control unit configured to divide a control clock signal and use the divided control clock signal for synchronization; a voltage follower configured to produce a buffered delay-line control voltage signal; a successive approximation register (SAR) analog-to-digital converter (ADC) synchronized with the divided control clock signal and configured to quantize the buffered delay-line control voltage signal to obtain a DC level and produce an analog delay-line control voltage for tracking the DC level; and an inverting follower configured to receive the delay-line control voltage signal and the analog delay-line control voltage to produce the inverted delay-line control voltage signal.

    14. The PAM-4 receiver according to claim 13, wherein the voltage follower comprises a first amplifier having a negative unit gain feedback loop connected between an output of the first amplifier and an inverting input of the first amplifier so as to generate a unit gain.

    15. The PAM-4 receiver according to claim 13, wherein the inverting follower comprises; a second amplifier having a negative feedback loop including a feedback resistor R.sub.fb coupled between an output of the second amplifier and an inverting input of the second amplifier; and an input resistor R.sub.in coupled to the inverting input of the second amplifier; wherein the feedback resistor R.sub.fb and the input resistor R.sub.in are set to have a same resistance so as to generate an inverting unit gain.

    16. The PAM-4 receiver according to claim 13, wherein the SAR-ADC comprises: a comparator configured to receive the buffered delay-line control voltage signal at a first input terminal; a SAR logic circuit coupled to an output of the comparator and configured to provide a digital output; and a digital-to-analog converter (DAC) configured to receive the digital output from the logic circuitry, convert the digital output into the analog delay-line control voltage, and feedback the analog delay-line control voltage into a second input terminal of the comparator.

    17. The PAM-4 receiver according to claim 12, wherein each of the first, second and third C-VCDL circuits comprises one or more complementary voltage-controlled delay cells; each complementary voltage-controlled delay cell consists of a pair of NMOSs as input devices and a pair of PMOSs as output devices to generate a delayed output signal having a delay time proportional to the inverted delay-line control voltage signal with reference to the input clock signal.

    Description

    SUMMARY OF THE INVENTION

    [0011] To solve the above-mentioned challenges, the present invention provides a source synchronous 60-Gb/s ¼-rate PAM-4 receiver with a jitter compensation CDR (JCCDR) in 40-nm CMOS technology, achieving a wide jitter tolerance bandwidth (40-MHz) and an ultralow jitter transfer (←8-dB).

    [0012] According to one aspect of the present invention, the provided PAM-4 receiver includes a first-order delay-locked loop (DLL) which employs a bang-bang phase detector (BBPD) and a voltage-controlled delay line (VCDL) circuit supporting 40 MHz jitter tracking bandwidth and static phase skew elimination. A second-order wideband phase-locked loop (WBPLL) using the ¼-rate reference clock provides multi-phase clock generation and ensure a sufficiently low input-to-output latency. To suppress the consequent jitter transfer, a jitter compensation circuit (JCC) acquires the jitter transfer amplitude and frequency information by detecting the DLL loop filter voltage (VLF(s)) signal, and generates an inverted loop filter voltage signal, denoted as VLF.sub.INV(s). The VLF.sub.INV(S) modulates a group of complementary VCDLs (C-VCDLs) to attenuate the jitter transfer on both recovered clock and data.

    [0013] With the provided PAM-4 receiver, a jitter compensation ratio up to 60% can be supported from DC to 4 MHz, with a −3-dB cornerfrequency of 40IMHz. Therefore, the present invention provides a solution to the three challenges in source synchronous I/O, including clock phase deskew, wideband jitter tolerance and jitter transfer attenuation.

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0014] Aspects of the present disclosure may be readily understood from the following detailed description with reference to the accompanying figures. The illustrations may not necessarily be drawn to scale. That is, the dimensions of the various features may be arbitrarily increased or reduced for clarity of discussion. There may be distinctions between the artistic renditions in the present disclosure and the actual apparatus due to manufacturing processes and tolerances. Common reference numerals may be used throughout the drawings and the detailed description to indicate the same or similar components.

    [0015] FIG. 1A shows a conventional source synchronous I/O architecture;

    [0016] FIG. 1B shows a conventional clock and data recovery (CDR) architecture supporting jitter tracking;

    [0017] FIG. 2 shows another conventional CDR architecture;

    [0018] FIG. 3 shows a circuit block diagram of a PAM-4 receiver with jitter compensation clock and data recovery (JCCDR) according to some embodiments of the present invention;

    [0019] FIG. 4 shows an exemplary circuit diagram for a continuous-time linear equalizer (CTLE) according to some embodiments of the present invention;

    [0020] FIG. 5A shows an exemplary circuit diagram for a wide-band phase-locked loop (WBPLL) according to some embodiments of the present invention; and FIG. 5B shows a more detailed exemplary circuit diagram for the wide-band phase-locked loop (WBPLL);

    [0021] FIG. 6 shows an exemplary circuit diagram for a data decoder according to some embodiments of the present invention;

    [0022] FIG. 7 shows an exemplary circuit diagram for an edge detector according to some embodiments of the present invention;

    [0023] FIG. 8 shows timing diagram for decoding the data signals at clock PH-0, 90 and edge signal at clock PH-45 according to some embodiments of the present invention;

    [0024] FIG. 9 shows an exemplary circuit diagram for a retimer according to some embodiments of the present invention;

    [0025] FIG. 10 shows an exemplary circuit diagram for a delay-locked loop (DLL) according to some embodiments of the present invention;

    [0026] FIG. 11 shows an exemplary bang band phase detector (BBPD) logic circuit used for receiving clock signals at phases PH-0, 45 and 90 and a corresponding transition diagram of its early/late indication signal;

    [0027] FIG. 12 shows an exemplary circuit diagram for a voltage-controlled delay line (VCDL) circuit according to some embodiments of the present invention;

    [0028] FIG. 13 shows an exemplary circuit diagram for a voltage-controlled delay cell according to some embodiments of the present invention;

    [0029] FIG. 14 shows an exemplary circuit diagram for a jitter compensation circuit (JCC) according to some embodiments of the present invention;

    [0030] FIG. 15 shows an exemplary circuit diagram for a complementary signal generator (CSG) according to some embodiments of the present invention;

    [0031] FIG. 16 shows an exemplary circuit diagram for a core amplifier (AMP) according to some embodiments of the present invention;

    [0032] FIG. 17A shows an exemplary circuit diagram for a successive approximation register (SAR) analog-to-digital converter (ADC) according to some embodiments of the present invention;

    [0033] FIG. 17B shows an exemplary circuit diagram for a comparator (CMP) according to some embodiments of the present invention;

    [0034] FIG. 17C shows an exemplary circuit diagram for a regeneration (RG) circuit according to some embodiments of the present invention;

    [0035] FIG. 17D shows an exemplary circuit diagram for a SAR logic unit according to some embodiments of the present invention;

    [0036] FIG. 17E shows an exemplary circuit diagram for a R-2R digital-to-analog converter (DAC) according to some embodiments of the present invention;

    [0037] FIG. 18 shows a timing diagram of the operation process the SAR-ADC;

    [0038] FIG. 19 shows an exemplary circuit diagram for a complementary VCDL (C-VCDL) circuit according to some embodiments of the present invention;

    [0039] FIG. 20 shows an exemplary circuit diagram for a complementary voltage-controlled delay cell according to some embodiments of the present invention;

    [0040] FIG. 21 shows an exemplary JCCDR architecture supporting ¼-rate PAM-4 operation according to some embodiments of the present invention; and

    [0041] FIG. 22 shows an exemplary circuit diagram for a VCDL and a group of three C-VCDLs with two dummies for improving the layout matching according to some embodiments of the present invention.

    DETAILED DESCRIPTION

    [0042] In the following description, preferred examples of the present disclosure will be set forth as embodiments which are to be regarded as illustrative rather than restrictive. Specific details may be omitted so as not to obscure the present disclosure; however, the disclosure is written to enable one skilled in the art to practice the teachings herein without undue experimentation.

    [0043] FIG. 3 is a circuit block diagram of a PAM-4 receiver 100 with jitter compensation clock and data recovery (JCCDR) according to some embodiments of the present invention. As shown, the PAM-4 receiver 100 may comprise a two-stage continuous-time linear equalizer (CTLE) 110, a wide-band phase-locked loop (WBPLL) 120, a data decoder 130, an edge detector 140, a retimer 150, a delay-locked loop (DLL) 160, and a jitter compensation circuit (JCC) 170.

    [0044] The CTLE 110 is implemented as a front-end of the receiver 100 to compensate for moderate channel loss and configured to equalize an PAM4 input data signal (DATA.sub.IN). Referring to FIG. 4, in some embodiments, the CTLE 110 may have a R-C source degeneration and inductor shunt peaking architecture including a differential shunting peaking inductor L, a pair of drain resistors R.sub.D1,2, a degeneration capacitor (or source capacitor) Cs and a degeneration resistor (or source resistor) Rs. In some embodiments, the degeneration capacitor Cs and the drain resistors R.sub.D1,2 are adjustable to achieve a 2.5˜11-dB peaking and 4-dB gain tuning range.

    [0045] Referring back to FIG. 3. The WBPLL 120 may be configured to receive and lock to a quarter-rate delay-locked clock signal CLK.sub.DLL to generate a plurality of sampling clock signals CLK.sub.REC with evenly separated phases. The plurality of sampling clock signals CLK.sub.REC may include N data-sampling clock signals CLK.sub.REC_DATA with phases separated by 360°/N, where Nis a positive even integer. The plurality of sampling clock signals CLK.sub.REC may further include N/2 edge-sampling clock signals CLK.sub.REC_EDGE with phases separated by 360°/(N/2) interleaving with the N data-sampling clock signals. For example, the plurality of sampling clock signals may include 4 data-sampling clock signals with phases separated by 90° (e.g., clock signals at phases 0°, 90°, 180° and 270°, denoted as CLK.sub.REC,PH-0, CLK.sub.REC,PH-90, CLK.sub.REC,PH-180 and CLK.sub.REC,PH-270, respectively) and 2 edge-sampling clock signals with phases separated by 180° (e.g., clock signals at phases 45° and 225°, denoted as CLK.sub.REC,PH-45 and CLK.sub.REC,PH-225, respectively).

    [0046] Referring to FIG. 5A, in some embodiments, the WBPLL 120 may comprise a voltage-controlled oscillator (VCO) configured to generate the plurality of sampling clock signals CLK.sub.REC based on an oscillator control voltage signal; a phase frequency detector (PFD) configured to detect a phase difference of the generated sampling clock signals (CLK.sub.REC) with reference to the delay-locked clock signal (CLK.sub.DLL), and produce a phase difference signal; a charge pump (CP) and a loop filter (LF) configured to convert the phase difference signal to the oscillator control voltage signal.

    [0047] In some embodiments, the phase difference detection may be realized with a XOR phase detector which produces a high voltage level signal when the states of CLK.sub.REC and CLK.sub.DLL are different from each other, and a low voltage level signal (typically equal to 0 V) when the states of CLK.sub.REC and CLK.sub.DLL are the same with each other.

    [0048] Referring to FIG. 5B, in some embodiments, the VCO may be a multi-stage ring oscillator including one or more (e.g., four) delay cells with an adjustable external current source for frequency control. Compared with other multi-phase clock generation techniques such as IJO and DLL, the WBPLL causes minor phase mismatch due to the intrinsic symmetry of the ring oscillator. The WBPLL uses the synchronous ¼-rate reference clock and produces 8-PH output clocks with the same frequency as input. Thanks to the high frequency reference clock, the WBPLL can support a wide loop bandwidth without stability issue, which benefits faster locking, higher ring oscillator phase noise suppression, and lower VCO power consumption

    [0049] In some implementations, the output clock frequency of the WBPLL may have a tuning range from 3.75 to 7.5 GHz to support 30-to-60-Gb/s PAM-4 operation. The WBPLL bandwidth is set at 400 MHz to ensure its phase and frequency updates can settle much faster than the quarter-rate delay-locked clock signal CLK.sub.DLL, whose bandwidth is designed to be 40 MHz for good jitter tolerance. The 400-MHz PLL bandwidth also supports wideband correlated jitter tracking and pattern-dependent uncorrelated jitter filtering.

    [0050] Referring back to FIG. 3. The data decoder 130 may be configured to decode the equalized data signal DATA.sub.EQU with the N data-sampling clock signals CLK.sub.REC_DATA (e.g., CLK.sub.REC,PH-0/90/180/270) to recover a most significant bit (MSB) signal (MSB.sub.REC) and a least significant bit (LSB) signal (LSB.sub.REC).

    [0051] Referring to FIG. 6, in some embodiments, the data decoder 130 may include: a calibration circuit configured to calibrate voltage offsets at the PAM4 input data signal; N sample-and-hold (S/H) circuits (not shown) configured to sample the equalized data signal DATA.sub.EQU with the N data-sampling clock signals CLK.sub.REC_DATA to obtain N data samples respectively; a digital-to-analog converter (DAC) configured to generate three decision threshold voltage levels; a 3-level slicer circuit, which may include three StrongARM comparators (CMP), configured to demodulate each of the N data samples into a thermometer coded bit stream by comparing each of the N data samples with the three decision threshold voltage levels; and a coding converter configured to convert the thermometer coded bit stream to a binary coded bit stream including a MSB bit stream constituting the recovered MSB signal (MSB.sub.REC) and a LSB bit stream constituting the recovered LSB signal (LSB.sub.REC).

    [0052] For example, the input PAM-4 signal may be sampled and deserialized by four S/H circuits with the PH-0/90/180/270 CLK.sub.REC signals. Next, the sampled signals are decoded using the three StrongARM CMP with individual reference voltages generated from a 6-bit current-mode DAC for slicing the top, middle, and bottom data eyes. The offsets at the input MOSFET devices of the StrongARM CMPs are calibrated upon startup using a 6-bit DAC as the calibration circuit. The decoded 4×3-bit thermometer codes (Tcode) are then converted into 4×2-bit binary codes (Bcode) as MSB.sub.REC and LSB.sub.REC.

    [0053] Referring back to FIG. 3. The edge detector 140 may be configured to detect edge information of the equalized data signal DATA.sub.EQU with the N/2 edge-sampling clock signals CLK.sub.REC_EDGE to generate an edge information signal EDGE.

    [0054] Referring to FIG. 7, in some embodiments, the edge detector 140 may comprise N2 S/H circuits configured to sample edges on the equalized data signal DATA.sub.EQU with the N12 edge-sampling clock signals CLK.sub.REC_EDGE to obtain N/2 edge information samples; and a comparator (CMP) configured to generate the edge information signal by comparing each of the N/2 edge information samples with a decision threshold voltage level. For example, the PAM-4 signal edge information may be detected by two additional S/Hs and CMPs clocked by PH-45/225 CLK.sub.REC signals.

    [0055] FIG. 8 shows timing diagram for decoding the data signals at clock PH-0, 90 and edge signal at clock PH-45. The same timing sequence applies to data and edge decoding on clock PH-180, 270 and 225. At the first rising edges of PH-0, 90 and 45, the data and edge signals are sampled and held on sampling capacitors. On the following falling edges, the data and edge signals are decoded using CMPs with three reference levels. The decoded data signals are in the Tcode format, which are converted into Bcode format at the second clock rising edges. The decoded data and edge information are synchronized by clock PH-225, and then sent to the DLL 160 for phase detection.

    [0056] Referring back to FIG. 3. The retimer 150 may be configured to synchronize the recovered data signal (i.e., the recovered MSB and LSB signals, MSB.sub.REC and LSB.sub.REC) and the recovered edge information signal EDGE.

    [0057] Referring to FIG. 9, in some embodiments, the retimer 150 may comprise one or more D-type flip-flop (DFF) retiming circuits. The DFFs are configured to be synchronized by a single clock (e.g., CLK PH-225) to generate synchronized MSB.sub.RECSYN, LSB.sub.RECSYN and EDGE.sub.SYN, respectively.

    [0058] Referring back to FIG. 3. The DLL 160 may be configured to: detect a phase skew of the input PAM-4 signal with reference to the sampling clock signals CLK.sub.REC, produce a delay-line control voltage signal VLF(s) based on the detected phase skew, and generate a delay-locked clock signal CLK.sub.DLL based on the delay-line control voltage signal VLF.

    [0059] The delay-line control voltage signal VLF(s) consists of a DC component VLF.sub.DC for fixing the locked timing point and an AC component VLF.sub.AC for tracking high-frequency jitter. Typically, the VLF.sub.DC varies from 0.15 V to 0.85 V, while VLF.sub.AC exhibits an amplitude of tens of mV and a bandwidth within 40 MHz.

    [0060] Referring to FIG. 10, in some embodiments, the DLL 160 may comprise a bang-bang phase detector (BBPD) configured to detect the phase skew of the input PAM-4 signal with reference to the sampling clock signals CLK.sub.REC to generate the phase skew signal; a charge pump (CP) and a capacitor-resistor-capacitor (C-R-C) loop filter (LF) configured to convert the phase skew signal to the delay-line control voltage signal VLF; and a voltage-controlled delay line (VCDL) circuit configured to generate the delay-locked clock signal CLK.sub.DLL based on the delay-line control voltage signal VLF(s) and the input clock signal CLK.sub.IN.

    [0061] In some implementations, the CP may have an output current of 50˜100 uA. Since the on-off switching of CP current can cause a relatively large supply variation, the C-R-C loop filter decouples the variation on CP power supply and VCDL power supply. The VLF(s) regulates the VCDL to generate the CLK.sub.DLL, which tracks the jitter from the input PAM-4 signal.

    [0062] The DLL 160 may further comprise a buffer (Buf) circuit and a duty cycle correction (DCC) circuit configured to correct duty cycles of the input clock signal CLK.sub.IN and convert the input clock signal from a single-ended clock signal to a differential clock signal.

    [0063] FIG. 11 shows an exemplary BBPD logic circuit used for receiving clock signals at phases PH-0, 45 and 90 and a corresponding transition diagram of its early/late indication signal. The same circuit is also employed for PH-180, 225 and 270. As shown, the BBPD only produces the Late and Early signals when the MSB and LSB data on the rising edges of two consecutive clock cycle are both different from each other. For example, the BBPD may produce a 1-bit clock early/late indication signal as the phase skew signal by comparing the states of two consecutive MSB.sub.REC, LSB.sub.REC signals and one EDGE signal in between; When the state of MSB.sub.REC/LSB.sub.REC-EDGE-MSB.sub.REC/LSB.sub.REC is 1/1-0-0/0 or 0/0-1-1/1, a clock early information is produced. When the state MSB.sub.REC/LSB.sub.REC-EDGE-MSB.sub.REC/LSB.sub.REC is 1/1-1-0/0 or 0/0-0-1/1, a clock late information is produced. Other states are filtered out and not adopted for phase detection.

    [0064] Referring to FIG. 12, in some embodiments, the VCDL circuit may include one or more voltage-controlled delay cells to produce a delay time proportional to the detected phase skew to generate the delay-locked clock signal CLK.sub.DLL; and a duty cycle correction (DCC) block comprising a cross coupled PMOS pair for correcting duty cycle.

    [0065] Referring to FIG. 13, in some embodiments, each of the voltage-controlled delay cells may consist of a pair of NMOSs as input devices and a pair of PMOSs as output devices to produce a delayed output signal, that is the delay-locked clock signal CLK.sub.DLL, which has a delay time proportional to the delay-line control voltage signal VLF(s) with reference to the input clock signal CLK.sub.IN.

    [0066] Referring back to FIG. 3. The jitter compensation circuit JCC 170 may be configured to compensate jitter transfer from the PAM4 input data signal with an inverted delay-line control voltage signal VLF.sub.INV(s) to generate a jitter-compensated recovered clock signal CLK.sub.RECJC, jitter-compensated recovered LSB signal LSB.sub.RECJC and jitter-compensated recovered MSB signal MSB.sub.RECJC.

    [0067] Referring to FIG. 14, in some embodiments, the JCC 170 may comprise a lock detector; a complementary signal generator (CSG) configured to convert the delay-line control voltage signal to the inverted delay-line control voltage signal VLF.sub.INV; and a plurality of complementary VCDL (C-VCDL) circuits. The inverted delay-line control voltage signal VLF.sub.INV(s) produced by the CSG may have the same DC level and AC amplitude as the delay-line control voltage signal VLF(s) but opposite AC phase. The DC level of VLF.sub.INV(s), denoted as VLF.sub.INVDC, is fixed close to VLF.sub.DC with a negligible error caused by the insufficient gain of the core AMP and the ADC nonlinearity.

    [0068] Referring to FIG. 15, in some embodiments, the CSG may include a clock control unit configured to divide a control clock signal CLK.sub.REC, cTRL (e.g., a clock signal at PH-135) by a suitable number of times (e.g., 128 times); a voltage follower configured to buffer the delay-line control voltage signal VLF(s) to produce a buffered delay-line control voltage signal VLF.sub.Buf(s); a successive approximation register (SAR) analog-to-digital converter (ADC) synchronized with the divided control clock signal and configured to quantize the buffered delay-line control voltage signal VLF.sub.Buf(s) to obtain a DC level and produce an analog delay-line control voltage VLF.sub.DAC for tracking the DC level; and an inverting follower configured to receive the analog delay-line control voltage VLF.sub.DAC and the delay-line control voltage signal VLF(s) to produce the inverted delay-line control voltage signal VLF.sub.INV(s).

    [0069] In some embodiments, the voltage follower may include a first core amplifier (AMP) with rail-to-rail input and output connected as unit gain feedback. That is, the first core amplifier may have a negative unit gain feedback loop connected between an output of the AMP and an inverting input of the AMP so as to generate a unit gain.

    [0070] In some embodiments, the inverting follower may comprise a second core amplifier (AMP) having a negative feedback loop formed with a feedback resistor R.sub.fb coupled between an output of the second amplifier and an inverting input of the second amplifier; and an input resistor R.sub.in coupled to the inverting input of the second amplifier. The feedback resistor R.sub.fb and the input resistor R.sub.in are set to have a same resistance (typically equal to 10 KΩ) so as to generate an inverting unit gain (i.e., an inverting gain close to 1).

    [0071] In some embodiments, the SAR-ADC may include a comparator (CMP) with regeneration (RG) configured to receive the buffered delay-line control voltage signal VLF.sub.Buf(s) at a first input terminal; a SAR logic circuit coupled to an output of the comparator and configured to provide a digital output; and a digital-to-analog converter (DAC) (e.g., a R-2R DAC) configured to receive the digital output from the SAR logic circuitry, convert the digital output into the analog delay-line control voltage VLF.sub.DAC(s), and feedback the analog delay-line control voltage VLF.sub.DAC(s) into a second input terminal of the comparator. As such, upon receiving the enabling signal VENABLE from the lock detector, the SAR-ADC can start operation to detect, reproduce, and maintain the DC level of VLF.sub.Buf on the R-2R DAC as VLF.sub.DAC, which can be designed to track VLF.sub.DC with an error less than 7 mV typically.

    [0072] FIG. 16 shows an exemplary architecture of a 2-stage amplifier for implementing each of the first core AMP and second core AMP. As shown, the 2-stage amplifier may have PMOS and NMOS as input devices to support a rail-to-rail input and output ranges, which completely cover the VLF.sub.DC range, e.g., from 0.25 V to 0.85 V.

    [0073] FIG. 17A shows an exemplary block diagram of an 8-bit SAR-ADC, consisting of a StrongARM comparator (CMP) as shown in FIG. 17B with a regeneration (RG) circuit as shown in FIG. 17C, an 8-bit SAR logic and an 8-bit R-2R ladder-based DAC.

    [0074] The 8-bit SAR logic circuit consists of eight identical SAR logic units. As shown in FIG. 17D, each SAR logic unit contains two sequence-control D-Flip-Flops (SDFF) and one coding DFF (CDFF) to produce the switch control bit for the corresponding R-2R unit.

    [0075] As shown in FIG. 17E, the R-2R DAC directly uses VDD and VSS as reference levels to cover the whole VLF.sub.DC range.

    [0076] FIG. 18 shows a timing diagram of the operation process the 8-bit SAR ADC. The principle of the SAR ADC is to pre-set each DAC control bit to 1 as a predicted value successively, and then update the control bit after comparing the predicted value with the input. The operation of each SAR-ADC unit takes two clock cycles. In the first clock cycle, the CDFF sets (S) the R-2R unit control bit to 1 for prediction. The StrongARM CMP starts the comparison process at the clock rising edge, and is reset at the following clock falling edge. The return-to-zero (RZ) code produced by the CMP is converted into NRZ format by the RG circuit. In the second clock cycle, the CDFF output is updated (U) with the CMP comparison result.

    [0077] Referring back to FIG. 14. The plurality of C-VCDL circuits may include: a first C-VCDL circuit configured to compensate, based on the inverted delay-line control voltage signal VLF.sub.INV(S), an input jitter transferred to the recovered clock signals (CLK.sub.REC) to generate the jitter-compensated recovered clock signal (CLK.sub.RECJC); and a second C-VCDL circuit configured to compensate, based on the inverted delay-line control voltage signal VLF.sub.INV(S), an input jitter transferred to the recovered LSB data signal (LSB.sub.REC) to generate the jitter-compensated recovered LSB signal (LSB.sub.RECJC); and a third C-VCDL circuit configured to compensate, based on the inverted delay-line control voltage signal VLF.sub.INV(S), an input jitter transferred to the recovered MSB data signal (MSB.sub.REC) to generate the jitter-compensated recovered MSB signal (MSB.sub.RECJC).

    [0078] Referring to FIG. 19, in some embodiments, each C-VCDL circuit may include one or more complementary voltage-controlled delay cells. Referring to FIG. 20, each complementary voltage-controlled delay cell consists of a pair of NMOSs as input devices and a pair of PMOSs as output devices to generate a delayed output signal having a delay time proportional to the inverted delay-line control voltage signal VLF.sub.INV(s) with reference to the input clock signal CLK.sub.IN.

    [0079] In other words, the first C-VCDL circuit may include one or more complementary voltage-controlled delay cells for the jitter-compensated recovered clock signal CLK.sub.RECJC which has a delay time proportional to the inverted delay-line control voltage signal VLF.sub.INV(s) with reference to the input clock signal CLK.sub.IN.

    [0080] The second C-VCDL circuit may include one or more complementary voltage-controlled delay cells for generating the jitter-compensated recovered LSB signal LSB.sub.RECJC which has a delay time proportional to the inverted delay-line control voltage signal VLF.sub.INV(s) with reference to the input clock signal CLK.sub.IN.

    [0081] The third C-VCDL circuit may include one or more complementary voltage-controlled delay cells for generating the jitter-compensated recovered MSB signal MSB.sub.RECJC which has a delay time proportional to the inverted delay-line control voltage signal VLF.sub.INV(s) with reference to the input clock signal CLK.sub.IN.

    [0082] FIG. 21 shows an exemplary JCCDR architecture supporting ¼-rate PAM-4 operation according to some embodiments of the present invention. As shown in FIG. 21, a first-order DLL tracks the jitter on the input PAM-4 signal using a PAM-4 BBPD, a charge pump (CP), a loop filter (not shown), and a VCDL. The VCDL is controlled by VLF(s) to generate the ¼-rate CLK.sub.DLL, which carries a jitter almost identical to input PAM-4 signal. A second-order 400-MHz WBPLL uses the CLK.sub.DLL as reference to produce the 8-phase (PH-0/45 . . . /270/315) clocks, denoted as CLK.sub.REC, for the PAM-4 signal decoding. The 400-MHz WBPLL bandwidth ensures fast frequency and phase update, which does not affect the DLL dynamics. The recovered 8-PH CLK.sub.REC synchronize the PAM-4 decoder to generate the recovered most significant bit (MSB.sub.REC) and least significant bit (LSB.sub.REC). A jitter compensation circuit (JCC) consisting of a complementary signal generator (CSG) and VCDL replicas is used to attenuate the jitter transfers on CLK.sub.RECJC, MSB.sub.RECJC and LSB.sub.RECJC. The CSG yields an inverted loop filter voltage VLF.sub.INV(s) for controlling the VCDL replicas to create the C-VCDL. The VLF.sub.INV(s) is designed to have the same amplitude but inverted phase as VLF(s). The CLK.sub.REC, MSB.sub.REC, and LSB.sub.REC are fed to the C-VCDLs controlled by VLF.sub.INV(s) to negate the jitter transfer, and deliver the jitter-compensated outputs CLK.sub.RECJC, MSB.sub.RECJC, and LSB.sub.RECJC, which theoretically carry no transferred jitters from the input PAM-4 signal. Therefore, the undesirable jitter transfer can be attenuated.

    [0083] The principle of jitter compensation can also be illustrated using loop dynamic analysis. The close loop transfer function (CLTF) of jitter transferred from input data to the DLL can be derived as:

    [00001] H VLF ( s ) = VLS ( s ) ϕ i n ( s ) = R T K BBPD K CP 1 sC 1 + R T K BBPD K CP 1 sC , ( 1 )

    [0084] where Φ.sub.in(s) stands for the input jitter, RT stands for transition ratio (typically equal to 0.5). K.sub.e and K.sub.CP represents the gains of the BBPD and charge pump (CP). The effect of WBPLL is not include since its loop bandwidth is ten times higher than the DLL.

    [0085] The CLTF from Φ.sub.in(s) to the recovered clock phase Φ.sub.CLKREC(S) can be represented by:

    [00002] H ϕ CLKREC ( s ) = ϕ CLKREC ( s ) ϕ in ( s ) = R T K BBPD K CP K VCDL 1 sC 1 + R T K BBPD K CP K VCDL 1 sC , ( 2 )

    [0086] where K.sub.VCDL represents the gain of the VCDL circuit.

    [0087] Eq. (2) illustrates the jitter transfer behavior of the DLL. The 3-dB bandwidth of Eq (2) determines the jitter tolerance bandwidth defined as:

    [00003] Jitter Tolerance Bandwidth = R T K BBPD K CP K VCDL C , ( 3 )

    [0088] The CLTF from Φin(s) to the phase of jitter compensated clock Φ.sub.CLKRECJC(S) can be determined by:

    [00004] H ϕ CLKRECJC ( s ) = ϕ CLKREC ( s ) + ϕ in ( s ) H VLF ( s ) K CSG ( s ) K VCDL ϕ in ( s ) R T K BBPD K CP 1 s C 1 + R T K BBPD K CP K VCDL 1 s C × ( K VCDL + K CSG ( s ) K PV K VCDL ) , ( 4 )

    [0089] where K.sub.CSG represents the gain of the CSG circuit and K.sub.PV represents the gain induced by process variation.

    [0090] Ideally, KCSG is equal to −1 to generate an VLF.sub.INV(s) signal with completely the same amplitude and inverted phase as VLF(s) such that complete jitter transfer compensation can be achieved. However, two non-ideal factors deviate KCSG from −1, including the AC gain errors in voltage follower and inverting follower, and the DC offset between VLF.sub.DC and VLF.sub.INVDC. Therefore, KCSG can be represented as:


    K.sub.CSG=K.sub.ACgainK.sub.DCoffset  (6)

    [0091] wherein K.sub.ACgain is the AC gain of voltage follower and inverting voltage follower and K.sub.DCoffset is the DC offset gain of the voltage follower and SAR ADC in the CSG circuit.

    [0092] The DC offset gain may be calculated using the equation:

    [00005] K DCoffset = K VCDL ( VLF DC + DCoffset ) K VCDL ( VLF D C ) ( 7 )

    [0093] where K.sub.vCDL(VLF.sub.DC) represents the K.sub.vCDL value at V.sub.LFDC.

    [0094] As described previously, the function of CSG is to produce the VLF.sub.INV(s) with the same amplitude and inverted phase as VLF. The mismatch factor between KVCDL and KCVCDL due to local process variation is included in K.sub.PV, which is close to 1 with fully symmetry layout.

    [0095] In real CMOS implementation, the offset and gain error in CSG, and the mismatch between VCDL and C-VCDL due to process variation can degrade the jitter transfer compensation performance. In order to ensure better matching, the VCDL and C-VCDL circuits are aligned close to each other, and protected by dummies at both ends in circuit layout as shown in FIG. 22. The VCDL is composed of a number (e.g., four as depicted) of single voltage-controlled delay cells and a duty cycle correction (DCC) block. Each delay cell consists of a pair of NMOSs as input devices and a pair of PMOSs controlled by VLF(s) or VLF.sub.INV(s) to determine the delay time. A cross coupled PMOS pair is used to correct the duty cycle. The delay cell with current source transfers less supply variation into output jitter.

    [0096] The embodiments were chosen and described in order to best explain the principles of the invention and its practical application, thereby enabling others skilled in the art to understand the invention for various embodiments and with various modifications that are suited to the particular use contemplated. While the methods disclosed herein have been described with reference to particular operations performed in a particular order, it will be understood that these operations may be combined, sub-divided, or re-ordered to form an equivalent method without departing from the teachings of the present disclosure. Accordingly, unless specifically indicated herein, the order and grouping of the operations are not limitations. While the apparatuses disclosed herein have been described with reference to particular structures, shapes, materials, composition of matter and relationships . . . etc., these descriptions and illustrations are not limiting. Modifications may be made to adapt a particular situation to the objective, spirit and scope of the present disclosure. All such modifications are intended to be within the scope of the claims appended hereto.