System of Free Running Oscillators for Digital System Clocking Immune to Process, Voltage and Temperature (PVT) Variations

20230087096 · 2023-03-23

    Inventors

    Cpc classification

    International classification

    Abstract

    A system of free running oscillators synchronized to the lowest frequency running one and following PVT variation generates a system clock. A method is particularly applicable to clock relatively small clock domains within a multi-core chip containing thousands of cores, and where the clock domain encompasses one or more cores and additional logic blocks. The resulting system clock is divided by 2.sup.k using latches or flip-flops to achieve a symmetric 50-50 duty cycle of the system clock. Further, such PVT insensitive system clock can be used as a reference for a PLL or DLL generated clock for the domain.

    Claims

    1. A system of free running oscillators whose frequency is determined by PVT, running in synchrony with each other providing a composite system clock.

    2. The system of claim 1 synchronized by a clock grid.

    3. The system of claim 1 including interleaved free running oscillators whose frequency is determined by PVT, running in synchrony with each other providing a composite system clock.

    4. The system of free running oscillators of claim 1 producing a system clock running at the lowest frequency of FROs in the system.

    5. The system of claim 2 providing a reference signal for a PLL clocking a particular clock domain.

    6. The system of claim 2 providing a reference signal for a DLL clocking a particular clock domain.

    7. The system of claim 2 which frequency is divided by a factor of 2.sup.k to produce 50-50 clock duty cycle.

    8. The system of claim 2 communicating asynchronously with other clock domains.

    9. An apparatus comprising: an integrated circuit substrate providing different circuit speeds depending on location on the substrate; an integrated circuit comprising a plurality of circuits formed in said integrated circuit substrate and spanning a defined area of said integrated circuit substrate; a plurality of ring oscillators formed in said defined area of said integrated circuit at different locations; said ring oscillators connected to each other; a clock distribution system connected to said plurality of circuits of said integrated circuit; and said plurality of ring oscillators connected to said clock distribution system.

    10. The apparatus according to claim 9, wherein the plurality of ring oscillators are interconnected to provide an a clock speed at the interconnection to said clock distribution system which is an average of the speed of said plurality ring oscillators.

    11. The apparatus according to claim 9, wherein the plurality ring oscillators are interconnected to provide a clock speed at the interconnection to said clock distribution system which is the lowest speed of said plurality ring oscillators.

    12. The apparatus according to claim 9, wherein the plurality ring oscillators are spaced and arranged within said defined area of said integrated circuit substrate in a pattern such that each ring oscillator substantially spans the defined area.

    13. The apparatus according to claim 12, wherein the pattern is interleaved.

    14. The apparatus according to claim 12, wherein the pattern is a spiral.

    15. The apparatus according to claim 11, wherein each of the plurality of ring oscillators comprises an odd number of inverters wherein the first and last inverters of each of the plurality ring oscillators are NAND gates.

    16. The apparatus according to claim 15, wherein the first NAND gate of each of the ring oscillators has at least one input connected to the output of another of said plurality of ring oscillators.

    17. The apparatus according to claim 15, wherein the last NAND gate of each of the plurality ring oscillators has at least one input connected to the second to last inverter of another ring oscillator of said plurality of ring oscillators.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0020] The present invention is described with reference to the following drawings, wherein like reference numbers denote substantially similar elements:

    [0021] FIG. 1 shows a silicon chip divided into multiplicity of clock domains;

    [0022] FIG. 2 shows a free running oscillator (FRO);

    [0023] FIG. 3A shows an example of multiple FROs;

    [0024] FIG. 3B shows simulation results of an experiment conducted using Cadence CAD simulation tool on the FROs shown in FIG. 3A disconnected from one another;

    [0025] FIG. 3C also shows simulation results of an experiment conducted using Cadence CAD simulation tool on the structure (connected FROs) shown in FIG. 3A;

    [0026] FIG. 4A shows another topology of a system of FROs;

    [0027] FIG. 4B shows the clock signals originating from three of the FROs shown in the structure of FIG. 4A;

    [0028] FIG. 4C shows clocking signals from the main system clock (emphasized) and two signals from the two FROs in the middle of the domain;

    [0029] FIG. 5A shows a possible topology for placing FROs within the domain;

    [0030] FIG. 5B shows another possible topology for placing FROs within the domain;

    [0031] FIG. 6 shows two FROs that are linked with each other;

    [0032] FIG. 7 shows an arrangement of four oscillators in a domain;

    [0033] FIG. 8 shows a system that generates a system clock with near perfect 50-50 clock duty cycle; and

    [0034] FIG. 9 shows a system of FROs that provide a reference point for phase locked loop (PLL) or digital locked loop (DLL) clocking in a domain.

    DETAILED DESCRIPTION

    [0035] The present invention overcomes problems associated with the prior art. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the invention. Those skilled in the art will recognize, however, that the invention may be practiced apart from these specific details. In other instances, details of well-known clocking practices and components have been omitted, so as not to unnecessarily obscure the present invention.

    [0036] The following references are incorporated herein by reference: [0037] 1. V. G. Oklobdzij a, et al, Digital System Clocking: High Performance and Low-Power Aspects, Wiley-IEEE, (2005); and [0038] 2. V. G. Oklobdzij a, “Clocking and Clocked Storage Elements in a Multi-Gigahertz Environment,” IBM Journal of Research and Development, (2003), vol. 47, no. 5/6, pp. 567-584.

    [0039] FIG. 1 shows the silicon chip divided into a multiplicity of clock domains that could be cores or regions containing more than a single core. An example of such a system is taken from the open literature (Uehara).

    [0040] FIG. 1 also shows an example of a clock domain, in this case consisting of one, local memory, DMAC and associated Router. The example is taken from the open literature (Uehara).

    [0041] FIG. 2 shows a free running oscillator (FRO) (implemented as a ring oscillator in this case). The operation of the ring oscillator is controlled by the Enable signal (EN). When EN=0 the ring oscillator is not oscillating and it is in a predetermined known state. When EN=1, the ring oscillator is oscillating at the frequency determined by the delay of the elements in its path (invertors in shown case), which is dependent on PVT.

    [0042] FIG. 3A shows an example of free running oscillators, elements of which are dispersed across the clock domain (in this particular case, around the boundary of the domain). The FROs are interconnected via a grid, which forces the FROs to synchronize to the average frequency and PVT conditions. The grid is further interconnected into a well-known “clock mesh” for distributing the system clock within the domain.

    [0043] FIGS. 3B and 3C show simulation results of an experiment conducted using the Cadence CAD simulation tool on the structure shown in FIG. 3A. FIG. 3B shows the signal waveform of four free running oscillators (from FIG. 3A) when they are not connected via grid (i.e. the grid is disconnected). The FROs are running at their own frequencies (they are intentionally made to be different to simulate process variations).

    [0044] FIG. 3C shows their signals when they are connected via the grid. FIG. 3C shows how the signals are perfectly synchronized to each other and can be used as the system clock, which has a frequency that varies with PVT.

    [0045] FIG. 4A shows another topology of the system of FROs dispersed throughout the clock domain in a way that can capture all the areas of the domain and associated PVT variations. FROs are interconnected at various points, forcing them to synchronize. Though FIG. 4A shows six FROs synchronized together, the number of FROs utilized in such a configuration is not limited to six and can include many more FROs. FIG. 4B shows the clock signals originating from three of the FROs shown in the structure described in FIG. 4A. We can observe how they are perfectly synchronized generating the system clock. FIG. 4C shows the main system clock (emphasized) and two signals from the two FROs in the middle of the domain. The experiment shown in FIGS. 4A-4C demonstrates that the system described is operational.

    [0046] FIG. 5A shows another possible topology for placing FROs within the domain. The invention described in FIGS. 1-5B shows a system of free running oscillators that is synchronized by the application of a grid. (i.e. by tying all the FROs outputs together) The clock signal in this case runs at the frequency that represents an average frequency of FROs in the system, and the frequency follows PVT variations in the domain. When using such an arrangement in a design, timing of the critical path still has to allow for a small margin, which is due to the process parameter variation across various points of the domain, though, such a margin is considerably smaller than the margin used across the entire chip (die). However, this invention will alleviate all other variations, aging effects included.

    [0047] The systems described here force the resulting system clock to run at the lowest frequency of all the FROs within the clock domain. This operation is illustrated in the example of two FROs synchronized to run at the lower frequency of the two, as shown in FIG. 6. FIG. 6 illustrates two FROs that are linked with each other in such a way that the frequency of the slower of the two FRO dominates. The number of FROs that can be synchronized to the frequency of the lowest is not limited to two FROs, and we can use as many as practicably feasible and sufficient for achieving our goal. FIG. 7 illustrates an arrangement of four such oscillators in a domain, producing the system clock running at the lowest frequency attainable due to PVT in the domain.

    [0048] In all the instances described, FROs are controlled by an Enable signal (EN). When EN632 0 FROs are prevented from oscillating. When EN632 1, the FROs are enabled to oscillate. Further, as EN is a common signal to all of them, it provides a determined starting point for all of the FROs, thus any “races” to synchronize with each other are avoided. Additional Enable signals can be used to turn off particular regions of the chip (clock gating).

    [0049] The resulting signal of the system of FROs shown in FIG. 6 (running at the frequency dictated by the slowest FRO in the system) does not produce a “symmetric” clock signal, i.e. the clock signal with the 50-50 “duty cycle”. When this feature is desired and necessary, the system of FROs is set to run at the frequency that is twice or four times as fast as the desired system clock frequency. The desired frequency is obtained by dividing the clock signal by the factor of 2 or 4 (or factor 2.sup.k in general). This generates the system clock with near perfect 50-50 clock duty cycle. This is illustrated in FIG. 8.

    [0050] It is further possible to use a described system of FROs to provide a reference point for PLL or DLL in the domain. Thus, the system can follow standard design flow, using PLL or DLL, while the reference clock provides a signal that follows PVT. The PLL or DLL can then provide the system clock signal that is following the reference signal by a factor introduced by PLL/DLL. This arrangement is illustrated in FIG. 9. Communication between domains (e.g. cores) is performed in asynchronous fashion since each domain is clocked synchronously by its own system clock independent of each other.

    [0051] The description of particular embodiments of the present invention is now complete. Many of the described features may be substituted, altered or omitted without departing from the scope of the invention. Various deviations from the particular embodiments shown will be apparent to those skilled in the art, particularly in view of the foregoing disclosure.