Droop detection and mitigation
11402413 · 2022-08-02
Assignee
Inventors
Cpc classification
G06F1/08
PHYSICS
G06F1/28
PHYSICS
H03K19/00346
ELECTRICITY
G06F1/3206
PHYSICS
International classification
G01R19/165
PHYSICS
Abstract
In an embodiment, a method includes filtering, with a low-pass filter, a voltage signal (V.sub.dd) of a chip to create a filtered signal (V.sub.ref). The method further includes dividing V.sub.ref by a given factor. The method further includes determining whether a voltage droop occurred in V.sub.dd by comparing V.sub.dd to the divided V.sub.ref. The method further includes outputting a droop detection signal if V.sub.dd is less than the divided V.sub.ref. In an embodiment, dividing V.sub.ref by the given factor includes selecting, with a multiplexer, one of a plurality of divided V.sub.ref signals outputted by a voltage divider. The selecting is based on a selection signal.
Claims
1. A method comprising: filtering, with a low-pass filter, a first voltage signal (V.sub.dd-1) of a root clock of a chip and a second voltage signal (V.sub.dd-2) of a local clock of the chip to produce a first filtered signal (V.sub.ref-1) of the root clock and a second filtered signal (V.sub.ref-2) of the local clock; dividing V.sub.ref-1 and V.sub.ref-2 by a given factor; comparing V.sub.dd-1 to the divided V.sub.ref-1 and V.sub.dd-2 to the divided V.sub.ref-2; and outputting a droop detection signal if V.sub.dd-1 is less than the divided V.sub.ref-1 or if V.sub.dd-2 is less than the divided V.sub.ref-2.
2. The method of claim 1, wherein dividing V.sub.ref-1 and V.sub.ref-2 by the given factor includes selecting, with a multiplexer, one of a plurality of divided V.sub.ref-1 and V.sub.ref-2 signals outputted by a voltage divider, the selecting based on a selection signal.
3. The method of claim 1, wherein outputting the droop detection signal includes setting at least one SR Latch, wherein the SR Latch stores the droop detection signal.
4. The method of claim 3, wherein the at least one SR Latch includes a local SR Latch and a global SR Latch, wherein the local SR Latch is cleared by a finite state machine local to the local SR latch and the global SR Latch is cleared by a finite state machine global to the chip.
5. The method of claim 1, further comprising: upon receiving the droop detection signal, decreasing a frequency of the chip from a full frequency to a lower frequency.
6. The method of claim 5, further comprising: increasing the frequency of the chip to at least one intermediate frequency, the intermediate frequency being between the full frequency and the lower frequency.
7. The method of claim 1, wherein comparing V.sub.dd-1 to the divided V.sub.ref-1 and V.sub.dd-2 to the divided V.sub.ref-2 uses a sense amplifier.
8. The method of claim 1, further comprising: storing, in a database, the droop detection signal.
9. A droop detection circuit comprising: a low-pass filter configured to filter a first voltage signal (V.sub.dd-1) of a root clock of a chip and a second voltage signal (V.sub.dd-2) of a local clock of the chip to produce a first filtered signal (V.sub.ref-1) of the root clock and a second filtered signal (V.sub.ref-2) of the local clock; a voltage divider configured to divide V.sub.ref-1 and V.sub.ref-2 by a given factor; a sense amplifier configured to compare V.sub.dd-1 to the divided V.sub.ref-1 and V.sub.dd-2 to the divided V.sub.ref-2 and configured to output a droop detection signal if V.sub.dd-1 is less than the divided V.sub.ref-1 or if V.sub.dd-2 is less than the divided V.sub.ref-2.
10. The droop detection circuit of claim 9, further comprising: a multiplexer configured to select one of a plurality of divided V.sub.ref-1 and V.sub.ref-2 signals outputted by the voltage divider, the selecting being based on a selection signal.
11. The droop detection circuit of claim 9, further comprising at least one SR Latch, wherein the SR Latch stores the droop detection signal.
12. The droop detection circuit of claim 11, wherein the at least one SR Latch includes a local SR Latch and a global SR Latch, wherein the local SR Latch is cleared by a finite state machine local to the local SR latch and the global SR Latch is cleared by a finite state machine global to the chip.
13. The droop detection circuit of claim 9, further comprising: a clock division module configured to, upon receiving the droop detection signal, decrease a frequency of the chip from a full frequency to a lower frequency.
14. The system of claim 13, wherein the clock division module is further configured to: increase the frequency of the chip to at least one intermediate frequency, the intermediate frequency being between the full frequency and the lower frequency.
15. The system of claim 9, further comprising: an interface to a database that is configured to store, in a database, the droop detection signal.
16. A processor comprising: a root clock; at least one local clock; a first droop detection circuit located coupled to the root clock configured to detect a global voltage droop in a source voltage; a second droop detection circuit located coupled to the local clock configured to detect a local voltage droop in the source voltage; a droop mitigation circuit configured to, in response to the first or second droop detection circuit detecting the voltage droop at the root clock or local clock, respectively, reduce a frequency of the root clock of the processor.
17. The processor of claim 16, further comprising: at least one local droop detection circuit being located on a die of the processor.
18. The processor of claim 16, wherein the first and second droop detection circuits each further include: a low-pass filter configured to filter a voltage signal (V.sub.dd) of a chip to create a filtered signal (V.sub.ref); a voltage divider configured to divide V.sub.ref by a given factor; a sense amplifier configured to compare V.sub.dd to the divided V.sub.ref and configured to output a droop detection signal if V.sub.dd is less than the divided V.sub.ref.
19. The processor of claim 18, wherein the droop detection circuit further includes: a multiplexer configured to select one of a plurality of divided V.sub.ref signals outputted by the voltage divider, the selecting based on a selection signal.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The foregoing will be apparent from the following more particular description of example embodiments, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments.
(2)
(3)
(4)
(5)
(6)
DETAILED DESCRIPTION
(7) A description of example embodiments follows.
(8) In modern high-performance and low-power chips, a sudden change in current consumption can result in a large voltage droop (L*dt) causing the chip to malfunction.
(9) Currently, droop voltages are mitigated by running chips at a lower frequency than the chip is capable. For example, consider a chip that runs at 1.2 Gigahertz (GHz). In addition, the maximum tolerable droop for this chip is 0.8 Volts (V), 0.2 V lower than the normal 1V. The chip may be sold to run at under 1.2 GHz so that the voltage drops may not reach 0.2V or higher. In other words, designers may engineer chips to sacrifice performance to have margin for error for when droops occur.
(10) However, if the droop can be detected and managed in real-time, the chip can run at a higher frequency, or in a mode to save battery life or energy. A droop can be detected when voltage, which is usually a fixed level, begins decreasing.
(11) The droop detection circuit can be placed at multiple places on a chip to manage droop mitigation. As soon as a droop happens anywhere on the chip, a droop mitigation module/circuit runs the chip at lower frequency. For a chip running at 1 GHz, for example, when a droop is detected, the root clock of that chip is running at the normal 1 GHz frequency. The frequency of the clock can be lowered immediately after droop detection. For example, the clock can be slowed by factors of ½, ⅔, or ⅚ (e.g., 3/6, 4/6, ⅚).
(12) The droop detection circuit is placed close to the root so that droop mitigation can occur quickly. If the on-die power grid has low impedance, the entire die can detect the droop, however, droop detector closest to the root can mitigate the droop first. Every edge of every clock is sampled (both up and down edges). The output is buffered, but sampling every edge of every clock allows for a faster response to any voltage droop detection.
(13) In an embodiment of the present disclosure, a circuit for detecting voltage droops includes an amplifier that is connected to a reference voltage (V.sub.ref). V.sub.ref is a configured voltage that is lower than the standard source voltage. If the voltage of the source (V.sub.dd) is less than V.sub.ref, the circuit generates a droop detection signal. Due to sensitivity issues, a low pass filter including a resistor and capacitor clean the V.sub.dd signal of noise. A resistance ladder coupled with a multiplexer generates a V.sub.Ref signal for the multiplexer to select based on a configured selection signal.
(14)
(15) A sense amplifier 110 is sampled at both clock edges using a high-speed clock (RCLK) such that the droop mitigating circuit responds as soon as a droop event is detected. The sense amplifier 110 compares the selected V.sub.ref to the noisy source signal V.sub.DD_CORE. If the V.sub.DD_CORE signal is lower than the V.sub.ref signal, the sense amplifier 110 outputs a signal that a droop is detected.
(16) SR latches 112 and 114 store the droop-detector output (e.g., output of the sense amplifier 110). Once the SR latches 112 and 114 receive a signal of a droop detection, the respective latches 112 and 114 retain the signal until they receive a clear signal. The SR Latch 112 outputs a local droop detection signal (local_ddet) and are cleared by a local clear signal (local_clr). The SR Latch 114 outputs a global droop detection signal (global_ddet) and is cleared by a global_clr. The respective clear signals (local_clr and global_clr) represent completion of a droop mitigation mechanism. The droop mitigation mechanism is explained in more detail in relation to
(17)
(18)
(19) The global detection signal is a droop detector output that is cleared/re-armed by the global FSM. The global detection signal can be disabled by setting dsel to 0, and enabled by setting dsel to 1.
(20) The droop_en signal is a CSR-controlled master enable signal input to the FSM. For example, droop mitigation can be disabled by setting droop_en to 0, and enabled by setting droop_en to 1. Even when disabled, droop detection can be monitored by DROs by using the local_ddet outputs.
(21) The circuit can further include various counters that are CSR controlled decrementing counters. The counters can include a 6-bit counter, Cnt1, having a maximum of 64 reference clock cycles in a half frequency (HF) state (f/2). The counters can also include two 4-bit counters, Cnt2 and Cnt3, with maximum of 16 reference cycles in f2/3 state and f5/6 state.
(22) By default, the values for the circuit shown in
(23) a) DRO FSM local_clr counter=312
(24) b) Cnt1=4
(25) c) Cnt2, Cnt3=4
(26) d) Droop_en=0
(27) e) Vref_sel*[3:0]=14
(28) f) Dsel*=0
(29) g) Ddet_csr=0
(30)
(31) The droop detector circuits 306a-e (DD0-4) are placed in various parts of the chip where the occurrence of a droop event is more likely. One droop detector, DD4 306e, is placed near the clock root (PLL), which allows quick response by the droop mitigation circuit 310 for a common-mode droop event, which is experienced across the whole chip. However, local droop detector circuits 306a-d, (DD0-3), are placed at other places on the chip that may experience local droop voltages.
(32) Each droop detector circuit 306a-e is coupled with a control status register providing configuration values. For example, each droop detector 306a-e receives a voltage selection signal (e.g., vref_sel), a local clear signal (local_clr) and a global detection enablement signal (dsel). DRO2 is an example of a control status register 330 shown coupled to DD2 306c in
(33) The local_ddet signal (lddet) is an output from each respective droop detector 306a-e that is cleared/re-armed by the local clear signal (local_clr) which is provided by the respective DRO (e.g., DRO2 330). The local clear (local_clr) signal is generated by a local finite state machine (FSM) (not shown) in each respective DRO module, which loads a 10-bit down-counter with a control status register (CSR) (e.g., DRO2 330 or other DRO) controlled value. The count-down is triggered by a rising edge of the local_ddet signal. Once the counter reaches zero, it asserts local_clr for a cycle, which clears the local_ddet of the respective droop detector, and re-loads the counter. The local_ddet signal also increments a CSR controllable and observable 16-bit counter (e.g., event monitor) in the DRO (e.g., DRO2 330 or other DRO).
(34) Further, a database 320 can be coupled to one or more droop detection circuit 306c (or 306a-b, 306d-e, connection not shown). The droop detection circuit 306a-e can output its detection signal (e.g., gddet or lddet) to the database 320 for monitoring and analysis. In embodiments, a setting (not shown) can enable a monitor-only mode, where the droop mitigation circuit 310 is disabled but the droop detection circuit(s) 306a-e output droop detection signals (e.g., gddet or lddet) to the database 320 for later tracking and analysis.
(35) As described above, when a droop event is detected by one of the droop detectors (e.g., DD0-4 306a-e), a clock division module 304 of the mitigation circuit 310 reduces the clock frequency f (e.g., to f/2) to prevent further droop and damage to the chip. After recovering from the droop event in a programmable number of cycles, the clock division module 304 can increase the clock frequency in incremental steps (e.g., from f/2, 2f/3, 5f/6, to f).
(36) The clock division module 304 is configured to, upon receiving a droop detection signal (gddet, lddet), decrease a frequency of the chip from a full frequency to a lower frequency. The clock division module 304 can be further configured to increase the frequency of the chip to at least one intermediate frequency. The intermediate frequency is between the full frequency and the lower frequency.
(37) The clock division module 304 receives the root clock signal (root_clk) and outputs three different clocks: f/2 (root_clk_hf), 2f/3 (root_clk_23), and 5f/6 (root_clk_56). The clock division module 304 generates the f/2 (root_clk_hf) signal based on the root clock and a reset signal delayed two cycles using a 50% clock divider 332.
(38) The clock division module 304 further generates the f2/3 (root_clk_23) signal based on a reset signal delayed two cycles and three SR flip flops 334a-c in series. The output of the three flip flops 334a-c is input to an AND gate 338 with the root clock as the other input. A person having ordinary skill in the art can recognize that the SR flip flops are set to the values as illustrated in
(39) The clock division module 304 further generates the f5/6 (root_clk_23) signal based on a reset signal delayed two cycles and six SR flip flops 336a-f in series. The output of the six flip flops 336a-f is input to an AND gate 340 with the root clock as the other input. A person having ordinary skill in the art can recognize that the SR flip flops 336a-f are set to the values as illustrated in
(40) The clock division module 304 outputs its respective clock signals to a multiplexer 342. The multiplexer 342 is a 4×1 multiplexer. The multiplexer 342 receives the root clock signal (root_clk) as well as the three divided clocks from the clock division module 304 (e.g., root_clk_hlf, root_clk_23, and root_clk_56. A second multiplexer, 344, selects an encoding, clksel[1:0], that selects the clock to output from the first multiplexer 342. The second multiplexer 344 selects based on a scan_mode input.
(41) A plurality of circuit logic 346 (e.g., logic gates, latches, flip flops, etc.) is configured to receive droop detection signals from each respective droop detector 306a-e, as well as from the finite state machine 302. The finite state machine is described in further detail below in relation to
(42)
(43) From state FF1 404, if a droop (D) is detected, the chip moves to state HF1 406. In state HF1 406, the chip runs at half frequency (f/2). The clock select (clksel) is set to 2′b00. Cnt1 is decreased by 1 each clock cycle. The FSM stays in HF1 406 for 8 cycles.
(44) After 8 cycles, the FSM enters state HF2 408. In state HF2 408, dd_rst is set back to 1. Cnt1 continues to decrease until it reaches zero. When it reaches zero, B1 is satisfied and the FSM leaves HF2 408. If a fuse setting is enabled to jump directly to full frequency, the FSM transitions to state FF2 414. Otherwise, the FSM transitions to state F23 410. A person having ordinary skill in the art can recognize that the HF1 remains in its state for 8 cycles to make sure that the clock select signal has safely transitioned from 2′b10 to 2′b00 before the dd_rst signal is reasserted in state HF2 408.
(45) In state F23 410, the processor runs at 2/3 frequency, with the clock select (clksel) set to 2′b01. In F23, Cnt2 is decreased each cycle. Upon Cnt2 reaching zero (B2 being satisfied), the FSM enters state F56 412.
(46) In state F56 412, the processor runs at 5/6 frequency, with the clock select (clksel) set to 2′b11. The FSM decreases Cnt3 each cycle. Upon Cnt3 reaching zero and B3 being satisfied, the FSM moves to state FF2 414.
(47) In state FF2 414, the processor runs at full frequency. The FSM loads Cnt1-2, and sets the clksel to 2′b10. It remains in this state for 8 cycles, and moves to state FF0 402 afterwards.
(48)
(49) While example embodiments have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the embodiments encompassed by the appended claims.