Clock gating circuit
10671562 ยท 2020-06-02
Assignee
Inventors
Cpc classification
G06F13/364
PHYSICS
Y02D10/00
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
G06F13/28
PHYSICS
Y02B70/10
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
International classification
G06F13/28
PHYSICS
G06F13/364
PHYSICS
Abstract
A system-on-chip bus system includes a bus configured to connect function blocks of a system-on-chip to each other, and a clock gating unit connected to an interface unit of the bus and configured to basically gate a clock used in the operation of a bus bridge device mounted on the bus according to a state of a transaction detection signal.
Claims
1. A clock gating circuit, comprising: a first counter configured to output a first signal; a second counter configured to output a second signal; a flip-flop configured to latch a third signal that is generated based on the first signal and the second signal, and configured to output a fourth signal; and a clock gating cell configured to latch a clock gating enable signal that is generated based on the third signal and the fourth signal, and configured to output a gate clock.
2. The clock gating circuit of claim 1, wherein the clock gating enable signal is generated based further on a valid read address, a valid write address and a valid write data.
3. The clock gating circuit of claim 1, wherein the clock gating cell is configured to output the gate clock in response to the clock gating enable signal.
4. The clock gating circuit of claim 1, wherein the clock gating cell includes a latch configured to latch the clock gating enable signal.
5. The clock gating circuit of claim 4, wherein the latch is configured to latch and output the clock gating enable signal according to a clock.
6. The clock gating circuit of claim 5, wherein the clock gating circuit gates the clock based on a state of a transaction detection signal.
7. The clock gating circuit of claim 1, wherein: the first counter is configured to count transactions during a write operation, and the second counter is configured to count transactions during a read operation.
8. The clock gating circuit of claim 1, wherein each of the first counter and the second counter receives a clock.
9. The clock gating circuit of claim 8, wherein the flip-flop receives the clock.
10. A clock gating circuit, comprising: a dynamic clock gate including a first counter configured to output a first signal, a second counter configured to output a second signal, and a flip-flop configured to latch a third signal that is generated based on the first signal and the second signal; and a clock gating cell including a latch, and configured to output a gate clock, wherein the first counter is configured to count transactions during a write operation, the second counter is configured to count transactions during a read operation, and the latch is configured to latch a clock gating enable signal that is generated based on a valid read address, a valid write address, a valid write data, the third signal, and an output signal of the flip-flop.
11. The clock gating circuit of claim 10, wherein each of the first counter, the second counter, the flip-flop, and the latch receives a clock.
12. The clock gating circuit of claim 11, wherein the clock gating circuit gates the clock based on a state of a transaction detection signal.
13. The clock gating circuit of claim 10, wherein the clock gating cell provides or blocks the gate clock in response to the clock gating enable signal.
14. The clock gating circuit of claim 10, wherein the first counter counts transactions based on the valid write address, a write address ready signal, a valid write response signal, and a write response ready signal.
15. The clock gating circuit of claim 10, wherein the second counter counts transactions based on the valid read address signal, a read address ready signal, a valid read data, a read signal, and a read last signal.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Features will become apparent to those of skill in the art by describing in detail exemplary embodiments with reference to the attached drawings in which:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
DETAILED DESCRIPTION
(22) Example embodiments will now be described more fully hereinafter with reference to the accompanying drawings; however, they may be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
(23) In the drawing figures, the dimensions of regions may be exaggerated for clarity of illustration. Like reference numerals refer to like elements throughout.
(24) In the specification, it will also be understood that when an element or lines are referred to as being on a target element block, it can be directly on the target element block, or intervening another element may also be present.
(25) The terms used in the specification are for the purpose of describing particular embodiments only and are not intended to be limiting of the invention. As used in the specification, the singular forms a, an, and the are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms comprises and/or comprising, when used in the specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
(26) Each embodiment described and exemplified herein may include a complementary embodiment thereof. Note that descriptions of interface architectures such as an AMBA (Advanced Microcontroller Bus Architecture) bus interface architecture and their detailed basic operations may be avoided to avoid obscuring the descriptions of embodiments.
(27)
(28)
(29) In the example shown in
(30) A microprocessor 10, a DMA 11, a DSP 12, and a USB 13 may function as master IPs on a first bus BUS1 that may become an AXI bus. In addition, a RAM 20, an SDRAM, and a bridge 22 may function as slave IPs. A UART 40, which may function as a slave IP, may be connected to a second bus BUS2 that may become an APB bus.
(31) AHB (Advanced High-Performance Bus), APB (Advanced Peripheral Bus), and AXI (Advanced eXtensible Interface) have been proposed as bus types of the AMBA. Of the above, the AXI is an interface protocol having advanced functions such as a multiple outstanding address function, a multiple outstanding transaction function, and a data interleaving function.
(32) The multiple outstanding transaction function is a function for allowing the utilization of idle transmission time occurring between addresses by transmitting the address of each data only once through the address lines at the same time as the data is transmitted. The multiple outstanding transaction function is a parallel transaction processing function for allowing a plurality of transactions to be transmitted to a slave IP. Accordingly, one of the transactions may be selected by the slave IP to be processed first. At the same time, read and write operations may be executed via the AXI.
(33) The data interleaving function allows data to be interleaved with each other at the slave when several masters transmit the data to one slave, thus allowing the more efficient utilization of a bandwidth as well as providing an advantage in respect of latency.
(34) Although an AHB is not shown in
(35)
(36) An AMBA architecture shown in
(37) In the example shown in
(38) In
(39) An AHB master 30, an internal memory 31, a DMA controller 32, an AHB slave 34, an extended memory controller 35, a memory controller 36, and a second bridge 22 are connected to the AHB B2.
(40) An APB master 50, an APB slave 41, a UART 40, a WDT 42, and an interrupt controller 43 are connected to the APB B3.
(41) The APB B3 is a peripheral bus which operates at a lower speed than other buses. Therefore, the second bridge 22 is coupled between the APB B3 and the AHB B2 to cover difference in performance and speed. Similarly, the first bridge 23 is coupled between the AHB B2 and the AXI bus B1.
(42) Although not shown in this figure, various bus bridge devices such as a quality of service enhancement (QE) unit, a memory management unit (MMU), an up/down-sizer, an async bridge, a master/slave interface, and a crossbar switch may be mounted on the AXI bus B1 and coupled between the AXI master 10 and the AXI slave 20.
(43) For the operation of such a bus bridge device, a clock is provided through a clock tree buffer. By providing the clock, power consumption still occurs at the clock tree buffer even when a logic unit or a slave IP of the bus bridge device is temporarily in an idle state. As a result, a more detailed technique of clock gating for the bus bridge device may optimally reduce power consumption of a system-on-chip. For example, when the power consumed in the clock tree buffer of the bus bridge device is more than tens of percent of the gross switching power, a clock gate scheme may be useful in efficiently cutting off the power of the clock tree buffer.
(44)
(45) The above channels are coupled between the master device 10 and the slave device 20 through interfaces. A master interface (MI) 100 exists in the master device 10, and a slave interface (SI) 200 exists in the slave device 20. When the slave device 20 acts as a master device, the slave interface 200 also becomes a master interface. That is, a master device may turn into a slave device and a slave device turn into a master device according to the operating environment.
(46) The master interface 100 may include an arbiter 101, a router 102, and a decoder 103. The slave interface 200 may include an arbiter 201, a router 202, and a decoder 203. In
(47) Maximum sixteen master devices 10 and slave devices 20 may be connected to one AMBA bus.
(48)
(49) In the example shown in
(50)
(51) Referring to
(52) The master device 10 obtaining a right to use the bus transmits a signal HADDR to the decoder 103. The signal HADDR means an address of the desired slave device 20. The decoder 103 transmits a signal HSEL.sub.X to the corresponding slave device 20. The signal HSEL.sub.X includes the meaning of slave device, you are selected by me. Thus, the corresponding slave device 20 becomes enabled.
(53) The master device 10 transmits a signal HWRITE of high level to write data. If the master device 10 transmits a signal HWRITE of low level, it is recognized that the selected slave device 20 is required to read the data. In response to the signal HWRITE, the slave device 20 transmits a signal HREADY to the master device 10. The signal HREADY includes the meaning of master device, I am ready to write/read the data; please perform the operation. Thus, the master device 10 confirming the signal HREADY transmits a signal HWDATA to the slave device 20 during a write operation and receives a signal HRDATA from the slave device 20 during a read operation. In the read operation, a burst mode operation may be performed to provide data once and successively read the data. For example, the burst mode may employ an incremental manner in which data size continues to be incremented by HSIZE (32 bits=4 bytes, address is incremented by four, and a start address is a least significant bit 00).
(54) As a power consumed in a tree buffer of a bus bridge device increases, a clock gating scheme may be used to mitigate or avoid increased power consumption. An example of the clock gating scheme is illustrated in
(55) Referring to the example shown in
(56) It may be helpful in power saving of the entire system if a power consumed in the clock tree buffer 400 is not wasted when the bus bridge device logic unit 500 is in a non-operating or idle state. With this aim, the clock gating unit 300 receives a clock CLK to provide a gating clock GCLK to the clock tree buffer 400.
(57) The gating clock GCLK is a clock generated as a result of dynamic clock gating and is not a clock-type signal but a signal maintained at a low level.
(58) In a recent bus, a pipeline structure is widely used to enhance a bus throughput and an async design for globally asynchronous locally synchronous (GALS) is common. Accordingly, lots of flip-flops may be adapted with the increase of gate count. For this reason, a ratio of the power consumed in a clock tree buffer may increase and, in certain cases, may reach more than 40 percent. It is therefore expected that effective clock gating would be useful in reducing overall system power consumption.
(59)
(60) Referring to
(61) In case of an AXI bus B1, the clock gating unit 300 is connected to a master interface MI of the master device 10 of the AXI bus B1 and basically gates a clock used for the operation of a bus bridge device 150 mounted at the AXI bus B1 according to a state of a transaction detection signal. A clock HCLK applied to the clock gate unit 300 is gated with the transaction detection signal as a gating clock GCLK. The clock HCLK is not applied to a clock tree buffer when an internal logic unit of the bus bridge device 150 does not operate or is in a standby state. Thus, there is no power consumed in the clock tree buffer.
(62) In
(63) An example of the clock gating unit 300 is shown in
(64)
(65) The dynamic clock gate 310 is connected to an output terminal of the master interface MI. After obtaining an outstanding count value using signals of the bus and the clock HCLK, the dynamic clock gate 310 compares the outstanding count value with a reference value to output a clock gating enable signal EN0. In case of an AXI bus, the transaction detection signal may be generated by checking the outstanding count value of the master interface MI.
(66) The clock gating cell 320 provides the clock HCLK to the bus bridge device 150 or blocks the clock HCLK in response to the clock gating enable signal EN0.
(67) A request/data is generally transmitted through a bus while there is transaction, but most functions of a bus system are stopped when there is no transaction. Accordingly, if root clock gating on clock supply is done by inserting a circuit configured to determine whether there is transaction into the clock gating unit 300, the power consumed in the clock tree buffer in the bus bridge device 150 may be blocked or minimized.
(68) In case of the AXI bus, an outstanding count is checked to determine whether there is transaction.
(69)
(70) Referring to
(71) Four AND gates AN1-AN4 may be connected to front ends of the first and second counters C1 and C2. The first AND gate AN1 receives the valid write address AWVALID and a write address ready signal AWREADY to generate an AND response, and applies the AND response to an increase input terminal INC of the first counter C1. The second AND gate AN2 receives a valid write response signal BVALID and a write response ready signal BREADY to generate an AND response, and applies the AND response to a decrease input terminal DEC of the first counter C1. The third AND gate AN3 receives a valid read address signal ARVALID and a read address ready signal ARREADY to an AND response, and applies the AND response to an increase input terminal INC of the second counter C2. The fourth AND gate AN4 receives valid read data RVALID, a read signal RREADY, and a read last signal LAST to generate an AND response and applies the AND response to a decrease input terminal DEC of the second counter C2.
(72) In
(73)
(74) In
(75) The third AND gate AN3 performs an AND gating for input signals of waveforms ARVALID and ARREADY, and applies a result of the AND gating to the increase input terminal INC of the second counter C2.
(76) The fourth AND gate AN4 performs an AND gating for input signals of waveforms RVALID, PREADY, and PLAST, and applies a result of the AND gating to the decrease input terminal DEC of the second counter C2.
(77) Since there is no transaction when an output count value of the second counter C2 is 0 and there is transaction when the output count value of the second counter C2 is not 0, the output of the second counter C2 exhibits a waveform COUNT for a period T2 when there is the transaction. For this reason, an output of the first OR1 gate is logic high. Accordingly, the lock gating enable signal EN0 appearing at an output terminal of a second OR gate OR2 may exhibit a waveform EN0 having a high level for periods T1 and T2. For the period T1, the high level is generated by a high period of the waveform RVALID. For the period T2, the high level is generated by the output count value of the second counter C2. As a result, the clock HCLK is applied to a clock buffer tree for a combined period of the periods T1 and T2 and is not applied to the clock buffer tree for the other periods. For this, the gating clock GCLK like a waveform GCLK is output to the AND gate AN5 shown in
(78) The operation timing in
(79)
(80)
(81) In
(82) Due to characteristics of an asynchronous bridge, a frequency of a gate clock GCLK1 output from the first clock gating cell 321 and a frequency of a gating clock GCLK2 output from the second clock gating cell 322 may be different from each other.
(83)
(84) In
(85) Due to characteristics of a sync up-down circuit, a frequency of a gate clock GCLK1 output from the first clock gating cell 321 and a frequency of a gating clock GCLK2 output from the second clock gating cell 322 may be different from each other.
(86) The scheme described in
(87)
(88) In
(89) Thus, root clock gating according to an embodiment may significantly reduce or minimize a power consumed in a clock buffer tree.
(90)
(91) Referring to
(92)
(93) In case of parallel and cascade connection, the clock gating unit 300 described in
(94) In
(95) Due to a clock gating function of the clock gating unit 300-1, an operation of the bus bridge device 150-1 (including a QE 153, an MMU 154, an UPSIZER 155, and a slave interface 162-1) is stopped on the AXI bus B1. Thus, the range of the clock gating function of the clock gating unit 300-1 extends from the QE 153 to the slave interface 162-1.
(96) The clock gating function of the clock gating unit 303 covers the bus bridge device 166 including an async bridge 168 and a slave interface 167-1.
(97)
(98) In the example shown in
(99) In another embodiment, a control selection signal PSELx obtained from the D flip-flop F14 is used as a transaction detection signal to perform clock gating at the APB B3.
(100) Various signals shown in
(101)
(102) Waveforms PSEL shown in
(103)
(104) In the example shown in
(105) In case of the APB, the clock PCLK is provided to a block buffer tree or a block buffer of the bus bridge device 48 while the control selection signal PSEL is high, and the clock PCLK is not provided thereto while the control selection signal is low. Accordingly, since a clock is not applied in an idle operation of the bus bridge device 48, power saving may be achieved.
(106) In
(107) The APB bus bridge device 48 is connected to the APB slave device 40 through an APB master interface 49.
(108) Since all requests start from a slave interface, power saving may occur on all bus bridge devices receiving the APB clock PCLK when a clock gating unit is mounted on a slave interface of a 1:n APB bus.
(109)
(110) Referring to
(111) When the clock gate unit 300 such as shown in
(112)
(113) Referring to
(114) The processor device 1130 may include a clock gating unit according to an embodiment. The processor device 1130 controls the input device 1100, the output device 1120, and the memory device 1140 through corresponding interfaces, respectively. By using a clock gating unit according to an embodiment in the processor device 1130, power saving may be achieved in an idle state. Thus, the power performance of the electronic system employing the processor device 1130 may be enhanced.
(115)
(116) Referring to
(117) In case of a portable terminal such as a smart phone or the like, compactness and power consumption of the portable terminal have a significant influence on competitiveness of products. Accordingly, there is a desire to minimize power consumption in an idle state.
(118) In
(119) As described above, a clock may be gated according to a state of a transaction detection signal. Thus, a power consumed in a bus system may be minimized or reduced to enhance power control performance of a system-on-chip (SoC).
(120) By way of summation and review, an SoC may be implemented by integrating conventional multi-function blocks, e.g., intellectual properties (IPs) on a single chip. With the high integration of chips and increase in the amount of information between IPs, an SoC using a bus-based structure may encounter extensibility limitations. As an approach for overcoming the extensibility limitations, a network-on-chip (NoC) technology has been considered, which applies general network technologies within a chip to connect the IPs. As SoCs increase in integration density and size, and their operating speed is improved, low power consumption is an important factor to consider. This is because high power consumption may cause a temperature of a chip to rise, which may result in not only malfunction of the chip but also breakage of a package.
(121) As described above, clock gating may be used as a power-saving technique for a bus system in an SoC. Embodiments may provide a clock gating method which may include obtaining a transaction detection signal using signals of a master interface; and basically gating a clock used in the operation of a bus bridge device mounted on a system bus according to a state of the transaction detection signal.
(122) Example embodiments have been disclosed herein, and although specific terms are employed, they are used and are to be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, as would be apparent to one of ordinary skill in the art as of the filing of the present application, features, characteristics, and/or elements described in connection with a particular embodiment may be used singly or in combination with features, characteristics, and/or elements described in connection with other embodiments unless otherwise specifically indicated. Accordingly, it will be understood by those of skill in the art that various changes in form and details may be made without departing from the spirit and scope of the present invention as set forth in the following claims.