Thursday, May 1, 2008

Clock Gating

The distribution network of clock forms a significant contributor for the power of a chip. Reduction in the switching capacitance of the clock infers a great impact on the total power. Clock gating is one such an approach that partitions the clock network in the design and allows only those partitions of the design to toggle that are needed on each clock cycle. It is implemented by turning off the clock for those blocks which are not required.

This approach can often save significant amount of total clock power but it requires trade-offs between timing and leakage power during the implementation.Moreover based on the granularity level at which the clock gating is applied the power savings tend to differ.There are three levels of granularity

Module level Clock gating or Global Clock gating: In this approach for an entire block or module in the design the clock is shut off,typically from a central clock-generator module.This method unctionally shuts down the block and reduces a significant amount of dynamic power as it shuts down the entier clock tree.


Register level clock gating or local clock gating:
In this approach the clock to a single or a set of regiusters is gated. In origianl RTL implementation , it is typical to implement a synchronous load enabled register using a clocked D-FlipFlop and a recirculating multiplexer with the D flip flop being clocked every cycle


So, the key to clock gating for these registers is to use the same enable signal to gate the clock thus the register doesn't get the clock signal in the cycles when no new data is loaded thus saving the power and also eliminating the multiplexer and it's power consumption.However gating a single bit register doesn't render useful in the power savings , so it's better to use this type of clock gating for a large number of registers ,saving the flip flop clocking power and multiplexer power of all the registers by using a single clock gating circuit.

Cell-level clock gating: In this approach a clock gating circuit designed for each cell.For example a memory is designed such as to receive the clock only during cycles of "active" access.This appears as an easy method for power saving but has an area overhead and limits the power savings as a large number of registers and memeories need to be predesigned with clock gating , and it doesn't help in sharing the clock-gating logic between the registers.

While comparing the power savings per clock gate betwee these approaches , global clock gating reduces more power compared to local power gating, but local power gating provides more opportunities like automated insertion which can result in large number of clock gated cells in the design.

The clock gating circuit can be implemented

1. By using an AND gate for the enable and Clock and giving the output as the clock for the flip flop.But this approach has the problem with the glitches on the enable signal that are propagated when the clock is high to the clock pin of the register.Though this can be avoided by applying appropriate set up and hold time constraints on the enable signal,any spurious change of the signal during run time can cause wrong values to be latched.

By using a level sensitive ,active-low latch on the enable path .This helps in removing the problem of glitches as mentioned before.The output of the latch is freeezed at the rising edge of the clock and also ensures that the enable signal at the AND gate is stable when the clock is high.



Clock gating technique reduces a significant amount of dynamic power ,with an overhead of insertion delays in the clock tree( clock skew).But still there is leakage power still in the system ,While going ahead we can discuss more on the techniques through which we can reduce the leakage power.

1 comment:

Rohit said...

Clock gating adds some skew to the clock tree.
There are standard clock gating cells ICGs (Integrated clock gating modules)present in the library which are picked up by the Power compiler.