Power has become a primary consideration during hardware design. Dynamic power can contribute up to 50% of the total power dissipation. Clock-gating is the most common RTL optimization for reducing dynamic power.
Most clock-gating is done at the Register Transfer Level (RTL). RTL clock-gating algorithms can be grouped into three categories: system-level, sequential and combinational. System-level clock-gating stops the clock for an entire block, effectively disabling all functionality. On the contrary, combinational and sequential clock-gating selectively suspend clocking while the block continues to produce output.
Combinational clock-gating is a straightforward substitution to the RTL code. It reduces power by disabling the clock on registers when the output is not changing. Opportunities to insert combinational clock-gating can be found by looking for conditional assignments in the code. Clock-gating logic is substituted when code like "if (cond) out <= in" is present. (See figure 1). Combinational clock-gating is now a feature in the RTL compilers. Power aware synthesis tools identify RTL coding patterns and make the appropriate substitution.
switching activity is eliminated only when data is not changing, the actual power savings is limited. In typical designs, combinational clock-gating can reduce dynamic power by about 5-to-10%.
Sequential clock-gating alters the RTL micro-architecture without affecting design functionally. Power is optimized by identifying unused computations, data dependent functions and don't-care cycles in the original code. One example of a sequential optimization is turning off subsequent pipeline stages based on a propagated valid condition (see Figure 2). Because of the additional logic, this transformation makes sense only if the datapath is multiple bits wide.
sequential clock-gating can save significant power, typically reducing switching activity by 15-to-25% on a given block.
System-level clock-gating is designed into the original hardware architecture and coded as part of the RTL functionality. For example, sleep modes in a cell phone may strategically disable the display, keyboard or radio depending on the phones current operational mode. System-level clock-gating shuts off entire RTL blocks. Because large sections of logic are not switching for many cycles it has the most potential to save power.
There are two types of clock gating styles available. They are:
1) Latch-based clock gating
2) Latch-free clock gating.
2) Latch-free clock gating.
Latch free clock gating
The latch-free clock gating style uses a simple AND or OR gate (depending on the edge on which flip-flops are triggered). Here if enable signal goes inactive in between the clock pulse or if it multiple times then gated clock output either can terminate prematurely or generate multiple clock pulses. This restriction makes the latch-free clock gating style inappropriate for our single-clock flip-flop based design.
Latch based clock gating
The latch-based clock gating style adds a level-sensitive latch to the design to hold the enable signal from the active edge of the clock until the inactive edge of the clock. Since the latch captures the state of the enable signal and holds it until the complete clock pulse has been generated, the enable signal need only be stable around the rising edge of the clock, just as in the traditional ungated design style.
In the technique shown in Figure, a register generates the enable signal to ensure that the signal is free of glitches and spikes. The register that generates the enable signal is triggered on the inactive edge of the clock to be gated (use the falling edge when gating a clock that is active on the rising edge, as shown in Figure). Using this technique, only one input of the gate that turns the clock on and off changes at a time. This prevents any glitches or spikes on the output. Use an AND gate to gate a clock that is active on the rising edge. For a clock that is active on the falling edge, use an OR gate to gate the clock and register the enable command with a positive edge-triggered register.