Sunday, January 10, 2010

clock gating

Power has become a primary consideration during hardware design. Dynamic power can contribute up to 50% of the total power dissipation. Clock-gating is the most common RTL optimization for reducing dynamic power.
                                                     Most clock-gating is done at the Register Transfer Level (RTL). RTL clock-gating algorithms can be grouped into three categories: system-level, sequential and combinational. System-level clock-gating stops the clock for an entire block, effectively disabling all functionality. On the contrary, combinational and sequential clock-gating selectively suspend clocking while the block continues to produce output.
                                                Combinational clock-gating is a straightforward substitution to the RTL code. It reduces power by disabling the clock on registers when the output is not changing. Opportunities to insert combinational clock-gating can be found by looking for conditional assignments in the code. Clock-gating logic is substituted when code like "if (cond) out <= in" is present. (See figure 1). Combinational clock-gating is now a feature in the RTL compilers. Power aware synthesis tools identify RTL coding patterns and make the appropriate substitution. 

switching activity is eliminated only when data is not changing, the actual power savings is limited. In typical designs, combinational clock-gating can reduce dynamic power by about 5-to-10%.

                 




Sequential clock-gating alters the RTL micro-architecture without affecting design functionally. Power is optimized by identifying unused computations, data dependent functions and don't-care cycles in the original code. One example of a sequential optimization is turning off subsequent pipeline stages based on a propagated valid condition (see Figure 2). Because of the additional logic, this transformation makes sense only if the datapath is multiple bits wide.













sequential clock-gating can save significant power, typically reducing switching activity by 15-to-25% on a given block.

                                    System-level clock-gating is designed into the original hardware architecture and coded as part of the RTL functionality. For example, sleep modes in a cell phone may strategically disable the display, keyboard or radio depending on the phones current operational mode. System-level clock-gating shuts off entire RTL blocks. Because large sections of logic are not switching for many cycles it has the most potential to save power. 

There are two types of clock gating styles available. They are:
1) Latch-based clock gating
2) Latch-free clock gating.

Latch free clock gating
The latch-free clock gating style uses a simple AND or OR gate (depending on the edge on which flip-flops are triggered). Here if enable signal goes inactive in between the clock pulse or if it multiple times then gated clock output either can terminate prematurely or generate multiple clock pulses. This restriction makes the latch-free clock gating style inappropriate for our single-clock flip-flop based design.






Latch based clock gating
The latch-based clock gating style adds a level-sensitive latch to the design to hold the enable signal from the active edge of the clock until the inactive edge of the clock. Since the latch captures the state of the enable signal and holds it until the complete clock pulse has been generated, the enable signal need only be stable around the rising edge of the clock, just as in the traditional ungated design style.







In the technique shown in Figure, a register generates the enable signal to ensure that the signal is free of glitches and spikes. The register that generates the enable signal is triggered on the inactive edge of the clock to be gated (use the falling edge when gating a clock that is active on the rising edge, as shown in Figure). Using this technique, only one input of the gate that turns the clock on and off changes at a time. This prevents any glitches or spikes on the output. Use an AND gate to gate a clock that is active on the rising edge. For a clock that is active on the falling edge, use an OR gate to gate the clock and register the enable command with a positive edge-triggered register.






low power techniques


The  various low power techniques into 2 categories
(a)Structural Techniques
Voltage Islands
Multi-threshold devices
Multi-oxide devices
Minimize capacitance by custom design
Power efficient circuits
Parallelism in micro-architecture
(b)Traditional Techniques
Clock gating
Power gating
Variable frequency
Variable voltage supply
Variable device threshold
Dynamic Power Reduction
Clock Gating
Power efficient circuits
Variable frequency
Variable voltage supply
Leakage Power Reduction
Minimize usage of Low Vt Cells
Power Gating
Back Biasing
Reducing Dynamic Power
Reduce Oxide Thickness
Use FINFET’s
The techniques will discussed in detail future posts



powerconsumption in cmos



Power Dissipation Components in Digital CMOS Circuits


Power consumption in CMOS circuits can be divided into three main components: short-circuit power, switching power, and leakage power. Short-circuit power arises when a conducting path between supply and ground is formed. The pull-up and pull-down devices should to be sized properly to achieve approximately equal rise and fall time. This component of power consumption can be significant in precharge and evaluate circuits, e.g. dynamic circuits.


                        Switching power is a result of the power consumed in charging and discharging internal capacitances in the circuit. Leakage power is the power dissipated while the device is turned off. Leakage power has started to form a significant portion of the total power consumption as a result of the low threshold devices normally used in advanced DSM(deep sub-micron) technologies.







Figure shows the increase in static (leakage) power for different technology generations. It is apparent that static power is dramatically increasing with technology scaling.The ratio of leakage to total power is expected to exceed 50% in 45nm designs from about 10% in 90nm designs
Switching power
Switching power is the largest contributor to the total power dissipation in conventional CMOS technologies. It is a result of switching the junction, diffusion, and interconnect capacitances. Consider the CMOS inverter circuit in Figure . The parasitic capacitances are lumped into the output capacitor C. Consider the behavior of the circuit over one full clock cycle with the input going from VDD to zero and back to VDD. As the input switches from high to low, the NMOS pull-down transistor is turned OFF while the PMOS pull-up transistor is ON and capacitor C is charged. This charging process draws an energy equal to CVDD^2from the power supply. Half of this energy is dissipated immediately in the PMOStransistor, while the other half is stored on C. When the input switches from zero back to VDD, the NMOS pull-down turns ON and the capacitance C discharges through it. If the rise time of the input signal is slow, both PMOS and NMOS are simultaneously ON causing a short circuit current to flow. This slow rise/fall time should be avoided through proper transistors sizing.
Leakage power
leakage power is discussed in earlier postings leakage current in cmos