An automatic clock gating utility.
- Overview
- File structure
- Dependencies
- How to use
- How it works
- Power reduction analysis
- Authors
- Copyright and Licensing
Electrical Power reduction in digital systems is significant for several reasons, including portability, reliability, and cost reduction. Because of this, power dissipation has become a critical parameter in low-power VLSI circuit designs. There are two sources for power dissipation in CMOS circuits: static power and dynamic power. Dynamic power is associated with the circuit switching activities due to the charging and discharging of internal node capacitances. On the other hand, static power is due to leakage current, or current that flows through the transistor when there is no activity. Dynamic power is the dominating component in mature fabrication processes such as sky130. It is still dominating in cutting-edge fabrication technologies. However, static power contribution to the total power is higher than mature technologies.
Several techniques can be utilized to reduce dynamic power through reducing the circuit switching activity. Clock gating is the most widely used technique here. It can be done manually or automatically. Automatic clock gating can be peformed for load-enabled registers.
Typically, RTL synthesizer maps load-enabled registers to flip-flops and multiplexors (or to load-enabled flip-flops if the standard cell library has them). In both cases, the dynamic power is very high as the flip-flops are connected to the clock, which is the fastest signal in the design. Instead of circulating the register output back to its input when the load condition is false (typically using multiplexors), the register clock can be enabled only when the load condition is true. This reduces the switching activities, which leads to lower dynamic power and less area (due to the elimination of the multiplexors). The following figure illustrates the automatic clock gating for a single flip flop.
Lighter is a Yosys plugin and technology mapping files that can perform automatic clock gating for registers to reduce the dynamic power. Currently, Lighter supports the following open-source standard cell libraries:
- sky130_fd_sc_hd
- sky130_fd_sc_hs
- sky130_fd_sc_hvl
- sky130_fd_sc_ms
- gf180mcu_fd_sc_mcu7t5v0
- gf180mcu_fd_sc_mcu9t5v0
Through extensive experimentation, Lighter has demonstrated significant power and area savings. To explore the detailed results, kindly refer to the following link: here
- designs / conatains verilog designs for benchmarking
- docs/ contains documentation
- platform/ contains standard cell libraries
- report_power/ contains power reporting python code (report_power.py)
- stats/ contains full benchmarking results
- src/ contains clock gating Yosys plugin code
- validation/ contains automatic validation python code for clock-gated designs
You can find the installation steps in dependencies.md for:
First make sure to follow the dependancies section to install all requirements.
Generate the Yosys plugin using the following command:
yosys-config --build cg_plugin.so clock_gating_plugin.cc
Add the clock gating technology maping file to your project directory. For example, if you are using the sky130_fd_sc_hd standard cell library add the following file. You can find other supported libraries here.
Add the flipflop clock gating command to your synthesis script:
reg_clock_gating sky130_fd_sc_hd_ff_map.v
For example:
read_verilog design
read_liberty -lib -ignore_miss_dir -setattr blackbox sky130_fd_sc_hd.lib
hierarchy -check
reg_clock_gating -map sky130_fd_sc_hd_ff_map.v
synth -top design
dfflibmap -liberty sky130_fd_sc_hd.lib
abc -D 1250 -liberty sky130_fd_sc_hd.lib
splitnets
opt_clean -purge
opt;;
write_verilog -noattr -noexpr -nohex -nodec -defparam design.gl.v
Run your Yosys synthesis script as follows:
yosys -m cg_plugin.so your_script.ys
Or TCL synthesis script as follows:
yosys -m cg_plugin.so your_script.tcl
Follow the same steps above for generating the Yosys plugin and adding the library files.
You can use the selection option to specify the flipflops you want to map in the clock gating step by doing the following:
-
You need to add an attribute (pragma) to the module intended (inside the module declaration), for example:
module test (...); (* clock_gate *) ... ... endmodule
-
Then add the attribute to the clock gating command as a selection like:
reg_clock_gating -map sky130_hd_ff_map.v a:clock_gate
For example:
read_verilog design
read_liberty -lib -ignore_miss_dir -setattr blackbox sky130_fd_sc_hd.lib
hierarchy -check
reg_clock_gating -map sky130_hd_ff_map.v a:clock_gate
synth -top design
dfflibmap -liberty sky130_fd_sc_hd.lib
abc -D 1250 -liberty sky130_fd_sc_hd.lib
splitnets
opt_clean -purge
opt;;
write_verilog -noattr -noexpr -nohex -nodec -defparam design.gl.v
A detailed guide can be found here
Design | # Cells | # Added Clock Gates | Power reduction % | # Cells reduction % |
---|---|---|---|---|
AHB_SRAM | 245 | 47 | 17.29% | 25.71% |
blabla | 10589 | 1098 | 28.80% | 3.72% |
blake2s | 14207 | 1872 | 33.70% | 11.05% |
blake2s_core | 12971 | 1353 | 36.05% | 12.53% |
blake2s_m_select | 4518 | 512 | 43.00% | 22.29% |
chacha | 12857 | 1936 | 31.49% | 4.28% |
genericfir | 143575 | 11624 | 32.62% | 5.51% |
i2c_master | 758 | 106 | 13.29% | 13.19% |
jpeg_encoder | 62472 | 4637 | 30.23% | 11.78% |
ldpcenc | 20134 | 1273 | 18.04% | 6.14% |
NfiVe32_RF | 3362 | 1024 | 30.46% | 30.58% |
picorv32a | 14271 | 1244 | 25.42% | 12.48% |
PPU | 10248 | 2845 | 19.74% | 34.44% |
prv32_cpu | 2241 | 207 | 28.57% | 23.47% |
rf_64x64 | 13475 | 4096 | 32.01% | 32.07% |
sha512 | 20187 | 3669 | 30.53% | 12.85% |
spi_master | 175 | 43 | 9.16% | 17.14% |
y_huff | 11004 | 2345 | 19.55% | 21.33% |
y_quantizer | 8281 | 2816 | 30.96% | 30.85% |
zigzag | 3807 | 769 | 31.95% | 58.21% |
Average Power Reduction % | Average cell Reduction % |
---|---|
27.14% | 19.48% |
Max Power Reduction % | Min Power Reduction % |
---|---|
43.00% | 9.16% |
Max cell Reduction % | Min cell Reduction % |
---|---|
58.21% | 3.72% |
To access the complete benchmarking data, methodology, and all the necessary details, we have prepared a dedicated file called benchmarks.md.
Copyright 2022 AUC Open Source Hardware Lab
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at:
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.