Introduction
Having found multiple, sometimes conflicting or incomplete information on the internet and in some training classes about how to create timing constraints in SDC format correctly, I'd like to ask the EE community for help with some general clock generating structures I have encountered.
I know that there are differences on how one would implement a certain functionality on an ASIC or FPGA (I have worked with both), but I think there should be a general, correct way to constrain the timing of a given structure, independent of the underlying technology - please let me know if I'm wrong on that.
There are also some differences between different tools for implementation and timing analysis of different vendors (despite Synopsys offering a SDC parser source code), but I hope that they are mainly a syntax issue which can be looked up in the documentation.
Question
This is about the following clock multiplexer structure, which is part of the clkgen module which is again part of a larger design:
While the ext_clk
input is assumed to be generated externally to the design (entering through an input pin), the clk0
and clk4
signals are also generated and used by the clkgen module (see my related ripple clock question for details) and have associated clock constraints named baseclk
and div4clk
, respectively.
The question is how to specify the constraints such that the timing analyser
- Treats
cpu_clk
as a multiplexed clock which can be either one of the source clocks (fast_clk
or slow_clk
or ext_clk
), taking the delays through the different AND and OR gates into account
- While at the same time not cutting the paths between the source clocks which are used elsewhere in the design.
While the simplest case of an on-chip clock multiplexer seems to require just the set_clock_groups
SDC statement:
set_clock_groups -logically_exclusive -group {baseclk} -group {div4clk} -group {ext_clk}
...in the given structure, this is complicated by the fact that clk0
(via the fast_clk
output) and clk4
(via slow_clk
) are still used in the design, even if cpu_clk
is configured to be ext_clk
when only use_ext
is asserted.
As described here, the set_clock_groups
command as above would cause the following:
This command is equivalent to calling set_false_path from each clock in every group to each clock in every other group and vice versa
...which would be incorrect, since the other clocks are still used elsewhere.
Additional Information
The use_clk0
, use_clk4
and use_ext
inputs are generated in such a way that only one of them is high at any given time. While this could be used to stop all clocks if all use_*
inputs are low, the focus of this question is on the clock multiplexing property of this structure.
The X2 instance (a simple buffer) in the schematic is just a place-holder to highlight the issue of automatic place&route tools being usually free to place buffers anywhere (such as between the and_cpu_1/z
and or_cpu1/in2
pins). Ideally, the timing constraints should be unaffected by that.
Define divide by 1 clocks on the and_* nets and declare them to be physically exclusive. Cadence RTL compiler handles the situation correctly by generating 3 timing paths for registers clocked by cpu_clk (one path each for one clock). Registers directly driven by clk0, clk4 and clk_ext have their own timing arcs.
create_generated_clock -source [get_ports clk0] \
-divide_by 1 -name and_clk0 [get_pins and_cpu_1/Y]
create_generated_clock -source [get_ports clk4] \
-divide_by 1 -name and_clk4 [get_pins and_cpu_2/Y]
create_generated_clock -source [get_ports clk_ext] \
-divide_by 1 -name and_clk_ext [get_pins and_cpu_ext1/Y]
set_clock_groups \
-physically_exclusive \
-group [get_clocks and_clk0] \
-group [get_clocks and_clk4] \
-group [get_clocks and_clk_ext]