Wednesday, 18 May 2016

fpga - Advantage of clock enable over clock division


I have an FPGA design which uses different clocks. There is a 100 MHz reference clock provided by an oscillator. The reference clock is used in a DCM (Xilinx FPGA) to generate 3 related clocks, 100 MHz, 50 MHz and 10 MHz (no phase offset).
In situations like this the advice of tool vendors and collegues is to use only one clock, the 100 MHz one, and instead of the other clocks create clock enables which are only active every 5th / 10th clock cycle.
What's the advantage of doing this? Yes, there will only be one clock and timing analysis will become easier, but that's a tool issue and should not dictate fundamental design decisions.

Especially for a design with a high device utilization I can see advantages for divided clocks, since e.g. the 10 MHz clock domain will have a much easier time to achieve timing closure and will "leave more room" for the placement and routing of the logic in the 100 MHz clock domain. With clock enables I'd carelessly give away this advantage.



Answer



There are several advantages to this methodology that I can think of:




  1. Clock Network - Firstly you only have one clock rather than three. This means that there is less competition for global and local clock routing resources. There are usually only a small number of low-skew clock trees, so minimising usage requirements can help routing.




  2. ALM Restrictions - Depending on the structure of your ALMs/Logic Blocks, you may only be able to clock each block from a single clock - i.e. you can't have two registers on two different clock domains in the same block. If this is the case on your FPGA, then using only a single clock-enabled clock domain could allow logic to be more tightly packed in a design that has a high % of blocks used. Tighter packing can then lead to a higher chance of meeting timing as everything is closer together.





  3. Clock Domain Transfer - Another important consideration is going between clocks. If you need to transfer data between the clock domains, then there are two options.


    If you have no reference between the clocks as would be the case if you divide the entire clock (you don't know which edge of the fast clock corresponds to the slow clock), then your transfer must be asynchronous which implies the additional headache of FIFOs for data, and synchroniser chains for control signals.


    On the other hand, if you use a clock enable to slow things down, you know exactly when the slow clock and fast clock are going to intersect - you know this by monitoring the clock enable signal. For example if you want to transfer data from your /10 domain to your /1 domain, then you don't need FIFOs to do it - you just say that the clock enable signal is also a valid signal. Because they are the same clock, no clock domain transfer is needed.


    It is possible to do domain transfers from PLL clocks if you can keep track of the clock edges - for example going from a /1 to a /2 and vice versa clock is easy because you can potentially compare the fast and slow clocks directly to synchronise. However this depends on the structure of the FPGA - some won't easily allow the clocks to be fed in to look up tables as data inputs.




  4. Valuable Resources - In smaller FPGAs, PLLs are few and far between. For example Spartan-6 LX9 FPGAs if I recall correctly only have two PLL sites! Ideally you want to save these for things like external interfaces (LVDS, Memory, etc.) and not be using them for general clock division. Why use a valuable PLL for nothing but integer clock division when it is possible to do it in general logic.


    Furthermore, using clock enables, your division is simply a counter which can be implemented anywhere on the FPGA. PLLs on the other hand are located in specific areas. Say you need a local clock on one side of the chip, but the only free PLL is on the other side. You have to use not only a remote PLL, but also valuable global clock routing to get the clock all the way to the other side of the chip where it is used locally. If instead you build a counter for clock enables, it can be placed right next to the logic, thus reducing usage of valuable resources.





Those are just the reasons off the top of my head. I will try to thing of some more, and also I'll add in some "why is PLL favourable to CE" arguments as well.




In answer to your final point about timing. In practice, using a slow clock or a clock enabled fast clock doesn't affect routing too much. Really your designs should have all registers use the rising edge of a clock (or all use the falling edge) but not mix and match (except maybe in DDR I/O buffers). As a result, both clocks still have their sensitive edges at the same time. The fast clock has n times more, but the clock is not enabled, so these are ignored.


This means you can tell your timing analysis tool that either the clock is multi-cycle - i.e. only one of n cycles is valid - or you tell it that it is a slower clock with a narrow duty cycle. Both of these ways let the tool know that it can allow multiple fast clock periods of setup and hold time in its calculations (because the register values aren't going to change when the clock is not enabled).


No comments:

Post a Comment

arduino - Can I use TI's cc2541 BLE as micro controller to perform operations/ processing instead of ATmega328P AU to save cost?

I am using arduino pro mini (which contains Atmega328p AU ) along with cc2541(HM-10) to process and transfer data over BLE to smartphone. I...