Wednesday 23 November 2016

fpga - When is it neater to use VECTOR representations vs INTEGERs?


In the comment thread on an answer to this question: Wrong outputs in VHDL entity it was stated:



"With integers you don't have control or access to the internal logic representation in the FPGA, while SLV lets you do tricks like utilizing the carry chain efficiently"




So, in what circumstances have you found it neater to code using a vector of bits representation than using integer s in order to access the internal representation? And what advantages did you measure (in terms of chip area, clock frequency, delay, or otherwise.)?



Answer



I've written the code suggested by two other posters in both vector and integer form, taking care to have both versions operate in as similar way as possible.


I compared the results in simulation and then synthesised using Synplify Pro targetting Xilinx Spartan 6. Code samples below are pasted from working code, so you should be able to use them with your favourite synthesiser and see if it behaves the same.





Firstly, the downcounter, as suggested by David Kessner:


library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;


entity downcounter is
generic (top : integer);
port (clk, reset, enable : in std_logic;
tick : out std_logic);
end entity downcounter;

Vector architecture:


architecture vec of downcounter is
begin

count: process (clk) is
variable c : unsigned(32 downto 0); -- don't inadvertently not allocate enough bits here... eg if "integer" becomes 64 bits wide
begin -- process count
if rising_edge(clk) then
tick <= '0';
if reset = '1' then
c := to_unsigned(top-1, c'length);
elsif enable = '1' then
if c(c'high) = '1' then
tick <= '1';

c := to_unsigned(top-1, c'length);
else
c := c - 1;
end if;
end if;
end if;
end process count;
end architecture vec;

Integer architecture



architecture int of downcounter is
begin
count: process (clk) is
variable c : integer;
begin -- process count
if rising_edge(clk) then
tick <= '0';
if reset = '1' then
c := top-1;
elsif enable = '1' then

if c < 0 then
tick <= '1';
c := top-1;
else
c := c - 1;
end if;
end if;
end if;
end process count;
end architecture int;


Results


Code-wise, the integer one seems preferable to me as it avoid the to_unsigned() calls. Otherwise, not much to choose.


Running it through Synplify Pro with top := 16#7fff_fffe# produces 66 LUTs for the vector version and 64 LUTs for the integer version. Both versions make much use of the carry-chain. Both report clock speeds in excess of 280MHz. The synthesiser is quite capable of establishing good use of the carry chain - I verified visually with the RTL viewer that similar logic is produced with both. Obviously an up-counter with comparator will be bigger, but that'd be the same with both integers and vectors again.





Suggested by ajs410:


library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity clkdiv is
port (clk, reset : in std_logic;
clk_2, clk_4, clk_8, clk_16 : buffer std_logic);
end entity clkdiv;

Vector architecture


architecture vec of clkdiv is

begin -- architecture a1


process (clk) is
variable count : unsigned(4 downto 0);
begin -- process
if rising_edge(clk) then
if reset = '1' then
count := (others => '0');
else
count := count + 1;
end if;
end if;

clk_2 <= count(0);
clk_4 <= count(1);
clk_8 <= count(2);
clk_16 <= count(3);
end process;

end architecture vec;

Integer architecture


You have to jump through some hoops to avoid just using to_unsigned and then picking bits off which would clearly produce the same effect as above:



architecture int of clkdiv is
begin
process (clk) is
variable count : integer := 0;
begin -- process
if rising_edge(clk) then
if reset = '1' then
count := 0;
clk_2 <= '0';
clk_4 <= '0';

clk_8 <= '0';
clk_16 <= '0';
else
if count < 15 then
count := count + 1;
else
count := 0;
end if;
clk_2 <= not clk_2;
for c4 in 0 to 7 loop

if count = 2*c4+1 then
clk_4 <= not clk_4;
end if;
end loop;
for c8 in 0 to 3 loop
if count = 4*c8+1 then
clk_8 <= not clk_8;
end if;
end loop;
for c16 in 0 to 1 loop

if count = 8*c16+1 then
clk_16 <= not clk_16;
end if;
end loop;
end if;
end if;
end process;
end architecture int;

Results



Code-wise, in this case, the vector version is clearly better!


In terms of synthesis results, for this small example, the integer version (as ajs410 predicted) does produce 3 extra LUTs as part of the comparators, I was too optimistic about the synthesiser, although it is working with an awfully obfuscated piece of code!





Vectors are a clear win when you want arithmetic to wrap-around (counters can be done as a single line even):


vec <= vec + 1 when rising_edge(clk);

vs


if int < int'high then 
int := int + 1;

else
int := 0;
end if;

although at least it's clear from that code that the author intended a wrap around.




Something I've not used in real-code, but pondered:


The "naturally-wrapping" feature can also be utilised for "computing through overflows". When you know that the output of a chain of additions/subtractions and multiplications is bounded, you don't have to store the high bits of the intermediate calculations as (in 2-s complement) it'll come out "in the wash" by the time you get to the output. I'm told that this paper contains a proof of this, but it looked a bit dense for me to make a quick assessment! Theory of Computer Addition and Overflows - H.L. Garner


Using integers in this situation would cause simulation errors when they wrapped, even though we know they'll unwrap in the end.





And as Philippe pointed out, when you need a number bigger than 2**31 you have no choice but to use vectors.


No comments:

Post a Comment

arduino - Can I use TI&#39;s cc2541 BLE as micro controller to perform operations/ processing instead of ATmega328P AU to save cost?

I am using arduino pro mini (which contains Atmega328p AU ) along with cc2541(HM-10) to process and transfer data over BLE to smartphone. I...