Benchmark Test Report on Veritak 3.00A and MXE6.0
Date
Jun.30.2006
Tak.Sugawara
1. Purpose
To report performance comparison between new version of
Veritak and MXE6.0.
2. Test Condition
Item | Description | Remarks |
---|---|---|
Machine | Athlon64 3000+Single /3800+Dual 2GB memory Asus A8V |
|
OS | Windows 2000 | |
Test Bench | Mainly From Opencores/Icarus Test Suite/Others | |
Simulator | Veritak1.71/1.82A/3.00A( Released .) MXE6.0d(Not Starter. Full Xilinx Edition of Modelsim) | |
Measured Time | From Simulation Starts to Simulation ends. Not include compile time. |
Veritak:Optimized Debug:Normal/ Level/2/NBA/Fast Switch All bench run W/O "waveform save" except for item No.15. |
3 Test Result
Athlon64 3000+(Single) | Athlon 3800+(Dual ) | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
No | Name | Description | source | #of lines | Veritak1.71 | Veritak 1.82A | MXE6.0d | Veritak 2.14B |
Veritak 3.00A |
3.00A vs MXE |
MXE6.0d | Veritak Project Files |
CPU Cores | ||||||||||||
1 | WB_Z80 | Z80 | opencores | 5K | 1min.17sec | 1min14sec | 1min.47sec | 1min8sec | 39sec | 1min38sec | Download | |
2 | TV80 | Z80 | opencores | 22K | 1min.23sec | 1min8sec | 51sec. | 1min7sec | 16sec | 49sec. | Download | |
3 **1 |
FZ80 | Z80 | PC8001 FPGA(in Japanese only) | 2K | 51sec | 49sec | 1min03 | 46sec | 22sec | 57sec | ||
4 | YACC | MIPS-I subset | opencores | 5.7K | 7min.29sec. | 8min.17sec. | . | |||||
5 | M68K | M68K | Verilator's site(opencores) | 12K | 11min.15sec | 6min50sec | 17min.18sec. | 6min23sec | 1min55sec | 15min34sec. | Download | |
6 | H8300H | renesas subset | sugawara-systems | 10K | 23sec. | 20sec | 27sec. | 20sec | 11sec | 25sec | ||
7 | Openrisc | very large design | opencores | 144K | 25sec. | 14sec | 29sec | 13sec | 8sec | 25sec. | Download | |
Peripheral Cores | ||||||||||||
8 | Eithernet | large design | opencores | 45K | 1h5min. | 39min34sec | 1h13min | 35min19sec | 16min28sec | 1h12min52sec. | Download | |
9 | USB11 | opencores | 11K | 2min.41sec. | 1min53sec | 2min.44sec | 1min41sec | 43sec | 2min31sec. | Download | ||
10 | PCI | very large design | opencores | 89K | 2h1min.57sec. | 51min20sec | 1h42min. | 42min18sec | 24min48sec | 1h40min10sec. | Download | |
11 | ATA | opencores | 4K | 13min.30sec. | 8min5sec | 20min.37sec. | 7min17sec | 3min7sec | 20min24sec. | Download | ||
12 | CONMUX | opencores | 11K | 9min.47sec. | 3min36 | 4min.59sec. | 3min10sec | 1min56sec | 4min36sec. | Download | ||
13 | AC97 | opencores | 11K | 47min.37sec | 28min18sec | 47min.58sec. | 25min44sec | 9min17sec | 43min15sec. | Download | ||
14 | XilinxCorelib RAM | 256KB R/W w/DCM | sugawara-systems | - | 5min.56sec. | 1min30sec | 3min.44sec | 1min19sec | 14sec | . | 3min28sec | |
Others | ||||||||||||
20 | ASIC | ASIC(50kgates) | sugawara-systems | 20K | 13sec. | 12sec | 18sec. | 10sec | 5sec | 17sec. | ||
15 *2 |
Simple Counter | (30Millions Pattern /w saved waveform) | VeritakUser's Contribution | 1K | 2min.57sec | 2min54sec | 5min.16sec | |||||
16 | AES | 128bit galois operation | sugawara-systems(for CQ publisher's contest) | 10K | 1min.6sec. | 32sec | 50sec. | 30sec | 14sec | |||
17 | Many Instances1000 | small module but many instances | Icarus Test Suite(modified) | 0.3K | 6min.3sec. | 13min59sec. | . | |||||
18 **3 |
Many Instances10000 | small module but many instances | Icarus Test Suite(modified) | 0.3K | 16sec | 2min.1sec(xilinx instance restriction) | xilinx instance restriction | |||||
19 | Large Multiplier | 100bit multiplier | Icarus Test Suite(modified) | 0.2K | 17sec. | 9sec | 31sec. | 9sec | 2sec | 25sec. | Download | |
21 | PCI IP | Net List IP | Athlon 1.2GHz | 4min10sec | 34sec | 4min9sec | 12sec | 10sec | 1min29sec | |||
4. Consideration
4.0 Performance difference between Single CPU and Dual CPU
Since simulator runs as single thread, no performance gain is expected
even if dual CPU is used.
This is true not only Veritak but also MXE.. You will notice
10% performance gain between Single CPU and Dual CPU on the same mother
board with the same config. However this is not effect of "Dual power".
It is noted Dual CPU(3800+) has 2.0GHz clock, while Single CPU(3000+) has 1.8GHz clock.
4.1 Comparison of All Save Performance
In Item No 3 was re-tested by another machine.(Athlon 1.2GHz)
Fig. below shows relational speed as MXE=1,w/o w/ "all
save of waveform".
Since veritak design concept is "default save all "by run-time compression, internal compressed file is small(37MB)
and overhead is low, while vcd data is over 300MB.
Extraction of any signal in this project is almost instantaneous. So, such debug
stage of each designer's RTL design ,( most time consuming due
to many run and run), is suitable for use of Veritak..
4.2 Comparison of Long Vector Performance
Veritak is faster in this test..This is reasonable since all waveform
files are inside memory, not Disk. Even in 30millions patterns vector,
view response is still fast. However,this is limitation in Veritak at the
same time. Size of waveform view is restricted to size of PC's virtual
memory. (Around 1GB seems used in this test.) ModelSim's waveforms
are saved to disk,so ModelSim has advantage in long vector test. However
this should be resolved in Veritak64 bit version.
4.3 Many Instances
Xilinx-Edition restricts numbers of instances.That is the reason why No.18
is so slow.
5.Conclusion