2.0 Embedded Computational Units (ECUs)
Traditional programmable logic architectures do not implement arithmetic functions efficiently or
effectively. These functions require high logic cell usage while garnering only moderate performance
results. By embedding a dynamically reconfigurable computational unit, the QuickMIPS chip can address
various arithmetic functions efficiently and effectively providing for a robust DSP platform. This
approach offers greater performance than traditional programmable logic implementations. The ECU
block is ideal for complex DSP, filtering, and algorithmic functions. The QuickMIPS architecture allows
functionality above and beyond that achievable using DSP processors or programmable logic devices.
The embedded block is implemented at the transistor level with the following block diagram in
Figure 2
.
Figure 2: Embedded Computational Unit (ECU) Block Diagram
Implementation of the equivalent ECU block as HDL in a programmable logic architecture requires 205
logic cells with a 10 ns delay in a -4 speed grade. There are a maximum of 18 ECU blocks and a
minimum of 10 ECU blocks in the QuickMIPS chip. The ECU blocks are placed next to the RAM
circuitry for efficient memory/instruction fetch and addressing for DSP algorithmic implementations.
Eighteen 8-bit Multiply Accumulate functions can be implemented per cycle for a total of 2.6 billion
MACs/s when clocked at 144 MHz. Further Multiply Accumulate functions also can be implemented in
the programmable logic.
Table 3: ECU Comparisons
Function
Description
Slowest Speed
Grade
Fastest Speed Grade
Adder
16 bit
8 ns
2.5 ns
32 bit
10 ns
5.6 ns
64 bit
12 ns
6.7 ns
Multiplier
8 x 8
10 ns
4.3 ns
16 x 16
12ns
6.7 ns
System Clock
200 MHz
400 MHz
Logic Cell
Memory
Multiply
Sequencer
Add
Register
Abus
Xbus
Ybus
I bus
Sign
Rbus
16
8
8
3
1
2
17