
Timing
SC140 DSP Core Reference Manual
6-9
6.3 Timing
This section describes the time needed to execute SC140 instructions as measured in clock cycles. In the
discussion below, it is assumed that memory accesses are zero wait-state and contention free, unless
explicitly stated otherwise. This timing is for the current SC140 implementation, and may change with
future implementations.
Parallel execution takes place when two or more instructions (grouped into an execution set) execute
simultaneously. Instructions belonging to an execution set always start execution concurrently. A set of
instructions start execution only after all the instructions belonging to previous execution sets are
completed. Therefore, an execution set’s execution time is determined by the instruction in the set that has
the longest execution time.
6.3.1 Simple Instruction Timing
This section describes the timing of simple instructions such that:
All DALU instructions take one clock cycle to execute.
All AGU arithmetic instructions take one cycle to execute.
All move to internal memory instructions take one clock cycle to execute, unless the addressing
mode needs to perform a pre-calculation, in which case, the move executes in two cycles. For
example, the move instructions below take two cycles:
—
—
—
—
All bit mask instructions execute in two cycles on registers and memory with simple addressing
modes. However, if a pre-calculation is required, such as an SP offset, a third cycle is added.
MOVE.L d0,(Rn + N0)
MOVE.L d0,(Rn + $5)
MOVE.L d0,(Rn + Rm)
MOVE.L d0,(SP + $100)
6.3.2 Change-of-Flow Instruction Timing
The basic change-of-flow JMP instruction takes three cycles to execute. However, the number of cycles is
different for the following change-of-flow instructions:
PC-relative instructions such as BRA require an additional cycle to calculate the destination.
Delayed instructions such as JMPD effectively require the same cycle count as the non-delayed
version (in this example JMP) minus the execution cycle count of the set in the delay slot. This is the
case because the pipeline fill-up time is used to execute a useful execution set. The actual time taken
to jump to the new address is the same for the delayed or non-delayed version. However, the effective
cycle count is less for the delayed version since the execution of the instructions in the delay slot
would be extra counts if the non-delayed version was available.
The delay slot lasts for the full execution time of the set in the delay slot, which may be more than
one cycle. The minimum execution time of a delayed instruction is one cycle. For example:
JMPD dest;
takes 1 cycle (3-2=1), because the next instruction
MOVE.W d0,(sp + xxx)
; takes 2 cycles
Stalls that originate in delay slot instructions, and are caused by an external access or a memory
contention, stall the whole core, and should NOT be deducted from the cycle count.