35
TSPC603R
2125A
–
HIREL
–
04/02
Instruction Timing
The 603r is a pipelined superscalar processor. A pipelined processor is one in which the
processing of an instruction is reduced into discrete stages. Because the processing of
an instruction is broken into a series of stages, an instruction does not require the entire
resources of an execution unit. For example, after an instruction completes the decode
stage, it can pass on to the next stage, while the subsequent instruction can advance
into the decode stage. This improves the throughput of the instruction flow. For exam-
ple, it may take three cycles for a floating-point instruction to complete, but if there are
no stalls in the floating-point pipeline, a series of floating-point instructions can have a
throughput of one instruction per cycle.
The instruction pipeline in the 603r has four major pipeline stages, described as follows:
The fetch pipeline stage primarily involves retrieving instructions from the memory
system and determining the location of the next instruction fetch. Additionally, the
BPU decodes branches during the fetch stage and folds out branch instructions
before the dispatch stage if possible.
The dispatch pipeline stage is responsible for decoding the instructions supplied by
the instruction fetch stage, and determining which of the instructions are eligible to
be dispatched in the current cycle. in addition, the source operands of the
instructions are read from the appropriate register file and dispatched with the
instruction to the execute pipeline stage. At the end of the dispatch pipeline stage,
the dispatched instructions and their operands are latched by the appropriate
execution unit.
During the execute pipeline stage each execution unit that has an executable
instruction executes the selected instruction (perhaps over multiple cycles), writes
the instruction
’
s result into the appropriate rename register, and notifies the
completion stage that the instruction has finished execution. In the case of an
internal exception, the execution unit reports the exception to the
completion/writeback pipeline stage and discontinues instruction execution until the
exception is handled. The exception is not signaled until that instruction is the next
to be completed. Execution of most floating-point instructions is pipelined within the
FPU allowing up to three instructions to be executing in the FPU concurrently. The
pipeline stages for the floating-point unit are multiply, add, and round-convert.
Execution of most load/store instructions is also pipelined. The load/store units has
two pipeline stages. The first stage is for effective address calculation and MMU
translation and the second stage is for accessing the data in the cache.
The complete/writeback pipeline stage maintains the correct architectural machine
state and transfers the contents of the rename registers to the GPRs and FPRs as
instructions are retired. If the completion logic detects an instruction causing an
exception, all following instructions are cancelled, their execution results in rename
registers are discarded, and instructions are fetched from the correct instruction
stream.