Chapter 6. Instruction Pipeline and Timing
For More Information On This Product,
Go to: www.freescale.com
6-9
Operand Execution Pipeline (OEP)
OAG, OC1, or OC2 stage. If no ExComputeEngine write from these stages is pending, the
execution of the opcode may be relocated from the EX stage to the OAG stage.
The last four OEP stages provide a strict scheme for prioritizing pending register updates.
A clear understanding of this scheme is crucial because there are often multiple updates for
a single destination register in the pipeline state at any time. Prioritization is as follows:
1. ExComputeEngine update, OAG stage (highest priority)
2. OagComputeEngine update,OAG stage
3. ExComputeEngine update, OC1 stage
4. OagComputeEngine update, OC1 stage
5. ExComputeEngine update, OC2 stage
6. OagComputeEngine update, OC2 stage
7. ExComputeEngine update, EX stage
8. OagComputeEngine update, EX stage (lowest priority)
The OEP implements a 2- x 4-stage scoreboard for tracking pending register updates from
the two compute engines. DS stage control logic uses this scoreboard to determine if
dynamic execution relocation can be performed and to generate pipeline stalls on
register-busy conditions, described in Section 6.3.3, “Sequence-Related OEP Stalls.”
V4 processor core performance measurements produce the following:
12% of the instructions always execute in the OagComputeEngine.
30% of the instructions can be executed in either compute engine, and 15% of the
total instructions are relocated to the OagComputeEngine.
18% of the instructions include auto-addressing mode updates {(An)+, -(An)}
performed by the OagComputeEngine. By summing these three classes, the
OagComputeEngine is used on 45% of the dynamic instructions.
Section 6.4, “Instruction Execution Locations,” gives a complete specification of the OEP
compute engine execution location for every ColdFire instruction.
6.3.2 Instruction Folding and the Limited Superscalar OEP
The V4 branch cache supports zero-cycle execution of correctly predicted taken conditional
branch instructions. The V4 OEP also supports instruction folding on other heavily used
constructs. In particular, MOVE is an ideal candidate for instruction folding for two
reasons. First, two-operand ColdFire constructs mean that simple assignment operations
using MOVE opcodes are used extensively. Second, because a MOVE simply involves an
assignment but no other computation operation, it can be combined with other opcodes for
dual-issue opportunities using both OEP compute engines. This instruction folding
provides limited superscalar dispatch.
F
Freescale Semiconductor, Inc.
n
.