MOTOROLA
Chapter 2. PowerPC Processor Core
2-33
Instruction Timing
table search operations through the hashed page table on TLB misses. Supervisor software
can invalidate TLB entries selectively.
After an effective address is generated, the higher-order bits of the effective address are
translated by the appropriate MMU into physical address bits. Simultaneously, the lower-
order address bits (that are untranslated; therefore, considered both logical and physical),
are directed to the on-chip caches where they form the index into the four-way set-
associative tag array. After translating the address, the MMU passes the higher-order bits
of the physical address to the cache, and the cache lookup completes. For caching-inhibited
accesses or accesses that miss in the cache, the untranslated lower-order address bits are
concatenated with the translated higher-order address bits; the resulting 32-bit physical
address is then used by the system interface, which accesses external memory.
For instruction accesses, the MMU performs an address lookup in both the 64 entries of the
ITLB, and in the IBAT array. If an effective address hits in both the ITLB and the IBAT
array, the IBAT array translation takes priority. Data accesses cause a lookup in the DTLB
and DBAT array for the physical address translation. In most cases, the physical address
translation resides in one of the TLBs and the physical address bits are readily available to
the on-chip cache.
When the physical address translation misses in the TLBs, the processor core provides
hardware assistance for software to search the translation tables in memory. When a
required TLB entry is not found in the appropriate TLB, the processor vectors to one of the
three TLB miss exception handlers so that the software can perform a table search operation
and load the TLB. When this occurs, the processor automatically saves information about
the access and the executing context. Refer to the MPC603e Users Manual for more
detailed information about these features and the suggested software routines for searching
the page tables.
2.7 Instruction Timing
The processor core is a pipelined superscalar processor. A pipelined processor is one in
which the processing of an instruction is broken into discrete stages. Because the
processing of an instruction is broken into a series of stages, an instruction does not require
the entire resources of an execution unit at one time. For example, after an instruction
completes the decode stage, it can pass on to the next stage, while the subsequent
instruction can advance into the decode stage. This improves the throughput of the
instruction ow. The instruction pipeline in the processor core has four major stages,
described as follows:
¥
The fetch pipeline stage primarily involves retrieving instructions from the memory
system and determining the location of the next instruction fetch. Additionally, the
BPU decodes branches during the fetch stage and folds out branch instructions
before the dispatch stage if possible.