
MOTOROLA
3-6
CENTRAL PROCESSING UNIT
Rev. 15 June 98
MPC509
USER’S MANUAL
The ALU-BFU unit includes the implementation of all integer logic, add and sub-
tract, and bit field instructions.
The IU also includes the integer exception register (XER) and the general-purpose
register file.
IMUL-IDIV and ALU-BFU are implemented as separate execution units. The ALU-BFU
unit can execute one instruction per clock cycle. IMUL-IDIV instructions require multi-
ple clock cycles to execute. IMUL-IDIV is pipelined for multiply instructions, so that
consecutive multiply instructions can be issued on consecutive clock cycles. Divide
instructions are not pipelined; an integer divide instruction preceded or followed by an
integer divide or multiply instruction results in a stall in the processor pipeline. Note
that since IMUL-IDIV and ALU-BFU are implemented as separate execution units, an
integer divide instruction preceded or followed by an ALU-BFU instruction does not
cause a delay in the pipeline.
3.4.3 Load/Store Unit (LSU)
The load-store unit handles all data transfer between the general-purpose and float-
ing-point register files and the internal load/store bus (L-bus). The load/store unit is
implemented as an independent execution unit so that stalls in the memory pipeline
do not cause the master instruction pipeline to stall (unless there is a data depen-
dency). The unit is fully pipelined so that memory instructions of any size may be
issued on back-to-back cycles.
There is a 32-bit wide data path between the load/store unit and the general-purpose
register file and a 64-bit-wide data path between the load/store unit and the floating-
point register file. Single-word accesses to the internal on-chip data RAM require one
clock, resulting in two clocks latency. Double-word accesses require two clocks,
resulting in three clocks latency. Since the L-bus is 32 bits wide, double-word transfers
require two bus accesses. The load/store unit performs zero-fill for byte and half-word
transfers and sign extension for half-word transfers.
Addresses are formed by adding the source one register operand specified by the
instruction (or zero) to either a source two register operand or to a 16-bit, immediate
value embedded in the instruction.
3.4.4 Floating-Point Unit (FPU)
The FPU contains a double-precision multiply array, the floating-point status and con-
trol register (FPSCR), and the FPRs. The multiply-add array allows the MPC509 to
efficiently implement floating-point operations such as multiply, multiply-add, and
divide.
The MPC509 depends on a software envelope to fully implement the IEEE floating-
point specification. Overflows, underflows, NaNs, and denormalized numbers cause
floating-point assist exceptions that invoke a software routine to deliver (with hardware
assistance) the correct IEEE result.
To accelerate time-critical operations and make them more deterministic, the MPC509
provides a mode of operation that avoids invoking the software envelope and attempts