PC7457/47 [Preliminary]
Performs alignment, normalization, and precision conversion for floating-
point data
Executes cache control and TLB instructions
Performs alignment, zero padding, and sign extension for integer data
Supports hits under misses (multiple outstanding misses)
Supports both big- and little-endian modes, including misaligned little-endian
Three issue queues FIQ, VIQ, and GIQ can accept as many as one, two, and three
instructions, respectively, in a cycle. Instruction dispatch requires the following:
Instructions can be dispatched only from the three lowest IQ entries – IQ0,
IQ1, and IQ2
A maximum of three instructions can be dispatched to the issue queues per
clock cycle
Space must be available in the CQ for an instruction to dispatch (this
includes instructions that are assigned a space in the CQ but not in an issue
Rename buffers
16 GPR rename buffers
16 FPR rename buffers
16 VR rename buffers
Dispatch unit
Decode/dispatch stage fully decodes each instruction
Completion unit
The completion unit retires an instruction from the 16-entry completion
queue (CQ) when all instructions ahead of it have been completed, the
instruction has finished execution, and no exceptions are pending
Guarantees sequential programming model (precise exception model)
Monitors all dispatched instructions and retires them in order
Tracks unresolved branches and flushes instructions after a mispredicted
Retires as many as three instructions per clock cycle
Separate on-chip L1 Instruction and data caches (Harvard Architecture)
32 Kbyte, eight-way set-associative instruction and data caches
Pseudo least-recently-used (PLRU) replacement algorithm
32-byte (eight-word) L1 cache block
Physically indexed/physical tags
Cache write-back or write-through operation programmable on a per-page or
per-block basis
Instruction cache can provide four instructions per clock cycle; data cache
can provide four words per clock cycle
Caches can be disabled in software
Caches can be locked in software