Revision 1.1
11
www.national.com
G
Architecture Overview
(Continued)
1.1.1
The integer unit consists of:
Integer Unit
Instruction Buffer
Instruction Fetch
Instruction Decoder and Execution
The pipelined integer unit fetches, decodes, and executes
x86 instructions through the use of a five-stage integer
pipeline.
The instruction fetch pipeline stage generates, from the on-
chip cache, a continuous high-speed instruction stream for
use by the module. Up to 128 bits of code are read during a
single clock cycle.
Branch prediction logic within the prefetch unit generates a
predicted target address for unconditional or conditional
branch instructions. When a branch instruction is detected,
the instruction fetch stage starts loading instructions at the
predicted address within a single clock cycle. Up to 48
bytes of code are queued prior to the instruction decode
stage.
The instruction decode stage evaluates the code stream
provided by the instruction fetch stage and determines the
number of bytes in each instruction and the instruction
type. Instructions are processed and decoded at a maxi-
mum rate of one instruction per clock.
The
address calculation function is pipelined and contains
two stages, AC1 and AC2. If the instruction refers to a
memory operand, AC1 calculates a linear memory address
for the instruction.
The AC2 stage performs any required memory manage-
ment functions, cache accesses, and register file
accesses. If a floating point instruction is detected by AC2,
the instruction is sent to the floating point unit for process-
ing.
The execution stage, under control of microcode, executes
instructions using the operands provided by the address
calculation stage.
Write-back
,
the last stage of the integer unit, updates the
register file within the integer unit or writes to the load/store
unit within the memory management unit.
1.1.2
The floating point unit (FPU) interfaces to the integer unit
and the cache unit through a 64-bit bus. The FPU is x87-
instruction-set compatible and adheres to the IEEE-754
standard. Because almost all applications that contain FPU
instructions also contain integer instructions, the GX1 mod-
ule’s FPU achieves high performance by completing inte-
ger and FPU operations in parallel.
Floating Point Unit
FPU instructions are dispatched to the pipeline within the
integer unit. The address calculation stage of the pipeline
checks for memory management exceptions and accesses
memory operands for use by the FPU. Once the instruc-
tions and operands have been provided to the FPU, the
FPU completes instruction execution independently of the
integer unit.
1.1.3
The 16 KB write-back unified (data/instruction) cache is
configured as four-way set associative. The cache stores
up to 16 KB of code and data in 1024 cache lines.
Write-Back Cache Unit
The GX1 module provides the ability to allocate a portion of
the L1 cache as a scratchpad, which is used to accelerate
the Virtual Systems Architecture technology algorithms.
1.1.4
The memory management unit (MMU) translates the linear
address supplied by the integer unit into a physical address
to be used by the cache unit and the internal bus interface
unit. Memory management procedures are x86-compati-
ble, adhering to standard paging mechanisms.
Memory Management Unit
The MMU also contains a load/store unit that is responsible
for scheduling cache and external memory accesses. The
load/store unit incorporates two performance-enhancing
features:
Load-store reordering
that gives memory reads,
required by the integer unit, priority over writes to
external memory.
Memory-read bypassing
that eliminates unnecessary
memory reads by using valid data from the execution
unit.
1.1.5
The internal bus interface unit provides a bridge from the
GX1 module to the integrated system functions and the
Fast-PCI bus interface.
Internal Bus Interface Unit
When an external memory access is required, the physical
address is calculated by the memory management unit and
then passed to the internal bus interface unit, which trans-
lates the cycle to an X-Bus cycle (the X-Bus is a proprietary
internal bus which provides a common interface for all of
the integrated functions). The X-Bus memory cycle is arbi-
trated between other pending X-Bus memory requests to
the SDRAM controller before completing.
In addition, the internal bus interface unit provides configu-
ration control for up to 20 different regions within system
memory with separate controls for read access, write
access, cacheability, and PCI access.