Revision 3.1
9
www.national.com
Architecture Overview (
Continued
)
G
1.2
The FPU (Floating Point Unit) interfaces to the integer unit
and the cache unit through a 64-bit bus. The FPU is x87-
instruction-set compatible and adheres to the IEEE-754
standard. Because almost all applications that contain
FPU instructions also contain integer instructions, the
GXm processor’s FPU achieves high performance by
completing integer and FPU operations in parallel.
FLOATING POINT UNIT
FPU instructions are dispatched to the pipeline within the
integer unit. The address calculation stage of the pipeline
checks
for
memory
management
accesses memory operands for use by the FPU. Once the
instructions and operands have been provided to the FPU,
the FPU completes instruction execution independently of
the integer unit.
exceptions
and
1.3
The 16 KB write-back unified cache is a data/instruction
cache and is configured as four-way set associative. The
cache stores up to 16 KB of code and data in 1024 cache
lines.
WRITE-BACK CACHE UNIT
The GXm processor provides the ability to allocate a por-
tion of the L1 cache as a scratchpad, which is used to
accelerate the Virtual Systems Architecture algorithms as
well as for some graphics operations.
1.4
The memory management unit (MMU) translates the lin-
ear address supplied by the integer unit into a physical
address to be used by the cache unit and the internal bus
interface unit. Memory management procedures are x86-
compatible, adhering to standard paging mechanisms.
MEMORY MANAGEMENT UNIT
The MMU also contains a load/store unit that is responsi-
ble for scheduling cache and external memory accesses.
The
load/store
unit
incorporates
enhancing features:
two
performance-
Load-store reordering
that gives priority to memory
reads required by the integer unit over writes to
external memory.
Memory-read bypassing
that eliminates unnecessary
memory reads by using valid data from the execution
unit.
1.4.1
The internal bus interface unit provides a bridge from the
GXm processor to the integrated system functions (i.e.,
and the PCI bus interface.
Internal Bus Interface Unit
When external memory access is required, the physical
address is calculated by the memory management unit
and then passed to the internal bus interface unit, which
translates the cycle to an X-Bus cycle (the X-Bus is a
National Semiconductor proprietary internal bus which
provides a common interface for all of the system mod-
ules). The X-Bus memory cycle now is arbitrated between
other pending X-Bus memory requests to the SDRAM
controller before completing.
In addition, the internal bus interface unit provides config-
uration control for up to 20 different regions within system
memory with separate controls for read access, write
access, cacheability, and PCI access.
1.5
The GXm processor integrates the following functions tra-
ditionally implemented using external devices:
INTEGRATED FUNCTIONS
High-performance 2D graphics accelerator
Separate CRT and TFT data paths from the display
controller
SDRAM memory controller
PCI bridge
The processor has also been enhanced to support
National Semiconductor’s proprietary Virtual System
Architecture (VSA) implementation.
The GXm processor implements a Unified Memory Archi-
tecture (UMA). By using National Semiconductor’s Dis-
play Compression Technology (DCT), the performance
degradation inherent in traditional UMA systems is elimi-
nated.
1.5.1
The graphics accelerator is a full-featured GUI (Graphical
User Interface) accelerator. The graphics pipeline imple-
ments a bitBLT engine for frame buffer bitBLTs and rect-
angular fills. Additional instructions in the integer unit may
be processed, as the bitBLT engine assists the CPU in the
bitBLT operations that take place between system mem-
ory and the frame buffer. This combination of hardware
and software is used by the display driver to provide very
fast transfers in both directions between system memory
and the frame buffer. The bitBLT engine also draws ran-
domly-oriented vectors, and scanlines for polygon fill. All
of the pipeline operations described in the following list
can be applied to any bitBLT operation.
Graphics Accelerator
Pattern Memory.
Render with 8x8 dither, 8x8 mono-
chrome, or 8x1 color pattern.
Color Expansion.
Expand monochrome bitmaps to
full-depth 8- or 16-bit colors.
Transparency.
Suppresses drawing of background
pixels for transparent text.
Raster Operations.
Boolean operation combines
source, destination, and pattern bitmaps.