
11
Chapter 1 Architectural Overview
Allowing two accesses to be in progress simultaneously can be effectively used by
the separate instruction and data memory systems of a Harvard architecture.
1.3.1 The Am29005
The Am29005
is pin compatible with other 3–bus members of the family (see
Table 1-1). It is an inexpensive version of the Am29000 processor. The Translation
Look–Aside Buffer (TLB) and the Branch Target Cache (BTC) have been omitted. It
is available at a lower clock speed, and only in the less expensive plastic packaging. It
is a good choice for systems which are price sensitive and do not require Memory
Management Unit support or the performance advantages of the BTC. An Am29005
design can always be easily upgraded with an Am29000 replacement later. In fact the
superior debugging environment offered by the Am29000 or the Am29050
may
make the use of one of these processor a good choice during software debugging. The
faster processor can always be replaced by an Am29005 when production com-
mences.
1.4
THE Am29050 3–BUS FLOATING–POINT MICROPROCESSOR
The Am29050 processor is pin compatible with other 3–bus members of the
family (see Table 1-1) [AMD 1991a]. Many of the features of the Am29050 were al-
ready described in the section describing its closely related relative, the Am29000.
The Am29050 processor offers a number of additional performance and system sup-
port features when compared with the Am29000. The most notable is the direct
execution of double–precision (64–bit) and single–precision (32–bit) floating–point
arithmetic on–chip. The Am29000 has to rely on software emulation or the
Am29027
floating–point coprocessor to perform floating–point operations. The
introduction of the Am29050 eliminated the need to design the Am29027 coproces-
sor into floating–point intensive systems.
The processor contains a Branch Target Cache (BTC) memory system like the
Am29000; but this time it is twice as big, with 32 entries in each of the two sets rather
than the Am29000’s 16 entries per set. BTC entries are not restricted to four instruc-
tions per entry; there is an option (bit CO in the CFG register) to arrange the BTC as
64 entries per set, with each entry containing two instructions rather than four. The
smaller entry size is more useful with lower latency memory systems. For example, if
a memory system has a 2–cycle first–access start–up latency it is more efficient to
have a larger number of 2–instruction entries. After all, for this example system, the
third and fourth instructions in a four per entry arrangement could just as efficiently
be fetched from the external memory.
The Am29050 also incorporates an Instruction Forwarding path which addi-
tionally helps to reduce the effects of instruction memory access latency. When a new
instruction fetch sequence commences, and the target of the sequence is not found in