2
Programming Details
Implementation of Write Allocate in the K86 Processors
21326F/0—February 1999
not executed. The data associated with the pending write cycle
is merged with the recently-allocated data-cache line and
stored in the processor’s L1 data cache. If the data-cache line
was fetched from memory (because of an L2 cache miss), the
data is stored without modification in the L2 cache. The final
MESI state of the cache lines depends on the state of the
WB/WT# and PWT signals during the burst read cycle and the
subsequent L1 data cache write hit. If the L1 data cache line is
stored in the modified state, then the same cache line is stored
in the L2 cache in the exclusive state. If the L1 data cache line is
stored in the shared state, then the same cache line is stored in
the L2 cache in the shared state.
All AMD-K6
Models
During the write allocation, a 32-byte burst read cycle is
executed in place of a non-burst write cycle (in the case of the
Model 9, this assumes the data-cache line was not present in the
L2 cache). While the burst read cycle generally takes longer to
execute than the non-burst write cycle, performance gains are
realized on subsequent write cycle hits to the write-allocated
cache line. Due to the nature of software, memory accesses tend
to occur within proximity of each other (principle of locality).
The likelihood of additional write hits to the write-allocated
cache line is high.
For the Model 9, write allocates that hit the L2 cache increase
performance by avoiding accesses to the system bus.
Programming Details
The steps required for programming write allocate on K
86
processors are as follows:
1. Verify write allocate support by using the CPUID
instruction to check for the correct model and stepping of
the processor.
2. Configure the Model-Specific Registers (MSRs).
3. Enable write allocate.
Note:
The BIOS should enable the write allocate mechanisms
only after performing any memory sizing or typing
algorithms.