Rev. A
|
Page 4 of 72
|
September 2011
Digital applications interface that includes four precision
clock generators (PCG), an input data port (IDP/PDAP)
for serial and parallel interconnect, an S/PDIF
receiver/transmitter, four asynchronous sample rate con-
verters, eight serial ports, a shift register, and a flexible
signal routing unit (DAI SRU).
Digital peripheral interface that includes two timers, a 2-
wire interface, one UART, two serial peripheral interfaces
(SPI), 2 precision clock generators (PCG), three pulse
width modulation (PWM) units, and a flexible signal rout-
ing unit (DPI SRU).
As shown in the SHARC core block diagram
on Page 5, the pro-
cessors use two computational units to deliver a significant
performance increase over the previous SHARC processors on a
range of DSP algorithms. With its SIMD computational hard-
ware, the processors can perform 1.8 GFLOPS running at
300 MHz.
FAMILY CORE ARCHITECTURE
The processors are code compatible at the assembly level with
the ADSP-2146x, ADSP-2137x, ADSP-2136x, ADSP-2126x,
ADSP-21160, and ADSP-21161, and with the first generation
ADSP-2106x SHARC processors. The ADSP-2147x shares
architectural features with the ADSP-2126x, ADSP-2136x,
ADSP-2137x, ADSP-2146x, and ADSP-2116x SIMD SHARC
processors, as shown in
Figure 2 and detailed in the following
sections.
SIMD Computational Engine
The processors contain two computational processing elements
that operate as a single-instruction, multiple-data (SIMD)
engine. The processing elements are referred to as PEX and PEY
and each contains an ALU, multiplier, shifter, and register file.
PEX is always active, and PEY may be enabled by setting the
PEYEN mode bit in the MODE1 register. SIMD mode allows
the processor to execute the same instruction in both processing
elements, but each processing element operates on different
data. This architecture is efficient at executing math intensive
DSP algorithms.
SIMD mode also affects the way data is transferred between
memory and the processing elements because twice the data
bandwidth is required to sustain computational operation in the
processing elements. Therefore, entering SIMD mode also dou-
bles the bandwidth between memory and the processing
elements. When using the DAGs to transfer data in SIMD
mode, two data values are transferred with each memory or reg-
ister file access.
SIMD mode is supported from external SDRAM but is not sup-
ported in the AMI.
Independent, Parallel Computation Units
Within each processing element is a set of computational units.
The computational units consist of an arithmetic/logic unit
(ALU), multiplier, and shifter. These units perform all opera-
tions in a single cycle. The three units within each processing
element are arranged in parallel, maximizing computational
throughput. Single multifunction instructions execute parallel
ALU and multiplier operations. In SIMD mode, the parallel
ALU and multiplier operations occur in both processing ele-
ments. These computation units support IEEE 32-bit single-
precision floating-point, 40-bit extended precision floating-
point, and 32-bit fixed-point data formats.
Timer
The processor contains a core timer that can generate periodic
software interrupts. The core timer can be configured to use
FLAG3 as a timer expired signal.
Data Register File
Each processing element contains a general-purpose data regis-
ter file. The register files transfer data between the computation
units and the data buses, and store intermediate results. These
10-port, 32-register (16 primary, 16 secondary) register files,
combined with the processor’s enhanced Harvard architecture,
allow unconstrained data flow between computation units and
internal memory. The registers in PEX are referred to as
R0–R15 and in PEY as S0–S15.
Context Switch
Many of the processor’s registers have secondary registers that
can be activated during interrupt servicing for a fast context
switch. The data registers in the register file, the DAG registers,
and the multiplier result registers all have secondary registers.
The primary registers are active at reset, while the secondary
registers are activated by control bits in a mode control register.
Universal Registers
Universal registers can be used for general-purpose tasks. The
USTAT (4) registers allow easy bit manipulations (Set, Clear,
Toggle, Test, XOR) for all peripheral control and status
registers.
The data bus exchange register (PX) permits data to be passed
between the 64-bit PM data bus and the 64-bit DM data bus, or
between the 40-bit register file and the PM/DM data bus. These
registers contain hardware to handle the data width difference.
Single-Cycle Fetch of Instruction and Four Operands
The processors feature an enhanced Harvard architecture in
which the data memory (DM) bus transfers data and the pro-
gram memory (PM) bus transfers both instructions and data
(see
Figure 2). With its separate program and data memory
buses and on-chip instruction cache, the processor can simulta-
neously fetch four operands (two over each data bus) and one
instruction (from the cache), all in a single cycle.
Instruction Cache
The processor includes an on-chip instruction cache that
enables three-bus operation for fetching an instruction and four
data values. The cache is selective—only the instructions whose
fetches conflict with PM bus data accesses are cached. This
cache allows full speed execution of core looped operations such
as digital filter multiply-accumulates, and FFT butterfly
processing.