49
SAM4CP [DATASHEET]
43051E–ATPL–08/14
12.
ARM Cortex-M4 Processor
12.1
Description
The Cortex-M4 processor is a high performance 32-bit processor designed for the microcontroller market. It offers
significant benefits to developers, including outstanding processing performance combined with fast interrupt handling,
enhanced system debug with extensive breakpoint and trace capabilities, efficient processor core, system and memo-
ries, ultra-low power consumption with integrated sleep modes, and platform security robustness
,
with integrated
memory protection unit (MPU).
The Cortex-M4 processor is built on a high-performance processor core, with a 3-stage pipeline Harvard architecture,
making it ideal for demanding embedded applications. The processor delivers exceptional power efficiency through an
efficient instruction set and extensively optimized design, providing high-end processing hardware including IEEE754-
compliant single-precision floating-point computation, a range of single-cycle and SIMD multiplication and multiply-with-
accumulate capabilities, saturating arithmetic and dedicated hardware division.
To facilitate the design of cost-sensitive devices, the Cortex-M4 processor implements tightly-coupled system compo-
nents that reduce processor area while significantly improving interrupt handling and system debug capabilities. The
Cortex-M4 processor implements a version of the Thumb instruction set based on Thumb-2 technology, ensuring high
code density and reduced program memory requirements. The Cortex-M4 instruction set provides the exceptional perfor-
mance expected of a modern 32-bit architecture, with the high code density of 8-bit and 16-bit microcontrollers.
The Cortex-M4 processor closely integrates a configurable NVIC, to deliver industry-leading interrupt performance. The
NVIC includes a non-maskable interrupt (NMI), and provides up to 256 interrupt priority levels. The tight integration of the
processor core and NVIC provides fast execution of interrupt service routines (ISRs), dramatically reducing the interrupt
latency. This is achieved through the hardware stacking of registers, and the ability to suspend load-multiple and store-
multiple operations. Interrupt handlers do not require wrapping in assembler code, removing any code overhead from the
ISRs. A tail-chain optimization also significantly reduces the overhead when switching from one ISR to another.
To optimize low-power designs, the NVIC integrates with the sleep modes, that include a deep sleep function that
enables the entire device to be rapidly powered down while still retaining program state.
12.1.1 System Level Interface
The Cortex-M4 processor provides multiple interfaces using AMBA
technology to provide high speed, low latency
memory accesses. It supports unaligned data accesses and implements atomic bit manipulation that enables faster
peripheral controls, system spinlocks and thread-safe Boolean data handling.
The Cortex-M4 processor has a Memory Protection Unit (MPU) that provides fine grain memory control, enabling appli-
cations to utilize multiple privilege levels, separating and protecting code, data and stack on a task-by-task basis. Such
requirements are becoming critical in many embedded applications such as automotive.
12.1.2 Integrated Configurable Debug
The Cortex-M4 processor implements a complete hardware debug solution. This provides high system visibility of the
processor and memory through either a traditional JTAG port or a 2-pin Serial Wire Debug (SWD) port that is ideal for
microcontrollers and other small package devices.
For system trace the processor integrates an Instrumentation Trace Macrocell (ITM) alongside data watchpoints and a
profiling unit. To enable simple and cost-effective profiling of the system events these generate, a Serial Wire Viewer
(SWV) can export a stream of software-generated messages, data trace, and profiling information through a single pin.
The Flash Patch and Breakpoint Unit (FPB) provides up to 8 hardware breakpoint comparators that debuggers can use.
The comparators in the FPB also provide remap functions of up to 8 words in the program code in the CODE memory
region. This enables applications stored on a non-erasable, ROM-based microcontroller to be patched if a small pro-
grammable memory, for example flash, is available in the device. During initialization, the application in ROM detects,
from the programmable memory, whether a patch is required. If a patch is required, the application programs the FPB to
remap a number of addresses. When those addresses are accessed, the accesses are redirected to a remap table
specified in the FPB configuration, which means the program in the non-modifiable ROM can be patched.