NAME: HISTORY OF THE ARM PROCESSOR Advanced RISC

  NAME: IBRAHIM ABIOLA TIAMIYU INSTRUCTOR: DR MARIOS SPECIAL TOPICS PROJECT   A STUDY OF THE ARM MICROCONTROLLER        Chapter 1 THE BASIC INTRODUCTION TO THE ARM PROCESSOR. An ARM processor is one of a member of CPUs based on the RISC (reduced instruction set computer) architecture developed by Advanced RISC Machines (ARM).  The ARM makes 32-bit and 64-bit RISC multi-core processors. RISC processors are generally designed to perform a smaller number of types of computer instructions so that they can operate at a higher speed, performing more millions of instructions per second (MIPS).  By stripping out unneeded instructions and optimizing pathways, RISC processors provide outstanding performance at a fraction of the power demand of CISC (complex instruction set computing) devices. ARM processors are majorly used in consumer electronics such as smartphones tables wearables etc. Due to the reduced instruction set the have. The processors require less transistor compared to their counterparts, which enables a smaller die size for the integrated circuitry (IC). The ARM processor’s smaller size, reduced complexity and lower power consumption makes them suitable for increasingly miniaturized devices.  By combining the ARM microprocessor with RAM, ROM and other peripherals in one single chip, we get an ARM microcontroller, for example, LPC2148.   Furthermore, it is important we understand and know the history of the ARM processor and how they came about to change our lives   BRIEF HISTORY OF THE ARM PROCESSOR  Advanced RISC machine in short (ARM) is the first reduced instruction set computer (RISC) processor for commercial use around the world, and is presently being developed by ARM Holdings. The history of ARM processor goes back to the year 1983 in England when Acorn Computers Ltd officially launched an Acorn RISC Management project after being inspired to design its own processor by Berkeley RISC, one of the high-impact projects under ARPA’s (Advanced Research Projects Agency, which is now known as DARPA) VLSI project, dealing with RISC-based microprocessor design led by David Patterson who derived the term ‘RISC.’ As the name suggests, it does not mean that the processors with less than 100 instructions are qualified to RISC category, but instead they should have a highly optimized instruction set. ARM in the beginning was known as Acorn RISC machine. With VLSI Technology Inc. as its silicon partner, ARM came up with ARM1, the first ARM silicon on April 26, 1985, which was used as a second processor to the BBC Micro to develop the simulation software to finish work on the support chips (VIDC, IOC and MEMC) and to increase the operating speed of the CAD software used in development of ARM2. Apple, whilst developing an entirely new computing platform for its Newton, a personal digital assistant, found that only Acorn RISC machine was close to the requirements needed for implementation, but since ARM had no integral memory management unit, Apple collaborated with Acorn to develop ARM. The result of this collaboration was that both Acorn Group and Apple Computer, Inc., with 43 per cent share each, and VLSI Technology, Inc. as an investor, a separate company, ARM Ltd, was established in 1990. Also, the advanced research and development section of Acorn was employed here. After that time, ARM became the acronym for advanced RISC machine.  FEATURES OF THE ARM PROCESSOR. Load/store architecture. An orthogonal instruction set. Mostly single-cycle execution. Enhanced power-saving design. 64 and 32-bit execution states for scalable high performance. Hardware virtualization support. The ARM architecture is a simple hardware design allowing things to be left off the chip which can come in very handy providing flexibility. It helps in creating small die-sized chip which helps in considerably reducing the cost. Its low cost, simple pipeline construction and the freedom to put the design point where the designer finds it suitable for low-power consumption adds onto the benefits it provides for embedded applications. Furthermore, ARM enables an instruction set which is called ‘Thumb’, which compresses 32-bit instructions to 16 bits, enabling programs to be coded much more densely than standard RISC instruction sets. The Processors enabled to execute ‘Thumb’ also allow 32-bit instructions to run on the same hence allowing 16-bit and 32-bit instructions to mix together without affecting the performance, maintaining powerful computing capabilities. Also, the ARM cores are normally simple compared to other processors due to the fact that it can be manufactured with few transistors, which is also beneficial in saving cost of their production compared to other processors.  Today, most embedded applications like, set-top boxes smartphones, digital televisions and digital cameras use an ARM processor due to their cost-effectiveness and low-power consumption. It is also no hidden fact that the ARM architecture is compatible with all four major operating systems, Symbian OS, Palm OS, Windows and Android OS. The instruction sets in ARM processor are classified as ARM instruction set, Thumb instruction set and Jazelle mode. ARM mode is a standard 32-bit instruction set. Thumb instruction set is a 16-bit compressed form that provides better performance than complex instruction set computers (CISCs). Jazelle DBX (direct bytecode eXecution) allows some ARM processor to execute Java bytecode. VERSIONS AND FEATURES OF ARM MICROCONTROLLER  One of the most advanced form of these microcontrollers is the cortex microcontroller, developed by ARMv7. The cortex family is further divided as:  Cortex Ax series  Cortex Rx series  Cortex Mx series CORETEX M3 MICROCONTROLLER FEATURES It is a 32 bit processor offering many advantages over other microcontrollers. It is a ‘harward architecture’. For communication with Ram and Rom, this architecture provides separate instruction buses and data buses. It consists of a 3 stage pipeline which fetches the instructions, then decodes it and then finally executes the instruction.   The memory required for the program has been reduced and also it provides high code density because of the usage of THUMB-2 technology in coretex-M3. For the good interrupt performance, the core m3 is closely integrated to NVIC (Nested Vector Interrupt Controller). It is a Reduced Instruction Set Computing (RISC) controller. It has a high performance CPU of 32 bits and the pipelining is done through 3 stages. The Thumb-2 technology has been integrated in these controllers, which means they can handle 16 bit as well as 32 bit instructions. This technology also provides high performance in operations and executions. It has low power modes. Sleep modes are also supported by it. It controls the software efficiently and it consists of multiple domains of power. The NVIC, Nested Vectored interrupt controller provides low latency as well as low jitter interrupts response. Another advantage is that there is no need of assembly programming in it                                                            THUMB Thumb is a subset of the ARM instruction set encoded in 16-bit wide instructions. Requires 70% of the space of ARM code. Uses 40% more instructions than equivalent ARM code. A CPU has Thumb support if it has a T in its name, or it is architecture v6 or later. With 32-bit memory: ARM code is 40% faster than Thumb code. With 16-bit memory: Thumb code is 45% faster than ARM code. Uses 30% less external memory power than ARM code. Thumb is not a complete architecture: you can’t have a Thumb-only CPU. Some of the limitations of Thumb mode include: Conditional execution only exists for branch instructions. Data processing ops use a two-address format, as opposed to ARM’s three-address format. Its instruction encodings are less regular than ARM’s. Thumb uses the same register set as ARM — but only R0-R7.  THUMB 2 Remarks Thumb-2 is the newest version of Thumb. It’s present in the Cortex CPU series (or any v7). Now a complete architecture: you can have a Thumb-2-only CPU (v7M). Mixed 16/32-bit instruction stream provides the economy of space of Thumb combined with most of the speed of pure ARM code. What does Thumb-2 assembly language look like? It’s pretty much like ARM assembler. ARM Ltd. have unified the ARM and Thumb assembler formats into UAL – a Unified Assembler Language. New ARM assemblers can take UAL format assembler and output to ARM or Thumb-2.  JAZELLE Besides the ARM and Thumb modes, a new technology has been introduced which allows the execution of Java bytecode in hardware. This technology is known as Jazelle. It is most prominently used in mobile phones so that the execution speed of Java EM games can be increased. The Java Virtual Machine performs the complicated operations in software while the Java bytecodes are usually run on hardware. The first processor to use Jazelle was ARM926EJ-S and the architecture of ARMv5TEJ specifies the functionality of Jazelle. The JVM software depends on the details of hardware interface so that the JVM and hardware can develop very well together and no other software is affected.     THE ARM ARCHITECTURES  ARM architecture evolution The ARM architecture has evoked through many stages, the smart phones employ ARMv5 architecture and the later releases. Hardware Floating Point Unit is the major change brought in ARMv7 to provide more speed than the software based floating point. Even DSP instructions were added to the set to improve the ARM architecture for use in Digital Signal Processing and multimedia applications. In ARMv7 Thumb 2 instructions also added to obtain the code density. The new ARMv8 has undergone a considerable change by using 64-bit architecture. Therefore, this ARM architecture with brilliant features is widely accepted by many organizations ARM EVOLUTION CHART  ARCHITECTURE DESCRIPTION   The Arm Architecture is the bedrock of all that Arm does, and the foundation on which all CPU products are formed. What we mean by architecture is the contract between the hardware & the software assigning rights and responsibilities between those two parties. This defines how compatible hardware behaves for correctly written software and is the very essence of the portability guarantee of Arm. The architecture defines the basic instruction set, and the exception and memory model that are relied upon by the operating system and hypervisor – in effect, the architecture defines what the CPU must do, but actually says very little about how it does it. The Micro-architecture and Implementation of the CPU sit on top of the architecture, and determine how it meets the architectural contract, and defines the processor’s power, performance and area by determining the pipeline length, levels of cache etc.  an ARM processor consists of 31 general purpose 32-bit register. Sixteen registers namely R0-R15 are visible, which means they can be modified by the user whereas other registers help to speed up the execution processes. Some registers play some special roles like R14 acts as a link register (LR), R15 acts as a Program Counter (PC) and R13 acts as a Stack Pointer ARM processor mode of operation There are seven modes of operations. These modes are categorized as user mode, prevailed mode and exception mode. User mode is a normal program execution mode in which the system resources are unavailable. If some exception occurs, then the mode is changed to the exception mode. In exception mode, all system resources are available. IMPORTANT REGISTERS IN ARM    There are two important registers in ARM, namely Current Program Status Register (CPSR) and Saved Program Status Register (SPSR). CPSR is similar to PSWR register in 8051 micro-controllers, which indicate some important flag bits like carry bits and zero flag bits. Whereas SPSR is used in execution modes. Whenever exception occurs the content of CPSR are copied in SPSR                             Format of the CPSR and the SPSR (source: ARM reference manual)  ARM has sixteen registers visible at any one time. They are named R0 to R15. All are 32 bits wide.  The registers may also be referred to by the following aliases:  All of the registers are general purpose, save for: R13 / SP which holds the stack pointer R14 / LR the link register which holds the callers’ return address. R15 / PC which holds the program counter In addition to the main registers there is also a status register  The ARM architecture has evolved through many stages; the smartphones employ ARMv5 architecture and the later releases. Hardware floating-point unit (FPU) is the major change brought in ARMv7 to provide more speed than the software-based floating point. Even DSP instructions were included to the set to upgrade the ARM architecture for use in digital signal processing (DSP) and multimedia applications. In ARMv7, even the Thumb-2 feature, to obtain code density as Thumb and performance as ARM instruction set, was added that extends 16-bit Thumb instruction set with 32-bit instructions, producing instruction sets of variable lengths. A new ‘unified assembly language’ (UAL) support is provided to generate Thumb-2 or ARM instructions, whichever is required from the same source code. The new ARMv8 has undergone a considerable change by using 64-bit architecture and cryptography instructions supporting AES and SHA-1/SHA-256, and even allows 32-bit applications to be executed.          EARLY AND CURRENT ARCHITECTURE OF ARM Early Architectures v1 Developed at Acorn, Cambridge, UK. Between October 1983 and April 1985. Fewer than 25,000 transistors. No multiply or coprocessor instructions. 26-bit addressing. v2 30,000 transistors. 32-bit multiplier instructions (MUL ). v2a First ARM with an on-chip cache (ARM3). v3 32-bit addressing. Undefined Instruction and Abort modes (allows virtual memory). v3M Signed and unsigned long multiply and multiply-accumulate instructions: SMULL, SMLAL, UMULL, UMLAL. v4 is the oldest supported architecture today. It added: Load/store instructions for signed and unsigned halfwords and bytes. LDRH, LDRSH, LDRSB. System mode – privileged mode using user registers. 26-bit addressing no longer supported. v4T added: Thumb mode v5T Superset of ARMv4T. New instructions: BLX, CLZ and BKPT. v5TE New signal processing instructions. New multiply instructions for DSP: SMULxy, SMLAxy, SMULWy, SMLAWy, SMLALxy. Saturated math support: Q flag, QADD, QSUB, QDADD, QDSUB. New PLD memory pre-load hint instruction. v5TEJ Java acceleration. v6 Mixed endian data handling: SETEND, REV, REV16, REVSH. 60+ new SIMD instructions: SMUSD, SMUADX, USAD8, USADA8. Unaligned data handling. New multiprocessing instructions: LDREX, STREX. v6T2 Thumb-2. v7A, v7R Dynamic Compiler Support. Execution Environment (Thumb-2EE). VFP v3 (Vector Floating Point). NEON advanced SIMD. Thumb-2 mandated. v7M Minimalist variant for embedded uses. Thumb-2 only.