Title: Embedded Systems
1ECM583 Special Topics in Computer Systems
Lecture 3. ARM Instructions
Prof. Taeweon Suh Computer Science
Education Korea University
2ARM (www.arm.com)
3ARM
- Source 2008 Embedded SW Insight Conference
4ARM Partners
- Source 2008 Embedded SW Insight Conference
5ARM (as of 2008)
- Source 2008 Embedded SW Insight Conference
6ARM Processor Portfolio
- Source 2008 Embedded SW Insight Conference
7Abstraction
- Abstraction helps us deal with complexity
- Hide lower-level detail
- Instruction set architecture (ISA)
- An abstract interface between the hardware and
the low-level software interface
8A Typical Memory Hierarchy in Computer
Secondary Storage (Disk)
On-Chip Components
Main Memory (DRAM)
CPU Core
L2
L1I (Instr Cache)
ITLB
Reg File
L1D (Data Cache)
DTLB
Speed (cycles) ½s 1s
10s
100s 10,000s
Size (bytes) 100s
10Ks Ms
Gs Ts
Cost highest
lowest
9Typical and Essential Instructions
- Each CPU provides many instructions
- It would be confusing and complicated to study
all the instructions CPU provides - But, there are essential instructions all the
CPUs commonly provide - Instruction categories
- Arithmetic and Logical (Integer)
- Memory Access Instructions
- Load and Store
- Branch
Registers in ARM
R0, R1, R2 R15 CPSR, SPSR
10Levels of Program Code (ARM)
- High-level language program (in C)
- swap (int v, int k)
- int temp
- temp vk
- vk vk1
- vk1 temp
-
- Assembly language program
- swap sll R2, R5, 2
- add R2, R4, R2
- ldr R12, 0(R2)
- ldr R10, 4(R2)
- str R10, 0(R2)
- str R12, 4(R2)
- b exit
11CISC vs RISC
- CISC (Complex Instruction Set Computer)
- One assembly instruction does many (complex) job
- Variable length instruction
- Example x86 (Intel, AMD)
- RISC (Reduced Instruction Set Computer)
- Each assembly instruction does a small (unit) job
- Fixed-length instruction
- Load/Store Architecture
- Example MIPS, ARM
12ARM Architecture
- ARM is RISC (Reduced Instruction Set Computer)
- x86 instruction set is based on CISC (Complex
Instruction Set Computer) even though internally
x86 implements pipeline - Suitable for embedded systems
- Very small implementation (low price)
- Low power consumption (longer battery life)
13ARM Registers
- ARM has 31 general purpose registers and 6 status
registers (32-bit each)
14ARM Registers
- Unbanked registers R0 R7
- Each of them refers to the same 32-bit physical
register in all processor modes. - They are completely general-purpose registers,
with no special uses implied by the architecture - Banked registers R8 R14
- R8 R12 have no dedicated special purposes
- FIQ mode has dedicated registers for fast
interrupt processing - R13 and R14 are dedicated for special purposes
for each mode
15R13, R14, and R15
- Some registers in ARM are used for special
purposes - R15 PC (Program Counter)
- x86 uses a terminology called IP (Instruction
Pointer) - EIP register
- R14 LR (Link Register)
- R13 SP (Stack Pointer)
16CPSR
- Current Program Status Register (CPSR) is
accessible in all modes - Contains all condition flags, interrupt disable
bits, the current processor mode
17CPSR bits
18CPSR bits
19CPSR bits
- ARM 32-bit mode
- Thumb 16-bit mode
- Jazelle Special mode for JAVA acceleration
20Interrupt
- Interrupt is an asynchronous signal from hardware
indicating the need for attention or a
synchronous event in software indicating the need
for a change in execution. - Hardware interrupt causes the processor (CPU) to
save its state of execution via a context switch,
and begin execution of an interrupt handler. - Software interrupt is usually implemented as
instructions in the instruction set, which cause
a context switch to an interrupt handler similar
to a hardware interrupt. - Interrupt is a commonly used technique in
computer system for communication between CPU and
peripheral devices - Operating systems also extensively use interrupt
(timer interrupt) for task (process, thread)
scheduling
21Hardware Interrupt in ARM
- IRQ
- Normal Interrupt Request by asserting IRQ pin
- Program jumps to 0x0000_0018
- FIQ
- Fast Interrupt Request by asserting FIQ pin
- Has a higher priority than IRQ
- Program jumps to 0x0000_001C
IRQ
FIQ
22Software Interrupt in ARM
- There is an instruction in ARM for software
interrupt - SWI instruction
- Software interrupt is commonly used by OS for
system calls - Example open(), close().. etc
23Exception Vectors in ARM
24Exception Priority in ARM
25ARM Instruction Overview
- ARM is a RISC machine, so the instruction length
is fixed - In the ARM mode, the instructions are 32-bit wide
- In the Thumb mode, the instructions are 16-bit
wide - Most ARM instructions can be conditionally
executed - It means that they have their normal effect only
if the N (Negative), Z (Zero), C (Carry) and V
(Overflow) flags in the CPSR satisfy a condition
specified in the instruction - If the flags do not satisfy this condition, the
instruction acts as a NOP (No Operation) - In other words, the instruction has no effect and
advances to the next instruction
26ARM Instruction Format
27Condition Field
28Data Processing Instructions
- Move instructions
- Arithmetic instructions
- Logical instructions
- Comparison instructions
- Multiply instructions
29Execution Unit in ARM
Rn
Rm
Barrel Shifter
No pre-processing
Pre-processing
N
ALU
Rd
30Move Instructions
Syntax ltinstructiongtcondS Rd, N
MOV Move a 32-bit value into a register Rd N
MVN Move the NOT of the 32-bit value into a register Rd N
31Move Instructions MOV
- MOV loads a value into the destination register
(Rd) from another register, a shifted register,
or an immediate value - Useful to setting initial values and transferring
data between registers - It updates the carry flag (C), negative flag (N),
and zero flag (Z) if S bit is set - C is set from the result of the barrel shifter
MOV R0, R0 move R0 to R0, Thus, no effect MOV
R0, R0, LSL3 R0 R0 8 MOV PC, R14 (R14
link register) Used to return to caller MOVS PC,
R14 PC lt- R14 (lr), CPSR lt- SPSR
Used to return from interrupt or exception
SBZ should be zeros
32MOV Example
cpsr nzcv r0 0x0000_0000 r1
0x8000_0004 MOVS r0, r1, LSL 1
Before
cpsr nzCv r0 0x0000_0008 r1 0x8000_0004
After
33Rm with Barrel Shifter
Encoded here
MOVS r0, r1, LSL 1
Shift Operation (for Rm) Syntax
Immediate immediate
Register Rm
Logical shift left by immediate Rm, LSL shift_imm
Logical shift left by register Rm, LSL Rs
Logical shift right by immediate Rm, LSR shift_imm
Logical shift right by register Rm, LSR Rs
Arithmetic shift right by immediate Rm, ASR shift_imm
Arithmetic shift right by register Rm, ASR Rs
Rotate right by immediate Rm, ROR shift_imm
Rotate right by register Rm, ROR Rs
Rotate right with extend Rm, RRX
LSL Logical Shift Left LSR Logical Shift
Right ASR Arithmetic Shift Right ROR Rotate
Right RRX Rotate Right with Extend
34Arithmetic Instructions
Syntax ltinstructiongtcondS Rd, Rn, N
ADC add two 32-bit values with carry Rd Rn N carry
ADD add two 32-bit values Rd Rn N
RSB reverse subtract of two 32-bit values Rd N - Rn
RSC reverse subtract of two 32-bit values with carry Rd N Rn - !C
SBC subtract two 32-bit values with carry Rd Rn - N - !C
SUB subtract two 32-bit values Rd Rn - N
35Arithmetic Instructions ADD
- ADD adds two operands, placing the result in Rd
- Use S suffix to update conditional field
- The addition may be performed on signed or
unsigned numbers
ADD R0, R1, R2 R0 R1 R2 ADD R0, R1, 256
R0 R1 256 ADDS R0, R2, R3,LSL1 R0 R2
(R3 ltlt 1) and update flags
36Arithmetic Instructions ADC
- ADC adds two operands with a carry bit, placing
the result in Rd - It uses a carry bit, so can add numbers larger
than 32 bits - Use S suffix to update conditional field
lt64-bit additiongt 64 bit 1st operand R4
and R5 64 bit 2nd operand R8 and R9 64 bit
result R0 and R1 ADDS R0, R4, R8 R0
R4 R8 and set carry accordingly ADCS R1, R5,
R9 R1 R5 R9 (Carry flag)
37Arithmetic Instructions SUB
- SUB subtracts operand 2 from operand 1, placing
the result in Rd - Use S suffix to update conditional field
- The subtraction may be performed on signed or
unsigned numbers
SUB R0, R1, R2 R0 R1 - R2 SUB R0, R1, 256
R0 R1 - 256 SUBS R0, R2, R3,LSL1 R0 R2
- (R3 ltlt 1) and update flags
38Arithmetic Instructions SBC
- SBC subtracts operand 2 from operand 1 with the
carry flag, placing the result in Rd - It uses a carry bit, so can subtract numbers
larger than 32 bits. - Use S suffix to update conditional field
- lt64-bit Subtractiongt
-
- 64 bit 1st operand R4 and R5 64 bit
2nd operand R8 and R9 - 64 bit result R0 and
R1 - SUBS R0, R4, R8 R0 R4 R8
- SBC R1, R5, R9 R1 R5 R9 - !(carry flag)
39Examples
Before
r0 0x0000_0000 r1 0x0000_0002 r2
0x0000_0001 SUB r0, r1, r2
After
After
After
r0 0x0000_0001 r1 0x0000_0002 r2 0x0000_0001
r0 0xFFFF_FF89 r1 0x0000_0077
r0 0x0000_000F r1 0x0000_0005
40Examples
cpsr nzcv r1 0x0000_0001 SUBS r1, r1, 1
Before
cpsr nZCv r1 0x0000_0000
After
- Why is the C flag set (C 1)?
41Logical Instructions
Syntax ltinstructiongtcondS Rd, Rn, N
AND logical bitwise AND of two 32-bit values Rd Rn N
ORR logical bitwise OR of two 32-bit values Rd Rn N
EOR logical exclusive OR of two 32-bit values Rd Rn N
BIC logical bit clear Rd Rn N
42Logical Instructions AND
- AND performs a logical AND between the two
operands, placing the result in Rd - It is useful for masking the bits
AND R0, R0, 3 Keep bits zero and one of R0
and discard the rest
43Logical Instructions EOR
- EOR performs a logical Exclusive OR between the
two operands, placing the result in the
destination register - It is useful for inverting certain bits
EOR R0, R0, 3 Invert bits zero and one of R0
44Examples
r0 0x0000_0000 r1 0x0204_0608 r2
0x1030_5070 ORR r0, r1, r2
r1 0b1111 r2 0b0101 BIC r0, r1, r2
Before
Before
r0 0x1234_5678
r0 0b1010
After
After
45Comparison Instructions
- The comparison instructions update the cpsr flags
according to the result, but do not affect other
registers - After the bits have been set, the information can
be used to change program flow by using
conditional execution
Syntax ltinstructiongtcondS Rn, N
CMN compare negated Flags set as a result of Rn N
CMP Compare Flags set as a result of Rn N
TEQ test for equality of two 32-bit values Flags set as a result of Rn N
TST test bits of a 32-bit value Flags set as a result of Rn N
46Comparison Instructions CMP
- CMP compares two values by subtracting the
second operand from the first operand - Note that there is no destination register
- It only update cpsr flags based on the execution
result
CMP R0, R1
47Comparison Instructions CMN
- CMN compares one value with the 2s complement
of a second value - It performs a comparison by adding the 2nd
operand to the first operand - It is equivalent to subtracting the negative of
the 2nd operand from the 1st operand - Note that there is no destination register
- It only update cpsr flags based on the execution
result
CMN R0, R1
48Comparison Instructions TST
- TST tests bits of two 32-bit values by logically
ANDing the two operands - Note that there is no destination register
- It only update cpsr flags based on the execution
result - TEQ sets flags by logical exclusive ORing the two
operands
49Examples
cpsr nzcv r0 4 r9 4 CMP r0, r9
Before
cpsr nZCv r0 4 r9 4
After
50Branch Instructions
- A branch instruction changes the flow of
execution or is used to call a routine - The type of instruction allows programs to have
subroutines, if-then-else structures, and loops
Syntax Bcond label BLcond
label
B branch pc label
BL branch with link pc label lr address of the next instruction after the BL
51B, BL
- B (branch) and BL (branch with link) are used for
conditional or unconditional branch - BL is used for the subroutine (procedure,
function) call - To return from a subroutine, use
- MOV PC, R14 (R14 link register) Used to return
to caller - Branch target address
- Sign-extend the 24-bit signed immediate (2s
complement) to 30-bits - Left-shift the result by 2 bits
- Add it to the current PC (actually, PC8)
- Thus, the branch target could be 32MB away from
the current instruction
52Examples
B forward ADD r1, r2, 4 ADD r0, r6,
2 ADD r3, r7, 4 forward SUB r1, r2, 4
53Memory Access Instructions
- Load-Store (memory access) instructions transfer
data between memory and CPU registers - Single-register transfer
- Multiple-register transfer
- Swap instruction
54Single-Register Transfer
LDR Load a word into a register Rd ? mem32address
STR Store a word from a register to memory Rd ? mem32address
LDRB Load a byte into a register Rd ? mem8address
STRB Store a byte from a register to memory Rd ? mem8address
LDRH Load a half-word into a register Rd ? mem16address
STRH Store a half-word into a register Rd ? mem16address
LDRSB Load a signed byte into a register Rd ? SignExtend ( mem8address)
LDRSH Load a signed half-word into a register Rd ? SignExtend ( mem16address)
55LDR (Load Register)
- LDR loads a word from a memory location to a
register - The memory location is specified in a very
flexible manner with addressing mode
// Assume R1 0x0000_2000 LDR R0, R1 // R0 ?
R1 LDR R0, R1, 16 // R0 ? R116
0x0000_2010
56STR (Store Register)
- STR stores a word from a register to a memory
location - The memory location is specified in a very
flexible manner with a addressing mode
// Assume R1 0x0000_2000 STR R0, R1 // R1
lt- R0 STR R0, R1, 16 // R116 lt- R0
57Load-Store Addressing Mode
Indexing Method Data Base Address register updated? Example
Preindex with writeback Membase offset Yes (Base offset) LDR r0, r1, 4!
Preindex Membase offset No LDR r0, r1, 4
Postindex Membase Yes (Base offset) LDR r0, r1, 4
! Indicates that the instruction writes the
calculated address back to the base address
register
LDR r0, r1, 4!
LDR r0, r1, 4
LDR r0, r1, 4
58Multiple Register Transfer LDM, STM
Syntax ltLDM/STMgtcondltaddressing modegt Rn!,
ltregistersgt
LDM Load multiple registers
STM Store multiple registers
Addressing Mode Description Start address End address Rn!
IA Increment After Rn Rn 4 x N - 4 Rn 4 x N
IB Increment Before Rn 4 Rn 4 x N Rn 4 x N
DA Decrement after Rn 4 x N 4 Rn Rn 4 x N
DB Decrement Before Rn 4 x N Rn 4 Rn 4 x N
59Multiple Register Transfer LDM, STM
- LDM (Load Multiple) loads general-purpose
registers from sequential memory locations
- STM (Store Multiple) stores general-purpose
registers to sequential memory locations
60LDM, STM - Multiple Data Transfer
- In multiple data transfer, the register list is
given in a curly brackets - It doesnt matter which order you specify the
registers in - They are stored from lowest to highest
- A useful shorthand is -
- It specifies the beginning and end of registers
STMFD R13! R0, R1 // R13 is updated LDMFD
R13! R1, R0 // R13 is updated
STMFD R13!, R0-R12 // R13 is updated
appropriately LDMFD R13!, R0-R12 // R13 is
updated appropriately
61Examples
LDMIA r0!, r1-r3
After
Before
Mem320x80018 0x3 Mem320x80014
0x2 Mem320x80010 0x1 r0 0x0008_0010 r1
0x0000_0000 r2 0x0000_0000 r3 0x0000_0000
Mem320x80018 0x3 Mem320x80014
0x2 Mem320x80010 0x1 r0 0x0008_001C r1
0x0000_0001 r2 0x0000_0002 r3 0x0000_0003
62Stack Operation
- Multiple data transfer instructions (LDM and STM)
are used to load and store multiple words of data
from/to main memory
Stack Other Description
STMFA STMIB Pre-incremental store
STMEA STMIA Post-incremental store
STMFD STMDB Pre-decremental store
STMED STMDA Post-decremental store
LDMED LDMIB Pre-incremental load
LDMFD LDMIA Post-incremental load
LDMEA LDMDB Pre-decremental load
LDMFA LDMDA Post-decremental load
- IA Increment After
- IB Increment Before
- DA Decrement After
- DB Decrement Before
- FA Full Ascending (in stack)
- FD Full Descending (in stack)
- EA Empty Ascending (in stack)
- ED Empty Descending (in stack)
63SWAP Instruction
Syntax SWPBcond Rd, Rm, ltRngt
SWP Swap a word between memory and a register tmp mem32Rn mem32Rn Rm Rd tmp
SWPB Swap a byte between memory and a register tmp mem8Rn mem8Rn Rm Rd tmp
64SWAP Instruction
- SWP swaps the contents of memory with the
contents of a register - It is a special case of a load-store instruction
- It performs a swap atomically meaning that it
does not release the bus unitil it is done with
the read and the write - It is useful to implement semaphores and mutual
exclusion (mutex) in an OS
SWP r0, r1, r2
After
mem320x9000 0x1111_2222 r0 0x1234_5678 r1
0x1111_2222 r2 0x0000_9000
65Semaphore Example
Spin MOV r1, semaphore // r1 has an address
for semaphore MOV r2, 1 SWP r3, r2,
r1 CMP r3, 1 BEQ spin
66Miscellaneous but Important Instructions
- Software interrupt instruction
- Program status register instructions
67SWI (Software Interrupt)
- The SWI instruction incurs a software interrupt
- It is used by operating systems for system calls
- 24-bit immediate value is ignored by the ARM
processor, but can be used by the SWI exception
handler in an operating system to determine what
operating system service is being requested
Syntax SWIcond SWI_number
SWI Software interrupt lr_svc (r14) address of instruction following SWI pc 0x8 cpsr mode SVC cpsr I bit 1 (it masks interrupts)
- To return from the software interrupt, use
- MOVS PC, R14 PC lt- R14 (lr), CPSR lt- SPSR
68Example
0x0000_8000 SWI 0x123456
After
cpsr nzcVqIft_SVC spsr nzcVqift_USER pc
0x0000_0008 lr 0x0000_8004 r0 0x12
69Program status register instructions
Syntax MRScond Rd, ltcpsr spsrgt MSRcond
ltcpsr spsrgt_ltfieldsgt, Rm MSRcond ltcpsr
spsrgt_ltfieldsgt, immediate
MRS Copy program status register to a general-purpose register Rd psr
MSR Copy a general-purpose register to a program status register psrfield Rm
MSR Copy an immediate value to a program status register psrfield immediate
fields can be any combination of control (c),
extension (x), status (s), and flags (f)
Control 70
eXtension 158
Status2316
Flags3124
N Z C V I F T Mode
70MSR MRS
- MSR Move the value of a general-purpose register
or an immediate constant to the CPSR or SPSR of
the current mode - MRS Move the value of the CPSR or the SPSR of
the current mode into a general-purpose register - To change the operating mode, use the following
code
MSR CPSR_all, R0 Copy R0 into CPSR MSR
SPSR_all, R0 Copy R0 into SPSR
MRS R0, CPSR_all Copy CPSR into R0 MRS R0,
SPSR_all Copy SPSR into R0
// Change to the supervisor mode MRS R0,CPSR
Read CPSR BIC R0,R0,0x1F Remove current mode
with bit clear instruction ORR R0,R0,0x13
Substitute to the Supervisor mode MSR CPSR_c,R0
Write the result back to CPSR
71(Assembly) Language
- There is no golden way to learn language
- You got to use and practice to get used to it
72 73Overflow/Underflow
- Overflow/Underflow
- The answer to an addition or subtraction exceeds
the magnitude that can be represented with the
allocated number of bits - Overflow/Underflow is a problem in computers
because the number of bits to hold a number is
fixed - For this reason, computers detect and flag the
occurrence of an overflow/underflow. - Detection of an overflow/underflow after the
addition of two binary numbers depends on whether
the numbers are considered to be signed or
unsigned
74Overflow/Underflow in Unsigned Numbers
- When two unsigned numbers are added, an overflow
is detected from the end carry out of the most
significant position - If the end carry is 1, there is an overflow.
- When two unsigned numbers are subtracted, an
underflow is detected when the end carry is 0
75Subtraction of Unsigned Numbers
- Unsigned number is either positive or zero
- There is no sign bit
- So, a n-bit can represent numbers from 0 to 2n -
1 - For example, a 4-bit can represent 0 to 15 (24
1) - To declare an unsigned number in C language,
- unsigned int a
- x86 allocates a 32-bit for unsigned int
- Subtraction of unsigned integers (M, N)
- M N in binary can be done as follows
- M (2n N) M N 2n
- If M N, the sum produces an end carry, which is
2n - Subtraction result is zero or a positive number
- If M lt N, the sum does not produce an end carry
since it is equal to 2n (N M) - Unsigned Underflow subtraction result is
negative and unsigned number cant represent
negative numbers
76Overflow/Underflow in Signed Numbers
- With signed numbers, an overflow/underflow cant
occur for an addition if one number is positive
and the other is negative. - Adding a positive number to a negative number
produces a result whose magnitude is equal to or
smaller than the larger of the original numbers - An overflow may occur if two numbers are both
positive in addition - When x and y both have sign bits of 0 (positive
numbers) - If the sum has sign bit of 1, there is an
overflow - An underflow may occur if two numbers are both
negative in addition - When x and y both have sign bits of 1 (negative
numbers) - If the sum has sign bit of 0, there is an
underflow
77Overflow/Underflow in Signed Numbers
8-bit Signed number addition
8-bit Signed number addition
10000001 (-127) 11111010 ( -6) ---------------
----- (-133)
01001000 (72) 00111001 (57) ------------------
-- (129)
What is largest positive number represented by
8-bit?
What is smallest negative number represented by
8-bit?
Slide from H.H.Lee, Georgia Tech
78Overflow/Underflow in Signed Numbers
- So, we can detect overflow/underflow with the
following logic - Suppose that we add two k-bit numbers
- xk-1xk-2 x0 yk-1yk-2 y0 sk-1sk-2 s0
-
- Overflow xk-1yk-1sk-1 xk-1yk-1sk-1
-
- There is an easier formula
- Let the carry out of the full adder adding two
numbers be ck-1ck-2 c0 -
- Overflow ck-1 ck-2
- If a 0 (ck-2) is carried in, the only way that 1
(ck-1) can be carried out is if xk-1 1 and
yk-1 1 - Adding two negative numbers results in a
non-negative number - If a 1 (ck-2) is carried in, the only way that 0
(ck-1) can be carried out is if xk-1 0 and
yk-1 0 - Adding two positive numbers results in a negative
number
79Overflow/Underflow Detection in Signed Numbers
Slide from H.H.Lee, Georgia Tech
80Recap
- Unsigned numbers
- Overflow could occur when 2 unsigned numbers are
added - An end carry of 1 indicates an overflow
- Underflow could occur when 2 unsigned numbers are
subtracted - An end carry of 0 indicates an underflow.
- minuend lt subtrahend
- Signed numbers
- Overflow could occur when 2 signed positive
numbers are added - Underflow could occur when 2 signed negative
numbers are added - Overflow flag indicates both overflow and
underflow
81Recap
- Binary numbers in 2s complement system are added
and subtracted by the same basic addition and
subtraction rules as used in unsigned numbers - Therefore, computers need only one common
hardware circuit to handle both types (signed,
unsigned numbers) of arithmetic - The programmer must interpret the results of
addition or subtraction differently, depending on
whether it is assumed that the numbers are signed
or unsigned.
82ARM Flags
- In general, computer has several flags
(registers) to indicate state of operations such
as addition and subtraction - N Negative
- Z Zero
- C Carry
- V Overflow
- We have only one adder inside a computer.
- CPU does comparison of signed or unsigned numbers
by subtraction using adder - CPU sets the flags depending on the result of
operation - These flags provide enough information to judge
that one is bigger than or less than the other?
83ARM Flags (Cont)