Title: Embedded%20Hardware%20Foundation
1Embedded Hardware Foundation
2Content
- CPU
- Bus
- Memory
- I/O
- Design, develop and debug
31. CPU
- I/O programming
- Busy/wait
- Interrupt-driven
- Supervisor mode, exceptions, traps
- Co-processor
- Memory System
- Cache
- Memory management
- Performance and power consumption
4I/O devices
- Usually includes some non-digital component.
- Typical digital interface to CPU
status reg
CPU
mechanism
data reg
5Application 8251 UART
- Universal asynchronous receiver transmitter
(UART) provides serial communication. - 8251 functions are integrated into standard PC
interface chip. - Allows many communication parameters to be
programmed.
68251 CPU interface
8251
status (8 bit)
CPU
xmit/ rcv
data (8 bit)
serial port
7Programming I/O
- Two types of instructions can support I/O
- special-purpose I/O instructions
- memory-mapped load/store instructions.
- Intel x86 provides in, out instructions. Most
other CPUs use memory-mapped I/O. - I/O instructions do not preclude memory-mapped
I/O.
8ARM memory-mapped I/O
- Define location for device
- DEV1 EQU 0x1000
- Read/write code
- LDR r1,DEV1 set up device address
- LDR r0,r1 read DEV1
- LDR r0,8 set up value to write
- STR r0,r1 write value to device
9peek and poke (Using C)
- int peek(char location)
- return location
-
- void poke(char location, char newval)
- (location) newval
10Busy/wait output
- Simplest way to program device.
- Use instructions to test when device is ready.
- char mystring"hello, world."
- char current_char
- current_char mystring
- while (current_char ! \0)
- while (peek(OUT_STATUS) ! 0)
- poke(OUT_CHAR,current_char)
- current_char
11Simultaneous busy/wait input and output
- while (TRUE)
- / read /
- while (peek(IN_STATUS) ! 0)
- achar (char)peek(IN_DATA)
- / write /
- while (peek(OUT_STATUS) ! 0)
- poke(OUT_DATA,achar)
12Interrupt I/O
- Busy/wait is very inefficient.
- CPU cant do other work while testing device.
- Hard to do simultaneous I/O.
- Interrupts allow a device to change the flow of
control in the CPU. - Causes subroutine call to handle device.
13Interrupt interface
intr request
status reg
CPU
intr ack
mechanism
PC
IR
data/address
data reg
14Interrupt behavior
- Based on subroutine call mechanism.
- Interrupt forces next instruction to be a
subroutine call to a predetermined location. - Return address is saved to resume executing
foreground program.
15Interrupt physical interface
- CPU and device are connected by CPU bus.
- CPU and device handshake
- device asserts interrupt request
- CPU asserts interrupt acknowledge when it can
handle the interrupt.
16Example interrupt-driven input and output
- void input_handler()
- void output_handler()
- main()
-
- while (TRUE)
- if (gotchar)
- while (peek(OUT_STATUS) ! 0)
- poke(OUT_DATA,achar)
- gotchar FALSE
-
-
17Example character I/O handlers
- void input_handler()
-
- achar peek(IN_DATA)
- gotchar TRUE
- poke(IN_STATUS,0)
-
- void output_handler()
-
18Example interrupt I/O with buffers
19Buffer-based input handler
- void input_handler()
-
- char achar
- if (full_buffer())
- error 1
- else
- achar peek(IN_DATA)
- add_char(achar)
-
- if (nchars 1)
- poke(OUT_DATA,remove_char()
- poke(OUT_STATUS,1)
-
20Buffer-based output handler
- void output_handler()
-
- if (!empty_buffer())
- poke(OUT_DATA, remove_char())
- / send character /
- poke(OUT_STATUS, 1)
- /turn device on /
-
21Priorities and vectors
- Two mechanisms allow us to make interrupts more
specific - Priorities determine what interrupt gets CPU
first. - Vectors determine what code is called for each
type of interrupt. - Mechanisms are orthogonal most CPUs provide both.
22Prioritized interrupts
device 1
device 2
device n
interrupt acknowledge
CPU
L1 L2 .. Ln
23Interrupt prioritization
- Masking interrupt with priority lower than
current priority is not recognized until pending
interrupt is complete. - Non-maskable interrupt (NMI) highest-priority,
never masked. - Often used for power-down.
24Interrupt vectors
- Allow different devices to be handled by
different code. - Interrupt vector table
Interrupt vector table head
handler 0
handler 1
handler 2
handler 3
25Interrupt vector acquisition
device
interruput request
interruput ack.
vector
CPU
26Interrupt vector acquisition
CPU
device
receive request
receive ack
receive vector
27Interrupt sequence
- CPU checks pending interrupt requests and
acknowledges the one of highest priority. - Device receives acknowledgement and sends vector.
- CPU locates the handler using vector as index of
interrupt table and calls the handler. - Software processes request.
- CPU restores state to foreground program.
28Sources of interrupt overhead
- Handler execution time.
- Interrupt mechanism overhead.
- Register save/restore.
- Pipeline-related penalties.
- Cache-related penalties.
29ARM interrupts
- ARM7 supports two types of interrupts
- Fast interrupt requests (FIQs).
- Interrupt requests (IRQs).
- Interrupt table starts at location 0.
30ARM interrupt procedure
- CPU actions
- Save PC. Copy CPSR to SPSR.
- Force bits in CPSR to record interrupt.
- Force PC to vector.
- Handler responsibilities
- Restore proper PC.
- Restore CPSR from SPSR.
- Clear interrupt disable flags.
31Exception and Trap
- Exception
- internally detected error.
- Exceptions are synchronous with instructions but
unpredictable. - Build exception mechanism on top of interrupt
mechanism. - Exceptions are usually prioritized and
vectorized. - Trap (software interrupt)
- an exception generated by an instruction.
- Call supervisor mode.
32Supervisor mode
- May want to provide protective barriers between
programs. - Avoid memory corruption.
- Need supervisor mode to manage the various
programs.
33ARM CPU modes
????? ??
????(User, usr) ?????????
??????(FIQ, fiq) ?????????????
??????(IRQ, irq) ?????????
????(Supervisor, svc) ??????????????
???????? (Abort, abt) ???????????
????????? (Undefined, und) ?????????????????
???? ??????????????
????
34ARM CPU modes (contd)
- SWI (Software interrupt)??
- ??
- SWIlt???gt immed_24
- SWI??????????,?????????, CPSR????????SPSR?,????0x0
8???????ltimmed_24gt???????
35Co-processor
- Co-processor added function unit that is called
by instruction. - Floating-point units are often structured as
co-processors. - ARM allows up to 16 designer-selected
co-processors. - Floating-point co-processor uses units 1 and 2.
36Memory System
- Cache
- Memory Management Unit
37Cache
- Small amount of fast memory
- Sits between normal main memory and CPU
- May be located on CPU chip or module
38Cache operation - overview
- CPU requests contents of memory location
- Check cache for this data
- If present, get from cache (fast)
- If not present, read required block from main
memory to cache - Then deliver from cache to CPU
- Cache includes tags to identify which block of
main memory is in each cache slot
39Cache operation
- Many main memory locations are mapped onto one
cache entry. - May have caches for
- instructions
- data
- data instructions (unified).
- Memory access time is no longer deterministic.
40Cache organizations
- Direct-mapped each memory location maps onto
exactly one cache entry. - Fully-associative any memory location can be
stored anywhere in the cache (almost never
implemented). - N-way set-associative each memory location can
go into one of n sets.
41(No Transcript)
42Example
- Cache of 64kByte
- Cache block of 4 bytes
- i.e. cache is 16k (214) lines of 4 bytes
- 16MBytes main memory
- 24 bit address (22416M)
- 222 blocks 28 blocks will be mapped into one
cache line on the average
43Direct-mapped cache
- Each block of main memory maps to only one cache
line - i.e. if a block is in cache, it must be in one
specific place - Address is in two parts
- Least Significant w bits identify unique word
- Most Significant s bits specify one memory block
- The MSBs are split into a cache line field r and
a tag of s-r (most significant)
44Direct MappingAddress Structure
Line or Slot r
Word w
Tag s-r
14
2
8
- 24 bit address
- 2 bit word identifier (4 byte block)
- 22 bit block identifier
- 8 bit tag (22-14)
- 14 bit slot or line
- No two blocks in the same line have the same Tag
field - Check contents of cache by finding line and
checking Tag
45Direct-mapped cache
valid
tag
data
cache block
tag
index
offset
value
hit
46- Fully-associative cache
- Set-associative cache
47Write operations
- Write-through immediately copy write to main
memory. - Write-back write to main memory only when
location is removed from cache.
48Memory management units
- Memory management unit (MMU) translates addresses
main memory
logical address
memory management unit
physical address
CPU
49Memory management tasks
- Allows programs to move in physical memory during
execution. - Allows virtual memory
- memory images kept in secondary storage
- images returned to main memory on demand during
execution. - Page fault request for location not resident in
memory.
50Address translation
- Requires some sort of register/table to allow
arbitrary mappings of logical to physical
addresses. - Two basic schemes
- segmented
- paged.
- Segmentation and paging can be combined (x86).
51Segments and pages
memory
segment 1
page 1
page 2
segment 2
52Segment address translation
segment base address
logical address
range error
segment lower bound
range check
segment upper bound
physical address
53Page address translation
page
offset
page i base
concatenate
page
offset
54Page table organizations
page descriptor
page descriptor
flat
tree
55Caching address translations
- Large translation tables require main memory
access. - TLB cache for address translation.
- Typically small.
56ARM memory management
- Memory region types
- section 1 Mbyte block
- large page 64 kbytes
- small page 4 kbytes.
- An address is marked as section-mapped or
page-mapped. - Two-level translation scheme.
57CPU performance and power consumption
58Example Intel XScale core
592. Bus
- ??CPU????????????
- ????????
- ????????
- ???????
- ?????
- ?????????,?CPU,DMA???
60?????????
61DMA
- DMA Direct Memory Access
- ??????CPU????????DMA???DMA?????,??CPU?????????????
,DMA????????????????????
62?DMA????????
63?
- ?????????
- ????
- ?????????????
- ????????
- ???????????I/O????????
64ARM??-AMBA
- AMBA Advanced Microcontroller Bus Architecture
- 2.0?AMBA?????????
- AHB(AMBA High-performance Bus)
- ASB(AMBA System Bus)
- APB(AMBA Peripheral Bus)
65?????AMBA???
- ???????AMBA????????AHB?ASB??,???APB???
- ASB???????????AHB????,???????????????????
663. Memory
- RAM
- SRAM
- DRAM
- ROM
- PROM,EPROM,EEPROM
- Flash ROM
- Flash????????????(boot ROM?hard disk)
674. I/O
- Watchdog timer
- A/D D/A Converter
- LCD
- LED
- Touch screen
- Key board
- USB
68Watchdog timer
- ???????????????????????????????????????????
- ?????????,????????????????????????????,???????????
?????????????,???????????,????????,????,?????CPU??
?
69Watchdog timer
- Watchdog timer is periodically reset by system
timer. - If watchdog is not reset, it generates an
interrupt to reset the host.
host CPU
interrupt
watchdog timer
reset