CH3 CPUs - PowerPoint PPT Presentation

1 / 78
About This Presentation
Title:

CH3 CPUs

Description:

read and write arbitrary memory locations are peek and poke. The peek function written in C as: ... poke(DEV1,8); 3.2.3 Busy-Wait I/O. busy-wait I/O ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 79
Provided by: abc88
Category:
Tags: ch3 | cpus | poke

less

Transcript and Presenter's Notes

Title: CH3 CPUs


1
CH3 CPUs
2
summary
  • Input and output mechanisms.
  • Supervisor mode, exceptions, and traps.
  • Memory management and address translation.
  • Caches.
  • How architecture affects program performance.
  • How architecture affects program power
    consumption.

3
3.1 Introduction
4
outline
  • aspects of CPUs that do not directly relate to
    their instruction sets
  • interrupts and memory management
  • performance and power consumption

5
outline
  • 3.2 study input and output mechanisms such as
    interrupts
  • 3.3 several mechanisms designed to handle
    internal events
  • 3.4 co-processors that provide optional support
    for parts of the instruction set
  • 3.5 memory systems, memory management and caches

6
outline
  • 3.6 looks at performance
  • 3.7 considers power consumption
  • 3.8 data compressor example

7
3.2 Programming Input and Output
8
  • basics of I/O programming
  • basic characteristics of I/O devices

9
3.2.1 Input and Output Devices
10
Structure of a typical I/O device
  • Input and output devices usually have some analog
    or nonelectronic component
  • relationship between I/O device and CPU
  • Registers interface between CPU and device's
    internals
  • CPU talks to the device by reading and writing
    the registers

11
Structure of a typical I/O device
12
Structure of a typical I/O device
  • Data registers hold data values, such as the
    data read or written by a disk.
  • Status registers provide information about the
    device's operation

13
Ex1. 8251 UART
  • 8251 UART (Universal Asynchronous
    Receiver/Transmitter) the original device used
    for serial communications
  • Data are transmitted as streams of characters
  • Every character starts with a start bit (a 0) and
    a stop bit (a 1)

14
Ex1. 8251 UART
  • baud rate data bits are sent as high and low
    voltages at a uniform rate
  • CPU must set the UART's mode registers
  • - baud rate
  • - data bits 5-8bits
  • - parity bit even, odd,none
  • - stop bit 1, 1.5, or 2 bits

15
Ex1. 8251 UART
  • 8-bit register buffers characters between the
    UART and the CPU bus.
  • Transmitter Ready output transmitter is ready to
    accept a data character
  • Transmitter Empty signal goes high when the UART
    has no characters to send.
  • Receiver Ready goes high when UART has a
    character ready to be read by CPU.

16
3.2.2 Input and Output Primitives
17
programming support for input and output
  • I/O instructions
  • - special instructions (Intel x86) for input and
    output
  • memory-mapped I/O
  • provides addresses for the registers in each I/O
    device
  • read and write instructions communicate with the
    devices

18
Ex1. Memory-Mapped I/O on ARM
  • use the EQU pseudo-op to define a symbolic name
    for the memory location of our I/O
    device
  • DEV1 EQU 0x1000

19
Ex1. Memory-Mapped I/O on ARM
  • read and write the device register
  • LDR r1,DEV1 set up device address
  • LDR r0,r1 read DEV1
  • LDR r0,8 set up value to write
  • STR r0,r1 write 8 to device

20
Ex2. Memory-Mapped I/O on SHARC
  • A memory-mapped I/O device must be assigned
    within the external memory space, which starts at
    0x400000.
  • use a DM access to read and write the off-chip
    device register
  • I0 0x400000
  • M0 0
  • R1 DM(i0,M0)

21
write I/O devices in C
  • read and write arbitrary memory locations are
    peek and poke
  • The peek function written in C as
  • int peek(char location)
  • return location
  • define DEV1 0x1000
  • dev_status peek(DEVl)

22
write I/O devices in C
  • poke function can be implemented as
  • void poke(char location, char newval)
  • (location) newval
  • write 8 to the status register
  • poke(DEV1,8)

23
3.2.3 Busy-Wait I/O
24
busy-wait I/O
  • Devices are slower than the CPU and require many
    cycles to complete an operation.
  • CPU must wait for one operation to complete
    before starting the next one
  • polling Asking an I/O device whether it is
    finished by reading its status register

25
Ex3-3 Busy-Wait I/O Programming
  • write a sequence of characters to an output
    device
  • two registers one for the character to be
    written and a status register
  • status register's value is 1 when the device is
    busy writing and 0 when the write transaction has
    completed

26
Ex3-3 Busy-Wait I/O Programming
  • register addresses
  • define 0UT_CHAR 0x1000 / output device
    character register /
  • define OUT_STATUS 0x1001 / output device status
    register /

27
Ex3-3 Busy-Wait I/O Programming
  • sequence of characters is stored in a standard C
    string, which is terminated by a null (0)
    character
  • char mystring "Hello, world." / string to
    write /
  • char current_char / pointer to current
    position in string /

28
Ex3-3 Busy-Wait I/O Programming
  • current_char mystring
  • / point to head of string /
  • while (current_char ! '\0')
  • / until null character /
  • poke(OUT_CHAR,current_char)
  • / send character to device /
  • while (peek(OUT_STATUS) ! 0)
  • / keep checking status /
  • current_char / update character pointer /

29
Ex3-4 Copy Characters from Input to Output Using
Busy-Wait I/O
  • repeatedly read a character from the input device
    and write it to the output device
  • define addresses for the device registers
  • define IN_DATA 0x1000
  • define IN_STATUS 0x1001
  • define 0UT_DATA 0x1100
  • define OUT_STATUS 0x1101

30
Ex3-4 Copy Characters from Input to Output Using
Busy-Wait I/O
  • The input device
  • sets status register to 1 when a new character
    has been read
  • set the status register 0 after character has
    been read
  • When writing
  • set the output status register to 1 to start
    writing and wait for it to return to 0

31
  • while (TRUE) / perform operation forever /
  • / read a character into achar /
  • while (peek(IN_STATUS) 0) / wait until ready
    /
  • achar (char)peek(IN_DATA) / read the
    character /
  • / write achar /
  • poke(OUT_DATA,achar)
  • poke(OUT_STATUS,l) / turn on device /
  • while (peek(OUT_STATUS) ! 0) / wait until done
    /

32
3.2.4 Interrupts
  • Busy-wait I/O is inefficient the CPU does
    nothing but test the device status
  • CPU could work in parallel with the I/O
    transaction
  • - computation
  • - control of other I/O devices.

33
interrupt
  • interrupt mechanism allows devices to signal CPU
    and to force execution of a particular piece of
    code
  • At interrupt, the program counter point to an
    interrupt handler routine (device driver)
    writing the next data, reading data
  • CPU can return to the program that was
    interrupted

34
interrupt
35
interrupt
  • interface between the CPU and I/O device includes
    the following signals
  • I/O device asserts the interrupt request signal
    when it wants service
  • CPU asserts the interrupt acknowledge signal when
    it is ready to handle the I/O device's request

36
interrupt
  • The interrupt handler operates much like a
    subroutine, except that it is not called by the
    executing program
  • The program that runs when no interrupt is being
    handled is often called the foreground program
  • when the interrupt handler finishes, it returns
    to the foreground program

37
ex3-5 Copy Characters from Input to Output with
Basic Interrupts
  • repeatedly read a character from an input device
    and write it to an output device
  • use a global variable achar for the input handler
    to pass the character to the foreground program
  • use a global Boolean variable, gotchar, to signal
    when a new character has been received

38
  • void input_handler() / get a character and put
    in global /
  • achar peek(IN_DATA) / get character /
  • gotchar TRUE / signal to main program /
  • poke(IN_STATUS,0) / reset status to initiate
    next transfer /
  • void output_handler() / react to character
    being sent /
  • / don't have to do anything /

39
ex3-5 Copy Characters from Input to Output with
Basic Interrupts
  • main()
  • while (TRUE) / read then write forever /
  • if (gotchar) / write a character /
  • poke(OUT_DATA,achar) / put character in
    device /
  • poke(OUT_STATUS,l) / set status to initiate
    write /
  • gotchar FALSE / reset flag /

40
Ex3-6 Copy Characters from Input to Output with
Interrupt and Buffer
  • performs reads and writes independently.
  • The read and write routines communicate through
    the following global variables.
  • string io_buf hold a queue of characters that
    have been read but not yet written.
  • integers buf_start and buf_end point to the
    first and last characters read.
  • integer error set to 0 whenever io_buf overflows

41
Ex3-6 Copy Characters from Input to Output with
Interrupt and Buffer
  • input and output devices allow to run at
    different rates
  • queue io_buf acts as a wraparound buffer
  • add characters to the tail
  • take characters from the head

42
Ex3-6 Copy Characters from Input to Output with
Interrupt and Buffer
  • When head and tail are equal, the queue is empty

43
Ex3-6 Copy Characters from Input to Output with
Interrupt and Buffer
  • When the buffer is full, we leave one character
    in the buffer unused

44
(No Transcript)
45
Debug interrupt
  • interrupt can occur at any time means that the
    same bug can manifest itself in different ways
    when the interrupt handler interrupts different
    segments of the foreground program

46
Ex3-7 Debugging Interrupt Code
  • Y Axb
  • for (i 0 i lt M i)
  • yi bi
  • for (j 0 j lt N j)
  • yi yi Ai,jxj

47
Ex3-7 Debugging Interrupt Code
  • Assume read_handler has a bug that causes it to
    change the value of j
  • Any CPU register that is written by the interrupt
    handler must be saved before it is modified and
    restored before the handler exits

48
implement
  • The CPU implements interrupts by checking the
    interrupt request line at the beginning of
    execution of every instruction
  • If an interrupt request asserted, CPU does not
    fetch curent instruction
  • The starting address of the interrupt handler is
    usually given as a pointer

49
interrupts and subroutines
  • interrupt handler must return to the foreground
    program without disturbing the foreground
    program's operation
  • Most CPUs use the same basic mechanism for
    remembering the foreground program's PC as is
    used for subroutines
  • interrupt mechanism puts the return address on a
    stack

50
Priorities and Vectors
  • interrupts can be generalized to handle multiple
    devices and to provide more flexible definitions
  • - interrupt priorities CPU to recognize some
    interrupts as more important than others
  • - interrupt vectors allow the interrupting
    device to specify its handler

51
Prioritized interrupts
  • Prioritized interrupts
  • - allow multiple devices to be connected
  • - allow the CPU to ignore less important
    interrupt requests
  • the lower-numbered interrupt lines are given
    higher priority

52
Prioritized device interrupts
  • most CPUs provide the priority number in binary
    form

53
change the priority
  • How do we change the priority of a device?
  • Simply by connecting it to a different interrupt
    request line
  • This requires hardware modification
  • programmable switches, or make the change easy

54
Nested interrupt
  • Masking CPU stores the priority level of
    interrupt in an internal register
  • When a subsequent interrupt occur,
  • - checked against the priority register
  • - new request only if higher priority
  • When the interrupt handler exits, the priority
    register must be reset.

55
power-down interrupts
  • The highest-priority interrupt is normally called
    the nonmaskable interrupt or NMI.
  • The NMI cannot be turned off
  • reserved for interrupts caused by power failures
  • detect a dangerously low power supply
  • NMI interrupt handler save critical state in
    nonvolatile memory, turn off I/O devices

56
  • Most CPUs provide a relatively small number of
    interrupt priority levels
  • more priority levels can be added with external
    logic
  • combine polling with prioritized interrupts to
    efficiently handle the device

57
Using polling to share an interrupt over several
devices
58
Ex3-8 I/O with Prioritized Interrupts
  • A has priority 1
  • B priority 2
  • C priority 3.

59
Interrupt vectors
  • define the interrupt handler that should service
    a request from a device
  • hardware structure to support interrupt vectors

60
Interrupt vectors
  • additional interrupt vector lines run from the
    devices to the CPU
  • After request is acknowledged, device sends its
    interrupt vector to CPU.
  • CPU uses vector number as an index in a table
    stored in memory
  • gives the address of the handler

61
Activity on the bus during a vectored interrupt
62
Interrupt vectors
  • First, the device stores its vector number. a
    device can be given a new handler without
    modifying the system software.
  • there is no fixed relationship between vector
    numbers and interrupt handlers

63
implement
  • Most modern CPUs implement both prioritized and
    vectored interrupts.
  • Priorities determine which device is serviced
    first
  • vectors determine what routine is used to service
    the interrupt

64
Interrupt Overhead
  • complete interrupt handling process
  • Once a device requests an interrupt, some steps
    are performed by the CPU, some by the device, and
    others by software.
  • The basic procedure is described below.
  • 1. CPU checks interrupts at the beginning of an
    instruction, answers the highest-priority
    interrupt

65
Interrupt Overhead
  • 2. Device device receives acknowledgment and
    sends the CPU its interrupt vector.
  • 3. CPU CPU looks up the device handler address
    in the interrupt vector table, save current PC,
    internal CPU state, general-purpose registers.

66
Interrupt Overhead
  • 4. Software device driver save additional CPU
    state, performs required operations, restores
    saved state, executes interrupt return
    instruction.
  • 5. CPU interrupt return instruction restores the
    PC and other automatically saved states, return
    to the interrupted.

67
performance penalty
  • interrupt causes a change in the program counter,
    it incurs a branch penalty. if interrupt
    automatically stores CPU registers, requires
    extra cycles
  • interrupt requires extra cycles to acknowledge
    the interrupt and obtain the vector from the
    device.

68
performance penalty
  • interrupt handler will save and restore CPU
    registers that were not automatically saved by
    the interrupt.
  • interrupt return instruction incurs a branch
    penalty as well as the time required to restore
    the automatically saved state.

69
performance penalty
  • time required for the hardware to respond to the
    interrupt, obtain the vector, cannot be changed
    by the programmer.
  • programming result in a small number of registers
    used by an interrupt handler
  • coding interrupt handler in assembly language
    rather than a high-level language

70
Interrupts in ARM
  • types of interrupts fast interrupt requests
    (FIQs) and interrupt requests (IRQs).
  • FIQ takes priority over an IRQ.
  • interrupt table is kept in the bottom memory
    addresses, starting at location 0.
  • The entries in the table contain subroutine calls
    to the appropriate handler.

71
Interrupts in ARM
  • responding to an interrupt
  • saves the appropriate value of the PC to be used
    to return,
  • copies the CPSR into an SPSR (saved program
    status register),
  • forces bits in the CPSR to note the interrupt,
    and
  • forces the PC to the appropriate interrupt vector.

72
Interrupts in ARM
  • leaving the interrupt handler
  • restore the proper PC value,
  • restore the CPSR from the SPSR, and
  • clear interrupt disable flags.

73
Interrupts in ARM
  • worst-case latency to respond
  • 2 cycles to synchronize external request,
  • up to 20 cycles to complete current instruction,
  • 3 cycles for data abort
  • 2 cycles to enter interrupt handling state.
  • adds up to 4-27 clock cycles

74
Interrupts in SHARC
  • supports three prioritized, vectored, maskable
    interrupts,
  • each of which calls an interrupt handler
    subroutine

75
When processing an interrupt
  • outputs interrupt vector address
  • pushes current PC onto the PC stack
  • may push the ASTAT and MODE1 registers onto the
    status stack
  • sets appropriate bit in the interrupt latch
    register
  • changes interrupt mask pointer to show the
    current interrupt nesting state.

76
return from an interrupt
  • pops the return address of the PC stack and saves
    it to the PC,
  • pops the status stack if appropriate, and
  • clears the appropriate bits in the interrupt
    latch and mask registers.

77
Interrupts in SHARC
  • The interrupt vector table may be kept either in
    internal or external memory.
  • vector table provides interrupt vectors for a
    number of actions, including
  • reset, the three external interrupts,
  • internal DMA channels, timers,
  • floating-point errors,
  • user software interrupts.

78
Interrupts in SHARC
  • For most instructions, the latency for an
    external interrupt is four cycles.
  • Some instructions require multiple cycles to
    finish and will delay interrupt handling
  • waiting for external memory may also delay
    handling.
Write a Comment
User Comments (0)
About PowerShow.com