IO Subsystems - PowerPoint PPT Presentation

1 / 99
About This Presentation
Title:

IO Subsystems

Description:

Peek and Poke functions. traditional functions to read and write arbitrary memory locations ... while (peek (IN_STATUS) = = 0); /*wait loop status =1 when ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 100
Provided by: Ban101
Category:
Tags: peek | subsystems

less

Transcript and Presenter's Notes

Title: IO Subsystems


1
I/O Subsystems
2
(No Transcript)
3
I/O Subsystems
  • Data registers hold values that are treated as
    data by the device, such as the data read or
    written from/to a disk
  • Status registers provide information about the
    devices operation, such as whether the current
    transaction has completed

4
CPU I/O Structures
  • Isolated I/O
  • Separate I/O instructions in instruction set
  • Memory Mapped I/O
  • Memory reference instructions used for I/O
  • NIOS
  • Memory mapped I/O

5
Memory mapped I/O in C
  • Peek and Poke functions
  • traditional functions to read and write arbitrary
    memory locations

6
Basic Structures
  • int peek (char location)
  • return location / return contents of

    location/
  • void poke (char location, char newval)
  • (location) newval /output newval to
    location/

7
BusyWait I/O - Input
  • define IN_STATUS 0X1001
  • define IN_DATA 0X1000
  • while (peek (IN_STATUS) 0) /wait loop
    status 1 when char ready/
  • achar (char) peek (IN_DATA) /input
    character/

8
BusyWait I/O - Output
  • define OUT_STATUS 0X1101
  • define OUT_DATA 0X1100
  • while (peek(OUT_STATUS) 0)
  • poke (OUT_STATUS,1) /set device busy/
  • poke (OUT_DATA, achar) /output
    character/

9
String Output Example
  • Sequence of characters stored in a standard C
    string - terminated by a null (0) character

10
Definitions
11
String Output
12
  • The outer while loop sends the characters one at
    a time
  • The inner while loop checks the device status
  • it implements the busy-wait function by
    repeatedly checking the device status until the
    status changes to 0

13
Copying Characters from Input to Output Using
Busy-Wait I/0
  • Repeatedly read a character from the input
    device and write it to the output device

14
Copying Characters from Input to Output Using
Busy-Wait I/0
15
Copying Characters from Input to Output Using
Busy-Wait I/0
16
Copying Characters from Input to Output Using
Busy-Wait I/0
  • Advantage
  • simplicity of implementation - hardware and
    software
  • Disadvantage
  • CPU can do nothing else while in the wait loop
  • OK for small systems

17
Solution - Interrupts
  • Eliminates dead time for I/O transfers
  • Allows CPU to respond to asynchronous external
    events

18
Basic Interrupt Structure
  • Device issues interrupt request
  • CPU saves state transfers to interrupt
    handler
  • CPU - issues interrupt acknowledge

19
(No Transcript)
20
Copying Characters from Input to Output with
Basic Interrupts
  • Write C functions as interrupt handlers
  • Define global variables
  • achar for input handler to pass character to
    the foreground program
  • gotchar Boolean variable to signal new
    character has been received

21
Input Handler
22
Main
23
Output Handler
24
Circular Queue
25
Copying Characters from Input to Output with
Basic Interrupts
  • Use of interrupts has made the main program
    somewhat simpler
  • But program design still does not let the
    foreground program do useful work
  • Hence, need a more sophisticated program design
    to let the foreground program work completely
    independently of input and output

26
Copying Characters from Input to Output with
Interrupts and Buffers
  • Need program to perform reads and writes
    independently
  • Solution - a buffer to hold inputs until they
    are written

27
Copying Characters from Input to Output with
Interrupts and Buffers
  • Read and write routines to communicate through
    the following global variables
  • Character string io-buf to hold a queue of
    characters that have been read but not yet
    written
  • Integer error will be set to 0 if/whenever
    io-buf overflows

28
For io_buf - a Circular Queue
  • The queue io-buf acts as a wraparound buffer
  • characters added to the tail when an input is
    received
  • characters taken from the head when we are ready
    for output

29
Circular Queue
  • Situation at the start of the program execution
  • tail points to the first available location
  • head points to the next character to be output
  • if head and tail are equal the queue is empty

30
Circular Queue
  • When the first character is entered, the tail is
    incremented after the character is added to the
    queue

31
Circular Queue
32
Circular Queue
  • When the buffer is full, leave one location in
    the buffer empty
  • if another character is added and the tail buffer
    is updated(wrapping it around to the head of
    the buffer) could not distinguish a full buffer
    from an empty one

33
Circular Queue
34
Circular Queue
  • What happens when the output goes past the end
    of io-buf

35
Circular Queue
36
Service Routines
37
Copying Characters from Input to Output with
Interrupts and Buffers
  • Two interrupt handler routines defined in C
  • input-handler for the input device
  • output-handler for the output device

38
Copying Characters from Input to Output with
Interrupts and Buffers
  • The complication is in starting the output
    device
  • If io-buf has characters waiting, the output
    driver can start a new output transaction by
    output action whenever the new character
    arrives
  • If there are no characters waiting, an outside
    agent must start a new output action whenever
    the new character arrives

39
Copying Characters from Input to Output with
Interrupts and Buffers
  • Solution -- have the input handler check to see
    whether there is only one character in the
    buffer and start a new transaction

40
Input and Output Handlers
41
UML Sequence Diagram
42
  • Foreground program does not need to do anything
    everything is taken care of by the interrupt
    handlers
  • Simulation shows that the foreground program is
    not executing continuously, but continues to run
    in a regular state independent of the number of
    characters waiting in the queue

43
Debugging Interrupt Code
  • An Example

44
Y Ax b
45
Y Ax b
  • Assume
  • the foreground code is performing the matrix
    multiplication operation
  • the interrupt handlers perform I/O while the
    matrix computation is performed
  • but with one small problem
  • read-handler has a bug that causes it to change
    the value of j

46
  • Any CPU register that is written by the
    interrupt handler must be saved before it is
    modified and restored before the handler exits
  • Any type of bug such as forgetting to save the
    register or to properly restore it can cause
    that register to mysteriously change value in
    the foreground program

47
  • What happens to the foreground program when j
    changes value during an interrupt depends on
    when the interrupt handler executes
  • Because the value of j is reset at each iteration
    of the outer loop, the bug will affect only one
    entry to result y
  • But clearly the entry that changes will depend
    on when the interrupt occurs

48
  • Furthermore, the change observed in y depends on
    not only what new value is assigned to j (which
    may depend on the data handled by the interrupt
    code), but also when in the inner loop the
    interrupt occurs

49
Prioritized and Vectored Interrupts
  • Early CPUs (and many still in use) had only an
    interrupt request and interrupt acknowledge
  • Multiple I/O devices or other external events
    required additional external hardware and
    instructions in the handler to handle multiple
    interrupts

50
Prioritized and Vectored Interrupts
  • PRIORITY implemented via internal or external
    hardware
  • VECTOR starting address of interrupt handler

51
Vectored Interrupts
52
UML Sequence Diagram Description
53
NIOS
  • 64 prioritized interrupts
  • 6 bit interrupt priority number must be supplied
    by the device
  • One interrupt request line
  • Service routine accessed via device number

54
Interrupt Overhead
  • Once a device requests an interrupt
  • some steps are performed by the CPU hardware
  • some by the device
  • others by software

55
Basic Procedure
  • CPU
  • checks for pending interrupts at the beginning
    of an instruction cycle
  • answers the highest-priority interrupt that has a
    higher priority than that given in the interrupt
    priority register
  • Device
  • receives the acknowledgement and sends the CPU
    its interrupt vector

56
Basic Procedure - continued
  • CPU
  • looks up the device handler address in the
    interrupt vector table using the vector as an
    index
  • a subroutine-like mechanism is used to save the
    current value of the PC and possibly other
    internal CPU state, such as general-purpose
    registers

57
Basic Procedure - continued
  • Software
  • device driver may save additional CPU state
  • performs the required operations on the device
  • restores any saved state and executes the
    interrupt return instruction
  • CPU
  • interrupt return instruction restores the PC and
    other automatically saved states to return
    execution to the code that was interrupted

58
Interrupt Performance Penalty
  • Interrupt itself has overhead similar to a
    subroutine call
  • because an interrupt causes a change in the
    program counter, it incurs a branch penalty
  • if the interrupt automatically stores CPU
    registers, that action requires extra cycles,
    even if the state is not modified by the
    interrupt handler

59
Interrupt Performance Penalty
  • In addition to the branch delay penalty
  • interrupt requires extra cycles to acknowledge
    the interrupt and obtain the vector from the
    device
  • Interrupt handler will, in general, save and
    restore CPU registers that were not
    automatically saved by the interrupt
  • Interrupt return instruction incurs a branch
    penalty as well as the time required to restore
    the automatically saved state

60
Processor Characteristics
61
Supervisor Mode
  • Complex systems are often implemented as several
    programs that communicate with each other
  • Even with an operating system, it may be
    desirable to provide hardware checks to ensure
    that the programs do not interfere with each
    other

62
Supervisor Mode
  • Often useful to have a supervisor mode provided
    by the CPU
  • Normal programs run in user mode
  • Supervisor mode has privileges that user modes do
    not
  • e.g. - control of the memory management unit is
    typically reserved for supervisor mode to avoid
    the obvious problems
  •  NIOS has no supervisor mode

63
Exceptions
  • An exception is an internally detected error
  • Simple example is division by zero
  • One possibility - check every divisor before
    division to be sure it is not zero i.e. via
    software
  • CPU can more efficiently check the divisors
    value during execution with an exception
  • The exception mechanism provides a way for the
    program to react to such unexpected events

64
Exceptions
  • Exceptions are generally implemented as a
    variation of an interrupt
  • however, exceptions are generated internally
  • Exceptions in general require both
    prioritization and vectoring
  • A single operation may generate more than one
    exception for example, an illegal operand and
    an illegal memory access

65
Exceptions
  • Priority of exceptions is usually fixed by the
    CPU architecture
  • Vectoring provides a way for the user to specify
    the handler for the exception condition
  • The vector number for an exception is usually
    predefined by the architecture

66
NIOS
  • Provides two exceptions
  • Register file window underflow
  • Register file window overflow

67
Traps
  • A trap, also known as a software interrupt, is
    an instruction that explicitly generates an
    exception condition
  • most common use of a trap is to enter supervisor
    mode
  • NIOS
  • provides a trap instruction

68
Co-processors
  • CPU architects
  • often want to provide flexibility in what
    features are implemented in the CPU
  • or the features can not be fit on the CPU chip
  • Co-processors attached to the CPU can provide
    such flexibility at the instruction set level
  • e.g. Intel 8086, 8087

69
Co-processors
  • To support co-processors
  • certain opcodes must be reserved in the
    instruction set for co-processor operations
  • Co-processor must be tightly coupled to the CPU
  • when the CPU receives a co-processor instruction,
    the CPU must activate the co-processor and pass
    it the relevant instruction
  • co-processor instructions can load and store
    co- processor registers or can perform internal
    operations

70
Co-processors
  • CPU may receive co-processor instructions even
    when there is no co-processor attached
  • illegal instruction traps are used to handle
    these situations
  • the function is executed in software on the main
    CPU

71
Memory Systems
  • Caches
  • Memory Management Units

72
Cache
  • A fast and small memory that holds copies of
    some of the content of main memory

73
Cache
74
Cache
  • Cache hit requested location is in the cache
  • Cache miss requested location is not in the
    cache

75
Cache Performance
  • h hit rate
  • Tcache cache access time
  • Tmain main memory access time
  • Tav hTcache (1-h)Tmain
  • improves system performance
  • can cause problems in embedded systems involving
    hard real time control

76
Cache Types
  • Direct Mapped
  • Set Associative caches
  • Set Associative yields better average
    performance (with additional complexity) but
    penalty for a miss is more severe impacting
    predictability

77
Memory Management Unit
  • Translates addresses between the CPU and
    physical memory
  • virtual addresses virtual memory
  • requires a disk or other secondary storage
    device
  • To date have not been common in embedded systems

78
Memory Management Units
79
CPU Performance
  • Pipelining
  • ARM and SHARC three stage
  • Fetch the instruction is fetched from memory
  • Decode the instructions opcode and operands
    are decoded to determine what function to
    perform
  • Execute the decoded instruction is executed

80
CPU Performance
  • NIOS - four stages
  • Instruction Fetch
  • Instruction Decode/Operand Fetch
  • Execute
  • Write-back

81
CPU Performance
  • Latency vs. Throughput

82
CPU Performence
83
Data Stall
84
Branches - Control Stall or Branch Penalty
85
NIOS
  • Delayed branch
  • some number of instructions directly after the
    branch are always executed, whether or not the
    branch is taken
  • maintains full pipeline some instructions may
    have to be no ops

86
Superscalar Processors
87
Data Dependency
88
Superscalar Processors
  • Improves throughput but complicates performance
    estimation
  • instructions scheduled at execution time
  • Simulator needed to calculate performance

89
Power Consumption
  • Energy vs power
  • Power ---gt heat generation
  • Energy ---gt battery life
  • Power generally used for both unless needed for
    clarification

90
Power Consumption
  • Voltage drops
  • the power consumption of a CMOS circuit is
    proportional to the square of the power supply
    voltage (V2)
  • Toggling
  • a CMOS circuit uses most of its power when it is
    changing its output value

91
Power Consumption
  • Leakage
  • even when a CMOS circuit is not active, some
    charge leaks out of the circuits nodes through
    the substrate

92
Power Management
  • Static Power Management
  • invoked by user 
  • Dynamic Power Management
  • automatic control by the CPU

93
Power Saving Strategies in CMOS
  • Reduce power supply level
  • e.g. 5.0 ?3.3. (5.0/3.3)2 2.29 factor reduction
    or 56 reduction
  • Lower clock frequency
  • lowers power but not energy consumption
  • Disable certain internal function units (turn
    off clock)
  • Totally disconnect power from some internal units

94
Power Down Mode
  • Provides the opportunity to greatly reduce power
    consumption because it will typically be entered
    for a substantial period of time
  • going into and especially out of a power-down
    mode is not free it costs both time and energy
  • pipeline processors require complex control that
    must be properly initialized to avoid corrupting
    data in the pipeline

95
Power State Machine
96
Power-Saving Modes of the Strong ARM SA-1100
97
Power-Saving Modes of the Strong ARM SA-1100
  • Run mode is normal operation and has the highest
    power consumption
  • Idle mode saves power by stopping the CPU clock
  • system unit modules real-time clock, operating
    system timer, interrupt control, general-purpose
    I/O, and power manager all remain operational
  • idle mode is entered by executing a
    three-instruction sequence
  • CPU returns to run mode upon receiving an
    interrupt from one of the internal system units
    or from a peripheral or by resetting the CPU

98
Power-Saving Modes of the Strong ARM SA-1100
  • Sleep mode shuts off most of the chips activity
  • entering sleep mode causes the system to shut
    down on- chip activity, reset the CPU, and negate
    the PWR_EN pin to tell the external electronics
    that the chips power supply should be driven to
    0 volts
  • a separate I/O power supply remains on and
    supplies power to the power manager so that the
    CPU can be awakened from sleep mode the
    low-speed clock keeps the power manager running
    at low speeds sufficient to manage sleep mode
  • sleep mode is entered by forcing the sleep bit in
    the power manager control register it can also
    be entered by a power supply fault

99
Power-Saving Modes of the Strong ARM SA-1100
Write a Comment
User Comments (0)
About PowerShow.com