Title: Storage
1Storage
2Typical I/O Devices
3Why I/O?
- Amdahls Law
- Speedup only CPU
- I/O becomes bottleneck
-
- TimeJOB TimeCPU TimeI/O - TimeOVERLAP
- Example
- 100 90 10 ( I/O time)
- CPU improvement by 50 per year while I/O stays
same - After 5 years?
- 90/1.55 ? 12
- I/O time 45 ? 10/(1210) x 100
4I/O Performance Measure
- Throughput
- Bandwidth (Big Image File)
- I/Os per second (ATM terminals)
- Latency
- Response time
5I/O Characteristics
- Supercomputer I/O benchmark
- Data throughput important
- Number of bytes per second for large files
- Transaction processing
- Both response throughput requirements
- I/O rate important
- Number of disk accesses per second
- Random accesses
- Example Bank Transaction
- 15 minutes to check your balance?
- Maximum 10 transactions at the same time?
- Timesharing file systems
- Small files important
- Sequential access
- Many creates and deletes
6Device Characteristics
- Behavior
- Input (read once)
- Output (write once)
- Storage (read many times usually write)
- Partner
- Human
- Machine
- Data rate
- Peak transfer rate
7I/O Device Diversity
8Magnetic Disks
9Disk Operations
- Seek time Time to a right track
- Rotational delay Time to a right sector
- Transfer rate
- Overhead Controller time
10Disk Read Time
Q how long to read/write a 512-byte sector?
10,000 Rotations Per Minute (RPM) Average seek
time 6 ms Transfer rate 50 MB/sec Controller
overhead 0.2 ms Disk is idle (no waiting time)
Average disk access time average seek time
average rotational delay transfer time
controller overhead 6 ms 3.0 ms ( 0.5
rotation/10,000 RPM) 0.01 ms (0.5 KB/50
MB/sec) 0.2 ms 9.2 ms
11Defining Reliability and Availability
- A system toggles between
- Service accomplishment service matches
specifications - Service interruption service deviates from
specs - The toggle is caused by failures and restorations
- Reliability measures continuous service
accomplishment and is usually expressed as mean
time to failure (MTTF) - Availability measures fraction of time that
service matches specifications, expressed as
MTTF / (MTTF MTTR)
12RAID
- Reliability and availability are important
metrics for disks - RAID
- Redundant array of inexpensive (independent)
disks - Redundancy can deal with one or more failures
- Each sector of a disk records check information
that allows it to determine if the disk has an
error or not (in other words, redundancy already
exists within a disk) - When the disk read flags an error, we turn
elsewhere for correct data
13RAID levels 0-6
RAID0
Multiple disks for higher
Data organization on multiple disks
data rate no redundancy
RAID1
Mirrored disks
Data
Data
Mirror
Data
Mirror
Mirror
disk 0
disk 1
disk 1
disk 2
disk 2
disk 0
RAID2
Error
-
correcting code
RAID3
Bit
-
or byte
-
level striping
Data
Data
Data
Data
Parity
with parity/checksum disk
disk 0
disk 2
disk 1
disk 3
disk
RAID4
Parity/checksum applied
to blocks, not bits or bytes
RAID5
Parity/checksum
distributed across several disks
RAID6
Parity and 2nd check
distributed across several disks
14RAID 0 and RAID 1
- RAID 0 has no additional redundancy (misnomer)
it uses an array of disks and stripes
(interleaves) data across the arrays to improve
parallelism and throughput - RAID 1 mirrors or shadows every disk every
write happens to two disks - Reads to the mirror may happen only when the
primary disk fails or, you may try to read both
together and the quicker response is accepted - Expensive solution high reliability at twice the
cost
15RAID 3
- Data is bit-interleaved across several disks and
a separate disk maintains parity information for
a set of bits - For example with 8 disks, bit 0 is in disk-0,
bit 1 is in disk-1, , bit 7 is in disk-7
disk-8 maintains parity for all 8 bits - For any read, 8 disks must be accessed (as we
usually read more than a byte at a time) and for
any write, 9 disks must be accessed as parity has
to be re-calculated - High throughput for a single request, low cost
for redundancy (overhead 12.5 9/8), low
task-level parallelism
16RAID 4 and RAID 5
- Data is block interleaved
- Allows getting all the data from a single disk on
a read - In case of a disk error, read all 9 disks
- Block interleaving reduces throughput for a
single request - Only a single disk drive is servicing the request
- But improves task-level parallelism as other disk
drives are free to service other requests - On a write, access the data and the parity disk
- Parity information can be updated simply by
checking if the new data differs from the old
data
17Optimization for Small Writes (RAID4)
18RAID 5
- If we have a single disk for parity, multiple
writes can not happen in parallel (as all writes
must update parity info) - RAID 5 distributes the parity block to allow
simultaneous writes
19RAID Review
- RAID 1-5 can tolerate a single fault mirroring
(RAID 1) has a 100 overhead, while parity (RAID
3, 4, 5) has modest overhead - Can tolerate multiple faults by having multiple
check functions each additional check can cost
an additional disk (RAID 6) - RAID 6 and RAID 2 (memory-style ECC) are not
commercially employed
20Flash Memory
- Pros
- Small, light-weight, robust
- Low power consumption
- Fast read access times comparable to those of
DRAM - Cons
- Much slower write access times
- No in-place-update need an erase operation.
- Erase operations can only be performed in a much
larger unit than the write operation. - Limited lifetime
- Bad blocks
21Flash Memory
22NOR NAND Flash
- NOR Flash Memories
- Random, direct access interface
- Fast read performance
- Slow erase and write
- Boot image, BIOS, Cell phone, etc.
- NAND Flash Memories
- I/O mapped access
- Smaller die size
- Better performance for erase and write
- Solid state file storage, MP3, digital camera,
etc.
23Characteristics of Flash Memory
24Flash-Based Storage
25Flash Memory Operations
- Operations
- Read
- Write or Program change state from 1 to 0
- Erase change state from 0 to 1
- Unit
- Page (sector) 2KB
- management or program unit
- Block 128KB
- Erase unit
write
write
erase
26Bus
- Shared communication link
- Single set of wires used to connect multiple
subsystems - A fundamental tool for composing large, complex
systems - Systematic means of abstraction
27Bus Design
- The bus is a shared resource
- Any device can send data on the bus (after first
arbitrating for it) - And all other devices can read this data off the
bus - The address/control signals on the bus
- Specify the intended receiver of the message
- The length of the bus
- Determines its speed (hence, a hierarchy makes
sense) - Buses can be synchronous or asynchronous
- SYNCH a clock determines when each operation
must happen - ASYNCH a handshaking protocol is used to
co-ordinate operations
28Advantages of Buses
- Versatility
- New devices can be added easily
- Peripherals can be moved between computer systems
that use the same bus standard - Low Cost
- A single set of wires is shared in multiple ways
- Manage complexity by partitioning the design
29Disadvantage of Buses
- Bus creates a communication bottleneck
- Bandwidth of bus can limit the maximum I/O
throughput - Maximum bus speed is largely limited by
- Length of bus
- Number of devices on the bus
- Need to support a range of devices with
- Widely varying latencies
- Widely varying data transfer rates
30The General Organization of a Bus
- Control lines
- Signal requests and acknowledgments
- Indicate what type of information is on the data
lines - Data lines
- Carry information between the source and the
destination - Data and Addresses
- Complex commands
31What defines a bus?
32Types of Buses
- Processor-Memory Bus (design specific)
- Short and high speed
- Only need to match the memory system
- Maximize memory-to-processor bandwidth
- Connects directly to the processor
- Optimized for cache block transfers
- I/O Bus (industry standard)
- Usually is lengthy and slower
- Need to match a wide range of I/O devices
- Connects to the processor-memory bus or backplane
bus - Example SCSI
- Backplane Bus (standard or proprietary)
- Backplane an interconnection structure within
the chassis - Allow processors, memory, and I/O devices to
coexist - Cost advantage one bus for all components
33One-Bus System
- A single bus (the backplane bus) is used for
- Processor to memory communication
- Communication between I/O devices and memory
- Advantages Simple and low cost
- Disadvantages slow and the bus can become a
major bottleneck - Example IBM PC - AT
34Two-Bus System
- I/O buses tap into the processor-memory bus via
bus adaptors - Processor-memory bus mainly for processor-memory
traffic - I/O buses provide expansion slots for I/O
devices - Example Apple Macintosh-II
- NuBus Processor, memory, and a few selected I/O
devices - SCCI Bus the rest of the I/O devices
35Three-Bus System
- A few backplane buses tap into the processor
memory bus - Processor-memory bus is used for processor memory
traffic - I/O buses are connected to the backplane bus
- Advantage loading on the processor bus is
greatly reduced - Example IBM RS/6000
36PC Bus
37Synchronous and Asynchronous Bus
- Synchronous Bus
- Includes a clock in the control lines
- A fixed protocol for communication that is
relative to the clock - Advantage involves very little logic and can run
very fast - Disadvantages
- Every device on the bus must run at the same
clock rate - To avoid clock skew, they cannot be long if they
are fast - Asynchronous Bus
- Not clocked
- Requires a handshaking protocol
- Can accommodate a wide range of devices
- Can be lengthened without worrying about clock
skew
38Bus Communication Protocols
Synchronous bus with fixed-latency devices.
Handshaking on an asynchronous bus for an input
operation (e.g., reading from memory).
39Synchronous OR Asynchronous?
- Processor-memory bus
- I/O bus
- Determined by
- Speed
- Distance
- Number of devices
40Bus Bandwidth
- Data bus width
- Separate (vs. multiplexed) address and data lines
- Block transfers
- Split-transaction bus
41Interfacing I/O Devices
- Processor
- How is a user I/O request transformed into a
device command and communicated to the device? - Memory
- How is data transferred to/from memory?
- OS
- What is the role of O/S?
42Giving Commands to I/O Devices
- Two methods are used to address the device
- Special I/O instructions
- Memory-mapped I/O
- Special I/O instructions specify
- Both the device number and the command word
- Device number the processor communicates this
via a set of wires normally included as part of
the I/O bus - Command word this is usually send on the bus
data lines - Memory-mapped I/O
- Portions of the address space are assigned to I/O
device - Read and writes to those addresses are
interpreted as commands to the I/O devices - User programs are prevented from issuing I/O
operations directly - The I/O address space is protected by the address
translation
43I/O Device Notifying the OS
- The OS needs to know when
- The I/O device has completed an operation
- The I/O operation has encountered an error
- This can be accomplished in two different ways
- Polling
- The I/O device put information in a status
register - The OS periodically check the status register
- Advantage the processor is totally in control
and does all the work - Disadvantage polling overhead can consume a lot
of CPU time - I/O Interrupt
- Whenever an I/O device needs attention from the
processor, it interrupts the processor from what
it is currently doing - Advantage user program progress is only halted
during actual transfer - Disadvantage extra HW to cause interrupt and
detect interrupt
44I/O Interrupt
- An I/O interrupt is just like the exceptions
except - An I/O interrupt is asynchronous
- Further information needs to be conveyed
- An I/O interrupt is asynchronous w.r.t.
instruction execution - I/O interrupt is not associated with any
instruction - I/O interrupt does not prevent any instruction
from completion - You can pick your own convenient point to take an
interrupt - I/O interrupt is more complicated than exception
- Needs to convey the identity of the device
generating the interrupt - Interrupt requests can have different urgencies
- Interrupt request needs to be prioritized
45DMA
- Delegating I/O Responsibility from the CPU
- Direct Memory Access (DMA)
- External to the CPU
- Act as a master on the bus
- Transfer blocks of data to or from memory without
CPU intervention
- CPU sends the following to DMAC
- starting address
- direction
- length count
- Then issues start
CPU
Memory
DMAC
IOC
Device
- DMAC provides
- handshake signals for peripheral controller
- memory addresses handshake signals for memory
46Coherence Problem
- The value of a memory seen by DMA and CPU may
differ - May cause stale data problem
- Solutions
- Route the I/O activity through the cache
- OS selectively invalidates the cache for an I/O
read - or force write-backs to occur for an I/O write.
- HW mechanism for selectively flushing (or
invalidating) cache entries.
47Summary
- Average disk access time
- Seek time rotational delay transfer time
controller overhead - RAID
- Multiple components decrease MTTF
- Increase MTTF by having redundant information
(i.e. parity) - Flash memory
- NOR vs. NAND
- Fast read but slow write (need to erase block
before write) - Bus
- Systematic means of composing a large system
- Synchronous vs. asynchronous
- Interfacing I/O devices
- Polling vs. interrupt vs. DMA
- Coherence handling