Computer Architecture

About This Presentation

Title:

Computer Architecture

Description:

We will look at how devices (especially disks) are put together. ... Example: Barracuda Cheetah ST320822A. 200 Gigabytes. 7,200 RPM. 8.5 ms avg seek time. ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 49

Provided by: jb20

Category:

more less

Transcript and Presenter's Notes

Title: Computer Architecture

1
Computer Architecture

Chapter 6
Storage Systems
Prof. Jerry Breecher
CSCI 240
Fall 2003

2
Chapter Overview

6.1 Introduction
6.2 Types of Storage Devices
6.3 Busses - Connecting IO Devices to
CPU/Memory. Interrupts etc. How is data
transferred.
6.5 Reliability, Availability and RAID

3
Introduction
The Big Picture Where are We Now?
6.1 Introduction 6.2 Types of Storage
Devices 6.3 Busses - Connecting IO Devices to
CPU/Memory. Interrupts etc. How is data
transferred. 6.5 Reliability, Availability and
RAID
We will look at how devices (especially disks)
are put together. Well look at how to connect
IO devices to the CPU. And then well look at
RAID, the brainchild of Patterson and his buddies.
4
The Processor Picture
5
The Processor Picture
Processor/Memory Bus
PCI Bus
I/O Busses
6
The Processor Picture
Memory
I/O
7
Types of Storage Devices
In this section we will Take a quick look at
how disks work. This is only one example of IO,
but we will save networks, tapes, etc. for
another course.

6.1 Introduction
6.2 Types of Storage Devices
6.3 Busses - Connecting IO Devices to
CPU/Memory.
Interrupts etc. How is data transferred.
6.5 Reliability, Availability and RAID

8
Disk Device Terminology
Types of Storage Devices

Purpose
Long-term, nonvolatile storage
Large, inexpensive, slow level in the storage
hierarchy
Bus Interface
IDE
SCSI Small Computer System Interface
Fibre Channel
Transfer rate
About 120 Mbyte/second through the interface
bus.
About 5 Mbyte/second off of heads.
Data is moved in Blocks
Capacity
Approaching 100 Gigabytes
Quadruples every 3 years (aerodynamics)
Can be grouped together to get terabytes of data.

9
Disk Device Terminology
Types of Storage Devices
Example Seagate Cheetah ST3146807FC
4 disks, 8 heads 290,000,000 Total
Sectors 50,000 cylinders Average of 6,000
sectors/cylinder or 800 sectors / track (but
different amounts on each track.) MTBF
1,200,000 hours
Server
147 Gigabytes 10,000 RPM 4.7 ms avg seek
time. Fibre Channel 499.00
http//www.seagate.com/cda/products/discsales/mark
eting/detail/0,1121,355,00.html
10
Disk Device Terminology
Types of Storage Devices
These are 4X more capacity than in 2001!!!
Example Barracuda Cheetah ST320822A
Desktop
200 Gigabytes 7,200 RPM 8.5 ms avg seek
time. ATA 299.00
2 disks, 4 heads 390,000,000 Total
Sectors 24,000 cylinders Average of 16,000
sectors/cylinder or 400 sectors / track (but
different amounts on each track.) MTBF
???????????? hours
http//www.seagate.com/support/disc/manuals/fc/100
195490b.pdf
11
Performance of Magnetic Disks
Types of Storage Devices
Track
Sector
Cylinder
Platter
Head
15,000 RPM 240 RPS gt 4 ms per rev Average
rotational latency 2 ms 500 sectors per track
gt 0.10 ms per sector 512 bytes per sector gt
5,000,000 MB / s
Electronics (controller)
Read Cache
Write Cache
Control
Data
12
Busses
In this section we will Look at various bus
mechanisms. In very simple terms, a bus is the
connection between various chips/components in
the computer. The bus is responsible for sending
data/control between these various components.

6.1 Introduction
6.2 Types of Storage Devices
6.3 Busses - Connecting IO Devices to
CPU/Memory
6.4 I/O Performance Measures
6.5 Reliability, Availability and RAID

13
Interconnect Trends
Busses

Interconnect glue that interfaces computer
system components
High speed hardware interfaces logical
protocols
Networks, channels, backplanes

Network
Channel
Backplane
Connects
Machines
Chips
Devices
gt1000 m
10 - 100 m
0.1 m
Distance
10 - 1000 Mb/s
40 - 1000 Mb/s
320 - 2000 Mb/s
Bandwidth
high ( 1ms)
medium
low (Nanosecs.)
Latency
low
medium
high
Reliability
Extensive CRC
Byte Parity
Byte Parity
memory-mapped wide pathways centralized
arbitration
message-based narrow pathways distributed
arbitration
14
A Computer System with One Bus Backplane Bus
Busses
Backplane Bus
Processor
Memory
I/O Devices

A single bus (the backplane bus) is used for
Processor to memory communication
Communication between I/O devices and memory
Advantages Simple and low cost
Disadvantages slow and the bus can become a
major bottleneck
Example IBM PC - AT

15
A Two-Bus System
Busses

I/O buses tap into the processor-memory bus via
bus adaptors
Processor-memory bus mainly for processor-memory
traffic
I/O buses provide expansion slots for I/O
devices
Apple Macintosh-II
NuBus Processor, memory, and a few selected I/O
devices
SCCI Bus the rest of the I/O devices

16
A Three-Bus System
Busses

A small number of backplane buses tap into the
processor-memory bus
Processor-memory bus is only used for
processor-memory traffic
I/O buses are connected to the backplane bus
Advantage loading on the processor bus is
greatly reduced

17
North/South Bridge architectures separate busses
Busses
Processor
Processor Memory Bus
Memory
Director
backside cache
Bus Adaptor
I/O Bus
Backplane Bus
Bus Adaptor
I/O Bus

Separate sets of pins for different functions
Memory bus
Caches
Graphics bus (for fast frame buffer)
I/O busses are connected to the backplane bus
Advantage
Busses can run at different speeds
Much less overall loading!

18
What defines a bus?
Busses
Transaction Protocol
Timing and Signaling Specification
Bunch of Wires
Electrical Specification
Physical / Mechanical Characteristics the
connectors
19
Synchronous and Asynchronous Bus
Busses

Synchronous Bus
Includes a clock in the control lines
A fixed protocol for communication that is
relative to the clock
Advantage involves very little logic and can run
very fast
Disadvantages
Every device on the bus must run at the same
clock rate
To avoid clock skew, busses cannot be long if
they are fast
Asynchronous Bus
It is not clocked
It can accommodate a wide range of devices
It can be lengthened without worrying about clock
skew
It requires a handshaking protocol

20
Busses So Far
Busses
Master
Slave

Control Lines
Address Lines
Data Lines

Bus Master has ability to control the bus,
initiates transaction
Bus Slave module activated by the transaction
Bus Communication Protocol specification of
sequence of events and timing requirements in
transferring information.
Asynchronous Bus Transfers control lines (req,
ack) serve to orchestrate sequencing.
Synchronous Bus Transfers sequence relative to
common clock.

21
Arbitration Obtaining Access to the Bus
Busses
Control Master initiates requests
Bus Master
Bus Slave
Data can go either way

One of the most important issues in bus design
How is the bus reserved by a device that wishes
to use it?
Chaos is avoided by a master-slave arrangement
Only the bus master can control access to the
bus
It initiates and controls all bus requests
A slave responds to read and write requests
The simplest system
Processor is the only bus master
All bus requests must be controlled by the
processor
Major drawback the processor is involved in
every transaction

22
The Daisy Chain Bus Arbitrations Scheme
Busses
Device 1 Highest Priority
Device N Lowest Priority
Device 2
Grant
Grant
Grant
Release
Bus Arbiter
Request
wired-OR

Order is
Request
Grant
Release.

Advantage simple
Disadvantages
Cannot assure fairness A low-priority
device may be locked out indefinitely
The use of the daisy chain grant signal also
limits the bus speed

23
Simple Synchronous Protocol
Busses
Clock
Bus Request
Bus Grant
R/W Address
CmdAddr
Data1
Data2
Data

Even memory busses are more complex than this
memory (slave) may take time to respond
it may need to control data rate

24
Asynchronous Handshake (4-phase)
Busses
Write Transaction
Address Data Read Request Acknowledge
Master Asserts Address
Next Address
Master Asserts Data
t0 t1 t2 t3 t4
t5

t0 Master has obtained control and asserts
address, direction (not read), data. Waits a
specified amount of time for slaves to decode
target
t1 Master asserts request line
t2 Slave asserts ack, indicating data received
t3 Master releases req
t4 Slave releases ack

This is Fig. 6.11
25
Read Transaction
Busses
Address Data Read Req Ack
Master Asserts Address
Next Address
Slave Data
t0 t1 t2 t3 t4
t5

t0 Master has obtained control and asserts
address, direction, data
Waits a specified amount of time for slaves to
decode target\
t1 Master asserts request line
t2 Slave asserts ack, indicating ready to
transmit data
t3 Master releases req, data received
t4 Slave releases ack

26
EXAMPLE PCI Read/Write Transactions
Busses

All signals sampled on rising edge
Centralized Parallel Arbitration
overlapped with previous transaction
All transfers are (unlimited) bursts
Address phase starts by asserting FRAME
Next cycle initiator asserts cmd and address
Data transfers happen on when
IRDY asserted by master when ready to transfer
data
TRDY asserted by target when ready to transfer
data
transfer when both asserted on rising edge
FRAME de-asserted when master intends to
complete only one more data transfer

27
EXAMPLE PCI Read Transaction
Busses
Turn-around cycle on any signal driven by more
than one agent
28
How The CPU Talks To The IO
Interfacing I/O To The Processor

The interface consists of setting up the device
with what operation is to be performed-
Read or Write
Size of transfer
Location on device
Location in memory
Then triggering the device to start the operation
When operation complete, the device will
interrupt.

I/O instructions (in,out) unique from memory
access instructions. LDD R0,D,P lt-- Load R0 with
the contents found at device D, port P.
Device registers are mapped to look like regular
memory LD R0,Mem1 lt-- Load R0 with the contents
found at device D, port P. This works because an
initialization has correlated the device
characteristics with location Mem1.
29
How The CPU Talks To The IO
Interfacing I/O To The Processor
ROM
RAM
Virtual Memory Pointing at IO space.
target device
where commands are
I/O
OP Device Address
CPU IOC
(1) Issues instruction to IOC
(4) IOC interrupts CPU when done
IOP looks in memory for commands
(2)
OP Addr Cnt Other
(3)
memory
what to do
special requests
Device to/from memory transfers are controlled by
the IOC directly.
where to put data
how much
30
Memory Mapped I/O
Interfacing I/O To The Processor
Some physical addresses are set aside. There is
no REAL memory at these addresses. Instead when
the processor sees these addresses, it knows to
aim the instruction at the IO processor.
ROM
RAM
I/O
31
Transfer Method 1Programmed I/O (Polling)
Interfacing I/O To The Processor
CPU
Is the data ready?
busy wait loop not an efficient way to use the
CPU unless the device is very fast!
no
Memory
IOC
yes
read data
device
but checks for I/O completion can be dispersed
among computationally intensive code
store data
done?
no
yes
32
Device Interrupts
Interfacing I/O To The Processor

An I/O interrupt is just like the exception
handlers except
An I/O interrupt is asynchronous
Further information needs to be conveyed
An I/O interrupt is asynchronous with respect to
instruction execution
I/O interrupt is not associated with any
instruction
I/O interrupt does not prevent any instruction
from completion
You can pick your own convenient point to take an
interrupt
I/O interrupt is more complicated than exception
Needs to convey the identity of the device
generating the interrupt
Interrupt requests can have different urgencies
Interrupt request needs to be prioritized

33
Device Interrupts
Interfacing I/O To The Processor

Advantage
User program progress is only halted during
actual transfer
Disadvantage, special hardware is needed to
Cause an interrupt (I/O device)
Detect an interrupt (processor)
Save the proper states to resume after the
interrupt (processor)

34
Transfer Method 2Interrupt Driven Data Transfer
Interfacing I/O To The Processor
add sub and or nop
CPU
user program
(1) I/O interrupt
(2) save PC
Memory
IOC
(3) interrupt service addr
device
read store ... rti
interrupt service routine
User program progress only halted during actual
transfer. Interrupt handler code does the
transfer. 1000 transfers at 1000 bytes each
1000 interrupts _at_ 2 µsec per interrupt
1000 interrupt service _at_ 98 µsec each 0.1 CPU
seconds
(4)
memory
Device xfer rate 10 MBytes/sec gt 0 .1 x 10-6
sec/byte gt 0.1 µsec/byte
gt 1000 bytes
100 µsec 1000 transfers x 100 µsecs 100 ms
0.1 CPU seconds
Still far from device transfer rate! 1/2 in
interrupt overhead
35
Delegating I/O Responsibility from the CPU DMA
Interfacing I/O To The Processor
CPU sends a starting address, direction, and
length count to IOC. Then issues "start".

Direct Memory Access (DMA)
External to the CPU
Act as a master on the bus
Transfers blocks of data to or from memory
without CPU intervention

CPU
Memory
IOC
device
IOC provides handshake signals for
Peripheral Controller, and Memory Addresses and
handshake signals for Memory.
36
Transfer Method 3Direct Memory Access
Interfacing I/O To The Processor
Time to do 1000 xfers at 1000 bytes each
1 DMA set-up sequence _at_ 50 µsec 1 interrupt _at_ 2
µsec 1 interrupt service sequence _at_ 48
µsec .0001 second of CPU time
CPU sends a starting address, direction, and
length count to DMAC. Then issues "start".
0
ROM
CPU
Memory Mapped I/O
RAM
Memory
IOC
device
Peripherals
IOC provides handshake signals for
Peripheral Controller, and Memory Addresses and
handshake signals for Memory.
IO Buffers
n
37
RAID

Redundant Array of Independent Disks
In this section we will
Motivate a need to have greater reliability and
availability for disk data.
Look at ways to get this greater reliability.

6.1 Introduction 6.2 Types of Storage
Devices 6.3 Busses - Connecting IO Devices to
CPU/Memory. Interrupts etc. How is data
transferred. 6.5 Reliability, Availability and
RAID
38
Array Reliability
RAID

Reliability of N disks Reliability of 1 Disk
N
1,200,000 Hours 100 disks 12,000 hours
1 year 365 24 8700 hours
Disk system MTTF Drops from 140 years to
about 1.5 years!
Arrays (without redundancy) too unreliable to
be useful!

Hot spares support reconstruction in parallel
with access very high media availability can be
achieved
39
Redundant Arrays of Disks
RAID
Files are "striped" across multiple
spindles Redundancy yields high data
availability
Disks will fail Contents reconstructed from data
redundantly stored in the array
Capacity penalty to store it Bandwidth penalty
to update
Mirroring/Shadowing (high capacity cost) Parity
Techniques
40
Redundant Arrays of DisksRAID 1 Disk
Mirroring/Shadowing
RAID
recovery group
Each disk is fully duplicated onto its
"shadow" Very high availability can be
achieved Bandwidth sacrifice on write
Logical write two physical writes Reads may
be optimized Most expensive solution 100
capacity overhead
Targeted for high I/O rate , high availability
environments
Probabliity of failure (assuming 24 hours MTTR)
24 / ( 1.2 X 106 X 1.2 X 106 ) 6.9 x 10-13
170,000,000 years
41
Redundant Arrays of Disks RAID 3 Parity Disk
RAID
10010011 11001101 10010011 . . .
P
logical record
1 0 0 1 0 0 1 1
1 1 0 0 1 1 0 1
1 0 0 1 0 0 1 1
0 0 1 1 0 0 0 0
Striped physical records
Parity computed across recovery group to
protect against hard disk failures 33
capacity cost for parity in this configuration
wider arrays reduce capacity costs, decrease
expected availability, increase
reconstruction time Arms logically
synchronized, spindles rotationally synchronized
logically a single high capacity, high
transfer rate disk
Targeted for high bandwidth applications
Scientific, Image Processing
42
Redundant Arrays of Disks RAID 5 High I/O Rate
Parity
RAID
Increasing Logical Disk Addresses
D0
D1
D2
D3
P
A logical write becomes four physical
I/Os Independent writes possible because
of interleaved parity Reed-Solomon Codes ("Q")
for protection during reconstruction
D4
D5
D6
P
D7
D8
D9
P
D10
D11
D12
P
D13
D14
D15
Stripe
P
D16
D17
D18
D19
Targeted for mixed applications
Stripe Unit
D20
D21
D22
D23
P
. . .
. . .
. . .
. . .
. . .
Disk Columns
43
Problems of Disk Arrays Small Writes
RAID
44
Subsystem Organization
RAID
single board disk controller
Cache
array controller
host
host adapter
single board disk controller
manages interface to host, DMA
control, buffering, parity logic
single board disk controller
physical device control
single board disk controller
striping software off-loaded from host to array
controller no applications modifications no
reduction of host performance
45
System Availability Orthogonal RAIDs
RAID
Data Recovery Group unit of data redundancy
Redundant Support Components fans, power
supplies, controller, cables
End to End Data Integrity internal parity
protected data paths
46
System-Level Availability
RAID
host
host
Fully dual redundant
I/O Controller
I/O Controller
Cache Array Controller
Cache Array Controller
. . .
. . .
. . .
Goal No Single Points of Failure
. . .
. . .
. . .
with duplicated paths, higher performance can
be obtained when there are no failures
Recovery Group
47
Summary

6.1 Introduction
6.2 Types of Storage Devices
6.3 Busses - Connecting IO Devices to
CPU/Memory.
Interrupts etc. How is data transferred.
6.5 Reliability, Availability and RAID

48
Course Summary

During this course, weve started to learn about
the details of computer architecture. Items
included
Instruction Sets - especially a glimpse at the
Intel instruction set.
Pipelines - the gyrations necessary to speed up
the processor.
Memory - the various elements in the hierarchy
designed to speed up the effective access to
data.
IO - a brief look at disks, busses, and how they
are put together.