Title: Chapter 12: Mass-Storage Structure
1Chapter 12 Mass-Storage Structure
- Disk attachment
- SSD (Solid state drive) vs. HDD (Hard disk drive)
- Disk I/O scheduling
- Disk management
- Formatting/partitioning
2Resources
- Kernel Korner - I/O Schedulers
- http//www.linuxjournal.com/article/6931
- Describes soft deadline disk head scheduling and
anticipatory scheduling in Linux 2.6 kernel - See also the CFQ scheduler (Complete Fair Queuing
Scheduler) - http//en.wikipedia.org/wiki/CFQ
3Disk Attachment
- Host-attached storage
- Storage accessed on host through local I/O port
- Uses hardware bus and host controller (e.g., IDE,
ATA, SATA, FireWire, USB, SCSI, FC Fiber
Channel) - Network-attached storage
- Special purpose storage system attached remotely
over a data network - Clients access severs via RPC (e.g., NFS for Unix
systems or CIFS for Windows systems) - Storage-area network
- Private network (using storage protocols)
connecting file servers and storage units
4SSD (Solid State Drive)
- SSD provides random access to blocks of device
- Uniform access time to different blocks
- HDD does not provide random access
- Different access times to data blocks depending
on where head is presently and location of goal
block
5Erase and SSD
- Erase Writing to the SSD must be preceded by a
block erase (sets all bits to 1) - Lifetime Measured in the number of erase cycles
- Wear leveling Firmware or operating system
drivers must balance the numbers of erase cycles
done on each block so that device does not fail
prematurely
6SSD Performance
- Single chip has relatively high latencies (SLC
NAND) - 25 µs to fetch (read) a 4K page from the array
to the IO buffer - 250 µs to commit (write) a 4K page from the IO
buffer to the array - 2Â ms to erase a 256Â kiB block
- With parallel chip operation, 250Â MB/s effective
read/write speeds
http//en.wikipedia.org/wiki/Solid-state_drive
7hdparm command
- Measuring disk performance yourself on Linux
- http//www.cyberciti.biz/tips/how-fast-is-linux-sa
ta-hard-disk.html - Old disk system 57.9 MB/sec
8Figure 12.1 Moving-head disk mechanism (HDD)
9http//www.cprince.com/courses/cs5631spring09/news
/item1.pdf (Presentation by Yizhao Zhuang)
10HDD (Hard Disk Drive) Terms
- Access (positioning) time
- Time to start to transfer data
- Components Seek time, rotational latency
- Seek time time for disk arm to move heads to the
right cylinder - Rotational latency time for disk to rotate to
the desired sector - Transfer rate
- Sustained bandwidth (p. 539) average data
transfer rate during a large transfer that is
the, number of bytes divided by transfer time - data rate without positioning time
- Effective bandwidth (p. 507 539) average
transfer rate including positioning time
11HDD Scheduling
- Modern disks are addressed as a large array of
blocks - a disk address is a block number
- Block number is converted to a old style disk
address (i.e., cylinder, head, sector) by the
disk device firmware - Generally, increasing block number implies
physically adjacent sectors (see also
http//www.linuxjournal.com/article/6931), and
movement to inner cylinders of disk
12Disk Requests
- Typically a disk device has a queue with several
pending requests - One issue How can we schedule requests to (a)
improve system performance and throughput, and
(b) reduce request latency? - Some request ordering possibilities
- Prioritize disk requests by type
- E.g., page faults get priority over a processes
file I/O - Process priority could play a role
- Reduce access time by changing order of requests
- Just consider seek time
13Some Algorithms To Reduce Seek Time
14Scheduling Examples
- Assume
- List (queue) of track requests
- 98, 183, 37, 122, 14, 124, 65, 67
- Left most is first request
- R/W head starts at track 53
- Tracks range from 0-199
- Only a single platter
- Question
- Given a specific scheduling algorithm what is the
total head movement required for a list of track
requests?
15FCFS
- Assume
- List (queue) of track requests
- 98, 183, 37, 122, 14, 124, 65, 67
- Left most is first request
- R/W head starts at track 53
- Tracks range from 0-199
- Only a single platter
- 1) Show the head movement over the disk surface
- 2) What is the total head movement?
16FCFS
98-5345 183-9885 183-37146 122-3785 122-14108
124-14110 124-6559 67-652
Total head movement 640 tracks
17FCFS Performance Issues
- It is possible for disk positions far from a
current area of activity to be starved
indefinitely? - No. With FCFS, incoming (new) requests will
eventually get processed. - We are not reordering requests here.
- The algorithm is intrinsically fair, but it
generally does not generally provide the fastest
service.
18SSTF Shortest Seek Time First
- Assume
- List (queue) of track requests
- 98, 183, 37, 122, 14, 124, 65, 67
- Left most is first request
- R/W head starts at track 53
- Tracks range from 0-199
- Only a single platter
- 1) Show the head movement over the disk surface
- 2) What is the total head movement?
19SSTF example
Total head movement 236 tracks
20SSTF
Total head movement 236 tracks
21SSTF Performance Issues
- It is possible for disk positions far from a
current area of activity to be starved
indefinitely? - YES If new requests keep arriving for the
current area - Prefer a method that we know will not starve
requests - Also, while better than FCFS (236 for SSTF vs.
640 for FCFS), less head movement can be obtained - E.g., service order 53, 37, 14, 65, 67, 98, 122,
124, 183 has total head movement of 208 tracks
22SCAN
- Algorithm Move head from end to end (has a
current direction), servicing requests as you
come to them - Assume
- List (queue) of track requests
- 98, 183, 37, 122, 14, 124, 65, 67
- Left most is first request
- R/W head starts at track 53
- Tracks range from 0-199
- Only a single platter
- Start direction down (to lower numbered tracks)
- 1) Show the head movement over the disk surface
- 2) What is the total head movement?
- From start position to servicing last request in
list
23SCAN
Total head movement 53 183 236
24SCAN Performance Issues
- It is possible for disk positions far from a
current area of activity to be starved
indefinitely?
25C-SCAN
- Algorithm Only service requests in one direction
(circular SCAN) - Assume
- List (queue) of track requests
- 98, 183, 37, 122, 14, 124, 65, 67
- Left most is first request
- R/W head starts at track 53
- Tracks range from 0-199
- Only a single platter
- Start direction up (to higher numbered tracks)
- Service requests (only) in up direction
- 1) Show the head movement over the disk surface
- 2) What is the total head movement?
26C-SCAN
Total head movement 199-5337183 ( wrap)
27C-LOOK
- Algorithm Only service requests in one direction
and turn at last request in direction (circular
LOOK) - Assume
- List (queue) of track requests
- 98, 183, 37, 122, 14, 124, 65, 67
- Left most is first request
- R/W head starts at track 53
- Tracks range from 0-199
- Only a single platter
- Start direction up (to higher numbered tracks)
- Service requests in up direction
- 1) Show the head movement over the disk surface
- 2) What is the total head movement?
28C-LOOK
Total head movement (183-53)(37-14)153 ( wrap)
29Algorithm Selection Issues
- Starvation
- FCFS, SCAN and LOOK algorithms will not starve
requests - SSTF can starve requests
- Type of request activity
- Some requests will be to similar tracks of disk
- E.g., Requests by a single process for a file
allocated using contiguous file allocation - FCFS, SSTF Should do well with these clustered
requests - Some requests will be on widely different tracks
of disk - E.g., Indexed, linked file allocation requests
by different processes - May be best to have algorithm that distributes
requests uniformly across surface of disk (e.g.,
SCAN, LOOK)
30Ch 12.5 Disk Management
- Disk formatting
- Swap space
- Boot blocks Booting
31Ch 12.5.1 Disk Formatting
- Three steps
- Low-level formatting, partitioning, logical
formatting - Low-level formatting
- Fills disk sectors with special data structure
that is used by disk I/O controller - Header, data area (usually 512 bytes), trailer
- Header trailer information used by disk
controller hardware-- e.g., sector number and ECC
(error correcting code) - Creates map of bad blocks
- Reserves spare sectors for bad block repair
- Partitioning
- Grouping disk into one or more groups of
cylinders to be treated as separate logical disks - Each partition can have its own file system type
32Logical Formatting (Per Partition)
- Creation of organization for different types of
file systems, swap space, database application
etc. - E.g., directory structures, free space lists
- Boot partitions
- Boot block code to load kernel of O/S
- It is also possible to leave the partition
without logical formatting - With no file system disk data structures
- Partition accessed as an array of blocks
- raw disk, and raw I/O
- Raw I/O bypasses file system services such as
buffer cache, file locking, prefetching, space
allocation, file names, and directories
33Swap Space
- Virtual memory
- Will use regular file system for at least reading
code and data of program - Swap space may be on separate device or multiple
devices - Partition formatted for swap space can give
higher virtual memory performance Why? - With a separate partition, swap space manager
allocates/deallocates blocks - Doesnt use functions of normal file system
- Maintain map(s) of swap space usage
- May handle text (code) data differently
- E.g., Solaris 1 dont write text to swap disk
if page replacement needs page, next time just
re-read from normal file system dirty data pages
still written to swap disk as needed
34Boot Process
- Typically a three phase process
- ROM bootstrap
- Boot block loader
- Running operating system kernel
- ROM bootstrap program
- Initial program code that starts loading
operating system - Typically a relatively simple loader program
stored in ROM reads the disk boot block(s) into
RAM, and starts running this boot block code - Boot block code
- Actually does loading of operating system kernel
from boot partition
35Handling Bad Blocks
- Can handle at free blocks level in O/S
- Free list of blocks can be used to indicate these
blocks cannot be used - Device controller
- Can maintain list of bad blocks
- Initialized during low-level formatting at
factory - Low-level formatting also sets aside spare
sectors, not visible to O/S - Sector sparing or forwarding can be used to
manage new bad blocks - Sparing or forwarding Controller uses a spare
sector when it gets a request for a sector that
has a bad block - When block is remapped, controller uses a spare
sector from the same cylinder if possible - Sector slipping Moving a sequence of blocks
re-mapping them (p. 519)
36RAID Redundant Array of Independent/Inexpensive
Disks
- Multiple disk techniques for
- Improving throughput and/or response time
- Multiple disks, parallelize disk operations
- Improving data reliability
- Decrease mean time to failure (MTBF)
- If one disk fails, can replace with another and
go on quickly, with easy recovery - Ideas
- Data striping Methods of improving throughput
- Putting part of data on disk A, part on disk B,
part on - Bit striping e.g., 8 disks, one bit of each byte
on each disk - Block striping n disks, block i goes on disk (i
mod n) 1 - Methods of improving reliability
- Mirroring (a.k.a. shadowing) duplicating all
data - Parity, ECC (error correcting codes)
37END!
38(No Transcript)