Disk fundamentals - PowerPoint PPT Presentation

About This Presentation
Title:

Disk fundamentals

Description:

A hard disk may have 63 or more sectors per track. Photo of hard disk with reflective platter visible A platter from a 5.25 – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 54
Provided by: higg2
Category:

less

Transcript and Presenter's Notes

Title: Disk fundamentals


1
Disk fundamentals
  • Old edition chapter 14

2
The virtual layering
  • Virtual layering of disk storage system
  • disk controller firmware controller chips or
    card to map physical disk geometry for different
    drive brands and models
  • BIOS low level functions to read/write sectors
    or format tracks
  • The OS API services to open/close files, set
    properties, read/write files

3
Virtual levels of disk access
4
Common to all systems
  • Physical partitioning of data
  • Access to data at the file level
  • Map filenames to physical locations

5
Hardware level
  • Platters
  • sides
  • Tracks
  • Cylinders
  • sectors

6
OS level
  • OS level view of the disk is in terms of
    partitions, directories and files

7
Assembly access to disk
  • Readily available using BIOS under MS-DOS for ME,
    NT, XP, Windows7, etc.
  • Store and retrieve data in a special format (like
    Hamming or Huffman codes)
  • Recover lost data
  • Perform diagnostics
  • Using NT or XP you must use Win32 API for disk
    manipulationor write device drivers with high
    privilege

8
Tracks, cylinders, sectors
  • Disk is made up of multiple platters
  • Attached to a spindle which rotates at constant
    speed
  • Above the surface of each platter is a r/w head
    that records magnetic pulses
  • The heads move in or out as a group
  • See text sketch p 465

9
Tracks, cylinders, sectors
  • Surface of disk is formatted into (invisible)
    concentric bands called tracks where data is
    stored magnetically.
  • A disk will have thousands of tracks.
  • Moving r/w head from one track to another is
    called seeking.
  • (not mentioned) latency is the time it takes a
    particular sector to rotate around under the head
  • Seek time for a disk is one sort of performance
    measure
  • RPM is another performance measure- usually 7200
  • The outermost track is track 0 and numbers
    increase as you move toward the center.

10
Tracks, cylinders, sectors
  • All tracks readable from a given r/w head
    position together form a cylinder.
  • A file would typically be stored on disk using
    adjacent cylinders. This reduces seek time.
  • A sector is a 512-byte portion of a track
  • Physical sectors are magnetically marked at the
    factory using low-level formatting. Their size
    does not change regardless of the OS used. A hard
    disk may have 63 or more sectors per track.

11
Photo of hard disk with reflective platter visible
12
A platter from a 5.25" hard disk, with 20
concentric tracks drawnover the surface. Each
track is divided into 16 imaginary sectors
13
Figure 1
                                               
14
Sectors tracks
  • A sector is the basic unit of data storage on a
    hard disk. The term "sector" emanates from a
    mathematical term referring to that pie shaped
    angular section of a circle, bounded on two sides
    by radii and the third by the perimeter of the
    circle - See Figure 1. An explanation in its
    simplest form, a hard disk is comprised of a
    group of predefined sectors that form a circle.
    That circle of predefined sectors is defined as a
    single track. A group of concentric circles
    (tracks) define a single surface of a disks
    platter. Early hard disks had just a single
    one-sided platter, while today's hard disks are
    comprised of several platters with tracks on both
    sides, all of which comprise the entire hard disk
    capacity. Early hard disks had the same number of
    sectors per track location, and in fact, the
    number of sectors in each track were fairly
    standard between models. Today's advances in
    drive technology have allowed the number of
    sectors per track, or SPT, to vary significantly,
    but more about that later.

15
More about disks
  • When a hard disk is prepared with its default
    values, each sector will be able to store 512
    bytes of data. Without elaborating, there are a
    few operating system disk setup utilities that
    permit this 512 byte number per sector to be
    modified, however 512 is the standard, and found
    on virtually all hard drives by default. Each
    sector, however, actually holds much more than
    512 bytes of information. Additional bytes are
    needed for control structures, information
    necessary to manage the drive, locate data and
    perform other functions. Exact sector structure
    depends on the drive manufacturer and model,
    however the contents of a sector usually include
    the following elements
  • ID Information Within each sector a small space
    is left to identify the sector's number and
    location, which is used to locate the sector on
    the disk and provide for status information about
    the sector itself. For example, a single bit is
    used to indicate if the sector has been marked
    defective and remapped.
  • Synchronization Fields These are used internally
    by the drive controller to guide the read
    process.
  • Data The actual data in the sector.
  • ECC Error correcting code used to ensure data
    integrity.
  • Gaps Often referred to as spacers used to
    separate sector areas and provide time for the
    controller to process what it has been read
    before processing additional data.
  • Servo Information In addition to the sectors,
    each of which contain the items above, space on
    each track is allocated for servo information on
    drives that utilize embedded servo drives.
    Most, if not all, modern drives not employ servo
    technology.

16
Aside Zoned Bit Recording
  • We would be remiss in our discussion of drive
    sectors, tracks and performance without
    mentioning mass improvements such as Zoned Bit
    Recording. One of the methods used to increase
    capacity and data access speeds on hard disks is
    by improving the utilization of the larger, outer
    tracks of the disk. Early hard disks were
    extremely primitive, and their controllers
    weren't capable of handling complicated
    arrangements such as being able to change tracks.
    As the result of this arrangement, every track
    had the same number of sectors, with the standard
    set at 17 sectors per track.
  • As you can see from our sketch above, Figure 1,
    tracks are concentric circles, with the ones on
    the outside of the platter much larger in
    circumference than the ones closer to the center.
    Since there is a constraint on how tightly the
    inner circles can be packed with bits, developers
    packed them tightly as possible given the state
    of technology at the time. By reducing bit
    density, developers were able to assign the same
    number of sectors to the outer circles.
    Essentially this meant that the inner sectors
    were being packed so tightly there was no room
    for error, and the outer sectors underutilized,
    as in theory they could hold many more sectors
    given the same linear bit density limitations as
    were imposed on the inner sectors.

17
Zoned Bit Recording
  • Drive developers, in an effort to create larger
    drive sizes, as well as improve utilization and
    performance, developed a technology referred to
    as zoned bit recording (ZBR). Zoned bit recording
    is often referred to as  multiple zone recording
    or just zone recording. With this technology,
    tracks are grouped into zones based on their
    distance from the center of the disk, and each
    zone is assigned a number of sectors per track.
    As you move from the innermost part of the disk
    to the outer edge, you move through different
    zones, each containing more sectors per track
    than the one before. This makes more efficient
    use of the larger tracks on the outside of the
    disk. In essence, with ZBR, the size (or length)
    of a sector remains reasonably constant over the
    entire surface of the disk. Stark contrast to
    very early hard disks that did not employ ZBR, as
    their tracks were limited to only 9 sectors
    regardless of track size.
  • An interesting added benefit from zoned bit
    recording is that the raw data transfer rate of
    the disk, also referred to as the media transfer
    rate (a bit of a misnomer), when reading the
    outside cylinders is considerably higher than
    when reading the inside ones. Although the
    angular velocity of the platters is constant
    regardless of which track is being read, the
    outer cylinders contain more data. Bear in mind
    though that angular velocity does not necessarily
    compensate for the fact that the outer tracks
    (periphery of the platter) is moving much faster
    than the tracks at the core of the platter.
  • Take note that constant angular velocity is not
    the case for all drive technologies, such as
    older CD-ROM drives.
  • Since data is written to the outer tracks of a
    drive first, hence the drive is filled with data
    from the outside in. The fastest data transfer
    occurs when the drive is first used and data
    retained in the outer tracks. Many people that
    perform benchmarks on their systems and their
    hard drives when new, then make some tweaks and
    changes to their system only to return to their
    benchmarks weeks or months later only to be
    unpleasantly surprised that the disk and its
    benchmarks are getting slower. Actually, the disk
    has probably has not changed at all, but the
    second benchmark may have been run on tracks
    closer to the center of the disk. While most
    people that take benchmarking seriously
    defragment their drives before running the tests,
    fragmentation of the file system can have impact
    performance benchmarks.

18
fragmentation
  • Disk storage becomes fragmented over time just
    like main memory.
  • A fragmented file is not located in contiguous
    disk sectors. This slows access time.

19
translation
  • Translation is the process converting physical
    geometry into logical structure
  • The drive itself or a card has a controller to
    perform this operation.
  • The OS works with logical (not physical) sector
    numbers.

20
Logical Block Addressing aside
  • Prior to the advent of Logical Block Addressing,
    all hard drives were accessed via CHS (Cylinder,
    Head, Sector) or Extended CHS, which means that
    the drive was accessed by specifying its
    cylinder, head and sector address. More
    appropriately, it was referred to as accessing
    the drive through its "geometry". Extended CHS
    was a transition change in the way a drive was
    accessed in order to work around the 504 MiB
    barrier, however, the addressing was still done
    in terms of cylinder, head and sector numbers and
    then translated one or more times before actually
    accessing the drive itself.
  • By contrast, logical block addressing (LBA)
    involves a completely new method of addressing
    sectors. New in that it is new to the EIDE/IDE
    interface. LBA was first developed around SCSI
    hard drives. With LBA, instead of referring to a
    drives cylinder, head and sector number geometry
    in order to access or "address" it, each sector
    is assigned a unique "sector number". In essence,
    LBA is a means by which a drive is accessed by
    linearly addressing sector addresses, beginning
    at sector 1 of head 0, cylinder 0 as LBA 0, and
    proceeding on in sequence to the last physical
    sector on the drive, which, for instance, on a
    standard 540 Meg drive would be LBA 1,065,456.
    While this was new it the AT Specification ATA-2,
    it has always been the one and only addressing
    mode in SCSI. AT Attachment ATA-2 has been
    subsequently replaced, and the latest AT
    specification is at ATA-7. Note also that LBA
    does not allow you to address more sectors than
    CHS style addressing would.

21
Logical Block Addressing
  • In order for you to employ LBA support, it must
    be supported by both the BIOS and the operating
    system. In addition, since it is a new method of
    communicating with the hard drive, the drive
    itself must support LBA as well. All newer hard
    drives do in fact support LBA. Often we review
    other sites to ensure that we provide you with
    accurate information, and with respect to LBA, we
    came upon a unique, but inaccurate, statement.
    One purported authority on computer systems
    stated that when drives supporting LBA are
    auto-detected by a BIOS that supports LBA, it
    will be set up to use that mode. This is
    inaccurate and misleading, as there's nothing in
    the BIOS code that will set up your drive to use
    LBA mode. If you have ever used Fdisk, you may
    recall that during the drive setup process, you
    are asked whether you want to enable LBA. Hence,
    it is a function of the operating system, and
    therefore don't expect your BIOS to somehow
    mysteriously setup your drive.
  • While it is true that a drive enabled for LBA is
    not subject to the 504 MiB drive size barrier,
    there still remains considerable confusion about
    Logical Block Address and what it does. Many
    knowledgeable technicians and users believe that
    it is LBA addressing that avoids the 504 MiB
    barrier, however this is not quite accurate.
    Logical Block Addressing isn't getting around the
    barrier, because it is just another manner in
    which to address the same geometry. If you were
    still limited to 1,024 cylinders, 16 heads and 63
    sectors, you would still have logical sectors
    beginning with number 0, and progressing
    sequentially through to 1,032,191, with the 504
    MiB still in place. What does avoid this barrier
    is that LBA mode automatically enables geometry
    translation. This translation is required because
    the operating system calling the BIOS Int 13h
    routines knows nothing about LBA. Therefore it is
    the translation part of LBA that really gets
    around the barrier.
  • When LBA is enabled, the BIOS will enable
    geometry translation. This translation may be
    done in the same way that it is done in Extended
    CHS or large mode via a drives geometry, or it
    may be done using a different algorithm called
    LBA-assist translation. It is this translated
    geometry that is presented to the operating
    system for use in Int 13h calls. Basically, the
    difference between LBA and ECHS is that when
    using ECHS the BIOS translates the parameters
    used by these calls from the translated geometry
    to the drive's logical geometry. With LBA, it
    translates from the translated geometry directly
    into a logical block (sector) number.
  • LBA is currently the dominant form of hard disk
    addressing. When the 8.4 GB limit of the Int13h
    interface was reached in 1998-1999, it became
    impossible to express the geometry of large hard
    disks using cylinder, head and sector numbers,
    regardless of whether translated or not, while
    remaining below the Int13h limits of 1,024
    cylinders, 256 heads and 63 sectors. This is one
    of the reasons that today's hard drives no longer
    indicate their classical geometry.

22
Disk partitioning
  • A single harddrive may be partitioned into
    logical units named partitions or volumes
    represented by a letter, A, B, C, ..
  • A partition may be primary or extended and a
    drive may contain both types.
  • A primary partition is bootable.
  • An extended partition may be further divided into
    unlimited logical partitions. Each is mapped to a
    drive letter and can not be bootable. But each
    may be formatted with a different file system.

23
Multiboot systems
  • It is common to create multiple primary
    partitions each booting a different OS.
  • Mathlab is dual boot
  • In industry, you might have primary partitions
    for development and production.
  • Logical partitions hold data. Different OS can
    access the same file systems. Both Linux and DOS
    can read FAT32 disks.

24
FDISK.exe under MS-DOS
  • Create and remove partitions
  • Does not preserve data
  • Later versions (Win2000 and later) have a disk
    manager utility

25
File systems
  • Every OS has some disk management system.
  • At the lowest level it manages partitions, at the
    next highest, files and dirctories.
  • It must keep track of location, size and
    attributes for each file.

26
FAT File-Allocation-Table (see also later slide)
  • Maps logical sectors to clusters (a basic storage
    unit)
  • Maps files and directories to sequences of
    clusters.
  • A cluster is the smallest unit of space used by a
    file, consisting of one or more adjacent disk
    sectors.

27
Wikipedia FAT
  • File Allocation Table (FAT) is a file system
    developed by Microsoft for MS-DOS and was the
    primary file system for consumer versions of
    Microsoft Windows up to and including Windows Me.
    FAT as it applies to flexible/floppy and optical
    disk cartridges (FAT12 and FAT16 without long
    file name support) has been standardized as
    ECMA-107 and ISO/IEC 9293. The file system is
    partially patented.
  • The FAT file system is relatively uncomplicated,
    and is supported by virtually all existing
    operating systems for personal computers. This
    ubiquity makes it an ideal format for floppy
    disks and solid-state memory cards, and a
    convenient way of sharing data between disparate
    operating systems installed on the same computer
    (a dual boot environment).
  • The most common implementations have a serious
    drawback in that when files are deleted and new
    files written to the media, directory fragments
    tend to become scattered over the entire media,
    making reading and writing a slow process.
    Defragmentation is one solution to this, but is
    often a lengthy process in itself and has to be
    performed regularly to keep the FAT file system
    clean.

28
Wikipedia NTFS
  • NTFS (New Technology File System) is the standard
    file system of Windows NT, including its later
    versions Windows 2000, Windows XP, Windows Server
    2003, Windows Server 2008, and Windows Vista.5
  • NTFS replaced Microsoft's previous FAT file
    system, used in MS-DOS and early versions of
    Windows. NTFS has several improvements over FAT
    and HPFS (High Performance File System) such as
    improved support for metadata and the use of
    advanced data structures to improve performance,
    reliability, and disk space utilization plus
    additional extensions such as security access
    control lists and file system journaling. The
    exact specification is a trade secret, although
    (since NTFS v3.00) it can be licensed
    commercially from Microsoft through their
    Intellectual Property Licensing program.

29
XP disk management tool
30
Cluster sizes for 1.25-2gig volume
  • FAT Type FAT16 FAT32
  • Cluster Size 32 kiB 4 kiB
  • Number of FAT Entries65,526 524,208
  • Size of FAT 128 kiB 2 MiB

31
Clusters used by FAT
  • A chain of clusters is referenced by a FAT that
    keeps track of all clusters used by a file.
    Pictures show cluster chain and wasted space
    examples.

sector
2
1
5
6
7
8
4
3
cluster1
cluster2
4096 used
4096 used
1000 bytes used
32
FAT 12
  • Still supported by Windows and Linux
  • Cluster size is 512 bytes perfect for small
    files
  • Each table entry is 12 bits
  • A volume holds less than 4087 clusters

33
FAT 16
  • The only system for drives formatted under ms-dos
  • Supported by all versions of windows and linux
  • Drawbacks
  • Storage is inefficient on volumes over 1 gig due
    to large cluster size
  • Each table entry is 16 bits limiting the total
    number of clusters that can be accessed
  • Volume holds between 4087 and 65,526 clusters
  • Boot sector has no backup so a read error can be
    catastrophic
  • No built in security or individual user
    permissions

34
FAT 32
  • Introduced with OEM2 release of win 95 and later
    refined
  • A single file can be up to 4gb (minus 2b)
  • Each table entry is 32 bits
  • a volume holds 65,526 up to 268,435,456 clusters
  • Volume can hold up to 32 gig
  • Smaller clusters than FAT 16 on volumes 1gb to
    8gb resulting in less waste
  • Boot record has a backup of critical information

35
NTFS
  • Supported under NT, 2000, XP
  • Handles large volumes possibly spread over
    multiple drives
  • For disksgt2gig, default cluster is 4kb
  • Supports unicode filenames up to 255 chars long
  • Permissions
  • Built-in encryption
  • Change journal can track file revisions
  • Disk quotas for individuals or groups of users
  • Robust recovery for data error and automatically
    repairs errors
  • Supports multiple disk mirroring (a mirror is a
    copy)

36
ECC and Hamming
  • Hamming is a fairly expensive single-error
    correction scheme developed by Hamming at Bell
    Labs.
  • 2-power bits store parity of the other bits which
    they correct. So bit 1 is parity for all the odd
    bits. Bit 2 is parity for bits 3, 6, 7, 10,11,
    14, 15, bit 4 is correction bit for bits 5, 6, 7,
    12, 13, 14, 15. Bit 8 corrects 9, 10, ..15, and
    so on.

37
Hamming performance
  • To send an 8 bit (ASCII code for example) piece
    of data, we will use correcting bits 1, 2, 4, and
    8 (4 bits) plus the 8 data bits means we will
    package 12 bits. Notice this is a 33
    overhead.
  • To send 16 bits of data we would use correction
    bits 1,2,4,8 and 16 for a 21-bit package where
    overhead has dropped to less than 25
  • We can send up to 247 bits of data using parity
    bits 1,2,4,8,16,32,64,and 128 (8 correction
    bits) so the overhead has dropped down to
    8/255smaller than 3

38
Hamminga 12 bit example
  • Compute the correcting bits to send 8 bits of
    data, like A or 9.
  • Assume even parity bits.

39
ECC example
  • In bit interleaved parity disk 4 might hold
    parity bits for data on the other three disks.
  • Bits are read simultaneously off the 4 disks. If
    data is lost on one of the 3 data disks it can be
    recovered from the parity disk.
  • For example, if 2 good data bits read (with X
    marking lost data) are 1X1 with parity bit1 we
    see lost data (X) must be a 1.

40
MS DOS boot record
  • See text pg 471
  • Root directory is the main directory for a disk
    volume A directory entry for a file contains
    filename, size, attribute and starting cluster
    number.

41
Directory trees
  • FAT and NTFS have root directories containing
    primary list of files on the disk.
  • Subdirectories may be contained in the directory

42
Directory trees
Root directory
cpp
java
asm
etc
bin
jar
jdk
source
lib
bin
43
MS DOS directory structure
  • MS-DOS entries are 32 bytes long with fields
    shown in table 14-5

44
MS DOS directory entry
Hex ofs Field format
00 Filename ASCII
08 extension ASCII
0B attr 8-bit bin
0C reserved
16 time 16-bit bin
18 data 16-bit bin
1A Start cluster 16-bit bin
1C size 32-bit bin
45
Filename status byte
Status byte description
00h Entry never used
01h With attr0fh and status byte 1h, this is the first entry of a long filename
05h
E5h Entry is for a filename where the file has been erased
2E5h (.) for directory name
4nh First long name entry with attr0fh this marks the end. nentries for filename
46
Attribute field is bit-mapped
reserved
archive
reserved
subdir
Volume label
System
hidden
Read-only
An entry of 0Fh indicates that the current dir
entry is for an extended filename
47
Date stamp
Year 0..119 and is added to 1980
Month1..12
Day1..31
month
day
15
8
5
year
9

48
Time stamp
Hour0..23
Minute0..59
Seconds0..59
seconds
hours
minutes
0
15
5
11
4
10

49
MSDOS 32 bit date/timesame as 16 bit, but date
is high word of a double word
  • Year bits 31-25
  • Month 24-21
  • Day 20-16
  • Hour 15-11
  • Min 10-5
  • Sec 4-0

50
Cluster chain example- just links are shown
2 3 4 8 9 10 eoc
1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16
File starting cluster1, filesize7
51
Cluster chain example2- just links are shown
6 7 11 12 eoc
1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16
File starts in cluster 5, size5
52
FAT
  • When a file is create the OS looks for an
    available cluster entry in the FAT. Gaps occur
    if insufficient contiguous entries are available
    typically as files are deleted new ones
    added.
  • As files are modified and resaved, their chains
    become fragmented.
  • As r/w heads jumps between cylinders to locate
    all of a files clusters, performance degrades.

53
3 programs
  • Previous (5th) edition text contained 3 programs
    to read sectors, check free diskspace, and look
    at clusters.
  • But the first two do not run under xp.
Write a Comment
User Comments (0)
About PowerShow.com