Recap of Feb 25: Physical Storage Media - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Recap of Feb 25: Physical Storage Media

Description:

Primary storage (volatile): Cache, Main Memory. Secondary or On-line storage (non ... These concerns are linked in to our next topic: file organization. ... – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 14
Provided by: david227
Category:

less

Transcript and Presenter's Notes

Title: Recap of Feb 25: Physical Storage Media


1
Recap of Feb 25 Physical Storage Media
  • Issues are speed, cost, reliability
  • Media types
  • Primary storage (volatile) Cache, Main Memory
  • Secondary or On-line storage (non-volatile)
    Flash Memory, Mag Disk
  • Tertiary or Off-line storage (non-volatile)
    Optical Storage, Tape Storage
  • Mag disk issues
  • definitions sector, track, cylinder
  • disk controllers, multiple disks
  • disk performance measures (seek time, rotational
    latency, data transfer rate, MTTF)
  • Now we start with Optimization of Disk-Block
    Access

2
Optimization of Disk-Block Access Motivation
  • Requests for disk I/O are generated both by the
    file system and by the virtual memory manager
  • Each request specifies the address on the disk to
    be referenced in the form of a block number
  • a block is a contiguous sequence of sectors from
    a single track on one platter
  • block sizes range from 512 bytes to several K (4
    -- 16K is typical)
  • smaller blocks mean more transfers from disk
    larger blocks makes for more wasted space due to
    partially filled blocks
  • block is the standard unit of data transfer
    between disk to main memory
  • Since disk access speed is much slower than main
    memory access, methods for optimizing disk-block
    access are important

3
Optimization of Disk-Block Access Methods
  • Disk-arm Scheduling requests for several blocks
    may be speeded up by requesting them in the order
    they will pass under the head.
  • If the blocks are on different cylinders, it is
    advantageous to ask for them in an order that
    minimizes disk-arm movement
  • Elevator algorithm -- move the disk arm in one
    direction until all requests from that direction
    are satisfied, then reverse and repeat
  • Sequential access is 1-2 orders of magnitude
    faster random access is about 2 orders of
    magnitude slower

4
Optimization of Disk-Block Access Methods
  • Non-volatile write buffers
  • store written data in a RAM buffer rather than on
    disk
  • write the buffer whenever it becomes full or when
    no other disk requests are pending
  • buffer must be non-volatile to protect from power
    failure
  • called non-volatile random-access memory
    (NV-RAM)
  • typically implemented with battery-backed-up RAM
  • dramatic speedup on writes with a
    reasonable-sized buffer write latency essentially
    disappears
  • why cant we do the same for reads? (hints ESP,
    clustering)

5
Optimization of Disk-Block Access Methods
  • File organization (Clustering) reduce access
    time by organizing blocks on disk in a way that
    corresponds closely to the way we expect them to
    be accessed
  • sequential files should be kept organized
    sequentially
  • hierarchical files should be organized with
    mothers next to daughters
  • for joining tables (relations) put the joining
    tuples next to each other
  • over time fragmentation can become an issue
  • restoration of disk structure (copy and rewrite,
    reordered) controls fragmentation

6
Optimization of Disk-Block Access Methods
  • Log-based file system
  • does not update in-place, rather writes updates
    to a log disk
  • essentially, a disk functioning as a non-volatile
    RAM write buffer
  • all access in the log disk is sequential,
    eliminating seek time
  • eventually updates must be propogated to the
    original blocks
  • as with NV-RAM write buffers, this can occur at a
    time when no disk requests are pending
  • the updates can be ordered to minimize arm
    movement
  • this can generate a high degree of fragmentation
    on files that require constant updates
  • fragmentation increases seek time for sequential
    reading of files

7
Storage Access (11.5)
  • Basic concepts (some already familiar)
  • block-based. A block is a contiguous sequence of
    sectors from a single track blocks are units of
    both storage allocation and data transfer
  • a file is a sequence of records stored in
    fixed-size blocks (pages) on the disk
  • each block (page) has a unique address called
    BID
  • optimization is done by reducing I/O, seek time,
    etc.
  • database systems seek to minimize the number of
    block transfers between the disk and memory. We
    can reduce the number of disk accesses by keeping
    as many blocks as possible in main memory.
  • Buffer - portion of main memory used to store
    copies of disk blocks
  • buffer manager - subsystem responsible for
    allocating buffer space in main memory and
    handling block transfer between buffer and disk

8
Buffer Management
  • The buffer pool is the part of the main memory
    alocated for temporarily storing disk blocks read
    from disk and made available to the CPU
  • The buffer manager is the subsystem responsible
    for the allocation and the management of the
    buffer space (transparent to users)
  • On a process (user) request for a block (page)
    the buffer manager
  • checks to see if the page is already in the
    buffer pool
  • if so, passes the address to the process
  • if not, it loads the page from disk and then
    passes the address to the process
  • loading a page might require clearing (writing
    out) a page to make space
  • Very similar to the way virtual memory managers
    work, although it can do a lot better (why?)

9
Buffer Replacement Strategies
  • Most operating systems use a LRU replacement
    scheme. In database environments, MRU is better
    for some common operations (e.g., join)
  • LRU strategy replace the least recently used
    block
  • MRU strategy replace the most recently used
    block
  • Sometimes it is useful to fasten or pin blocks to
    keep them available during an operation and not
    let the replacement strategy touch them
  • pinned block is thus a block that is not allowed
    to be written back to disk
  • There are situations where it is necessary to
    write back a block to disk even though the buffer
    space it occupies is not yet needed. This write
    is called the forced output of a block useful in
    recovery situations
  • Toss-immediate strategy free the space occupied
    by a block as soon as the final tuple of that
    block has been processed

10
Buffer Replacement Strategies
  • Most recently used (MRU) strategy system must
    pin the block currently being processed. After
    the final tuple of that block has been processed
    the block is unpinned and becomes the most
    recently used block. This is essentially
    toss-immediate with pinning, and works very
    well with joins.
  • The buffer manager can often use other
    information (design or statistical) to predict
    the probability that a request will reference a
    particular page
  • e.g., the data dictionary is frequently accessed
    -- keep the data dictionary blocks in main memory
    buffer
  • if several pages are available for overwrite
    choose the one that has the lowest number of
    recent access requests to replace

11
Buffer Management (cont)
  • Existing OS affect DBMS operations by
  • read ahead, write behind
  • wrong replacement strategies
  • Unix is not good for DBMS to run on top
  • Most commercial systems implement their own I/O
    on a raw disk partition
  • Variations of buffer allocation
  • common buffer pool for all relations
  • separate buffer pool for each relation
  • as above but with relations borrowing space from
    each other
  • prioritized buffers for very frequently accessed
    blocks, e.g. data dictionary

12
Buffer Management (cont)
  • For each buffer the manager keeps the following
  • which disk and which block it is in
  • whether the block is dirty (has been modified) or
    not (why?)
  • information for the replacement strategy
  • last time block was accessed
  • whether it is pinned
  • possible statistical information (access
    frequency etc.)

13
Buffer Management and Disk-block Access
Optimization (end)
  • Disk-block access methods must take care of some
    information within each block, as well as
    information about each block
  • allocate records (tuples) within blocks
  • support record addressing by address and by
    value
  • support auxiliary (secondary indexing) file
    structures for more efficient processing
  • These concerns are linked in to our next topic
  • file organization.
Write a Comment
User Comments (0)
About PowerShow.com