Chapter 11: Storage and File Structure

About This Presentation

Title:

Chapter 11: Storage and File Structure

Description:

Chapter 11: Storage and File Structure. Overview of Physical Storage Media ... What is the value of data (e.g., Maxines Salon)? Classification of Storage Media ... – PowerPoint PPT presentation

Number of Views:227

Avg rating:3.0/5.0

Slides: 70

Provided by: marily198

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 11: Storage and File Structure

1
Chapter 11 Storage and File Structure

Overview of Physical Storage Media
Redundant Arrays of Independent Disks (RAID)
Buffer Management
File Organization

2
Properties of Physical Storage Media

Types of storage media differ in terms of
Speed of data access
Cost per unit of data
Reliability
Data loss on power failure or system (software)
crash
Physical failure of the storage device
gt What is the value of data (e.g., Maxines
Salon)?

3
Classification of Storage Media

Storage be classified as
Volatile
Content is lost when power is switched off
Includes primary storage (cache, main-memory)
Non-volatile
Contents persist even when power is switched off
Includes secondary and tertiary storage, as well
as battery-backed up RAM

4
Storage Hierarchy, Cont.

Storage can also be classified as
Primary Storage
Fastest media
Volatile
Includes cache and main memory
Secondary Storage
Moderately fast access time
Non-volatile
Includes flash memory and magnetic disks
Also called on-line storage
Tertiary Storage
Slow access time
Non-volatile
Includes magnetic tape and optical storage
Also called off-line storage

5
Primary Storage

Cache
Volatile
Fastest and most costly form of storage
Managed by the computer system hardware
Main memory (also known as RAM)
Volatile
Fast access (10s to 100s of nanosecs 1 nanosec
109 seconds)
Generally too small (or too expensive) to store
the entire database
Capacities of up to a few Gigabytes
Capacities have gone up, while costs have gone
down
gt The above comments notwithstanding, main
memory databases do exist

6
Physical Storage Media, Cont.

Flash memory
Non-volatile
Widely used in embedded devices such as digital
cameras
Data can be written at a location only once, but
location can be erased and written to again
Can support only a limited number of write/erase
cycles.
Erasing of memory has to be done to an entire
bank of memory
Reads are roughly as fast as main memory
Writes are slow (few microseconds), erase is
slower
Cost per unit of storage roughly similar to main
memory
Also known as Electrically Erasable Programmable
Read-Only Memory (EEPROM)

7
Physical Storage Media, Cont.

Magnetic-disk
Non-volatile
Survives power failures and system crashes
Failure can destroy data, but is very rare
(??????)
Primary medium for database storage
Typically stores the entire database
Capacities range up to roughly 100 GB currently
Much larger capacity than main or flash memory
Growing constantly and rapidly with technology
improvements
Direct-access, i.e., data on disk can be accessed
in any order, unlike magnetic tape also called
random-access.

8
Physical Storage Media, Cont.

Optical storage
Non-volatile, data is read optically from a
spinning disk using a laser
CD-ROM (640 MB) and DVD (4.7 to 17 GB) most
popular forms
Write-once, read-many (WORM) optical disks used
for archival storage (CD-R and DVD-R)
Multiple write versions available (CD-RW, DVD-RW,
and DVD-RAM)
Reads and writes are slower than with magnetic
disk

9
Physical Storage Media, Cont.

Tape storage
Non-volatile
Sequential-access much slower than a disk
Very high capacity (40 to 300 GB tapes available)
Storage costs are much cheaper than for a disk,
but high quality drives can be very expensive
Used primarily for backup (to recover from disk
failure), archiving and transfer of large
quantities of data

10
Physical Storage Media, Cont.

Juke-boxes
For storing massive amounts of data
Hundreds of terabytes (1 terabyte 109 bytes),
petabyte (1 petabyte 1012 bytes) or exabytes (1
exabyte 1012 bytes)
Large number of removable disks or tapes
A mechanism for automatic loading/unloading of
disks or tapes

11
Storage Hierarchy
12
Magnetic Hard Disk Mechanism
NOTE Diagram is schematic, and simplifies the
structure of actual disk drives
13
Magnetic Disks

Disk assembly consists of
A single spindle that spins continually (at 7500
or 10000 RPMs typically)
Multiple disk platters (typically 2 to 4)
Surface of platter divided into circular tracks
Over 16,000 tracks per platter on typical hard
disks
Each track is divided into sectors
A sector is the smallest unit of data that can be
read or written
Sector size typically 512 bytes
Typical sectors per track 200 (on inner tracks)
to 400 (on outer tracks)
Head-disk assemblies
One head per platter, mounted on a common arm,
each very close to its platter.
Reads or writes magnetically encoded information
Cylinder i consists of ith track of all the
platters

14
Magnetic Disks, Cont.

To read/write a sector
disk arm swings to position head on right track
sector rotates under read/write head
data is read/written as sector passes under head
gt Disks are the primary performance bottleneck
in a database system, in part because of the need
for physical movement

15
Performance Measures of Disks

Access time - The time it takes from when a read
or write request is issued to when data transfer
begins.
Access time consists of two parts
Seek time - the time it takes to reposition the
arm over the correct track
4 to 10 milliseconds on typical disks
Rotational latency - the time it takes for the
sector to be accessed to appear under the head
4 to 11 milliseconds on typical disks (5400 to
15000 r.p.m.)

16
Performance Measures of Disks, Cont.

Data Transfer Rate - The rate at which data can
be retrieved from or stored to the disk.
25 to 100 MB per second is typical
Multiple disks may share an interface controller,
so the rate that the interface controller can
handle data is also important
ATA-5 66 MB/second, SCSI-3 40 MB/s, Fiber
Channel 256 MB/s

17
Performance Measures (Cont.)

Mean time to failure (MTTF) - The average time
the disk is expected to run continuously without
any failure.
Typically 3 to 5 years
Probability of failure of new disks is quite low
- 30,000 to 1,200,000 hours for a new disk
An MTTF of 1,200,000 hours for a new disk means
that given 1000 relatively new disks, on an
average one will fail every 1200 hours
MTTF decreases as disk ages

18
Magnetic Disks (Cont.)

The term controller is used (primarily) in two
ways, both in the book and other literature the
book does not distinguish between the two very
well.
Disk controller (use 1)
Packaged within the disk
Accepts high-level commands to read or write a
sector
Initiates actions such as moving the disk arm to
the right track and actually reading or writing
the data
Computes and attaches checksums to each sector
for reliability
Performs re-mapping of bad sectors

19
Disk Subsystem
interface

Interface controller (use 2)
Multiple disks are typically connected to the
system bus through an interface controller
(a.k.a., a host adapter)
Many functions (checksum, bad sector re-mapping)
are often carried out by individual disk
controllers reduces load on the interface
controller

20
Disk Subsystem

The distribution of work between the disk
controller and interface controller depends on
the interface standard.
Disk interface standard families
ATA (AT adaptor)/IDE
SCSI (Small Computer System Interconnect)
Fibre Channel, etc.
Several variants of each standard (different
speeds and capabilities)
One computer can have many interface controllers,
of the same or different type
gt Like disks, interface controllers are also a
performance bottleneck.

21
Techniques for Optimizationof Disk-Block Access

Several techniques are employed to minimize the
negative effects of disk and controller
bottlenecks
File organization
Organize data on disk based on how it will be
accessed
Store related information (e.g., data in the same
file) on the same or nearby cylinders.
A disk may get fragmented over time
As data is inserted and deleted from disk, free
slots on disk tend to become scattered
Data items (e.g., files) are then broken up and
scattered over the disk
Sequential access to a fragmented data results in
increased disk arm movement
Most DBMSs have utilities to de-fragment the file
system

22
Techniques for Optimizationof Disk-Block Access

Block size adjustment
A block is a contiguous sequence of sectors from
a single track
A DBMS transfers data between disk and main
memory in blocks
Sizes range from 512 bytes to several kilobytes
Smaller blocks more transfers from disk
Larger blocks may waste space due to partially
filled blocks
Typical block sizes range from 4 to 16 kilobytes
Block size can be adjusted to accommodate
workload
Disk-arm-scheduling algorithms
Order pending accesses so that disk arm movement
is minimized
One example is the Elevator algorithm

23
Techniques for Optimizationof Disk Block Access,
Cont.

Log disk
A disk devoted to the transaction log
Writes to a log disk are very fast since no seeks
are required
Often times provided with battery back-up
Nonvolatile write buffers/RAM
Battery backed up RAM or flash memory
Typically associated with a disk or storage
device
Blocks to be written are first written to the
non-volatile RAM buffer
Disk controller then writes data to disk whenever
the disk has no other requests or when the
request has been pending for some time
Allows processes to continue without waiting on
write operations
Data is safe even in the event of power failure
Allows writes to be reordered to minimize disk
arm movement

24
RAID

Redundant Arrays of Independent Disks (RAID)
Disk organization techniques that exploit a
collection of disks
Speed, reliability or the combination are
increased
Collection appears as a single disk
Originally inexpensive disks
Using multiple disks increases the risk of
failure
A system with 100 disks, each with MTTF of
100,000 hours (approx. 11 years), will have a
system MTTF of 1000 hours (approximately 41 days)
Techniques that use redundancy to avoid data loss
are therefore critical

25
Improvement of Reliabilityvia Redundancy

RAID improves on reliability by using two
techniques
Parity Information
Mirroring
Parity Information (basic)
Makes use of the exclusive-or operator
1 1 0 1 1
0 1 1 0 0
Enables the detection of single-bit errors

26
Improvement of Reliabilityvia Redundancy

Mirroring
Duplicate every disk
Every write is carried out on both disks
If one disk in a pair fails, data still available
Reads can take place from either disk and, in
particular, from both disks at the same time
Data loss would occur only if a disk fails, and
its mirror disk also fails before the system is
repaired
Probability of both events is very small.
Mean time to data loss depends on time to failure
and time to repair.
If MTTF is100,000 hours, mean time to repair is10
hours, then mean time to data loss is 500106
hours (or 57,000 years) for a mirrored pair of
disks.

27
Improvement in Performancevia Parallelism

RAID improves performance main through the use of
parallelism.
Two main goals of parallelism in a disk system
Load balance multiple small accesses to increase
throughput (I/O rate).
Parallelize large accesses to reduce response
time (transfer rate).
Parallelism is achieved through striping.
Striping can be done at varying levels
bit-level
block-level (variable size)

28
Improvement in Performance via Parallelism

Bit-level striping
Split the bits of each byte across multiple
disks
In an array of eight disks, write bit i of each
byte to disk i.
In theory, each access can read data at eight
times the rate of a single disk.
But seek/access time worse than for a single
disk.
Not commonly used, in particular, as described.
Block-level striping
With n disks, block i of a file goes to disk (i
mod n) 1
Requests for different blocks can run in parallel
if the blocks reside on different disks (improves
throughput).
A request for a long sequence of blocks can use
all disks in parallel (improves response time).

29
RAID Levels

Different RAID organizations, or RAID levels,
have differing cost, performance and reliability
characteristics.
Levels vary by source and vendor
Standard levels 0, 1, 2, 3, 4, 5, 6
Nested levels 01, 10, 50, 51
Several other, non-standard, vendor specific
levels exist as well.
Our book 0, 1, 2, 3, 4, 5
Performance and reliability are not linear in the
level .
It is helpful to compare each level to every
other level, and to the single disk option.

30
RAID Levels

RAID Level 0
Block striping.
Used in high-performance applications where data
lost is not critical.
RAID Level 1
Mirrored disks with block striping.
Offers the best write performance.
Popular for applications such as storing log
files.
Sometimes called 01, 10, 01 or 10.

31
RAID Levels, Cont.

RAID Level 2
Memory-Style Error-Correcting-Codes (ECC) with
bit-level striping.
Parity information is used to detect, locate and
correct errors.
Outdated Disks contain embedded functionality to
detect and locate sector errors, so only a single
parity disk is needed (for correction).

32
RAID Levels, Cont.

RAID Level 3
Bit-level striping, with parity.
To recover data in a damaged disk, compute XOR of
bits from other disks (including parity bit
disk).
Faster data transfer than with a single disk, but
fewer I/Os per second since every disk has to
participate in every I/O.
When writing data, corresponding parity bits must
also be computed and written to a parity bit
disk.
Subsumes Level 2 (provides all its benefits, at
lower cost).

33
RAID Levels (Cont.)

RAID Level 4
Block-level striping, and keeps a parity block on
a separate disk for corresponding blocks from N
other disks.
To recover data in a damaged disk, compute XOR of
bits from other disks (including parity bit
disk).
When writing data block, the corresponding block
of parity bits must also be recomputed and
written to the parity disk
Can be done by using old parity block, old value
of current block and new value of current block
(2 block reads 2 block writes)
Or by re-computing the parity value using the new
values of blocks corresponding to the parity
block
More efficient for writing large amounts of data
sequentially

34
RAID Levels (Cont.)

RAID Level 4 (Cont.)
Parity block becomes a bottleneck for independent
block writes.
Provides higher I/O rates for independent block
reads than Level 3
Provides higher transfer rates for large,
multi-block reads compared to a single disk.

35
RAID Levels (Cont.)

RAID Level 5
Block-level striping with distributed parity
partitions data and parity among all N 1 disks,
rather than storing data in N disks and parity in
1 disk.
For example, with 5 disks, parity block for ith
set of N blocks is stored on disk (i mod 5) 1,
with the data blocks stored on the other 4 disks.

36
RAID Levels (Cont.)

RAID Level 5 (Cont.)
Provides same benefits of level 4, but minimizes
the parity disk bottleneck.
Higher I/O rates than Level 4.
RAID Level 6
PQ Redundancy scheme similar to Level 5, but
uses additional information to guard against
multiple disk failures.
Better reliability than Level 5 at a higher cost
not used as widely.

37
Choice of RAID Level

Factors in choosing RAID level
Monetary cost of disks, controllers or RAID
storage devices
Reliability requirements
Performance requirements
Throughput vs. response time
Short vs. long I/O
Reads vs. writes
During failure and rebuilding of a failed disk

38
Choice of RAID Level, Cont.

RAID 0 is used only when (data) reliability is
not important.
Level 2 and 4 never used since they are subsumed
by 3 and 5.
Level 3 is not used (typically) since bit-level
striping forces single block reads to access all
disks, wasting disk arm movement.
Sometimes vendors advertize a version of level 3
using byte-level striping.
Level 6 is rarely used since levels 1 and 5 offer
adequate safety for almost all applications.
So competition is between 1 and 5 only.

39
Choice of RAID Level (Cont.)

Level 1 provides much better write performance
than level 5.
Level 5 requires at least 2 block reads and 2
block writes to write a single block, whereas
Level 1 only requires 2 block writes.
Level 1 preferred for high update environments.
Level 1 has higher storage cost than level 5,
however
I/O requirements have increased greatly, e.g. for
Web servers.
When enough disks have been bought to satisfy
required rate I/O rate, they often have spare
storage capacity
In that case there is often no extra monetary
cost for Level 1!
Level 5 is preferred for applications with low
update rate,and large amounts of data.
Level 1 is preferred for all other applications.

40
Other RAID Issues

Hardware vs. software RAID.
Local storage buffers.
Hot-swappable disks.
Spare components disks, fans, power supplies,
controllers.
Battery backup.

41
Storage Access

A DBMS will typically have several files
allocated to it for storage, which are (usually)
formatted and managed by the DBMS.
In the simplest case each file maps to either a
disk or disk partition.
Each file is partitioned into blocks (a.k.a,
pages).
Consists of zero or more contiguous sectors.
A block is the smallest unit of DBMS storage
allocation.
A block is the smallest unit of DBMS data
transfer.
Each block is partitioned into records.
Each record is partitioned into fields.

42
Buffer Management

The DBMS will transfer blocks of data between RAM
and Disk, in a manner similar to an operating
system.
The DBMS seeks to minimize the number of block
transfers between the disk and memory.
The portion of main memory available to store
copies of disk blocks is called the buffer.
The DBMS subsystem responsible for allocating
buffer space in main memory is called the buffer
manager.

43
Buffer Manager Algorithm

Programs call the buffer manager when they need a
(disk) block.
If a requested block is already in the buffer
then the requesting program is given the address
of the block in main memory
If the block is not in the buffer then the buffer
manager
Allocates space in the buffer for the block,
throwing out some other block, if required.
Writes out to disk the thrown out block if it has
been modified since the last time it was
retrieved or written.
Reads the block from disk to the buffer once
space is available, and passes the address of the
block in main memory to requester.

44
Buffer-Replacement Policies

How is a block selected for replacement?
Most operating systems use Least Recently Used
(LRU).
Past pattern of block references is used to
predict future references.
In a DBMS, LRU can be a bad strategy for certain
access patterns.
Queries have well-defined access patterns (such
as sequential scans), and so a DBMS can predict
future references (an OS cannot).
Mixed strategies are common.
Statistical information can be exploited.
Examples of statistical information
The data dictionary is frequently accessed, so
keep those blocks in the buffer.
Index blocks are used frequently, so keep them in
the buffer.
Store and use histograms on the distribution of
column values.

45
Buffer-Replacement Policies, Cont.

Some terminology...
A memory block that has been loaded into the
buffer from disk, and is not allowed to be
replaced is said to be pinned.
At any given time a buffer block is in one of the
following states
Unused (free) does not contain the copy of a
block from disk.
Used, but not pinned contains the copy of a
block from disk, which is available for
replacement.
Used and pinned contains the copy of a block
from disk, but which is not available for
replacement.

46
Buffer-Replacement Policies, Cont.

Page replacement strategies
Toss immediate Frees the space in the buffer
occupied by a block, typically done when the
final tuple of a block has been used writes the
block out if necessary.
Most recently used (MRU) The moment a disk
block in the buffer becomes unpinned, it becomes
the most recently used block.
Least recently used (LRU) Of all unpinned disk
blocks in the buffer, the one referenced furthest
back in time.

47
Buffer-Replacement Policies, Cont.

Consider the following query and query plan,
chosen by the query optimizer, for implementing a
join query between the borrower and customer
relations.
select customer-name, loan-number, street, city
from borrower, customer
where bcustomer-name ccustomer-name
for each tuple b of borrower do // scan
borrower sequentially
for each tuple c of customer do // scan
customer sequentially
if bcustomer-name ccustomer-name then
begin
xcustomer-name bcustomer-name
xloan-number bload-number
xstreet cstreet
xcity ccity
include x in the result of the query
end if
end for
end for

48
Buffer-Replacement Policies, Cont.

Suppose that borrower consists of blocks b1, b2,
and b3, that customer consists of blocks c1, c2,
c3, c4, and c5, and that the buffer can fit five
(5) blocks total.
LRU Example
Read in a block of borrower, and keep it in the
buffer until the last tuple in that block has
been processed (pinned).
Once all of the tuples from that block have been
processed, free up the buffer space used by that
block immediately (toss-immediately).
If a customer block is to be brought into the
buffer, and another block must be moved out in
order to make room, then move out the least
recently used block (borrower, or customer).

49
Buffer-Replacement Policies, Cont.

Applying the LRU replacement algorithm results in
the following for the first two tuples of b1
b1 b1 b1 b1 b1 b1 b1 b1 b1 b1 b1
c1 c1 c1 c1 c5 c5 c5 c5 c4 c4
c2 c2 c2 c2 c1 c1 c1 c1 c5 .......
c3 c3 c3 c3 c2 c2 c2 c2
c4 c4 c4 c4 c3 c3 c3
Notice that each time a customer block is read
into the buffer, the block replaced is the next
block required!
This results in a total of six (6) page
replacements.

50
Buffer-Replacement Policies, Cont.

MRU Example
Read in a block of borrower, and keep it in the
buffer until the last tuple in that block has
been processed (pinned).
Once all of the tuples from that block have been
processed, free up the buffer space used by that
block immediately (toss-immediately).
If a customer block is to be brought into the
buffer, and another block must be moved out in
order to make room, then move out the most
recently used block (borrower, or customer).

51
Buffer-Replacement Policies, Cont.

Applying the MRU replacement algorithm results in
the following for the first two tuples of b1
b1 b1 b1 b1 b1 b1 b1
c1 c1 c1 c1 c1 c1
c2 c2 c2 c2 c2 .......
c3 c3 c3 c4
c4 c5 c5
This results in a total of two (2) page
replacements.

52
File Organization

Several standard low-level storage DBMS issues
Avoiding large numbers of block reads
Algorithmically, the bar has been raised
Records crossing block boundaries
Dangling pointers
Disk fragmentation
Allocating free space
Other issues exist.
Partial solutions to the above will be presented
in the following.

53
File Organization

Recall
A database is stored as a collection of files.
A file consists of a sequence of blocks.
A block consists of a sequence of records.
A record consists of a sequence of fields.
Initial assumptions
Record size is fixed.
Each file has records of one particular type
only.
Different files are used for different relations.
Tuples are stored in an arbitrary order, i.e.,
unsorted.
This case is easiest to deal with variable
length records will be considered later.

54
Fixed-Length Records

Simple approach
Store record i starting from byte n ? (i 1),
where n is the size of each record.
Record access is simple.
Technically, records could cross block
boundaries, but this can be fixed.
Insertion is easy
Add each new record to the end
Deletion of record i options
Move records i 1, . . ., n to i, . . . , n 1
(shift)
Move record n to i
What problems arise with each?
gt Dont move records, but link all free
records on a free list.

55
Free Lists

Store the address, i.e., a pointer, of the first
deleted record in a file header.
Store a pointer in the first record to the second
record, etc.
Improvement - use space for a normal attribute to
store the pointers.
Records in the file could be connected with
points for efficient scanning.

56
Variable-Length Records

Variable-length records arise in databases in
several ways
Storage of multiple record types in a file
Record types that allow variable lengths for one
or more fields
Record types that allow repeating fields
Options for storing variable-length records
Byte-string
Reserved space
Pointers
Slotted page structure

57
Variable-Length Records, Cont.

Byte string representation
Attach an end-of-record (?) character to the end
of each record.
Could also combine this with pointers for free
and un-free records.
Deletion insertion result in fragmentation.
Difficulty with individual record growth.
Reserved space can use fixed-length records of
a known maximum length unused space in shorter
records filled with a null or end-of-record
symbol.
Once again, could use with pointers for free and
un-free records.

58
Variable Length Records, Cont.

Pointer method
A variable-length record is represented by a list
of fixed-length records, chained together via
pointers.
Can be used even if the maximum record length is
not known.
Once again, could use with pointers for free and
un-free records.

59
Variable Length Records, Cont.

Disadvantages
Records are more likely to cross block boundaries
Space is wasted in all records except the first
in a a chain
Partial solution is to allow two kinds of block
in file
Anchor block contains the first records of
chain
Overflow block contains records other than
those that are first in a chain.
Records are still more likely, to cross block
boundaries.

60
Variable-Length Recordsthe Slotted Page
Structure

The slotted page (e.g., block) header contains
Number of record entries
Pointer to the end of free space in the block
Location and size of each record
Records can be moved within a page to eliminate
empty space
Entry in the header must be updated.
Pointers from outside the block do not point
directly to a record, but rather to the
corresponding entry in the header.

61
Organization of Records in Files

How are records assigned to blocks within a
specific file?
Heap a record is placed anywhere in the file
where there is space.
Sequential records are stored in sequential
order, based on the value of the search key of
each record.
Hashing a hash value is computed from some
attribute(s) of each record, which specifies the
block where the record is placed.
Clustering records of different tables are
stored in the same file.
Motivation is to store related records on the
same block to minimize I/O
The preceding slides covered heap organization.

62
Sequential File Organization

Suitable for applications that require sequential
processing of the entire file.
The records in the file are ordered by a
search-key

63
Sequential File Organization (Cont.)

Deletion must maintain pointer chains.
Insertion locate the position where the record
is to be inserted
if there is free space insert there
if no free space, insert the record in an
overflow block
In either case, pointer chain must be updated
Need to reorganize the filefrom time to time to
restoresequential order.

64
Clustering File Organization

Simple file structure stores each relation in a
separate file
A clustering file organization stores several
relations in one file
Clustered organization of customer and depositor

Good for queries involving depositor
customer, and for queries involving one single
customer and his accounts
Bad for queries involving only customer
Results in variable size records

65
Data Dictionary Storage

The data dictionary (a.k.a. system catalog)
stores metadata
Information about relations
names of relations
names and types of attributes
names and definitions of views
integrity constraints
Physical file organization information
How relation is stored (sequential, hash, etc.)
Physical location of relation
operating system file name, or
disk addresses of blocks containing the records
Statistical and descriptive data
number of tuples in each relation
User and accounting information, including
passwords
Information about indices (Chapter 12)

66
Data Dictionary Storage (Cont.)

Catalog structure can use either
specialized data structures designed for
efficient access
a set of relations, with existing system features
used to ensure efficient access
The latter alternative is usually preferred
A possible catalog representation

Relation-metadata (relation-name,
number-of-attributes, storage-organization,
location)Attribute-metadata (attribute-name,
relation-name, domain-type, position,
length) User-metadata (user-name,
encrypted-password, group) Index-metadata
(index-name, relation-name, index-type,
index-attributes) View-metadata (view-name,
definition)
67
Large Objects

Large objects
text documents
images
computer aided designs
audio and video data
Large objects may need to be stored in a
contiguous sequence of bytes when brought into
memory.
If an object is bigger than a block, contiguous
blocks in the buffer must be allocated to store
it.
Might be preferable to disallow direct access to
data, and only allow access through a
file-system-like API, to remove need for
contiguous storage.
Most vendors provide supporting types
binary large objects (blobs)
character large objects (clobs)
text, image, etc.

68
Modifying Large Objects

If the application requires insertion or deletion
of bytes from specified regions of an object
B-tree file organization (described later in
Chapter 12) can be modified to represent large
objects
Each leaf block of the tree stores between half
and 1 block worth of data from the object
Special-purpose application programs outside the
database are frequently used to manipulate large
objects
Text data treated as a byte string manipulated by
editors and formatters.
Graphical data and audio/video data is typically
created and displayed by separate application