I/O-Algorithms - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

I/O-Algorithms

Description:

I/O-algorithms. 6. Example: Grid Terrain Data. Appalachian Mountains (800km x 800km) ... between memory and disk. D. P. M. Block I/O. Lars Arge. I/O-algorithms ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 14
Provided by: lars46
Category:
Tags: algorithms

less

Transcript and Presenter's Notes

Title: I/O-Algorithms


1
I/O-Algorithms
Lars Arge January 31, 2005
2
Random Access Machine Model
R A M
  • Standard theoretical model of computation
  • Infinite memory
  • Uniform access cost

3
Hierarchical Memory
  • Modern machines have complicated memory hierarchy
  • Levels get larger and slower further away from
    CPU
  • Levels have different associativity and
    replacement strategies
  • Large access time amortized using block transfer
    between levels
  • Bottleneck often transfers between largest memory
    levels in use

4
I/O-Bottleneck
  • I/O is often bottleneck when handling massive
    datasets
  • Disk access is 106 times slower than main memory
    access
  • Large transfer block size (typically 8-16 Kbytes)
  • Important to obtain locality of reference
  • Need to store and access data to take advantage
    of blocks

5
Massive Data
  • Massive datasets are being collected everywhere
  • Storage management software is billion- industry
  • Examples
  • Phone ATT 20TB phone call database, wireless
    tracking
  • Consumer WalMart 70TB database, buying patterns
  • WEB Web crawl of 200M pages and 2000M links,
    Akamai stores 7 billion clicks per day
  • Geography NASA satellites generate 1.2TB per day

6
Example Grid Terrain Data
  • Appalachian Mountains (800km x 800km)
  • 500MB at 100m resolution
  • 5.5GB at 30m resolution
  • NASA SRTM mission acquired 30m data for
  • 80 of the earth land mass
  • 50GB at 10m resolution (some of US available from
    USGS)
  • 5TB at 1m resolution

7
I/O-Model
  • Parameters
  • N elements in problem instance
  • B elements that fits in disk block
  • M elements that fits in main memory
  • K output size in searching problem
  • We often assume that MgtB2
  • I/O Movement of block between memory and disk

D
Block I/O
M
P
8
List Ranking
BM/B2
  • Trivial internal memory algorithm takes O(N) time
  • and causes O(N) page faults in external memory
  • O(N/B) is the number of I/Os we need to read N
    element
  • Difference between N and N/B is extremely
    important in practice
  • Can we develop O(N/B) algorithm?
  • Answer is NO

9
Fundamental Bounds AV88
  • Internal External
  • Scanning N
  • Sorting N log N
  • Permuting
  • List rank
  • Searching
  • Note
  • Permuting not linear
  • Permuting and sorting bounds are equal in all
    practical cases
  • B factor VERY important
  • Cannot sort optimally with search tree

10
Sorting
  • Merge sort
  • Create N/M memory sized sorted runs
  • Merge runs together M/B at a time
  • ? phases using
    I/Os each

11
Distribution Sort
12
Finding partition elements
13
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com