Title: Jim Gray Talk at University of Tokyo
1Jim GrayTalk at University of Tokyo
- Personal views on PITAC report invest in long
term research
- Preview of Turing lecture 10 long term research
problems
- Bush Summarize info in cyberspace
- Turing Intelligent Computers
- 7 9s build systems that are always up and prove
it.
- 5-Minute rule
- For disks
- For tapes
- Sorting Progress
- PennySort
- Terabyte Sort (!)
- Slides will be at http//research.Microsoft.com/G
ray/talks
2Presidential Advisory Committee onHigh
Performance Computing and Communications,Informat
ion Technologies, and the Next Generation
InternetInformation Technology
http//www.ccic.gov/ac/interim/ or
http//research.microsoft.com/Gray/papers/PITAC_I
nterim_Report_8_98.doc
3Charter for the Committee provide an
independent assessment of
- High-Performance Computing and Communications
(HPCC)
- Progress
- Balance among research components
- Next Generation Internet initiative
- Progress
- Balance
- IT Research and development
- Maintain United States leadership in
- IT and
- Applications
4Committee Members
- Co-Chairs
- Bill Joy, Sun Microsystems Ken Kennedy, Rice
University
- Members
- Eric Benhamou, 3Com Vinton Cerf, MCI
- Ching-chih Chen, Simmons David Cooper, LLNL
- Steve Dorfman, Hughes David Dorman, PointCast
- Bob Ewald, SGI David Farber, U. of
Pennsylvania
- Sherri Fuller, U. of Washington Hector
Garcia-Molina, Stanford
- Susan Graham, UC Berkeley Jim Gray, Microsoft
- Danny Hillis, Disney, Inc John Miller, Montana
State Univ.
- David Nagel, ATT Raj Reddy, Carnegie Mellon
- Ted Shortliffe, Stanford Larry Smarr, U. of
Illinois _at_ UC
- Joe Thompson, Miss. State U. Les Vadasz, Intel
- Andy Viterbi, Qualcom Steve Wallach,
Centerpoint
- Irving Wladawsky-Berger, IBM
5 My Summary of the Report
- 1/3 of the US economic growth since 1992 was in
the IT sector. IT is key to our health, wealth,
and safety.
- Created 400 B of wealth in last 3 years (!!)
- Federal IT research funding of twenty years ago,
created the boom.
- Federal IT research funding for the last decade
has been flat (in constant dollars).
- Research funding is increasingly near-term
applied development
- The committee recommends Increase long-term
research funding in
- Software design and implementation technologies
- Technologies to scale the Next Generation
Internet to 6 billion users.
- Tools, algorithms, and systems for
high-performance computing.
- Spend a billion dollars over the next 5 years on
Lewis and Clark style "expeditions" into
cyberspace.
6Myths
- Now that IT is a big business, Industry will do
long term research.
- FACT
- industry spends LITTLE on long-term research.
- it is not in their best interest
- IT research buy computers for scientists.
- FACT
- computer science research
- is different from
- the application of computers to some discipline.
7Research Priorities
- Findings
- Total federal Information technology RD
investment is inadequate
- Federal IT RD is excessively focused on
near-term problems
- Recommendations
- Create a strategic initiative in long-term IT
RD
- Increase the investment for research in
software, scalable information
infrastructure, high-end computing, and
socio-economic and workforce impacts
8Software Research
- Findings
- Demand for software far exceeds the nations
ability to produce it
- The nation depends on fragile software
- Technologies to build reliable and secure
software are inadequate
- The nation is under-investing in fundamental
software research
- Recommendations
- Fund more fundamental research in software
development methods and component technologies
- Sponsor a national library of software
components
- Make software research a substantive component of
every major IT research initiative
- Support research in human-computer interfaces and
interaction
- Make fundamental software research an absolute
priority
9Scalable Information Infrastructure
- Findings
- The Internet has grown well beyond the intent of
its original designers
- Our nations dependence on the information
infrastructure is increasing daily
- We cannot safely extend what we currently know to
more complex systems
- Learning how to build large-scale, highly
reliable and secure systems requires research
- Recommendations
- Increase funding in research and development of
core software and communications technologies
aimed directly at the challenge of scaling the
information infrastructure - Expand the Next Generation Internet test beds to
include additional industry partnerships in order
to foster the rapid commercialization and
deployment of enabling technologies
10High-End Computing
- Findings HEC is
- essential for science and engineering research
- an element of the United States national security
- ripe for new applications
- suppliers suffer from unusual market pressures
- Research Development Recommendations
- Fund innovative technologies and architectures
- Fund HEC software (parallel programming)
- Aim for a real application petaops by 2010
through a both hardware and software strategies
- Fund HEC systems for science and engineering
research
11Social, Economic, Workforce Recommendations
- Expand research on the social and economic
impacts of information technology diffusion and
adoption
- Expand initiatives to increase IT literacy,
access and research capabilities
- Address the shortage of high-technology workers
- Programs to re-train stale IT workers
- Encourage participation by women and minorities
- Short-term increase in immigration of skilled IT
workers
12Conclusions
- IT is an essential foundation for commerce,
education, health care, environmental
stewardship, and national security
- Dramatically transform the way we communicate,
learn, deal with information and conduct
research
- Transform the nature of work, nature of commerce,
product design cycle, practice of health care,
and the government itself
- The total Federal IT RD investment is
inadequate
- The Federal IT RD is excessively focused on
near-term problems
- U. S. government must
- Create a strategic initiative in long-term IT
RD
- Establish an effective structure for managing and
coordinating IT
13Jim GrayTalk at University of Tokyo
- Personal views on PITAC report invest in long
term research
- Preview of Turing lecture 10 long term research
problems
- Bush Summarize info in cyberspace
- Turing Intelligent Computers
- 7 9s build systems that are always up and prove
it.
- 5-Minute rule
- For disks
- For tapes
- Sorting Progress
- PennySort
- Terabyte Sort (!)
- Slides will be at http//research.Microsoft.com/G
ray/talks
14Vanaveer Bush Memex
- Memex Proposed putting all information online
(1948)
- It will happen
- Result InfoGlut. Too much information in the
shoebox
- Challenge
- Organize the information.
- Give answers as good as an expert in the field.
- Anticipate questions and so inform subscriber
- Protect personal privacy
- A hacker cannot get access to your personal
information without your consent.
15Turings Test (1951) Intelligent Machines
- Computers helped with the 4-color problem end
game
- Computers (and people) won world chess
championship
- Computers will likely be our 5th brain
- Augment our intelligence
- See for us, hear for us, read for us,
- Prosthetic eyes, ears, voices, arms, legs,.
- Probably computers will be intelligent like
plants and animals.
- Perhaps computers can be intelligent like people
- Pass the Turing Test (easy/impossible?) (70, 5
minutes, B can lie)
- Translating telephone (as good as a human
translator)
- Read a textbook and pass the written exam.
- Pass a graduate programming class
- Pass a graduate literature class
- Radical Download someone.
16Dependable Systems
- Build a system used by millions of people each
day.
- Then
- Prove that it does what it is supposed to do
(code matches spec).
- Prove that it delivers 99.99999 (7 9s)
availability (1 hr per millennium)
- Prove that it cannot be hacked for less than
1B (Y2K )
- Then build the system automatically from the
specification.
17Jim GrayTalk at University of Tokyo
- Personal views on PITAC report invest in long
term research
- Preview of Turing lecture 10 long term research
problems
- Bush Summarize info in cyberspace
- Turing Intelligent Computers
- 7 9s build systems that are always up and prove
it.
- 5-Minute rule
- For disks
- For tapes
- Sorting Progress
- PennySort
- Terabyte Sort (!)
- Slides will be at http//research.Microsoft.com/G
ray/talks
18Storage Hierarchy (9 levels)
- Cache 1, 2
- Main (1, 2, 3 if nUMA).
- Disk (1 (cached), 2)
- Tape (1 (mounted), 2)
19Meta-Message Technology Ratios Are Important
- If everything gets faster cheaper at the
same rate THEN nothing really changes.
- Things getting MUCH BETTER
- communication speed cost 1,000x
- processor speed cost 100x
- storage size cost 100x
- Things staying about the same
- speed of light (more or less constant)
- people (10x more expensive)
- storage speed (only 10x better)
20Todays Storage Hierarchy Speed Capacity vs
Cost Tradeoffs
Size vs Speed
Price vs Speed
Cache
Nearline
Tape
Offline
Main
Tape
Disc
Secondary
Online
Online
Secondary
/MB
Tape
Tape
Disc
Typical System (bytes)
Main
Offline
Nearline
Tape
Tape
Cache
-9
-6
-3
0
3
-9
-6
-3
0
3
10
10
10
10
10
10
10
10
10
10
Access Time (seconds)
Access Time (seconds)
21Storage Ratios Changed
- 10x better access time
- 10x more bandwidth
- 4,000x lower media price
- DRAM/DISK 1001 to 1010 to 501
22Thesis Performance Storage Accesses not
Instructions Executed
- In the old days we counted instructions and
IOs
- Now we count memory references
- Processors wait most of the time
Where the time goes
clock ticks used by AlphaSort Components
Disc Wait
Sort
Sort
Disc Wait
OS
Memory Wait
23The Pico Processor
1 M SPECmarks 106 clocks/ fault to b
ulk ram Event-horizon on chip. VM reincarnat
ed
Multi-program cache
Terror Bytes!
24Storage Latency How Far Away is the Data?
Andromeda
9
Tape /Optical
10
2,000 Years
Robot
6
Pluto
Disk
2 Years
10
1.5 hr
Sacramento
Memory
100
This Campus
10
10 min
On Board Cache
On Chip Cache
2
This Room
Registers
1
My Head
1 min
25The 5 Minute Rule Derived
- M cost of a RAM page
- RAM /MB
- PageSize x Lifetime
- A cost of a disk access
- Disk Price
- AccessesPerSec x Lifetime
- RI Reference Interval
- time between accesses to page
Breakeven M A / Reference Interval
Reference Interval M/A
DiskPrice x PageSize
RAMprice x
AccPerSec
Reference Interval Time
26The Five Minute Rule Observations
- Break even has two terms
- (2) Economic term DiskPrice /
RAM_MB_Price 4004 1001
- (1) Technology term PageSize /
DiskAccPerSec 8KB 80 1001
- Economic term trends down
- Technology term trends up to compensate.
- Still at 5 minute for random, 1 minute sequential
27Shows Best Page Index Page Size 16KB
28Standard Storage Metrics
- Capacity
- RAM MB and /MB today at 10MB 100/MB
- Disk GB and /GB today at 10 GB and 200/GB
- Tape TB and /TB today at .1TB and 25k/TB
(nearline)
- Access time (latency)
- RAM 100 ns
- Disk 10 ms
- Tape 30 second pick, 30 second position
- Transfer rate
- RAM 1 GB/s
- Disk 5 MB/s - - - Arrays can go to 1GB/s
- Tape 5 MB/s - - - striping is problematic
29New Storage Metrics Kaps, Maps, SCAN?
- Kaps How many KB objects served per second
- The file server, transaction processing metric
- This is the OLD metric.
- Maps How many MB objects served per sec
- The Multi-Media metric
- SCAN How long to scan all the data
- The data mining and utility metric
- And
- Kaps/, Maps/, TBscan/
30For the Record (good 1998 devices packaged in
systemhttp//www.tpc.org/results/individual_resul
ts/Dell/dell.6100.9801.es.pdf)
X 14
31For the Record (good 1998 devices packaged in
systemhttp//www.tpc.org/results/individual_resul
ts/Dell/dell.6100.9801.es.pdf)
X 14
32How To Get Lots of Maps, SCANs
- parallelism use many little devices in parallel
- Beware of the media myth
- Beware of the access time myth
At 10 MB/s 1.2 days to scan
1,000 x parallel 100 seconds SCAN.
Parallelism divide a big problem into many
smaller ones to be solved in parallel.
33The Disk Farm On a Card
- The 1 TB disc card
- An array of discs
- Can be used as
- 100 discs
- 1 striped disc
- 10 Fault Tolerant discs
- ....etc
- LOTS of accesses/second
- bandwidth
14"
Life is cheap, its the accessories that cost ya.
Processors are cheap, its the peripherals that
cost ya
(a 10k disc card).
34Tape Farms for Tertiary StorageNot Mainframe
Silos
100 robots
1M
50TB
50/GB
3K Maps
10K robot
14 tapes
27 hr Scan
500 GB
5 MB/s
20/GB
Scan in 27 hours. many independent tape robots (
like a disc farm)
30 Maps
35Tape Optical Beware of the Media Myth
Optical is cheap 200 /platter
2 GB/platter
100/GB (2x cheaper than disc)
Tape is cheap 30 /tape 20 GB
/tape 1.5 /GB (100x cheaper than disc
).
36Tape Optical Reality Media is 10 of System
Cost
Tape needs a robot (10 k ... 3 m )
10 ... 1000 tapes (at 20GB each) 20/GB
... 200/GB (1x10x cheaper than disc) O
ptical needs a robot (100 k )
100 platters 200GB ( TODAY ) 400 /GB
( more expensive than mag disc )
Robots have poor access times Not good fo
r Library of Congress (25TB) Data motel da
ta checks in but it never checks out!
37The Access Time Myth
- The Myth seek or pick time dominates
- The reality (1) Queuing dominates
- (2) Transfer dominates
BLOBs
- (3) Disk seeks often short
- Implication many cheap servers better than
one fast expensive server
- shorter queues
- parallel transfer
- lower cost/access and cost/byte
- This is now obvious for disk arrays
- This will be obvious for tape arrays
38Jim GrayTalk at University of Tokyo
- Personal views on PITAC report invest in long
term research
- Preview of Turing lecture 10 long term research
problems
- Bush Summarize info in cyberspace
- Turing Intelligent Computers
- 7 9s build systems that are always up and prove
it.
- 5-Minute rule
- For disks
- For tapes
- Sorting Progress
- PennySort
- Terabyte Sort (!)
- Slides will be at http//research.Microsoft.com/G
ray/talks
39Penny Sort Ground Ruleshttp//research.microsoft.
com/barc/SortBenchmark
- How much can you sort for a penny.
- Hardware and Software cost
- Depreciated over 3 years
- 1M system gets about 1 second,
- 1K system gets about 1,000 seconds.
- Time (seconds) SystemPrice () / 946,080
- Input and output are disk resident
- Input is
- 100-byte records (random data)
- key is first 10 bytes.
- Must create output file and fill with sorted
version of input file.
- Daytona (product) and Indy (special) categories
40PennySort
- Hardware
- 266 Mhz Intel PPro
- 64 MB SDRAM (10ns)
- Dual Fujitsu DMA 3.2GB EIDE disks
- Software
- NT workstation 4.3
- NT 5 sort
- Performance
- sort 15 M 100-byte records (1.5 GB)
- Disk to disk
- elapsed time 820 sec
- cpu time 404 sec
41How Good is NT5 Sort?
- CPU and IO not overlapped.
- System should be able to sort 2x more
- RAM has spare capacity
- Disk is space saturated (1.5GB in, 1.5GB out on
3GB drive.) Need an extra 3GB drive or a 6GB
drive
Disk
CPU
Fixed
ram
42Sort Speed Doubles Every Year
?
?h
?
43Recent Results
- NOW Sort 9 GB on a cluster of 100 UltraSparcs
in 1 minute
- MilleniumSort 16x Dell NT cluster 100 MB in 1.8
Sec (Datamation)
- Tandem/Sandia Sort 68 CPU ServerNet 1 TB in
47 minutes
- Rumor of IBM Sort 7000 cpu Blue Pacific 1
TB in 1024 seconds (17 minutes). 10 Mrps
(1GBps)
44Jim GrayTalk at University of Tokyo
- Personal views on PITAC report invest in long
term research
- Preview of Turing lecture 10 long term research
problems
- Bush Summarize info in cyberspace
- Turing Intelligent Computers
- 7 9s build systems that are always up and prove
it.
- 5-Minute rule
- For disks
- For tapes
- Sorting Progress
- PennySort
- Terabyte Sort (!)
- Slides will be at http//research.Microsoft.com/G
ray/talks