Data Management in a Highly Connected World - PowerPoint PPT Presentation

About This Presentation

Title:

Data Management in a Highly Connected World

Description:

Complete, scripted 5- deck to generate discussion of roadmap and futures. – PowerPoint PPT presentation

Number of Views:177

Avg rating:3.0/5.0

Slides: 55

Provided by: sqlm

Category:

more less

Transcript and Presenter's Notes

Title: Data Management in a Highly Connected World

1
Data Management in a Highly Connected World

James Hamilton
JamesRH_at_microsoft.com
Microsoft SQL Server

March 3, 2000
2
Agenda

Client Tier
Number of devices
Device interconnect fabric
Standard programming infrastructure
Client tier database issues
Resource requirements
Implementation language
Administrative cost implications
Development cost implications
Middle Tier
Server Tier
Summary

3
How Many Clients?

1998 US WWW users (IDC)
US 51M World wide 131M
2001 estimates
World Wide 319M users
515M connected devices
½ billion connected Clients
Conservative estimate based upon conventional
device counts

4
Other Device Types

TVs, VCRs, stoves, thermostats, microwaves, CD
players, computers, garage door openers, lights,
sprinklers, appliances, driveway de-icers,
security systems, refrigerators, health
monitoring, etc.
Sony evangelizing IEEE 1394 Interconnect
http//www.sel.sony.com/semi/iee1394wp.html
Microsoft consortium evangelizing Universal
Plug Play
www.upnp.org
WAP Wireless Application Protocol
http//www.wap.net/

5
Device Interconnect Infrastructure

Power line control
X10 http//www.x10.org
Sunbeam Thalia
http//www.thaliaproducts.com/

6
Why Connect These Devices?

TV guide auto VCR programming
CD label info song list download
Sharing data resources
Set clocks (flashing 1200)
Fire and burglar alarms
Persist thermometer settings
Feedback data sharing based systems
Temperature control power blind interaction
Occupancy directed heating and lighting

7
Device Communication Implications

The need is there
Infrastructure is going in
Wireless
Power line communications
Unused twisted pair (phone) bandwidth
Connectable devices infrastructure arriving
being deployed
On order of billions of client devices

8
Device Interconnect Example
9
Device Interconnect Example
10
Device Interconnect Example
11
Device interconnect Example
Den
Windows NT Server
Ethernet Hub
56k bps line
Ethernet Backbone
Deck
Filtration Plant
130 Gallon F/W Aquarium
Living Room
660 Gallon Marine Aquarium
X10 Backbone
130 Gallon F/W Aquarium
Home Sprinklers
Bedroom
12
Improvements For Example

Cooperation of lighting, A/C and power blind
systems
Alarms and remote notification for failures in
Circulations pump
Heating cooling
Salinity other water chemistry changes
Filtration system
Feedback directed systems

13
Palmtop Resource Trends

Palmtops Ive purchased through the years
All about same cost physical size

Palmtop RAM
Moores Law
100
Casio E105 (32M)
32M
HP 200LX (2M)
10
HP 100LX (1M)
Everex A20 (4m)
HP 95LX (0.5M)
Sharp IQ8300 (0.25M)
Sharp IQ7000 (0.125M)
1
0.1
1992
1994
1990
1996
1998
2000
2002
14
O/S Memory Requirements

Windows Memory requirements over time

Desktop RAM
Moores Law
128m
100
Windows 2000(64M)
Windows98 (16M)
10
Windows95 (4M)
WFW 3.1 (3M)
Windows 3.0 (2M)
1
Windows 2.0 (512K)
Windows 1.0 (256K)
0.1
1989
1991
1987
1993
1995
1997
1999
1985
15
Smartcard Resource Trends
300 M
1 M
Memory Size (Bits)
10 K
You are here
3 K
1990
1992
1996
1998
2000
2002
2004
Source Denis Roberson PIN/Card -Tech/ NCR
16
Devices Smaller Than PDAs

Qualcomm PDQ
2 MB total memory
Same mem curve as PDAsjust 2 to 5 years behind

Nokia 9000il
8 MB total Memory

17
Digital Cameras
Make Model Memory
Agfa CL30 60 to 360MB
Canon PowerShot S20 6 to 176MB
Epson PhotoPC 850Z 10 to 120MB
Kodak DC-280 32 to 245MB
Olympus D-340R 18 to 120MB
Panasonic Palmcam PV-SD4090 450 to 1,500MB
Sanyo VPC-SX500 19 to 120MB
18
Resource Trend Implications

Device resources at constant cost are growing at
super-Moore rates
Same but 2 to 3 yrs behind desktop system growth
Same is true of each class of devices
Telephones trail PDAs but again grow at the same
rate
Memory growth is not the problem
However devices always smaller than desktops
Devices more specialized so resource consumption
less can still run standard vertical app slice

19
Standard Infrastructure at Client

Clearly specialized user interface S/W needed
But we have the memory resources to support
Standard communications stack (TCP/IP)
Standard O/S software
Standard data management S/W with query
Transparent replication
Symmetric multi-tiered infrastructure S/W
Leverage best development environments
No need to rewrite millions of redundant lines of
code
More heavily used tested so less bugs
Better productivity in programming to richer
platform
A full DBMS at client both practical useful

20
Client-Side Database Issues

Honey I shrunk the database (SIGMOD99)
DB Footprint
Implementation Language
Both issues either largely irrelevant or soon to
be
Resource availability trends support standard
infrastructure S/W
Dominant costs admin, operations user
training, and programming
Vertical slice of standard apps rather than full
custom infrastructure

21
DB Implementation Language

Special DB implementation language (Java)
argument
centers on auto-installation of S/W
infrastructure
Auto-install is absolutely vital, but independent
of implementation language
Auto-install not enough client should be a cache
of recently used S/W and data
Full DBMS at client
Client-side cache of recently accessed data
Optimizer selected access path choice
driven by accuracy currency requirements
balanced against connectivity state
communications costs

22
Admin Costs Still Dominate

60s large system mentality still prevails
Optimizing precious machine resources is false
economy
Admin education costs more important
TCO education from the PC world repeated
Each app requires admin and user trainingmuch
cheaper to roll out 1 infrastructure across
multiple form factors
Sony PlayStation has 3Mb RAM Flash
Nokia 9000IL phone has 8Mb RAM
Trending towards 64M palmtop in 2001
Vertical app slice resource reqmt can be met

23
Dev Costs Over Memory Costs

Specialty RTOS weak dev environments
Quality quantity of apps driven by
Dev environment quality
Availability of trained programmers
Requirement for custom client development
configuration greatly reduces deployment speed
Same apps have wide range of device form factors
Symmetric client/server execution environ.
DB components and data treated uniformly
Both replicated to client as needed

24
Client Side Summary

On order of billions connected client devices
Most are non-conventional computing devices
All devices include standard DB components
Standard physical logical device interconnect
standards will emerge
DB implementation language irrelevant
Device DB resource consumption much less
important than ease of
Installation
Administration
Programming
Symmetric client/server execution environment

25
Agenda

Client Tier
Middle Tier
High Availability via redundant data metadata
Fault Isolation domains
XML
Mid-tier Caching
Server Tier
Summary

26
High Availability is Tough
Availability Annual Lost Data Access Number of Nines
90 1 week 1
99 lt4 days 2
99.9 lt9 hours 3
99.99 1 hour 4
99.999 5 min 5
99.9999 30 sec 6
27
Server Availability Heisenbugs

Industry good at finding functional errors
Multi-user application interactions hard
Sequences of statistically unlikely events
Heisenbugs (http//research.microsoft.com/gray/ta
lks)
Testing for these is exponentially expensive
Server stack is nearing 100 MLOC
Long testing and beta cycles delay software
release
System size complexity growth inevitable
Re-try operation (Microsoft Exchange)
Re-run operation against redundant data copy
(Tandem)
Fail fast design approach is robust but only
acceptable with redundant access to redundant
copies of data

28
The Inktomi Lesson

Inktomi web search engine (Brewer --SIGMOD98)
Quickly evolving software
Memory leaks, race conditions, etc. considered
normal
Dont attempt to test beta until quality high
System availability of paramount importance
Individual node availability unimportant
Shared nothing cluster
Exploit ability to fail individual nodes
Automatic reboots avoid memory leaks
Automatic restart of failed nodes
Fail fast fail restart when redundant checks
fail
Replace failed hardware weekly (mostly disks)
Dark machine room
No panic midnight calls to admins
Mask failures rather than futile attempt to avoid

29
Apply to High Value TP Data?

Inktomi model
Scales to 100s of nodes
S/W evolves quickly
Low testing costs and no-beta requirement
Exploits ability to lose individual node without
impacting system availability
Ability to temporarily lose some data W/O
significantly impacting query quality
Cant loose data availability in most TP systems
Redundant data allows node loss w/o data
availability lost
Inktomi model with redundant data metadata a
potential solution

30
Redundant Data Metadata

TP Point access to data nearly solved problem
TP systems scale with user number, people on
planet, or business size
All trending at sub-Moore rates
Data analysis systems growing far faster than
Moores Law
Gregs law 2x every 9 to 12 (SIGMOD98Patterson)
Seriously super-Moore implying that no single
system can scale sufficiently clusters are the
only solution
Storage trending to free with access speed
limiting factor
Detailed data distribution statistics need to be
maintained
Improve access speed availability using
redundant data (indexes, materialized views,
etc.)
Async update for stats, indexes, mat views
Data paths choice based upon need currency
accuracy

31
Affordable Availability

Web-enabled direct access model driving high
availability requirements
recent high profile failures at eTrade and
Charles Schwab
Web model enabling competition in information
access
Drives much faster server side software
innovation which negatively impacts quality
Dark machine room approach requires auto-admin
and data redundancy
Inktomi model (Erik BrewerSIGMOD98)
42 of system failures admin error (Gray)
Paging admin at 2am to fix problem is dangerous

32
Connection Model/Architecture
Client
Server Node

Redundant data metadata
Shared nothing
Single system image
Symmetric server nodes
Any client connects to any server
All nodes SAN-connected

Server Cloud
33
Compilation Execution Model
Client

Query execution on many subthreads synchronized
by root thread

Server Thread Lex analyze Parse Normalize Optimize
Code generate
Server Cloud
Query execute
34
Node Loss/Rejoin
Client

Execution in progress

Rejoin
Node local recovery
Rejoin cluster
Recover global data at rejoining node
Rejoin cluster

Server Cloud
35
Redundant Data Update Model
Client

Updates are standard parallel query plans
Optimizer manages redundant access paths
Query plan responsible for access plan
management
No significant new technology
Similar to materialized view index updates today

Server Cloud
36
Fault Isolation Domains

Trade single-node perf for redundant data checks
Complex error recovery more likely to be wrong
than original forward processing code
Many redundant checks are compiled out of retail
versions when shipped
Fail fast rather than attempting to repair
Bring down node for mem-based data structure
faults
Dont patch inconsistent data copies keep
system available
If anything goes wrong fire the node and
continue
Attempt node restart
Auto-reinstall O/S, DB and recreate DB partition
Mark node dead for later replacement

37
Data Structure Matters

Most internet content is unstructured text
restricted to simple Boolean search techniques
Docs have structure, just not explicit
Yahoo hand categorizes content
indexing limited human involvement doesnt
scale well
XML is a good mix of simplicity, flexibility,
potential richness
Structure description language of internet
DBMSs need to support as first class datatype
Too few librarians in world
so all information must be self-describing

38
Relational to XML

SELECT FOR XML
FOR XML RAW (return an XML rowset)
FOR XML AUTO (exploit RI, name matching, etc.)
FOR XML EXPLICIT (maximal control)
Annotated Schema
Mapping between XML and relational schema
expressed in XML
Templates
Encapsulated parameterized query
XSL/T support
XPATH support
Direct URL access (SQL owned virtual root)
SELECT FOR XML
Annotated schema
Templates

39
XML to Relational

XML bulk load
Templates and Annotated Schema
SQL server hosted XML tree
Directly insert document into SQL Server hosted
XML tree
Select from server hosted XML tree rowset
insert into SQL tables
XML Data type support
Hierarchical full text search

40
XML Example

http//SRV1/nwind?sqlSELECTDISTINCTContactTitle
FROMCustomersWHEREContactTitleLIKE'Sa25'OR
DERbYContactTitleFORXMLAUTO
Result set
ltCustomers ContactTitle"Sales Agent"/gt
ltCustomers ContactTitle"Sales Associate"/gt
ltCustomers ContactTitle"Sales Manager"/gt
ltCustomers ContactTitle"Sales Representative"/gt

41
Mid-Tier Cache Requirements

Non-proprietary multi-lingual programming
Symmetric mid-tier server programming model
Non-connected, stateless programming model
High scale thread pool based
Efficient main memory DB support
Full query over local cache
Query over just cached data, or
Query over full corpus (server interaction reqd)
Ability to handle network partitions server
failure
Support for life-time attributed data
Transactional (possibly multi-server)
Near real time
Every N time units
Read only

42
Agenda

Client Tier
Middle Tier
Server Tier
Affordable computing by the slice
Everything online
Disk are actually getting slower
Processing moves to storage
Approximate answers quickly
Semi-structured storage support
Administrative issues
Summary

43
Server-Side Changes

Server databases more functionally rich than
often required
Trend reversal
Less at the server-tier with richer mid-tier
Focus at back-end shifts to
Reliability, Availability, and Scalability
Reducing administrative costs
Server side trends
Scalability over single-node performance
Everything online
Affordable availability in high scale systems

44
Compaq/Microsoft TPC-C Benchmark
tpmC
These are Top 5 benchmarks as of Feb 17, 2000.
227,079
152,207
135,815
135,815
135,461
98
55
53
20
19
Enterprise 6500 Solaris 2.6 Oracle 8i v 8.1.6
13,153,324. 97.10/tpmC
Escala EPC2400 AIX 4.3.3 Oracle
v8.1.6 7,462,215 54.94 tpmC
ProLiant 8500 Cluster Windows 2000 SQL Server
2000 4,341,603. 19.12 tpmC
IBM RS/6000 S80 AIX 4.3.3 Oracle v 8.1.6
7,156,910. 52.70/tpmC
ProLiant 8500 Cluster Windows 2000 SQL Server
2000 2,880,431. 18.93 tpmC
NOTE All TPC-C results reported as of February
17, 2000
45
Computing by the Slice
Source TPC report executive summary
46
Just Save Everything

Able to store all Info produced on earth (Lesk)
Paper sources less than 160 TB
Cinema less than 166 TB
Images 520,000 TB
Broadcasting 80,000 TB
Sound 60 TB
Telephony 4,000,000 TB
These data yield 5,000 petabytes
Others estimate upwards of 12,000 petabytes
World wide 1998 storage production 13,000
petabytes
No need to manage deletion of old data
Most data never accessed by a human
Access aggregations analysis, not point fetch
More storage than data allows for greater
redundancy
indexes, materialized views, statistics, other
metadata

47
Disk are Becoming Black Holes

Seagate Cheetah 73
Fast 10k RPM, 5.6 ms access, 16 MB cache
But Very large 73.4 GB

Result? Black hole 2.4 accesses/sec/gb
Large data caches required
Employ redundant access paths

48
Processing Moves Towards Storage

Trends
I/O bus bandwidth is bottleneck
Switched serial nets support very high bandwidth
Processor/memory interface is bottleneck
Growing CPU/DRAM perf gap leading to most CPU
cycles in stalls
Combine CPU, serial network, memory, disk in
single package
E.g. David Patterson ISTORE project

49
Processing Moves Towards Storage

Each disk forms part of multi-thousand node
cluster
Redundant data masks failure
RAID-like approach
Each cyberbrick commodity H/W and S/W
O/S, database, and other server software
Each slice plugged in personality set
E.g. database or SAP app server)
No other configuration required
On failure of S/W or H/W, redundant nodes pick up
workload
Replace failed components at leisure
Predictive failure models

50
Approximate Answers Quickly

DB systems focus on absolute correctness
As size grows, correct answer increasingly
expensive
Text search systems depend upon quick approx
answer
Approx answer with statistical confidence bound
Steadily improve result until user satisfied
Ripple Joins for Online Aggregation
(Hellerstein-SIGMOD99)
Allows rapid exploration of large search spaces
Conventional full accuracy only when needed
Run query on incomplete mid-tier cache?

51
Semi-Structured Storage Support

Example applications
Directory systems (e.g. Microsoft Active
Directory)
Document management systems
Storage characteristics
Flexible sparse schema support
Fine grained security
Recursive query
Notification based extensibility common
XML support important
Particularly difficult to support when native SQL
access is also allowed
Important area for RDBMS expansion

52
Examples Performance W/O Admin

Multiple cached plans for different parameter
marker sub-domains
Async statistics gathering
Async optimization
Feedback-directed techniques
Adapting number of histogram buckets
Re-optimizing when cardinality errors discovered
during execution
re-optimize with additional data distribution
info gained during previous execution
Optimizer-created indexing structures
Add indexes when needed (Exchange AS/400)

53
Summary