Yakham Ndiaye, Witold Litwin,

About This Presentation

Title:

Yakham Ndiaye, Witold Litwin,

Description:

Tore Risch. Tore.Risch_at_dis.uu.se. Uppsala Universitet Dept. of Information Science ... File that scales transparently for the application in the distributed RAM ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 18

Provided by: Witold6

Category:

more less

Transcript and Presenter's Notes

Title: Yakham Ndiaye, Witold Litwin,

1
AMOS-SDDS A Scalable Distributed Data Manager
for Windows Multicomputers

Yakham Ndiaye, Witold Litwin,
Yakham.Ndiaye, Witold.Litwin_at_dauphine.fr
CERIA Université Paris IX Dauphine
Tore Risch
Tore.Risch_at_dis.uu.se
Uppsala Universitet Dept. of Information Science

2
AMOS-SDDS A Scalable Distributed Data Manager
for Windows Multicomputers

A Scalable Distributed Data Structure
- File that scales transparently for the
application in the distributed RAM of a
multicomputer.
AMOS-II DBMS
- Amos II is an Object Relational DBMS with
external data sources capability.
Coupling SDDS and AMOS-II
- for a scalable RAM file supporting database
queries.

3
Multicomputers

A collection of loosely coupled computers
Share nothing architecture
Message passing through high-speed net
(???100Mb/s)
Network multicomputers
Use general purpose nets PCs
Switched multicomputers
Use a bus, or a switch

4
SDDS

New data structures specifically for
Multicomputers
Data are structured
- records with keys
parallel scans function shipping
Data are on servers
- waiting for access
Overflowing servers split into new servers
- appended to the file without informing the
clients
Queries come from multiple autonomous clients
- Access initiators
- Not using any centralized directory for access
computations
See for more http//ceria.dauphine.fr

5
SGBD AMOS-II

AMOS-II Active Mediating Object System
A RAM database system.
Declarative query language AMOSQL.
External data sources capability.
External program interfaces AMOS-II using
- Call-level interface (call-in)
- Foreign functions (call-out)
See the AMOS-II page for more
http//www.dis.uu.se/udbl/

6
Coupling SDDS AMOS-II

Client/Server System.
Scalable RAM Database.
Scalable distributed range partitioning
Increased storage and processing capabilities.
Unlimited Distributed RAM Storage
Parallel / Distributed queries
SDDS - Distributed RAM storage manager.
- Communication platform.
- Supports efficiently the key range
queries.
AMOS-II - Database query processor.
- Import and store tuples locally into
AMOS-II.

7
Coupling SDDS AMOS-II
AMOS-SDDS overall Architecture
8
The Hardware

Six Pentium III 700 MHz with 256 MB of RAM
running Windows 2000 on a 100Mbit/s Ethernet
network.
One site is used as Client and the five other as
Servers
File scaled from 1 to 15 servers.
We run many AMOS-SDDS servers at the same machine
(up to 3 per machine).

9
Benchmark queries

Benchmark data
Table Person (SS, Name, City).
Size 20,000 to 300,000 tuples of 25 bytes.
50 Cities.
Random distribution.
Benchmark query couples of persons in the
same city
Query 1, the file resides at a single AMOS-II.
Query 2, the file resides at AMOS-SDDS.
Count Join Count couples in the same city
To determine the result transfer time to the
client
Join evaluation
Multicast and Nested loop or Local index lookup.
Measures
- Speed-up Scale-up

10
Server Query Processing

E-strategy
Data stay external to AMOS
within the SDDS bucket
Custom foreign functions perform the query
I-strategy
Data are dynamically imported into AMOS-II
Possibly with the local index creation
Deleted after the processing
Good for joins
AMOS performs the query

11
Speed-up
File of 20,000 records, on AMOS-II and
distributed over 1 to 5 AMOS-SDDS servers with
I-strategy.
Elapsed time of Query 1
Elapsed time of Query 2 for I-Strategy
Elapsed time per tuple of Query 2 with I-strategy
12
Scaling the file size
File of 100,000 records, on AMOS-II or on
AMOS-SDDS, processed using with I-strategy over 5
servers.
Query 2 on AMOS-SDDS
Elapsed time of Query 1 on AMOS-II
Performance of AMOS-II, and of AMOS-SDDS for a
scaling file
13
Discussion

File of 20,000 records. For the nested loop, the
improvement ratio is 5.5 times(82). For the
index join, the improvement is about 1.4
times(29).
File of 100,000 records. For the nested loop, the
improvement ratio is 6.5 times(85). For the
index join, the improvement is about 1.7
times(41).
Better scale-up for AMOS-SDDS when scaling the
file size by factor of 5
- For AMOS-II the nested loop elapsed time per
tuple increases by factor of 5, from 13.15 to
65.57ms(factor of 4.8 for AMOS-SDDS). For the
index join, by factor of 4.8, from 2.25 to
11.81ms(4.3 for AMOS-SDDS).

14
Scaling the number of servers
Q1 AMOS-SDDS join Q2 AMOS-SDDS join with
count.
Time per tuple (extrapolated for AMOS-SDDS)
Expected time per tuple of join queries to
AMOS-SDDS
15
Discussion

The file scales to 300,000 tuples.
Spreading from 1 to 15 AMOS-SDDS Servers.
- Transparently for the application
Results are extrapolated to 1 server per machine.
- Basically, the CPU component of the elapsed
time is divided by 3
The extrapolated time per tuple for AMOS-SDDS on
300,000-tuples file on 15 servers is 12.72ms.
Its 2.9 times better than with AMOS-II alone of
36.44 ms.

16
Conclusion

We have coupled an SDDS manager and a RAM DBMS
AMOS-SDDS provides a scalable high-performance
data repository supporting database queries
An important goal studied by various researchers
in the past.
We have explored theoretically and experimentally
various complex design issues implementation
choices
In particular performance improve for larger
files with respect to AMOS-II alone
2 times better than with AMOS-II alone 18.55 vs
36.44 ms
I-Strategy is more efficient for joins than
E-strategy
Local index on-the-fly creation outperforms the
nested loop evaluation
Despite index creation drop out cost
Not always however, e.g., the counting (not
reported here)

17
Future Work

Other types of DBMS queries
Client's scalable distributed query decomposer
AMOS as unique local server storage manager.
SD-AMOS prototype
SDDS provides the scalable distributed
partitioning schema.
Server DBMS performs the splits.
Client manages scalable query decomposition
execution.
The whole system generalizes the PDBMS
technology.
Static partitioning only.

Write a Comment

User Comments (0)

About PowerShow.com

Yakham Ndiaye, Witold Litwin, - PowerPoint PPT Presentation

Yakham Ndiaye, Witold Litwin,

Tore Risch. Tore.Risch_at_dis.uu.se. Uppsala Universitet Dept. of Information Science ... File that scales transparently for the application in the distributed RAM ... – PowerPoint PPT presentation