Yakham Ndiaye, Witold Litwin, - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Yakham Ndiaye, Witold Litwin,

Description:

Tore Risch. Tore.Risch_at_dis.uu.se. Uppsala Universitet Dept. of Information Science ... File that scales transparently for the application in the distributed RAM ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 18
Provided by: Witold6
Category:

less

Transcript and Presenter's Notes

Title: Yakham Ndiaye, Witold Litwin,


1
AMOS-SDDS A Scalable Distributed Data Manager
for Windows Multicomputers
  • Yakham Ndiaye, Witold Litwin,
  • Yakham.Ndiaye, Witold.Litwin_at_dauphine.fr
  • CERIA Université Paris IX Dauphine
  • Tore Risch
  • Tore.Risch_at_dis.uu.se
  • Uppsala Universitet Dept. of Information Science

2
AMOS-SDDS A Scalable Distributed Data Manager
for Windows Multicomputers
  • A Scalable Distributed Data Structure
  • - File that scales transparently for the
    application in the distributed RAM of a
    multicomputer.
  • AMOS-II DBMS
  • - Amos II is an Object Relational DBMS with
    external data sources capability.
  • Coupling SDDS and AMOS-II
  • - for a scalable RAM file supporting database
    queries.

3
Multicomputers
  • A collection of loosely coupled computers
  • Share nothing architecture
  • Message passing through high-speed net
    (???100Mb/s)
  • Network multicomputers
  • Use general purpose nets PCs
  • Switched multicomputers
  • Use a bus, or a switch

4
SDDS
  • New data structures specifically for
    Multicomputers
  • Data are structured
  • - records with keys
  • parallel scans function shipping
  • Data are on servers
  • - waiting for access
  • Overflowing servers split into new servers
  • - appended to the file without informing the
    clients
  • Queries come from multiple autonomous clients
  • - Access initiators
  • - Not using any centralized directory for access
    computations
  • See for more http//ceria.dauphine.fr

5
SGBD AMOS-II
  • AMOS-II Active Mediating Object System
  • A RAM database system.
  • Declarative query language AMOSQL.
  • External data sources capability.
  • External program interfaces AMOS-II using
  • - Call-level interface (call-in)
  • - Foreign functions (call-out)
  • See the AMOS-II page for more
  • http//www.dis.uu.se/udbl/

6
Coupling SDDS AMOS-II
  • Client/Server System.
  • Scalable RAM Database.
  • Scalable distributed range partitioning
  • Increased storage and processing capabilities.
  • Unlimited Distributed RAM Storage
  • Parallel / Distributed queries
  • SDDS - Distributed RAM storage manager.
  • - Communication platform.
  • - Supports efficiently the key range
    queries.
  • AMOS-II - Database query processor.
  • - Import and store tuples locally into
    AMOS-II.


7
Coupling SDDS AMOS-II
AMOS-SDDS overall Architecture
8
The Hardware
  • Six Pentium III 700 MHz with 256 MB of RAM
    running Windows 2000 on a 100Mbit/s Ethernet
    network.
  • One site is used as Client and the five other as
    Servers
  • File scaled from 1 to 15 servers.
  • We run many AMOS-SDDS servers at the same machine
    (up to 3 per machine).

9
Benchmark queries
  • Benchmark data
  • Table Person (SS, Name, City).
  • Size 20,000 to 300,000 tuples of 25 bytes.
  • 50 Cities.
  • Random distribution.
  • Benchmark query  couples of persons in the
    same city  
  • Query 1, the file resides at a single AMOS-II.
  • Query 2, the file resides at AMOS-SDDS.
  • Count Join Count couples in the same city
  • To determine the result transfer time to the
    client
  • Join evaluation
  • Multicast and Nested loop or Local index lookup.
  • Measures
  • - Speed-up Scale-up

10
Server Query Processing
  • E-strategy
  • Data stay external to AMOS
  • within the SDDS bucket
  • Custom foreign functions perform the query
  • I-strategy
  • Data are dynamically imported into AMOS-II
  • Possibly with the local index creation
  • Deleted after the processing
  • Good for joins
  • AMOS performs the query

11
Speed-up  
File of 20,000 records, on AMOS-II and
distributed over 1 to 5 AMOS-SDDS servers with
I-strategy.
Elapsed time of Query 1
Elapsed time of Query 2 for I-Strategy
Elapsed time per tuple of Query 2 with I-strategy
12
Scaling the file size 
File of 100,000 records, on AMOS-II or on
AMOS-SDDS, processed using with I-strategy over 5
servers.
Query 2 on AMOS-SDDS
Elapsed time of Query 1 on AMOS-II
Performance of AMOS-II, and of AMOS-SDDS for a
scaling file
13
Discussion  
  • File of 20,000 records. For the nested loop, the
    improvement ratio is 5.5 times(82). For the
    index join, the improvement is about 1.4
    times(29).
  • File of 100,000 records. For the nested loop, the
    improvement ratio is 6.5 times(85). For the
    index join, the improvement is about 1.7
    times(41).
  • Better scale-up for AMOS-SDDS when scaling the
    file size by factor of 5
  • - For AMOS-II the nested loop elapsed time per
    tuple increases by factor of 5, from 13.15 to
    65.57ms(factor of 4.8 for AMOS-SDDS). For the
    index join, by factor of 4.8, from 2.25 to
    11.81ms(4.3 for AMOS-SDDS).

14
Scaling the number of servers  
Q1 AMOS-SDDS join Q2 AMOS-SDDS join with
count.
Time per tuple (extrapolated for AMOS-SDDS)
Expected time per tuple of join queries to
AMOS-SDDS
15
Discussion  
  • The file scales to 300,000 tuples.
  • Spreading from 1 to 15 AMOS-SDDS Servers.
  • - Transparently for the application
  • Results are extrapolated to 1 server per machine.
  • - Basically, the CPU component of the elapsed
    time is divided by 3
  • The extrapolated time per tuple for AMOS-SDDS on
    300,000-tuples file on 15 servers is 12.72ms.
    Its 2.9 times better than with AMOS-II alone of
    36.44 ms.

16
Conclusion  
  • We have coupled an SDDS manager and a RAM DBMS
  • AMOS-SDDS provides a scalable high-performance
    data repository supporting database queries
  • An important goal studied by various researchers
    in the past.
  • We have explored theoretically and experimentally
    various complex design issues implementation
    choices
  • In particular performance improve for larger
    files with respect to AMOS-II alone
  • 2 times better than with AMOS-II alone 18.55 vs
    36.44 ms
  • I-Strategy is more efficient for joins than
    E-strategy
  • Local index on-the-fly creation outperforms the
    nested loop evaluation
  • Despite index creation drop out cost
  • Not always however, e.g., the counting (not
    reported here)

17
Future Work  
  • Other types of DBMS queries
  • Client's scalable distributed query decomposer
  • AMOS as unique local server storage manager.
  • SD-AMOS prototype
  • SDDS provides the scalable distributed
    partitioning schema.
  • Server DBMS performs the splits.
  • Client manages scalable query decomposition
    execution.
  • The whole system generalizes the PDBMS
    technology.
  • Static partitioning only.
Write a Comment
User Comments (0)
About PowerShow.com