Applying Grid Technologies to Distributed Data Mining - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Applying Grid Technologies to Distributed Data Mining

Description:

Michael J. Jackson1 Ashley D. Lloyd2 Terence M. Sloan1 ... Cursors, sessions, transactions, timeouts, meta-data. Analysis of SAS or SPSS ODBC usage ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 18
Provided by: terry144
Category:

less

Transcript and Presenter's Notes

Title: Applying Grid Technologies to Distributed Data Mining


1
Enabling Access to Federated Grid Databases An
OGSA-DAI ODBC Driver
Michael J. Jackson1 Ashley D. Lloyd2 Terence M.
Sloan1
1EPCC 2Curtin Business School Edinburgh
University Management School
2
Overview
  • Why develop an OGSA-DAI ODBC driver?
  • ODBC
  • OGSA-DAI
  • Design and Development
  • What does an OGSA-DAI ODBC driver give us?
  • Issues and Concerns

3
Why?
  • Facilitate use of standard data analysis tools in
    a Grid environment
  • Remove need for Grid awareness
  • Allow use of existing data analysis skills in a
    Grid environment
  • Improve rate of adoption of Grid technologies
  • Data analysis tools
  • SPSS, SAS
  • How can standard data analysis tools access
    Grid-enabled databases?
  • An ODBC driver for OGSA-DAI

4
Open DataBase Connectivity ODBC
ODBC data source
Database API
ODBC API
ODBC API
Data source name
Reside on same host
5
ODBC Advantages
  • Application developers
  • Applications can be database-independent
  • No need to compile against database-specific
    libraries
  • Call-level interface execute SQL generated at
    run-time
  • Change a database gt only change driver and
    configuration
  • Database manufacturers
  • An ODBC-compliant driver allows the database to
    be a back end for any ODBC-compliant application

6
OGSA-DAI
  • Open Grid Services Architecture Data Access and
    Integration
  • Extensible framework for data access and
    integration
  • Expose heterogeneous data resources to a Grid
    through web services
  • Data operations
  • Access, update, management and integration
    relational, XML, files
  • Compression and transformation
  • Delivery to URLs, FTP, GridFTP, mail, other
    services
  • Base for developing higher-level services
  • Data federation and distributed query processing
  • Data mining
  • Data visualisation

7
Accessing Data Resources via OGSA-DAI
OGSA-DAI Perform document
JDBC API
OGSA-DAI Response document
8
An ODBC Driver for OGSA-DAI
ODBC API
ODBC API
Data source name
OGSA-DAI Response document
OGSA-DAI Perform document
JDBC API
9
A Simple Scenario
  • Data analysis
  • ODBC view
  • Connect to OGSA-DAI ODBC data source
  • Submit a SELECT FROM table query
  • Get back the results
  • Disconnect from the data source
  • OGSA-DAI view
  • Connect to an OGSA-DAI data service
  • Construct a Perform document holding the query
  • Send it to the service
  • Receive a Response document from the service
  • Parse it to get the results

10
Development Options
  • Implement an OGSA-DAI ODBC driver
  • From scratch
  • Use an open source ODBC driver
  • Extract a data resource-independent skeleton
  • Customise it to OGSA-DAI
  • Use an ODBC SDK
  • OpenAccess
  • Simba
  • Syware

11
Using an SDK
  • Proof of concept
  • Prototype within a tight time-scale
  • OpenAccess SDK
  • 30 day evaluation licence
  • Provides an ODBC driver
  • Developer codes an Interface Provider (IP)
  • Supports Java development gt exploit OGSA-DAIs
    client toolkit

12
An ODBC Driver for OGSA-DAI using OpenAccess
OGSA-DAI Perform document
OGSA-DAI CTk API
OpenAccess API
OGSA-DAI Response document
Data resource configuration (e.g. service URL)
13
Testing
  • OpenAccess ODBC SQL query tool
  • Submit SQL statements to an ODBC data source
  • Present the results
  • EPCC
  • OGSA-DAI ODBC data source on a PC
  • ODBC driver OGSA-DAI service URL
  • Curtin Business School
  • OGSA-DAI server and services
  • Database

14
What does this give us?
  • Transparency
  • Database location
  • Changes are restricted to the OGSA-DAI server
  • Client applications are unaffected
  • Database product
  • Global access of data
  • Publish service URL
  • Security
  • Database user names and passwords reside on
    OGSA-DAI server
  • Clients can be required to provide credentials to
    connect to OGSA-DAI services

15
Data Federation
ODBC API
ODBC API
Data source name
OGSA-DAI documents
Virtual database
16
Issues and Concerns
  • OGSA-DAI WSI / WSRF compliance
  • Prototype developed using OGSA-DAI OGSI
  • Data source includes OGSA-DAI factory service URL
  • OGSA-DAI WSI or WSRF data service URL
    resource ID
  • Driver development
  • Complete the OpenAccess IP
  • Write a pure OGSA-DAI ODBC driver from scratch
  • ODBC conformance
  • Cursors, sessions, transactions, timeouts,
    meta-data
  • Analysis of SAS or SPSS ODBC usage
  • Efficiency

17
Questions
Write a Comment
User Comments (0)
About PowerShow.com