Using OGSADAI in a commercial environment - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Using OGSADAI in a commercial environment

Description:

TOG from Sun DCG provides access to remote HPC resource ... Transfer-queue Over Globus (TOG) http://gridengine.sunsource.net/project/gridengine/tog.html ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 21
Provided by: terry144
Category:

less

Transcript and Presenter's Notes

Title: Using OGSADAI in a commercial environment


1
Using OGSA-DAI in a commercial environment Terry
Sloan EPCC Telephone 44 131 650 5155
Email tsloan_at_epcc.ed.ac.uk
2
Overview
  • FirstDIG
  • INWA
  • Outstanding issues raised by these projects

3
First Data Investigation on the Grid
FirstDIG http//www.epcc.ed.ac.uk/firstdig/
4
Motivation
  • Few UK e-Science projects involve service
    companies such as First plc
  • First plc
  • Operate worldwide in variety of transport sectors
  • Over 10000 vehicles in the UK, 23 of the market
  • UKs largest operator
  • The challenge for First
  • Meeting the needs of the travelling public whilst
    making money
  • Data integration and mining may assist but huge
    range of fragmented data sources

5
Data Sources in the Bus Industry
  • Many different kinds of data involved with
    running a bus company
  • Mileage, revenue, customer contact, schedule,
    fuel consumption, vehicle maintenance, routes
  • Many means to collect data
  • Manually entered data at depot
  • Data collected on buses from ticket machines
  • Data collected on buses from GPS systems
  • GPS system notes when bus passes through a
    predefined footprint and records the time at
    which this happens

6
Answering Business Questions
  • Want to combine data from more than one source
  • Complaints versus Lateness
  • Revenue versus Lost Miles
  • Complaints versus Lost Miles
  • Want data aggregated in some way
  • By Service
  • By Day
  • Want to consider subsets of the data
  • e.g. weekdays only

7
Disparate Databases
  • Data is typically stored in disparate databases
  • Various reasons for this Incremental
    construction of systems.
  • Not a problem for day-to-day running and querying
    but
  • Introduces challenges for Data Analysis
  • Systems introduced at different times
  • Different database engines
  • Different front-ends
  • Different operating systems
  • Different physical locations
  • Different ways of representing data
  • These issues are NOT unique to buses

8
OGSA-DAI
  • OGSA-DAI
  • Open Grid Services Architecture Data Access and
    Integration
  • Potentially provides a solution
  • Need business users to make transition from
    science to commerce
  • Grid middleware
  • Assists with the access and integration of data
    from separate data sources via the Grid
  • Represents databases as Grid Services
  • Enables access from other machines in a secure
    manner

9
FirstDIG Achievements
  • Deployment at First South Yorkshire
  • Combined two databases to answer real business
    questions
  • The Customer Contact System
  • Microsoft Access
  • Information on customer complaints e.g. time,
    service, nature
  • The Mileage database
  • dBASE IV
  • Information on bus mileage e.g. lost miles
  • Produced generic Grid Data Service Browser
  • SQL access including joins across the databases

10
First Grid Data Service Browser
11
Informing Business Regional Policy
Grid-enabled fusion of global data local
knowledge INWA http//www.epcc.ed.ac.uk/inwa/

12
INWA
  • An e-Social Science demonstrator
  • Demonstrates how grid technologies can improve
    business
  • Combining private and public data sources
  • Finance and Telecommunications
  • Uses many grid technologies
  • TOG from Sun DCG provides access to remote HPC
    resource
  • OGSA-DAI provides access control and discovery of
    distributed heterogeneous data resources
  • FirstDIG grid data service browser provides SQL
    access to OGSA-DAI enabled resources
  • Globus Toolkit 2 and 3

13
INWA Grid Infrastructure
User_at_Curtin
User_at_Edinburgh
FirstDIG
FirstDIG
Grid Engine
Bank
Telco
TOG
Globus Grid
Curtin
Bank data
Telco data
14
References
  • EPCC
  • http//www.epcc.ed.ac.uk/
  • FirstDIG
  • http//www.epcc.ed.ac.uk/firstdig/
  • OGSA-DAI
  • http//www.ogsadai.org.uk
  • INWA
  • http//www.epcc.ed.ac.uk/inwa
  • Sun Data Compute Grids
  • http//www.epcc.ed.ac.uk/sungrid/
  • Transfer-queue Over Globus (TOG)
  • http//gridengine.sunsource.net/project/gridengin
    e/tog.html

15
Outstanding issues raised by FirstDIG INWA
16
Outstanding IssuesUsability
  • OGSA-DAI is middleware, client toolkit helps
  • Incorporation of demo First browser helpfulish
  • But really want
  • Interfaces to real data analysis dbms packages
    eg SPSS
  • Otherwise users could end up building
    applications that replicate these eg the First
    Grid Data Service Browser
  • Want to be able to point Access, Excel, etc at a
    grid data source and examine it

17
Outstanding issuesData
  • CSV (Comma separated value) data sources
  • are common but current JDBC-ODBC drivers do not
    have sufficient functionality (NOT an OGSA-DAI
    issue per se)
  • No support for BIT type field
  • And others eg BOOLEAN, BINARY, etc
  • Certain characters (eg , gt) are not handled by
    the OGSA-DAI XML parser
  • Company names often have in them
  • Dates from certain sources not handled properly
  • First Grid Data Service has to handle this
    internally

18
Outstanding issuesMiscellaneous
  • Security
  • Rolemap file is not encrypted
  • If one GDS accesses another GDS the user security
    credentials are not passed on so it does not work
  • Installation Testing
  • Install Set-up
  • Well-explained but still a fair amount of user
    effort involved
  • Lack of an example OGSA-DAI site to point at to
    test that your OGSA-DAI installation works

19
Outstanding IssuesMiscellaneous
  • Installation Testing
  • Lack of an example OGSA-DAI site to point at to
    test that your OGSA-DAI installation works
  • Large results sets
  • Can increase JVM size but this is not scalable
  • This occurred on most datasets
  • Integration
  • DQP is a start .(Linux, OQL)
  • Why use OGSA-DAI ?
  • Easysoft etc
  • http//www.easysoft.com/products/2001/main.phtml

20
Why use OGSA-DAI ?
a RDBMS engine that appears to client apps as a
fully conformant ODBC 3.5 data source.can be
used to provide real-time, heterogeneous access
to multiple target data sources.
Write a Comment
User Comments (0)
About PowerShow.com