ChinaVO Data Access Service - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

ChinaVO Data Access Service

Description:

The quantity of data nearly amounts to PB. The data is distributed and stored ... need statistical analysis to determine which source are the true counterpoint. ... – PowerPoint PPT presentation

Number of Views:11
Avg rating:3.0/5.0
Slides: 46
Provided by: krist208
Category:

less

Transcript and Presenter's Notes

Title: ChinaVO Data Access Service


1
Chinese Virtual Observatory
China-VO Data Access Service Based on OGSA
Jian Sang National Astronomical Observatory of
China
2
Outline
  • VO,Grid and OGSA
  • Build the catalog data service
  • Build the image mosaic service
  • Faced technical difficulties

3
The Increase Of Astronomical Data
4
Challenges
  • The quantity of data nearly amounts to PB.
  • The data is distributed and stored in
    heterogeneous DBMSs in heterogeneous
  • host environments.

5
The VOs Goal
  • The VOs initial goal is to federate existing
    astronomical data archives and provide standard
    services for manipulating these data.

HOW TO REACH THIS GOAL?
The Grid technology can solve the problem!
6
What is Grid
  • Grid technology has been driven by genesis from
    metacomputing, but
  • In practice, the Grid is about resource sharing
    and coordinated problem solving in dynamic,
    multi-institutional virtual organizations
  • Focus on how to enable, maintain and control the
    sharing of resources to achieve a common goal

7
What Grid offers
  • Resource management protocols and services that
    support secure remote access to shared data
    resources and computing and the co-allocation of
    multiple resources.
  • Security solutions that support management of
    credentials and policies.
  • Information query protocols and services that
    provide configuration and status information
    about resources,organizations and services.
  • Data Management services that locate and
    transport datasets between storage systems and
    applications.

8
What is OGSA
  • The Open Grid Services Architecture (OGSA)
    represents an evolution towards a Grid system
    architecture based on Web services concepts and
    technologies. 
  • The OGSA integrates key Grid technologies
    (including the Globus Toolkit with Web services
    mechanisms to create a distributed system
    framework based around the Open Grid Services
    Infrastructure (OGSI).

In Grids ,Everything is Service
9
The Open Grid Services Architecture
  • Service orientation to virtualize resources
  • From Web services( everything is service)
  • -Standard interface definition
    mechanismsmultiple protocol bindings,multiple
    implementations,local/remote transparency
  • Building on Globus Toolkit
  • -Grid service semantics for service
    interactions
  • -Management of transient instances
  • -Factory,Registry,Discovery,other services
  • -Reliable and secure transport
  • Multiple host environmentsJ2EE,.NET,C,

10
The Structure of Grid Service
11
Grid service interfaces
12
Construct The Astronomical Data Grid
  • The astronomical data service is the most
    fundamental and important component in Virtual
    Observatory.
  • In the aspect of data share, the VO can be think
    as a astronomical Data Grid

VOAstronomical Data Grid
13
Outline
  • VO,Grid and OGSA
  • Build the catalog data access service
  • Build the image mosaic service
  • Faced difficulties

14
The Classification of Astronomical Data Service
  • Astronomical Catalog Service
  • Image Mosaic Service
  • Spectrum Data Service
  • Simulation Data Service

15
Existing Astronomical Datasets we have
16
Build Catalog Data Service
  • How to federate the catalog data into VO,that is,
    how to build Data Service using the existing
    databases and programs?

17
Define Catalog Service Interface
Some standards we used
  • Input Query Language SQL(now),ADQL (plan)
  • Output Data Format VOTable 1.0
  • Catalog resource metadata registry protocol
  • VOResource 0.9

input ADQL query sentence output VOTable format
result it makes service interface/API simple.
18
How to use existing databases and programs to
create catalog data service
  • How to create a catalog data service that can
    understand ADQL and generate VOTable format
    result??
  • we adopt two ways!
  • Reconstruct the existing catalog DBMS
  • Encapsulate search program,like pmm
  • The CDS has offered search program for big
    catalog like USNO A2,0..

19
Catalog data service based on DB
GT3 Interface
VOTable
ADQL
VOTable Wrapper
ADQL/SQL Translator
SQL
ResultSet
JDBC
Catalog/metadata
DBMS
20
Advantage and disadvantage
  • Can sufficiently use the functions of SQL
    language and implement complex query.
  • DBMSs offer the most powerful functions for data
    management and maintenance.
  • Need many works to reconstruct the DBs.
  • To big catalogs, like USNOB1.0,2MASS PSC, query
    efficiency is low

21
(No Transcript)
22
Data service based on search program
GT3 Interface
ADQL
VOTable
VOTable Wrapper
ADQL Translator
parameters
stream
JNI/
program
Data Files
23
Advantage and disadvantage
  • Positional search is quicker than DB
  • Only offer search functions that programs could
    offer. Many programs only offer position search
    functions,no statistical functions.

24
Catalog Access Service Provided by us
25
How to call a Catalog data service
Resource Registry
1.ltFind Factorygt
ltregistrygt
2.ltFactory GSHgt
Data Service Factory
3.ltcreate data servicegt
Grid Client
4.ltData service GSHgt
Create Data service
5.ltdata request(ADQL)gt
Data Service Instance
Database
6.ltresult (VOTable)gt
26
Use Data Service to build www service for end user
Web Client
End user dont know where the data services are
http
Web server
Data Mining Service
Data processing Service
Data Visualization Service

Grid Client
Resources Register
Services Register
MySQL
Oracle 9i
Files
27
Use data service to create other service
  • Our next work is to build a
  • multi-wavelength cross-identification service
    (MWCI)based on the catalog data service.
  • What is multi-wavelength cross-identification ?
  • To cross-identify datasets by positional
    consistency, we can understand objects from
    different wavelength properties.

28
The steps of multi-wavelength cross-identification
  • Cross-identify datasets from different
    wavelengths within error radius.
  • Divide the result of cross-identification into
    three situations one-to-one, one-to-two,
    one-to-many.
  • Choose the one-to-one entry for data mining
  • The other two situations need statistical
    analysis to determine which source are the true
    counterpoint.

29
Requirements
  • Locate the datasets that users want to use.
  • (dataset discovery)
  • How to cross-match the datasets in heterogeneous
    DBMSs at different locations effectively and
    efficiently.
  • Find storage resource to store the results

30
Registry
MWCI Factory
Data Service
2
4
2MASS
MWCI
1
5
MWCI Service Provider
User Application
. . .
3
6
. . .
5
Data Service
storage Factory
7
6
NVSS
4
storage
Storage Service Provider
31
Outline
  • VO,Grid and OGSA
  • Build the catalog data access service
  • Build the image mosaic service
  • Faced technical difficulties

32
Build The Image Mosaic Service
  • Use DSS-I sky image build our first image mosaic
    service.

33
the definition of interface of service
  • Input parameters
  • 1.RA,2.Dec,3.image height,4.image width
  • transport protocols gridFTP
  • Output Data format fits

34
Realization of DSS-I image mosaic service
GridFTP
GT3 Interface
JNI/
Fits file
parameters
GetImage
DSS-I Image Files
35
Outline
  • VO,Grid and OGSA
  • Build the catalog data access service
  • Build the image mosaic service
  • Faced technical difficulties

36
Technical Difficulties
  • service/resource registry and discovery!
  • ADQL2SQL translator
  • protocol shortcoming

37
protocol shortcomings
  • The shortcomings of VOTable 1.0 protocol
  • 1.How to encapsulate result of join query!!
  • 2.The standard to encapsulating spectrum data
  • 3.the definition of FIELD element is not
    strict and uncompleted
  • The shortcoming of UCD
  • 1.Cant express concrete meaning,such as
    ERROR ,Error for what??
  • 2. incomplete, exampleHTMID has no UCD
  • Lack of standard for Unit

38
Thank You

Q A
?
www. .org
39
Our provided catalogs in Catalog Service
40
The Step Of Calling A Data Service
41
Transparencies for Astro Data Access
  • Heterogeneity Transparency
  • Name Transparency
  • Distribution Transparency

42
What is Grid Service?
43
What Is The Data Grid
  • DataGrid A dynamic logical namespace that
    enables coordinated sharing of heterogeneous
    distributed storage resources and digital
    entities based on local and global policies
    across administrative domains in a virtual
    enterprise.
  • DataGrid
  • Logical name space for location independent
    identifiers
  • Abstractions for storage repositories,
    information repositories, and access APIs
  • Latency management

44
Using a Data Grid in Abstract
Data Grid
  • User asks for data from the data grid

45
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com