IVOAs Data Integration Approah - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

IVOAs Data Integration Approah

Description:

The VO consists of a collection of data centres each with unique collections of ... image cutouts, image mosaics; image is returned as a FITS file or graphics file; ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 31
Provided by: asa2
Category:

less

Transcript and Presenter's Notes

Title: IVOAs Data Integration Approah


1
IVOAs Data Integration Approah
  • OpenSkyQuery

2
Definitions
  • A Virtual Observatory (VO) is a collection of
    interoperating data archives and software tools
    which utilize the internet to form a scientific
    research environment in which astronomical
    research programs can be conducted. The VO
    consists of a collection of data centres each
    with unique collections of astronomical data,
    software systems and processing capabilities.
  • Various VO projects are funded through national
    and international programs, and all projects work
    together under the International Virtual
    Observatory Alliance (IVOA) to share expertise
    and develop common standards and infrastructures
    for data exchange and interoperability.
  • The goal of the IVOA is the development of
    architectural decisions and standards in the
    astronomy domain
  • NVO (National Virtual Observatory) is an VO
    project in compatible with IVOA.

3
IVOA
  • Data is communicated between services in two
    basic formats FITS and XML.
  • The IVOA architecture uses services at different
    levels HTTP GET/POST services, SOAP services,
    Grid services.
  • IVOA has executive committee and Interested
    groups such as GGF Astro-RG

4
IVOA Architecture Diagram
5
IVOA Standards
  • Metadata Registries for VO
  • Resource Metadata for the Virtual Observatory
  • IVOA Metadata Registry Interface
  • VOTable Format Definition
  • Unified Content Descriptors (UCD)
  • DAL Architecture
  • Simple Image Access Protocol
  • Simple Spectral Access Specification
  • IVOA Query Language
  • IVOA SkyNode Interface
  • Astronomical Data Query Language (ADQL)
  • VO Query Language
  • Data Modeling
  • A unified domain model for astronomy, for use in
    the Virtual Observatory
  • Data model for quantity
  • IVOA Observation data model
  • Simple Spectral Data Model
  • Simulation Data Model

Our focus area regarding IVOAs approach to data
access and integration
6
Data Access and Integration Issues
  • DAL Architecture
  • Simple Image Access Protocol
  • Simple Spectral Access Specification
  • VOQL and SkyNode Interfaces
  • IVOA Query Language
  • IVOA SkyNode Interface
  • Astronomical Data Query Language (ADQL)
  • VO Query Language

7
Data Access Layer (DAL)
  • Defines and formulates standards for uniform
    access to VO data that may have heterogeneous
    representations by different data providers.
  • Family of data access services access to VO
    resources
  • 1. Simple Image Access (SIA)
  • uniform access to image archives
  • atlas and pointed image archives
  • image cutouts, image mosaics
  • image is returned as a FITS file or graphics
    file
  • 2. Simple Spectral Access (SSA, currently being
    specified)
  • access to 1D spectra and SEDs
  • spectra is returned as ASCII, VOTable, FITS.
  • 3. VO Query Language IVOA SkyNode Interfaces
  • VONode
  • OpenSkyNode
  • OpenSkyQuery Portal and Protocol

Data Integration
8
1. Simple Image Access Protocol
  • A protocol for retrieving image data from a
    variety of astronomical image repositories
    through a uniform interface.
  • SOAP/WSDL and HTTP/GET based Web Services
    implementation are defined
  • SIA data model with familiar astronomical image
    which generally means a 2D sky projection with a
    data array that is logically a regular grid of
    pixels encoded as a FITS image, GIF/JPEG, etc
  • The SIA includes standardized dataset metadata
    such as provenance, image geometry, scale,
    format, position, time of observation, spectral
    bandpass and access information.

9
2. Simple Spectral Access Specification
  • A simple query POS, SIZE, FORMAT like SIA
    possibly refined by spectral or time bandpass,
    etc. In the simplest case, data returning could
    be wavelength, flux as text (for spectrum).
  • The goal of the Simple Spectral Access (SSA)
    specification is to define a uniform interface to
    spectral data including spectral energy
    distributions (SEDs), 1D spectra, and time series
    data. In contrast to 2D images, spectra are
    stored in a wide variety of formats and there is
    no widely used standard in astronomy for
    representing spectral data.
  • The data model for SEDs defines a set of spectra
    or time series, some of which may have only one
    or few data points (photometry) and each of which
    may have different contextual metadata like
    aperture, position, etc.
  • Spectra is returned as ASCII, VOTable, FITS.
  • SOAP/WSDL and HTTP/GET based WebServices
    implementation are defined

10
3. VO Query Language IVOA SkyNode
  • Data (in Databases)
  • Integration issues
  • ADQL, VOQL, SkyNode, OpenSkyServer

11
Astronomical Data Query Language (ADQL)
  • ADQL is based on a subset of SQL plus region
    with, as a minimum support, for circle (Cone
    Search).
  • ADQL has two forms
  • ADQL/x An XML document conforming to the XSD
  • ADQL/s A String form based on SQL92 and
    conforming to the ADQL grammar. Some non standard
    extensions are added.
  • Extensions to SQL92 include
  • ADQL supports the region specification The Region
    would look something like Region(CIRCLE J2000
    19.5 36.7 0.02)
  • JDBC Mathematical functions shall be allowed in
    ADQL
  • XMATCH implies crossmatch between two or more
    astronomical catalogues
  • To support Xquery as well as SQL, it will be
    possible to express selections and selection
    criteria as a simple Xpath
  • ADQL supports the syntax to return only the first
    N records from a query

12
lt?xml version"1.0" encoding"utf-16"?gt ltSelect
xmlnsxsd"http//www.w3.org/2001/XMLSchema"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce"gt ltSelectiongt ltItemsgt
ltSelectionItem xsitype"ExprSelectionItem"gt
ltExpr xsitype"ColumnExpr"gt ltColumn
xsitype"AllColumnReference"gt
ltTableNamegtalt/TableNamegt lt/Columngt
lt/Exprgt lt/SelectionItemgt lt/Itemsgt
lt/Selectiongt ltTableClausegt ltFromClausegt
ltTableReferencegt ltTablegt
ltNamegtTablt/Namegt ltAliasNamegtalt/AliasName
gt lt/Tablegt lt/TableReferencegt
lt/FromClausegt ltWhereClausegt ltCondition
xsitype"RegionSearch"gt ltRegion
xmlnsq1"urnnvo-region" xsitype"q1circleType"
gt ltq1Centergt ltPos3Vector
xmlns"urnnvo-coords"gt
ltCoordValuegt ltValuegt
ltdoublegt1.2lt/doublegt
ltdoublegt2.4lt/doublegt
ltdoublegt2.4lt/doublegt lt/Valuegt
lt/CoordValuegt
lt/Pos3Vectorgt lt/q1Centergt
ltq1Radiusgt0.2lt/q1Radiusgt lt/Regiongt
lt/Conditiongt lt/WhereClausegt
lt/TableClausegt lt/Selectgt
ADQL/s might be as follows Select a. from Tab
a where Region('Circle Cartesian 1.2 2.4 3.6
0.2') This is represented in xml as shown left
Sample ADQL/x
13
VO Query Language
  • The Virtual Observatory Query Language (VOQL) is
    an ambitious language at a higher level than
    ADQL. A VOQL portal would take VOQL programs.
  • Layers of VOQL
  • VOQL1 WebServices ADQL and VOTABLE to exchange
    information between machines
  • VOQL2 Federation SQL-like query language and
    federation system, i.e. combination of SkyQuery ,
    JVOQL and VO standards
  • VOQL3 SkyXQuery future XML-based query language.
  • The highest level of VOQL is a semantics-based
    language that allows astronomers to build queries
    in the language of astronomy rather than the
    language of databases.

14
IVOA SkyNode Interface
  • The SkyNode Interface describes the minimum
    required interface to participate in the IVOA as
    a queryable VONode as well as requirements to be
    a Full OpenSkyNode, part of the OpenSkyQuery
    Portal.
  • The OpenSkyQuery protocol drives a data service
    that allows querying of a relational database or
    a federation of databases. The request is written
    in a specific XML abstraction of SQL that is part
    of ADQL (Astronomical Data Query Language).
  • The Portal will formulate a plan and create
    multiple queries, typically one per archive. And
    the results are collected, joined, and served to
    the users.
  • There are two types of SkyNodes
  • Basic SkyNode
  • Full SkyNode

15
SkyNodes (Basic / Full)
  • Basic -simple ADQL/x queries
  • Full ADQL/x/s, performance query, ExecPlan,
    XMatch and footprint.

16
OpenSkyQuery from NVO
  • As an example of data integration according to
    IVOA

17
OpenSkyQuery
  • A Virtual Observatory prototype application that
    marries Web Services technology with emerging VO
    standards to enable dynamic cross-matching
    queries between different VO-enabled archives

18
OpenSkyQuery is consists of
  • Open SkyNodes
  • Basic building blocks of the federated query
    system. They offer core services, including some
    special sophistication search functions. They are
    identical Web Services. Only the content of the
    Databases differ.
  • Open SkyPortal
  • The starting point for a queries. Queries are
    divided up, organized into sorted plan and sent
    off to the first node. The only thing a portal
    really has to do is split up a query and ship it.
  • NVO Registry
  • All nodes must be registered in this registry.

19
SkyNodes
  • SkyNodes are services supporting ADQL.
  • Database query interfaces based on Web Services
  • Take ADQL and returns data.
  • The next generation of the DAL Cone Search
    protocol, providing federated access to
    distributed astronomical databases.
  • The formalism for distributed astronomical DBMS
    queries through large scale processing depends on
    the VOStore formalism.
  • SkyNode and SkyServer are used interchangeably.
    In case of OpenSkyServer, SkyServer is used.

20
OpenSkyPortal
  • Enables the OpenSkyQuery (OSQ)
  • Ability to build queries using a graphical
    interface (OSS). OSQ includes query builder that
    allows creating complex ADQL queries. OSQ is also
    integrated with VOPlot to plot query results.
  • Planning Execution - ExecPlan Document
  • Portal makes a plan by asking each node for its
    estimate of data for the given query (Perform
    Queries).
  • The nodes are ordered based on this information -
    the one with least data is the first to execute
  • The ExecPlan is next sent to the first node in
    the plan (the one which will execute last) which
    passes is recursively to the other nodes.
  • The data is passed back from each node and
    XMatched. Finally the result is passed to the
    portal.

21
Execution Overview
  • After the portal has constructed ExecPlan
    document. it then sends it to the first node.
  • First node does not execute its section yet,
    instead it passes it off to the next node and
    this continues until it reaches the last node.
  • The last node will run its section, return its
    results to the previous node, and continue until
    it has reached the portal again.

22
OpenSkyQuery Architecture
later
End 2003
Open SkyQuery Portal
VOQL Portal
Uses only Registry, Lev3 and Lev4 SkyNodes.
High Level Language allowing seemingly uniform
access to services.
MetaData
SkyQuery
LEV3
SkyQueryWebApp
VOQLQuery
PerformQuery
Tables
Columns
XMatch
ExecPlan
Clients
May use Services at any level
23
A part from the original ExecPlan document that
portal passes to the nodes
ExecPlan document lists of nodes queries
24
ExecPlan Document
  • Two sections in the Plan
  • 1. Format
  • The specified transport for this particular plan.
    It is almost always VOTABLE, but may occasionally
    be DataSet. VOTABLE is the only required
    supported format for nodes.
  • 2. PlanElements
  • An array of PlanElement objects.
  • Sorted from lowest index to highest. the node at
    PlanElements0 would be the first to receive the
    plan.
  • PlanElement
  • Statement
  • Hosts (list of mirrors for that node - serviceURL
    in registry)
  • Target (shortName (from the registry) of the
    intended node)

25
Summary
  • IVOA is not directly addressing the data
    integration issues.
  • In order to solve data integration problems, they
    do not propose any innovative architecture
    instead, they plan to use
  • SRB or NGAS for the implementation of VOSpace
  • GridFTP or a simple application of Ogsa-DAI for
    the implementation of VOStore.

26
APPENDIX
27
Standards for accessing and querying data and
metadata
  • Create standard metadata attributes to describe
    astronomy quantities.
  • UCD Uniform Content Descriptor
  • Use standard data formats for storage and
    transformission
  • FITS image format
  • VOTable
  • Create standard services for accessing data
    formats using standard metadata

28
Terms ConflictionSkyNode, SkyServer, VOStore,
VOSpace
  • VOStore and VOSpace are defined in Grid and Web
    Services specs. These are more generic terms than
    the SkyNode and SkyServer.
  • IVOA is still not sure about the underlying
    implementation of VOStore and VOSpace
  • VOStore seeks to develop a common API for
    managing and using remote read/write storage.
  • VOSpace manages metadata and data collections and
    sits between user and VOStore.
  • Minimal info management system to organize shared
    collections.
  • VOSpace can function in near term as a SRB or
    NGAS interoperability layer.
  • SkyNodes are services supporting ADQL.
  • Database query interfaces based on Web Services
  • the next generation of the DAL Cone Search
    protocol, providing federated access to
    distributed astronomical databases.
  • The formalism for distributed astronomical DBMS
    queries through large scale processing depends on
    the VOStore formalism.
  • SkyNode and SkyServer are used interchangeably.
    In case of OpenSkyServer, SkyServer is used.

29
Terms
  • SkyNodes are VONodes serving Data kept in
    Databases.
  • SkyNodes have ADQL based SOAP interfaces
    returning VOTable based results.
  • OpenSkyQuery Portal is a portal allowing access
    to multiple SkyNodes and enable integration of
    data
  • IVOA has specifications for VOStore (SkyNode) and
    VOSpace but not implementation specifications
  • VOSpace Manages medata and data collections and
    sits between user (portal) and VOStore.

30
ADQL/s
  • Sample ADQL/s querying two distributed Databases
  • FROM statement specifies which databases to use
    and defines alias for the databases
  • The clause XMATCH and Region are OpenSkyQuery
    extensions to SQL
  • XML version of the below query (ADQL/s) is called
    ADQL/x and put into Statement tag of the
    ExecPlan document. (see Slide 21)
Write a Comment
User Comments (0)
About PowerShow.com