VO Query Language - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

VO Query Language

Description:

Provide a means for users to submit general requests for astronomical ... Based on Quilt, XQL, and XML-QL. Quilt is based on Object Query Langauge (OQL) ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 26
Provided by: edsh8
Category:
Tags: language | query | quilt

less

Transcript and Presenter's Notes

Title: VO Query Language


1
VO Query Language
  • GSFC XML Group
  • Ed Shaya
  • Brian Thomas
  • Kirk Borne

2
VOQL Requirements
  • Provide a means for users to submit general
    requests for astronomical information from a
    distributed set of repositories.
  • Allow for the science use cases.
  • Easy to learn and use
  • Hide from the user obvious but tedious steps
  • May require several levels o f language with only
    the top level being easy.
  • Allow for web form entry.
  • Independent of internal arrangement of data at
    repositories.
  • Plug-n-play metadata and ontology.
  • Span a distributed set of heterogeneous services.
  • Each VO query can transform to multiple queries
    in local dialects.
  • Workflow of interactions between registries,
    services, and user.
  • Integration of multiple responses

3
More VOQL Requirements
  • Easy to parse and transform into other forms
  • Extensible
  • Sites can extend query language through local
    namespaces
  • VO namespace can add language elements into the
    future.

4
XML Query Language
  • Compatible XML and Human-Readable versions
  • Xquery is a superset of Xpath
  • Based on Quilt, XQL, and XML-QL
  • Quilt is based on Object Query Langauge (OQL)
  • OQL is based on Structured Query Language (SQL)
  • If,then,else case switch basic functions
    define new functions
  • FLWR (for, let, where, return)
  • for i in (1 to 3)
  • let j (1 to i)
  • Results in
  • i 1, j 1
  • I 2, j (1,2)
  • I 3, j (1,2,3)

5
XQuery Continued
  • for s in document('bright_stars.xml'')//id_main
  • let b document('photometry.xml'')//starname
    s/band
  • where count (b) gt 1
  • return
  • ltcolorsgt
  • ltstarNamegtilt/starnamegt
  • for j in (2 to count(b))
  • ltcolor namebj_at_name - bj-1_at_namegt
  • bj/value - bj-1/value
  • lt/colorgt
  • lt/colorsgt

6
(No Transcript)
7
OLAP/XMLA
  • On-line Analytical Processes
  • Reduces bandwidth/time of data out
  • Statistical Package add on to Databases
  • Analysis of DataCubes
  • Hierarchy of Axis Values
  • Years, Months, Days, Hours, minutes
  • Degrees, minutes, seconds
  • Interior, core, mantle, atmosphere, mesosphere,
    exosphere

8
JVO Query Language Naoki Yasuda
  • Retrieves catalog data and images from multiple
    data servers via a single user interface
  • Extension of SQL
  • Catalog.UCD
  • Box(Point(c1.ra,c1.dec), width1,height1)
  • XMATCH(c1,c2,!c3,)lt 3 arcsec
  • Select Catalog Keyword1 Keyword2
  • Select by MAXMIN(PROPERTY) ALL NAME
  • Area insideoutside area0
  • Area1 overlapunion area2 shape
  • SHAPE box, circle, oval, triangle,point
  • DIFF(x.obs_date, y.obs_date) gt 30 days

9
Data mining
  • Beyond finding data intense data filtering,
    conditioning, knowledge synthesis.
  • Grid Services?
  • Principal Component Analysis
  • Iterative solutions
  • Genetic algorithms
  • Maximum-likelihood functions
  • Neural nets
  • Decision trees
  • Cluster analysis
  • Regression analysis

10
Data Objects
  • Dataset
  • Tables
  • Fields
  • Units
  • Class (UCD)
  • Range
  • Values
  • Images
  • Axes
  • Coordinate Maps
  • Data Values
  • Spectra
  • Wavelength
  • Intensity

11
ADQL
  • Obtain Data Sets
  • By bibliographic query
  • Author, date published, title, journal, volume
  • By description
  • Keywords, abstract, mission name
  • Obtain tables
  • By title, table , field names
  • By Xpath
  • /LocalGroup/galaxyM31/region7/v-band
  • Obtain table data by UCDs or field names
  • Min/max of range, regular expression
  • Obtain N-cube data
  • Subset by axis values,
  • subset by ra,dec, radius or more generally
    Func(axes1..)

12
Astronomy Data Query Language (ADQL)
13
ADQL/Query Schema
14
Knowledge Based Query
  • Class ? Instance? Objects
  • Property (V-band) ? Instance ? value (-1.4)
  • Measurement property values are Data
  • Modifier (aperture) ? Instance ? value (3 arcsec)
  • Modifier (inequality) ? Instance ? value (before,
    not)
  • Aggregate property member, region, component
  • Values are bags of objects
  • SubclassOf property subclass has restricted
    property value range or restricted list of
    properties.
  • Property Space N-properties form a space.
  • A bit of math is needed to relate values.

15
Problem Statement Language Root
16
PSL Constraint
17
PSL AstroObject
18
Dataset Schema
ltdataset subject"astronomy"gt lttitlegtAC 2000.2
The Astrographic Catalogue on the Hipparcos
Systemlt/titlegt ltaltname type"ADC"gt1275lt/altnamegt
ltaltname type"CDS"gtI/275lt/altnamegt ltaltname
type"brief"gtThe AC 2000.2 Cataloguelt/altnamegt ltre
ferences type"source"gt
ltreferencegt lttitlegtAC 2000.2 The Astrographic
Catalogue on the Hipparcos Systemlt/titlegt ltauthor
gtltinitialgtSlt/initialgtltinitialgtElt/initialgtltlastName
gtUrbanlt/lastNamegtlt/authorgt ltauthorgtltinitialgtTlt/in
itialgtltinitialgtElt/initialgtltlastNamegtCorbinlt/lastNa
megtlt/authorgt ltauthorgtltinitialgtGlt/initialgtltinitial
gtLlt/initialgtltlastNamegtWycofflt/lastNamegtlt/authorgt
ltauthorgtltinitialgtElt/initialgtltlastNamegtHoeglt/lastNa
megtlt/authorgt ltauthorgtltinitialgtClt/initialgtltlastNam
egtFabriciuslt/lastNamegtlt/authorgt ltauthorgtltinitialgt
Vlt/initialgtltinitialgtVlt/initialgtltlastNamegtMakarovlt/
lastNamegtlt/authorgt ltjournalgtltnamegtAstron.
J.lt/namegtltvolumegt115lt/volumegtltpagenogt1212lt/pagenogt
ltdategtltyeargt1998lt/yeargtlt/dategtltbibcodegt1998AJ..
..115.1212Ult/bibcodegt lt/journalgt
lt/referencegt lt/referencesgt
19
Dataset Continued
ltkeywords xmlbasehttp//adc.gsfc.nasa.gov/keywo
rdLists/adc/ parentListURL"adc_keywordList.html"gt
ltkeyword xlinkhref"kw_p.htmlPositional_data"gt
Positional datalt/keywordgt ltkeyword
xlinkhref"kw_a.htmlAstrographic_zones"gtAstrogra
phic zoneslt/keywordgt ltkeyword xlinkhref"kw_s.ht
mlSurveys"gtSurveyslt/keywordgt lt/keywordsgt ltdescrip
tionsgt ltdescriptiongt ltparagt The AC 2000.2
is a revised version of the 1997 release of the
AC 2000 (Cat. ltI/247gt). It was decided that the
availability of an improved reference catalogue
and the inclusion of photometry from the Tycho-2
catalogue would be sufficient to warrant a
complete re-reduction of the data and a new
distribution of the catalogue. The AC 2000.2
catalog contains positions of 4,621,751 stars at
the average epoch of plate exposures for each
star (average 1907). lt/paragt lt/descriptiongt
20
Case Study 0 Setting up the Query
  • Return RA, Dec, Vmag for stars with 13ltVmaglt15
    and 101253.5ltRAlt131343 and 183800ltDElt
    184000.
  • PSL
  • ltobject classstargt
  • ltproperty nameVmaggt
  • ltrange min13 max15/gt
  • ltvaluegt?vmaglt/valuegt
  • lt/propertygt
  • ltproperty nameRAgt
  • ltrange min101253.5 max131343/gt\
  • ltvaluegt?ralt/valuegt
  • lt/propertygt
  • ltproperty nameDEgt
  • ltrange min183800 max184000/gt
  • ltvaluegt?delt/valuegt
  • lt/propertygt
  • lt/objectgt

21
Case Study 0 Mapping Query to Metadata
  • Search for tables with metadata that satisfy
  • Object/classstar search-gt keyword,
    description
  • Property_at_nameVmag search-gt field/UCD, name
  • Property_at_nameRA search-gt field/UCD, name
  • Property_at_nameDE search-gt field/UCD, name
  • Property/range search-gt field/min and field/max
    or coverage attributes
  • For all such tables, return
  • ?vmag, ?ra, ?de
  • Also, return group/field_at_nameerror for group
    with Vmag info.

22
Problem Statement Language (PSL)
PSL Pull down AndConstrainties, Andproperties
Property Name Pull Down Name, Class, etc.
MathML Pull down ,-,/,,sum,avg,lt,gt, etc
  • Begin RequestConstraint
  •         Find astronomical objects with the
    following properties            AND these
    properties                 1. Name assign to
    var1                 2. Class is "cluster of
    galaxies galaxy cluster"                 3.
    Measurement quantities satisfy               
          a. X-ray brightness gt 3.3E7Jy   assign to
    var2                        1. Time interval of
    measurement 1998Y-1999Y     
  •          Using the above variables satisfy, the
    math formulae
  •                 1. (var2 var3) lt (var1
    logvar4) OR these constraints        
        several constraints for which one must
    be true etc Return a table with the following
    sequence of fields   var1    var2  
  • End Request

23
Brian Thomas Infrastructure
24
Tony Lindes Infrastructure
  • VO activity
  • User
  • Problem Assistant service to help user state
    the problem
  • Ontology terms and relationships derived from
    existing data
  • Workflow to retrieve data, merge it, analyze
    it, reduce it
  • Registry lists all services and their high
    level metadata
  • Job Control decides which jobs and when
  • Data Centre receiver of query for all internal
    data sources
  • Data Source Service uses translator to restate
    query
  • Translator from data query language to
    implemented service
  • Languages
  • Problem Statement Language (PSL)
  • Workflow Language (WFL)
  • Astronomical dataset Query Language (ADQL)
  • Ontology Query Language (OQL)
  • Registry Query Language (RQL)

25
Conclusion
  • Metadata should clearly distinguish between
    values that are property values and those that
    are modifiers of properties.
  • Then, a mapping from a natural(ish) scientific
    knowledge based language (PSL) to a request
    language for data-center common items (ADQL) is
    possible.
  • A federated system with a VO-wide vocabulary plus
    specialized (local) namespaces is best for
    getting started right away and permitting for
    evolution.
Write a Comment
User Comments (0)
About PowerShow.com