Science Environment for Ecological Knowledge - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Science Environment for Ecological Knowledge

Description:

Efrat Jaeger GEON. Matt Jones SEEK. Edward A. Lee ... (Efrat Jaeger, GEON) ... (Efrat Jaeger, GEON) SEEK Overview, 3/2004. 14 ... inside the Classifier. SEEK ... – PowerPoint PPT presentation

Number of Views:216
Avg rating:3.0/5.0
Slides: 42
Provided by: bertr68
Category:

less

Transcript and Presenter's Notes

Title: Science Environment for Ecological Knowledge


1
Science Environment for Ecological Knowledge
Bertram Ludäscher San Diego Supercomputer
Center University of California, San Diego
http//seek.ecoinformatics.org
2
Architecture Overview
  • Analysis Modeling System
  • Design and execution of ecological models and
    analysis
  • End user focus
  • application-/upperware
  • Semantic Mediation System
  • Data Integration of hard-to-relate sources and
    processes
  • Semantic Types and Ontologies
  • upper middleware
  • EcoGrid
  • Access to ecology data and tools
  • middle-/underware

(cf. GEON Cyberinfrastructure)
  • Plus Working Groups
  • Knowledge Representation (SEEK-KR)
  • Classification and Nomenclature (TAXON)
  • Biodiversity and Ecological Analysis and
    Modeling (BEAM)

3
SEEK EcoGrid
  • Goal standardize interfaces (using web and grid
    services)
  • We have standardized data via EML
  • Integrate diverse data networks from ecology,
    biodiversity, and environmental sciences
  • Grid-standardized interfaces
  • Uniform interface to
  • Metacat, SRB, DiGIR, Xanthoria, etc.
  • Anyone can implement these interfaces
  • Hides complexity of underlying systems
  • Metadata-mediated data access
  • Supports multiple metadata standards
  • EML, Darwin Core as foci
  • Computational services
  • Pre-defined analytical services
  • On-the-fly analytical services

4
Grid versus Web Services
  • Grid Services are Web Services
  • Add authentication, lifecycle management,
    notification, etc.
  • Globus Toolkit 3 Implements Open Grid Services
    Architecture (OGSA)
  • Implications for use
  • Write a normal web service extending GridService
    base class
  • When deployed within GT3, you get these extra
    functions for free
  • Supports distributed computation via proxy
    authentication
  • Problems
  • Complex system to understand
  • GT3 can be difficult to deploy
  • Proposals to incorporate grid services within the
    Web services community (Web Services Resource
    Framework WSRF)

5
EcoGrid client interactions
  • Modes of interaction
  • Client-server
  • Fully distributed
  • Peer-to-peer
  • EcoGrid Registry
  • Node discovery
  • Service discovery
  • Aggregation services
  • Centralized access
  • Reliability
  • Data preservation

6
Building the EcoGrid
LTER Network (24) Natural History
Collections (gtgt 100) Organization of Biological
Field Stations (180) UC Natural Reserve System
(36) Partnership for Interdisciplinary Studies of
Coastal Oceans (4) Multi-agency Rocky Intertidal
Network (60)
Metacat node
SRB node
VegBank node
DiGIR node
Xanthoria node
Legacy system
7
Kepler Scientific Workflows
Query EcoGrid to find data
Archive output to EcoGrid
EML provides semi-automated data
binding Scientific workflows represent knowledge
about the process Kepler captures this knowledge
8
GARP Invasive Species Model
Scientific workflows represent knowledge about
the process AMS captures this knowledge
Slide from D. Pennington
9
Kepler Team, Projects, Sponsors
  • Ilkay Altintas SDM
  • Chad Berkley SEEK
  • Shawn Bowers SEEK
  • Jeffrey Grethe BIRN
  • Christopher H. Brooks Ptolemy II
  • Zhengang Cheng SDM
  • Efrat Jaeger GEON
  • Matt Jones SEEK
  • Edward A. Lee Ptolemy II
  • Kai Lin GEON
  • Bertram Ludäscher BIRN, GEON, SDM, SEEK
  • Steve Mock NMI
  • Steve Neuendorffer Ptolemy II
  • Jing Tao SEEK
  • Mladen Vouk SDM
  • Yang Zhao Ptolemy II

Ptolemy II
10
Kepler Understands EML Data (Chad Berkley, SEEK)
11
Kepler Ecological Modeling(Chad Berkley, SEEK)
12
Database Access (Efrat Jaeger, GEON)
Note EML descriptions of relational sources
would allow automated data ingestion
13
Mineral Classification with Kepler (Efrat
Jaeger, GEON)
14
inside the Classifier
15
Standard BrowserUI Client-Side SVG
16
SWF Reengineering (Ilkay, SDM Ashraf, Efrat,
Kai, GEON)
17
DataMapper Sub-Workflow
18
Result launched via BrowserUI actor(coupling
with ESRIs ArcIMS)
19
Distributed Workflows in KEPLER
  • Web and Grid Service plug-ins
  • WSDL (now) and Grid services (stay tuned )
  • ProxyInit, GlobusGridJob, GridFTP,
    DataAccessWizard
  • SSH, SCP, SDSC SRB, OGS?-??? coming
  • WS Harvester
  • Import query-defined WS operations as Kepler
    actors
  • XSLT and XQuery Data Transformers
  • to link not designed-to-fit web services
  • WS-deployment interface (planned)

20
Web Service Actor (Ilkay Altintas, SDM)
  • Given a WSDL and the name of an operation of a
    web service, dynamically customizes itself to
    implement and execute that method.

21
Set Parameters and Commit
Set parameters and commit
22
Specialized WS Actor (after instantiation)
23
Web Service Harvester (Ilkay Altintas, SDM)
  • Imports the web services in a repository into
    the actor library.
  • Has the capability to search for web services
    based on a keyword.

24
Kepler Grid Services Access(Steve Mock, NMI)
25
An (oversimplified) Model of the Grid
  • Hosts h1, h2, h3,
  • Data_at_Hosts d1_at_hi, d2_at_hj,
  • Functions_at_Hosts f1_at_hi, f2_at_hj,
  • Given data/workflow
  • as a functional plan Y f(X) Z
    g(Y)
  • as a logic plan
    f(X,Y)?g(Y,Z)
  • Find Host Assignment di ? hi , fj ? hj
    for all di , fj
  • s.t. d3_at_h3 f_at_h2(d1_at_h1), is a valid
    plan

26
Shipping Handling Algebra (SHA)
Logical view
(1)
  • plan Y_at_C F_at_A of X_at_B
  • X_at_B to A, Y_at_A F_at_A(X_at_A), Y_at_A to C
  • F_at_A gt B, Y_at_B F_at_B(X_at_B), Y_at_B to C
  • X_at_B to C, F_at_A gt C, Y_at_C F_at_C(X_at_C)

(2)
(3)
Physical view SHA Plans
27
Grid-Enabling PTII Handles
  • A?GA get_handle
  • GA?A return X
  • A?B send X
  • B?GB request X
  • GB?GA request X
  • GA? GB send X
  • GB?B send done(X)
  • Example
  • X GA.17
  • X ltsome_huge_filegt
  • Candidate Formalisms
  • GridFTP
  • SSH, SCP
  • SDSC SRB
  • OGS?-??? WSRF?

Logical token transfer (3) requires
get_handle(1,2) then exec_handle(4,5,6,7) for
completion.
Kepler space
3
A
B

4
7
2
1
5
Grid space
GA
GB
6
28
Homogeneous Data Integration
  • Integration of homogeneous or mostly homogeneous
    data via EML metadata is relatively
    straightforward

29
Heterogeneous Data integration
  • Requires advanced metadata and processing
  • Attributes must be semantically typed
  • Collection protocols must be known
  • Units and measurement scale must be known
  • Measurement relationships must be known
  • e.g., that ArealDensityCount/Area

30
Semantic Mediation
  • Label data with semantic types
  • Label inputs and outputs of analytical components
    with semantic types
  • Use reasoning engines to generate transformation
    steps
  • Beware analytical constraints
  • Use reasoning engine to discover relevant
    components

Data
Ontology
Workflow Components
31
Ecological ontologies
  • What was measured (e.g., biomass)
  • Type of measurement (e.g., Energy)
  • Context of measurement (e.g., Psychotria
    limonensis)
  • How it was measured (e.g., dry weight)
  • SEEK intends to enable community-created
    ecological ontologies using OWL
  • Represents a controlled vocabulary for ecological
    metadata

32
Extensions Semantic Types
  • Take concepts and relationships from an ontology
    to semantically type the data-in/out ports
  • Application e.g., design support
  • smart/semi-automatic wiring, generation of
    massaging actors

m1 (normalize)
p3
p4
Takes Abundance Count Measurements for Life
Stages
Returns Mortality Rate Derived Measurements for
Life Stages
33
(No Transcript)
34
(No Transcript)
35
Semantic Types
  • The semantic type signature
  • Type expressions over the (OWL) ontology

m1 (normalize)
p3
p4
SemType m1 Observation
itemMeasured.AbundanceCount
hasContext.appliesTo.LifeStageProperty -gt
DerivedObservation itemMeasured.MortalityRate
hasContext.appliesTo.LifeStageProperty
36
Extended Type System (here OWL Semantic Types)
SemType m1 Observation
itemMeasured.AbundanceCount
hasContext.appliesTo.LifeStageProperty ?
DerivedObservation itemMeasured.MortalityRate
hasContext.appliesTo.LifeStageProperty
Substructure association XML raw-data
(X)Querygt object model link gt OWL ontology
37
Semantic Types for Scientific Workflows
38
Deriving Data Transformations from Semantic
Service Registration
Bowers-Ludaescher, DILS04
39
Structural and Semantic Mappings
Bowers-Ludaescher, DILS04
40
SEEK Impact
  • Fundamental improvements for researchers
  • Global access to ecologically relevant data
  • Rapidly locate and utilize distributed
    computation
  • Capture, reproduce, extend analysis process

41
Acknowledgements
This material is based upon work supported
by The National Science Foundation under Grant
Numbers 9980154, 9904777, 0131178, 9905838,
0129792, and 0225676. PBI Collaborators NCEAS,
University of New Mexico (Long Term Ecological
Research Network Office), San Diego Supercomputer
Center, University of Kansas (Center for
Biodiversity Research) Kepler contributors SEEK,
Ptolemy II, SDM/SciDAC, GEON
Write a Comment
User Comments (0)
About PowerShow.com