Title: WDO-It! Tutorial
1WDO-It! Tutorial
- Leonardo Salayandia, Paulo Pinheiro da Silva, Ann
Q. Gates - CyberShare Center of Excellence
- University of Texas at El Paso
- http//trust.utep.edu/wdo
2Cyber-ShARE Center of Excellence
- Established at UTEP in 2007 with NSF funding
- Focused
- Cross-disciplinary research in science,
engineering, and technology at UTEP - Training workshops on using cyberinfrastructure
- Education and outreach
- Resource acquisition and sharing (documentation
of processes)
Sharing resources via Cyberinfrastructure to
advance research and education.
3WDO-It! Team
- Leonardo Salayandia
- Paulo Pinheiro da Silva
- Ann Q. Gates
-
- Acknowledgements to
- Aida Gandara and Nick Del Rio
4Overview
- Background
- Building vocabularies for a process
- Modeling a process
- Conclusions
5Background
6Solution CI-Miner
- CI-Miner is a framework for documenting
(annotating) scientific processes. CI-Miner
includes - A collection of notations
- A collection of tools
- A methodology for using the framework notations
and tools to annotate scientific processes
Paulo Pinheiro da Silva, Leonardo Salayandia,
Aida Gandara, Ann Q. Gates. CI-Miner
Semantically Enhancing Scientic Processes. To
appear in Earth Science Informatics,
Springer. http//www.cs.utep.edu/paulo/papers/Pinh
eirodaSilva_ESI_2009.pdf
7CI-Miner Use Case
Scientific Process CM
Ontology
How was this model created?
Abstract Workflow
Scientist 1
CM
Create based on CM with CI-Miner
collaborate
Analyze
Provenance
Scientific Process CM
Scientist 3
Scientific Result
Scientist 2
- Scientists 1 and 2 may have different
understanding of process CM - Scientist 3 may not understand how the scientific
result of process CM was derived
8WDO-It!
- A tool from the CI-Miner toolset to document
scientific processes - Built on top of key technologies for
Cyberinfrastructure (CI) - Ontologies
- Workflows
9Two Key Technologies for CI (1/2)
- Ontologies Encoded discipline knowledge
- Used to build common vocabularies for terms in a
field or project - Facilitates information integration/exchange
- Supports (semantic) search
- Challenges with ontologies
- Lack of sufficient guidelines for scientists
direct involvement, e.g., Where/How do I start? - Great ontology editors (for computer scientists)
10Two Key Technologies for CI (2/2)
- Workflow Recipe to do a task
- Example a workflow to create a map of gravity
data - Facilitates definition of formal venues for
connecting CI resources, e.g., executable
workflows (Kepler) - Challenges with workflows
- Difficult for scientists to develop
- Great workflow tools (for computer scientists)
11WDO-It!
- An editor to help scientists to
- Build vocabularies about their processes through
ontologies (Workflow-Driven Ontologies) - Create models of their processes as workflows
(Semantic Abstract Workflows)
12Workflow-Driven Ontologies (WDOs)
- Building vocabulary from two basic concepts
- Data
- Field observations, graphs, maps, and others
- E.g., Gravity Dataset, Contour Map
- Methods
- Algorithms, functions, and techniques used to
transform data - E.g., Contouring, Gridding, Calculate the mean
- Assumption scientists can easily identify and
name their datasets and tools
13Semantic Abstract Workflows (SAWs)
- Built from the vocabulary encoded in a WDO
- Modeling process based on Data Flow
- Avoid technical complexities
- May omit parameters
- May omit steps that are not in direct support of
scientific activities, e.g., reformatting a file
14Building vocabulary for a process
15An example of a scientific process
- Description
- Geo-referenced datasets usually are built from
sparse field measurements , where the location of
each point is given by Longitude/Latitude
coordinates - To create a map model of a geo-referenced
dataset, e.g., a contour map, the dataset usually
needs to - Be pre-processed to create a grid of
uniformly-spaced data points - Create the map model from the uniformly
distributed dataset
16An example of a scientific process
1. Sparse geo-referenced dataset
2. Uniformly distributed dataset (Grid)
Longitude Latitude
OBS -074.4244296 40.0049488
4176.40 -074.9746118 40.0051130
4189.60 -074.4245976 40.0051168
4176.40 -074.7647730 40.0059447
4199.07 -074.7647730 40.0059447
4199.10 -074.3714268 40.0099501
4173.71 -074.3714268 40.0101141
4173.69 -074.2129201 40.0109512
4159.90 -074.3237562 40.0139483
4172.05 -074.3237562 40.0139483 4172.10
ncols 5 nrows 5 cellsize2 4604 4599 4598
4602 4606 4619 4618 4611 4586 4566 4596
4599 4598 4593 4585 4551 4562 4575
4572 4535 4532 4512 4482 4449 4459
3. Map model
17Create WDO for the process
- Launch WDO-It! (instructions on web site)
- First we start building the vocabulary as a
Workflow-Driven Ontology (WDO) - OWL document
- Captures the vocabulary of the process in two
main categories - Data
- Method
18Create WDO for the process
1. Click
2. Enter a namespace (URI-like format)
Recommendation Use a namespace that matches the
URL where you will publish the WDO
Example http//trust.utep.edu/2009/ContourMapWDO
You can change namespaces later with a text
editor that supports the Replace All operation.
19Create WDO for the process
Loaded OWL Documents Tree
WDO namespace on the root
Imported Ontologies subtree reflects
ltowlimportsgt statement of OWL document
Note1 All WDOs import the wdo.owl ontology
Note2 The WDO in the root of the tree is the one
being edited, imported ontologies are not
modified.
Note3 The wdo.owl ontology imports
pml-provenance.owl. More on that later.
20Create WDO for the process
Adding Data and Method concepts to the WDO
1. Click
2. Choose type
3. Add label (ltrdfslabelgt) and optionally a
comment (ltrdfscommentgt)
Note URIs automatically generated by WDO-It! in
reference to namespace assigned to the WDO
document. Facilitates renaming!
21Create WDO for the process
Start with the more general concepts of your
process and start building your WDO hierarchies.
To add a child concept Select a concept from
the Data or Method tree, then click the Add
Concept Icon
You can remove concepts that do not have children
by selecting them and clicking the Remove Concept
Icon
You can rename and add comments to concepts by
selecting them and clicking the Edit Concept Icon
Note Concept Hierarchy shows ALL concepts
defined in the selected OWL document
22Create WDO for the process
- Exercise Replicate the following Data and Method
hierarchies
23Create WDO for the process
- We may want to reuse vocabulary terms to
- Create community consensus
- Adopt existing standards
- Support data/system integration
- Reduce duplication of efforts
- We can reuse vocabularies by harvesting
vocabulary terms from other existing ontologies
24Create WDO for the process
Suppose that we want to reuse terms from the
Virtual Solar Terrestrial Observatory (VSTO)
Ontology. (http//dataportal.ucar.edu/schemas/vsto
.owl)
- Click the File menu, select Open OWL URI
- Enter the URI for the VSTO ontology (above) and
click OK
The Loaded OWL Documents panel will show the VSTO
ontology in the Imported Ontologies subtree
The Concept Hierarchy tree will show the concepts
of the VSTO ontology
- Identify the DataProduct concept in the Concept
Hierarchy - Click and hold DataProduct
- Drag and drop in an empty space in the Data
hierarchy panel
We have harvested the DataProduct VSTO concept
(and its children) as Data concepts in our WDO
ontology!
25Modeling a process
26Create SAW for the process
- Now that we have an initial WDO, we can start
building a SAW to model the process - SAWs
- are OWL documents
- do not include class (or concept) definitions
- define instances of classes
- import a Source WDO
- e.g. ltowlimports rdfsresourceContourMapWDOgt
27Creating a Semantic-Abstract Workflow (SAW)
- We will create an abstract workflow about a
scientific system to - Model our understanding of a process
- Identify the parts of interest for our needs
28Create SAW for the process
2. Enter a namespace (URI-like format)
1. Click
Recommendation Use a namespace that matches the
URL where you will publish the SAW
Example http//trust.utep.edu/2009/CreateGravityC
ontourMapSAW
29Create SAW for the process
Loaded OWL Documents Tree
WDO namespace on the root (Source WDO)
Workflows subtree reflects ltowlimportsgt
statement of OWL document w.r.t. source WDO
SAWs do not contain concept definitions, hence,
concept hierarchy empty when SAW selected
Workflow area enabled when SAW selected
30Create SAW for the process
Adding instances to the SAW
1. Click and hold on a WDO concept
2. Drag and drop on the Workflow area
Data concepts are rendered as directed edges with
beginning and ending Sources
Note Sources are instances of the
pmlpSource class
Methods are rendered as rectangles
31Create SAW for the process
Removing instances from the SAW
1. Select an Data or a Method instance
2. Press the Delete key
Note Source instances cannot be individually
deleted, since Data instances depend on them
Edit instances
1. Select an Data, Method, or Source instance
2. Click the Edit Instance Icon
32Create SAW for the process
Editing SAW instances
pmlpSource instances
wdoData instances
wdoMethod instances
Method instances can be assigned a
pmlpInferenceEngine instance (URI), which will
be used during the creation of data annotators to
encode provenance with PML
Data instances can be assigned a pmlpFormat
instance (URI), which will be used during the
creation of data annotators to encode provenance
with PML
Source instances can be specialized into other
subtypes, as per the pmlp ontology, e.g.,
pmlpPerson
33Create SAW for the process
Assembling the process graph No control-flow,
just data-flow!
Connecting edges and nodes
- Click on a Source instance and hold
- Drag and drop on top of a Method instance to
connect
(Dropping Method instances into Source instances
works too)
Note Data instances need to be attached to a
Source instance or to a Method instance on each
side of the edge
Disconnecting edges and nodes
- Select and hold an endpoint of an edge
- Drag and drop on another part of the workflow
area to disconnect
Note Sources cannot be disconnected from Data
edges
34Create SAW for the process
- Sources can be merged, as long as they are
attached to the same direction of their
corresponding edges
35Create SAW for the process
Exercise Build the following SAW
Lets describe Contouring in more detail
36Create SAW for the process
Creating a subworkflow
- Right Click on the Contouring method instance
- Select Edit DetailedBy property
- Select New Workflow
- Enter namespace for new workflow and start
creating new abstract workflow - E.g., http//trust.utep.edu/2009/ContouringSAW
- Notice the updated Workflow subtree in the Loaded
OWL Documents section
37Create SAW for the process
Exercise Build the following subworkflow for
Contouring
38Conclusions
39Conclusions
- WDO-It! is part of the CI-Miner toolset
- WDO-It! can be used by scientists to build
vocabularies of processes - WDO-It! can be used by scientists to create
abstractions of their processes that reflect
their specific understanding - WDO-It! uses OWL to encode the vocabularies and
abstractions (Semantic Web technology)
40Conclusions
- Some benefits of creating vocabularies and
process abstractions are - Explicitly define a scientists understanding of
a process - Requirements elicitation artifacts to build
systems that carry out the modeled processes - Can identify relevant components in the data
lineage of scientific products created by a
process
41Thank you!