myGridTaverna Provenance - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

myGridTaverna Provenance

Description:

myGrid/Taverna Provenance. Daniele Turi. University of Manchester ... PEP decodes the response and either allows data/metadata to be returned to the ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 31
Provided by: Chris547
Category:

less

Transcript and Presenter's Notes

Title: myGridTaverna Provenance


1
myGrid/Taverna Provenance
  • Daniele Turi
  • University of Manchester
  • OMII f2f Meeting, London, 19-20/4/06

2
(No Transcript)
3
Components
  • Identifiers
  • LSIDs
  • Data
  • JDBC data store
  • Metadata
  • RDF Provenance Plugin
  • Browsing
  • Provenance Browser Plugin
  • Security
  • Under development

4
LSID
5
LSID Life Science Identifier
  • URN specification in progress
  • 5 part identifier (with optional version id)
  • urnlsidwww.mygrid.org.uklsdocumentX1234
  • urnlsidncbi.nlm.nlh.gov.lsid.biopathways.orggen
    bank_gi7717376
  • protocol for retrieving data and metadata about
    an object
  • commitment by the provider to always return the
    same data for an ID

6
LSID (ctd)
  • Issue
  • LSID Authorities
  • Resolution
  • LSID Resolvers
  • Examples
  • myGrid
  • Long Term Ecological Research Network
  • BioPathways Consortium

7
LSID (ctd 2)
  • abstraction
  • lightweight
  • independent from actual storage implementation
  • database
  • file system
  • application
  • both for private and public data sources

8
Data
9
Data Storage (current)
  • Taverna can persist inputs, outputs and
    intermediate results in an SQL database via JDBC
  • Optional and can be done by configuring a Baclava
    Data Store
  • Allows the LSIDs of data items to be resolved
    against the actual data

10
Data Storage (future)
  • Domain-specific databases
  • use outside myGrid
  • Develop
  • taverna processor for JDBC/OGSA-DAI
  • associated interface (cf BioMart)
  • Users will be able to study the contents of an
    existing database and
  • write queries that extract data from the
    database, where the query may be parameterised
    with values passed in from the workflow
  • write requests that insert data from the workflow
    into a named table in the database.

11
Metadata
12
Metadata Generation
  • Taverna Provenance Plugin
  • Listen to Taverna Events
  • WorkflowEventListener
  • Faithfully record them as ontological instance
    data
  • RDF graphs (one for each Taverna run)

13
Metadata
  • Representation
  • Ontology (Schema)
  • Storage
  • Query
  • Browsing

14
Representation
  • RDF
  • triples
  • subject predicate? object
  • URIs (hence easy data integration)
  • semantic web language
  • XML serialization
  • flexible, powerful
  • sets of triples gives rise to graphs

15
Workflow Run
urnlsidworkflow6
urnlsidorgHY7
runs
belongsTo
urnlsid..wfInstance8
launchedBy
urnlsidperson4
executed
executed
urnlsidprocessRun84
urnlsidprocessRun51
16
Schema
  • Ontology
  • RDF schema
  • Taxonomic inferences
  • also available as OWL
  • opens it up to complex reasoning

17
(No Transcript)
18
Typed Workflow Run
launchedBy
Provenance Ontology
executed
Experimenter
Organization
ProcessRun
WorkflowRun
Workflow
belongsTo
runs
urnlsidworkflow6
urnlsidorgHY7
runs
belongsTo
urnlsid..wfInstance8
launchedBy
urnlsidperson4
executed
executed
urnlsidprocessRun84
urnlsidprocessRun51
19
Storage
  • Named RDF graphs
  • retrieve whole graphs (eg workflows)
  • implementation in
  • NG4J (Jena MySQL)
  • scalability issues
  • Sesame2 native store
  • scalable
  • Java 5

20
Query
  • RDF query languages
  • TriQL, SeRQL, SPARQL
  • query languages for named RDF graphs
  • Ontology inspection/reasoning
  • Canned Queries
  • workflows with failed processes
  • input/output of past process runs
  • workflows with data changed by user

21
(No Transcript)
22
Browsing
23
Provenance Browsing
  • Provenance Browser Plugin
  • reusing Taverna GUI components
  • Matthew Gamble

24
(No Transcript)
25
Analysis
26
Provenance Analysis
  • Comparison
  • Aggregation
  • etc
  • see work by Jun Zhao

27
Security
28
(No Transcript)
29
  • User sends LSID ref and credentials to the Access
    Point
  • Access Point returns data and metadata or denies
    access as follows
  • credentials are passed to a User Directory
  • User Directory passes the corresponding user to
    the Authorization Authority
  • Authorization Authority returns the user
    attributes in the form of a (possibly signed)
    SAML assertion
  • this assertion, together with the lsid and its
    corresponding metadata, is passed to the Policy
    Enforcement Point (PEP)
  • PEP uses these three inputs to form an XACML
    request that is passed to a Policy Decision Point
    (PDP) that is preloaded with an XACML Policy Set.
  • PDP evaluates the request against its policy set
    and returns an XACML response to PEP
  • PEP decodes the response and either allows
    data/metadata to be returned to the user or
    denies access.

30
myGrid XACML Policy
  • Scenario
  • supervisors can access all workflows in the
    organization
  • students can access only their own workflows
  • blacklisted users cannot access anything
  • See policySet.xml on myGrid wiki
Write a Comment
User Comments (0)
About PowerShow.com