Title: Folie 1
1"A Python Library for Provenance Recording and
Querying"Requirements for a Provenance
Visualization Panel
- Presentation on IPAW08 Henning Bergmeyer
2Overview
- Brief Overview Provenance System
- A Python Library for Provenance Recording and
Querying - Usage Examples
- Initializing, Recording, Querying, Extending
- Architecture
- Requirements for a Provenance Visualization Panel
- User Groups and Intentions
- Graphical Representation and Exploration
- Requirements
3Main Model Concepts of the Provenance
SystemGrid Provenance, PReServ 1.0 (University
of Southampton)
- Interactions between actors
- Relationships (1 subject, 1..n objects, 1
relation type) - are dependencies between interactions (e.g.
cause-and-effect) - describe internal, otherwise hidden functionality
of actors - Actor States
- are assertions about internal states of actors
- Interaction Records
- complete documentation of an interaction through
assertions of all influencing incidents and
dependencies - Tracers
- unique markers that serve to identify individual
workflow executions - distributed along message paths
4Main System Concepts of the P-System
- Distribution
- Several connected P-Stores
- differentiation of asserter views
5"A Python Library for Provenance Recording and
Querying(Roland Gude, Carsten Bochner)
6A Python Library for Provenance Recording and
Querying
- Open Source http//sourceforge.net/projects/prove
nance-csl/ - Purpose
- easy Provenance recording and querying for Python
applications or applications with interface to
Python - independent of Java on the client side
- Examples for
- Initialization
- Recording
- Querying
- Extending own types
7Code Examples Initialization
- from provenance.api import
- looks like bad coding style at first
- but automatic lazy-loading of required modules
prevents severe performance losses - cl client.Client(http//localhost8080,
asserterme) - Thats it!
- A trace file can be specified to log
communication with P-Store
8Code Example Recording
- subj utils.createSubjectId(1, dataAccessor,
"parametername") - objlist utils.createObjectId(
- utils.createInteractionKey("http//sink",
http//source"), - pAssID, 'anything', 'dataAccessor',
'parameter', 'isSender') - keys,response self.cl.record(
- utils.createActorState(a_content_0,
doc_style), - utils.createRelationship(subj, rel_type,
objlist), - utils.createInteraction(m_content_0,
doc_style), - utils.createInteraction(xml_content_0)
- , "isSender", sink, source)
- res interfaces.IRecordAck(response)
9Code Example Querying
- queryString "for n in pspstruct return n"
- response self.cl.query(queryString)
- result interfaces.IQueryAck(response)
- Afterwards result contains an XML structure
containing all pstructs available in that store.
10Architecture
- SOAP interface translated from WSDL by ZSI
- pyProtocols
- Python lacks of OO-concept "Interfaces"
- pyProtocols allows protocol definitions and
automatic adaption - used to make SOAP interface transparent to user
- Lazy-loading
- PEAK framework
11Code Example Extending Types
- class IAddress(IZSITypeCode)
- """ interface for string typecodes """
- def getAsString(self)
- """ returns a String with the Value of the
Stringlike. """ - IString protocols.protocolForType(basestring,)
- class AddressAdapter(object)
- protocols.advise(instancesProvideIAddress,
- asAdapterForProtocolsIStrin
g) - def __init__(self, string)
- self._delegate serverAPI.Address(string.
__str__()) - def getAsString(self)
- return self._delegate.__str__()
- def toTypeCode(self)
- return self._delegate
12Requirements for a Provenance Visualization Panel
- (Markus Kunde, Henning Bergmeyer)
13Motivation
- Determine requirements for a Provenance
visualization panel - Requirement to document Provenance in our
projects (e.g. AeroGrid) - No specification for concrete use of the
documented provenance, yet - gt Tool at least for general browsing of
low-level documentation is needed - Raw provenance data in XML is hard to browse
- Verification of records
- Experimental browsing to determine better query
and interpretation methods - Panel provided by project Grid Provenance not
suitable
14Approach
- Identify User Groups
- User interests (What do they want to explore?)
- User intentions (Why do they want to explore
that?) - Analyse the Provenance data structure
- Elements
- Properties
- Connections
- Scale
- Determine visualization and analysis methods
- What information to be shown,
- Where to show it
- When, for how long, static or animated
- Clear and consistent semantics for visual
elements - Determine exploration strategy
15Identifying User Groups
- Interest / Scope
- What documentation is asked for?
- What documentation is a user allowed to see?
- Abstraction high-level border, range of access
- Intention
- Why is that documentation asked for?
- Abstraction low-level border, type and level of
detail of required documentation -
16Identified User Groups
- General User
- Scientist, Engineer, Portal User
- Interest own work, own results,origin of used
data - Intentions reliability and authenticityof
results, reproducibility - Designer
- Software Engineer, Workflow Developer
- Project related, all origins, monitored system,
partner-made components - workflow behavior, service interaction, product
evolution - Manager
- Workflow Provider, Provenance Analyst, User
Support - all assigned user and system Provenance
- correctness of services, interpretation support,
quality of the P-system - Administrator
- Developer / Admin of Provenance System
- all P-data available in connected P-stores
- building the P-system and maintaining its function
17User Analysis Intentions
Process Evaluation of the approach of a
workflow Actors, Interactions, Sequence of
Process steps Results Quality of intermediate
and end results of processes Dependencies of
inputs and outcome Relationship Analysis of the
evolution of data Relationships of interactions
or actors Time Line Finding performance
bottlenecks, improving workflows Evolution of
results, actor behavior Participation Trust to
result Participating actors Comparison Validate
correctness of processes and results, by
comparing documented executions with
reference structures, like processes, views on
interactions, results Interpretation Custom
visualization requirements, deriving
knowledge from Provenance data Custom,
probably all aspects gt Exploration required
18Exploration
- Difficulty in a large scale graphic exploration
system - Where to start?
- Begin with on overview
- Select processes, interaction channels or actors
- Fade out the rest and choose specific detail
visualizations. - Read application specific content
19Elements
20Actor / Asserter Views
21Focus on Interaction
- Process Map (inspired by tube map)
- Processes
- Participating Actors
- Bottlenecks
- Interaction Stretch
- Individual Interactions
- Relationships and order
22Combined Flow-Chart
- Typical Data Flow Graph
- Shows directions of message flows
- No notion of time gt Requires previous selection
of recorded process.
System / Process Context
23Process Aerial
- Find individual executions of selected processes
- Find anomalies
- Show only interesting actor states and
relationships - Scrolling up and down along time axis
24Visualisation Methods
25Graphical and Exploration Requirements
- distinct, consistent representations of
documentation elements to allow intuitive
interpretation - extensible support of different layout methods
- adjustment of alignment helps to interpret
- switching of scope and detail
- proxy displays for large data sets
- e.g. navigation maps
- mixing and migrating of layouts (animated)
26Architectural Requirements
- support of VO management
- store access
- actor/asserter views
- caching and merging of query results
- extensible architecture
- layout methods
- element representations
- exploration methods
- "content" support
- GUI abstraction
- Web Portals
- Desktop Applications