Title: An introduction to Taverna workflows
1An introduction to Taverna workflows
21. Installing the Workbench
3Exercise 1 Installing the Workbench
- Download Taverna from http//taverna.sourceforge.n
et - Windows or linux
- If you are using either a modern version of
Windows (Win2k or WinXP, with XP preferred) or
any form of linux, solaris etc. you should
download the workbench zip file. For windows
users, Taverna can be unzipped and used, for
linux you will also need to install GraphViz
(http//www.graphviz.org/ the appropriate rpm for
your platform) - Mac OSX
- If you are using Mac OSX you should download the
.dmg workbench file. Double-click to open the
disk image and copy both components (Taverna and
GraphViz) onto your hard-disk to run the
application - YOU WILL ALSO NEED a modern Java Runtime
Environment (JRE) or Java Software Development
Kit (SDK) from http//java.sun.com Java 5 or above
4- Taverna workbench has a standard menu of 6 tabs
- File with 6 items
-
- Open a new workspace
- Load a workflow from a file
- Load a workflow from the web
- Close existing workflow
- Save workflow
- Import workflow from a file
- Import workflow from the web
- Run your workflow
- Close the workbench
5Standard menu
- Tools for plug-in and updates
- Workflow list of all created workflows
- Advanced to create new perspectives
- Design Workflow design space
- Result view workflow results
6- Taverna Design view is composed of 3 main
windows - 1- Available Services
- Lists services available by default in Taverna
- Local java services
- Simple web services
- Soaplab services legacy command-line
application - BioMart database services
- BioMoby services
- Allows the user to add new services or workflows
from the web or from file systems
7Workbench Layout
- 2- AME Advanced Model Explorer
- The Advanced Model Explorer (AME) is the primary
editing component within Taverna. Through it you
can load, save and edit any property of a
workflow. It enables - -building
- -loading
- -editing
- -saving workflows
-
8- 3- Workflow Diagram Window
- Visual representation of workflow
- Shows inputs / outputs, services and control
flows - Enables saving of workflow diagrams for
publishing and sharing
9Installing Plugins
- Go to the Tools menu at the top of the
workbench and select the Plugin manager - Select find new plugins
- Tick the boxes for Feta, Execute remotely and
LogBook and install these plugins - Three more options Execute remotely, Discover
and LogBook will now have appeared at the top
of your screen - Feta is now available through the Discover tab
- The Discover tab can be used to search for web
services by name, task, input and output
parameters - NB Later exercise in this tutorial will use
these plugins
102. Adding new services
11Exercise 2 Adding New Services
- New services can be gathered from anywhere on the
web - Go to the following page http//developerdays.com
/cgi-bin/tempconverter.exe/wsdl/ITempConverter
and copy the web page address - These services were not designed for use in
Taverna, but Taverna can use them if you supply
the address of the WSDL file
12Exercise 2 Adding New Services
- Go to the Available services panel and
right-click on Available Processors. For each
type of service, you are given the option to add
a new service, or set of services. - Select Add new wsdl scavenger. A window will
pop-up asking for a web address - Enter the Web service address you have just
copied. - Scroll down to the bottom of the Available
Services panel and look at the Temperature
Conversion web service that is now included.
133. Finding and Invoking a Service
14Exercise 3Invoking a single service
- Expand the next to tempconverter (the
Temperature Conversion) web service - Right click on the CtoF operation and select
Invoke. This operation converts a temperature
from Celsius to Fahrenheit. - In the pop-up Run workflow window add a
Temperature value in Celsius by selecting temp
and right-clicking. Select new input value and
enter a value in the box on the right - Click Run workflow and the service is invoked
15Exercise 3 View Results
- Click on text/plain in the left panel
- The temperature in Fahrenheit is displayed on the
Right - Click on Process Report
- Look at processes. This shows the experiment
provenance where and when processes were run - Click on Status
- Look at options As workflows run, you can monitor
their progress here.
16Exercise 3 - Conclusion
- The processes for running and invoking a single
service are the basics for any workflow and the
tracking of processes and generation of results
are the same however complicated a workflow
becomes - In the next few exercises, we will look at some
example workflows and build some of our own from
scratch -
174. Finding and Using Workflows
18Exercise 4 Finding and using workflows
- Switch to the design view by clicking on Design
- Select Open Workflow from the File menu at the
top of the workbench. You will see a selection of
.xml files in an examples directory. These are
workflow definition files - Select ConvertedEMBOSSTutorial.xml and a
pre-defined workflow will be loaded - View the workflow diagram - you will see services
of in different colours
19Exercise 4 Workflow Documentation
- Find out what the workflow does by reading the
workflow metadata - In the AME click on the name of the workflow
in this case A workflow version of the EMBOSS
tutorial and then select the workflow metadata
tab at the top of the AME. You will see a text
description of the workflow, its author and its
unique LSID. When publishing workflows for
others, this annotation is useful information and
allows the acknowledgement of intellectual
property
20Exercise 4 Workflow Features
- Run the workflow by selecting run workflow from
the file menu - Watch the progress of the workflow in the
enactor invocation window. As services
complete, the enactor reports the events. If a
service fails, the enactor reports this also
21Loading workflows from the Web
- Go to the webpage www.cs.man.ac.uk/ytanoh
- Select ConditionalBranchChoice and copy the web
address - Go back to the Taverna workbench and select Open
workflow location from the file menu. - Paste the address in the pop up window and click
ok - Run the workflow using true or false as input
value.
- Go to the webpage www.cs.man.ac.uk/ytanoh
- Select ConditionalBranchChoice and copy the web
address - Go back to the Taverna workbench and select Open
workflow location from the file menu. - Paste the address in the pop up window and click
ok - Run the workflow using true or false as input
value.
22Loading workflows from the Web
- You will see at least one of the services fail.
What happens when it fails depends on whether the
service is set as a critical one. If it is, the
workflow will abort, if it isnt, the workflow
will continue - You can set a workflow to critical by ticking the
critical box in the AME. - Set the workflow to critical and run it again
- The entire workflow fails this time.
23Loading workflows from the Web
- Go back to the Design view
- Look at the workflow diagram
- You will see black arrows and white circles
black arrows show the flow of the data and white
circles are control links. - A control link specifies that even though there
is no data flowing between two services, the
second should not start until the end of the
first
2456 Building a simple workflow
255.1 Building a simple workflow from scratch
- Open a new workspace by Selecting New workflow
from the file menu. - Then find the CtoF service in the Available
services panel (you can use the search form on
top of Available Processors). - Right-click on CtoF and import it into the
workbench by selecting Add to Model - In the AME window CtoF shows
- 1 input (Green arrow pointing up)
- 2 output (purple arrow pointing down)
26Exercise 5.2 Adding Input
- Define a new workflow input by right-clicking on
Workflow Input and selecting create new
Input - Supply a suitable name e.g. temperatureInCelsius
- Connect this new input to the CtoF service by
right-clicking on temperatureInCelsius and
selecting CtoF gttemp - You always build workflows with the flow of data
27Exercise 5.3 Adding output
- Define a new workflow output by right-clicking on
workflow output and selecting create new
output - Supply a suitable name e.g. temperatureInFahrenhe
it - Connect this new output to the CtoF service
output (return). (right-click the output return
on CtoF service and select workflow output -gt
temperatureInFahrenheit) -
- Congratulation! You have built a simple
workflow from scratch. - Run the workflow. You will again need to supply a
temperature value in Celsius, e.g. 25 - Save your workflow
28Exercise 6 Stringing Services Together
- In the following section you will learn to
connect more than one services together. - you are going to convert a temperature value from
Celsius to Fahrenheit then back to Celsius again
using only one workflow. - Open a new workflow workspace
- Search for CtoF web service in the Available
services panel and add it to the AME window. - Search for FtoC web service in the Available
services panel and add it to the AME window. - Create a input called TempC and connect it to
temp input on CtoF service
29Exercise 6 Stringing Services Together
- The temperature input for the FtoC service will
be the output from the CtoF service. Connect
the output return on CtoF web service to the
input temp on FtoC web service . - Create a output called temp_in_C and connect it
to the output return on FtoC service. - Remember You always build workflows
with the flow of data - Run the workflow
30Exercise 6 String Constants
- Go back to the design view
- Select and right-click the workflow input TempC
- Select Remove from model to delete it.
- Select string constant from Available
Services - Right-click and select add to model with name
31Exercise 6 String Constants
- Insert TemperatureC in the pop-up window
- Right-click on TemperatureC and select Edit
string value - Enter a temperature value in Celsius.
- Connect the output value on TemperatureC to
the input temp on CtoF service. - Run the workflow- The workflow will run with the
default value
32Saving Results
- Taverna provides several options for saving data.
- Individual data items can be saved by
right-clicking on them - All data can be saved to disk
- Textual/tabular data can be saved to excel
- Save all the data from your workflow
33Try it
- Build a workflow following the model below. The
web services (purple and green colour) names and
input values are given in the diagram. Hint-use
the Discover tab to find the services. - Annotate your workflow (name, author, date)
34Advanced Exercises
- The previous exercises have covered the basics of
myGrid workflows. The following demos and
exercises cover more advanced features, such as
rendering output, dealing with service failure
and iterating over datasets. You may not reach
the end of these exercises, but they will provide
some examples to take home.
35Advanced Features
- Output format
- Iteration
- Substituting Services
- Fault tolerance
- Nested workflow
- Shim
- XMLSplitters
36Exercise 8 Defining Output Formats
- Taverna is able to display results using a
specific type of renderer if the workflow output
is configured correctly. - Load the workflow convertedEMBOSSTutorial from
the examples directory - Run the workflow
37Exercise 8 Defining Output Format
- Look at the results. For tmapPlot and
outputPlot, you will see the results are
displayed graphically. This is achieved by
specifying a particular mime type in the output. - Go back to the AME and look at the metadata for
tmapPlot and outputPlot (e.g. select
tmapPlot and click on Metadata for tmapPlot).
- Select MIME Types. As you can see, each has the
image/png mime type associated with it. If you
wish to render results in anything other than
plain text, you MUST specify the mime-type in the
workflow output
38Exercise 8 Taverna MIME-Types
- The following mime-types are currently used by
Taverna - text/plainPlain Text
- text/xmlXML Text
- text/htmlHTML Text
- text/rtfRich Text Format
- text/x-graphvizGraphviz Dot File
- image/pngPNG Image
- image/jpegJPEG Image
- image/gifGIF Image
- application/zipZip File
- chemical/x-swissprotSWISSPROT Flat File
- chemical/x-embl-dl-nucleotideEMBL Flat File
- chemical/x-ppdPPD File
- chemical/seq-aa-genpeptGenpept Protein
- chemical/seq-na-genbankGenbank Nucleotide
- chemical/x-pdbProtein Data Bank Flat File
- chemical/x-mdl-molfile
39Exercise 8 Taverna MIME types(2)
-
- The chemical/ mime-types are rendered using
SeqVista to view formatted sequence data - Load FetchPDBFlatFile from the
examples/library directory - Run the workflow using 1atp as input example
- The chemical/x-pdb can be used to view rotating
3D protein images
40Nested workflows
- Nested workflows encourage the reuse of workflow
within a more complex scenario and Give
abstraction of an overall - Select the workflow temperature conversion of
exercise 6 workflow1 - Click on Add Nested Workflow under Advanced
model explorer. - Select Open File and choose the workflow you
saved in exercise 5.3 workflow2 - Connect both workflows together so that workflow2
becomes a subworkflow of workflow1 - Run the workflow- Hint we may need to create a
new workflow output.
41Iteration
-
- Taverna has an implicit iteration framework. If
you connect a set of data objects (for example, a
set of fasta sequences) to a process that expects
a single data item at a time, the process will
iterate over each sequence - Load the BiomartandEMBOSSAnalysis.xml workflow
from the examples directory and run it. - Watch the progress report. You will see several
services with Invoking with Iteration
42Iteration
- The user can also specify more complex iteration
strategies using the service metadata tag - Load the IterationStrategyExample.xml from the
example directory - Read the workflow metadata to find out what the
workflow does - Select the ColourAnimals service and read the
metadata for that service. Under the description
is the iteration strategy - Click on dot product. This allows you to switch
to cross product
43Iteration
- Run the workflow twice once with dot product
and once with cross product. - Save the first results so you can compare them
what is the difference? What does it mean to
specify dot or cross product?
44Substituting services and fault Tolerance
- Taverna does not own many of the services it
provides. This means that it cannot control their
reliability. Instead, Taverna provides strategies
for dealing with services being unavailable - Reload the convertedEMBOSSTutorial.xml from the
examples directory. - Look at the metadata for the emma service. It
is an implementation of clustalw - Find the DDBJ clustalw service analyseSimple,
HINT use the Feta discovery tool
45Substituting Services
- When you have added this service to your
workflow, right-click on it and select add as
alternate - In the resulting menu select emma
- The DDBJ version of the clustalw service is now
added as an alternative to emma in the AME. It
will be called alternate1 - Select alternate1 and look at the inputs and
outputs. These need to be mapped to the correct
inputs and outputs in emma
46Substituting Services
- Right-click on the query input in alternate1
and map it to sequence_direct_data. In both
services, these inputs expect a set of fasta
sequences. - Right-click on the result output and map it to
outseq in emma in the same way. - Now you have a workflow which will run using emma
when it is available but will substitute it for
DDBJ clustalw if emma fails!
47Fault Tolerance
- Taverna also allows the user to specify the
number of times a service is retried before it is
considered to have failed. Sometimes network
traffic is heavy, so a working service needs to
be retried - Select tmap from the same workflow. To the
right of the service name are a series of 0s and
1s. By simply typing the numbers, the user can
specify the number of retries and the time
between the retries - Change it to 3 retries for tmap and set the
status to critical using the final tickbox. Now
it is critical, it means the whole workflow will
be aborted if tmap fails after 3 retries.
Failures in non-critical services will not abort
the workflow run.
48Shim Services
- A shim is a service that doesnt do anything
scientific, but helps two scientific services fit
together - There are many myGrid shim services. These are
currently being described in a shim library, but
for now, a small collection are documented here - http//www.cs.man.ac.uk/hulld/shims.html
-
49Beanshell script
- Beanshell scripts allow users to write small,
bespoke java scripts to allow incompatible
services to work together - Create a new beanshell processor by
right-clicking Beanshell scripting host in the
service panel and selecting Add to model (you
may change the name of the processor) - Right click the beanshell processor created and
select Configure beanshell - Create 2 input port named myName and mySurname
- Cretate 1 output port named myFullname
- Note that theses ports are automatically added to
AME window
50Beanshell script
- Select the script tab and Paste the following
script - myFullname myName "\t" mySurname
- Create 2 workflow inputs and 1 workflow output
and connect them to the configured beanshell
processor. - Run the workflow
51Adding XMLSplitters
- Some web services do not explicitly expose their
inputs and/or outputs. XMLSplitters are used to
present to the user these inputs and/or outputs. - Open a new workflow workspace
- Add the following wsdl service http//www.esynaps.
com/WebServices/DailyDiblert.asmx?WSDL - Add the service DailyDilbertImagePath in the
AME window. - It has 2 outputs but no input.
52Adding XMLSplitters
- Select the output parameters on
DailyDilbertImagePath service - Right-click and select add XML splitter
- A new service parametersXML is added with its
input connection already made. - Search for Get image from URL web service and
add it to the AME window. - Connect the output DailyDilbertImagePathResult
on ParametersXML service to the input url on
Get image from URL service.
53Adding XMLSplitters
- The second input base on Get image from URL
service is optional. Leave it unconnected. - Create a new workflow output DailyDilbert and
connect it to the output image on Get image
from URL service. - Run the workflow
54Remote execution
- The Taverna Remote Execution plugin is a plugin
for Taverna that allows workflows to be run on a
Remote Execution Server. - To install the Remote Execution plugin use the
Plugin Manager in the Tools Menu. - Configuration information and how to use the
remote execution are available here - http//www.mygrid.org.uk/usermanual1.7/remote_exec
ution.html - http//www.mygrid.org.uk/usermanual1.7/remote_exec
ution_server.html
55Useful links
- Taverna user manual http//www.mygrid.org.uk/user
manual1.7/ - Taverna mailing lists http//taverna.sourceforge.
net/index.php?doclists.html