Title: Taverna Workflows
1Taverna Workflows A beginners guide
Mark Wilkinson Edward Kawas (Modified for
Taverna 2.1 from earlier version by Paul Fisher)
2Taverna
This tutorial is designed to introduce you to the
Taverna 2.1 workflow workbench
31. Installing the Workbench
4Prerequisites - 1
- Java
- In order to run Taverna 2.1 on your computer you
will need to have the latest Java installed. If
you do not have Java already installed, you can
download it from this URL - http//java.sun.com/javase/downloads/index.jsp
- You will have a choice of the download you would
like. Download the JDK with Java EE packaged up
too. This will give you the opportunity to
develop web services and use the ones deployed by
Java developers at a later date. The Java Runtime
Environment (JRE) being downloaded should be 1.6
or later for Taverna to work. - If you have Java installed, but it is an earlier
version, you will need to update it to 1.6 or
later otherwise Taverna will NOT work. - The minimal installation you will need is the
standard JDK package. - Download the desired JDK by following the link on
the website and choose a location on your
computer to save it to. - Open the saved file and follow the installation
instructions to install Java on your computer - Restart your computer to complete the
installation.
5Prerequisites - 2
- A zip package (not needed for MS Windows install)
- You will also need a tool to unzip the downloaded
workbench. There are various tools available on
the internet, including WinZip, 7-Zip, and a few
others. Personally I prefer 7-Zip, which is free
to and easy use, available at the following URL - http//www.7-zip.org/download.html
- You will need to choose the appropriate file to
download for your operating system, i.e. Windows,
Linux, Apple MAC. - Choose a location to save the file in and save
it. - Locate your saved file and follow the
installation instructions to install it on your
computer. - Restart your computer to complete the
installation.
6Prerequisites - 3
- Linux users - Graphviz
- Those who are installing Taverna on Linux will
also have to install Graphviz onto the system.
This is available at the following URL - http//www.graphviz.org/
- At the time of writing I have no installation
instructions for this package, so please refer to
the user documentation provided on the web site
7Downloading Taverna
- Updates and news about Taverna can be found on
the myGrid homepage at -
- http//www.mygrid.org.uk
- The latest versions of Taverna can be downloaded
from Sourceforge -
- http//taverna.sourceforge.net
- Once on the Download page, identify the
relevant Taverna distribution you need. (we are
using 2.1) - Follow the link to download the workbench. The
web page should re-direct you to the source forge
page. - Choose a location to save the file and click OK.
8Unzipping the workbench(not needed for a Windows
installation)
- Choose to Unzip/Extract the files, but not into
the current directory. - You will need to choose a directory in which to
unzip the files. I recommend somewhere in the
root drive of your computer so you can easily
access it, e.g. C\myGrid\ . - You can change the name of the folder at this
stage, e.g. to Taverna. - If you are using Taverna on Linux, please be sure
that you have the relevant access permissions to
install and run Taverna in the desired directory.
If you need a Zip package download and install
7-ZIP (find it using Google)
9Opening Taverna
- Locate you Taverna installation in the windows
Start menu, or open the Taverna folder. - Start Taverna by double clicking on the Taverna
icon in the start menu, or runme.sh (Linux and
Mac users). - If you have successfully installed Java, you
should see a dialog box or command window open,
shortly followed by the Taverna application. - Once you have installed Taverna for the first
time it will need to update all of its
components. You do not need to do anything for
this, as this happens as the workbench is
opening. You should see a graphic in the centre
of your screen, with a download progress. Each
component will be shown loading in this progress
bar in turn. Once this has completed (depending
on connection speed about 5 minutes), the
Taverna workbench will open. - The Taverna workbench consists of 3 main panels
for constructing workflows - The Available services pane (Top Left side)
- The Advanced Model Explorer pane (Bottom Left
side) - The Diagram pane (Right side)
10(No Transcript)
11The 3 Panes of Taverna
- The Available services pane is used to display
the web services to the user. This list contains
default services from when the workbench starts.
Once you become more experienced with the
workbench, you will be able to add you own
services, including adding default services so
they load automatically when Taverna opens. This
list contains WSDL web services, local BioJava
widgets, Soaplab services, and BioMoby objects.
Each of these can be added to the workflow model
(workflow being constructed) so that a task can
be achieved. - The Advanced Model Explorer (AME) pane contains
the services used in the current workflow,
including the inputs, outputs, and data links
between each service. Once populated with
services, each service can be expanded using the
button. This provides a list of the inputs
and outputs that the service takes in and expels.
It is these inputs and outputs that allow you to
connect services together. - The Diagram pane shows a graphical representation
of the workflow being used/constructed. The
diagram can be adapted to view different aspects
of the current workflow, to show all the ports
for all the services, only those ports that have
been connected or bound, or to change the layout
of the workflow from portrait to landscape.
123 Panes of Taverna
Available services
Available services
Available services
Diagram pane
Diagram pane
Diagram pane
Advanced Model Explorer
Advanced Model Explorer
13Advanced Model Explorer
- AME (bottom left panel)
- The AME is the primary editing component within
Taverna. Through it you can load, save, and edit
any property of a workflow. - It enables you to
- build a workflow
- add nested workflows
- edit workflows by connecting services
- add metadata to a workflow
14Diagram Pane
- Shows inputs / outputs, services and control
flows - It allows you to change the view of a workflow,
save the visual representation, and explode or
implode nested workflows
15Available services
- Lists services available by default in Taverna
top left - 3500 services
- Local java services
- Simple web services
- Soaplab services legacy command-line
application - R Processor
- BioMart database services
- BioMoby services
- Beanshell processor
- Allows the user to add new services or workflows
from the web or from file systems
162. Adding new services
17Exercise 2 Adding New Services
- New services can be gathered from anywhere on the
web the default list are just a few we already
know about importing others is very
straightforward - Go to the DDBJ list of available web services at
http//xml.nig.ac.jp/index.html - These services were not designed for use in
Taverna, but Taverna can use them if you supply
the address of the WSDL file - Click on the DDBJ blast service
(http//xml.nig.ac.jp/wsdl/Blast.wsdl) and copy
the web page address
18This is the Blast Service from DDBJ
19This is what Taverna needs
Clicking on the blast service brings you to the
page describing the methods(functions) that
the blast service can execute
20Exercise 2 Adding New Services
- Go to the Service panel in Taverna, and click on
Import new services button. You are then asked
what kind of service you are adding. The DDBJ
services are WSDL so we will select WSDL
service from the menu. - A window will pop-up asking for a web address
- Enter the Blast Web service address you just
copied - Scroll down to the bottom of the Services list
and look at the new DDBJ service that is now
included, clicking on the icon next to the
service
21Exercise 2 Adding a WSDL service
22Exercise 2 Adding New Services
- Enter the Blast Web service address you just
copied
23Exercise 2 Adding New Services
- Scroll down to the bottom of the Services list
and look at the new DDBJ service that is now
included, clicking on the icon next to the
service
There it is!
243. Finding services and adding them to workflows
25Exercise 3 - Finding services and adding them to
workflows
- We want to do a BLAST search using the DDBJ
service that we just added into Taverna - First, we need to get a sequence to blast, and
this sequence should be in FASTA format. - Go to the Services Panel
- Type fasta into the search box at the top of
the panel You will see several services, the top
one sounds promising!
26Exercise 3 - Finding services and adding them to
workflows
Right-click andadd to workfow
27Exercise 3 - Finding services and adding them to
workflows
28Exercise 3 - Finding services and adding them to
workflows
To see the inputs and outputs for a service,
click the little button here
29Exercise 3 - Finding services and adding them to
workflows
- Now we need to provide it an ID number for the
sequence we are interested in. - Right-click on the id box in the service
diagram and set constant value (e.g. 163483)
30Exercise 3 - Finding services and adding them to
workflows
31Exercise 3 - Finding services and adding them to
workflows
- Now we want to add the Blast service from DDBJ
- Search for blast, or simply scroll down to the
DDBJ service searchSimple execute blast
32Exercise 3 - Finding services and adding them to
workflows
- Right-click and add it to the workflow
33Exercise 3 - Finding services and adding them to
workflows
- Now we need to connect the output from the
sequence retrieval to the input port of the blast - Right click on outputText connect as input
to, searchSimple, query - How do we know the correct input port was called
query? In this case we can get the
documentation from the DDBJ website by clicking
on the searchSimple link - Unfortunately, not all services are so nicely
documented!! - (See the BioMoby project for a solution to this
problem! http//biomoby.org)
34Documentation for the DDBJ service
35Exercise 3 - Finding services and adding them to
workflows
36Exercise 3 - Finding services and adding them to
workflows
37Exercise 3 - Finding services and adding them to
workflows
- The other two ports are used to name the database
you want to blast against, and the type of blast
you want to do - Lets chose to execute a blastn against the
Arabidopsis EST dataset - Right-click on database and set constant
value. Make the value est_atha - Right-click on program, set constant value
and make the value blastn
38Exercise 3 - Finding services and adding them to
workflows
39Exercise 3 - Finding services and adding them to
workflows
- Finally we need to do something with the output.
Lets add an output bin to store it - Right click on Result from the blast service,
and connect as output to new workflow output
port
40DONE!!
41Running your workflow
- In the main menu bar, there is a play button
to run your workflow, click it!
42Running your workflow
- Services become grey as they finish executing.
Once all services are grey, your workflow has
completed
43Running your workflow
- To see the results, click on the one of the
result links on the left. If the workflow
generated more than one result, there will be
multiple links in that list.
Click here
Result here!
44Lets make it more flexible!
- We dont want to have to re-set the constant
input value every time, so lets make it more
flexible. Right-click on the id_value and
delete it. - Now right-click on the Get_nucleotide_fasta id
port and connect with output from new workflow
input port
45Lets make it more flexible!
- Lets say we had a lot of sequences we wanted to
blast make your input port into a list!
46Your new, flexible workflow
47Running the workflow
- This time, when you press play, you will be
asked for input
48Running the workflow
- Click on new value for each of your list of
genbank gi numbers
Click here to begin
49Done!
Three blast reports