Bioinformatics Workflows - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Bioinformatics Workflows

Description:

Using Perl/ Matlab scripts to implement a pipeline ... Disadvantages: hard to explain, hard to relocate, hard to tinker with. Workflows. Repeat ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 12
Provided by: Chris547
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics Workflows


1
Bioinformatics Workflows
  • Chris Wroe
  • (based on material from the myGrid team
  • May Tassabehji / Hannah Tipney
  • Medical Genetics, St Marys)

2
Bioinformatics pipelines on the web
RepeatMasker
BLASTn
Twinscan
  • Copying and pasting from one web based
    application to annotation by hand
  • Advantages quick, easy access to distributed
    resources
  • Disadvantages time consuming, error prone, tacit
    procedure so difficult to share both protocol and
    results

3
Automating pipelines
  • Using Perl/ Matlab scripts to implement a
    pipeline
  • Advantages automation, quick to write,
    significant community resources (e.g. BioPerl)
  • Disadvantages hard to explain, hard to relocate,
    hard to tinker with.

4
Workflows
Predicted genes out
Sequence in
RepeatMasker Web service
BLASTn Web Service
TwinscanWeb Service
  • Simple scripting language aims to specify how
    steps of a pipeline link together
  • High level picture of the pipeline separated from
    any low level fiddling
  • Application logic and low level fiddling
    encapsulated in remote web services
  • Advantages automation, quick to write, easier
    to explain, share, relocate, and record
    provenance of results in a standard way

5
Workflow components in myGrid
  • Scufl Simple Conceptual Unified Flow Language
  • Developed by myGrid members at EBI.
  • Designed to be as simple as possible, just enough
    features to support bioinformatics workflows
  • Taverna a tool for writing, running workflows
    and examining results.
  • (http//taverna.sourceforge.net)
  • FreeFluo workflow engine to run workflows
  • (http//freefluo.sourceforge.net)

6
(No Transcript)
7
Workflow use
  • Newcastle University (Anil Wipat, Peter Li)
  • Affymetrix Microarray Analysis Workflow
  • Gene annotation workflow
  • Manchester University May Tassabehji, PhD
    student Hannah Tipney, Medical Gentics, St Marys
    (Wellcome Trust Funded)
  • Gene alerting service workflow (GAS)
  • Gene and protein annotation workflow
  • And others

8
Workflow experience
  • Easy to get started with Taverna (1-2 hours
    tutorial)
  • Sharing does happen
  • Cuts down the time taken to perform one pipeline
    from 2wks to 2 hours

9
Workflow experience outstanding
issues
  • Early days web services rare significant time
    take to wrap applications as web services
    (licensing, installation, maintenance)
  • Soaplab and Gowlab try to help (http//industry.eb
    i.ac.uk/soaplab)
  • Fiddly bits dont go away Many shim services
    needed to ensure the output of one step fits the
    expected input of another
  • Automation produces many results in a short
    amount of time. Issues of result management and
    display

10
Other workflow systems
  • Commercial bioinformatics drug discovery
  • Incogen VIBE
  • TurboWorx Pipeline Pilot
  • eScience
  • DiscoveryNet (bioinformatics proprietary)
  • Keppler ( US ecology)
  • Triana (UK Physics astronomy, signal processing)

11
Workflow standards
  • Cant have enough of them! All currently come
    from e-Business rather than science community
  • BPEL Business Process Execution Language
  • WS Orchestration
  • XML Process Definition Language (XPDL)
  • Business Process Markup Language (BPML)
Write a Comment
User Comments (0)
About PowerShow.com