Future Challenges in Bioinformatics - PowerPoint PPT Presentation

1 / 37

About This Presentation

Title:

Future Challenges in Bioinformatics

Description:

Future Challenges in Bioinformatics Introduction Introduction: How RRX got involved Life sciences context: How bioinformatics came to be important – PowerPoint PPT presentation

Number of Views:192

Avg rating:3.0/5.0

Slides: 38

Provided by: rutherfor9

Category:

more less

Transcript and Presenter's Notes

Title: Future Challenges in Bioinformatics

1
Future Challenges in Bioinformatics
2
Introduction

Introduction How RRX got involved
Life sciences context How bioinformatics came to
be important
The past half century How bioinformatics has
evolved

3
Introduction

Categories of Bioinformatics Tools
Why We Need Supercomputers
Software Development Issues
Future Challenges
Tools for Biotech Projects
Summary

4
How RRX got involved

Submitted a Canadian Foundation for Innovation
(CFI) proposal for Advanced Bioinformatics
Collaborative Computing (ABioCC)

5
How RRX got involved

Developed an SVG based visualization front end
Paper will be presented at SVG Open 2003 in
Vancouver on July 17th

6
How bioinformatics came to be important

After the structure of DNA was reverse engineered
with X-Ray diffraction in 1953 focus shifted to
nucleic acid sequence analysis
DNA/RNA/protein sequence data accumulated using
computer programs for storage and analysis

7
How bioinformatics came to be important

Bioinformatics algorithms in development for the
last half century came into wide spread use by
researchers
The ability to compare sequences created a
homology context for unknown sequences of
interest leading to advances

8
How bioinformatics came to be important

Improved sequencing technology enabled the
complete deciphering of the human genome gtgtgt 1999
About 3.18 billion base pairs
Celera used 300 PE Biosystems ABI Prism 3700 DNA
Analysers

9
How bioinformatics has evolved

Central dogma of molecular biology
DNA sequences are transcribed into mRNA
sequences, mRNA sequences are translated into
protein sequences, which fold 3D creating
structures with functions statistically survival
selected gtgtgt affecting the prevalence of the
underlying DNA sequences in a population

10
How bioinformatics has evolved

This created a supporting information flow
Organization and control of genes in the DNA
sequence
Identification of transcriptional units in the
DNA sequence
Prediction of protein structure from sequence
Analysis of molecular function

11
How bioinformatics has evolved

Another covariant information flow was created
based on the scientific method
Create hypothesis wrt biological activity
Design experiments to test the hypothesis
Evaluate resulting data for compatibility with
the hypothesis
Extend/modify hypothesis in response

12
How bioinformatics has evolved

IT used to handle explosion of data from high
throughput techniques, too complex for manual
analysis
X-ray diffraction

13
How bioinformatics has evolved

Automated DNA sequencing
Amersham Biosciences
Applied Biosystems
Beckman Coulter
LI-COR
SpectruMedix Corp.
Visible Genetics Corp.

14
How bioinformatics has evolved

Microarray expression analysis

15
How bioinformatics has evolved

Rapid emergence of 3D macromolecular structure
databases
New sub discipline structural bioinformatics
Atomic and sub cellular spatial scales
Representation/physics
Storage/retrieval/source data correlation/interpre
tation
Analysis/simulation
Display/visualization

16
How bioinformatics has evolved
17
Categories of Bioinformatics Tools

Databases gtgtgt search/compare
Sequence Analysis - Clusters
Genomics
Phylogenics
Structure Prediction
Molecular Modelling
Microarrays
Packages, Misc Apps, Graphics, Scripts

18
Categories of Bioinformatics Tools

Database gtgtgt search/compare

aceperl
BLAST
Blastall
Blastpgp
BLAT
Blimps
Entrez
FASTA
fastacmd
formatdb
getz

HMMER
IMPALA
InterProScan
PHI-BLAST
ProSearch
PSI-BLAST
PSI-BLASTN
Seguin
Swat
tace
xace

19
Sequence Analysis

Artemis
Bl2seq
BLAST
Clustal W, X
consed/autofinish
Cross_match
Dotter
EMBOSS
FASTA
Glimmer
HMMER
InterProScan
MEME
View

Paracel Transcript Assem
Phrap
Phred
Primers
ProSearch
Readseq2
Rnabob
RRTree
SAPS
seals
Seqsblast
STADEN
Swat
T-Coffee

20
Genomics

Calc_primers
Cross_match
FPC
GENSCAN
Glimmer
Image
Mzef
Phrap

Phred
STADEN
Swat
tace
tace_celegans
tRNAscan-SE
xace
xace_celegans

21
Phylogenics

Clustal W
Clustal X
MOLPHY
MrBayes
PHYLIP

RRTree
T-Coffee
TREE-PUZZLE
TreeViewX

22
Structure Prediction

EMBOSS
MEME
Modeller
Mzef
PHI-BLAST

23
Molecular Modelling

Modeller
homology modeling an alignment of a sequence to
be modeled with known related structures
Rasmol
a molecular graphics program intended for 3D
visualisation of proteins and nucleic acids
Raster3D (publishing images)
X3DNA
analyzing and rebuilding 3D structures

24
Microarrays

Dapple
a program for quantitating spots on a two-colour
DNA microarray image..
OligoArray
a program that computes gene specific
oligonucleotides that are free of secondary
structure for genome-scale oligonucleotide
microarray construction.

25
Packages, Useful Scripts/Source Code, Graphics,
PERL

BioPERL
BioJava
boxshade
mvscf
seg
Split_fasta

povRay
Raster3D
MOLPHY

26
Why We Need Supercomputers

Some commercial packages run on supercomputers
Accelrys modeling and simulation
Materials Studio
Cerius2 (SGI Unix only)
Homology modeling to catalyst design
Insight II (SGI Unix only)
3D graphical environment for physics based
molecular modeling
Catalyst (high end Unix servers)
database management valuable in drug discovery
research
QUANTA (high end Unix servers)
crystallographic 2D/3D protein structure solution
Discovery Studio

27
Why We Need Supercomputers

Supercomputer advantages
Multiple processors
Large shared memory
Handle very large files
Large/fast RAID arrays
Terabyte tape backup systems
Power backup systems
High performance networks

28
Why We Need Supercomputers

Common bioinformatics requirements
Computationally intensive tasks
Large memory models
Intensive/complex database searches
Large experimental database sets
Large derived database sets
Large persistent intermediate data structures
Teamwork data sharing and visualization

29
Why We Need Supercomputers

Network requirements
Driving gigE/10gigE NICs
Moving large files/data sets rapidly
Visualization streams/Access GRID
Coordinating Cluster/GRID computing
Dynamic provisioning of light paths

30
Why We Need Supercomputers
31
Why We Need Supercomputers

xxxxxxxxxxxxxxxxxxxxxxx
32
Software Development Issues

Collaboration contexts/barriers
Team work collaboration spaces
Standards development DTDs
Integration issues
experimental data to homology to 3D model
platform issues
network issues 9k MTU - jumbo frames
Licensing issues public vs. private

33
Future Challenges

Creating developer infrastructure for building up
structural models from component parts
components from macromolecule libraries ported to
object models
Understanding the design principles of systems of
macromolecules and harnessing them to create new
functions
specialized molecular machines

34
Future Challenges

Learning to design drugs efficiently and cost
effectively based on knowledge of the target
target generation automation
validation automation
Development of enhanced simulation models that
give insight into context based function from
knowledge of structure
possible use of artificial intelligence to limit
scope of search

35
How Tools might be used for Industry Biotech
Projects
36
Summary