Title: Bioinformatics part 2
1Bioinformatics part 2
- Jon Manning
- Bioinformatics, QMRI
2Contents
- My background and skills
- Example projects
- Data transformation (Ann)
- Perl hackery Ensembl
- Structure/ mutation studies
- Data integration
- New and exciting Next Generation Sequencing
3Me!
- Background expertise
- Biochemistry
- Bioinformatics
- Sequence alignment analysis (PhD)
- Structural biology
- Perl hackery database mining
- Web interactivity
- Microarray (recent)
4Projects
5Data Transformation
- How do I convert this machine output to be
readable by this software?
6Starting format
Annotation
Data
7Result
8Database Mining
- How can I download sequences for this big list
of IDs? - How can I look for this motif over the whole
human genome? Can I allow for mis-match?
9Example Zinc finger nucleases (ZFNs)
TCCAGTAGCGAT N4-6 GAAGCTCAGTTC
http//www.sigmaaldrich.com/life-science/functiona
l-genomics-and-rnai/zinc-finger-nuclease-technolog
y/learning-center/what-is-zfn.html
10Programatic Tools
- Ensembl API
- Whole genome sequences
- Genes, transcripts, exons
- Variation (SNPs etc)
- Homology information
- Functional annotation
- Bioperl Automatic alignments, fetch sequences by
ID etc - Entrez utils Access NCBI resources
11Structure/ mutation / function
- What impact will making this change have on
function?
12Simple ACE
- Reduced penetrance of malignant hypertension in
Lewis vs Fischer rats. - Localised to a QTL containing ACE
- Single amino acid change leucine (Lewis) to
phenylalanine (Fischer) - Could this change have caused a detrimental
change in activity?
13No.
14Bit more complex PABP
- PABPC5 isnt transcriptionally active compared to
PABPC1. - Why doesnt a domain swap to replace eIF4G
binding site work? - Action
- Model PABPC5 RRM1-2 based on available structure
for PABPC1 - Examine changes at known important residues
- Examine inter-domain interactions
15(No Transcript)
16(No Transcript)
17(No Transcript)
18Data Integration
- How can I compare results over these disparate
data sets??
19Example expression
- Different experimental techniques, with results
keyed by different IDs - Microarrays
- SAGE
- Sequencing
- Different values
- Counts from SAGE
- Intensities from microarrays
- Need to display and compare results
20(No Transcript)
21(No Transcript)
22Next Generation Sequencing
23What is next-gen sequencing?
J. Shendure and H. Ji. Next-generation dna
sequencing. Nat Biotechnol, 26 (10)11351145,
Oct 2008.
24Applications
J. Shendure and H. Ji. Next-generation dna
sequencing. Nat Biotechnol, 26 (10)11351145,
Oct 2008.
25Bioinformatics Challenges
- New statistics toolbox required
- Platform-specific error models (e.g.
homo-polymers in 454) - Tag frequency comparisons (diff. Exp.)
- Alignment assembly
- Short read lengths necessitate new methods
- Storage access
26Reporting
27Hypotheses
- Bioinformatics often hypothesis generation
- This set of genes may be important for this
phenotypic difference - These mutations probably have this effect on
structure which might affect function this way - Reduce experimental effort
28(No Transcript)