Title: Close-to-the-experiment Data analysis
1Close-to-the-experiment Data analysis
Dr. Werner Van Belle werner.van.belle_at_gmail.com,
werner_at_onlinux.be
2Bridging The Gap
- Core of most biological research is experimental
Interpretation
Data Reduction
Experiments
3Bridging The Gap
- Core of most biological research is experimental
Interpretation
Data Analysis
Experiments
4Contents
- Part 1. 2DE Gel Correlation Analysis
- Part 2. Maldi Tof Artefacts / Denoising
- Part 3. Accuracy Analysis of a Micro-Array
Experiment - Part 4. Protein Interaction Map Integration
5Part 1. 2DE Gel Analysis
Werner Van Belle werner.van.belle_at_gmail.com,
werner_at_onlinux.be In cooperation with Bjørn Tore
Gjertsen, Nina Ånensen Ingvild Haaland, Gry
Sjøholt, Kjell-Arild Høgda
62D Gels
Patient 2Age 46
Patient 1Age 57
Courtesy Gry Sjøholt, Nina Ånensen Bjørn Tore
Gjertsen
7Initial Problem
- The question we were asked
- Is there a relation between various parameters of
AML/ALL cancer patients and their P53
biosignatures / isoforms ? - Gels /- 97 gel images of different patients
- Biological Parameters
- FAB Classification (AML/ALL), AML Class, Flt3
(WT/ITD) - Resistance AML, Resistance ALL, Survival AML,
Survival ALL - BCL2, Stat5 GMCSF, Stat3 IL3, Stat1 Ifng, CD4, C34
8Standard Solution
- Detect Spots, Measure Spot Volumes, Compare
- Non Trivial Solution
- Spot identity unknown, often no calibration spots
- Manual interpretation dangerous shifts of spots
are difficult to interpret - Some PTM influence spot positioning, complicating
the matter
Complicated method Tedious work Lousy results
9Manual Comparison
102D Gel Analysis
- Step 1 Image registration rotate, scale
translate
11Different Operations
- Scaling (Zoom)
- Rotation
- Translation
Calibration spots Antibody spots Manual annotation
Image registration techniques Geocoding Landmark
tracking Standard spot detection
12Pairwise Image Alignment
13Noise hinders pairwise alignment
14 1. Artefacts in 2D gels
Camera Noise
Gels have been altered to respect NDA
152. Artefacts in 2D Gels
Camera Warping
163. Artefacts in 2D gels
Inconsistencies over different machines
KODAK Image Station
Typhoon Image Station
Courtesy Gry Sjøholt, Nina Ånensen Bjørn Tore
Gjertsen
174. Artefacts in 2D gels
Underexpressed tails
Gels have been altered to respect NDA
185. Artefacts in 2D gels
Non linearity
Gels have been altered to respect NDA
196. Artefacts in 2D gels
Washing
Gels have been altered to respect NDA
207. Artefacts in 2D gels
Drying
Gels have been altered to respect NDA
218. Artefacts in 2D gels
Dots
Gels have been altered to respect NDA
229. Artefacts in 2D gels
Poor XY resolution
Poor grey-value resolution
Gels have been altered to respect NDA
2310. Artefacts in 2D gels
Clipping
Gels have been altered to respect NDA
2411. Artefacts in 2D gels
Unclean Lenses
25Denoising 2DE gels
Before
After
26Denoising I
Input Image
27Denoising II
Inverted Image
28Denoising III
- Calculate Background Variations
29Denoising IV
- Remove backround variation
Original
Divide original by background variation image
Denoised
30Denoising V
Clip everything gt 1
31Denoised result
Median Filtering
32Original
33Denoising enables pairwise alignment
34Outline
- Correlation analysis
- Requires multiple aligned gels
- Multiple gel alignment based on pairwise
alignment - Pairwise alignment difficult due to many
artefacts - Developed denoising algorithm
- Pairwise alignment possible
- How to align multiple gels ?
35Cummulative Superposition
- Idea
- take first gel, superimpose second gel
- take third gel, superimpose on projection of
previous gels - repeat process for all gels
This does not work, we merely find a suitable
superposition to reflect the first images.
36Cummulative Superposition
Final Overlay Image
Initial 2DE Gel Image
37Cummulative Superposition
Final Overlay Image
Initial 2DE Gel Image
38Multi Gel Alignment
- 1- align all image pairs -gt X.X alignments
- 2- find an optimal (x,y) position that minimizes
the overall alignment error
100 images at 1024 x 1024 65011712 operations per
cross correlation 5000 cross correlations 32505856
0000 operations in total 325.109 FLOP
theoretical 2.7 hours practical 3 days
392D Gel Overlays
- Superposition of all images
Mother image
402D Gel Overlays
Reflects Known Protein Isoforms
41Step 1 Alignment
422D Gel Analysis
- Step 2 Intensity Normalization
43Background Differences
44Background Differences
45Step 2a Background Intensity
46Contrast
47Contrast
48Step 2b Intensity Normalization
492DE Gel Analysis
50Step 3 Correlation
51Step 3 Correlation
52Initial Problem
- Is there a relation between various parameters of
AML/ALL cancer patients and their P53 isoforms ?
53P53 Biosignatures vs Age
542D Gel Analysis
55Step 4 Masking
56Step 4a Significance
57Significance Mask
58Step 4b Variance
59Variance Mask
60Step 4c Overall Mask
61Overall Mask
62P53 Biosignature vs Age
63Step 5 3D Visualization
64Step 5 3D Visualization
65Resource Usage
- 132 Parameters, 13 correlation sets, 128 images
- Creating the fine-tuned overlay alignment 72h
- Computing all the correlations 85.55h, which
produced 5.8 Gb of raw data. - Rendering of the movies 5 hours per movie, with
1416 images 7080h
66Part 2. Maldi-TOF Artefacts
Werner Van Belle werner.van.belle_at_gmail.com,
werner_at_onlinux.be In cooperation withOlav
Mjaavatten, Kari Espolin Fladmark Stijn Ove
Døskeland
67MALDI
Laser
Protein
Digestion (Trypsine)
Sublimation
Crystalisation
Matrix absorpbing wavelenght x
Charged particles
Time of flight
Protein fingerprinting
Detection
Mass spectrum
Peaklist
Sequencing
Diagonalisation
68MALDI
69Artefacts I
Mass spectrum output
Frequency Analysis
70Artefacts II
Mass spectrum output
Frequency Analysis
71Artefacts III
1 Shot
10 Shots
Mass spectrum output
Frequency Analysis
72Artefacts III
100 Shot
1000 Shots
Mass spectrum output
Frequency Analysis
73Denoising method
- Multi-rate spectral analysis excelent tool for
event detection
74Denoising method
Initialisation
Haar wavelet Decomposition
75Denoising method
Normalisation Event Enhancement
Wavelet Composition Saving
76Wavelet Enhancement Global
Original
Denoised
Frequency Analysis
Mass Spectrum
77Wavelet Enhancement Local
Original
Denoised
Frequency Analysis
Mass Spectrum
78Part 3. Micro-Array Accuracy Analysis
Werner Van Belle werner.van.belle _at_ gmail.com,
werner _at_ onlinux.be In cooperation with Nancy
Gerits, Ugo Moens, Halvor Grønaas, Lotte Olsen,
Ruth Paulssen
79Intensity Distribution
80Intensity Distribution
Gating ? Critical Mass ?
81Cy5 ? Cy3
82Measurement Accuracy
Green
Relative Error
PD of error at distance z
PD of error at distance y
Red
PD of error at distance x
83Intensity Dependent Error Distribution
84Intensity Dependent Error Distribution
In 95 of the cases the measurement error will
fall within -49364936
85Confidence Interval for 1 Spot
In 95 of the cases, the actual value will Range
within the measurement -49364936
86Multiple Spots
- Multiple measurements lead to better estimates /
smaller confidence intervals
87Reported Regulations
88Omitted spots too close to error
89Part 4. Protein Interaction Map Integration
Werner Van Belle werner.van.belle_at_gmail.com,
werner_at_onlinux.be In cooperation with Nancy
Gerits, Ugo Moens
90Gene Expression
91Influenced by/Influences
- MK5 -gt Multiple changes in gene expression
- 27000 gene expressions measured
- Those that change will very likely influence
other proteins
Which proteins are likely influenced by our
measured up/down regulations ?
92The 'Involved' Game
- Protein change will influence nearby proteins,
which in turn ...
1.0
0.8
0.6
0.6
93The 'Involved' Game
- Multiple proteins changes will all influence
their neighbors as well.
1.0
1.6
1.4
1.0
94The 'Involved' Game
- This network is iterated a number of times to
expand the sphere of influence of all the altered
gene expressions. - affected proteins will have higher numbers
- Protein Interaction key mechanism for signal
transduction - Protein Interaction Network as published by
Jean François Rual et al. Towards a Proteome
Scale Map of the Human Protein Protein
Interaction Network Nature 2005 vol 437, p.
1173-1178
95Involved Proteins by Rank
96Involved Proteins by Rank
97Involved Proteins by Rank
98Involved Proteins Network
99Involved Proteins Network
- Red Highest involvement Blue Lowest
Involvement - Based on our lowest estimates for up/down
regulation - Based on the high confidence set of protein
interactions - Measured gene expressions are not listed
100Involved Proteins Network
101Involved Protein Network
102Involved Protein Network
103Involved Protein Network
104Credits
- 2DE Gel Imaging and Patient sampling
- Nina Ånensen, Bjørn Tore Gjertsen, Ingvild
Haaland, Øystein Bruserud, Gry Sjøholt - Maldi TOF Mass Spectra
- Olav Mjaavatten, Kari Espolin Fladmark, Stijn Ove
Døskeland - Micro-arrays
- Nancy Gerits, Ugo Moens, Halvor Grønaas, Lotte
Olsen, Ruth Paulssen