Data Analysis for Mouse MALDI Spectrum Data - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Data Analysis for Mouse MALDI Spectrum Data

Description:

Permutation & FDR ... of samples have been permuted, hence the internal relationship ... Permute 1000 times. Correlation Analysis with phenotype data. FDR ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 38
Provided by: kerch
Category:

less

Transcript and Presenter's Notes

Title: Data Analysis for Mouse MALDI Spectrum Data


1
Data Analysis for Mouse MALDI Spectrum Data
  • Xuelian Wei
  • Department of Statistics

2
Outline
  • Data sets analyzed in this presentation
  • C18 Q4 Low (73 spec 3 control spec, 41 samples
    control).
  • C18 Q9 Low (84 spec 10 control spec, 43 samples
    control).
  • C18 S4 Low (78 spec 8 control spec, 40 samples
    control).
  • WCX Q4 Low (76 spec 10 control spec, 42
    samples control).
  • WCX Q9 High (85 spec 9 control spec, 43 samples
    control).
  • Protein modification detection
  • Two lipid modifications of proteins, 138.10 and
    120.09 Da.
  • Oxidation modification, 16 Da.
  • Correlation analysis with phenotype data
  • 23 phenotypes (ApoA1 and ApoA2 were detected due
    to too many missing values).

3
Pre-processing
  • Adjusted Intensity table for all spectra and all
    peaks.
  • Each row represent a peak.
  • Each column represent a spectrum.
  • Adjusted intensity table for all samples and all
    peak clusters. (using for future analysis)
  • Each row represent a peak cluster.
  • Each column represent a sample.
  • Each cell is the summation of average adjusted
    intensity in that peak cluster.
  • The minimal MZ in a cluster represent the MZ for
    that cluster.

4
Pre-processing
  • Adjusted Intensity table for all spectra and all
    peaks.
  • Adjusted intensity table for all samples and all
    peak clusters. (using for future analysis)

5
1. C18 Q4 Low
6
2. C18 Q9 Low
7
3. C18 S4 Low
8
4. WCX Q4 Low
9
5. WCX Q9 High
10
Pre-Processing
  • Refer to complimentary files to see more detailed
    zoom-in peak cluster detection.
  • Such as _plot_mapped_peak_cluster_5_6_7_8.pdf.

11
Possible outlier spectra
12
Protein modification detectionMethod I
  • Idea based on pre-defined peak clusters.
  • Algorithm
  • the difference between the MZs of the first peak
    in each pair of peak clusters is within
    Da.
  • Drawback
  • The pre-defined peak clusters may not be perfect.
  • The size may be different a lot.
  • Conclusion
  • It only provides possible protein modifications.
  • Subjective judgment needed.

13
Protein modification detectionMethod I
14
Protein modification detectionMethod II
  • Idea based on all peaks.
  • Algorithm
  • First find all possible modification peak pairs.
  • If 75 peaks in a peaks cluster have matched
    modification peaks, marked it as a potential peak
    cluster with modification.

15
Protein modification detectionMethod II
16
Protein modification detectionMethod II
17
Protein modification detectionMethod II
18
Protein modification detection
  • Be careful! The of detected modifications in
    above table is meaningless, make your own
    judgment based on figures!
  • Refer to complimentary files for more detail
    figures, such as Protein_mod_138.10_2.pdf

19
Protein modification detection
  • For mod 16 150, count the number of matched
    peak cluster pairs.

20
Protein modification detection
  • For mod 16 150, count the number of matched
    peak pairs.

21
Correlation Analysis with phenotype data
  • 25 Phenotypes
  • ApoA1 ApoA2 excluded form this study due to too
    many missing.

22
Correlation Analysis with phenotype data
  • Correlation
  • First, Normscore-transformation applied to all
    vectors to reduce the outlier effect.
  • For a given phenotype, say Insulin_ug_l, the
    correlation between Insulin_ug_l and all peak
    cluster profile are computed.
  • Question how to choose cutoff point?

23
Correlation Analysis with phenotype data
  • Permutation FDR
  • The order of samples have been permuted, hence
    the internal relationship between phenotype and
    peak cluster has been broken, lets see how the
    correlations distributed under such permutation.
  • Permute 1000 times.

24
Correlation Analysis with phenotype data
  • FDR
  • For a given cutoff point, say 0.5.
  • D of discover of peak clusters with
    abs(correlation) higher than the cutoff point in
    our data.
  • FD of false discover of peak clusters
    with abs(correlation) higher than the cutoff
    point in 1000 permutation data / 1000.
  • FDR D / FD 100.

25
Correlation Analysis with phenotype data
  • Permutated p_value
  • Permutated p_value of peak clusters in 1000
    permutations with correlation exceed than the
    observed correlation / of peak clusters 1000.

26
Correlation Analysis with phenotype data
27
Correlation Analysis with phenotype data
  • Refer to complimentary files for more FDR
    figures, such as _PlotFDR.ps.
  • Refer to complimentary files for more scatter
    plot, such as _Scatter_plot_insulin_ug_l.ps.

28
With large bin size
29
With large bin size
30
Result for bin.size60
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com