D91623405 D92525008 - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

D91623405 D92525008

Description:

Comparing multiple tissues with 2-dye arrays ... Numerical::Shuffle, POSIX, Statistics::Distributions, Storable, and Tie::RefHash. ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 27
Provided by: Luk88
Category:
Tags: d91623405 | d92525008 | dye | tie

less

Transcript and Presenter's Notes

Title: D91623405 D92525008


1
D91623405 ???D92525008 ???
  • Simple statistics tools for gene expression
    arrays
  • http//microarray.cpmc.columbia.edu/pavlidis
    /pub/stats/web/
  • NIA Array Analysis Tool.htm
  • http//lgsun.grc.nia.nih.gov/ANOVA/index.html
  • Perl module (StatisticsKruskalWallis,ttest)
  • Mann-Whitney U Test 1http//eatworms.swmed.edu/
    leon/core_2002/stats/formulas.htmutest
  • the Rank Products methods of Breitling et
    alrankproducts_FDR.pl
  • Multiclass discovery in array data.htm
  • http//www.thep.lu.se/markus/software/classdisco
    verer/index.html.

2
Simple statistics tools for gene expression
arrays-1
  • Analysis of variance and t-tests
  • ttest Do a two-sided ttest with or without Welch
    correction. This program include some
    nonparametric tests as options (the Mann-Whitney
    "U" test and the rank-transformed t-test).
  • anova-oneway
  • Do a one-way analysis of variance with a balanced
    design.
  • anova-twoway-norep
  • Do a two-way analysis of variance, when there
    are no replicates.
  • anova-twoway-withrep
  • Do a two-way analysis of variance, when there are
    replicates with a balanced design.

3
Command lines to run the analyses on the test
files
  • ttest -r testdata.txt ttest-layout.txt gt
    testdata-ttest.out
  • anova-oneway -r testdata.txt anova-oneway-layout
    .txt gt testdata-anova-oneway.out
  • anova-twoway-withrep -r testdata.txt
    anova-twoway-withrep-layout.txt gt
    testdata-anova-twoway-withrep.out

4
testdata.txt 12(sameple) x30(gene)
5
Layout(ttest, anova-oneway, anova-twoway-withre
p)
6
ttest.out ttest -r testdata.txt
ttest-layout.txt gt testdata-ttest.out
7
ttest
  • Perform various two-sided statistical analyses of
    data that is divided into two groups, including
    the Student's t-test.
  • -r format line needs to be removed
  • -w use Welch approximate t
  • -m do mann-whitney U (a.k.a. Wilcoxon) test
  • -rank use rank transformation of the data
  • -l log transform the data
  • ttest -r -rank affydatafile.txt
    affydatafile-layout /usr/local/bin/sort -gk 3
    gt! test.rank.out

8
anova-oneway.outanova-oneway -r testdata.txt
anova-oneway-layout.txt gt testdata-anova-oneway.o
ut
9
anova-twoway-withrep.outanova-twoway-withrep
-r testdata.txt anova-twoway-withrep-layout.txt gt
testdata-anova-twoway-withrep.out
10
NIA Array Analysis Tool-anova_oneway
11
NIA Array Analysis Tool-anova_oneway
  • The major advantage of ANOVA versus simple t-test
    is that variances are averaged over all factor
    levels, thus the statistics become more stable.
  • In ANOVA we calculate the F-statistics which is
    then used to estimate P-value and determine if
    the variation between means is significant.
  • Testing multiple hypotheses with ANOVA (as in the
    case of microarray data) may require some
    modifications in ANOVA like variance averaging,
    and FDR.

12
Comparing 2 tissues with 1-dye arraysInput
Example (2 tissues, 1-dye arrays)
13
Comparing multiple tissues with 2-dye
arraysInput Example (3 tissues, 2-dye arrays)
14
Comparing 2 tissues with 2-dye arraysInput
Example (2 tissues, 2-dye arrays, dye swap)
15
Comparing multiple tissues with 2-dye
arraysExample(3 tissues, 2-dye arrays, universal
reference(UR))
16
Visualization of a data set with no
replicationsExample of an input file (1-dye
arrays)
Because your data has no replications, no
statistical analysis will be done. Only
visualization will be available by pair-wise
comparison of tissues, PAC, and hierarchical
clustering. Note that hierarchical clustering is
available only if you have not more than 20
tissues for comparison.
17
Kruskal-Wallis test-1
  • Perl module, use to test if differences exist
    between 3 or more independant groups of unequal
    sizes.
  • StatisticsKruskalWallis
  • Also includes the post-hoc Newman-Keuls test, to
    test if the differences between pairs of the
    tested group are significant

18
Kruskal-Wallis test-2input and output
  • use StatisticsKruskalWallis use strict
  • my _at_group_1 (6.4,6.8,7.2,8.3,8.4,9.1,9.4,9.7),
  • _at_group_2 (2.5,3.7,4.9,5.4,5.9,8.1,8
    .2),
  • _at_group_3 (1.3,4.1,4.9,5.2,5.5,8.2)
  • my kw new StatisticsKruskalWallis
  • kw-gtload_data('group 1',_at_group_1)
  • kw-gtload_data('group 2',_at_group_2)
  • kw-gtload_data('group 3',_at_group_3)
  • my (H,p_value) kw-gtperform_kruskal_wallis_tes
    t
  • print "Kruskal Wallis statistic is H\n"
  • print "p value for test is p_value\n"

19
Kruskal-Wallis test-3Newman-Keuls test
  • (q,p) kw-gtpost_hoc('Newman-Keuls','group
    1','group 2')
  • print "Newman-Keuls statistic for groups 1,2 is
    q, p value p\n"
  • (q,p) kw-gtpost_hoc('Newman-Keuls','group
    2','group 3')
  • print "Newman-Keuls statistic for groups 2,3 is
    q, p value p\n"

20
Kruskal-Wallis test-4 kruskalwallis.perl
  • Jussi Karlgren, SICS, 2003. jussi_at_sics.se
  • Usage kruskalwallis.perl -c ltCategorial_Columngt
    -v ltValue_Columngt
  • require "getopts.pl"
  • print "Kruskal-Wallis test statistic H - refer
    to khi2 tables with df degrees of freedom.\n"

21
Ttestperl module
  • my ttest new StatisticsTTest
  • ttest-gtset_significance(90)
  • ttest-gtload_data(\_at_r1,\_at_r2)
  • ttest-gtoutput_t_test()
  • ttest-gtset_significance(99)
  • ttest-gtprint_t_test()

22
Mann-Whitney U Test 1http//eatworms.swmed.edu/
leon/core_2002/stats/formulas.htmutest
23
Mann-Whitney U Test 2(Wilcoxon Rank Sum Test)
  • http//eatworms.swmed.edu/leon/core_2002/stats/fo
    rmulas.htmutest
  • Utest.c
  • Utable.pl

24
Multiclass discovery in array data-1http//www.th
ep.lu.se/markus/software/classdiscoverer/index.ht
ml.
25
Multiclass discovery in array data-2
  • An unsupervised classification method for
    discovery of classes in array data.
  • For two classes the Wilcoxon test is used to
    find discriminatory genes. For more than two
    classes the Kruskal-Wallis test is used.
  • The Perl modules AlgorithmNumericalShuffle,
    POSIX, StatisticsDistributions, Storable, and
    TieRefHash.

26
  • For discovery of two classes, P values from
    random permutation tests are stored in the file
    'pvalues.data' in binary format using the CPAN
    module Storable. If 'pvalues.data' is not
    compatible with your system you have to generate
    one using the included 'generate_pvalues.pl'
    program.
  • The file 'pvalues.data' contains results for
    which the total number of experiments is
    maximally 100. If you are analysing a data set
    with more than 100 experiments and do not want to
    perform the permutation tests every time, you
    have to modify the subroutine 'new' in
    'WilcoxonTest.pm'.
  • Y. Liu and M. Ringner, Multiclass discovery in
    array data, BMC Bioinformatics 5, 70 (2004)
  • class_discoverer.pl - multiclass discovery in
    array data
  • Genes Exp_1 Exp_2 Exp_3 Exp_4
  • Gene_1 0 0 0 0
  • Gene_2 1 1 1
  • Gene_3 -1 -1 -1 -1
  • Gene_4 0 1 2 3
Write a Comment
User Comments (0)
About PowerShow.com