D91623405 D92525008

About This Presentation

Title:

D91623405 D92525008

Description:

Comparing multiple tissues with 2-dye arrays ... Numerical::Shuffle, POSIX, Statistics::Distributions, Storable, and Tie::RefHash. ... – PowerPoint PPT presentation

Number of Views:80

Avg rating:3.0/5.0

Slides: 27

Provided by: Luk88

Category:

more less

Transcript and Presenter's Notes

Title: D91623405 D92525008

1
D91623405 ???D92525008 ???

Simple statistics tools for gene expression
arrays
http//microarray.cpmc.columbia.edu/pavlidis
/pub/stats/web/
NIA Array Analysis Tool.htm
http//lgsun.grc.nia.nih.gov/ANOVA/index.html
Perl module (StatisticsKruskalWallis,ttest)
Mann-Whitney U Test 1http//eatworms.swmed.edu/
leon/core_2002/stats/formulas.htmutest
the Rank Products methods of Breitling et
alrankproducts_FDR.pl
Multiclass discovery in array data.htm
http//www.thep.lu.se/markus/software/classdisco
verer/index.html.

2
Simple statistics tools for gene expression
arrays-1

Analysis of variance and t-tests
ttest Do a two-sided ttest with or without Welch
correction. This program include some
nonparametric tests as options (the Mann-Whitney
"U" test and the rank-transformed t-test).
anova-oneway
Do a one-way analysis of variance with a balanced
design.
anova-twoway-norep
Do a two-way analysis of variance, when there
are no replicates.
anova-twoway-withrep
Do a two-way analysis of variance, when there are
replicates with a balanced design.

3
Command lines to run the analyses on the test
files

ttest -r testdata.txt ttest-layout.txt gt
testdata-ttest.out
anova-oneway -r testdata.txt anova-oneway-layout
.txt gt testdata-anova-oneway.out
anova-twoway-withrep -r testdata.txt
anova-twoway-withrep-layout.txt gt
testdata-anova-twoway-withrep.out

4
testdata.txt 12(sameple) x30(gene)
5
Layout(ttest, anova-oneway, anova-twoway-withre
p)
6
ttest.out ttest -r testdata.txt
ttest-layout.txt gt testdata-ttest.out
7
ttest

Perform various two-sided statistical analyses of
data that is divided into two groups, including
the Student's t-test.
-r format line needs to be removed
-w use Welch approximate t
-m do mann-whitney U (a.k.a. Wilcoxon) test
-rank use rank transformation of the data
-l log transform the data
ttest -r -rank affydatafile.txt
affydatafile-layout /usr/local/bin/sort -gk 3
gt! test.rank.out

8
anova-oneway.outanova-oneway -r testdata.txt
anova-oneway-layout.txt gt testdata-anova-oneway.o
ut
9
anova-twoway-withrep.outanova-twoway-withrep
-r testdata.txt anova-twoway-withrep-layout.txt gt
testdata-anova-twoway-withrep.out
10
NIA Array Analysis Tool-anova_oneway
11
NIA Array Analysis Tool-anova_oneway

The major advantage of ANOVA versus simple t-test
is that variances are averaged over all factor
levels, thus the statistics become more stable.
In ANOVA we calculate the F-statistics which is
then used to estimate P-value and determine if
the variation between means is significant.
Testing multiple hypotheses with ANOVA (as in the
case of microarray data) may require some
modifications in ANOVA like variance averaging,
and FDR.

12
Comparing 2 tissues with 1-dye arraysInput
Example (2 tissues, 1-dye arrays)
13
Comparing multiple tissues with 2-dye
arraysInput Example (3 tissues, 2-dye arrays)
14
Comparing 2 tissues with 2-dye arraysInput
Example (2 tissues, 2-dye arrays, dye swap)
15
Comparing multiple tissues with 2-dye
arraysExample(3 tissues, 2-dye arrays, universal
reference(UR))
16
Visualization of a data set with no
replicationsExample of an input file (1-dye
arrays)
Because your data has no replications, no
statistical analysis will be done. Only
visualization will be available by pair-wise
comparison of tissues, PAC, and hierarchical
clustering. Note that hierarchical clustering is
available only if you have not more than 20
tissues for comparison.
17
Kruskal-Wallis test-1

Perl module, use to test if differences exist
between 3 or more independant groups of unequal
sizes.
StatisticsKruskalWallis
Also includes the post-hoc Newman-Keuls test, to
test if the differences between pairs of the
tested group are significant

18
Kruskal-Wallis test-2input and output

use StatisticsKruskalWallis use strict
my _at_group_1 (6.4,6.8,7.2,8.3,8.4,9.1,9.4,9.7),
_at_group_2 (2.5,3.7,4.9,5.4,5.9,8.1,8
.2),
_at_group_3 (1.3,4.1,4.9,5.2,5.5,8.2)
my kw new StatisticsKruskalWallis
kw-gtload_data('group 1',_at_group_1)
kw-gtload_data('group 2',_at_group_2)
kw-gtload_data('group 3',_at_group_3)
my (H,p_value) kw-gtperform_kruskal_wallis_tes
t
print "Kruskal Wallis statistic is H\n"
print "p value for test is p_value\n"

19
Kruskal-Wallis test-3Newman-Keuls test

(q,p) kw-gtpost_hoc('Newman-Keuls','group
1','group 2')
print "Newman-Keuls statistic for groups 1,2 is
q, p value p\n"
(q,p) kw-gtpost_hoc('Newman-Keuls','group
2','group 3')
print "Newman-Keuls statistic for groups 2,3 is
q, p value p\n"

20
Kruskal-Wallis test-4 kruskalwallis.perl

Jussi Karlgren, SICS, 2003. jussi_at_sics.se
Usage kruskalwallis.perl -c ltCategorial_Columngt
-v ltValue_Columngt
require "getopts.pl"
print "Kruskal-Wallis test statistic H - refer
to khi2 tables with df degrees of freedom.\n"

21
Ttestperl module

my ttest new StatisticsTTest
ttest-gtset_significance(90)
ttest-gtload_data(\_at_r1,\_at_r2)
ttest-gtoutput_t_test()
ttest-gtset_significance(99)
ttest-gtprint_t_test()

22
Mann-Whitney U Test 1http//eatworms.swmed.edu/
leon/core_2002/stats/formulas.htmutest
23
Mann-Whitney U Test 2(Wilcoxon Rank Sum Test)

http//eatworms.swmed.edu/leon/core_2002/stats/fo
rmulas.htmutest
Utest.c
Utable.pl

24
Multiclass discovery in array data-1http//www.th
ep.lu.se/markus/software/classdiscoverer/index.ht
ml.
25
Multiclass discovery in array data-2

An unsupervised classification method for
discovery of classes in array data.
For two classes the Wilcoxon test is used to
find discriminatory genes. For more than two
classes the Kruskal-Wallis test is used.
The Perl modules AlgorithmNumericalShuffle,
POSIX, StatisticsDistributions, Storable, and
TieRefHash.

For discovery of two classes, P values from
random permutation tests are stored in the file
'pvalues.data' in binary format using the CPAN
module Storable. If 'pvalues.data' is not
compatible with your system you have to generate
one using the included 'generate_pvalues.pl'
program.
The file 'pvalues.data' contains results for
which the total number of experiments is
maximally 100. If you are analysing a data set
with more than 100 experiments and do not want to
perform the permutation tests every time, you
have to modify the subroutine 'new' in
'WilcoxonTest.pm'.
Y. Liu and M. Ringner, Multiclass discovery in
array data, BMC Bioinformatics 5, 70 (2004)
class_discoverer.pl - multiclass discovery in
array data
Genes Exp_1 Exp_2 Exp_3 Exp_4
Gene_1 0 0 0 0
Gene_2 1 1 1
Gene_3 -1 -1 -1 -1
Gene_4 0 1 2 3

Write a Comment

User Comments (0)

About PowerShow.com

D91623405 D92525008 - PowerPoint PPT Presentation

D91623405 D92525008

Comparing multiple tissues with 2-dye arrays ... Numerical::Shuffle, POSIX, Statistics::Distributions, Storable, and Tie::RefHash. ... – PowerPoint PPT presentation