Title: Computer algebra and rank statistics
1Computer algebra and rank statistics
- Alessandro Di Bucchianico
- HCM Workshop Coimbra
- November 5, 1997
2How to run this presentation?
- the presentation runs itself most of the time
- click the mouse if you want to continue
- type S to stop or restart the presentation
- underlined items are hyperlinks to files on the
World Wide Web (usually Postscripts files of
technical reports) - Enjoy my presentation!
3Outline
- General remarks on nonparametric methods
- What is computer algebra?
- Case study the Mann-Whitney statistic
- Critical values of rank test statistics
- Moments of the Mann-Whitney statistic
- Conclusions
4General remarks on nonparametric methods
- Practical problems
- tables (limited, errors, not exact,)
- limited availability in statistical software
- procedures in statistical software often only
based on asymptotics
5General remarks on nonparametric methods
- Mathematical problems
- in general no closed expression for distribution
function - direct enumeration only feasible for small sample
sizes - recurrences are time-consuming
6What is computer algebra?
7Case study Mann-Whitney statistic
- independent samples X1,,Xm and Y1,,Yn
- continuous distribution functions F, G resp.
- (hence, no ties with probability one)
- order the pooled sample from small to large
8Mann-Whitney (continued)
- Wilcoxon Wm,n Si rank(Xi)
- Mann-Whitney Mm,n (i,j) Yj lt Xi
- Wm,n Mm,n ½ m (m1)
- What is the distribution of Mm,n under H0FG?
9Under H0, we have
10(No Transcript)
11Computational speed (Pentium 133 MHz)
- Exact P(M5,5 ? 4) 1/21 ? 0.0476
- computing time 0.05 sec (generating function
degree 25) - P(M5,5 ? 4) ? 0.0384
Exact P(M20,20 ? 138) 0.0482
(rounded) computing time 8.5 sec (generating
function degree 400) P(M20,20 ? 138) ? 0.0475
Asymptotics and exact calculations are both
useful!
12Other examples of nonparametric test statistics
with closed form for generating function include
- Wilcoxon signed rank statistic
- Kendall rank correlation statistic
- Kolmogorov one-sample statistic
- Smirnov two-sample statistic
- Jonckheere-Terpstra statistic
- Consult the combinatorial literature!
- What to do if there is no generating function?
13Linear rank statistics
- Z? 1 if ?th order statistic in the pooled
sample is an X-observation, and 0 otherwise
Streitberg Röhmel 1986 (cf. Euler 1748)
Branch-and-bound algorithm (Van de Wiel)
14Moments of Mann-Whitney statistic
- Mann and Whitney (1947) calculated 4th central
moment - Fix and Hodges (1955) calculated 6th central
moment - Computations are based on recurrences
- Can we improve?
computer algebra and generating functions
21th century
solution
15Computing moments of Mm,n
- recompute E(Mm,n) (following René Swarttouw)
16(No Transcript)
17Hence, it remains to calculate for 1 ? k ? m
After some simplifications
18LHôpitals rule yields that the limit equals
It is tedious to perform these computations by
hand. Alternative compute moments using
Mathematica.
19Mathematica procedures for moments of Mm,n
208th central moment of Mm,n
21Conclusions
- generating functions are also useful in
nonparametric statistics - computer algebra is a natural tool for
mathematicians - asymptotics and exact calculations complement
each other
22Topics under investigation
- tests for censored data
- power calculations
- nonparametric ANOVA (Kruskal-Wallis, block
designs, multiple comparisons) - Spearmans ? (rank correlation)
- multimedia/ World Wide Web implementation
- Click on underlined items to obtain Postscript
file of technical report
23References
- A. Di Bucchianico, Combinatorics, computer
algebra and the Wilcoxon-Mann-Whitney test, to
appear in J. Stat. Plann. Inf. - B. Streitberg and J. Röhmel, Exact distributions
for permutation and rank tests An introduction
to some recently published algorithms, Stat.
Software Newsletter 12 (1986), 10-18
24References (continued)
- M.A. van de Wiel, Exact distributions of
nonparametric statistics using computer algebra,
Masters Thesis, TUE, 1996 - M.A. van de Wiel and A. Di Bucchianico, The exact
distribution of Spearmans rho, technical report - M.A. van de Wiel, A. Di Bucchianico and P. van
der Laan, Exact distributions of nonparametric
test statistics using computer algebra, technical
report
25References (continued)
- M.A. van de Wiel, Edgeworth expansions with exact
cumulants for two-sample linear rank statistics ,
technical report - M.A. van de Wiel, Exact distributions of
two-sample rank statistics and block rank
statistics using computer algebra , technical
report
26The End