Nonparametric Approaches - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Nonparametric Approaches

Description:

Though the non-parametric tests might be able to match that power under certain conditions ... Non-parametric tests are often used with small samples or ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 18
Provided by: mjc9
Category:

less

Transcript and Presenter's Notes

Title: Nonparametric Approaches


1
Non-parametric Approaches
  • The Bootstrap

2
Non-parametric?
  • Non-parametric or distribution-free tests have
    more lax and/or different assumptions
  • Properties
  • No assumption about the underlying distribution
    being normal
  • More sensitive to medians than means (which is
    good if youre interested in the median)
  • Some may not be very affected by outliers
  • Rank tests

3
Parametric vs. Non-parametric
  • Parametric tests will typically be used when
    assumptions are met as they will usually have
    more power
  • Though the non-parametric tests might be able to
    match that power under certain conditions
  • Non-parametric tests are often used with small
    samples or violations of assumptions

4
Some commonly used nonparametric analyses
  • Chi-Square
  • Chi-Square analysis involves categorical
    variables/frequency data only
  • Example
  • Party Republican Democrat
  • Vote Yes No
  • In this case, we cannot meet the assumptions
    common to our typical tests, but the goal would
    still be to understand the relationship between
    the variables involved
  • Chi-square analysis examines such relationships
    regarding frequencies of cells, and we can still
    get measures of the strength of the association
  • See effect size handout

5
Common Rank tests
  • Wilcoxon t for independent and dependent samples,
    Mann-Whitney U
  • Kruskal-Wallis, Friedman for more than 2 groups
  • Basic procedure
  • Rank the DV and get sums of the ranks for the
    groups
  • Construct a test statistic based on the ranked
    data
  • Advantage
  • Normality not necessary
  • Insensitive to outliers
  • Disadvantage
  • Ranked data is not in original units and so
    therefore may be less interpretable
  • May lack power, particularly when parametric
    assumptions hold

6
Transformation of data
  • So if I dont like my data I just change it?
  • Think about what youre studying
  • Is depression a function of Likert scale
    questions?
  • Is reaction time inherently related to learning?
  • Tukey reexpressions
  • Our original numbers are already useful fictions,
    and if we think of them as such, transforming
    them into something else may not seem so
    far-fetched

7
Some common transformations
  • Logarithmic Positively skewed
  • Square root Count data
  • e.g.
  • Reciprocal (1/x) When there are very extreme
    outliers
  • Arcsine Proportional data
  • e.g.
  • Other measures of location Heavy tailed data
  • e.g. Trimmed mean

8
When to transform?
  • Not something to think about doing straight away
    at any little sign of trouble
  • Even if your groups are skewed in a similar
    manner parametric tests may hold
  • Shop around
  • Try different transformations to see if one works
    better for your problem regarding the
    distribution of values (but not just to get a sig
    p-value)

9
Note
  • Transformations will not necessarily solve
    problems with outliers
  • Also, if inferences are based on e.g. the mean of
    the transformed data, we cannot simply transform
    the values back to the original and act as though
    the inferences still hold (e.g. for µ)
  • In the end, wed rather keep our data in original
    units and those transformations should be a last
    resort

10
More recent developments
  • The Bootstrap
  • The basic idea involves sampling with replacement
    from the sample data to produce random samples of
    size n
  • Each of these samples provides an estimate of the
    parameter of interest
  • Repeating the sampling a large number of times
    provides information on the variability of the
    estimate i.e. its standard error
  • Necessary for any inferential test

11
TV Example
How many hours of TV watched yesterday
12
Bootstrap
  • 1000 samples
  • Distribution of Means of each sample ?
  • Mean 3.951

13
How?
  • Two Examples

basic R function boot(11000) for (i in boot)
bootimean(sample(TVdata, replaceT)) mean(boot)
hist(boot, col"blue", border"red")
uses the bootstrap package library(bootstrap) boo
tmeanbootstrap(TVdata, thetamean,
nboot1000) mean(bootmeanthetastar) hist(bootmean
thetastar)
14
Bootstrap
  • Hypothetical situation
  • If we cannot assume normality, how would we go
    about getting a confidence interval for a
    particular statistic?
  • How would you get a confidence interval for
    robust measures and other statistics?
  • Solution
  • Resample (with replacement) from our own data
    based on its distribution
  • Treat our sample distribution as a population
    distribution and take random samples from it
  • So what we have done is, instead of assuming some
    sampling distribution of a particular shape and
    size, weve created it ourselves and derived our
    interval estimate from it
  • From this we can create confidence intervals and
    perform other inferential procedures

15
Hypothesis Testing
  • Comparing independent groups
  • Step 1 compute the bootstrap mean and bootstrap
    sd as before, but for each group
  • Each time you do so, calculate T
  • This creates your own t distribution.

16
Hypothesis Testing
  • Use the quantile points corresponding to your
    confidence level from it in computing your
    confidence interval on the difference betweens,
    rather than the tcv from typical distributions
  • Note however that your T will not be the same
    for the upper and lower bounds
  • Unless your bootstrap distribution was perfectly
    symmetrical
  • Not likely to happen

17
So why use?
  • Accuracy and control of type I error rate
  • Most of the problems associated with both
    accuracy and maintenance of type I error rate are
    reduced using bootstrap methods compared to
    Students t
  • Wilcox goes further to suggest that there may be
    in fact very few situations, if any, in which the
    traditional approach offers any advantage over
    the bootstrap approach
  • The problem of outliers and the basic statistical
    properties of means and variances as remain
    however
Write a Comment
User Comments (0)
About PowerShow.com