Intro to statistics for HCI - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Intro to statistics for HCI

Description:

N= degrees of freedom = Observations-conditions. Standard Error ... Lets say we want to look at a subjects performance across a number of days and tests ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 21
Provided by: ccGa
Category:
Tags: hci | intro | statistics

less

Transcript and Presenter's Notes

Title: Intro to statistics for HCI


1
Intro to statistics for HCI
  • Carlos Jensen
  • Oct 22, 2004

2
Hypothesis Testing
  • Why do hypothesis testing?
  • What exactly do we try to do here?
  • What kinds of hypotheses do we usually test?
  • X is better/larger/faster than Y
  • X improved more than Y

3
Vocabulary
  • Hypothesis
  • Statistical significance
  • Significance level
  • Normal distribution
  • Standard deviation
  • Standard Error
  • Degrees of freedom
  • Sample
  • Power
  • Mean
  • Student t-test
  • ANOVA
  • Type I and Type II error

4
Hypothesis Testing
  • Specify null hypothesis (H0) and the alternative
    hypothesis (H1). For example H0 µ1 - µ2 0 and
    H1 µ1 µ2.
  • Remember, define so that H1true iff H0false
  • Select a significance level. Typically P0.05 or
    P0.10
  • Sample population and calculate statistics.
  • Calculate the probability (p-value) of obtaining
    a statistic as different or more from the norm.
  • Compared the p-value with the chosen significance
    level. If PltSignificance level, reject null
    hypothesis
  • That P is greater does not mean H0 is true! Just
    means we cant tell.
  • When the null hypothesis is rejected, the result
    is "statistically significant When we cant
    reject null hypothesis, result is not
    statistically significant

5
Hypothesis Testing
  • Specify null hypothesis (H0) and the alternative
    hypothesis (H1). For example H0 µ1 - µ2 0 and
    H1 µ1 µ2.
  • Remember, define so that H1true iff H0false
  • Select a significance level. Typically P0.05 or
    P0.10
  • Sample population and calculate statistics.
  • Calculate the probability (p-value) of obtaining
    a statistic as different or more from the norm.
  • Compared the p-value with the chosen significance
    level. If PltSignificance level, reject null
    hypothesis
  • That P is greater does not mean H0 is true! Just
    means we cant tell.
  • When the null hypothesis is rejected, the result
    is "statistically significant When we cant
    reject null hypothesis, result is not
    statistically significant

6
2.1 Determining significance
  • By significance we mean the threshold at which we
    say an occurrence was due to some force rather
    than random chance
  • Different thresholds for different fields,
    depends on what is at stake (see next slide)
  • CS many others P0.05 is significant, P0.1 is
    sometimes referred to as marginal significance
    (CS only)
  • Medicine, psychology, etc Anything under 0.01 or
    0.005 is not generally accepted for any treatment
    related experiment
  • In a normal distribution, about 68 of the scores
    are within one standard deviation of the mean and
    about 95 of the scores are within two standards
    deviations of the mean
  • This is why CS and others arrived at p0.05, easy
    to calculate without computers, used to have to
    look things up in tables

7
2.2 Statistical Significance
  • Statistical significance means
  • Two populations differ to a significant extent
    along some variable, not necessarily all.
  • Statistical significance does NOT mean noteworthy
  • It may be the case that CS grad students are
    better at logic than the average grad student
    (unlikely), but that this difference is so small
    as to have no practical significance in real life
  • Typically in either the rate of occurrence, or
    the value of some result. Can have one without
    the other?
  • Group A can be twice as likely to do well on
    tests than Group B (statistically significant),
    yet the difference in scores may not be large
    enough to be significant

8
2.3 Significance Type I and II errors
  • What does significance mean?
  • Type I error False negative (significance
    threshold)
  • Type II error False positive
  • Type I error rate set by significance threshold
  • Type II error rate grows as Type I shrinks
  • Set significance to balance risks of type I or
    type II error
  • When might low Type I and high Type II be
    preferable?
  • When might high Type I and low Type II be
    preferable?

9
Hypothesis Testing
  • Specify null hypothesis (H0) and the alternative
    hypothesis (H1). For example H0 µ1 - µ2 0 and
    H1 µ1 µ2.
  • Remember, define so that H1true iff H0false
  • Select a significance level. Typically P0.05 or
    P0.10
  • Sample population and calculate statistics.
  • Calculate the probability (p-value) of obtaining
    a statistic as different or more from the norm.
  • Compared the p-value with the chosen significance
    level. If PltSignificance level, reject null
    hypothesis
  • That P is greater does not mean H0 is true! Just
    means we cant tell.
  • When the null hypothesis is rejected, the result
    is "statistically significant When we cant
    reject null hypothesis, result is not
    statistically significant

10
3.1 Sample sizes
  • Two factors determine how big a sample you need
  • Power
  • The likelihood well have enough data to reliably
    reject H0, given a difference
  • Power (1 - Probability of Type II error)
  • Central Limit Theorem
  • Central Limit Theory A large sample of a
    non-normal population will tend to approach
    normal distribution
  • The more the populations depart from normality,
    the larger the sample size needed
  • Most statistical tests assume a normally
    distributed sample
  • Advice 10-20 per condition, the more the better

11
3.2 Within-subject or Between-subject Design
  • Repeated measures vs. single sample (or low
    number of samples
  • Are we testing whether two groups are different
    (between subjects), or whether a treatment had an
    effect (within subject)?
  • Between subjects we typically look at population
    averages
  • Within subjects we typically look at the average
    change in subjects (analysis of variance)

12
3.2 Within-subject or Between-subject Design (2)
  • Within-subject design
  • Cheap, fewer subjects, more data
  • Removes individual differences
  • Introduces learning and carryover effects
  • Cant use the same stats as on between subjects
    because the observations are no longer
    independent

13
Hypothesis Testing
  • Specify null hypothesis (H0) and the alternative
    hypothesis (H1). For example H0 µ1 - µ2 0 and
    H1 µ1 µ2.
  • Remember, define so that H1true iff H0false
  • Select a significance level. Typically P0.05 or
    P0.10
  • Sample population and calculate statistics.
  • Calculate the probability (p-value) of obtaining
    a statistic as different or more from the norm.
  • Compared the p-value with the chosen significance
    level. If PltSignificance level, reject null
    hypothesis
  • That P is greater does not mean H0 is true! Just
    means we cant tell.
  • When the null hypothesis is rejected, the result
    is "statistically significant When we cant
    reject null hypothesis, result is not
    statistically significant

14
4.0 Stats Basic Concepts
  • The variance is a measure of how spread out a
    distribution is. It is computed as the average
    squared deviation of each number from its mean.
  • Standard deviation sqrt(variance)
  • N degrees of freedom Observations-conditions
  • Standard Error Standard dev/Sqrt(N)

15
4.1 Student t-test
  • Most basic test used (very handy)
  • For single group
  • t(statistic-hypothesized value)/Std error
  • For test of proportions
  • Where p1- p2 is the difference between sample
    proportions, p is a weighted average of the p1
    and p2 and the ns are the of samples

16
4.2 What does a t or a z mean?
  • Look it up in table!
  • Or use program to calculate probability value
  • (http//members.aol.com/johnp71/pdfs.html)

17
4.3 Analysis of variance ANOVA
  • t-test checks for difference between two means
  • Fishing bad, dont test every single possible
    combination of factors in a large experiment
  • Remember Type I error, something is bound to slip
    by sooner or later
  • Break out your stats program
  • It will calculate your ps for you, but you still
    need to figure out what it all means

18
4.4 ANOVA example
  • Lets say we want to look at a subjects
    performance across a number of days and tests
  • Your stats package might produce this result
  • What does that mean?

19
Final tips
  • Be impartial, be fair, be ethical
  • Easy to lie with statistics
  • Scientific honor code
  • Find a good reference book, use it, a lot
  • Chances are youll be doing more stats-work than
    you thought
  • If you are submitting to CHI, include at least
    one t-test

20
Resources
  • Design and Analysis A Researchers Handbook by
    Geoffrey Keppel
  • HyperStat Online (http//davidmlane.com/hyperstat/
    )
  • The Little Handbook of Statistical Practice
    (http//www.tufts.edu/gdallal/LHSP.HTM)
Write a Comment
User Comments (0)
About PowerShow.com