ARCH 21266126 - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

ARCH 21266126

Description:

Imagine, then, two batches of numbers... cases like the 2 batches of numbers, scraper lengths or human statures. H0: the 2 batches have equal means. H1: ... – PowerPoint PPT presentation

Number of Views:384
Avg rating:3.0/5.0
Slides: 18
Provided by: anu9
Category:
Tags: arch | batches

less

Transcript and Presenter's Notes

Title: ARCH 21266126


1
ARCH 2126/6126
  • Your papers for review please?
  • Session 8 Comparing sample means

2
Imagine, then, two batches of numbers...
  • ... representing, say, lengths of scrapers
    excavated from two sites
  • Are they the same or different?
  • Most unlikely to be exactly the same
  • Are they similar enough that they could be two
    samples from one population?
  • Or are they different enough to make that a very
    unlikely explanation?

3
We have reviewed some theoretical considerations
  • Re-sampling central limit theorem
  • Statistical significance
  • p-values low p high significance
  • The problem of many tests
  • Type I and type II errors
  • Sir R.A. Fisher the magic of 5
  • Classical statistics versus exploratory data
    analysis

4
So how do we reach a p-value hence a
conclusion?
  • A formal statistical test
  • E.g. a t test (Students t)
  • Special case of Analysis of Variance
  • In general the appropriate test for cases like
    the 2 batches of numbers, scraper lengths or
    human statures
  • H0 the 2 batches have equal means
  • H1 they have different means

5
t refers to a distribution, like the normal
distribution
  • Or rather, a set of distributions
  • A different distribution for each possible sample
    size or rather each degree of freedom (d.f.)
  • Degrees of freedom n 1
  • The larger the sample, the more t resembles the
    normal (z) distribution
  • For n ? 30, they are virtually the same
  • For n ?, they are the same

6
(No Transcript)
7
The formula looks intimidating
  • Its given in both Drennan (p.156) and Madrigal
    (p.98)
  • But to calculate t all you need to know is--
    the sample size- the mean - the variance (or
    SD)of each sample
  • We have already practised getting these from raw
    data

8
Here is the procedure in words for the (unpaired)
t test
  • To test the hypothesis that two samples could
    have been drawn from the same population, or two
    populations with the same mean
  • Calculate the difference between the means
  • Calculate the joint standard error of the means
  • Divide the first by the second

9
Now lets do it with an example Drennans (p.150)
  • Calculate pooled standard deviation
  • ((n1-1) x s12) ((n2-1) x s22)
  • Divide this by (n1 n2 2)
  • Take the square root to get Sp
  • Multiply by square root of (1/n1 1/n2)
  • This is pooled standard error SEp

10
(No Transcript)
11
To complete calculation of t
  • We scale the difference between the means by SEp
  • I.e. t (X-bar1 X-bar2)/SEp
  • When we have this number, we have t
  • What does it mean?
  • Look up in table (p.125)
  • Gives probability of finding such a difference by
    re-sampling one popn

12
(No Transcript)
13
More specifically
  • The outcome is a number, a value of the test
    statistic
  • E.g. t -3.66 (a different example)
  • We also need the degrees of freedom, a number
    usually closely related to the sample size for
    t, the two sample sizes summed e.g. 7 8 2
    13
  • We need to know whether to apply a one- or
    two-tailed test (normally two)

14
You can also use
  • A statpack e.g. SPSS
  • Or website e.g. http//home.clara.net/sisa/index.h
    tm
  • Or even just Excel (calculate the long way, then
    check against TTEST which returns p-value)
  • Round as late as possible
  • Checking is good

15
Interpreting the outcome
  • From value of t plus d.f. we can get a p-value
    from a look-up table or an electronic source
    e.g. 0.01gtpgt0.001
  • We should normally report- the value of the test
    statistic - the degrees of freedom- the p-value
    or significance level
  • Hence in this case t 3.66, d.f. 13,
    0.01gtpgt0.001

16
Assumptions of the t test
  • Random sampling
  • Independence of observations and samples
  • Data are normally distributed
  • Variances are homogeneous (or not)
  • If these conditions are significantly violated,
    need to discuss this with your statistical adviser

17
This is the unpaired t test
  • Other forms of the t test exist
  • Comparison of a single observation with a sample
    mean
  • And paired samples t test (simpler) for use when
    repeated measures are being taken on the same set
    of cases
  • t special case of analysis of variance
  • These are all univariate statistics
Write a Comment
User Comments (0)
About PowerShow.com