Sampling, sample size estimation, and randomisation - PowerPoint PPT Presentation

About This Presentation
Title:

Sampling, sample size estimation, and randomisation

Description:

Sampling, sample size estimation, and randomisation PS302 – PowerPoint PPT presentation

Number of Views:145
Avg rating:3.0/5.0
Slides: 34
Provided by: george539
Category:

less

Transcript and Presenter's Notes

Title: Sampling, sample size estimation, and randomisation


1
Sampling, sample size estimation, and
randomisation
  • PS302

2
Overview
  • Sampling
  • representative sampling (e.g. for surveys)
  • homogenous sampling (e.g. for experiments)
  • Sample size estimation
  • Based on power
  • Gathering the information you need
  • Power calculations (GPower software)
  • - ANOVA
  • - regression
  • Rules of thumb for multivariate tests
  • Presentation of power analysis in your report
  • Practical randomising
  • Random selection (e.g. for surveys)
  • Random allocation (e.g. for experiments)

3
Getting a representative sample
  • Survey of UK Households
  • want a sample from each SES group
  • each age group
  • each sex
  • Proportions should match the population

4
Matching the population
  • Percent of population ? percent of sample
  • Assume, sample size 1200
  • Population Women 60, Men 40
  • ?
  • Sample Women 720, Men 480

5
Problem for you to try
  • Population figures
  • Men 65 years 1 million
  • Women 65 years 1.5 million
  • Men 25-65 years 8 million
  • Women 25-65 years 8.5 million
  • Men lt 25 years 5 million
  • Women lt 25 years 5.2 million

Total population size 29.2 million
17.8
Percent W25-65 (5.2 / 29.2) 100
Given a sample size of 200, how many women lt25
years should be included?
6
quota sampling
  • Recruiters are given a quota of each stratum
  • Problem biased selection by recruiter/interviewe
    r
  • Advantage random selection very difficult to
    achieve, quota sampling a good compromise

7
Homogenous sampling
  • Restrict sampling to a narrow group
  • Sample only Warwick students
  • Sample only one Sex
  • Sample only one Age group
  • Advantages
  • reduces error variance by reducing individual
    differences

8
Homogenous sampling ctd
  • Disadvantage may reduce generalisability
  • generalisability will need to be considered and
    assessed separately
  • Suitability
  • experimental work
  • studies where individual differences are not
    directly relevant and power is more important
    concern

9
power
  • Probability that any particular (random) sample
    will produce a statistically significant effect
  • Eg. power 0.9
  • ? 90 chance of detecting an effect if there
    really is an effect

Researchers usually aim to have power at 80-90
10
Power and sample size
  • All else being equal, to get more power you need
    more participants

Where all else means reliability of
measures other sources of error
variance p-value the true size of the effect
11
These concepts are inter-related
  • Desired power ? N ?
  • Acceptable p-value ? N ?
  • Effect size to detect ? N ?
  • Reliability of measures ? N ?
  • Other error variance ? N ?

12
if you know these
effect size
variance of measures
you can often work out what the sample size
should be
So where can you find them?
Previous research studies
13
Calculating using G-power
  • First step, assemble the figures needed
  • For between subjects ANOVA
  • Effect size (Cohens f, or partial eta squared)
  • Significance level .05, usually
  • Power .8, usually
  • Numerator degrees of freedom (df)
  • Number of cells in design (groups)

14
1. Effect size
  • from previous studies
  • Easy they reported effect sizes
  • There was not a significant main effect of Sex
    on response time, F(1, 42) 2.03, p .16, ?2
    0.046
  • Harder they reported only the F and df, so you
    have to make a calculation
  • partial ?2 (dfeffect F) / (dfeffect F)
    dferror
  • (1 2.03) / (1 2.03) 42
  • 0.046

15
measures of effect size for ANOVA
Roughly, the correlation between an effect and
the outcome (DV)
  • eta squared
  • The proportion of variance in the outcome
    variable (DV) that is explained by the IV
  • SSeffect / SScorrected total
  • partial eta squared (SPSS prints this out)
  • The proportion of the effect error variance
    explained by the effect
  • SSeffect / (SSeffect SSerror)

16
4. Numerator df
  • There was a non-significant main effect of
    Gender on response time, F(1, 42) 2.03, p lt
    .05, ?2 0.09

17
5. Number of cells (groups)
  • Two way ANOVA
  • 2 x 3 ANOVA ? 6 cells
  • 4 x 2 ANOVA ? 8 cells
  • Etc.

18
Calculating using G-power
  • First step, assemble the figures needed
  • For this 2 X 3 between subjects ANOVA
  • Effect size (?2 0.046)
  • Significance level .05, as usual
  • Power .8, normal
  • Numerator degrees of freedom (df 1, 2 for the
    respective main effects, or 2 for the
    interaction)
  • Number of cells in design (groups 6)

19
tip power ANOVA
  • Each effect in the ANOVA has its own power
  • Eg. 2 x 3 ANOVA
  • Main effect A
  • Main effect B
  • Interaction effect A B

Tip power is lower for interactions than for
main effects
20
Sample size ethical issues
  • Too small a sample
  • -- cant detect significant effects
  • ? waste all participants time
  • Too large a sample
  • -- waste resources
  • -- waste the extra participants time

21
Sample size practical issues
  • Resources
  • Time
  • Cost of running each participant
  • Availability
  • Clinical populations are often small
  • Access can take time require permission

22
Choosing an appropriate sample size for
established laboratory paradigms
  • Shortcut
  • Base sample size on sample size used in previous
    research
  • This is often perfectly appropriate
  • (but make sure the previous research is of high
    quality!)

23
Rules of thumb for multivariate tests
  • multiple regression
  • cases (N) / predictors (p)
  • N at least 50 8p for R2
  • N at least 104 p for testing a predictor
  • Need more cases if outcome is skewed, anticipated
    effect size is small, measures less reliable

24
Rules of thumb for multivariate tests
  • PCA (exploratory FA)
  • 50 no good
  • 100 poor
  • 300 good, but ideally need more

25
Random allocation
  • For example
  • 3 between subjects conditions (e.g. control,
    happy, sad)
  • Who does which condition?
  • first come? Interviewer choice?
  • Must avoid confounds. But cant check all
    possible. Solution is random allocation.

26
Random allocation needs truly random numbers
  • Different ways to do that
  • SPSS
  • random.org
  • Research randomiser
  • scripting language like python

27
Python
  • to randomly assign 9 participants to 3
    conditions
  • from random import shuffle
  • numbers 1,1,1,2,2,2,3,3,3
  • shuffle(numbers)
  • numbers
  • 3, 2, 1, 2, 3, 1, 3, 1, 2

28
Research randomiserhttp//www.randomizer.org/form
.htm
  • 3 conditions, 48 planned participants
  • randomly allocate each participant (identified
    by order of recruitment) to one of 3 conditions
  • How many sets of numbers to generate? 1
  • numbers per set? 48
  • Number range? From 1 To 3
  • Do you wish each number in a set to remain
    unique? No
  • Dont sort!

29
Result
  • Set 13, 3, 1, 3, 3, 1, 2, 2, 2, 1, 2, 2, 1, 1,
    2, 1, 3, 3, 1, 3, 2, 1, 3, 3, 1, 2, 1, 3, 1, 2,
    2, 2, 3, 3, 1, 3, 3, 1, 1, 3, 1, 3, 3, 2, 3, 3,
    1, 2

30
Research randomiserhttp//www.randomizer.org/form
.htm
  • 3 sentence types, 48 sentences
  • 16 in each group, create a random sequence, but
    limit runs of the same type
  • How many sets of numbers to generate? 16
  • numbers per set? 3
  • Number range? From 1 To 3
  • Do you wish each number in a set to remain
    unique? Yes

31
Research randomiserhttp//www.randomizer.org/form
.htm
  • 3 types, 48 sentences, 16 of each type
  • limit run of a given type, while still
    randomising order of presentation
  • 16 Sets of 3 Unique Numbers Per SetRange From 1
    to 3 -- UnsortedJob Status      
  • Set 12, 3, 1           
  • Set 23, 1, 2           
  • Set 3 .

32
Web links
  • http//www.randomizer.org/
  • http//www.random.org/

33
measures of effect size for ANOVA
Roughly, the correlation between an effect and
the outcome (DV)
  • eta squared
  • The proportion of variance in the outcome
    variable (DV) that is explained by the IV
  • SSeffect / SScorrected total
  • partial eta squared (SPSS prints this out)
  • The proportion of the effect error variance
    explained by the effect
  • SSeffect / (SSeffect SSerror)
Write a Comment
User Comments (0)
About PowerShow.com