Exercise 19: Sample Size - PowerPoint PPT Presentation

About This Presentation
Title:

Exercise 19: Sample Size

Description:

Exercise 19: Sample Size Part One Explore how sample size affects the distribution of sample proportions This was achieved by first taking random samples 20 times ... – PowerPoint PPT presentation

Number of Views:11
Avg rating:3.0/5.0
Slides: 30
Provided by: scs87
Learn more at: https://sites.pitt.edu
Category:
Tags: exercise | sample | size

less

Transcript and Presenter's Notes

Title: Exercise 19: Sample Size


1
Exercise 19 Sample Size
2
Part One
  • Explore how sample size affects the distribution
    of sample proportions
  • This was achieved by first taking random samples
    20 times when n10 and then taking 20 random
    samples where n40. These random samples were
    then summarized as sample statistics (p-hat).

3
Tally for Discrete Variable Live
  • Live Count Percent
  • off 223 50.11
  • on 222 49.89
  • N 445
  • 1
  • This verifies that the proportion of students
    living on campus and off campus is approximately
    50. This would be the population proportion (p).

4
Mean, Shape Standard Deviation
  • What would you expect if 20 random samples of 10
    were taken?
  • What would you expect if 20 random samples of 40
    were taken?

5
Results from 20 samples where n10 resulting in
phatlive
0.6000 0.5000 0.5000 0.4000 0.5000 0.5556 0.7000 0
.4000 0.6000 0.8000
  • 0.3000
  • 0.4000
  • 0.5000
  • 0.4000
  • 0.5000
  • 0.4000
  • 0.5000
  • 0.3000
  • 0.5000
  • 0.6000

6
Descriptive Statistics phatlive10
  • Variable N N Mean SE Mean
    StDev
  • Phatlive 20 0 0.4978
    0.0278 0.1242
  • Minimum Q1 Median Q3
    Maximum
  • 0.3000 0.4000 0.5000 0.5889
    0.8000

7
Lets Look At A Stem Plot
  • Stem-and-leaf of phatlive10 (N 20)
  • Leaf Unit 0.010
  •  
  •   3 00
  • 3
  • 4 00000
  • 4
  • 5 0000000
  • 5 5
  • 6 000
  • 6
  • 7 0
  • 7
  • 8 0

8
Sample Proportions
  • What is the center, spread and shape for this
    sample proportion?
  • Center mean 0.4978 phat
  • Spread st.dev 0.1242
  • Shape np and/or n(1-p) does not equal atleast
    10, therefore guidelines for normality are not
    met. However, as shown in the stem plot, the
    results appear relatively normal because of the
    perfectly balanced population proportions of .5
    and .5.

9
What if the sample size increases
  • Results from 20 samples where n40 resulting in
    phatlive

0.5750 0.4750 0.4500 0.4250 0.4750 0.3250 0.4250 0
.4000 0.4250 0.3500
0.5500 0.5000 0.5385 0.4359 0.4500 0.5000 0.4750 0
.4250 0.4500 0.4750
10
Descriptive Statistics phatlive40
  • Variable N N Mean SE Mean
    StDev
  • Phatlive40 20 0 0.4562 0.0137
    0.0611
  • Minimum Q1 Median Q3
    Maximum
  • 0.3250 0.4250 0.4500 0.4938
    0.5750

11
Stem-plot for phatlive40
  • N 20 Leaf Unit 0.010
  • 3 2
  • 3 5
  • 3
  • 3
  • 4 0
  • 4 22223
  • 4 555
  • 4 7777
  • 4
  • 5 00
  • 5 3
  • 5 5
  • 5 7

12
Sample Proportions for phatlive40
  • What is the center, spread and shape for this
    sample proportion?
  • Center mean.4562
  • Spread st. dev. .0611
  • Shape np and n(1-p) are greater then 10 there
    normality satisfied.

13
Lets compare them simultaneously
  •  
  • Descriptive Statistics phatlive40, phatlive10
  •  
  • Variable N N Mean SE Mean
    StDev Minimum Q1 Median
  • phatlive40 20 0 0.4562 0.0137
    0.0611 0.3250 0.4250 0.4500
  • phatlive10 20 0 0.4978 0.0278
    0.1242 0.3000 0.4000 0.5000
  • Variable Q3 Maximum
  • phatlive40 0.4938 0.5750
  • phatlive10 0.5889 0.8000
  •  How do their centers, spreads and shapes
    compare?

14
Box-plots
15
What does this mean?
  • The mean for n40 is more consistent with the
    population mean.
  • The spread is smaller for n40
  • The shape is more normal for n40

16
As outlined in Chapter 6
  • A random variable X for count of sampled
    individuals in the category of interest is
    binomial with parameters n and p if
  • There is a fixed sample size n
  • Each selection is independent of the others
  • Each individual sampled takes just two possible
    values
  • The Probability of each individual falling in the
    category of interest is always p.

17
However
  • The second condition isnt really met when
    sampling without replacement. But as long as the
    population is at least 10n, then approximate
    independence can still be concluded.
  • Since the population is greater then 400, both
    sample sizes of 10 and 40 follow this rule.

18
Part 2
  • Explores how population shape affects the
    distribution of sample proportion.
  • First, 20 random samples of 10 were taken and
    then 20 random samples of 40 were taken. The
    results were compared.

19
Handedness
  • Tally for Discrete Variables Handed
  •  
  • Handed Count Percent
  • ambid 13 2.91
  • left 40 8.97
  • right 393 88.12
  • N 446
  • Proportion of ambidextrous is very skewed since
    only approximately 3 of population is vs. 97
    who is not.

20
For Handedness n10
  • Variable N N Mean
    SE Mean
  • phathandedn10 20 0 0.0300 0.0164
  • StDev Min. Q1 Median Q3
    Max.
  • 0.0733 0.00 0.00 0.00 0.00
    0.3000

21
Stem-plot n10
  • Stem-and-leaf of phathandedn10
  • N 20 Leaf Unit 0.010
  •  
  • 0 0000000000000000
  • 1 000
  • 2
  • 3 0

22
What does this data show?
  • The center or mean is 0.0300
  • The spread is .0073
  • The shape is not normal because the guidelines of
    np and n(1-p) being greater then 10 are not met

23
Handedness n40
  • Descriptive Statistics phathandedn40
  •  
  • Variable N N Mean SE
    Mean StDev
  • phathandedn40 20 0 0.04000 0.00612
    0.02739
  • Minimum Q1 Median Q3 Maximum
  • 0.00000 0.02500 0.03750 0.05000 0.10000
  •  

24
Stem-plot n-40
  •  
  • Stem-and-leaf of phathandedn40 N 20
  • Leaf Unit 0.0010
  •  
  • 0 000
  • 1
  • 2 5555555
  • 3
  • 4
  • 5 000000
  • 6
  • 7 555
  • 8
  • 9
  • 10 0

25
What does this mean?
  • The center or mean is 0.0400
  • The spread is 0.02739
  • The shape is normal because the guidelines of np
    and n(1-p) being greater then 10 are met.

26
Lets compare them
  • Variable N N Mean
    SE Mean StDev
  • phathandedn40 20 0 0.0400 0.00612
    0.02739
  • phathandedn10 20 0 0.0300 0.0164
    0.0733
  • Minimum Q1 Median Q3
    Maximum
  • 0.00000 0.02500 0.03750 0.05000
    0.10000
  • 0.0000 0.0000 0.0000 0.0000
    0.3000
  •  

27
Lets compare them
28
What does it mean?
  • By increasing the sample size, the box plot
    became less skewed.
  • There was less of a spread and fewer outliers.
  • The center remained at approximately .03
  • The shape became more normal.

29
Overall
  • Live seemed to be more normal the handedness.
    This was because the population was no skewed for
    the live variable like for handedness.
  • In both situation, n40 caused the distributions
    to be more normal.
Write a Comment
User Comments (0)
About PowerShow.com