Some definitions

About This Presentation

Title:

Some definitions

Description:

Example: The following histogram displays the birth weight (in Kg's) of n = 100 births. Find the proportion of births that have a birthweight less than 0.34 kg. ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 118

Provided by: lave9

Category:

more less

Transcript and Presenter's Notes

Title: Some definitions

1
Some definitions

In Statistics

2
A sample

Is a subset of the population

3
In statistics

One draws conclusions about the population based
on data collected from a sample

4
Reasons

Cost

It is less costly to collect data from a sample
then the entire population
Accuracy
5
Accuracy
Data from a sample sometimes leads to more
accurate conclusions then data from the entire
population
Costs saved from using a sample can be directed
to obtaining more accurate observations on each
case in the population
6
Types of Samples

different types of samples are determined by how
the sample is selected.

7
Convenience Samples

In a convenience sample the subjects that are
most convenient to the researcher are selected as
objects in the sample.
This is not a very good procedure for inferential
Statistical Analysis but is useful for
exploratory preliminary work.

8
Quota samples

In quota samples subjects are chosen conveniently
until quotas are met for different subgroups of
the population.
This also is useful for exploratory preliminary
work.

9
Random Samples

Random samples of a given size are selected in
such that all possible samples of that size have
the same probability of being selected.

Convenience Samples and Quota samples are useful
for preliminary studies. It is however difficult
to assess the accuracy of estimates based on this
type of sampling scheme.
Sometimes however one has to be satisfied with a
convenience sample and assume that it is
equivalent to a random sampling procedure

11
(No Transcript)
12
Some other definitions
13
A population statistic (parameter)

Any quantity computed from the values of
variables for the entire population.

14
A sample statistic

Any quantity computed from the values of
variables for the cases in the sample.

Since only cases from the sample are observed
only sample statistics are computed
These are used to make inferences about
population statistics
It is important to be able to assess the accuracy
of these inferences

16
To download lectures

Go to the stats 244 web site
Through PAWS or
by going to the website of the department of
Mathematics and Statistics -gt people -gt faculty
-gt W.H. Laverty -gt Stats 244-. Lectures.
Then
select the lecture
Right click and choose Save as

17
To print lectures

Open the lecture using MS Powerpoint
Select the menu item File -gt Print

The following dialogue box appear

In the Print what box, select handouts

Set Slides per page to 6 or 3.

21
6 slides per page will result in the least amount
of paper being printed
1
2
3
4
5
6
22
3 slides per page leaves room for notes.
1
2
3
23
Organizing and describing Data
24
Techniques for continuous variables
25
The Grouped frequency tableThe Histogram
26
To Construct

A Grouped frequency table
A Histogram

Find the maximum and minimum of the observations.
Choose non-overlapping intervals of equal width
(The Class Intervals) that cover the range
between the maximum and the minimum.
The endpoints of the intervals are called the
class boundaries.
Count the number of observations in each interval
(The cell frequency - f).
Calculate relative frequency
relative frequency f/N

28
Data Set 3 The following table gives data on
Verbal IQ, Math IQ, Initial Reading Acheivement
Score, and Final Reading Acheivement Score for 23
students who have recently completed a reading
improvement program Initial Final Verbal
Math Reading Reading Student IQ IQ Acheivement
Acheivement 1 86 94 1.1 1.7 2 104 103 1.5 1.7
3 86 92 1.5 1.9 4 105 100 2.0 2.0 5 118 115 1.9
3.5 6 96 102 1.4 2.4 7 90 87 1.5 1.8 8 95 100
1.4 2.0 9 105 96 1.7 1.7 10 84 80 1.6 1.7 11 94
87 1.6 1.7 12 119 116 1.7 3.1 13 82 91 1.2 1.8
14 80 93 1.0 1.7 15 109 124 1.8 2.5 16 111 119
1.4 3.0 17 89 94 1.6 1.8 18 99 117 1.6 2.6 19 9
4 93 1.4 1.4 20 99 110 1.4 2.0 21 95 97 1.5 1.3
22 102 104 1.7 3.1 23 102 93 1.6 1.9
29
In this example the upper endpoint is included in
the interval. The lower endpoint is not.
30
Histogram Verbal IQ
31
Histogram Math IQ
32
Example

In this example we are comparing (for two drugs A
and B) the time to metabolize the drug.
120 cases were given drug A.
120 cases were given drug B.
Data on time to metabolize each drug is given on
the next two slides

33
Drug A
34
Drug B
35
Grouped frequency tables
36
Histogram drug A(time to metabolize)
37
Histogram drug B(time to metabolize)
38
Some comments about histograms

The width of the class intervals should be chosen
so that the number of intervals with a frequency
less than 5 is small.
This means that the width of the class intervals
can decrease as the sample size increases

If the width of the class intervals is too small.
The frequency in each interval will be either 0
or 1
The histogram will look like this

If the width of the class intervals is too large.
One class interval will contain all of the
observations.
The histogram will look like this

Ideally one wants the histogram to appear as seen
below.
This will be achieved by making the width of the
class intervals as small as possible and only
allowing a few intervals to have a frequency less
than 5.

As the sample size increases the histogram will
approach a smooth curve.
This is the histogram of the population

43
N 25
44
N 100
45
N 500
46
N 2000
47
N 8
48
Comment the proportion of area under a histogram
between two points estimates the proportion of
cases in the sample (and the population) between
those two values.
49
Example The following histogram displays the
birth weight (in Kgs) of n 100 births
50
Find the proportion of births that have a
birthweight less than 0.34 kg.
51
Proportion (11310111917)/100 0.62
52
The Characteristics of a Histogram

Central Location (average)
Spread (Variability, Dispersion)
Shape

53
Central Location
54
Spread, Dispersion, Variability
55
Shape Bell Shaped (Normal)
56
Shape Positively skewed
57
Shape Negatively skewed
58
Shape Platykurtic
59
Shape Leptokurtic
60
Shape Bimodal
61
The Stem-Leaf Plot

An alternative to the histogram

Each number in a data set can be broken into two
parts
A stem
A Leaf

Example
Verbal IQ 84
84
Stem 10 digit 8
Leaf Unit digit 4

Leaf
Stem
64

Example
Verbal IQ 104
104
Stem 10 digit 10
Leaf Unit digit 4

Leaf
Stem
65
To Construct a Stem- Leaf diagram

Make a vertical list of all stems
Then behind each stem make a horizontal list of
each leaf

66
Example

The data on N 23 students
Variables
Verbal IQ
Math IQ
Initial Reading Achievement Score
Final Reading Achievement Score

67
Data Set 3 The following table gives data on
Verbal IQ, Math IQ, Initial Reading Acheivement
Score, and Final Reading Acheivement Score for 23
students who have recently completed a reading
improvement program Initial Final Verbal
Math Reading Reading Student IQ IQ Acheivement
Acheivement 1 86 94 1.1 1.7 2 104 103 1.5 1.7
3 86 92 1.5 1.9 4 105 100 2.0 2.0 5 118 115 1.9
3.5 6 96 102 1.4 2.4 7 90 87 1.5 1.8 8 95 100
1.4 2.0 9 105 96 1.7 1.7 10 84 80 1.6 1.7 11 94
87 1.6 1.7 12 119 116 1.7 3.1 13 82 91 1.2 1.8
14 80 93 1.0 1.7 15 109 124 1.8 2.5 16 111 119
1.4 3.0 17 89 94 1.6 1.8 18 99 117 1.6 2.6 19 9
4 93 1.4 1.4 20 99 110 1.4 2.0 21 95 97 1.5 1.3
22 102 104 1.7 3.1 23 102 93 1.6 1.9
68

We now construct
a stem-Leaf diagram
of Verbal IQ

A vertical list of the stems
8
9
10
11
12

We now list the leafs behind stem
70
8
6
10
4
8
6
10
5
11
8
9
6
9
0
9
5
10
5
8
4
9
4
11
9
8
2
8
0
10
9
11
1
8
9
9
9
9
4
9
9
9
5
10
2
10
2

71
8
6
10
4
8
6
10
5
11
8
9
6
9
0
9
5
10
5
8
4
9
4
11
9
8
2
8
0
10
9
11
1
8
9
9
9
9
4
9
9
9
5
10
2
10
2

8 6 6 4 2 0 9
9 6 0 5 4 9 4 9 5
10 4 5 5 9 2 2
11 8 9 1
12

73
The leafs may be arranged in order

8 0 2 4 6 6 9
9 0 4 4 5 5 6 9 9
10 2 2 4 5 5 9
11 1 8 9
12

74
The stem-leaf diagram is equivalent to a histogram

8 0 2 4 6 6 9
9 0 4 4 5 5 6 9 9
10 2 2 4 5 5 9
11 1 8 9
12

75
The stem-leaf diagram is equivalent to a histogram

8 0 2 4 6 6 9
9 0 4 4 5 5 6 9 9
10 2 2 4 5 5 9
11 1 8 9
12

76
Rotating the stem-leaf diagram we have
80
90
100
110
120
77
The two part stem leaf diagram

Sometimes you want to break the stems into two
parts
for leafs 0,1,2,3,4
for leafs 5,6,7,8,9

78
Stem-leaf diagram for Initial Reading Acheivement

01234444455556666677789
0
This diagram as it stands does not
give an accurate picture of the
distribution

We try breaking the stems into
two parts
1. 012344444
1. 55556666677789
2. 0
2.

80
The five-part stem-leaf diagram

If the two part stem-leaf diagram is not adequate
you can break the stems into five parts
for leafs 0,1
t for leafs 2,3
f for leafs 4, 5
s for leafs 6,7
for leafs 8,9

We try breaking the stems into
five parts
1. 01
1.t 23
1.f 444445555
1.s 66666777
1. 89
2. 0

Stem leaf Diagrams
Verbal IQ, Math IQ, Initial RA, Final RA

83
Some Conclusions

Math IQ, Verbal IQ seem to have approximately the
same distribution
bell shaped centered about 100
Final RA seems to be larger than initial RA and
more spread out
Improvement in RA
Amount of improvement quite variable

84
Numerical Measures

Measures of Central Tendency (Location)
Measures of Non Central Location
Measure of Variability (Dispersion, Spread)
Measures of Shape

85
Measures of Central Tendency (Location)

Mean
Median
Mode

Central Location
86
Measures of Non-central Location
Non - Central Location

Quartiles, Mid-Hinges
Percentiles

87
Measure of Variability (Dispersion, Spread)

Variance, standard deviation
Range
Inter-Quartile Range

Variability
88
Measures of Shape

Skewness
Kurtosis

89
Measures of Central Location (Mean)

Summation Notation
Let x1, x2, x3, xn denote a set of n numbers.
Then the symbol
denotes the sum of these n numbers
x1 x2 x3 xn

Example
Let x1, x2, x3, x4, x5 denote a set of 5 denote
the set of numbers in the following table.

Then the symbol
denotes the sum of these 5 numbers
x1 x2 x3 x4 x5
10 15 21 7 13
66

Meaning of parts of summation notation

Final value for i
each term of the sum
Quantity changing in each term of the sum
Starting value for i
93

Example
Again let x1, x2, x3, x4, x5 denote a set of 5
denote the set of numbers in the following table.

Then the symbol
denotes the sum of these 3 numbers
153 213 73
3375 9261 343
12979

95
Mean

Let x1, x2, x3, xn denote a set of n numbers.
Then the mean of the n numbers is defined as

Example
Again let x1, x2, x3, x4, x5 denote a set of 5
denote the set of numbers in the following table.

Then the mean of the 5 numbers is

98
Interpretation of the Mean

Let x1, x2, x3, xn denote a set of n numbers.
Then the mean, , is the centre of gravity of
those the n numbers.
That is if we drew a horizontal line and placed a
weight of one at each value of xi , then the
balancing point of that system of mass is at the
point .

99
xn
x1
x2
x3
x4
100
In the Example
21
10
7
15
13
20
10
0
101
The mean, , is also approximately the center
of gravity of a histogram
102
The Median

Let x1, x2, x3, xn denote a set of n numbers.
Then the median of the n numbers is defined as
the number that splits the numbers into two equal
parts.
To evaluate the median we arrange the numbers in
increasing order.

103

If the number of observations is odd there will
be one observation in the middle.
This number is the median.
If the number of observations is even there will
be two middle observations.
The median is the average of these two
observations

104

Example
Again let x1, x2, x3, x3 , x4, x5 denote a set of
5 denote the set of numbers in the following
table.

105

The numbers arranged in order are
7 10 13 15 21

Unique Middle observation the median
106

Example 2
Let x1, x2, x3 , x4, x5 , x6 denote the 6 denote
numbers
23 41 12 19 64 8
Arranged in increasing order these observations
would be
8 12 19 23 41 64

Two Middle observations
107

Median
average of two middle observations

108
Example

The data on N 23 students
Variables
Verbal IQ
Math IQ
Initial Reading Achievement Score
Final Reading Achievement Score

109
Data Set 3 The following table gives data on
Verbal IQ, Math IQ, Initial Reading Acheivement
Score, and Final Reading Acheivement Score for 23
students who have recently completed a reading
improvement program Initial Final Verbal
Math Reading Reading Student IQ IQ Acheivement
Acheivement 1 86 94 1.1 1.7 2 104 103 1.5 1.7
3 86 92 1.5 1.9 4 105 100 2.0 2.0 5 118 115 1.9
3.5 6 96 102 1.4 2.4 7 90 87 1.5 1.8 8 95 100
1.4 2.0 9 105 96 1.7 1.7 10 84 80 1.6 1.7 11 94
87 1.6 1.7 12 119 116 1.7 3.1 13 82 91 1.2 1.8
14 80 93 1.0 1.7 15 109 124 1.8 2.5 16 111 119
1.4 3.0 17 89 94 1.6 1.8 18 99 117 1.6 2.6 19 9
4 93 1.4 1.4 20 99 110 1.4 2.0 21 95 97 1.5 1.3
22 102 104 1.7 3.1 23 102 93 1.6 1.9
110

Computing the Median
Stem leaf Diagrams

Median middle observation 12th observation
111
Summary
112
Some Comments

The mean is the centre of gravity of a set of
observations. The balancing point.
The median splits the obsevations equally in two
parts of approximately 50

113

The median splits the area under a histogram in
two parts of 50
The mean is the balancing point of a histogram

50
50
median
114

For symmetric distributions the mean and the
median will be approximately the same value

50
50
Median
115

For Positively skewed distributions the mean
exceeds the median
For Negatively skewed distributions the median
exceeds the mean

50
50
median
116

An outlier is a wild observation in the data
Outliers occur because
of errors (typographical and computational)
Extreme cases in the population

117

The mean is altered to a significant degree by
the presence of outliers
Outliers have little effect on the value of the
median
This is a reason for using the median in place of
the mean as a measure of central location
Alternatively the mean is the best measure of
central location when the data is Normally
distributed (Bell-shaped)

Write a Comment

User Comments (0)

About PowerShow.com

Some definitions - PowerPoint PPT Presentation

Some definitions

Example: The following histogram displays the birth weight (in Kg's) of n = 100 births. Find the proportion of births that have a birthweight less than 0.34 kg. ... – PowerPoint PPT presentation