Review of Basic Statistics - PowerPoint PPT Presentation

About This Presentation
Title:

Review of Basic Statistics

Description:

Review of Basic Statistics Descriptive Statistics Review Mean Sample Mean Example: College Class Size Population Mean ( ) Median The College Class Size example Median ... – PowerPoint PPT presentation

Number of Views:291
Avg rating:3.0/5.0
Slides: 46
Provided by: ChrisB228
Category:

less

Transcript and Presenter's Notes

Title: Review of Basic Statistics


1
Review of Basic Statistics
2
Descriptive Statistics Review
  • Measures of Location
  • The Mean
  • The Median
  • The Mode
  • Measures of Dispersion
  • The variance
  • The standard deviation

3
Mean
The mean (or average) is the basic measure of
location or central tendency of the data.
  • The sample mean is a sample statistic.
  • The population mean ? is a population statistic.

4
Sample Mean
Where the numerator is the sum of values of n
observations, or
The Greek letter S is the summation sign
5
Example College Class Size
We have the following sample of data for 5
college classes 46 54 42 46 32
We use the notation x1, x2, x3, x4, and x5 to
represent the number of students in each of the 5
classes
X1 46 x2 54 x3 42 x4 46 x5
32
Thus we have
The average class size is 44 students
6
Population Mean (?)
The number of observations in the population is
denoted by the upper case N.
The sample mean is a point estimator of the
population mean ?
7
Median
The median is the value in the middle when the
data are arranged in ascending order (from
smallest value to largest value).
  1. For an odd number of observations the median is
    the middle value.
  2. For an even number of observations the median is
    the average of the two middle values.

8
The College Class Size example
First, arrange the data in ascending order
32 42 46 46 54
Notice than n 5, an odd number. Thus the median
is given by the middle value.
32 42 46 46 54
The median class size is 46
9
Median Starting Salary For a Sample of 12
Business School Graduates
A college placement office has obtained the
following data for 12 recent graduates
Graduate Starting Salary Graduate Starting Salary
1 2850 7 2890
2 2950 8 3130
3 3050 9 2940
4 2880 10 3325
5 2755 11 2920
6 2710 12 2880
10
First we arrange the data in ascending order
2710 2755 2850 2880 2880 2890 2920 2940
2950 3050 3130 3325
Notice that n 12, an even number. Thus we take
an average of the middle 2 observations
2710 2755 2850 2880 2880 2890 2920 2940
2950 3050 3130 3325
Middle two values
Thus
11
Mode
The mode is the value that occurs with
greatest frequency
The mode is Coke Classic. A mean or median is
meaningless of qualitative data
Soft Drink Example
Soft Drink Frequency
Coke Classic 19
Diet Coke 8
Dr. Pepper 5
Pepsi Cola 13
Sprite 5
Total 50
12
Using Excel to Compute the Mean, Median, and Mode
  • Enter the data into cells A1B13 for the starting
    salary example.
  • To compute the mean, activate an empty cell and
    enter the following in the formula
    barAverage(b2b13) and click the green
    checkmark.
  • To compute the median, activate an empty cell and
    enter the following in the formula bar
    Median(b2b13) and click the green checkmark.
  • To compute the mode, activate an empty cell and
    enter the following in the formula
    barAverage(b2b13) and click the green
    checkmark.

13
The Starting Salary Example
Mean 2940
Median 2905
Mode 2880
14
Variance
  • The variance is a measure of variability that
    uses all the data
  • The variance is based on the difference between
    each observation (xi) and the mean ( ) for
    the sample and µ for the population).

15
The variance is the average of the squared
differences between the observations and the mean
value
For the population
For the sample
16
Standard Deviation
  • The Standard Deviation of a data set is the
    square root of the variance.
  • The standard deviation is measured in the same
    units as the data, making it easy to interpret.

17
Computing a standard deviation
For the population
For the sample
18
Measures of AssociationBetween two Variables
  • Covariance
  • Correlation coefficient

19
Covariance
  • Covariance is a measure of linear association
    between variables.
  • Positive values indicate a positive correlation
    between variables.
  • Negative values indicate a negative correlation
    between variables.

20
To compute a covariance for variables x and y
For populations
For samples
21
n 299
II
I
IV
III
22
If the majority of the sample points are located
in quadrants II and IV, you have a negative
correlation between the variablesas we do in
this case.
Thus the covariance will have a negative sign.
23
The (Pearson) Correlation Coefficient
A covariance will tell you if 2 variables are
positively or negatively correlatedbut it will
not tell you the degree of correlation. Moreover,
the covariance is sensitive to the unit of
measurement. The correlation coefficient does not
suffer from these defects
24
The (Pearson) Correlation Coefficient
For populations
For samples
Note that
25
(No Transcript)
26
I have 7 hours per week for exercise
27
Normal Probability Distribution
The normal distribution is by far the most
important distribution for continuous random
variables. It is widely used for making
statistical inferences in both the natural and
social sciences.
28
Normal Probability Distribution
  • It has been used in a wide variety of
    applications

Heights of people
Scientific measurements
29
Normal Probability Distribution
  • It has been used in a wide variety of
    applications

Test scores
Amounts of rainfall
30
The Normal Distribution
Where µ is the mean s is the standard
deviation ? 3.1459 e 2.71828
31
Normal Probability Distribution
  • Characteristics

The distribution is symmetric, and is
bell-shaped.
x
32
Normal Probability Distribution
  • Characteristics

The entire family of normal probability
distributions is defined by its mean m and its
standard deviation s .
Standard Deviation s
x
Mean m
33
Normal Probability Distribution
  • Characteristics

The highest point on the normal curve is at the
mean, which is also the median and mode.
x
34
Normal Probability Distribution
  • Characteristics

The mean can be any numerical value negative,
zero, or positive.
x
-10
0
20
35
Normal Probability Distribution
  • Characteristics

The standard deviation determines the width of
the curve larger values result in wider, flatter
curves.
s 15
s 25
x
36
Normal Probability Distribution
  • Characteristics

Probabilities for the normal random variable
are given by areas under the curve. The total
area under the curve is 1 (.5 to the left of the
mean and .5 to the right).
.5
.5
x
37
The Standard Normal Distribution
The Standard Normal Distribution is a normal
distribution with the special properties that is
mean is zero and its standard deviation is one.
38
Standard Normal Probability Distribution
The letter z is used to designate the standard
normal random variable.
s 1
z
0
39
Cumulative Probability
Probability that z 1 is the area under the
curve to the left of 1.

z
0
1
40
What is P(z 1)?
To find out, use the Cumulative Probabilities
Table for the Standard Normal Distribution
Z .00 .01 .02
?
?
?
.9 .8159 .8186 .8212
1.0 .8413 .8438 .8461
1.1 .8643 .8665 .8686
1.2 .8849 .8869 .8888
?
?
41
(No Transcript)
42
Area under the curve
  • 68.25 percent of the total area under the curve
    is within () 1 standard deviation from the mean.
  • 95.45 percent of the area under the curve is
    within () 2 standard deviations of the mean.


68.25
95.45
z
0
2
1
1
2
43
Exercise 1
  • Answer
  • .9931
  • 1-.9931.0069
  1. What is P(z 2.46)?
  2. What is P(z 2.46)?

z
2.46
44
Exercise 2
  • Answer
  • 1-.9015.0985
  • .9015
  1. What is P(z -1.29)?
  2. What is P(z -1.29)?

Red-shaded area is equal to green- shaded area
Note that
-1.29
z
1.29
Note that, because of the symmetry, the area to
the left of -1.29 is the same as the area to the
right of 1.29
45
Exercise 3
What is P(.00 z 1.00)?
P(.00 z 1.00).3413
0
1
z
Write a Comment
User Comments (0)
About PowerShow.com