Title: Measures of Central Tendency
1Measures of Central Tendency
to be or not to be Normal
2TOPICS
- Normal Distributions
- Skewness Kurtosis
- Normal Curves and Probability
- Z- scores
- Confidence Intervals
- Hypothesis Testing
- The t-distribution
3Is this normal ?
4Normal Distributions
- Are your curves normal?
- Why do we care about normal curves?
- What do normal curves tell us?
- Answer
- The curves tell us something about the
distribution of the population - The curves allow us to make statistical
inferences regarding the probability of some
outcomes within some margin of error
5The normal distribution
- A distribution is easily depicted in a graph
where the height of the line determined by the
frequency of cases for the values beneath it. - Most cases cluster near the middle of a
distribution if close to normal
6The Normal Curve
- Bell-shaped distribution or curve
- Perfectly symmetrical about the mean.
- Mean median mode
- Tails are asymptotic closer and closer to
horizontal axis but never reach it.
7Skewness and Sample Distributions
Not all curves are normal, even if still
bell-shaped
8Skewness
9Kurtosis (Its not a disease)
- Beyond skewness, kurtosis tells us when our
distribution may have high or low variance, even
if normal. - The kurtosis value for a normal distribution will
equal 3. Anything above this is a peaked value
(low variance) and anything below is platykurtic
(high variance).
10Back to normal distributions
- The power of normal distributions, or those close
to it, is that we can predict where cases will
fall within a distribution probabilistically. - For example, what are the odds, given the
population parameter of human height, that
someone will grow to more than eight feet? - Answer, likely less than a .025 probability
11Sample Distribution
- What does Andre the Giant do to the sample
distribution? - What is the probability of finding someone like
Andre in the population? - Are you ready for more inferential statistics?
- Answer Oh boy, yes!!
12Normal Curves and probability
- We have answered the question of what Andre and
the Sumo wrestler would do to the distribution - But what about the probability of finding someone
the same height as Andre in the population? - What is the probability of finding someone the
same height as Dr. Peña or Dr. Boehmer?
13More on normal curves and probability
Andre would be here
Dr. Boehmer would be here
14Z-Scores (no sleeping!!)
- We can standardize the central tendency away from
the mean across different samples with z-scores. - The basic unit of the z-score is the standard
deviation.
15We can use the z-score to score each observation
as a distance from the mean. How far is a given
observation from the mean when its z-score
2? Answer 2 standard deviations. Approximately
what percentage of cases is a given case higher
than if its z-score 2? Answer 97
16Random Sampling Error
- Ever hear a poll report a margin of error? What
is that? - Random Sampling Error standard deviation/
square root of the sample size - Or
As the variance of the population increases, so
does the chance that a sample could not reflect
the population parameters
17Standard Error
- We often refer to both the random sampling error
with both the chance to err when sampling but
also the error of a specific sample statistic,
the mean. We typically use the term Standard
Error. - A sample statistic standard error is the
difference between the mean of a sample and the
mean of the population from which it is drawn.
18Standard Error
- Example What if most humans were 200 pounds and
only 1 million globally were 250 pounds? - The random sampling error would be low since the
chance of collecting a sample consisting heavily
of those heavier humans would be unlikely. There
would not be much error in general from sampling
because of the low variance.
19Standard Error
- Example continued. Now, when we take a sample,
each sample has a mean. If a population has low
variance, so should the samples. We should see
this reflected in low standard error in the mean
of the sample, the sample statistic. - Of course, higher variance in the population also
causes higher error in samples taken from it.
20Some more notation
Distributions Mean Standard Dev.
Sample of observed data s
Population µ s
Repeated Sampling µ
Random Sampling Error
Error in a Samples mean is the Standard Error
21Central Limit Theorem
- Remember that if we took an infinite number of
samples from a population, the means of these
samples would be normally distributed. - Hence, the larger the sample relative to the
population, the more likely the sample mean will
capture the population mean.
22Confidence Intervals
- We can actually use the information we have about
a standard deviation from the mean and calculate
the range of values for which a sample would have
if they were to fall close to the mean of the
population. - This range is based on the probability that the
sample mean falls close to the population mean
with a probability of .95, or 5 error.
23How Confident Are You?
- Are you 100 sure?
- Social scientists use a 95 as a threshold to
test whether or not the results are product of
chance. - That is, we take 1 out of 20 chances to be wrong
- What do you MEAN?
- We build a 95 confidence interval to make sure
that the mean will be within that range
24Confidence Interval (CI)
Y mean Z Z score related with a 95 CI s
standard error
25Building a CI
26CI
Why do we use 1.96?
27Calculating a 95 CI
- Lets look at the class population distribution
of height - Is it a normal or skew distribution?
- Lets build a 95 CI around the mean height of
the class
28Why do we care about CI?
- We use CI interval for hypothesis testing
- For instance, we want to know if there is an
income difference between El Paso and Boston - We want to know whether or not taking class at
Kaplan makes a difference in our GRE scores
29Mean Difference testing
Mean USA
Boston
Las Cruces
El Paso
Income levels
30(No Transcript)