Statistical Inference and Hypothesis Testing - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Statistical Inference and Hypothesis Testing

Description:

Hypothesis tests depend on estimated confidence intervals for the parameter ... Research hypothesis: Null hypothesis: X1 X2. X1 = X2. Hypothesis Testing ... – PowerPoint PPT presentation

Number of Views:232
Avg rating:3.0/5.0
Slides: 22
Provided by: JasonD92
Category:

less

Transcript and Presenter's Notes

Title: Statistical Inference and Hypothesis Testing


1
Statistical Inference and Hypothesis Testing
  • Political Science 102
  • Introduction to Political Inquiry
  • Lecture 18

2
Hypothesis Testing Procedures
3
Factors Influencing Statistical Significance
4
Errors in Hypothesis Testing
  • In selecting threshold for judging statistical
    significance, we should balance the costs of type
    I and type II errors
  • Standard scientific practice is to set threshold
    for rejecting H0 at 95 (i.e. .05 level)
  • Assumes type II errors are preferable (a
    conservative standard)
  • Other thresholds may be more appropriate for
    different problems or applications

5
Is it a Toss Up?
6
Hypothesis Tests and Sampling Distributions
  • Hypothesis tests depend on estimated confidence
    intervals for the parameter
  • Confidence interval depends on sampling
    distribution of the statistic
  • Distributions of variables may differ
    substantially and idiosyncratically
  • Seems to make problem intractable
  • But we want the sampling distribution of the mean
    (or some other estimator statistic)
  • NOT sampling distribution of parent variable

7
Calculating Confidence Intervals
  • Central Limit Theorem tells us
  • Sum of a set of random variables approaches a
    normal distribution as n approaches 8
  • Normal distribution is the bell curve
  • The mean (and other sample statistics) are sums
    of random variables
  • We can rely on the normal distribution to test
    hypotheses about the value of population
    parameters based on information from sample
    statistics

8
An Example of the Central Limit Theorem
  • Expected value of a fair coin toss0.5
  • Heads1, Tails0
  • Distribution of coin toss result is NOT normal
  • Result is 0 or 1
  • Bernouli distribution
  • But mean value of coin toss is normally
    distributed!

9
Statistical Inference
10
Hypothesis Tests
  • Our theories give us hypotheses about population
    parameters
  • µX gt µY
  • The value of X has a positive impact on the value
    of Y
  • We can estimate sample statistics
  • The mean of X and Y in our sample
  • We need a way to assess the validity of
    statements about population parameters
  • We can rely on our sample estimators and their
    variance to use probability theory to test such
    statements.

11
Z Scores Hypothesis Tests
  • We know that N( µX , sX )
  • Subtracting µX from both sides, we can see that
  • - µX N( 0 , sx )
  • The if we divide by the standard deviation we can
    see that
  • - µX / sX N( 0 , 1 )
  • This variable is a z-score based on the
    standard normal distribution.
  • 95 of cases are within 1.96 standard deviations
    of the mean.

12
Z-Scores Hypothesis Tests
  • We can use this to test the hypothesis that µX is
    equal to a particular value given our sample mean
  • Same logic will be used later to test for a
    relationship between X and Y
  • If - hypothesized value / sX gt 1.96 then
  • there is a 95 chance that µX ? hypothesized
    value

13
Hypothesis Testing
  • How do you test hypotheses with statistics?
  • Comparing the means of two groups
  • Consider an experiment
  • Research hypothesis
  • Null hypothesis

-
-
X1 ? X2
-
-
X1 X2
14
Hypothesis Testing
  • Hypothesis College students read fewer political
    news stories per week than other voting-age
    citizens.
  • (college mean) 5
  • ? (population mean) 10
  • ? (college std. dev.) 2
  • n 25

_
(X ?)
z
__________
(? / vn)
15
Hypothesis Testing
  • Hypothesis College students read fewer political
    news stories per week than other voting-age
    citizens.
  • (college mean) 5
  • ? (population mean) 10
  • ? (college std. dev.) 2
  • n 25


-12.5
z
16
Hypothesis Testing
  • Hypothesis College students read fewer political
    news stories per week than other voting-age
    citizens.
  • 95 confidence
  • z critical 1.96


-12.5
z
17
Z-Scores and t-scores
  • Problem with z-scores for testing hypothesis is
    that we generally do not know the true variance
    of X
  • Obvious solution is to substitute the SAMPLE
    variance of x
  • Problem The sample mean of X divided by the
    sample variance is the ratio of two random
    variables, and this will not be normally
    distributed
  • Fortunately, an employee of Guiness Brewery
    figured out this distribution in 1919
  • The statistic is called Students t, and the
    t-distribution looks similar to a normal
    distribution

18
The t-statistic
  • More generally X-bar / sX-hat t(n-k)
  • k is the of parameters estimated
  • Note the addition of a degrees of freedom
    constraint
  • Relates to how much independent information we
    have
  • Lose one piece of independent information each
    time we estimate a parameter (like the variance
    of X)
  • Thus the more data points we have relative to the
    number of parameters we are trying to estimate,
    the more the t distribution looks like the z
    distribution.
  • When Ngt100 the difference is negligible

19
A Real World ExampleFeminists vs.
Environmentalists
. sum v5072 v5059 Variable Obs
Mean Std. Dev. Min
Max ---------------------------------------------
------------------------ v5072 1043
66.0326 20.19167 0 100
v5059 1028 56.33171 21.68065
0 100
20
A Real World ExampleFeminists vs.
Environmentalists
One-sample t test --------------------------------
---------------------------------------------- Var
iable Obs Mean Std. Err. Std.
Dev. 95 Conf. Interval ---------------------
--------------------------------------------------
------ v5059 1028 56.33171 .6762009
21.68065 55.00482 57.65861 --------------
--------------------------------------------------
-------------- mean mean(v5059)
t 9.3637 Ho
mean 50
degrees of freedom 1027 Ha mean lt 50
Ha mean ! 50 Ha
mean gt 50 Pr(T lt t) 1.0000 Pr(T gt
t) 0.0000 Pr(T gt t) 0.0000
21
A Real World ExampleFeminists vs.
Environmentalists
Paired t test ------------------------------------
------------------------------------------ Variabl
e Obs Mean Std. Err. Std. Dev.
95 Conf. Interval ---------------------------
--------------------------------------------------
v5059 1021 56.34574 .6790026
21.69623 55.01334 57.67814 v5072
1021 66.0431 .6326196 20.21415
64.80171 67.28448 ----------------------------
-------------------------------------------------
diff 1021 -9.697356 .663641
21.20538 -10.99961 -8.395098 -----------------
--------------------------------------------------
----------- mean(diff) mean(v5059 - v5072)
t -14.6124 Ho
mean(diff) 0
degrees of freedom 1020 Ha mean(diff) lt
0 Ha mean(diff) ! 0 Ha
mean(diff) gt 0 Pr(T lt t) 0.0000 Pr(T
gt t) 0.0000 Pr(T gt t) 1.0000
Write a Comment
User Comments (0)
About PowerShow.com