Research Methods in Economics - PowerPoint PPT Presentation

1 / 69

About This Presentation

Title:

Research Methods in Economics

Description:

We think of univariate, bivariate or multivariate populations. ... to CEO compensation, we have a bivariate population where the elements are the ... – PowerPoint PPT presentation

Number of Views:1446

Avg rating:3.0/5.0

Slides: 70

Provided by: zikm7

Category:

more less

Transcript and Presenter's Notes

Title: Research Methods in Economics

1
Research Methods in Economics

ECO 4451
Sampling and Statistical Testing

2
Sampling Terminology

Population
The complete set of items of interest
Population element
An individual member of population
Census
A complete enumeration of all elements in
population
Sample
A subset of the population selected for
investigation

3
Terminology (Cont)

Frame
Population frame list of all elements in
population
Sample frame list of elements from which sample
will be drawn

4
Why Sample (not census)?

Cost
Sufficiently accurate for most purposes if well
designed probability sample
Sometimes decrease in accuracy from attempt to
make complete census
Destruction of sample units

5
Why Sample (cont)?

But sampling introduces error in that it is
virtually impossible for a sample to perfectly
represent the population from which it was drawn.
Two categories of errors
Non-sampling error
Sampling error

6
Representative?

How well does the sample represent the
population?
Population
Sample
Parameters Statistics

Estimation
7
Whats a population?

Technically the population is the complete set of
elements of interest.
For example, in a study of corporate profits, the
population is the set of profits of all
corporations
We think of univariate, bivariate or multivariate
populations.
If we are interested in whether profits are
related to CEO compensation, we have a bivariate
population where the elements are the sets of
pairs of profits compensation.

8
What makes a good sample?

It must be representative of the population.
Basically this means it must contain the same
variations that exist in the population.
Estimators based on sample must be valid.
Validity depends on
Accuracy
Precision

9
Accuracy is

The degree to which bias is absent from the
estimator.
To have Accuracy
Overestimates and Underestimates must balance out
in repeated sampling.

10
Precision

Is low sampling error.
Repeated samples would yield similar estimates.
Is measured by the standard error of estimate, a
type of standard deviation measurement we will
discuss later.

11
Errors from Investigating a Sample (rather than a
census)

Nonsampling (systematic) error
Results from some imperfection in research design
or mistakes in execution of design.
Sampling frame error
Non-response bias
Response or recording error

12
Systematic (Nonsampling) Errors

Sampling frame error
Some population elements not represented in
sampling frame
Non-response error
When results are affected because some elements
selected into sample do not respond or are not
measured
Response or recording error
Errors in making or recording responses or
measurements

13
Errors from Sample rather than census (Cont)

Sampling (random) error
Difference between sample statistic and
population parameter that results from chance
variation in elements selected for inclusion in
sample.
Two determinants of sampling error
Homogeneity (larger sampling error) vs.
heterogeneity (smaller sampling error) of
population
Sample size (larger sample reduces sampling error)

14
Errors

Target Population
Sampling Frame
Planned Sample
Actual Sample

Sampling frame error
Sampling error
Nonresponse error
15
Stages in the Selection of a Sample
Define the target population
Select a sampling frame
Determine if a probability or nonprobability
sampling method will be chosen
Plan procedure for selecting sampling units
Determine sample size
Select actual sampling units
Conduct fieldwork
16
Sampling Units

A single element or group of elements subject to
selection in sample.
When sampling occurs in one stage, the elements
selected in the sample are the sampling units.
Example simple random sample of college
students.
In multi-stage sampling we distinguish
Primary Sampling Units (PSU) first or top-level
Secondary Sampling Units second level
Tertiary Sampling Units third.

17
Sampling Units (Cont)

Multi-stage sampling
Primary, secondary, tertiary sampling units
Example first select a region (PSU), then
colleges within region (SSU), then students at
the colleges (TSU).

18
Two Major Categories of Sampling

Probability sampling
Known, nonzero probability for selecting any
element from sampling frame
This probability may be same or different for
different elements.
Sampling error can be estimated
Nonprobability sampling
Probability of selecting any particular element
of population is unknown
Sampling error is unknown

19
Nonprobability Sampling

Convenience
Judgment
Quota
Snowball

20
Probability Sampling

Simple random sample
Systematic sample
Stratified sample
Cluster sample
Multistage cluster sample

21
What is the Appropriate Sample Design?

Degree of accuracy precision
Resources available, including time.
Advanced knowledge of the population
National versus local
Need for statistical analysis

22
Statistical Analysis of Samples

Descriptive statistics
Describe characteristics of sample
Using sample statistics, like measures of central
tendency and dispersion, to describe a sample of
observations.
Inferential statistics
Make an inference about an unknown population
from a sample
Estimation and hypothesis testing.

23
Descriptive Statistics

Measures of central tendency
Mean, median, mode
Measures of dispersion
Variance (or standard deviation), range
Measures of frequency
Counts, proportions
Often presented in a table.
Possibly separate by different groups or
sub-samples, particularly if your paper involves
a comparison between groups.
Usually some brief discussion of the descriptive
statistics is appropriate.
Give the reader some idea about the type of units
in the sample.
Give the reader a feel for the scale of the data.
Give information about the amount of variation.
Inspection of descriptive statistics often
reveals the source of problems you may be having
with statistical procedures.

24
Frequency Distribution of Deposits
Frequency (number of people making
deposits Amount in each range)
less than 3,000 499 3,000 - 4,999
530 5,000 - 9,999 562 10,000 -
14,999 718 15,000 or more
811 3,120
25
Percentage Distribution of Amounts of Deposits
Amount Percent
less than 3,000 16 3,000 - 4,999
17 5,000 - 9,999 18 10,000 - 14,999
23 15,000 or more 26 100
26
Probability Distribution of Amounts of Deposits
Amount Probability
less than 3,000 .16 3,000 - 4,999
.17 5,000 - 9,999 .18 10,000 -
14,999 .23 15,000 or more
.26 1.00
27
Measures of Central Tendency

Mean - arithmetic average
µ, Population , sample
Median - midpoint of the distribution
Mode - the value that occurs most often

28
Population Mean
Average value in population.
29
Sample Mean
Where n denotes the total number of elements in
sample.
30
Daily Sales Calls by Salespersons
Number of Salesperson Sales calls
Mike 4 Patty 3 Billie
2 Bob 5 John 3 Frank
3 Chuck 1 Samantha 5 26
Sample mean3.25, median3, mode3. Range4, I-q
Range1, Variance1.93, Std.Dev.1.39
31
Measures of Dispersion or Spread

Range
Mean absolute deviation
Variance
Standard deviation

32
Sales for Products A and B, Both Average 200
Product A Product B
196 150 198 160 199 176 199 181 200
192 200 200 200 201 201 202 201 213 2
01 224 202 240 202 261
But sales of product B have greater variability.
33
Low Dispersion Vs High Dispersion

5 4 3 2 1
Low Dispersion
Frequency
150 160 170 180 190
200 210
Value of Variable
34
Low Dispersion Vs High Dispersion

5 4 3 2 1
High dispersion
Frequency
150 160 170 180 190
200 210
Value of Variable
35
Deviation Scores

The differences between each observation value
and the mean

36
Average Deviation
37
Mean Squared Deviation
38
Variance Mean SquaredDeviation
39
Sample Variance
40
Variance

The variance is given in squared units
The standard deviation is the square root of
variance, and so is in original units.

41
Population Standard Deviation
42
Sample Standard Deviation
43
Sample Standard Deviation
44
Inferential Statistics

Now instead of using statistics to describe a
sample, we use sample statistics to make
inferences about a population parameter.
For example, we use the sample mean to estimate
the value of the population mean.
Then we may want to test some hypothesis about
the population mean.

45
Distributions

Population distribution frequency distribution
of elements in population
Sample distribution - frequency distribution of
elements in sample
Sampling distribution theoretical distribution
of a sample statistic in repeated sampling.
Key concept in inferential statistics.
Example sampling distribution of sample mean is
normal.

46
Population Distribution

m
s
-s
x
47
Sample Distribution
_ C
X
S
48
Sampling Distribution
49
The Normal Distribution

Describes the probability distribution expected
of many random occurrences.
Bell shaped curve
Almost all of its values are within plus or minus
3 standard deviations
I.Q. is an example

50
Normal Distribution
13.59
13.59
34.13
34.13
2.14
2.14
51
Normal Curve IQ Example
145
70
85
115
100

52
Standardized Normal Distribution

Symmetrical about its mean
Mean identifies highest point
Infinite number of cases - a continuous
distribution
Area under curve has a probability density 1.0
Mean of zero, standard deviation of 1

53
Standard Normal Curve

The curve is bell-shaped or symmetrical
About 68 of the elements will fall within 1
standard deviation of the mean
About 95 of the elements will fall within
approximately 2 (i.e., 1.96) standard deviations
of the mean
Almost all (gt99) of the elements will fall
within 3 standard deviations of the mean

54
A Standardized Normal Curve
z
2
0
-1
-2
1
55
The Standardized Normal is the Distribution of Z
z
z

56
Population Standardized Scores
57
Standardized Values

Used to compare an individual value to the
population mean in units of the standard deviation

58
Linear Transformation of Any Normal Variable Into
a Standardized Normal Variable
s
s
m
X
m
Sometimes the distribution is stretched
Sometimes the distribution is shrunk
-2 -1 0 1 2
59
Central Limit Theorem

The CLT says that if the sample size n is
large,
On average across repeated samples, the mean of
sample means equals the population mean.
The variance of the sample means across different
samples equals the population variance divided by
n.
The distribution of sample means across different
sample is normal.

60
Population Parameters and Sample Statistics
61
Review of Simple Statistical Tests

Many research questions can be addressed with
very simple statistical tests.
Often a good research design leads to a simple
test, while a bad design requires complex
statistical procedures for analysis of the data.
Since many classic research questions imply a
comparison between groups, the two-sample (or
multiple-sample) tests are especially useful.

62
Overview

Tests concerning population means.
One sample test.
Two independent samples test.
K gt 2 independent samples.
Matched samples.
Tests concerning population proportions.
One sample.
More general tests.

63
Examples of Difference between Means Tests

Consider Does the Death Penalty Deter Murder?
by Tammra Hunt.
Compare murder rates ( per 100,000) with and
without death penalty.
Cross-section of states is murder rate higher on
average in states without executions?
Time series of states did the murder rate fall
in states implementing the death penalty when
allowed by Supreme Court?
Panel data allows an approach based on
differences-in-differences.

64
Between-State Differences 2003

Consider two populations, A (with death penalty)
and B (without death penalty).

Note the alternative is one-sided, because the
research hypothesis is that the death penalty
deters murder.
65
Testing the null hypothesis

The test can be conducted by computing the
t-statistic (note one sample size is less than
30) manually.
Or it can be conducted automatically using
statistical software, or Excel.
In Excel, select Tools,
Data Analysis,
t-Test Two Sample Test Assuming Equal Variances

66
Between-State Differences 2003
67
Within-State Differences

What if we consider within-state differences in
murder rates?
Idea is that if death penalty deters murder, the
murder rate should fall after the penalty is
implemented.
Take the states that adopted the death penalty
after the Court allowed it.
Get the mean death rate for each of these states
over the 4 years before and the 4 years after.
Test whether the mean is lower after than it is
before.
This is a matched pairs test.
Each states before period is matched to its
after period.

68
Testing the within-state difference

The test can be conducted manually.
Compute the After Before difference for each
state with the death penalty.
Get the sample mean and variance of these
differences.
Test the null hypothesis that the difference is
zero against the alternative that it is negative.
Or it can be conducted automatically using
statistical software, or Excel.
In Excel, select Tools,
Data Analysis,
t-Test Paired Two Sample for Means

69
Within-State Differences

Write a Comment

User Comments (0)