Research Methods in Economics - PowerPoint PPT Presentation

1 / 69
About This Presentation
Title:

Research Methods in Economics

Description:

We think of univariate, bivariate or multivariate populations. ... to CEO compensation, we have a bivariate population where the elements are the ... – PowerPoint PPT presentation

Number of Views:1445
Avg rating:3.0/5.0
Slides: 70
Provided by: zikm7
Category:

less

Transcript and Presenter's Notes

Title: Research Methods in Economics


1
Research Methods in Economics
  • ECO 4451
  • Sampling and Statistical Testing

2
Sampling Terminology
  • Population
  • The complete set of items of interest
  • Population element
  • An individual member of population
  • Census
  • A complete enumeration of all elements in
    population
  • Sample
  • A subset of the population selected for
    investigation

3
Terminology (Cont)
  • Frame
  • Population frame list of all elements in
    population
  • Sample frame list of elements from which sample
    will be drawn

4
Why Sample (not census)?
  • Cost
  • Sufficiently accurate for most purposes if well
    designed probability sample
  • Sometimes decrease in accuracy from attempt to
    make complete census
  • Destruction of sample units

5
Why Sample (cont)?
  • But sampling introduces error in that it is
    virtually impossible for a sample to perfectly
    represent the population from which it was drawn.
  • Two categories of errors
  • Non-sampling error
  • Sampling error

6
Representative?
  • How well does the sample represent the
    population?
  • Population
    Sample
  • Parameters Statistics

Estimation
7
Whats a population?
  • Technically the population is the complete set of
    elements of interest.
  • For example, in a study of corporate profits, the
    population is the set of profits of all
    corporations
  • We think of univariate, bivariate or multivariate
    populations.
  • If we are interested in whether profits are
    related to CEO compensation, we have a bivariate
    population where the elements are the sets of
    pairs of profits compensation.

8
What makes a good sample?
  • It must be representative of the population.
  • Basically this means it must contain the same
    variations that exist in the population.
  • Estimators based on sample must be valid.
  • Validity depends on
  • Accuracy
  • Precision

9
Accuracy is
  • The degree to which bias is absent from the
    estimator.
  • To have Accuracy
  • Overestimates and Underestimates must balance out
    in repeated sampling.

10
Precision
  • Is low sampling error.
  • Repeated samples would yield similar estimates.
  • Is measured by the standard error of estimate, a
    type of standard deviation measurement we will
    discuss later.

11
Errors from Investigating a Sample (rather than a
census)
  • Nonsampling (systematic) error
  • Results from some imperfection in research design
    or mistakes in execution of design.
  • Sampling frame error
  • Non-response bias
  • Response or recording error

12
Systematic (Nonsampling) Errors
  • Sampling frame error
  • Some population elements not represented in
    sampling frame
  • Non-response error
  • When results are affected because some elements
    selected into sample do not respond or are not
    measured
  • Response or recording error
  • Errors in making or recording responses or
    measurements

13
Errors from Sample rather than census (Cont)
  • Sampling (random) error
  • Difference between sample statistic and
    population parameter that results from chance
    variation in elements selected for inclusion in
    sample.
  • Two determinants of sampling error
  • Homogeneity (larger sampling error) vs.
    heterogeneity (smaller sampling error) of
    population
  • Sample size (larger sample reduces sampling error)

14
Errors
  • Target Population
  • Sampling Frame
  • Planned Sample
  • Actual Sample

Sampling frame error
Sampling error
Nonresponse error
15
Stages in the Selection of a Sample
Define the target population
Select a sampling frame
Determine if a probability or nonprobability
sampling method will be chosen
Plan procedure for selecting sampling units
Determine sample size
Select actual sampling units
Conduct fieldwork
16
Sampling Units
  • A single element or group of elements subject to
    selection in sample.
  • When sampling occurs in one stage, the elements
    selected in the sample are the sampling units.
  • Example simple random sample of college
    students.
  • In multi-stage sampling we distinguish
  • Primary Sampling Units (PSU) first or top-level
  • Secondary Sampling Units second level
  • Tertiary Sampling Units third.

17
Sampling Units (Cont)
  • Multi-stage sampling
  • Primary, secondary, tertiary sampling units
  • Example first select a region (PSU), then
    colleges within region (SSU), then students at
    the colleges (TSU).

18
Two Major Categories of Sampling
  • Probability sampling
  • Known, nonzero probability for selecting any
    element from sampling frame
  • This probability may be same or different for
    different elements.
  • Sampling error can be estimated
  • Nonprobability sampling
  • Probability of selecting any particular element
    of population is unknown
  • Sampling error is unknown

19
Nonprobability Sampling
  • Convenience
  • Judgment
  • Quota
  • Snowball

20
Probability Sampling
  • Simple random sample
  • Systematic sample
  • Stratified sample
  • Cluster sample
  • Multistage cluster sample

21
What is the Appropriate Sample Design?
  • Degree of accuracy precision
  • Resources available, including time.
  • Advanced knowledge of the population
  • National versus local
  • Need for statistical analysis

22
Statistical Analysis of Samples
  • Descriptive statistics
  • Describe characteristics of sample
  • Using sample statistics, like measures of central
    tendency and dispersion, to describe a sample of
    observations.
  • Inferential statistics
  • Make an inference about an unknown population
    from a sample
  • Estimation and hypothesis testing.

23
Descriptive Statistics
  • Measures of central tendency
  • Mean, median, mode
  • Measures of dispersion
  • Variance (or standard deviation), range
  • Measures of frequency
  • Counts, proportions
  • Often presented in a table.
  • Possibly separate by different groups or
    sub-samples, particularly if your paper involves
    a comparison between groups.
  • Usually some brief discussion of the descriptive
    statistics is appropriate.
  • Give the reader some idea about the type of units
    in the sample.
  • Give the reader a feel for the scale of the data.
  • Give information about the amount of variation.
  • Inspection of descriptive statistics often
    reveals the source of problems you may be having
    with statistical procedures.

24
Frequency Distribution of Deposits
Frequency (number of people making
deposits Amount in each range)
less than 3,000 499 3,000 - 4,999
530 5,000 - 9,999 562 10,000 -
14,999 718 15,000 or more
811 3,120
25
Percentage Distribution of Amounts of Deposits
Amount Percent
less than 3,000 16 3,000 - 4,999
17 5,000 - 9,999 18 10,000 - 14,999
23 15,000 or more 26 100
26
Probability Distribution of Amounts of Deposits
Amount Probability
less than 3,000 .16 3,000 - 4,999
.17 5,000 - 9,999 .18 10,000 -
14,999 .23 15,000 or more
.26 1.00
27
Measures of Central Tendency
  • Mean - arithmetic average
  • µ, Population , sample
  • Median - midpoint of the distribution
  • Mode - the value that occurs most often

28
Population Mean
Average value in population.
29
Sample Mean
Where n denotes the total number of elements in
sample.
30
Daily Sales Calls by Salespersons
Number of Salesperson Sales calls
Mike 4 Patty 3 Billie
2 Bob 5 John 3 Frank
3 Chuck 1 Samantha 5 26
Sample mean3.25, median3, mode3. Range4, I-q
Range1, Variance1.93, Std.Dev.1.39
31
Measures of Dispersion or Spread
  • Range
  • Mean absolute deviation
  • Variance
  • Standard deviation

32
Sales for Products A and B, Both Average 200
Product A Product B
196 150 198 160 199 176 199 181 200
192 200 200 200 201 201 202 201 213 2
01 224 202 240 202 261
But sales of product B have greater variability.
33
Low Dispersion Vs High Dispersion

5 4 3 2 1
Low Dispersion
Frequency
150 160 170 180 190
200 210
Value of Variable
34
Low Dispersion Vs High Dispersion

5 4 3 2 1
High dispersion
Frequency
150 160 170 180 190
200 210
Value of Variable
35
Deviation Scores
  • The differences between each observation value
    and the mean

36
Average Deviation
37
Mean Squared Deviation
38
Variance Mean SquaredDeviation
39
Sample Variance
40
Variance
  • The variance is given in squared units
  • The standard deviation is the square root of
    variance, and so is in original units.

41
Population Standard Deviation
42
Sample Standard Deviation
43
Sample Standard Deviation
44
Inferential Statistics
  • Now instead of using statistics to describe a
    sample, we use sample statistics to make
    inferences about a population parameter.
  • For example, we use the sample mean to estimate
    the value of the population mean.
  • Then we may want to test some hypothesis about
    the population mean.

45
Distributions
  • Population distribution frequency distribution
    of elements in population
  • Sample distribution - frequency distribution of
    elements in sample
  • Sampling distribution theoretical distribution
    of a sample statistic in repeated sampling.
  • Key concept in inferential statistics.
  • Example sampling distribution of sample mean is
    normal.

46
Population Distribution

m
s
-s
x
47
Sample Distribution
_ C
X
S
48
Sampling Distribution
49
The Normal Distribution
  • Describes the probability distribution expected
    of many random occurrences.
  • Bell shaped curve
  • Almost all of its values are within plus or minus
    3 standard deviations
  • I.Q. is an example

50
Normal Distribution
13.59
13.59
34.13
34.13
2.14
2.14
51
Normal Curve IQ Example
145
70
85
115
100

52
Standardized Normal Distribution
  • Symmetrical about its mean
  • Mean identifies highest point
  • Infinite number of cases - a continuous
    distribution
  • Area under curve has a probability density 1.0
  • Mean of zero, standard deviation of 1

53
Standard Normal Curve
  • The curve is bell-shaped or symmetrical
  • About 68 of the elements will fall within 1
    standard deviation of the mean
  • About 95 of the elements will fall within
    approximately 2 (i.e., 1.96) standard deviations
    of the mean
  • Almost all (gt99) of the elements will fall
    within 3 standard deviations of the mean

54
A Standardized Normal Curve
z
2
0
-1
-2
1
55
The Standardized Normal is the Distribution of Z
z
z

56
Population Standardized Scores
57
Standardized Values
  • Used to compare an individual value to the
    population mean in units of the standard deviation

58
Linear Transformation of Any Normal Variable Into
a Standardized Normal Variable
s
s
m
X
m
Sometimes the distribution is stretched
Sometimes the distribution is shrunk
-2 -1 0 1 2
59
Central Limit Theorem
  • The CLT says that if the sample size n is
    large,
  • On average across repeated samples, the mean of
    sample means equals the population mean.
  • The variance of the sample means across different
    samples equals the population variance divided by
    n.
  • The distribution of sample means across different
    sample is normal.

60
Population Parameters and Sample Statistics
61
Review of Simple Statistical Tests
  • Many research questions can be addressed with
    very simple statistical tests.
  • Often a good research design leads to a simple
    test, while a bad design requires complex
    statistical procedures for analysis of the data.
  • Since many classic research questions imply a
    comparison between groups, the two-sample (or
    multiple-sample) tests are especially useful.

62
Overview
  • Tests concerning population means.
  • One sample test.
  • Two independent samples test.
  • K gt 2 independent samples.
  • Matched samples.
  • Tests concerning population proportions.
  • One sample.
  • More general tests.

63
Examples of Difference between Means Tests
  • Consider Does the Death Penalty Deter Murder?
    by Tammra Hunt.
  • Compare murder rates ( per 100,000) with and
    without death penalty.
  • Cross-section of states is murder rate higher on
    average in states without executions?
  • Time series of states did the murder rate fall
    in states implementing the death penalty when
    allowed by Supreme Court?
  • Panel data allows an approach based on
    differences-in-differences.

64
Between-State Differences 2003
  • Consider two populations, A (with death penalty)
    and B (without death penalty).

Note the alternative is one-sided, because the
research hypothesis is that the death penalty
deters murder.
65
Testing the null hypothesis
  • The test can be conducted by computing the
    t-statistic (note one sample size is less than
    30) manually.
  • Or it can be conducted automatically using
    statistical software, or Excel.
  • In Excel, select Tools,
  • Data Analysis,
  • t-Test Two Sample Test Assuming Equal Variances

66
Between-State Differences 2003
67
Within-State Differences
  • What if we consider within-state differences in
    murder rates?
  • Idea is that if death penalty deters murder, the
    murder rate should fall after the penalty is
    implemented.
  • Take the states that adopted the death penalty
    after the Court allowed it.
  • Get the mean death rate for each of these states
    over the 4 years before and the 4 years after.
  • Test whether the mean is lower after than it is
    before.
  • This is a matched pairs test.
  • Each states before period is matched to its
    after period.

68
Testing the within-state difference
  • The test can be conducted manually.
  • Compute the After Before difference for each
    state with the death penalty.
  • Get the sample mean and variance of these
    differences.
  • Test the null hypothesis that the difference is
    zero against the alternative that it is negative.
  • Or it can be conducted automatically using
    statistical software, or Excel.
  • In Excel, select Tools,
  • Data Analysis,
  • t-Test Paired Two Sample for Means

69
Within-State Differences
Write a Comment
User Comments (0)
About PowerShow.com