Title: POLS 7012
1POLS 7012
- Introduction to Political Methodology
- Jan. 10, 2007
2The Scientific Approach to Politics
- Its not rocket science, but
- The scientific method works well for studying
rocks and medicine, but what about behavior of
individuals? - Notwithstanding the challenges of creating a good
research design, we can come up with testable
hypotheses about how the world works - How do we test them?
3How to test?
- Qualitative
- Easy to do (badly)
- Hard to do (well)
- Quantitative
- Hard to learn
- Once the skills are acquired, many of the facets
of a good research design are built-in to the
quantitative approach. - We will work this semester to lay the foundations
for this kind of work
4Where to beginQuantification
- Quantitiative reasoning requires measurement
- To measure, we must understand (or define)
something about what we are measuring - What can we measure? What cant we measure?
- The Better we define the concepts we are
measuring, the better we can measure them
5Concept Truck
- What do we need to know to answer this question
- What proportion of people in the United States
are truck owners? - What is a truck?
6Truck
- What do we need to know to answer this question
- What proportion of people in the United States
are truck owners? - What is a truck owner?
- Lease Trucks?
- What about people with car loans?
- What about corporate owners / company cars?
- Do we count kids as being non-owners when its
impossible for them to own?
7Clear Definitions Clear Measures
- Our questions and descriptions of results must be
clear about what we are measuring! - Bad
- What is the proportion of truck owners in the
U.S.? - Better
- What is the proportion of American adults (18 and
up) who owns a registered, drivable pick-up truck
(as defined by the manufacturers description)?
8What about politics?
- Definitions of all things regulated matter to us
- If truck is hard to measure, what about
- John Kerry is a Liberal
- Bush is a Compassionate Conservative
- Mexico is a Democratic country
- Americans have liberty
- Taiwan is a country
9Conceptual vs. Operational
- Concept How we describe, or how we think about,
a concept. Platos World of Forms - Conceptual Definition States the properties of a
concept and the subject(s) to which it applies - Operational Definition Instrument of Measurement
- Variable The Actual Measurement
10What is a Concept?
- Concepts do not physically exist
- Platos World of Forms
- Exist in shared understanding of Language
(Heidegger) - Somehow, we still know what they are (kind of)
11Conceptual Definition
- We may know what it is, but we often cant say
exactly what it is - Conceptual Definitions try to Define a concept
- The Subject to which the concept applies
- Variation within a characteristic
- How the characteristic is to be measured (what
the characteristic is)
12What is the concept of Democracy
- Democracies
- Competitive Elections
- Open entry into candidacy
- Unfettered participation of citizens in elections
13Conceptual Definition
- We can conceptually define our characteristic as
one or more of these things. - Lets say
- The concept of democracy is defined as the extent
to which countries exhibit the characteristic of
competitive elections - This definition includes everything we need
- The Subject to which the concept applies (the
unit of analysis) - Variation within a characteristic
- How the characteristic is to be measured (what
the characteristic is)
14Side note Unit of Analysis
- Suppose we wanted to measure the concept of Party
Identification to see if people change their
party ID - We sample 5 people, and then ask them again two
years later - At the first measurement, 3 out of 5 people are
Democrats. At the second measurement, again, 3
out of the 5 are Democrats. Can we conclude that
people dont change their party ID? - Ecological Fallacy and individual vs. aggregated
15Operational Definitions
- Conceptual Definitions specify a measurable
characteristic of the concept - The concept of democracy is defined as the extent
to which countries exhibit the characteristic of
competitive elections. - Operational Definition specifies how we measure
that characteristicWhat do you think? - Percentage of people who, when called on the
phone, think their country has competitive
elections. - Percentage of experts who think a country has
competitive elections - Mail Survey
- Percentage of the vote in the Executive Office
race - Percentage of candidates who run unopposed
16Variable
- The actual measurement, classified as
Variable
Quantitative
Categorical
Ratio
Interval
Ordinal
Nominal
17Now You Try!
- Suppose you wanted to measure smoking.
- How often do you smoke?
- Never
- 2-3 per day
- 1 pack per day
- gt 1 pack per day
- What is the level of measurement?
- What about this one?
- How many cigarettes do you smoke each day?
18Distributions
- The distribution of a variable tells us what
values it takes and how often it takes those
values - Distributions allow us to summarize the main
features of data, often graphically
19Graphs of Categorical Variables
- Lets begin with these data
- Education Count(mil.) Percent
- Less the H.S 4.6 11.8
- High School only 11.6 30.6
- Some College 7.4 19.5
- Associate Deg. 3.3 8.8
- BA/BS 8.6 22.7
- Advanced Degree 2.5 6.6 _
20(No Transcript)
21(No Transcript)
22(No Transcript)
23Pie Charts vs. Bar Charts
- Pie Chart is most useful for emphasizing parts of
a whole. But - Requires that you include all possible categories
- More errors in reading (humans dont gauge areas
very well) - Bar Graphs are easier to read and more flexible
- In the end, these are less important because
understanding the original table wasnt that hard!
24Graphs of Quantitative Variables Histograms
- IQ Scores for 60 randomly chosen fifth-grade
students
25Summarizing Distributions
- Central Tendancy
- What is in the Middle?
- What is most common?
- What would we use to predict (best guess)?
- Dispersion
- How Spread out is the distribution?
- What Shape is it?
26Appropriate Measures of Central Tendency
- Nominal variables Mode
- Ordinal variables Median or Mode
- Interval Level variables Mean
- if the distribution is symmetric, otherwise
consider median
27Mode
Male Female
28Median
- Middle-most Value
- 50 of observations are above the Median, 50 are
below it - The difference in magnitude between the
observations does not matter - Therefore, it is not sensitive to outliers
29Median
- Find the Median
- 4 5 6 6 7 8 9 10 12
- 7
- Find the Median
- 5 6 6 7 8 9 10 12
- 7.5
- Find the Median
- 5 6 6 7 8 9 10 100,000
- 7.5
30Aside on Sigma (S) Notation
- In statistics, we deal with many individuals
being measured on the same characteristic - We need a shortcut to help us
- Sigma (S) says that we add up every observed
measurement.
31Example
- Let x 4, 5, 7, 9
- Sx x1 x2 x3 x4
- 4 5 7 9
- 25
- S(x 1) (x1 1) (x2 1) (x3 1) (x4 1)
- (4 1) (5 1) (7 1) (9 1)
- 5 6 8
10 - 29
32Example
- Let x 4, 5, 7, 9 and y 2
- Sxy x1y x2y x3y x4y
- 42 52 72 92
- 8 10 14 18
- 50
- S(x y) (x1 y) (x2 y) (x3 y) (x4 y)
- (4 2) (5 2) (7 2) (9 2)
- 6 7 9
11 - 33
33Mean
- Most Common Measure of Central Tendency
- Best for making predictions
- Also known as average
- Symbolized as
- X for the mean of a sample
- ยต for the mean of a population
34Finding the Mean
-
- If X 3, 5, 10, 4, 3
- x (3 5 10 4 3) / 5
- 25 / 5
- 5
35Measures of Dispersion
- Percentiles The pth percentile of a distribution
is the value such that p percent of the
observations fall at or below it. - Quartiles are the most commonly used percentiles.
- For the IQ variable, we have
- 89.4 99.9 105.7
- Note that the 1st and 3rd quartiles are the
medians of the data below and above the median,
respectively
36Five-number Summary and the boxplot
- Can summarize spread and central tendency with 5
numberslowest value, highest value, median, and
the first and third quartiles (25th and 75th
percentiles) - These can be combined graphically into a
box-and-whisker plot (or boxplot)
37(No Transcript)
38Outliers
- An outlier is a value of a variable that is far
away from the other values of the variable - One rule of thumb for determining outliers
- Compute the Inter-quartile range (IQR) as the
difference between the first quartile and the
third quartile. - An observation is an outlier if it falls more
than 1.5 IQR above the third quartile or below
the first quartile.
39(No Transcript)
40How do we describe this?
- Measures of variability (Dispersion)
- Mean Deviation
- Variance
- Standard Deviation
41Mean Deviation
- We could just calculate the average distance
between each observation and the mean.
Problem This always sums to Zero!
42Mean Deviation
- We must take the absolute value of the distance
(absolute deviation), otherwise they would just
cancel out to 0! - Formula
43Mean Deviation An Example
Data X 6, 10, 5, 4, 9, 8
X 42 / 6 7
- Compute X (Average)
- Compute X X and take the Absolute Value to get
Absolute Deviations - Sum the Absolute Deviations
- Divide the sum of the absolute deviations by N
12 / 6 2
Total 12
44What Does it Mean?
- On Average, each observation is two units away
from the mean.
45Is it Really that Easy?
- No.
- Absolute values are difficult to manipulate
algebraically - Absolute values cause enormous problems for
calculus (Discontinuity) - We need something else(but what?)
46Variance and Standard Deviation
- Instead of taking the absolute value, we square
the deviations from the mean. This yields a
positive value. - This will result in measures we call the Variance
and the Standard Deviation - Sample- Population-
- s Standard Deviation s Standard Deviation
- s2 Variance s2 Variance
47Calculating the Variance and/or Standard Deviation
- Formulae
- Variance Std. Dev.
- An Example Follows. . .
48 GAINS LOSSES BY PRESIDENTS PARTY
IN MIDTERM ELECTIONS
MEAN S X/N -313/1226.08
S2 3716.9/12 309.7
S
17.6
49What Does it Mean?
- Interpretation is not quite as straightforward
(but it is much more useful). It requires the
use of the normal distribution
50Density Curve
- A density curve is like a smooth curve drawn over
a histogram. It describes what the histogram
would look like if it involved an infinite number
of cases with infinitely small bins.
51Density Curves
- Are always on the horizontal axis
- Always have an area of exactly 1 underneath the
curve. - The familiar bell-shaped normal distribution is a
density curve that represents many distributions
of real data, particularly those involving chance
outcomes.
52The Normal Distribution
- If a variable is normally distributed, the
standard deviation has a special meaning - Whatever the mean and std dev
- 68 of all the cases fall within 1 standard
deviation of the mean, - 95 of the cases within 2 standard deviation of
the mean - 99.7 of the cases within 3 sd of the mean.
53The Normal Distribution and the 68-95-99.7 rule
54Practice
- SAT scores are normally distributed with a mean
of 500 and a std. dev. of 100. - What percentage of students score between 400 and
600?
68
55Practice
- SAT scores are normally distributed with a mean
of 500 and a std. dev. of 100. - What percentage of students score between 400 and
500?
34
56Practice
- SAT scores are normally distributed with a mean
of 500 and a std. dev. of 100. - What percentage of students score less than 300?
2.5
57It works with different and s
- IQ scores are normally distributed with a mean of
100 and a std. dev. of 15. - What percentage of people have an IQ between 85
and 115?
68
58Practice
- IQ scores are normally distributed with a mean of
100 and a std. dev. of 15. - What percentage of people have an IQ between 85
and 100?
34
59Practice
- IQ scores are normally distributed with a mean of
100 and a std. dev. of 15. - What percentage of people have an IQ less than 70?
2.5
60Problem
- By manipulating probabilities, we can only handle
situations where we are 1, 2, or 3 Std. Devs.
Away from the mean. - What happens when we want to know the probability
of scoring between 100 and 105 - We need to convert the IQ score (or SAT score or
whatever) into units of the standard Deviation. - Example Distance between 100 and 105 is .333
Standard Deviations
61Think of the Std. Dev. As a Unit
- How many inches are in a foot?
- 12
- How many cups are in a pint?
- 2
- How many IQ points are there in a standard
deviation for IQ? - 15
- How many SAT points are there in a standard
deviation for SAT scores? - 100
62How do you convert inches to feet?
- Distance in feet Distance in inches
- 12
- Distance in IQ std devs Distance in IQ points
- 100
- Distance in IQ std. devs
63Consider this problem
- Party-time employee salaries in a company are
normally distributed with mean 20,000 and
Standard Dev. 1,000 - How many Std. Devs. Is 18,500 away from the mean?
- Intuitively, we see that 1,500 is 1.5 Std. Devs.
from - Using the formula, we get
- -1.5 (negative specifies direction)
?
64Consider this problem
- How many Std. Devs. Is 19,371 away from the mean?
- Intuitively, we cant do this
- Using the formula, we get
- -.269 Std. devs. away
?
X 19,371
65Z Scores
- We call these standard deviation values
Z-scores - Z score is defined as the number of standard
units any score or value is from the mean. - Z score states how many standard deviations the
observation X falls away from the mean and in
which direction plus or minus.
66(No Transcript)
67What Good does this do???
- Someone figured out that 68 are within 1 s.d.
and about 95 are within 2 s.d. - Someone did this to show that 74.16 are within
1.13 s.d. in the normal distribution - 1.14 s.d 74.58
- 1.15 s.d 74.98
- 1.16 s.d 75.4
- It goes on and on and on.
68These results appear in a Z-table
- You calculate a Z score, and the Z-table will
tell you - The probability of getting a score between your
Z-score and the mean (column B) - The probability of getting a score greater than
your Z-score, that is, from your Z-score out to
the end of the normal distribution (column C) - This Table can be downloaded from my web site
69It Looks like this
- Suppose you find a Z-score of .12
- Column B says that 4.78 of cases lie between the
mean and your Z-score
70It Looks like this
- Suppose you find a Z-score of .12
- Column C says that 45.22 of cases lie beyond
your Z-score
Column C
71IQ is normally distributed with a mean of 100 and
sd of 15. How do you interpret a score of
109? Use Z score
What does this Z-score .60 mean? Does not mean
60 percent of cases below this score BUT rather
that this Z score is .60 standard units above
the mean, We need the Z-table to interpret
this!
72Using the Z table
- Look at Column C for .60
- Only 27.43 of people have an IQ higher than
this. - If your IQ is 109 (.6 s.d. above the mean), you
are smarter than almost 75 of people in the
world! - 72.57 of people have an IQ less than this.