Title: PS400 Quantitative Methods
1(No Transcript)
2PS400 Quantitative Methods
- Dr. Robert D. Duval
- Course Introduction
- Presentation Notes and Slides
- Version of January 9, 2001
3Overview of Course
P Syllabus P Texts P Grading P Assignments P
Software
4The First Two Weeks
P Review and Setting The Logic of Research P
Logic P Microcomputers P Statistics
5Overview of Statistics
P Descriptive Statistics P Frequency
Distributions P Probability P Statistical
Inference P Statistical tests P Contingency
Tables P Regression Analysis
6The Logic of Research
A quick review of the research process
P Theory P Hypothesis P Observation P Analysis
7Sample Theories
- IR - Balance of Power
- Wars erupt when there are shifts in the balance
of power - Domestic Policy
- The crime rate is affected by the economy
8(No Transcript)
9 Theory
10 Theory Hypothesis
11 Theory Hypothesis Observatio
n
12 Theory Analysis Hypothesis
Observation
13 Theory Analysis Hypothesis
Observation
14 Theory Deduction Analysis Hypoth
esis Induction Operationalization
Observation
Confirmation/ rejection
15(No Transcript)
16Logic
A short primer on Deduction and Inference
We will look at Symbolic Logic in order to
examine how we employ deduction in cognition.
17Logic
What is Logic?
- Logic
- The study by which arguments are classified into
good ones and bad ones. - Comprised of Statements
- "Roses are red
- "Republicans are Conservatives
18Logic
Compound Statements
- Conjunctions (Conjunction Junction)
- Two simple statements may be connected with a
conjunction - and
- "Roses are Red and Violets are blue.
- "Republicans are conservative and Democrats are
liberal. - or
- "Republicans are conservative or Republicans are
moderate."
19Operators
- There are three main operators
- And ()
- Or (v)
- Not ()
- These may be used to symbolize complex statements
- The other symbol of value is
- Equivalence (?)
- This is not quite the same as equal to.
20Truth Tables
- Statements have truth value
- For example, take the statement PQ
- This statement is true only if P and Q are both
true. - P Q PQ
- T T T
- T F F
- F T F
- F F F
21Truth Tables (cont)
- Hence Republicans are conservative and Democrats
are liberal. is true only if both parts are
true. - On the other hand, take the statement PvQ
- This statement is true only if either P or Q are
true, but not both. (Called the exclusive or) - P Q PvQ
- T T F
- T F T
- F T T
- F F F
22The Inclusive or
- Note that or can be interpreted differently.
- Both parts of the conjunction may be true in the
inclusive or. This statement is true if either
or both P or Q are true. - P Q PvQ
- T T T
- T F T
- F T T
- F F F
23The Inclusive or
- Note that or can be interpreted differently.
- Both parts of the conjunction may be true in the
inclusive or. This statement is true if either
or both P or Q are true. - P Q PvQ
- T T T
- T F T
- F T T
- F F F
24Tautologies
- Note that p v p must be true
- Roses are red or roses are not red. must be
true. - A statement which must be true is called a
tautology. - A set of statements which, if taken together,
must be true is also called a tautology (or
tautologous). - Note that this is not a criticism.
25The Conditional
- The Conditional
- if a (antecedent)
- then b (consequent)
- It is also called the hypothetical, or
implication. - This translates to
- A implies B
- If A then B
- A causes B
26The Implication
- We symbolize the implication by
- We use the conditional or implication a great
deal. - It is the core statement of the scientific law,
and hence the hypothesis.
27Equivalency of the Implication
- Note that the Implication is actually equivalent
to a compound statement of the simpler operators. - p v q
- Please note that the implication has a broader
interpretation than common English would suggest
28Rules of Inference
- In order to use these logical components, we have
constructed rules of Inference - These rules are essentially how we think.
29Disjunctive Syllogism
30Hypothetical Syllogism
31Modus Ponens
32Modus Tollens
33Logical Systems
- Logic gives us power in our reasoning when we
build complex sets of interrelated statements. - When we can apply the rules of inference to these
statements to derive new propositions, we have a
more powerful theory.
34Tautologous systems
- Systems in which all propositions are by
definition true, are tautologous. - Balance of Power
- Why do wars occur? Because there is a change in
the balance of power. - How do you know that power is out of balance? A
war will occur. - Note that this is what we typically call circular
reasoning. - The problem isnt the circularity, it is the lack
of utility.
35Paradoxes
P The Liars Paradox lt Epimenedes the Cretan says
that all Cretans are liars. P The ??? Paradox (a
variant) lt The next statement is true. lt The
previous statement is false.
36Microcomputer Architecture
37Using the Computer
38Computers - Basic Architecture
39Basic components
40The CPU
41Some simple binary arithmetic
42Binary numbering
43Binary addition
44Miscellaneous
45Digital Systems
- So, in the end, we can see that computers simply
move ad add 0s and 1s. - And out of this, we can build incredibly rich and
complex experiences - Such as
- Or
46(No Transcript)
47Statistics
A Philosophical Overview
- Methods as Theory
- Methods as Language
48Principle organizing concepts
P The Nature of the Problem P Measurement P
Standards for comparison
49 Mathematical notation Important
mathematical notation the student
needs to know.
n
å
X
PSummation lt For instance, the sum of all Xi
from I1 to n means beginning with the first
number in your data set, add together all n
numbers. lt The 3 is a symbolic representation of
the process of adding up a specified series or
collection of numbers.
i
i
1
50Mathematical notation (cont.)
P Square Roots and Exponents P e - the base of
natural logarithms P Exponential and
Logarithmic Equations
51The Base of Natural Logarithms
Where does e come from?
P e is the base of natural logarithms P It is
derived from
52Demystifying e (sort of)
- So how does this translate to real life?
- Compound interest
- Where
- PV Present Value (amount deposited)
- FV future value (amount accrued)
- i interest rate (e.g .06 for 6 interest)
- k number of periods/year
- n number of years
53Levels of Measurement
- P Nominal
- ltDichotomous
- P Ordinal
- P Interval
- ltRatio
54Levels of Measurement
- Nominal
- Dichotomous
- Ordinal
- Interval
- Ratio
- For instance Levels of Measurement
55Nominal Measurement
56Statistics
Induction about the Observable World
P A statistic is a number that provides
information about some variable of interest. P
Descriptive Statistics lt Numbers that describe
some aspect of the world P Inferential
Statistics lt We use inferential statistics to
take information from a sample and make some
inference about a population.
57Descriptive Statistics
- P There are two main ways we describe collections
of data. - Measures of Central Tendency
- Measures of Dispersion
58Statistical Tools for Describing the World -
Distributions
P Intuitive Definition lt A bunch of numbers that
measure a characteristic for a group of cases. lt
May be represented by a set of numbers, a graph
or picture, or even a mathematical equation.
59Measures of Central Tendency
- Measures which provide some indication of the
typical value or the 'middle' of the distribution
60Measures of Central Tendency The Arithmetic Mean
(or Average)
- The sum of all of the numbers in a set, divided
by the number in the set - Most appropriate for symmetric distributions
- Influenced by extreme values
61Measures of Central Tendency The Median
- The middle number in the data set.
- (Sort the Data...
- The Median is the middle value if there are an
odd number of cases. - The Median is the average of the two middle
values if there are an even number of cases. - Best measure for skewed distributions
- Not very tractable mathematically
62Measures of Central Tendency The Mode
- The most frequently occurring value.
- Used primarily for nominal data.
- The peak value of a frequency distribution is
also referred to as the mode.
63Measures of Central Tendency Common terms for
this concept.
- We use the idea of measures of central tendency a
great deal in everyday language. - Average, accordance, bread-and-butter,
commonplace, Commensurate, congruent, consistent,
conventional, customary, day-to-day, everyday,
frequent, garden variety, general, habitual,
humdrum, invariably, likeness, mean, median,
medium, mediocrity, middle, middling,
nondescript, normal, ordinary, popular,
prevailing, regular, the same, standard,
stereotypical, stock, typical, unexceptional,
uniform, usual - From The Elementary Forms of Statistical Reason
by R. P. Cuzzort and James S. Vrettos)
64Measures of Dispersion
The Range Range Highest value - lowest
value Uses only two pieces of information Strongly
influenced by the particular
65Measures of Dispersion
Percentiles
66Measures of Dispersion
The Deviation about the Mean
The Deviation about the Mean Indicates how
far a value is from the center.
X
X
-
i
67Measures of Dispersion
The average deviation.
Can we find the average of the deviations? Alw
ays sums to 0.0!
n
(
)
å
X
X
-
i
i
1
AD
n
68Measures of Dispersion
The average absolute deviation.
Can we find the average of the absolute value of
the deviations? Yes, but difficult to use.
n
å
X
X
-
i
i
1
AD
n
69Measures of DispersionThe Standard Deviation
Square the deviations to remove minus
signs Take the square root to return to the
original scale
n
(
)
2
å
-
X
X
i
-
i
1
s
n
70Measures of DispersionThe Variance
The mean of the squared deviations
71Calculating the Standard Deviation
- The best way to calculate the standard deviation
is to use a computer. - If one is not available, try the table method.
- StDevdemo.xls (Excel)
- StDevdemo.wb3 (Quattro Pro)
72Population measures
- OKI lied. The formula for the standard
deviation is not quite as I described. - It turns out that the St. dev. Is biased in small
samples. - The estimate is a little too small in small
samples. - Thus we designate whether we are using population
or sample data.
73Population vs. Sample Means
74Population vs. Sample Standard Deviations
75Frequency Distribution
A frequency distribution is a graph or chart that
shows the number of observations of a given
value, or class interval.
76The Frequency Histogram
- To create a frequency histogram
- Determine the class interval width.
- Determine the number of intervals desired.
- Tally number of observations in each range.
- Create bar chart from class totals.
77Example Frequency Distribution
- Develop a frequency histogram for the following
crime rate data for the 50 states. - Use the data provided in class
78Frequency Polygon
- Same as a frequency histogram except the
midpoints of the class intervals are used - Points are connected with a line graph
- A large number of classes will make the
distribution a smooth curve if there is a large
sample size.
79Frequency DistributionsShape
- Modality
- The number of peaks in the curve
- Skewness
- An asymmetry in a distribution where values are
shifted to one extreme or the other. - Kurtosis
- The degree of Peakedness in the curve
80Frequency DistributionsModality
- Unimodal
- Bimodal
- Multimodal
81Frequency DistributionsSkewness
- The Third Moment about the Mean
- Right Skew (Positive Skew)
- Left Skew (Negative Skew)
82Frequency DistributionsMeasuring Skewness
- Measuring skewness
- Normal distribution has skewness 0.0
- (Normal ranges between 3.0)
83Frequency DistributionsKurtosis
- The Fourth Moment about the Mean
- Platykurtic
- Leptokurtic
- Mesokurtic
84Frequency DistributionsMeasuring Kurtosis
- Measure of kurtosis
- Normal distributions have kurtosis 3.0
85Frequency Distributions - Types
- The Normal
- The Uniform
- The Log-normal
- The Exponential
- Statistical Distributions
- t
- ?-Square
- F
86Freuency Distributions Types (cont.)
- Hyper-geometric
- Poisson
- Binomial
- Gamma
- Weibul
- Beta
87Graphs - Types
- Descriptive Graphs
- Bar Chart
- Pie Graph
- Line Graph
- Distributions
- Histogram
- Box Plot
- Steam and Leaf
88Graphs
Histogram
89Graphs
Pie Graph
90Graphs
Line Graph
91Graphs
Histogram
92Graphs
Box Plot
93Graphs
Stem and Leaf
94Probability Density Functions
- A probability density function is a frequency
distribution whose area is set equal to 1.0. - Most distributions are PDFs.
- They let us assess the likelihood or probability
of cases taking on particular values.
95The Normal Distribution
- The normal distribution is one of the most
- Popular
- Ubiquitous
- Useful
- distributions that we have.
- It gives great predictive ability when we can
apply it to data.
96The Normal Distribution the Formula
- The normal curve is described by the following
formula.
97The Normal Distribution (cont.)
- This formula will give us the following
distribution
98Standard Normal Variables
- A Standard Normal Variable is one that has been
transformed by the following formula - All Z-scores, as they are called, will have a
mean 0.0 and s 1.0
99Standard Normal Distribution
- The Standard Normal distribution is thus one that
has ? 0.0 and ? 1.0 - We say this symbolically as
- Z ?N(0,?)
- (or Z is normally distributed with a mean of zero
and a standard deviation of one)
100The Normal PDF
- Because the normal curve is a PDF, we can use it
to make probability assessments about values in
the distribution
101Using the Normal PDF
- We know the following facts
- Area under the curve 1.0
- Its symmetric, so the probability of Xi being
greater that 0.0 is .5 - Symbolically,
- P(Xi gt 0.0) .5
102Using the Normal PDF
- We can use this information in the following
fashion - The P(0.0 ? Xi ? 1.0) .3413
- Thus 68 of the Xis will fall between ?1?
- Thus 96 of the Xis will fall between ?2?
103The Central Limit Theorem
- The Normal Distribution pops up in one very
important context - The Central Limit Theorem
- This is a fundamental concept is inferring the
characteristics of a population based upon a
sample.
104Sampling Distributions
- The probability distribution of a statistic is
called its sampling distribution. - If we collect a sample and calculate the mean,
that is one data point in the sampling
distribution of the sample mean. - If we do this many times, we have a sampling
distribution, which we can then describe.
105The Central Limit Theorem
- The CLT tells us
- As the sample size n gets larger, the sampling
distribution of the sample mean can be
approximated by a normal distribution with mean
of ?, and a standard deviation equal to - where ? and ? are the population
characteristics.
106The Implications of the Central Limit Theorem
- We can use the CLT to make probability statements
about the sample mean because we know its
distributional characteristics. - Even if the original variable X is not normally
distributed, the sampling dist of the sample mean
is!
107Statistical inference
- We can use information about the way variables
are distributed to make assessments of
probability about them. - Many of these questions are phrased as Is A
greater (or less) than B? - This may also be phrased
- Does A belong to the same population as B?
108Assessing probabilities
- Take income
- Would you expect a doctors to have a higher
income than the population at large? - Would dog-catchers be lower?
- Would you expect males to have a higher income
than females - Is WV income lower than the national average?
What about Oklahoma?
109Statistical Decision-making
- Many of these questions are best answered with a
statement of statistical confidence a
probability assessment. - This statistical confidence places a decision
within an objective framework. - If we define the criteria for making decisions
according to some reasonable standards, then we
can remove (or certainly reduce the subjectivity
of the researcher.
110Statistical D-M (cont.)
- If you collected the following information, would
you conclude that males had a higher income than
females? - MeanMales 55.5K, MeanFemales 54.9K
- MeanMales 55.5K, MeanFemales 50.9K
- MeanMales 55.5K, MeanFemales 34.9K
- Where would you draw the line?
- Does sample size matter?
111Statistical Decision-making
- Many of these questions are best answered with a
statement of statistical confidence a
probability assessment. - This statistical confidence places a decision
within an objective framework. - If we define the criteria for making decisions
according to some reasonable standards, then we
can remove (or certainly reduce the subjectivity
of the researcher.
112Statistical Decision-making problem setup