Psychology 412 Applied Data Analysis - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Psychology 412 Applied Data Analysis

Description:

Physics, chemistry, psychology, sociology, political science all do this... definitions and meanings, but science is about understanding and description. ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 43
Provided by: adamk2
Category:

less

Transcript and Presenter's Notes

Title: Psychology 412 Applied Data Analysis


1
Psychology 412Applied Data Analysis
  • Instructor Adam Kramer
  • Week 1

2
Science
  • A method for testing theories or answering
    questions
  • Seek evidence which supports
  • Seek evidence which contradicts
  • The key seek evidence.
  • The result Bad theories fall away, good theories
    persist

3
Sciences
  • Physics, chemistry, psychology, sociology,
    political science all do this
  • Q How much does an atom weigh?
  • A Put one on a scale and see!
  • Q What happens if you mix baking soda and
    vinegar?
  • A Try it and see!
  • Q Does chocolate make people happy?
  • A What do you mean by people

4
Psychology as Science
  • Atoms, chemicals, etc., are nearly identical on
    the relevant dimensions
  • They weigh the same, react the same
  • People are nearly identical on some dimensions
  • They all die in extreme temperatures, react
    similarly to some drugs
  • But psychology asks different questions.

5
People are different.
  • But also similar psychology looks for maxims or
    similarities despite the differences among people
  • Or, what are people like on average?

6
A sea of variability!
  • What can so much variation tell us?
  • What is the trend despite the variation, and how
    sure are we that its real?
  • This is the task of statistics in social science
  • Detecting trends despite variation
  • Explaining variation in some way

7
An example Satisfaction
  • I am satisfied with my life.
  • 7 - Strongly agree
  • 6 - Agree
  • 5 - Slightly agree
  • 4 - Neither agree nor disagree
  • 3 - Slightly disagree
  • 2 - Disagree
  • 1 - Strongly disagree

8
So what does it mean?
  • So, now weve got a variablewhats in it? What
    does it tell us?

9
The mean
  • The average is the sum of the items divided by
    the number of the items.
  • µ or m are symbols for mean, x represents a
    variable, i is an index to an observation in the
    variable, n is the symbol for the number of data
    points, ? means sum.
  • Mean and average are used interchangeably.
  • Each observation contributes equally.

10
Median, Mode
  • The median is the value in the middle of your
    variable (or between the two middle values).
  • 50 of observations above, 50 below
  • The mode is the value most frequently observed
  • Each observation does not contribute equally

11
Mean median mode which when?
  • All have different definitions and meanings, but
    science is about understanding and description.
  • Use the summary statistic that is most
    interesting or central to your point!

12
Example SWL
  • Todays SWL If I could live my life over, I
    would change almost nothing
  • 1 - Strongly disagree, disagree, slightly
    disagree, neither agree nor disagree, agree,
    strongly agree - 7
  • How close are mean, median, mode?

13
None is enough
Lecture 1 ended here
  • We need some notion of variability, too.
  • Variability around some point
  • Use the mean!
  • Difference from average makes sense when the
    mean is meaningful
  • Each point contributed to the mean, so has an
    effect on its deviation

14
Mean deviations
  • Summed deviations
  • Always equals zero
  • So square them
  • Then the signs go away
  • Adding absolute values is messy
  • Average squared deviation
  • n-1 because weve already used the mean, which
    makes n overestimate.
  • This is the variance.

15
Mean deviations
  • Standard deviation
  • Un-square the sum of squared deviations
  • An average of deviations from the mean
  • Quantifies how much variability there is in the
    data set
  • In the same UNIT (satisfactions) as the original
    variable
  • The unsquared average squared deviation

16
The Z Score
  • Each score xi has a Z score
  • Z is how far from the mean x is, in standard
    deviation units
  • In other words, it is the standardized score
  • Z is distribution-free because the variability
    has been controlled for

17
Z scores
  • Our scores µ6.11, sd0.6
  • adam arron brent jason killian
    whitney dylan hal lauren
  • 1, 6.0000 6.0000 6.0000 6.0000 6.0000
    6.0000 7.000 5.000 7.000
  • 2, -0.1849 -0.1849 -0.1849 -0.1849 -0.1849
    -0.1849 1.479 -1.849 1.479
  • Im 0.18 sd below the mean
  • Means less satisfied, but not by much

18
The Normal Distribution
  • Whats an SD good for?
  • For normal distributions, 68 of observations
    within 1sd, 95 within 2
  • In other words, a Z score tells us how ODD a
    score is

19
How big is big?
  • If Z is big, it means someones score is many
    sds from the mean
  • 2 is big for normal distributions
  • That person is odd or doesnt belong if
    theyre further than we would expect by chance.
  • GIVEN the variability, is this person odd?
  • Z CONTROLS for variability, but still

20
The Shape of Distributions
  • One SD means different things for different
    distributions
  • So its hard to say a PERSON is weird

21
The Central Limit Theorem
  • Regardless of the distribution of items, the
    distribution of MEANS is normal (if you have
    enough)

22
The Z test
  • So GIVEN how many observations went into the mean
    (here, 10000), we can actually quantify HOW ODD a
    mean is
  • But only because we can place it on the normal
    distribution

23
The Null Hypothesis
  • Assuming that your sample is pulled from a
    certain population, how likely is the mean of the
    sample?
  • How far is your samples mean from the null
    hypothesis?

24
Back to science
  • Testing a hypothesis means
  • Finding evidence to support a theory
  • Finding evidence to disprove a theory
  • In psychology, we hypothesize relationships among
    variables
  • But we have a lot of variability
  • Are observed relationships accidental?

25
The Null Hypothesis
  • Is the relationship strong enough to overcome the
    variability?
  • Or, how bizarre or unlikely would our observed
    results be if there wasnt really a relationship
    there?
  • Formally
  • Assume there is no relationship (H0)
  • Quantify how unlikely the data are
  • The task for the term!

26
Our data
  • We can ask two questions using this technique
  • Is our measure DIFFERENT from some point (like
    0)?
  • Are two measures DIFFERENT from each other?
  • Are two measures RELATED to each other?

27
Our data
  • We can ask two questions using this technique
  • Is our measure DIFFERENT from some point (like
    0)? Z T-test
  • Are two measures DIFFERENT from each other?
    T-test
  • Are two measures RELATED to each other?
    Correlation

28
Satisfied?
  • Compared to UO students
  • UO students show µ5.05, sd1.48
  • So, are we, on average higher?
  • Average means mean!
  • Is our mean greater than 5.05?
  • Well, duh, its 6.11.
  • But is that MEANINGFUL?

29
Null hypothesis
  • 6.11 is meaningfully higher than 5.05 if it is
    really unlikely that wed have randomly gotten a
    number that high GIVEN that we are just uo
    students.
  • So whats a lot of deviation?
  • We have a measure of the deviationbut how do we
    know if it matters?

30
The Central Limit Theorem
  • If the population mean and sd are known, then OUR
    mean is part of a NORMAL distribution for means
    of size 9 around the population mean.

31
The Z-Test
  • If we know the mean and standard deviation of the
    population, we can quantify how far above it we
    are.
  • We are 0.717 standard deviations above the mean

32
The Z-Test
  • But SD isnt good enough, because were using our
    mean, which is produced by 9 observations.
  • The standard error is the sd over sqrt(n)
  • We are 2.15 standard errors above the mean

33
The normal distribution
  • 2.15 standard errors above the mean

98.4
1.6
34
Are we normal?
  • so theres a 1.4 chance that a null effect (us
    just being random UO students) would produce a
    result as extreme as 6.11
  • Z2.15, p.014

35
Are we normal?
  • Our standardized average, of 1.39, is higher than
    91.8 of averages which a true null hypothesis
    would produce.
  • Is that suitably unlikely?
  • So wed say p.082
  • but where did that mean and sd come from

36
Not really parameters.
  • The mean and sd for the population was actually
    from a study with n273.
  • Z-tests only work with parameters.
  • T-tests allow for uncertainty on both sides!

37
Independent-samples t
  • complicated.
  • Take the difference in means, over the standard
    error of the difference
  • which is estimated from the standard errors of
    the components, weighted by their sample sizes.

38
IST
  • t is larger than z!
  • But all ts were not created the same.

39
Degrees of Freedom
  • When we compute the mean, we use all the numbers
  • The set of numbers has some value pulled out of
    it.
  • So, every time a mean is computed and used, we
    lose a degree of freedom.
  • In this example, we compare two means, so we lose
    two df
  • Df 8272-2 278

40
t distributions
41
Our t distribution
  • Now only a 0.9 chance of a null hypothesis
    producing results as extreme as ours!

42
Other kinds of t
  • The t-distribution can be used for other things
    as well
  • One-sample t Compare a samples mean to any set
    number (are we satisfied µgt4?)
  • Dependent samples t Compare paired datanext
    week.
Write a Comment
User Comments (0)
About PowerShow.com