Univariate Statistics - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Univariate Statistics

Description:

Descriptive Statistics: Describe Variables (where data are any collection of ... Inferential Statistics: Make inferences about the population based on ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 34
Provided by: rfor6
Category:

less

Transcript and Presenter's Notes

Title: Univariate Statistics


1
Univariate Statistics
  • Analysis of a single variable
  • Two general varieties
  • Descriptive Statistics Describe Variables
    (where data are any collection of observations,
    sample/population)
  • Inferential Statistics Make inferences about the
    population based on characteristics of sample data

2
List of Variable Values
3
Frequency Distribution
  • A summary of the observations for a variable
  • Includes a list of the values of the variable and
    the frequency of observations for each value

4
Example Interval/Ratio
  • Freq. distribution of midterm grades

5
Example Interval/Ratio
6
Example Interval/Ratio
Freq. / Total
7
Example Interval/Ratio
Freq. / Total100
8
Example - Nominal
  • Freq. distribution of active hate group
    organizations in 1999

9
Example - Nominal
10
Summarizing Data in Graphs
  • Pie charts, Bar charts appropriate for nominal
    variables and ordinal variables (small number of
    categories)

11
Example Bar Chart
12
Summarizing Data in Graphs
  • Histograms appropriate for all interval/ratio
    variables with a large number of possible values
    data are collapsed into intervals, and axis
    labels represent interval boundaries or interval
    midpoints

13
Histogram of County Unemployment Rates in Fla
14
Measures of Central Tendency
  • Mean
  • _
  • Y ? Yi / N
  • Appropriate for interval/ratio variables ONLY

15
Measures of Central Tendency
  • Median Defined as the value of the variable in
    the middle of the distribution.
  • Odd of obs 2 2 5 9 11 median5
  • Even of obs 2 2 5 9 11 15
  • median(59)/2 7
  • Appropriate for ordinal, interval and ratio

16
Measures of Central Tendency
  • Mode Defined as the value that occurs most
    often.
  • 2 2 5 9 11 15
  • Mode2
  • Appropriate for all levels of measurement

17
Measures of Dispersion
  • 1. Range Ymax - Ymin
  • Weakness?
  • 2. Percentiles - For variable Y, the pth
    percentile represents the value of Y below which
    p of the observations fall.
  • 50th percentile median
  • IQR Y75pct - Y25pct

18
Measures of Dispersion (contd)
  • More complex measures Based on mean
    deviations _ Yi Y

  • _
  • Average Mean Deviation(?) S (Yi Y) / N

  • _
  • Mean Absolute Deviation S Yi Y / N
  • could use as measure of variation

  • _
  • Mean Squared Deviation S(Yi Y)2 / N

19
Variance (sample)
  • _
  • s2Y S (Yi - Y)2 / (N-1)
  • Standard Deviation
  • sY vs2Y
  • Numerator Sum of Squares
  • Denominator degrees of freedom

20
The Normal Distribution
  • Symmetric
  • Bell-shaped
  • MeanMedianMode

21
The Normal Distribution
22
Deviations from the normal distribution
  • Bimodal distributions
  • Skewed distributions
  • Left skew vs. right skew
  • Mean is pulled in direction of skew

23
Histogram of County Unemployment Rates in Fla
24
Descriptive Statistics for County Unemployment
Rates in Fla
  • . sum unemp, detail
  • unemp
  • --------------------------------------------------
    -----------
  • Percentiles Smallest
  • 1 2 1.7
  • 5 2.4 1.7
  • 10 2.7 1.7 Obs
    3149
  • 25 3.4 1.7 Sum of Wgt.
    3149
  • 50 4.4 Mean
    4.809908
  • Largest Std. Dev.
    2.129031
  • 75 5.5 19.5
  • 90 7.2 19.5 Variance
    4.532774
  • 95 8.6 19.6 Skewness
    2.30285
  • 99 13 19.7 Kurtosis
    12.11621

25
Sampling Distribution (sample means)
  • Population
  • Draw Random Sample of Size N
  • Calculate sample mean
  • Repeat until all possible random samples are
    exhausted
  • The resulting collecting of sample means is the
    sampling distribution of sample means

26
Sampling Distribution of Sample Means
  • A frequency distribution of all possible sample
    means for a given sample size (N)
  • The mean of the sampling distribution will be
    equal to the population mean.

27
Sampling Distribution of Sample Means
  • When N is reasonably large (gt30), the sampling
    distribution will be normally distributed
  • The standard error of the sampling distribution
    can be reliably estimated as (where sY sample
    standard deviation for Y and N sample size).
  • sY /vN


28
Standard Error
  • How the sample means vary from sample to sample
    (i.e. within the sampling distribution) is
    expressed statistically by the value of the
    standard deviation (i.e. standard error) of the
    sampling distribution.
  • (Standard deviation the average distance of
    each observation from the mean)

29
Using the Standard Error to Calculate a 95
Confidence Interval
  • Calculate the mean of Y
  • Calculate the standard deviation of Y
  • Calculate the standard error of Y
  • Calculate a 95 confidence interval for the
    population mean of Y
  • _
  • 95 CI Y 1.96(standard error)

30
Example
  • Hillary Clinton Feeling Thermometer (NES 2004)

31
Example
  • Hillary Clinton Feeling Thermometer (NES 2004)
  • Mean 64.137, s.d. 88.408, N 1212

32
Example
  • Hillary Clinton Feeling Thermometer (NES 2004)
  • Mean 64.137, s.d. 88.408, N 1212
  • Standard Error 88.408 / v1212 2.539

33
Example
  • Hillary Clinton Feeling Thermometer (NES 2004)
  • Mean 64.137, s.d. 88.408, N 1212
  • Standard Error 88.408 / v1212 2.539
  • 95 CI 64.137 1.96 2.539
  • 59.158, 69.116
Write a Comment
User Comments (0)
About PowerShow.com