Descriptive statistics - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Descriptive statistics

Description:

Suggests numbers of cases in different intervals for bell-shaped distributions ... saved with .spo extension. Use File, Open, Output to load existing saved ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 41
Provided by: johnrott
Category:

less

Transcript and Presenter's Notes

Title: Descriptive statistics


1
Descriptive statistics
  • V506 Class 2
  • September 3, 2009

2
Overview
  • Describing variables
  • Measures of central tendency
  • Measures of variation
  • Using SPSS for descriptive statistics

3
Describing variables
  • Have a variable with observations on a (possibly
    large) number of cases
  • Idea is to produce a number of summary measures
    that characterize those data
  • Focus here is on
  • Central tendency
  • Variation

4
Measures of central tendency
  • Mean
  • Median
  • Mode

5
Mean
  • Sum of the values divided by the number of cases

6
Summation notation
  • The yi (y1, y2, , yn) are the n values of the
    variable Y
  • The sum of the values is then denoted as

7
Calculating the mean for high temperatures
  • Add values
  • Number of cases
  • Calculate mean

8
Median
  • The median represents the middle of the ordered
    sample data
  • When the sample size is odd, the median is the
    middle value
  • When the sample size is even, the median is the
    midpoint/mean of the two middle values

9
Calculating the median for high temperatures
10
Mode
  • The mode is the value that occurs most frequently
  • It is the least useful (and least used) of the
    three measures of central tendency

11
Calculating the mode for high temperatures
mode 32
12
Measures of central tendency and levels of
measurement
  • Mean assumes numerical values and requires
    interval data
  • Median requires ordering of values and can be
    used with both interval and ordinal data
  • Mode only involves determination of most common
    value and can be used with interval, ordinal, and
    nominal data

13
Comparison of mean and median
  • Mean
  • Uses all of the data
  • Has desirable statistical properties
  • Affected by extreme high or low values (outliers)
  • May not best characterize skewed distributions
  • Median
  • Not affected by outliers
  • May better characterize skewed distributions

14
The mean and median and the distribution of the
data
  • For symmetric distributions, the mean and the
    median are the same
  • For skewed distributions, the mean lies in the
    direction of the skew (the longer tail) relative
    to the median

15
Distribution shapes
Positively skewed
Symmetric
Negatively skewed
16
Comparison of mean and median
17
Measures of variation
  • Range
  • Variance and standard deviation
  • Interquartile range

18
Range
  • Range is the difference between the minimum and
    maximum values

19
Calculating the range for high temperatures
range 60 32 28
20
Variance and standard deviation
  • The variance s2 is the sum of the squared
    deviations from the mean divided by the number of
    cases minus 1
  • The standard deviation s is the square root of
    the variance

21
Why squared? Why n-1?
  • Why square differences between data values and
    mean?
  • Gives positive values
  • Gives more weight to larger differences
  • Has desirable statistical properties
  • Why n - 1 for sample variance?
  • Dividing by n underestimates population variance
  • Dividing by n-1 gives unbiased estimate of
    population variance

22
Variance versus standard deviation
  • Standard deviation is in same units as variable,
    more readily interpreted
  • Standard deviation is measure of absolute
    deviation
  • Variance has properties making it useful for
    certain statistical analyses

23
Calculating the variance and standard deviation
for high temperatures
24
Interpretation of standard deviation
  • If distribution of data approximately bell
    shaped, then
  • About 68 of the data fall within one standard
    deviation of the mean
  • About 95 of the data fall within two standard
    deviations of the mean
  • Nearly all of the data fall within three standard
    deviations of the mean

25
Coefficient of variation
  • Coefficient of variation (also sometimes
    coefficient of dispersion)
  • Measure of relative variation
  • Use to compare variation of distributions with
    different units relative to their means

26
Interquartile range
  • Difference between upper (third) and lower
    (first) quartiles
  • Quartiles divide data into four equal groups
  • Lower (first) quartile is 25th percentile
  • Middle (second) quartile is 50th percentile and
    is the median
  • Upper (third) quartile is 75th percentile

27
Calculating the interquartile range for high
temperatures
interquartile range 52 35 17
28
Interquartile range and outliers
  • Value can be considered to be an outlier if it
    falls more than 1.5 times the interquartile range
    above the upper quartile or more than 1.5 times
    the range below the lower quarter
  • Example for high temperatures
  • Interquartile range is 17
  • 1.5 times interquartile range is 25.5
  • Outliers would be values
  • Above 52 25.5 77.5 (none)
  • Below 25 25.5 9.5 (none)

29
Comparison of range, standard deviation, and
interquartile range
  • Sensitivity to extreme values
  • Range extremely sensitive
  • Standard deviation very sensitive
  • Interquartile range not sensitive
  • Standard deviation
  • Has desirable statistical properties
  • Suggests numbers of cases in different intervals
    for bell-shaped distributions

30
Typical work session with SPSS
  • Create working folder on C using Windows
    Explorer
  • Download SPSS .sav file to working folder
  • Open .sav file in SPSS
  • Select procedure from Analyze menu
  • Select variable(s), specify options, run
  • Print output, save output, or copy output to
    other documents

31
Entering new data
  • Done in Data Editor window
  • Click on Variable View tab to specify variable
    name, type, and other information
  • Click on Data View tab to enter values in
    spreadsheet-like window
  • Variables are columns
  • Cases are rows

32
Saving data
  • Use File, Save command while in Data Editor
    window to save data as SPSS data file
  • Saves file with .sav extension
  • Use File, Save As to save with new filename, as
    with any Windows program

33
Using existing SPSS datasets
  • Use File, Open, Data command to open existing
    SPSS data file
  • Opens new Data Editor window with data from new
    data file

34
Descriptive statistics using Descriptives
  • Use Analyze, Descriptive Statistics, Descriptives
    to run procedure
  • Select variables for which descriptive statistics
    are to be computed
  • Use Options to select statistics to be computed
  • Quick way to get basic descriptive statistics
  • Compact display of results for multiple variables
  • Limited number of statistics available

35
Descriptive statistics using Frequencies
  • Use Analyze, Descriptive Statistics, Frequencies
    to run procedure
  • Select variables
  • For variables taking on large numbers of
    different values, uncheck Display frequency
    tables box
  • Must select Statistics to specify default is
    none
  • Wide range of statistics, including percentiles

36
Descriptive statistics using Explore
  • Use Analyze, Descriptive Statistics, Explore to
    run procedure
  • Select variables
  • Click Statistics button if only statistics
    desired and not graphs
  • Gives good assortment of statistics quickly,
    without need to specify choices

37
Saving SPSS output
  • Use File, Save while in SPSS Output Viewer window
    to save output file
  • File is saved with .spo extension
  • Use File, Open, Output to load existing saved
    output file

38
Modifying SPSS output
  • Can delete output by selecting in outline in left
    panel, pressing Delete keeps things organized
    after running something incorrectly
  • Use Insert, New Text command to open box for
    adding text to output after currently selected
    location
  • Allows adding notes, comments, answers to output

39
Printing SPSS output
  • Use File, Print while in SPSS Output Viewer
    window to print output
  • Selecting output in outline in left panel allows
    only selected output to be printed
  • Can use normal Windows Shift-Click and
    Control-Click in selecting output

40
Copying SPSS output to other documents
  • Select item to be copied, either in outline in
    left panel or in output itself
  • Generally works best to copy one object at a time
  • Choose the Edit, Copy command
  • Can right-click on object and select command from
    pop-up menu
  • Paste the output into the target document
  • In some cases, using Paste Special to paste
    output in another format works better, e.g., as
    Windows Metafile into PowerPoint
Write a Comment
User Comments (0)
About PowerShow.com