SPSS is the 2nd most popular package. - PowerPoint PPT Presentation

1 / 89
About This Presentation
Title:

SPSS is the 2nd most popular package.

Description:

... causes an MI using this method (X=bacon, X=Flintstone vitamins, etc. ... Keep variables names in the first row, with =8 characters, and no internal spaces. ... – PowerPoint PPT presentation

Number of Views:202
Avg rating:3.0/5.0
Slides: 90
Provided by: nipis
Category:

less

Transcript and Presenter's Notes

Title: SPSS is the 2nd most popular package.


1
SPSS is the 2nd most popular package. It is much
easier to use than SAS and Stata.
2
Install additional software for statistical odds
and ends
  • Instat by GraphPad graphpad.com
  • for summary data analysis - 100
  • True Epistat by Epistat Services
    true-epistat.com - 395
  • for random number table, etc.
  • CIA (Confidence Interval Analysis) bmj.com
  • for confidence intervals - 35.95 with book
  • Statistics with Confidence D. Altman

3
Install a sample size program.
  • If you can afford to spend 400, buy nQuery
    Advisor statistical solutions - www.statsol.com
  • If you can afford to spend 0, download PS from
    the Vanderbilt web site
  • http//www.mc.vanderbilt.edu/prevmed/ps/index.htm
  • Both packages are on the CRCs statistical
    workstation in room A-3101. VUMC investigators
    are welcome to use this workstation.

4
II. You Will Need a Plan
5
Use the scientific method to keep your project
focused.
  • State the problem
  • Formulate the null hypothesis
  • Design the study
  • Collect the data
  • Interpret the data
  • Draw conclusions

6
State the Problem
  • Among patients hospitalized for a hip fracture
    who develop pneumonia during their stay in the
    hospital, the mortality rate is 2.3 times higher
    at non-trauma centers compared with trauma
    centers
  • (48.7 vs. 21.1, P0.043.)
  • It is not clear if, or how, those who will
    develop pneumonia could be identified on
    admission.

7
Formulate the Null Hypothesis
  • Among patients hospitalized for treatment of a
    hip fracture, there are no factors known upon
    admission that are statistically different
    between those who develop pneumonia during their
    stay and those who do not.

8
Why bother with a null hypothesis?
  • For the same reason that we assume that a person
    is innocent until proven guilty.
  • The burden of responsibility is on the prosecutor
    to demonstrate enough evidence for members of a
    jury to be convinced of that the charges are true
    and to change their minds.
  • Outcome after treatment with Drug A will not be
    significantly different from placebo.

9
Design the Study
  • Data on 933 patients with a hip fracture from a
    New York trauma registry will be analyzed.
  • The 58 patients with pneumonia will be compared
    with the 875 without pneumonia.

10
The Most Common Type of Flaw
11
Example of Recall Bias
  • A control group is asked,
  • Two weeks ago from today, did you eat X for
    breakfast?
  • Two weeks after their MI, patients are asked
  • Did you eat X for breakfast on the day of your
    heart attack?
  • You can prove any food causes an MI using this
    method (Xbacon, XFlintstone vitamins, etc.)

12
John Bailars Quote
  • Study design and bias are much more important
    than complex statistical methods.
  • Devote more time to improving the study design,
    and minimizing and measuring bias.
  • Become an expert at study design issues and
    biases in your area of research.

13
What is the statistical power of the study?
  • Power
  • Beta
  • Alpha
  • Sample size
  • Ratio of treated to control group
  • Measure of outcome

14
Sample Size Table
  • See Table 9-1 in the handout
  • Sample Size Requirements for Each of Two Groups.

15
(No Transcript)
16
Collect the Data
  • See the handouts for
  • I TEC Trauma Systems Study

17
III. You Will Need Data Management Skills
18
Enter your data with statistical analysis in mind.
  • For small projects enter data into Microsoft
    Excel or directly into SPSS.
  • For large projects, create a database with
    Microsoft Access.
  • Keep variables names in the first row, with lt8
    characters, and no internal spaces.
  • Enter as little text as possible and use codes
    for categories, such as 1male, 2female.

19
Spreadsheet from Hell
20
Spreadsheet from Heaven
21
IV. You Will Need to Learn Descriptive Statistics
22
Descriptive vs. Inferential
  • Descriptive statistics summarize your group.
  • average age 78.5, 89.3 white.
  • Inferential statistics use the theory of
    probability to make inferences about larger
    populations from your sample.
  • White patients were significantly older than
    black and Hispanic patients, Plt0.001.

23
Import your data into a statistical program for
screening and analysis.
24
Screen your data thoroughly for errors and
inconsistencies before doing ANY analyses.
  • Check the lowest and highest value for each
    variable.
  • For example, age 1-777.
  • Look at histograms to detect typos.
  • Cross-check variables to detect impossible
    combinations.
  • For example, pregnant males, survivors discharged
    to the morgue, patients in the ICU for 25 days
    with no complications.

25
Analyze, descriptive statistics, frequencies,
select the variable
26
Analyze, Descriptive Statistics, Crosstabs
27
Correct the data in the original database or
spreadsheet and import a revised version into the
statistical package.
  • The age of 777 should be checked and changed to
    the correct age.
  • Suspicious values, such as an age of 106 should
    be checked. In this case it is correct.

28
Interpret the Data
29
Run descriptive statistics to summarize your data.
30
V. You Will Need to Learn Inferential Statistics
31
P Value
  • A P value is an estimate of the probability of
    results such as yours could have occurred by
    chance alone if there truly was no difference or
    association.
  • P lt 0.05 5 chance, 1 in 20.
  • P lt0.01 1 chance, 1 in 100.
  • Alpha is the threshold. If P is lt this
    threshold, you consider it statistically
    significant.

32
Basic formula for inferential tests
  • Based on the total number of observations and the
    size of the test statistic, one can determine the
    P value.

33
How many noise units?
  • Test statistic sample size (degrees of freedom)
    convert to a probability or P Value.

34
Use inference statistics to test for differences
and associations.
  • There are hundreds of statistical tests.
  • A clinical researcher does not need to know them
    all.
  • Learn how to perform the most common tests on
    SPSS.
  • Learn how to use the statistical flowchart to
    determine which test to use.

35
VI. You Will Need to Understand the Statistical
Terminology Required to Select the Proper
Inferential Test
36
Univariate vs. Multivariate
  • Univariate analysis usually refers to one
    predictor variable and one outcome variable
  • Is gender a predictor of pneumonia?
  • Multivariate analysis usually refers to more than
    one predictor variable or more than one outcome
    variable being evaluated simultaneously.
  • After adjusting for age, is gender a predictor of
    pneumonia?

37
Difference vs. Association
  • Some tests are designed to assess whether there
    are statistically significant differences between
    groups.
  • Is there a statistically significant difference
    between the age of patients with and without
    pneumonia?
  • Some tests are designed to assess whether there
    are statistically significant associations
    between variables.
  • Is the age of the patient associated with the
    number of days in the hospital?

38
Unmatched vs. Matched
  • Some statistical tests are designed to assess
    groups that are unmatched or independent.
  • Is the admission systolic blood pressure
    different between men and women?
  • Some statistical tests are designed to assess
    groups that are matched or data that are paired.
  • Is the systolic blood pressure different between
    admission and discharge?

39
Level of Measurement
  • Categorical vs. continuous variables
  • If you take the average of a continuous variable,
    it has meaning.
  • Average age, blood pressure, days in the
    hospital.
  • If you take the average of a categorical
    variable, it has no meaning.
  • Average gender, race, smoker.

40
Level of Measurement
  • Nominal - categorical
  • gender, race, hypertensive
  • Ordinal - categories that can be ranked
  • none, light, moderate, heavy smoker
  • Interval - continuous
  • blood pressure, age, days in the hospital

41
Horse race example
  • Nominal
  • Did this horse come in first place?
  • 0no, 1yes
  • Ordinal
  • In what position did this horse finish?
  • 1first, 2second, 3third, etc.
  • Interval (scale)
  • How long did it take for this horse to finish?
  • 60 seconds, etc.

42
(No Transcript)
43
Normal vs. Skewed Distributions
  • Parametric statistical test can be used to assess
    variables that have a normal or symmetrical
    bell-shaped distribution curve for a histogram.
  • Nonparamettric statistical test can be used to
    assess variables that are skewed or nonnormal.
  • Look at a histogram to decide.

44
Examples of Normal and Skewed
45
VII. You Will Need to Know Which Statistical
Test to Use
46
Commonly used statistical methods
  • 1. Chi-square
  • 2. Logistic regression
  • 3. Student's t-test
  • 4. Fisher's exact test
  • 5. Cox proportional-hazards
  • 6. Kaplan-Meier method
  • 7. Wilcoxon rank-sum test
  • 8. Log-rank test
  • 9. Linear regression analysis
  • 10. Mantel-Haenszel method

47
Commonly used statistical methods
  • 11. One-way analysis of variance (ANOVA)
  • 12. Mann-Whitney U test
  • 13. Kruskal-Wallis test
  • 14. Repeated-measures analysis of variance
  • 15. Paired t-test
  • 16. Chi-square test for trend
  • 17. Wilcoxon signed-rank test
  • 18. Analysis of variance (two-way)
  • 19. Spearman rank-order correlation
  • 20. Analysis of covariance (ANCOVA)

48
Chi-square
  • The most commonly used statistical test.
  • Used to test if two or more percentages are
    different.
  • For example, suppose that in a study of 933
    patients with a hip fracture, 10 of the men
    (22/219) of the men develop pneumonia compared
    with 5 of the women (36/714).
  • What is the probability that this could happen by
    chance alone?
  • Univariate, difference, unmatched, nominal, gt2
    groups, ngt20.

49
Chi-square example
50
Fishers Exact Test
  • This test can be used for 2 by 2 tables when the
    number of cases is too small to satisfy the
    assumptions of the chi-square.
  • Total number of cases is lt20 or
  • The expected number of cases in any cell is lt1
    or
  • More than 25 of the cells have expected
    frequencies lt5.

51

52
How to calculate the expected number in a cell
53
Chi-square for a trend test
  • Used to assess a nominal variable and an ordinal
    variable.
  • Does the pneumonia rate increase with the total
    number of comorbidities?
  • Univariate, association, nominal.
  • Analyze, Descriptive Statistics, Crosstabs.

54
(No Transcript)
55
Mantel-Haenszel Method
  • Used to assess a factor across a number of 2 by 2
    tables.
  • Is the mortality rate associated with pneumonia
    different between trauma centers and nontrauma
    centers?
  • Analyze, Descriptive Statistics, Crosstabs.

56
(No Transcript)
57
Students t-test
  • Used to compare the average (mean) in one group
    with the average in another group.
  • Is the average age of patients significantly
    different between those who developed pneumonia
    and those who did not?
  • Univariate, Difference, Unmatched, Interval,
    Normal, 2 groups.

58
(No Transcript)
59
Mann-Whitney U test
  • Same as the Wilcoxon rank-sum test
  • Used in place of the Students t-test when the
    data are skewed.
  • A nonparametric test that uses the rank of the
    value rather than the actual value.
  • Univariate, Difference, Unmatched, Interval,
    Nonnormal, 2 groups.

60
Paired t-test
  • Used to compare the average for measurements made
    twice within the same person - before vs. after.
  • Used to compare a treatment group and a matched
    control group.
  • For example, Did the systolic blood pressure
    change significantly from the scene of the injury
    to admission?
  • Univariate, Difference, Matched, Interval,
    Normal, 2 groups.

61
Wilcoxon signed-rank test
  • Used to compare two skewed continuous variables
    that are paired or matched.
  • Nonparametric equivalent of the paired t-test.
  • For example, Was the Glasgow Coma Scale score
    different between the scene and admission?
  • Univariate, Difference, Matched, Interval,
    Nonnormal, 2 group.

62
ANOVA
  • One-way used to compare more than 3 means from
    independent groups.
  • Is the age different between White, Black,
    Hispanic patients?
  • Two-way used to compare 2 or more means by 2 or
    more factors.
  • Is the age different between Males and Females,
    With and Without Pnuemonia?

63
(No Transcript)
64
Kruskal-Wallis One-Way ANOVA
  • Used to compare continuous variables that are not
    normally distributed between more than 2 groups.
  • Nonparametric equivalent to the one-way ANOVA.
  • Is the length of stay different by ethnicity?
  • Analyze, nonparametric tests, K independent
    samples.

65
Repeated-Measures ANOVA
  • Used to assess the change in 2 or more continuous
    measurement made on the same person. Can also
    compare groups and adjust for covariates.
  • Do changes in the vital signs within the first 24
    hours of a hip fracture predict which patients
    will develop pneumonia?
  • Analyze, General Linear Model, Repeated Measures.

66
Pearson Correlation
  • Used to assess the linear association between two
    continuous variables.
  • r1.0 perfect correlation
  • r0.0 no correlation
  • r-1.0 perfect inverse correlation
  • Univariate, Association, Interval

67
(No Transcript)
68
Spearman rank-order correlation
  • Use to assess the relationship between two
    ordinal variables or two skewed continuous
    variables.
  • Nonparametric equivalent of the Pearson
    correlation.
  • Univariate, Association, Ordinal (or skewed).

69
(No Transcript)
70
Summary of Inferential Tests
71
Unpaired vs. Paired
  • Students t-test
  • Chi-square
  • One-way ANOVA
  • Mann-Whitney U test
  • Kruskal-Wallis H test
  • Paired t-test
  • McNemars test
  • Repeated-measures
  • Wilcoxon signed-rank
  • Friedman ANOVA

72
Parametric vs. Nonparametric
  • Students t-test
  • One-way ANOVA
  • Paired t-test
  • Pearson correlation
  • Correlated F ratio (repeatedmeasures ANOVA)
  • Mann-Whitney U test
  • Kruskal-Wallis test
  • Wilcoxon signed-rank
  • Spearmans r
  • Friedman ANOVA

73
A Good Rule to Follow
  • Always check your results with a nonparametric.
  • If you test your null hypothesis with a Students
    t-test, also check it with a Mann-Whitney U test.
  • It will only take an extra 25 seconds.

74
VIII. You Will Need to Understand Regression
Techniques
75
Linear Regression
  • Used to assess how one or more predictor
    variables can be used to predict a continuous
    outcome variable.
  • Do age, number of comorbidities, or admission
    vital signs predict the length of stay in the
    hospital after a hip fracture?
  • Multivariate, Association, Interval/Ordinal
    dependent variable.

76
(No Transcript)
77
Logistic Regression
  • Used to assess the predictive value of one or
    more variables on an outcome that is a yes/no
    question.
  • Do age, gender, and comorbidities predict which
    hip fracture patients will develop pneumonia?
  • Multivariate, Difference, Nominal dependent
    variable, not time-dependent, 2 groups.

78
(No Transcript)
79
Draw Conclusions
  • We reject the null hypothesis.
  • Patients who are at high risk of developing
    pneumonia during their hospitalization for a hip
    fracture can be identified by
  • total number of pre-existing conditions
  • cirrhosis
  • COPD
  • male gender

80
How this information could be used to predict
pneumonia on admission
  • Z-4.899 (number of comorbidities x 0.469)
    (cirrhosis x 2.275) (COPD x 0.714) (age x
    0.021) (genderfemale1, male0 x 0.715)
  • e2.718
  • Example, an 80 year old male with cirrhosis and
    one other comorbidity (but not COPD) had a 99.4
    chance of developing pneumonia.
  • Z-4.899 (2 x 0.469) (1 x 2.275) (0 x
    0.714) (80 x 0.021) (0 x 0.715)

81
Survival Analysis
  • Kaplan-Meier method
  • Used to plot cumulative survival
  • Log-rank test
  • Used to compare survival curves
  • Cox proportional-hazards
  • Used to adjust for covariates in survival analysis

82
Odds and Ends You Will Need
83
95 Confidence Intervals
  • A 95 confidence interval is an estimate that you
    make from your sample as to where the true
    population value lies.
  • If your study were to be repeated 100 times, you
    would expect the 95 CIs to cross the true value
    for the population in 95 of these 100 studies.
  • the value might be a mean, percentage or RR
  • Confidence intervals should be included in
    publications for the major findings of the study.

84
Prevalence vs. Incidence
  • Prevalence
  • How many of you now have the flu?
  • Incidence
  • How many of you have had the flu in the past year?

85
Random
  • Random is not the same as haphazard, unplanned,
    incidental.
  • Allocating patients to the treatment group on
    even days and to the control group on odd days is
    systematic not random.
  • Random refers to the idea that each element in a
    set has an equal probability of occurrence.

86
Improving a RCT
  • See the handout, Table 3-2 pages18-19.
  • Checklist to Be Used by Authors When Preparing
    or by Readers When Analyzing a Report of a
    Randomized Controlled Trial.

87
IX. You Will Need to Continue Learning About
Statistics
88
Recommended books on statistics
  • Kuzma Statistics in the Health Sciences
  • Norusis Data Analysis with SPSS
  • Altman Statistics with Confidence
  • Friedman Fundamentals of Clinical Trials
  • Pagano Principles of Biostatistics
  • Encyclopedia of Biostatistics
  • SPSS manuals

89
A response to the comment Youre comparing
apples and oranges
  • No this is comparing apples and oranges!
Write a Comment
User Comments (0)
About PowerShow.com