Shuyu Chu

1 / 21
About This Presentation
Title:

Shuyu Chu

Description:

Title: VT PowerPoint Template Author: David Stanley Last modified by: Sissi Created Date: 10/23/2006 4:36:06 PM Document presentation format: (4:3) – PowerPoint PPT presentation

Number of Views:9
Avg rating:3.0/5.0
Slides: 22
Provided by: DavidS492

less

Transcript and Presenter's Notes

Title: Shuyu Chu


1
Lisa Short Course Series R Statistical
Analysis
Laboratory for Interdisciplinary Statistical
Analysis
2
Laboratory for Interdisciplinary Statistical
Analysis
LISA helps VT researchers benefit from the use of
Statistics
Collaboration Visit our website to request
personalized statistical advice and assistance
with Experimental Design Data Analysis
Interpreting ResultsGrant Proposals Software
(R, SAS, JMP, SPSS...) LISA statistical
collaborators aim to explain concepts in ways
useful for your research. Great advice right now
Meet with LISA before collecting your data.
Short Courses Designed to help graduate students
apply statistics in their research Walk-In
Consulting M-F 1-3 PM GLC Video Conference Room
M 3-5 PM 312 Sandy
T 11-1PM Port W 11-1PM Old Security
Building. For
questions requiring lt30 mins
All services are FREE for VT researchers. We
assist with researchnot class projects or
homework.
3
Outline
Laboratory for Interdisciplinary Statistical
Analysis
  • Review of plots
  • T-test
  • 2.1 One sample t-test
  • 2.2 Two sample t-test
  • 2.3 Paired T-test
  • 2.4 Normality Assumption Nonparametric
    test
  • ANOVA
  • 3.1 One-way ANOVA
  • 3.2 Two-way ANOVA
  • Logistic Regression

4
Review of plots
Laboratory for Interdisciplinary Statistical
Analysis
  • Using visual tools is a critical first step when
    analyzing data and it can often be sufficient in
    its own right!
  • By observing visual summaries of the data, we
    can
  • Determine the general pattern of data
  • Identify outliers
  • Check whether the data follow some theoretical
    distribution
  • Make quick comparisons between groups of data

5
Review of plots
Laboratory for Interdisciplinary Statistical
Analysis
  • plot(x, y) (or equivalent plot(yx)) scatter plot
    of variables x and y
  • pairs(cbind(x, y, z)) scatter plots matrix of
    variables x, y and z
  • hist(y) histogram
  • boxplot(y) boxplot
  • lm(yx) fit a straight line
  • between variable x and y

6
Review of plots
Laboratory for Interdisciplinary Statistical
Analysis
  • Low Birth Weight Data Description (lowbwt.csv)
  • (189 observations, 11 variables)
  • ID Identification Code
  • LOW Low Birth Weight (0 Birth Weight gt
    2500g, 1 Birth Weight lt 2500g)
  • AGE mothers age in years
  • LWT mothers weight in lbs
  • RACE mothers race (1 white, 2 black, 3
    other)
  • SMOKE smoking status during pregnancy
  • PTL no. of previous premature labors
  • HT history of hypertension
  • UI presence of uterine irritability
  • FTVno. of physician visits during first
    trimester
  • BWT Birth Weight in Grams

7
T-Test
Laboratory for Interdisciplinary Statistical
Analysis
  • 2.1 One sample t-test
  • Research Question
  • Is the mean of a population different from the
    null hypothesis (a nominal value, or some
    hypothesized value)?
  • Example
  • Testing whether a baby's average birth weight is
    different from 2500 g.
  • Hypotheses
  • Null hypothesis the baby's average birth weight
    is 2500 g
  • Alternative hypothesis the baby's average birth
    weight is not equal to(or
    greater/less than) 2500 g
  • In R t.test(x, y NULL, alternative
    c("two.sided", "less", "greater"), mu 0, paired
    FALSE, var.equal FALSE, conf.level 0.95)

8
T-Test
Laboratory for Interdisciplinary Statistical
Analysis
  • 2.2 Two sample t-test
  • Research Question Are the means of two
    populations different?
  • Example
  • Consider whether the birth weight of these babies
    whose mothers smoke is different form those whose
    mothers dont smoke ?
  • Hypotheses
  • Null hypothesis the average birth weight of the
    babies whose mothers smoke equals to the babies
    average birth weight whose mothers dont smoke
  • Alternative hypothesis the babies average birth
    weight of smoking mothers is not equal to (or
    greater/less than) that of non-smoking mothers
  • In R t.test(BWTSMOKE)
  • t.test(BWTSMOKE,var.equalT)

9
T-Test
Laboratory for Interdisciplinary Statistical
Analysis
  • 2.3 Sample size calculation
  • Research Question
  • How many observations are needed for a given
    power, or what is the power of the test given a
    sample size?
  • Power probability rejecting null when null is
    false
  • In R power.t.test(n NULL, delta NULL, sd
    1, sig.level 0.05, power NULL, type
    c("two.sample", "one.sample", "paired"),
    alternative c("two.sided", "one.sided"), strict
    FALSE)
  • Calculate a sample size given a power
    power.t.test(delta2,sd2,power.8)
  • Calculate a power given a sample size
    power.t.test(n20, delta2, sd2)

10
T-Test
Laboratory for Interdisciplinary Statistical
Analysis
  • 2.4 Paired T-test
  • Research Question
  • Given the paired structure of the data are the
    means of two sets of observations significantly
    different?
  • Example In a warehouse, the employees have asked
    management to play music to relieve the boredom
    of the job. The manager wants to know whether
    efficiency is affected by the change. The table
    below gives efficiency ratings of 15 employees
    recorded before and after the music system was
    installed.
  • (Link of the dataset
  • http//www-ist.massey.ac.nz/dstirlin/CAST/CAST/Hte
    stPaired/testPaired_c1.html)
  • In R t.test(efficiency_after,efficiency_before,pa
    iredT)
  • or, t.test(diff), diff
    efficiency_after-efficiency_before

11
T-Test
Laboratory for Interdisciplinary Statistical
Analysis
  • 2.5 Checking assumptions Nonparametric test
  • Using t-test, we assume the data follows a normal
    distribution, to check this normal assumption
    visualization and statistical test.
  • Visualization
  • Histogram shape of normal distribution
    symmetric, bell-shape with rapidly dying tails.
  • QQ-plot plot the theoretical quintiles of the
    normal distribution and the quintiles of the
    data, straight line shows assumption hold.
  • Statistical Test Shapiro-Wilk Normality Test
  • In R shapiro.test(data)

12
T-Test
Laboratory for Interdisciplinary Statistical
Analysis
  • 2.5 Checking assumptions Nonparametric test
  • When the normal assumption does not hold, we use
    the alternative nonparametric test.
  • Wilcoxon Signed Rank Test
  • Null hypothesis mean difference between the
    pairs is zero
  • Alternative hypothesis mean difference is not
    zero
  • In R wilcox.test(x, y NULL, alternative
    c("two.sided", "less", "greater"), mu 0, paired
    FALSE, exact NULL, correct TRUE, conf.int
    FALSE, conf.level 0.95, ...)

13
T-Test
Laboratory for Interdisciplinary Statistical
Analysis
  • 2.5 Checking assumptions Nonparametric test
  • When the normal assumption does not hold, we use
    the alternative nonparametric test.
  • Wilcoxon Signed Rank Test
  • Null hypothesis mean difference between the
    pairs is zero
  • Alternative hypothesis mean difference is not
    zero
  • In R wilcox.test(x, y NULL, alternative
    c("two.sided", "less", "greater"), mu 0, paired
    FALSE, exact NULL, correct TRUE, conf.int
    FALSE, conf.level 0.95, ...)

14
ANOVA- Analysis of Variance
Laboratory for Interdisciplinary Statistical
Analysis
  • T-test Compare the mean of a population to a
    nominal value
  • or compare the means of equivalence
    for two populations
  • What if you want to compare the means of more
    than two populations?
  • We use ANOVA!
  • One-Way ANOVA Compare the means of populations
    where the variation are attributed to the
    different levels of one factor.
  • Two-Way ANOVA Compare the means of populations
    where the variation are attributed to the
    different levels of two factors.

15
ANOVA- Analysis of Variance
Laboratory for Interdisciplinary Statistical
Analysis
  • 3.1 One-way ANOVA
  • Example Compare the BWT(birth weight in grams)
    for 3 races
  • bwt data BWT gams
  • RACE mothers race (1 White,
    2 Black, 3 Other)
  • SMOKE mothers smoking status
    during pregnancy (1 Yes, 0 No)
  • Hypothesis
  • Null hypothesis the three groups have equal
    average birth weight
  • Alternative hypothesis at least two groups do
    not have equal bwt
  • In R a.1aov(BWTfactor(RACE)) and summary(a.1)

16
ANOVA- Analysis of Variance
Laboratory for Interdisciplinary Statistical
Analysis
  • 3.2 Two-way ANOVA
  • Example Compare the bwt for 3 races and 2 status
    of smoking
  • Three effects to be considered RACE, SMOKE and
    the interactions
  • In R a.2 aov(BWTfactor(SMOKE)factor(RACE))
    and summary(a.2)

17
LOGISTIC Regression
Laboratory for Interdisciplinary Statistical
Analysis
18
LOGISTIC Regression
Laboratory for Interdisciplinary Statistical
Analysis
  •  

19
LOGISTIC Regression
Laboratory for Interdisciplinary Statistical
Analysis
  • Example Low birth weight data
  • We are interested in understanding the variables
    that predict the likelihood of a mother giving
    birth to a baby with low-birth weight (defined as
    a baby weighing less than 2500 grams).
  • The response variable low 0, 1 (Indicator of
    birth weight less than 2.5 kg)
  • The predict variables
  • age mothers age in years
  • lwt mothers weight in lbs
  • race mothers race (1 white, 2 black, 3
    other)
  • smoke smoking status during pregnancy
  • ptl no. of previous premature labors
  • ht history of hypertension
  • ui presence of uterine irritability
  • ftv no. of physician visits during first
    trimester

20
LOGISTIC Regression
Laboratory for Interdisciplinary Statistical
Analysis
  •  

21
Thank you!
Laboratory for Interdisciplinary Statistical
Analysis
  • Please dont forget to fill the sign in sheet and
    to complete the survey that will be sent to you
    by email.
Write a Comment
User Comments (0)