Title: Linear Regression
1Linear Regression
- Hein Stigum
- Presentation, data and programs at
- http//folk.uio.no/heins/courses
2Concepts
3Outcome and regression types
- Numerical data
- Discrete
- number of partners
- Continuous
- Weight
- Categorical data
- Nominal
- disease/ no disease
- Ordinal
- small/ medium/ large
- Poisson regression
- Linear regression
- Logistic regression
- Ordinal regression
4Regression idea
5Measures and Assumptions
- Adjusted effects
- b1 is the increase in weight per day of
gestational age - b1 is adjusted for b2
- Assumptions
- Independent errors
- Linear effects
- Constant error variance
- Robustness
- influence
6Workflow
- DAG
- Plots distribution and scatter
- Bivariate analysis
- Regression
- Model estimation
- Test of assumptions
- Independent errors
- Linear effects
- Constant error variance
- Robustness
- Influence
Discuss
Plot
Plot
7Analysis
- Continuous outcome Linear regression, Birth
weight
8DAGs
Associations Bivariate (unadjusted) Causal
effects Multivariable (adjusted)
Draw your assumptions before your conclusions
9Plot outcome by exposure
Effects on linear regression
OK
Be clear on the research question overall
birth weight linear regression low birth
weight logistic regression linear and logistic
can give opposite results May lead to
non-constant error variance
May have high influential outliers
10Plot outcome by exposure, cont.
Linear effects?
Yes
11Bivariate analysis
Outcome birthweight
12Regression
- Continuous outcome Linear regression, Birth
weight
13Categorical covariates
- 2 categories
- OK, but know the coding
- 3 categories
- Use dummies
- Dummies are 0/1 variables used to create
contrasts - Want 3 categories for parity 0, 1 and 2-7
children - Choose 0 as reference
- Make dummies for the two other categories
generate Parity1 (parity1) if
paritylt. generate Parity2_7 (paritygt2) if
paritylt.
14Model estimation
Syntax regress weight gest sex Parity1 Parity2_7
15Create meaningful constant
- Expected birth weight at
- gest 0, sex0, parity0
- gest280, sex1, parity0
Alternative center variables gen
gest280gest-280 gest280 has a meaningful zero
at 280 days gen sex0sex-1 sex0 has a
meaningful zero at boys
16Model results
17Test of assumptions
- Discuss
- Independent residuals?
- Plot residuals versus predicted y
- Linear effects?
- constant variance?
18Violations of assumptions
- Dependent residuals
- Use linear mixed models
- Non linear effects
- Add square term
- Or use piecewise linear
- Non-constant variance
- Use robust variance estimation
19Influence
20Measures of influence
Remove obs 1, see change remove obs 2, see change
- Measure change in
- Predicted outcome
- Deviance
- Coefficients (beta)
- Delta beta
21Delta beta for gestational age
If obs nr 539 is removed, beta will change from 6
to 16
22Removing outlier
Full data
Outlier removed
One outlier affected two estimates
Final model
23Summing up
- DAGs
- Guide analysis
- Plots
- Unequal variance, non-linearity, outliers
- Bivariate analysis
- Linear regression
- Fit model
- Check assumptions
- Check robustness
- Make meaningful constant