Studentforelesning 1 1999 - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Studentforelesning 1 1999

Description:

Mortality in Tanzania and Norway. Research and numbers. Numbers often appear in medical research. ... Neurophysiological study on rats ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 40
Provided by: kontorfork
Category:

less

Transcript and Presenter's Notes

Title: Studentforelesning 1 1999


1
Introduction and descriptive statistics 30th
August 2006 Tron Anders Moger
2
New England Journal of Medicine, Editorial, Jan.
6, 2000, p. 42-49
  • The eleven most important developments in
    medicine in the past millennium
  • Elucidation of human anatomy and physiology
  • Discovery of cells and their substructures
  • Elucidation of the chemistry of life
  • Application of statistics to medicine
  • Development of anesthesia
  • Discovery of the relation of microbes to disease
  • Elucidation of inheritance and genetics
  • Knowledge of the immune system
  • Development of body imaging
  • Discovery of antimicrobial agents
  • Development of molecular pharmacotherapy

3
Introduction
  • A lot of knowledge appear through numbers and
    quantitative data.
  • Problems in interpreting statistical results are
    often underestimated.
  • Important to learn numerical literacy the
    ability to understand numbers and quantitative
    relationships.

4
Number of births in former East Germany
5
Mortality in Tanzania and Norway
6
Research and numbers
  • Numbers often appear in medical research.
  • The numbers are often uncertain, they have
    variability
  • They must be organized in order to interpret them
  • Wish to generalize the results to the general
    population

7
Statistical data
  • Appear from
  • Numerical measurements with an instrument on a
    continuous scale (Continuous data). Examples
  • Fever 39.6 (Unproblematic)
  • IQ 116 (Problematic)
  • Categorization (categorical data). Examples
  • Man / woman (Unproblematic)
  • Depressed / Not depressed
  • (Problematic)

8
Variability in the data
  • Reliability Precision of data? How much will
    they differ if the measurements are repeated?
  • Validity Do we capture what we are really
    interested in? Is the measurement relevant?

9
Reliability of lung function measurements6
repeated measurements on 12 students.
10
Reliability of questionnaire/interview
  • Alcohol use (men 31-50 years)
  • Mean number of times alcohol users say that they
    have felt intoxicated
  • 1993 (questionnaire) 14.1 times per year
  • 1994 (interview) 7.3 times per year
  • In 1994 they used the word drunk.

11
Reliability of clinical study
  • Sackett et al Clinical Epidemiology (Little,
    Brown and Company, 1985). Pictures of the eye of
    100 patients are studied by two clinicians to see
    if there is evidence of retinopathy
  • Second
    clinician
  • No Yes
  • First No 46
    10
  • clinician Yes 12 32
  • Observed agreement
  • (4632)/100 78

12
Sources of variation in data
  • Laboratory variation
  • Observer variation
  • Instrument variation
  • Measurement variation
  • Biological variation between individuals
  • Day to day variation within the same
    individual/hospital

13
Generalization
  • Sample The units, experiments, individuals etc.
    that are in the study E.g.
  • 15 patients with migraine
  • Neurophysiological study on rats
  • Population The collection of units etc. one
    wishes the results to apply for
  • All patients with migraine
  • All repetitions of the neurophysiological
    experiment

14
Pairs of terms
  • Sample
  • Histogram
  • Mean
  • Proportion
  • Measurements of cholesterol level
  • Weather
  • Population
  • Probability distribution
  • Expectation
  • Risk
  • Cholesterol level in the population
  • Climate

15
Types of data
  • Continuous data. Data measured on a continuous
    scale, e.g. height, weight, age. Can be truly
    continuous (with decimals) or discrete (integers)
  • Categorical data. Data in categories, e.g.
    gender, education level, grouped age, hospital
    department. Can be nominal or ordinal.

16
Data in SPSS (and other statistical software)
  • IMPORTANT One line in the data file always
    correspond to one observation!
  • Common to have an id variable for each
    observation
  • If a measurement is missing, leave the cell empty
  • To create a new variable in SPSS, choose
    Data-gtInsert variable in the Data View window, or
    by writing the variable name in Name in the
    Variable View window

17
Data coding
  • The value of the variable for continuous data
  • For categorical data, define a suitable coding,
    e.g. 0male and 1female, or 0grammar schoole,
    1high school and 2college/university degree
  • In Variable View, the definition of the coding
    can be defined in Values
  • In Label you can write further information about
    the variable

18
Descriptive statistics
  • Tables
  • Graphs, plots
  • Measures of central tendency
  • Measures of variability

19
Types of graphs
  • Histogram
  • Box-plot
  • Scatter plot
  • Line plot
  • Bar plot

20
The age of 100 medical students
21
How can you get an overview of these data in
SPSS? Explore!
  • Choose Analyze - Descriptive Statistics -
    Explore. Select the relevant variables by
    clicking them, and transferring them to Dependent
    List. Choose Plots, remove the check on Stem
    and leaf and check Histogram instead. Click
    Continue and OK.

22
Histogram The distribution of age among the
students (n100)
23
Box-plot The distribution of age among the
students
24
Measures of central tendency
  • Mean
  • The students 22.2 years
  • Median
  • The middle observation when the observations
    are arranged in increasing order
  • The students 22.0 years
  • The mean is influenced by extreme observation.
    The median is robust

25
Measures of variability
  • Standard deviation
  • The students 3.06 years
  • Coefficient of variation s/ 100
  • The students 13.8
  • Quartiles Arrange the data in increasing order.
    The 25 quartile is at the observation where 25
    of the observations have lower values, and 75 of
    the observations have higher values. (In SPSS
    Check Percentiles in the Statistics meny in
    Explore)
  • The students 25 quartile 20.0 years 75
    quartile 23.0 years

26
How to get separate plots for each category of a
categorical variable, e.g. gender
  • Click Analyze - Descriptive Statistics -
    Explore. Move the continuous variable to
    Dependent List.
  • Move gender to Factor List
  • Thats it!

27
Separate boxplots for each gender
28
Relationship between two continuous variables
Scatter plot!
  • Choose Graphs - Scatter - Define. Choose a
    variable for the Y-axis and one for the X-axis
  • Separate markers for separate groups is achieved
    by transferring the categorical variable to Set
    Markers by
  • Can also include regression lines by choosing
    Fit line at total, or a line for each category
    by choosing Fit line at subgroups.

29
  • Scatter plot, weight versus height for the
    students

30
  • Scatter plot, weight versus height, with
    regression lines
  • Will talk much more about regression later

31
Correlation coefficient
  • A numerical measure of the relationship between
    two continuous variables x and y
  • Range between -1 and 1
  • Values close to 0 No relationship
  • Values close to 1 or -1 Almost linear
    relationship

32
Descriptive statistics for categorical variables
  • Not very useful to calculate the mean for e.g.
    educational level
  • Would like to find the percentages within each
    category in the study
  • Analyze-gtDescriptive Statistics -gtFrequencies
  • Move the variable to Variables(s)

33
Frequency table
Last column shows the cumulative distribution
always sums up to 100
34
Simple bar plot
35
Relationships between categorical variables
  • Choose Analyze-gtDescriptive Statistics
    -gtCrosstabs
  • Move one variable to Rows, and another to Columns
  • Click Cells, and check relevant percentages
    (Rows, Columns or Total)

36
Crosstable Relationship between race and smoking
37
Bar plot Relationship between race and smoking
38
Line plot for ordinal categorical variables
(time-series plot)
39
Conclusion
  • Tons of different options on how to present
    results
  • You will (hopefully) learn to understand which
    option is most relevant for each problem during
    this course
Write a Comment
User Comments (0)
About PowerShow.com