PCB 3043L General Ecology - PowerPoint PPT Presentation

1 / 121
About This Presentation
Title:

PCB 3043L General Ecology

Description:

Organizing an ecological study. What is the aim of the study? ... tree height of pine trees along transect from forest trail to interior forest at ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 122
Provided by: josett9
Category:

less

Transcript and Presenter's Notes

Title: PCB 3043L General Ecology


1
PCB 3043L - General Ecology
  • Data Analysis

2
OUTLINE
  • Organizing an ecological study
  • Basic sampling terminology
  • Statistical analysis of data
  • Why use statistics?
  • Describing data
  • Measures of central tendency
  • Measures of spread
  • Normal distributions
  • Using Excel
  • Producing tables
  • Producing graphs
  • Analyzing data
  • Statistical tests
  • T-Tests
  • ANOVA
  • Regression

3
Organizing an ecological study
  • What is the aim of the study?
  • What is the main question being asked?
  • What are your hypotheses?
  • Collect data
  • Summarize data in tables
  • Present data graphically
  • Statistically test your hypotheses
  • Analyze the statistical results
  • Present a conclusion to the proposed question

4
Basic sampling terminology
  • Variables
  • Populations
  • Samples
  • Parameters
  • Statistics

5
What is a variable?
  • Variable any defined characteristic that varies
    from one biological entity to another.
  • Examples plant height, bird weight, human eye
    color, no. of tree species
  • If an individual is selected randomly from a
    population, it may display a particular height,
    weight, etc.
  • If several individuals are selected, their
    characteristics may be very similar or very
    different.

6
What is a population?
  • Population the entire collection of measurements
    of a variable of interest.
  • Example if we are interested in the heights of
    pine trees in Everglades National Park (Plant
    height is our variable) then our population would
    consist of all the pine trees in Everglades
    National Park .

7
What is a sample?
  • Sample smaller groups or subsets of the
    population which are measured and used to
    estimate the distribution of the variable within
    the true population
  • Example the heights of 100 pine trees in
    Everglades National Park may be used to estimate
    the heights of trees within the entire population
    (which actually consists of thousands of trees)

8
What is a parameter?
  • Parameter any calculated measure used to
    describe or characterize a population
  • Example the average height of pine trees in
    Everglades National Park

9
What is a statistic?
  • Statistic an estimate of any population
    parameter
  • Example the average height of a sample of 100
    pine trees in Everglades National Park

10
Why use statistics?
  • It is not always possible to obtain measures and
    calculate parameters of variables for the entire
    population of interest
  • Statistics allow us to estimate these values for
    the entire population based on multiple, random
    samples of the variable of interest
  • The larger the number of samples, the closer the
    estimated measure is to the true population
    measure
  • Statistics also allow us to efficiently compare
    populations to determine differences among them
  • Statistics allow us to determine relationships
    between variables

11
Statistical analysis of data
Heights of pine trees at 2 sites in Everglades
National Park
  • Measures of central tendency
  • Measures of dispersion and variability

12
Measures of central tendency
  • Where is the center of the distribution?
  • mean (? or µ) arithmetic mean
  • median the value in the middle of the ordered
    data set
  • mode the most commonly occurring value
  • Example data set 1, 2, 2, 2, 3, 5, 6, 7, 8, 9,
    10
  • Mean (1 2 2 2 3 5 6 7 8 9
    10)/11 55/11 5
  • Median 1, 2, 2, 2, 3, 5, 6, 7, 8, 9,10 5
  • 1, 2, 2, 2, 3, 5, 6, 7, 8, 9,10,11
    (56)/2 5.5
  • Mode 1, 2, 2, 2, 3, 5, 6, 7, 8, 9, 10 2

13
Measures of dispersion and variability
  • How widely is the data distributed?
  • range largest value minus smallest value
  • variance (s2 or s2) ..
  • standard deviation (s or s)

14
Measures of dispersion and variability
Example data set 0, 1, 3, 3, 5, 5, 5, 7, 7, 9,
10Variance 9.8Standard Deviation
3.29Range 10 Example data set 0, 10, 30,
30, 50, 50, 50, 70, 70, 90, 100Variance
980Standard Deviation 270.13Range 100
15
Normal distribution of data
  • A data set in which most values are around the
    mean, with fewer observations towards the
    extremes of the range of values
  • The distribution is symmetrical about the mean

16
Proportions of a Normal Distribution
  • A normal population of 1000 body weights
  • µ 70kg s 10kg
  • 500 weights are gt 70kg
  • 500 weights are lt 70 kg

17
Proportions of a Normal Distribution
  • How many bears have a weight gt 80kg
  • µ 70kg s 10kg X 80kg
  • We use an equation to tell us how many standard
    deviations from the mean the X value is located

  • We then use a special table to tell us what
    proportion of a normal distribution lies beyond
    this Z value
  • This proportion is equal to the probability of
    drawing at random a measurement (X) greater than
    80kg

1
18
Z table
  • Look for Z value on table (1.0)
  • Find associated P value (0.1587)
  • P value states there is a 15.87 ((0.1587/1)x100)
    chance that a bear selected from the population
    of 1000 bears measured will have a weight greater
    than 80kg

19
Probability distribution tables
  • There are multiple probability tables for
    different types of statistical tests.
  • e.g. Z-Table, t-Table, ?2-Table
  • Each allows you to associate a critical value
    with a P value
  • This P value is used to determine the
    significance of statistical results

20
Using Excel
  • Program used to organize data
  • Produce tables
  • Perform calculations
  • Make graphs
  • Perform statistical tests

21
Organizing data in tables
  • Allows you to arrange data in a format that is
    best for analysis
  • The following are the steps you would use

22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
Performing calculations
  • Allows you to perform several calculations
  • Sum, Average, Variance, Standard deviation
  • Basic subtraction, addition, multiplication
  • More complex formulas

26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
Making graphs
  • Bar Charts.
  • Scatter Plots.

42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
Making graphs
  • Bar Charts.
  • Scatter Plots.

59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
(No Transcript)
64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
Analyzing Data in Excel
  • Statistical tests can be done to determine
  • Whether or not there is a significant difference
    between two data sets (Students t-test)
  • Whether or not there is a significant difference
    between more than two data sets (ANOVA)
  • Whether or not there is a significant
    relationship between two variables (Regression
    analysis)

73
Analyzing Data in Excel
  • The following steps must be followed
  • Choose an appropriate statistical test
  • State H0 and HA
  • Run test to produce Test Statistic
  • Examine P-value
  • Decide to accept or reject H0

74
Analyzing Data in Excel
  • Normally, you would have to calculate the
    critical value and look up the P value on a table
  • All tests done in Excel provide the P value for
    you
  • This P value is used to determine the
    significance of statistical results
  • This P value must be compared to an a value
  • a value is usually 0.05 or less (e.g. 0.01)
  • Less than 5 chance that the null hypothesis is
    true
  • The lower the a value the more certain we about
    rejecting the null Hypothesis
  • First thing you must do is select which
    statistical test you want to perform
  • This is how it is done..

75
(No Transcript)
76
(No Transcript)
77
(No Transcript)
78
(No Transcript)
79
t-Tests
  • Used to compare the means of two populations and
    answer the question
  • Is there a significant difference between the
    two populations?
  • Example Is there a significant difference
    between the average height of pine trees from 2
    sites in Everglades National Park?
  • You cannot use this test to compare two different
    types of data (e.g. water depth data and soil
    depth data).
  • It can only compare two sets of data based on the
    same data type (e.g. water depth data from two
    different sites)
  • The two data sets that are being compared must be
    presented in the same units. (e.g. you can
    compare two sets of data if both are recorded in
    days. You cannot compare data recorded in units
    of days with data recorded in units of months)

80
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • Your Null Hypothesis is always
  • There is no significant difference between the
    two compared populations (µ1 µ2)
  • Your Alternative Hypothesis is always
  • There is a difference between the two compared
    populations (µ1 ? µ2)

81
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
82
(No Transcript)
83
(No Transcript)
84
(No Transcript)
85
(No Transcript)
86
(No Transcript)
87
t-Tests
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • When you run the test, look for the p-value
  • If p gt 0.05 then fail to reject your Null
    Hypothesis and state that there is no
    significant difference between the two compared
    populations
  • If p lt 0.05 then reject your Null Hypothesis and
    state that there is a significant difference
    between the two compared populations

88
t-Tests
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • When you run the test, look for the p-value
  • Our results show P 0.09903
  • Therefore P gt 0.05 (This means that there is
    greater than a 5 chance that our null hypothesis
    is true)
  • So we must fail to reject the Null Hypothesis and
    state that there is no significant difference
    between the two compared populations

89
ANOVA
  • Used to compare the means of more than two
    populations and answer the question
  • Is there a significant difference between the
    populations?
  • Example Is there a significant difference
    between the average height of pine trees from 4
    sites in Everglades National Park?
  • For comparing a particular feature of two or more
    populations, use a Single Factor ANOVA
  • For comparing a particular feature of two or more
    populations, subdivided into two groups, use a
    Two Factor ANOVA

90
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • Your Null Hypothesis is always
  • There is no significant difference between the
    compared populations (µ1 µ2 µ3 µ4 ..)
  • Your Alternative Hypothesis is always
  • There is a difference between the compared
    populations (µ1 ? µ2 ? µ3 ? µ4 ..)

91
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
92
(No Transcript)
93
(No Transcript)
94
(No Transcript)
95
ANOVA
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • When you run the test, look for the p-value
  • If p gt 0.05 then fail to reject your Null
    Hypothesis and state that there is no
    significant difference between the compared
    populations
  • If p lt 0.05 then reject your Null Hypothesis and
    state that there is a significant difference
    between at least two of the compared populations

96
ANOVA
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • When you run the test, look for the p-value
  • Our results show P 0.002197
  • Therefore P lt 0.05 (This means that there is less
    than a 5 chance that our null hypothesis is
    true)
  • So we must reject your Null Hypothesis and state
    that there is a significant difference between
    at least two of the compared populations

97
ANOVA
  • Remember
  • The ANOVA result will only tell you that
  • None of the data sets are significantly different
    from each other
  • OR
  • At least two of the data sets among the data sets
    being compared are significantly different
  • If there is a significant difference between at
    least two data sets, it will not tell you which
    two.

98
Two-way ANOVA
  • Used to compare the means of more than two
    populations that are subdivided into two or more
    groups and answer the question
  • Is there a significant difference between the
    populations?
  • Example Is there a significant difference
    between the average height of pine trees from 4
    sites in Everglades National Park, during the wet
    and dry season?

99
(No Transcript)
100
(No Transcript)
101
(No Transcript)
102
(No Transcript)
103
(No Transcript)
104
(No Transcript)
105
Two-way ANOVA
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • When you run the test, look for the interaction
    p-value
  • If p gt 0.05 then fail to reject your Null
    Hypothesis and state that there is no
    significant difference between the compared
    populations
  • If p lt 0.05 then reject your Null Hypothesis and
    state that there is a significant difference
    between at least two of the compared populations

106
Two-way ANOVA
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • Our results show P 0.2888
  • Therefore P gt 0.05 (This means that there is a
    greater than a 5 chance that our null hypothesis
    is true)
  • So we must fail to reject the Null Hypothesis and
    state that there is no significant difference
    between the compared populations

107
Regression analysis
  • Used to determine whether or not there is a
    linear relationship between two variables and
    answer the question
  • Is there a significant linear relationship
    between two variables?
  • Example Is there a significant relationship
    between the average height of pine trees and soil
    depth in Everglades National Park?
  • It basically creates an equation (or line) that
    best predicts Y values based on X values.
  • You cannot use this test to compare populations.
    It only compares variables.
  • You are looking at two different variables (e.g.
    water depth (cm) and plant abundance (no. of
    individuals), so the data sets do not have to be
    presented in the same units

108
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • Your Null Hypothesis is always
  • There is no significant linear relationship
    between the two variables
  • Your Alternative Hypothesis is always
  • There is a significant linear relationship
    between the two variables

109
  • R squared how well y can be predicted by x,
    i.e. how strong the linear relationship is
    between the two variables.
  • The closer R square is to 0, the less well it
    fits the data.
  • The closer R square is to 1, more it fits the
    data.
  • Example R square value of 0.04
  • The regression line does not fit the data well
  • Many of the points lie far from the line, so
    there is not a defined linear relationship
    between the two variables
  • x cannot be used to predict y
  • Example R square value of 0.94
  • The regression line fits the data well
  • The points all lie fairly close to the line, so
    there is a defined linear relationship between
    the two variables
  • x can be used to predict y

110
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
111
(No Transcript)
112
(No Transcript)
113
(No Transcript)
114
(No Transcript)
115
(No Transcript)
116
(No Transcript)
117
Regression analysis
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • When you run the test, look for the Significance
    F or Sample p-value
  • If p gt 0.05 then fail to reject your Null
    Hypothesis and state that There is no
    significant linear relationship between the two
    variables
  • If p lt 0.05 then reject your Null Hypothesis and
    state that There is a significant linear
    relationship between the two variables

118
Regression analysis
1. Choose an appropriate statistical test2.
State H0 and HA 3. Run test to produce Test
Statistic4. Examine P-value5. Decide to accept
or reject H0
  • When you run the test, look for the p-value
  • Our results show Significance F or Sample p-value
    1.65E08 0.0000000165
  • Therefore P lt 0.05 (This means that there is less
    than a 5 chance that our null hypothesis is
    true)
  • So we must reject your Null Hypothesis and state
    that There is a significant linear relationship
    between the two variables
  • Next look at the R squared value
  • Our results show R squared 0.975
  • Therefore the line fits the data well
  • x can be used to predict y

119
Ecological study
  • What is the aim of the study?
  • What is the main question being asked?
  • What are your hypotheses?
  • Collect data
  • Summarize data in tables
  • Present data graphically
  • Statistically test your hypotheses
  • Analyze the statistical results
  • Present a conclusion to the proposed question

120
  • Aim To determine whether or not there are
    changes in heights of Pine trees with distance
    from the edge of a forest trail in Everglades
    National Park.
  • Hypotheses
  • HO There is no significant relationship between
    distance from the edge of the trail and Pine tree
    height
  • HA There is a significant relationship between
    distance from the edge of the trail and Pine tree
    height
  • Results

Average tree height of pine trees along transect
from forest trail to interior forest at ENP
  • P 1.65E-08 Since P lt 0.05, reject Ho
  • Therefore, there is a significant relationship
    between distance from the edge of the trail and
    Pine tree height
  • R Square 0.97, so there is a strong positive
    linear relationship between distance from the
    trail and plant height

121
Assignment Worksheet 1
  • Three questions
  • T-test
  • Single factor ANOVA and Two-way ANOVA
  • Regression analysis
Write a Comment
User Comments (0)
About PowerShow.com