Statistics: Data Analysis and Presentation - PowerPoint PPT Presentation

About This Presentation
Title:

Statistics: Data Analysis and Presentation

Description:

Statistics: Data Analysis and Presentation Fr Clinic II – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 23
Provided by: Jess3193
Learn more at: https://users.rowan.edu
Category:

less

Transcript and Presenter's Notes

Title: Statistics: Data Analysis and Presentation


1
StatisticsData Analysis and Presentation
  • Fr Clinic II

2
Overview
  • Tables and Graphs
  • Populations and Samples
  • Mean, Median, and Standard Deviation
  • Standard Error 95 Confidence Interval (CI)
  • Error Bars
  • Comparing Means of Two Data Sets
  • Linear Regression (LR)

3
Warning
  • Statistics is a huge field, Ive simplified
    considerably here. For example
  • Mean, Median, and Standard Deviation
  • There are alternative formulas
  • Standard Error and the 95 Confidence Interval
  • There are other ways to calculate CIs (e.g., z
    statistic instead of t difference between two
    means, rather than single mean)
  • Error Bars
  • Dont go beyond the interpretations I give here!
  • Comparing Means of Two Data Sets
  • We just cover the t test for two means when the
    variances are unknown but equal, there are other
    tests
  • Linear Regression
  • We only look at simple LR and only calculate the
    intercept, slope and R2. There is much more to
    LR!

4
Tables
Table 1 Average Turbidity and Color of Water
Treated by Portable Water Filters
4 5 12
Consistent Format, Title, Units, Big
Fonts Differentiate Headings, Number Columns
5
Figures
Consistent Format, Title, Units Good Axis Titles,
Big Fonts
11
Figure 1 Turbidity of Pond Water, Treated and
Untreated
6
Populations and Samples
  • Population
  • All of the possible outcomes of experiment or
    observation
  • US population
  • Particular type of steel beam
  • Sample
  • A finite number of outcomes measured or
    observations made
  • 1000 US citizens
  • 5 beams
  • We use samples to estimate population properties
  • Mean, Variability (e.g. standard deviation),
    Distribution
  • Height of 1000 US citizens used to estimate mean
    of US population

7
Mean and Median
  • Turbidity of Treated Water (NTU)

Mean Sum of values divided by number of
samples (1336810)/6 5.2 NTU
1 3 3 6 8 10
Median The middle number Rank - 1 2 3
4 5 6 Number - 1 3 3 6 8 10
For even number of sample points, average middle
two (36)/2 4.5
Excel Mean AVERAGE Median - MEDIAN
8
Variance
  • Measure of variability
  • sum of the square of the deviation about the mean
    divided by degrees of freedom

n number of data points
Excel variance VAR
9
Standard Deviation, s
  • Square-root of the variance
  • For phenomena following a Normal Distribution
    (bell curve), 95 of population values lie within
    1.96 standard deviations of the mean
  • Area under curve is probability of getting
    value within specified range

Excel standard deviation STDEV
Standard Deviations from Mean
10
Standard Error of Mean
  • Standard deviation of mean
  • Of sample of size n
  • taken from population with standard deviation s
  • Estimate of mean depends on sample selected
  • As n ?, variance of mean estimate goes down,
    i.e., estimate of population mean improves
  • As n ?, mean estimate distribution approaches
    normal, regardless of population distribution

11
95 Confidence Interval (CI) for Mean
  • Interval within which we are 95 confident the
    true mean lies
  • t95,n-1 is t-statistic for 95 CI if sample size
    n
  • If n ? 30, let t95,n-1 1.96 (Normal
    Distribution)
  • Otherwise, use Excel formula TINV(0.05,n-1)
  • n number of data points

12
Error Bars
  • Show data variability on plot of mean values
  • Types of error bars include
  • Standard Deviation, Standard Error, 95 CI
  • Maximum and minimum value

13
Using Error Bars to compare data
  • Standard Deviation
  • Demonstrates data variability, but no comparison
    possible
  • Standard Error
  • If bars overlap, any difference in means is not
    statistically significant
  • If bars do not overlap, indicates nothing!
  • 95 Confidence Interval
  • If bars overlap, indicates nothing!
  • If bars do not overlap, difference is
    statistically significant
  • Well use 95 CI

14
Example 1
Create Bar Chart of Name vs Mean. Right click on
data. Select Format Data Series.
15
Example 2
16
What can we do?
  • Plot mean water quality data for various filters
    with error bars
  • Plot mean water quality over time with error bars

17
Comparing Filter Performance
  • Use t test to determine if the mean of two
    populations are different.
  • Based on two data sets
  • E.g., turbidity produced by two different filters

18
Comparing Two Data Sets using the t test
  • Example - You pump 20 gallons of water through
    filter 1 and 2. After every gallon, you measure
    the turbidity.
  • Filter 1 Mean 2 NTU, s 0.5 NTU, n 20
  • Filter 2 Mean 3 NTU, s 0.6 NTU, n 20
  • You ask the question - Do the Filters make water
    with a different mean turbidity?

19
Do the Filters make different water?
  • Use TTEST (Excel)
  • Fractional probability of being wrong if you
    answer yes
  • We want probability to be small ? 0.01 to 0.10
    (1 to 10 ). Use 0.01

20
t test Questions
  • Do two filters make different water?
  • Take multiple measurements of a particular water
    quality parameter for 2 filters
  • Do two filters treat difference amounts of water
    between cleanings?
  • Measure amount of water filtered between
    cleanings for two filters
  • Does the amount of water a filter treats between
    cleaning differ after a certain amount of water
    is treated?
  • For a single filter, measure the amount of water
    treated between cleanings before and after a
    certain total amount of water is treated

21
Linear Regression
  • Fit the best straight line to a data set

Right-click on data point and use trendline
option. Use options tab to get equation and R2.
22
R2 - Coefficient of multiple Determination
yi Predicted y values, from regression
equation yi Observed y values
R2 fraction of variance explained by
regression (variance standard deviation
squared) 1 if data lies along a straight line
Write a Comment
User Comments (0)
About PowerShow.com