Title: Stat 155, Section 2, Last Time
1Stat 155, Section 2, Last Time
- Reviewed Excel Computation of
- Time Plots (i.e. Time Series)
- Histograms
- Modelling Distributions Densities (Areas)
- Normal Density Curve (very useful model)
- Fitting Normal Densities
- (using mean and s.d.)
2Reading In Textbook
- Approximate Reading for Todays Material
- Pages 71-83, 102-112
- Approximate Reading for Next Class
- Pages 123-127, 132-145
32 Views of Normal Fitting
- Fit Model to Data
- Choose .
- Fit Data to Model
- First Standardize Data
- Then use Normal .
- Note same thing, just different rescalings
- (choose scale depending on need)
4Normal Distribution Notation
- The normal distribution,
- with mean standard deviation
- is abbreviated as
5Interpretation of Z-scores
- Recall Z-score Idea
- Transform data
- By subtracting mean dividing by s.d.
- To get (mean
0, s.d. 1) - Interpret as
- I.e. is sds above the mean
6Interpretation of Z-scores
- Same idea for Normal Curves
- Z-scores are on scale,
- so use areas to interpret them
- Important Areas
- Within 1 sd of mean
- the majority
7Interpretation of Z-scores
- Within 2 sd of mean
- really most
- Within 3 sd of mean
- almost all
-
8Interpretation of Z-scores
- Interactive Version (used for above pics)
- From Publishers Website
- http//bcs.whfreeman.com/ips5e/
- Statistical Applets
- Normal Curve
9Interpretation of Z-scores
- Summary
- These relations are called the
- 68 - 95 - 99.7 Rule
- HW 1.86 (a 234-298, b 234, 298),
- 1.87
10Computation of Normal Areas
- Classical Approach Tables
- See inside covers of text
- Summarizes area computations
- Because cant use calculus
- Constructed by computers
- (a job description in the early 1900s!)
11Computation of Normal Areas
- EXCEL Computation
- works in terms of lower areas
- E.g. for
- Area lt 1.3
- is 0.7257
12Computation of Normal Areas
- Interactive Version (used for above pic)
- From Same Publishers Website
- http//bcs.whfreeman.com/ips5e/
- Statistical Applets
- Normal Curve
13Computation of Normal Areas
- EXCEL Computation
- (of above e.g.)
- Use NORMDIST
- Enter parameters
- x is cutoff point
- Return is Area below x
14Computation of Normal Areas
- Computation of areas over intervals
- (use subtraction)
- -
15Computation of Normal Areas
- Computation of areas over intervals
- (use subtraction for EXCEL too)
- E.g. Use Excel to check 68 - 95 - 99.7 Rule
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg9.xls
16Normal Area HW
- HW (use Excel)
- 1.94
- 1.97 (Hint the above 130
- 100 - below
130) - 1.99 (see discussion above)
- 1.113
- Caution Dont just twiddle EXCEL until answer
appears. Understand it!!!
17And Now for Something Completely Different
- A mind blowing video clip
- 8 year old Skateboarding Twins
- http//www.youtube.com/watch?v8X2_zsnPkq8modere
latedsearch - Do they ever miss?
- You can explore farther
- Thanks to Devin Coley for the link
18Inverse of Area Function
- Inverse of Frequencies Quantiles
- Idea Given area, find cutoff x
- I.e. for
- Area 80
- This x
- is the quantile
19Inverse of Area Function
- EXCEL Computation of Quantiles
- Use NORMINV
- Continue Class Example
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg9.xls - Probability is Area
- Enter mean and SD parameters
20Inverse Area Example
- When a machine works normally, it fills bottles
with mean 25 oz, and SD 0.2 oz. - The machine is out of control when it
overfills. Choose an alarm level, which will
give only 1 false alarms. - Want cutoff, x, so that Area above 1
- Note Area below 100 - Area above 99
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg9.xls
21Inverse Area HW
- 1.95, 1.101, 1.107, 1.109
- 1.116 a (-0.674, 0.674)
- 1.117
- 1.118 (4.3)
22Normal Diagnostic
- When is the Normal Model good?
- Useful Graphical Device
- Q-Q plot Normal Quantile Plot
- Idea look at plot which is approximately linear
for data from Normal Model
23Normal Quantile Plot
- Approach, for data
- Sort data
- Compute Theoretical Proportions
- Compute Theoretical Z-scores
- Plot Sorted Data (Y-axis) vs.
- Theoretical Z scores (X-axis)
24Normal Quantile Plot
- Several Examples
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg12.xls - Show how to compute in Excel
- Steps as above
25Normal Quantile Plot
- Main Lessons
- Melbourne Winter Temperature Data
- Gaussian is good, so looks linear
- So OK, to use normal model for these data
- Adding trendline helps in assessing linearity
26Normal Quantile Plot
- Main Lessons
- Intro Stat Course Exam Scores Data
- Skewed distributions ?? nonlinearity
- Outliers show up clearly
- Normal model unreliable here
- Combined plot highlights
- Mean Y-intercept
- Standard Deviation Slope
27Normal Quantile Plot
- Main Lessons
- Simulated Bimodal Data
- Curve is flat near modes
- Roughly linear near peaks
- Corresponds to two normal subpopulaitons
- Goes up fast a valley
28Normal Quantile Plot
- Homework
- 1.122
- 1.123
- 1.125
29And now for something completely different
- Recall
- Distribution
- of majors of
- students in
- this course
30And now for something completely different
- How about a biology joke?
- A seventh grade Biology teacher arranged a
demonstration for his class. He took two earth
worms and in front of the class he did the
following He dropped the first worm into a
beaker of water where it dropped to the bottom
and wriggled about. He dropped the second worm
into a beaker of Ethyl alchohol and it
immediately shriveled up and died. He asked the
class if anyone knew what this demonstration was
intended to show them.
31And now for something completely different
- He asked the class if anyone knew what this
demonstration was intended to show them. - A boy in the second row immediately shot his arm
up and, when called on said "You're showing us
that if you drink alcohol, you won't have worms."
32Variable Relationships
- Chapter 2 in Text
- Idea Look beyond single quantities, to how
quantities relate to each other. - E.g. How do HW scores relate
- to Exam scores?
- Section 2.1 Useful graphical device
- Scatterplot
33Plotting Bivariate Data
- Toy Example
- (1,2)
- (3,1)
- (-1,0)
- (2,-1)
34Plotting Bivariate Data
- Sometimes
- Can see more
- insightful patterns
- by connecting points
35Plotting Bivariate Data
- Sometimes
- Useful to switch off
- points, and only
- look at lines/curves
36Plotting Bivariate Data
- Common Name Scatterplot
- A look under the hood
- EXCEL Chart Wizard (colored bar icon)
- Chart Type XY (scatter)
- Subtype conrols points only, or lines
- Later steps similar to above
- (can massage the pic!)
37Scatterplot E.g.
- Data from related Intro. Stat. Class
- (actual scores)
- How does HW score predict Final Exam?
- HW, Final Exam
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg10.xls - In top half of HW scores
- Better HW ? Better Final
- For lower HW
- Final is much more random
38Scatterplots
- Common Terminology
- When thinking about X causes Y,
- Call X the Explanatory Var. or Indep. Var.
- Call Y the Response Var. or Dep. Var.
- (think of Y as function of X)
- (although not always sensible)
39Scatterplots
- Note Sometimes think about causation,
- Other times Explore Relationship
- HW 2.1
40Class Scores Scatterplots
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg10.xls - How does HW predict Midterm 1?
- HW, MT1
- Still better HW ? better Exam
- But for each HW, wider range of MT1 scores
- I.e. HW doesnt predict MT1 as well as Final
- Outliers in scatterplot may not be outliers in
either individual variable - e.g. HW 72, MT1 94
- (bad HW, but good MT1?, fluke???)
41Class Scores Scatterplots
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg10.xls - How does MT1 predict MT2?
- MT1, MT2
- Idea less causation, more exploration
- Still higher MT1 associated with higher MT2
- For each MT1, wider range of MT2
- i.e. not good predictor
- Interesting Outliers
- MT1 100, MT2 56 (oops!)
- MT1 23, MT2 74 (woke up!)
42Important Aspects of Relations
- Form of Relationship
- Direction of Relationship
- Strength of Relationship
43I. Form of Relationship
- Linear Data approximately follow a line
- Previous Class Scores Example
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg10.xls - Final vs. High values of HW is best
- Nonlinear Data follows different pattern
- Nice Example Bralowers Fossil Data
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg11.xls
44Bralowers Fossil Data
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg11.xls - From T. Bralower, formerly of Geological Sci.
- Studies Global Climate, millions of years ago
- Ratios of Isotopes of Strontium
- Reflects Ice Ages, via Sea Level
- (50 meter difference!)
- As function of time
- Clearly nonlinear relationship
45II. Direction of Relationship
- Positive Association
- X bigger ? Y bigger
- Negative Association
- X bigger ? Y smaller
- E.g. X alcohol consumption, Y Driving
Ability - Clear negative association
46III. Strength of Relationship
- Idea How close are points to lying on a line?
- Revisit Class Scores Example
- http//stat-or.unc.edu/webspace/postscript/marron/
Teaching/stor155-2007/Stor155Eg10.xls - Final Exam is closely related to HW
- Midterm 1 less closely related to HW
- Midterm 2 even related to Midterm 1