Title: Scatter-plot, Best-Fit Line, and Correlation Coefficient
1Scatter-plot, Best-Fit Line, andCorrelation
Coefficient
2Definitions
- Scatter Diagrams (Scatter Plots) a graph that
shows the relationship between two quantitative
variables. - Explanatory Variable predictor variable
plotted to the horizontal axis (x-axis). - Response Variable a value explained by the
explanatory variable plotted on the vertical
axis (y-axis).
3Why might we want to see a Scatter Plot?
- Statisticians and quality control technicians
gather data to determine correlations
(relationships) between two events (variables). - Scatter plots will often show at a glance whether
a relationship exists between two sets of data. - It will be easy to predict a value based on a
graph if there is a relationship present.
4Types of Correlations
- Strong Positive Correlation the values go up
from left to right and are linear. - Weak Positive Correlation - the values go up from
left to right and appear to be linear. - Strong Negative Correlation the values go down
from left to right and are linear. - Weak Negative Correlation - the values go down
from left to right and appear to be linear. - No Correlation no evidence of a line at all.
5Examples of each Plot
6How to create a Scatter Plot
- We will be relying on our TI 83 Graphing
Calculator for this unit! - 1st, get Diagnostics ON, 2nd catalog.
- Enter the data in the calculator lists. Place
the data in L1 and L2. STAT, 1Edit, type
values in - 2nd Y button StatPlot turn ON 1st type is
scatterplot. - Choose ZOOM 9 ZoomStat.
7Lets try one
SANDWICH Total Fat (g) Total Calories
Grilled Chicken 5 300
Hamburger 9 260
Cheeseburger 13 320
Quarter Pounder 21 420
Quarter Pounder with Cheese 30 530
Big Mac 31 560
Arch Sandwich Special 31 550
Arch Special with Bacon 34 590
Crispy Chicken 25 500
Fish Fillet 28 560
Grilled Chicken with Cheese 20 440
8The Correlation Coefficient
- The Correlation Coefficient (r) is measure of the
strength of the linear relationship. - The values are always between -1 and 1.
- If r /- 1 it is a perfect relationship.
- The closer r is to /- 1, the stronger the
evidence of a relationship.
9The Correlation Coefficient
- If r is close to zero, there is little or no
evidence of a relationship. - If the correlation coef. is over .90, it is
considered very strong. - Thus all Correlation Coefficients will be
- -1lt x lt 1
10Salary with a Bachelors and Age
Age Salary (in thousands)
22 31
25 35
28 29.5
28 36
31 48
35 52
39 78
45 55.5
49 64
55 85
11Find the Equation and Correlation Coefficient
- Place data into L1 and L2
- Hit STAT
- Over to CALC.
- 4Linreg(axb)
- Is there a High or Low, Positive or Negative
correlation?
12Movie Cost V.Gross (millions)
TITLE COST U.S. GROSS
1. Titanic (1997) 200 600.8
2. Waterworld (1995) 175 88.25
3. Armageddon (1998) 140 201.6
4. Lethal Weapon 4 (1998) 140 129.7
5. Godzilla (1998) 125 136
6. Dante's Peak (1997) 116 67.1
7. Star Wars I Phantom Menace (1999) 110 431
8. Batman and Robin (1997) 110 107
9. Speed 2 (1997) 110 48
10. Tomorrow Never Dies (1997) 110 125.3
13Finding the Line of Best Fit
- STAT ? CALC 4 LinReg(axb)
- Include the parameters L1, L2, Y1 directly after
it. - (Y1 comes from VARS ? YVARS, Function, Y1)
- Hit ENTER the equation of the Best Fit comes up.
Simply hit GRAPH to see it with the scatter.
14Using the Best-Fit Line to Predict.
- Once your line of Best fit is drawn on the
calculator, it can be used to predict other
values. - On the TI-83/84
- 2nd Calc
- 1Value
- x place in value
15Hypothesis Testing
- Is there evidence that there is a relationship
between the variables? - To test this we will do a
- TWO-TAILED t-test
- Using Table 5 for the level of Significance, and
d.f. n 2 degrees of freedom. - Compare the answer from the following formula to
determine if you will REJECT a particular
correlation.
16TI-83/84 HELP
TI Regression ModelsRules for a ModelDiagnostics OnCorrelation CoefficientCorrelation Not CausationResiduals and Least Squares Graphing ResidualsLinear RegressionLinear Regression w/ Bio DataExponential RegressionLogarithmic RegressionPower Regression