Title: Transforming relationships
1Transforming relationships
- A scatterplot might show a clear relationship
between two quantitative variables, but issues of
influential points or non linearity prevent us
from using correlation and regression tools. - Transforming the data changing the scale in
which one or both of the variables are expressed
can make the shape of the relationship linear
in some cases.
2Transforming Relationships How is the weight
of an animals brain related to the weight of its
body?
- What can we say about the labeled points?
- Humans and dolphins are smart.
- Hippos are dumb.
- Elephants are outliers in the x-direction
(weight).
3Transforming Relationships How is the weight
of an animals brain related to the weight of its
body?
The correlation between brain weight and body
weight is 0.86 or 74 of the data is
explained. What are the influential observations
here?
If we remove the influential observations, what
will happen?
4Transforming Relationships How is the weight
of an animals brain related to the weight of its
body?
- Note that removing the x-outlier changed the
scale significantly. - Does this look like a linear function?
- r is now less than 0.50 or less than 25 of the
data is explained. - What should we be looking at here if not a linear
form?
5Transforming Relationships How is the weight
of an animals brain related to the weight of its
body?
- Biologists know that when comparing animals of
different weights, it is often helpful to take
the log of both variables before analysis. - This is now a function where r 0.96.
6What is transforming a function?
- Transforming or Reexpressing the data by taking
the square root or logarithms of one or more
variables can help us see the relationship
better. - We may want to change both the explanatory and
the response variables (x and y) so we will call
the transformation t.
7First Steps in Transforming
- A linear change (such as changing slope or
y-intercept) have been discussed in previous
chapters. - A linear change can not straighten out a curved
relationship between two variables. - To do this, we need to resort to functions that
are not linear. - The logarithm function is applied in the previous
example.
8Monotonic Functions
- A monotonic function f(t) moves in one direction
as its argument t increases. - A monotonic increasing function preserves the
order of the data. - This means that if a gt b before the
transformation, then f(a) gt f(b). - A monotonic decreasing function reverses the
order of the data. - This means that if a gt b before the
transformation, the f(a) lt f(b).
9Visualizing the Graphs of somePositive Functions
a bt, slope b gt 0
t2
log t
10Visualizing the Graphs of some Negative Functions
a bt, slope b lt 0
11What is a power function?
- There is a ladder of hierarchy that can help us
know what transformation to choose. - Power functions tp for positive powers are
monotonic increasing for values tgt0. They
preserve the order of the observations. This is
also true from logarithms. - Power functions tp for negative powers are
monotonic decreasing for values tgt0. They
reverse the order of the observations.
12Power Functions
- For tp pgt1 gives powers that bend upward.
- Powers plt1 give powers that bend downward.
13Concavity of Power Functions
- Power transformations for pgt1 are concave upward.
The transformation pushes out the right tail of
a distribution and pulls in the left tail. - Power transformations for plt1 are concave
downward. The transformation pushes out the left
tail of the distribution and pulls in the right
tail. - These effects get stronger as p gets further from
1
14Concavity of Power Functions
15Concavity of Power Functions
- Example 4.2
- Examining if life expectancy changes as the
wealth of a nation changes. Figure a is a plot
of the data life expectancy vs GDP.
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20Concavity of Power Functions
- We can see the data is not linear, so it might be
better if we transform the data to make it fit a
linear pattern. Because GDP is right skewed and
very spread out we will just try to transform it. - Figure b transforms the data by taking the square
root, but it doesnt help much - Figure c transforms the data by taking the log,
it is a better model but still bends to the
right. - Figure d takes the reciprocal of the square root,
it produces the most linear graph with the
highest r value. To avoid reversing the values
use
21Concavity of Power Functions
- As we moved down the ladder with this function
the data got straighter. - The try and see approach isnt very helpful and
it doesnt tell us anything about the
relationship between GDP and life expectancy. We
learn a better way to find a mathematical model.