Title: Numerical prediction is similar to classification
1- (Numerical) prediction is similar to
classification - construct a model
- use model to predict continuous or ordered value
for a given input - Prediction is different from classification
- Classification refers to predict categorical
class label - Prediction models continuous-valued functions
- Major method for prediction regression
- model the relationship between one or more
independent or predictor variables and a
dependent or response variable - Regression analysis
- Linear and multiple regression
- Non-linear regression
- Other regression methods generalized linear
model, Poisson regression, log-linear models,
regression trees
2(No Transcript)
3(No Transcript)
4(No Transcript)
5(No Transcript)
6(No Transcript)
7(No Transcript)
8(No Transcript)
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15(No Transcript)
16Nonlinear Regression
- Some nonlinear models can be modeled by a
polynomial function - A polynomial regression model can be transformed
into linear regression model. For example, - y w0 w1 x w2 x2 w3 x3
- convertible to linear with new variables x2
x2, x3 x3 - y w0 w1 x w2 x2 w3 x3
- Other functions, such as power function, can also
be transformed to linear model - Some models are intractable nonlinear (e.g., sum
of exponential terms) - possible to obtain least square estimates through
extensive calculation on more complex formulae
17Other Regression-Based Models
- Generalized linear model
- Foundation on which linear regression can be
applied to modeling categorical response
variables - Variance of y is a function of the mean value of
y, not a constant - Logistic regression models the prob. of some
event occurring as a linear function of a set of
predictor variables - Poisson regression models the data that exhibit
a Poisson distribution - Log-linear models (for categorical data)
- Approximate discrete multidimensional prob.
distributions - Also useful for data compression and smoothing
- Regression trees and model trees
- Trees to predict continuous values rather than
class labels
18Classification
- Any regression technique can be used for
classification - Training perform a regression for each class,
setting the output to 1 for training instances
that belong to class, and 0 for those that dont - Prediction predict class corresponding to model
with largest output value (membership value) - For linear regression this is known as
multi-response linear regression
19Discussion of linear models
- Not appropriate if data exhibits non-linear
dependencies - But can serve as building blocks for more
complex schemes - Example multi-response linear regression defines
a hyperplane for any two given classes
20(No Transcript)
21(No Transcript)
22Odds can also be found by counting the number of
people in each group and dividing one number by
the other. Clearly, the probability is not the
same as the odds.) In our example, the odds would
be .90/.10 or 9 to one. Now the odds of being
female would be .10/.90 or 1/9 or .11. This
asymmetry is unappealing, because the odds of
being a male should be the opposite of the odds
of being female. We can take care of this
asymmetry though the natural logarithm, ln. The
natural log of 9 is 2.217 (ln(.9/.1) 2.217).
The natural log of 1/9 is -2.217 (ln(.1/.9)
-2.217), so the log odds of being male is
exactly opposite to the log odds of being female.
The natural log function looks like this
23In logistic regression, the dependent variable is
a logit, which is the natural log of the odds,
that is, Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
                            Â
24(No Transcript)
25Logistic regression
- Problem some assumptions violated when linear
regression is applied to classification problems - Logistic regression alternative to linear
regression - Designed for classification problems
- Tries to estimate class probabilities directly
- Does this using the maximum likelihood method
- Uses this linear model
Class probability
26(No Transcript)
27Discussion of linear models
- Not appropriate if data exhibits non-linear
dependencies - But can serve as building blocks for more
complex schemes - Example multi-response linear regression defines
a hyperplane for any two given classes