Title: Supervised Machine Learning: Classification Techniques
1Supervised Machine LearningClassification
Techniques
- Chaleece Sandberg
- Chris Bradley
- Kyle Walsh
2Supervised Machine Learning
- SML machine performs function (e.g.,
classification) after training on a data set
where inputs and desired outputs are provided - Following training, SML algorithm is able to
generalize to new, unseen data - Application Data Mining
- Often, large amounts of data must be handled
efficiently - Look for relevant information, patterns in data
3Decision Trees
- Logic-based algorithm
- Sort instances (data) according to feature
valuesa hierarchy of tests - Nodes features
- Root node feature that best divides data
- Algorithms exist for determining the best root
node - Branches values the node can assume
4Decision Trees, an example
INPUT data (symptom)
OUTPUT category (condition)
5Decision Trees Assessment
- Advantages
- Classification of data based on limiting
- features is intuitive
- Handles discrete/categorical features best
- Limitations
- Danger of overfitting the data
- Not the best choice for accuracy
6Bayesian Networks
- Graphical algorithm that encodes the joint
probability distribution of a data set - Captures probabilistic relationships between
variables - Based on probability that instances (data) belong
in each category
7Bayesian Networks, an example
Wikipedia, 2008
8Bayesian Networks Assessment
- Advantages
- Takes into account prior information regarding
relationships among features - Probabilities can be updated based on outcomes
- Fast!with respect to learning classification
- Can handle incomplete sets of data
- Avoids overfitting of data
- Limitations
- Not suitable for data sets with many features
- Not the best choice for accuracy
9Neural Networks
- Used for
- Classification
- Noise reduction
- Prediction
- Great because
- Able to learn
- Able to generalize
- Kiran
- Plauts (1996) semantic neural network that could
be lesioned and retrained useful for predicting
treatment outcomes - Mikkulainen
- Evolving neural network that could adapt to the
gaming environment useful learning application
10Neural Networks Biological Basis
11Feed-forward Neural Network
Perceptron
Hidden layer
12Neural Networks Training
- Presenting the network with sample data and
modifying the weights to better approximate the
desired function. - Supervised Learning
- Supply network with inputs and desired outputs
- Initially, the weights are randomly set
- Weights modified to reduce difference between
actual and desired outputs - Backpropagation
13Backpropagation
14Support Vector Machines
15Perceptron Revisited
- Linear Classifier y(x) sign(w.x b)
16Which one is the best?
17Notion of Margin
- Distance from a data point to the hyperplane
- Data points closest to the boundary are called
support vectors - Margin d is the distance between two classes.
18Maximizing Margin
- Maximizing margin is a quadratic optimization
problem. - Quadratic optimization problems are a well-known
class of mathematical programming problems, and
many (rather intricate) algorithms exist for
solving them.
19Kernel Trick
- What if the dataset is non-linearly separable?
- We use a kernel to map the data to a
higher-dimensional space
20Non-linear SVMs Feature spaces
- General idea The original space can always be
mapped to some higher-dimensional feature space
where the training set becomes separable
21Examples of Kernel Trick
- For the example in the previous figure
- The non-linear mapping
-
- A more commonly used radial basis function (RBF)
kernel
22(No Transcript)
23Advantages and Applications of SVM
- Advantages of SVM
- Unlike neural networks, the class boundaries
dont change as the weights change. - Generalizability is high because margin is
maximized. - No local minima and robustness to outliers.
- Applications of SVM
- Used in almost every conceivable situation where
automatic classification of data is needed. - (example from class) Raymond Mooney and his
KRISPER natural language parser.
24The Future of Supervised Learning (1)
- Generation of synthetic data
- A major problem with supervised learning is the
necessity of having large amounts of training
data to obtain a good result. - Why not create synthetic training data from real,
labeled data? - Example use a 3D model to generate multiple 2D
images of some object (such as a face) under
different conditions (such as lighting). - Labeling only needs to be done for the 3D model,
not for every 2D model.
25The Future of Supervised Learning (2)
- Future applications
- Personal software assistants learning from past
usage the evolving interests of their users in
order to highlight relevant news (e.g., filtering
scientific journals for articles of interest) - Houses learning from experience to optimize
energy costs based on the particular usage
patterns of their occupants - Analysis of medical records to assess which
treatments are more effective for new diseases - Enable robots to better interact with humans
26References
- http//homepage.psy.utexas.edu/homepage/class/Psy3
94U/Hayhoe/cognitive20science202008/talksreadin
gs/ - http//www.ai-junkie.com/ann/evolved/nnt1.html
- http//galaxy.agh.edu.pl/vlsi/AI/backp_t_en/backp
rop.html - http//cbcl.mit.edu/cbcl/people/heisele/huang-blan
z-heisele.pdf - http//www.grappa.univ-lille3.fr/gilleron/introML
.pdf