Title: CS 478 Machine Learning
1CS 478 - Machine Learning
2A Practical Definition
- Learning is
- Any change in a system that allows it to perform
better the second time on repetition of the same
task or on another task drawn from the same
population - (Herbert Simon)
3- What would be the potential benefits of computers
capable of learning?
4Benefits of Machine Learning (I)
- Learning overcomes the knowledge acquisition
bottleneck. - Alternative to hard-coded heuristics (e.g., ES
and Minimax) - Fully re-programmable vs read-only knowledge
bases
5Benefits of Machine Learning (II)
- Some algorithms are difficult to develop.
- Only partial specs and some examples may be
available. - Some things may not be computable in the
traditional sense.
6Benefits of Machine Learning (III)
- Learning is more robust than programming.
- No redesign or recoding is ever necessary.
- Only new information needs be given and the
system adapts to it (much as humans do).
7Benefits of Machine Learning (IV)
- Learning is more adaptive than programming.
- Only specific instances need be given.
- Behavior arises as a result of reaction to these
instances.
8Benefits of Machine Learning (V)
- Machines are less vulnerable.
- No feelings or emotions (i.e., subjectivity).
- Not subject to most human hazards (e.g., disease,
radioactivity).
9Benefits of Machine Learning (VI)
- Machines are fast.
- Can be expected to perform learned tasks more
efficiently than their human counterparts.
10Benefits of Machine Learning (VII)
- Constructing artificial learning systems offers
insight into the learning process. - Cognitive psychology.
- Improved teaching methods.
11Benefits of Machine Learning (VIII)
- It is Exciting!
- Intellectually challenging.
- Lots of neat applications.
12 13Survey and Online Game (I)
14Survey and Online Game (II)
Simple
or
Complex
0-13136 Poor 21 13136-19453 Fair 91 19453-257
69 Good 90 25769-32086 Excellent 39 32086 Ou
tstanding 15
15Nosocomial Infection Detection (I)
- Nosocomial infections are estimated to affect
6-12 of hospitalised patients - These infections have significant effects on
mortality, mean length of hospital stay and
antibiotics usage, and result in many 100000
annual cost to the NHS in the UK - Modern hospitals may have more than 500 beds and
laboratories may receive in excess of 100000
specimens per annum - Clues to incidents are easily lost in the vast
amount of data - No single member of the laboratory team sees all
reports - It is less likely that a single staff member will
handle several specimens from an outbreak
16Nosocomial Infection Detection (II)
- There have been sporadic clusters of colonisation
with a few cases of infection from 1995 to 1999.
The strains involved were mostly identified to
the species Klebsiella aerogenes and showed
resistance to multiple antibiotics. The data
downloaded as input for development of the
cross-infection detection program included one of
these clusters. This was not actually called as
an outbreak, because small numbers of patients
were involved, and the organisms were identified
as multi-resistant Klebsiella oxytoca, rather
than Klebsiella aerogenes. However, in
retrospect, these organisms had closely similar
antibiograms and biochemical patterns, and
probably represented a cluster of nosocomial
colonisation/infection. This cluster was
strikingly obvious in the teaching set output
from the detection program.
17- How does learning take place?
18Learning Modes
- By being told (i.e., programming).
- By analogy (i.e., seeking similarities within or
across domains). - By induction (i.e., directly from instances).
- In this class, we focus on inductive learning.
19Inductive Learning
- Induction is a process that involves
intellectual leaps from the particular to the
general. - Simple illustrations
- Card Game
- Play Tennis
20Card Game
Y
Y
Y
N
N
IF BG green OR has only 2 ellipses THEN Y ELSE N
N
Y
Y
N
21Play Tennis
What is the general concept?
22Rote Learning
- Until you discover the rule/concept(s), the very
BEST you can ever expect to do is - Remember what you observed
- Guess on everything else
- No better than MEMORISATION
23Induction
- What you do when you accurately predict
- Whether the next card is in the defined class
- Whether today is good for playing tennis
- i.e., when you generalize from your observations
- Claim All most of the laws of nature were
discovered by inductive reasoning
24 25System Architecture
IdealSystem
DesiredOutput
(Desired)
(Actual)
Performer
ActualOutput
Input
KnowledgeBase
Critic
Learner
26System Components
- KB current expertise
- Performer algorithm that uses the KB to guide
its problem-solving activity - Critic feedback module (e.g., compares actual
with desired, measures goodness) - Learner mechanism that uses information from the
Critic to update the KB
27General Taxonomy
- Supervised learning
- Critic Desired Output
- E.g., classification, program synthesis
- Unsupervised learning
- Critic only
- E.g., clustering, genetic algorithms
28KB and Learner
- Symbolic, rule-like KB and Learner
- Traditional ML
- Sub-symbolic, connectionist KB and Learner
- Neural Networks
- Symbolic/Connectionist KB and/or Learner
- Hybrid approaches
29ML Approaches (I)
- Supervised Learning
- Symbolic Methods
- TDIDT ID3, C4.5, OC1, etc.
- Sequential covering CN2, PROGOL, etc.
- IBL k-NN, IBk, etc.
- Probabilistic Naïve Bayes, etc.
- Connectionist Methods
- Perceptron
- Backpropagation NN
30ML Approaches (II)
- Unsupervised Learning
- Symbolic Methods
- K-Means
- EM
- COBWEB
- Etc.
- Connectionist Methods
- Competitive Learning
- Kohonen SOM
- Etc.
31ML Approaches (III)
- Association Learning
- Apriori
- GRI
- Etc.
- Genetic Algorithm
32- Other issues we will (partially) address
33General Issues
- Performance evaluation
- Model selection
- Overfitting
- Data representation
- Learnability