Title: Knowledge Representation and Machine Learning
1Knowledge Representationand Machine Learning
2Overview
- Recap some Knowledge Rep.
- History
- First order logic
- Machine Learning
- ANN
- Bayesian Networks
- Reinforcement Learning
- Summary
3Knowledge Representation?
- Ambiguous term
- The study of how to put knowledge into a form
that a computer can reason with (Russell and
Norvig) - Originally couple w/ linguistics
- Lead to philosophical analysis of language
4Knowledge Representation?
- Cool Robots
- Futuristic Robots
5Early Work
- SAINT (1963)
- Closed form Calculus Problems
- STUDENT (1967)
- If the number of customers Tom gets is twice the
square of 20 of the number of advertisements he
runs, and the number of advertisements he runs is
45, what is the number of customers Tom gets?
- Blockworlds (1972)
- SHRDLU
- Find a block which is taller than the one you
are holding and put it in the box
6Early Work - Theme
- Limit domain
- Microworlds
- Allows precise rules
- Generality
- Problem Size
- 1) Making rules are hard
- 2) State space is unbounded
7Generality
- First-order Logic
- Is able to capture simple Boolean relations and
facts - ?x ?y Brother(x,y) ? Sibling(x,y)
- ?x ?y Loves(x,y)
- Can capture lots of commonsense knowledge
- Not a cure-all
8First order Logic - Problems
- Faithful captures fact, objects and relations
- Problems
- Does not capture temporal relations
- Does not handle probabilistic facts
- Does not handle facts w/ degrees of truth
- Has been extended to
- Temporal logic
- Probability theory
- Fuzzy logic
9First order Logic - Bigger Problem
- Still lots of human effort
- Knowledge Engineering
- Time consuming
- Difficult to debug
- Size still a problem
- Automated acquisition of knowledge is important
10Machine Learning
- Sidesteps all of the previous problems
- Represent Knowledge in a way that is immediately
useful for decision making - 3 specific examples
- Artificial Neural Networks (ANN)
- Bayesian Networks
- Reinforcement Learning
11Artificial Neural Networks (ANN)
- 1st work in AI (McCulloch Pitts, 1943)
- Attempt to mimic brain neurons
- Several binary inputs, One binary output
12Artificial Neural Networks (ANN)
- Can be chained together to
- Represent logical connectives (and, or, not)
- Compute any computable functions
- Hebb (1949) introduced simple rule to modify
connection strength (Hebbian Learning)
13Single Layer feed-forward ANNs (Perceptrons)
Input Layer
Output Unit
- Can easily represent otherwise complex (linearly
separable) functions - And, Or
- Majority Function
- Can Learn based on gradient descent
- Cannot tell if 2 inputs are different!! (Minskey,
1969)
14Learning in Perceptrons
- Replace Threshold function w/ Sigmod g(x)
- Define Error Metric (Sum Sqr Diff)
- Calculate Gradient wrt Weight
- Err g(in) Xj
- Wj Wj ? Err g(in) Xj
15Multi Layer feed-forward ANNs
Input Layer
Hidden Layer
Output Unit
- Breaks free of problems of perceptions
- Simple gradient decent no longer works for
learning
16Learning in Multilayer ANNs (1/2)
- Backpropagation
- Treat top level just like single-layer ANN
- Diffuse error down network based on input
strength from each hidden node
17Learning in Multilayer ANNs (2/2)
- ?i Erri g(ini)
- Wj,i Wj,i ? aj ?i
- Wk,j Wk,j ? ak ?j
18ANN - Summery
- Single Layer ANNs (Proceptrons) can capture
linearly separable functions - Multi-layer ANNs can caputer much more complex
functions and can be effectively trained using
back-propagation - Not a silver bullet
- How to avoid over-fitting?
- What shape should the network be?
- Network values are meaningless to humans
19ANN In Robots (Simple)
- Can be easily set up and robot Brian
- Input Sensors
- Output Motor Control
- Simple Robot learns to avoid bumps
20ANN In Robots (Complex)
- Autonomous Land Vehicle In a Neural Network
(ALVINN) - CMU project learned to drive from humans
- 32x30 retina
- 5 hidden layers
- 30 output nodes
- Capable of driving
- itself after 2-3
- minutes of training
21Bayesian Networks
- Combines advantages of basic logic and ANNs
- Allows for effucient represenation of, and
rigorous reasoning with, unceartain knwoledge
(RN) - Allows for learning from experience
22Bayes Rule
- P(ba) P(ab)P(b)/P(a) nrm(ltP(ab)P(b),
P(ab)P(b)gt) - Meningitis Example (From RN)
- sstiff neck, m has meningitis
- P(sm) 0.5
- P(m) 1/50000
- P(s) 1/20
- P(ms) P(sm)P(m)/P(s)
- .5(1/5000)/(1/2)
- .0002
- Diagnostic knowledge more fragile than causal
knowledge
23Bayesian Networks
P(M) 1/50000
M P(S) T .5 F 1/20
- Allows us to chain together more complex
relations - Creating network is not necessarily easy
- Create a fully connected network
- Cluster groups w/ high correlation together
- Find probabilities using rejection sampling
24Bayesian Networks (Temporal Models)
- More complex Bayesian networks are possible
- Time can be taken into account
- Imagine predicting if it will rain tomorrow,
based only on if your co-worker brings in an
umbrella
25Bayesian Networks (Temporal Models)
- 4 Possible Inference tasks based on this
knowledge - Filtering Computing belief as to current state
- Prediction Computing belief of future state
- Smoothing Improving knowledge of pasts states
using hindsight (Forward-backward Algorithm) - Most likely explanation Finding the single most
likely explanation for a set of observations
(Viterbi)
26Bayesian Networks (Temporal Models)
- Assume you see umbrella 2 days in a row (U1 1,
U2 1) - P(R0) lt0.5,0.5gt (lt.5 R0 T, .5 R0 Fgt)
- P(R1) P(R1R0)P(R0)P(R1R0)P(R0)
- 0.70.5 0.30.5 lt0.5,0.5gt
- P(R1U1) nrm(P(U1R1)P(R1))
nrmlt.9.5,.3.5gt nrmlt.45,.1gt
lt.818,.182gt
27Bayesian Networks (Temporal Models)
- Assume you see umbrella 2 days in a row (U1 1,
U2 1) - P(R2U1) P(R2R1)P(R1U1) P(R2R1)P(R1U1)
- .7.818 0.30.182 .627
lt.627,.373gt - P(R2U2,U1) nrm(P(U2R2)P(R2U1))
nrmlt.9.627,.2.373gt
nrmlt.565,.075gt lt.883,.117gt - On the 2nd day of seeing the umbrella we were
more confident that it was raining
28Bayesian Networks - Summary
- Bayesian Networks are able to capture some
important aspects of human Knowledge
Representation and use - Uncertainty
- Adaptation
- Still difficulties in network design
- Overall a powerful tool
- Meaningful values in network
- Probabilistic logical reasoning
29Bayesian Networks in Robotics
- Speech Recognition
- Inference
- Sensors
- Computer Vision
- SLAM
- Estimating Human
- Poses
30Reinforcement Learning
- How much can we take the human out of loop?
- How do humans/animals do it?
- Genes
- Pain
- Pleasure
- Simply define rewards/punishments let agent
figure out all the rest
31Reinforcement Learning - Example
start
- R(s) Reward of state s
- R(Goal) 1
- R(pitfall) -1
- R(anything else) ?
- Attempts to move forward may move left or right
- Many (262,000) possible policies
- Different policies are optimal depending on the
value of R(anything else)
32Reinforcement Learning - Policy
start
- Above is Optimal policy for R(s) -.04
- Given a policy how can an agent evaluate U(s),
the utility of a state? (Passive Reinforcement
Learning) - Adaptive Dynamic Programming (ADP)
- Temporal Difference Learning (TD)
- With only an environment how can an agent develop
a policy? (Active Reinforcement Learning) - Q-learning
33Reinforcement Learning - Utility
1
2
3
1
2
3
4
- U(s) R(s) ?U(s)P(s)
- ADP Updating all U(s) based on each new
observation - TD Update U(s) only for last state change
- Ideally U(s) R(s) U(s), but s is
probabilistic - U(s) U(s) ?(R(s)U(s)-U(s))
- ? decays from 1 to 0 as a function of times
state is visited - U(s) is guaranteed converge to correct value
S
34Reinforcement Learning Policy
- Ideally Agents can create their own policies
- Exploration Agents must be rewarded for
exploring as well as taking best known path - Adaptive Dynamic Programming (ADP)
- Can be achieved by changing U(s) to U(s)
- U(s) nlt N ? Max_Reward U(s)
- Agent must also update transition model
- Temporal Difference Learning (TD)
- No changes to utility calculation!
- Can explore based on balancing utility and
novelty (like ADP) - Can chose random directions with a decreasing
rate over time - Both converge on optimal value
35Reinforcement Learning in Robotics
- Robot Control
- Discretize workspace
- Policy Search
- Pegasus System (Ng, Stanford)
- Learned how to control robots
- Better than human pilots w/ Remote Control
36Summary
- 3 different general learning approaches
- Artificial Neural Networks
- Good for learning correlation between inputs and
outputs - Little human work
- Bayesian Networks
- Good for handling uncertainty and noise
- Human work optional
- Reinforcement Learning
- Good for evaluating and generating
policies/behaviors - Can handle complex tasks
- Little human work
37References
- 1. Russell S, Norvig P (1995) Artificial
Intelligence A Modern Approach, Prentice Hall
Series in Artificial Intelligence. Englewood
Cliffs, New Jersey (http//aima.cs.berkeley.edu/) - 2. Mitchell, Thomas. Machine Learning. McGraw
Hill, 1997. (http//www.cs.cmu.edu/tom/mlbook.htm
l) - 3. Sutton, Richard S., and Andrew G. Barto.
Reinforcement Learning. Cambridge, MA MIT Press,
1998.(http//www.cs.ualberta.ca/sutton/book/the-b
ook.html ) - 4. Hecht-Nielsen, R. "Theory of the
backpropagation neural network." Neural Networks
1 (1989) 593-605. (http//ieeexplore.ieee.org/xpl
s/abs_all.jsp?isnumber3401arnumber118638) - 5. P. Batavia, D. Pomerleau, and C. Thorpe, Tech.
report CMU-RI-TR-96-31, Robotics Institute,
Carnegie Mellon University, October, 1996
(http//www.ri.cmu.edu/projects/project_160.html) - 6. Bayesian Network based Human Pose Estimation
D.J. Jung, K.S. Kwon, and H.J. Kim (Korea)
(http//www.actapress.com/PaperInfo.aspx?PaperID2
3199) - 7. Frank L. Lewis, "Neural Network Control of
Robot Manipulators," IEEE Expert Intelligent
Systems and Their Applications ,vol. 11, no. 3, p
p. 64-75, June, 1996. (http//doi.ieeecomputersoci
ety.org/10.1109/64.506755)