Knowledge Representation and Machine Learning - PowerPoint PPT Presentation

About This Presentation
Title:

Knowledge Representation and Machine Learning

Description:

... directions with a decreasing rate over time. Both converge on optimal value ... report CMU-RI-TR-96-31, Robotics Institute, Carnegie Mellon University, October, ... – PowerPoint PPT presentation

Number of Views:138
Avg rating:3.0/5.0
Slides: 38
Provided by: stephe195
Learn more at: http://gamma.cs.unc.edu
Category:

less

Transcript and Presenter's Notes

Title: Knowledge Representation and Machine Learning


1
Knowledge Representationand Machine Learning
  • Stephen J. Guy

2
Overview
  • Recap some Knowledge Rep.
  • History
  • First order logic
  • Machine Learning
  • ANN
  • Bayesian Networks
  • Reinforcement Learning
  • Summary

3
Knowledge Representation?
  • Ambiguous term
  • The study of how to put knowledge into a form
    that a computer can reason with (Russell and
    Norvig)
  • Originally couple w/ linguistics
  • Lead to philosophical analysis of language

4
Knowledge Representation?
  • Cool Robots
  • Futuristic Robots

5
Early Work
  • SAINT (1963)
  • Closed form Calculus Problems
  • STUDENT (1967)
  • If the number of customers Tom gets is twice the
    square of 20 of the number of advertisements he
    runs, and the number of advertisements he runs is
    45, what is the number of customers Tom gets?
  • Blockworlds (1972)
  • SHRDLU
  • Find a block which is taller than the one you
    are holding and put it in the box

6
Early Work - Theme
  • Limit domain
  • Microworlds
  • Allows precise rules
  • Generality
  • Problem Size
  • 1) Making rules are hard
  • 2) State space is unbounded

7
Generality
  • First-order Logic
  • Is able to capture simple Boolean relations and
    facts
  • ?x ?y Brother(x,y) ? Sibling(x,y)
  • ?x ?y Loves(x,y)
  • Can capture lots of commonsense knowledge
  • Not a cure-all

8
First order Logic - Problems
  • Faithful captures fact, objects and relations
  • Problems
  • Does not capture temporal relations
  • Does not handle probabilistic facts
  • Does not handle facts w/ degrees of truth
  • Has been extended to
  • Temporal logic
  • Probability theory
  • Fuzzy logic

9
First order Logic - Bigger Problem
  • Still lots of human effort
  • Knowledge Engineering
  • Time consuming
  • Difficult to debug
  • Size still a problem
  • Automated acquisition of knowledge is important

10
Machine Learning
  • Sidesteps all of the previous problems
  • Represent Knowledge in a way that is immediately
    useful for decision making
  • 3 specific examples
  • Artificial Neural Networks (ANN)
  • Bayesian Networks
  • Reinforcement Learning

11
Artificial Neural Networks (ANN)
  • 1st work in AI (McCulloch Pitts, 1943)
  • Attempt to mimic brain neurons
  • Several binary inputs, One binary output

12
Artificial Neural Networks (ANN)
  • Can be chained together to
  • Represent logical connectives (and, or, not)
  • Compute any computable functions
  • Hebb (1949) introduced simple rule to modify
    connection strength (Hebbian Learning)

13
Single Layer feed-forward ANNs (Perceptrons)
Input Layer
Output Unit
  • Can easily represent otherwise complex (linearly
    separable) functions
  • And, Or
  • Majority Function
  • Can Learn based on gradient descent
  • Cannot tell if 2 inputs are different!! (Minskey,
    1969)

14
Learning in Perceptrons
  • Replace Threshold function w/ Sigmod g(x)
  • Define Error Metric (Sum Sqr Diff)
  • Calculate Gradient wrt Weight
  • Err g(in) Xj
  • Wj Wj ? Err g(in) Xj

15
Multi Layer feed-forward ANNs
Input Layer
Hidden Layer
Output Unit
  • Breaks free of problems of perceptions
  • Simple gradient decent no longer works for
    learning

16
Learning in Multilayer ANNs (1/2)
  • Backpropagation
  • Treat top level just like single-layer ANN
  • Diffuse error down network based on input
    strength from each hidden node

17
Learning in Multilayer ANNs (2/2)
  • ?i Erri g(ini)
  • Wj,i Wj,i ? aj ?i
  • Wk,j Wk,j ? ak ?j

18
ANN - Summery
  • Single Layer ANNs (Proceptrons) can capture
    linearly separable functions
  • Multi-layer ANNs can caputer much more complex
    functions and can be effectively trained using
    back-propagation
  • Not a silver bullet
  • How to avoid over-fitting?
  • What shape should the network be?
  • Network values are meaningless to humans

19
ANN In Robots (Simple)
  • Can be easily set up and robot Brian
  • Input Sensors
  • Output Motor Control
  • Simple Robot learns to avoid bumps

20
ANN In Robots (Complex)
  • Autonomous Land Vehicle In a Neural Network
    (ALVINN)
  • CMU project learned to drive from humans
  • 32x30 retina
  • 5 hidden layers
  • 30 output nodes
  • Capable of driving
  • itself after 2-3
  • minutes of training

21
Bayesian Networks
  • Combines advantages of basic logic and ANNs
  • Allows for effucient represenation of, and
    rigorous reasoning with, unceartain knwoledge
    (RN)
  • Allows for learning from experience

22
Bayes Rule
  • P(ba) P(ab)P(b)/P(a) nrm(ltP(ab)P(b),
    P(ab)P(b)gt)
  • Meningitis Example (From RN)
  • sstiff neck, m has meningitis
  • P(sm) 0.5
  • P(m) 1/50000
  • P(s) 1/20
  • P(ms) P(sm)P(m)/P(s)
  • .5(1/5000)/(1/2)
  • .0002
  • Diagnostic knowledge more fragile than causal
    knowledge

23
Bayesian Networks
P(M) 1/50000
M P(S) T .5 F 1/20
  • Allows us to chain together more complex
    relations
  • Creating network is not necessarily easy
  • Create a fully connected network
  • Cluster groups w/ high correlation together
  • Find probabilities using rejection sampling

24
Bayesian Networks (Temporal Models)
  • More complex Bayesian networks are possible
  • Time can be taken into account
  • Imagine predicting if it will rain tomorrow,
    based only on if your co-worker brings in an
    umbrella

25
Bayesian Networks (Temporal Models)
  • 4 Possible Inference tasks based on this
    knowledge
  • Filtering Computing belief as to current state
  • Prediction Computing belief of future state
  • Smoothing Improving knowledge of pasts states
    using hindsight (Forward-backward Algorithm)
  • Most likely explanation Finding the single most
    likely explanation for a set of observations
    (Viterbi)

26
Bayesian Networks (Temporal Models)
  • Assume you see umbrella 2 days in a row (U1 1,
    U2 1)
  • P(R0) lt0.5,0.5gt (lt.5 R0 T, .5 R0 Fgt)
  • P(R1) P(R1R0)P(R0)P(R1R0)P(R0)
  • 0.70.5 0.30.5 lt0.5,0.5gt
  • P(R1U1) nrm(P(U1R1)P(R1))
    nrmlt.9.5,.3.5gt nrmlt.45,.1gt
    lt.818,.182gt

27
Bayesian Networks (Temporal Models)
  • Assume you see umbrella 2 days in a row (U1 1,
    U2 1)
  • P(R2U1) P(R2R1)P(R1U1) P(R2R1)P(R1U1)
  • .7.818 0.30.182 .627
    lt.627,.373gt
  • P(R2U2,U1) nrm(P(U2R2)P(R2U1))
    nrmlt.9.627,.2.373gt
    nrmlt.565,.075gt lt.883,.117gt
  • On the 2nd day of seeing the umbrella we were
    more confident that it was raining

28
Bayesian Networks - Summary
  • Bayesian Networks are able to capture some
    important aspects of human Knowledge
    Representation and use
  • Uncertainty
  • Adaptation
  • Still difficulties in network design
  • Overall a powerful tool
  • Meaningful values in network
  • Probabilistic logical reasoning

29
Bayesian Networks in Robotics
  • Speech Recognition
  • Inference
  • Sensors
  • Computer Vision
  • SLAM
  • Estimating Human
  • Poses

30
Reinforcement Learning
  • How much can we take the human out of loop?
  • How do humans/animals do it?
  • Genes
  • Pain
  • Pleasure
  • Simply define rewards/punishments let agent
    figure out all the rest

31
Reinforcement Learning - Example


start
  • R(s) Reward of state s
  • R(Goal) 1
  • R(pitfall) -1
  • R(anything else) ?
  • Attempts to move forward may move left or right
  • Many (262,000) possible policies
  • Different policies are optimal depending on the
    value of R(anything else)

32
Reinforcement Learning - Policy


start
  • Above is Optimal policy for R(s) -.04
  • Given a policy how can an agent evaluate U(s),
    the utility of a state? (Passive Reinforcement
    Learning)
  • Adaptive Dynamic Programming (ADP)
  • Temporal Difference Learning (TD)
  • With only an environment how can an agent develop
    a policy? (Active Reinforcement Learning)
  • Q-learning

33
Reinforcement Learning - Utility
1
2
3
1
2
3
4
  • U(s) R(s) ?U(s)P(s)
  • ADP Updating all U(s) based on each new
    observation
  • TD Update U(s) only for last state change
  • Ideally U(s) R(s) U(s), but s is
    probabilistic
  • U(s) U(s) ?(R(s)U(s)-U(s))
  • ? decays from 1 to 0 as a function of times
    state is visited
  • U(s) is guaranteed converge to correct value

S
34
Reinforcement Learning Policy
  • Ideally Agents can create their own policies
  • Exploration Agents must be rewarded for
    exploring as well as taking best known path
  • Adaptive Dynamic Programming (ADP)
  • Can be achieved by changing U(s) to U(s)
  • U(s) nlt N ? Max_Reward U(s)
  • Agent must also update transition model
  • Temporal Difference Learning (TD)
  • No changes to utility calculation!
  • Can explore based on balancing utility and
    novelty (like ADP)
  • Can chose random directions with a decreasing
    rate over time
  • Both converge on optimal value

35
Reinforcement Learning in Robotics
  • Robot Control
  • Discretize workspace
  • Policy Search
  • Pegasus System (Ng, Stanford)
  • Learned how to control robots
  • Better than human pilots w/ Remote Control

36
Summary
  • 3 different general learning approaches
  • Artificial Neural Networks
  • Good for learning correlation between inputs and
    outputs
  • Little human work
  • Bayesian Networks
  • Good for handling uncertainty and noise
  • Human work optional
  • Reinforcement Learning
  • Good for evaluating and generating
    policies/behaviors
  • Can handle complex tasks
  • Little human work

37
References
  • 1. Russell S, Norvig P (1995) Artificial
    Intelligence A Modern Approach, Prentice Hall
    Series in Artificial Intelligence. Englewood
    Cliffs, New Jersey (http//aima.cs.berkeley.edu/)
  • 2. Mitchell, Thomas. Machine Learning. McGraw
    Hill, 1997. (http//www.cs.cmu.edu/tom/mlbook.htm
    l)
  • 3. Sutton, Richard S., and Andrew G. Barto.
    Reinforcement Learning. Cambridge, MA MIT Press,
    1998.(http//www.cs.ualberta.ca/sutton/book/the-b
    ook.html )
  • 4. Hecht-Nielsen, R. "Theory of the
    backpropagation neural network." Neural Networks
    1 (1989) 593-605. (http//ieeexplore.ieee.org/xpl
    s/abs_all.jsp?isnumber3401arnumber118638)
  • 5. P. Batavia, D. Pomerleau, and C. Thorpe, Tech.
    report CMU-RI-TR-96-31, Robotics Institute,
    Carnegie Mellon University, October, 1996
    (http//www.ri.cmu.edu/projects/project_160.html)
  • 6. Bayesian Network based Human Pose Estimation
    D.J. Jung, K.S. Kwon, and H.J. Kim (Korea)
    (http//www.actapress.com/PaperInfo.aspx?PaperID2
    3199)
  • 7. Frank L. Lewis, "Neural Network Control of
    Robot Manipulators," IEEE Expert Intelligent
    Systems and Their Applications ,vol. 11, no. 3,  p
    p. 64-75, June, 1996. (http//doi.ieeecomputersoci
    ety.org/10.1109/64.506755)
Write a Comment
User Comments (0)
About PowerShow.com