CMSC 671 Fall 2003 - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

CMSC 671 Fall 2003

Description:

Usually used to represent static, taxonomic, concept dictionaries ... E.g., misdiagnosis in medicine. There may be multiple plausible hypotheses ... – PowerPoint PPT presentation

Number of Views:55

Avg rating:3.0/5.0

Slides: 32

Provided by: COGI

Learn more at: https://redirect.cs.umbc.edu

Category:

Tags: cmsc | fall

more less

Transcript and Presenter's Notes

Title: CMSC 671 Fall 2003

1
CMSC 671Fall 2003

Class 16 Wednesday, October 22

2
Todays topics

Approaches to knowledge representation
Deductive/logical methods
Forward-chaining production rule systems
Semantic networks
Frame-based systems
Description logics
Abductive/uncertain methods
Whats abduction?
Why do we need uncertainty?
Bayesian reasoning
Other methods Default reasoning, rule-based
methods, Dempster-Shafer theory, fuzzy reasoning

3
Knowledge Representation and Reasoning

Chapters 10.1-10.3, 10.6, 10.9 also includes
some material from 13.1-13.2 and 14.7

Some material adopted from notes by Andreas
Geyer-Schulz and Chuck Dyer
4
Introduction

Real knowledge representation and reasoning
systems come in several major varieties.
These differ in their intended use, expressivity,
features,
Some major families are
Logic programming languages
Theorem provers
Rule-based or production systems
Semantic networks
Frame-based representation languages
Databases (deductive, relational,
object-oriented, etc.)
Constraint reasoning systems
Description logics
Bayesian networks
Evidential reasoning

5
Forward-chaining production systems

The notion of a production system was invented
in 1943 by Post
Used as the basis for many rule-based expert
systems
Used as a model of human cognition in psychology
A production is a rule of the form

C1, C2, Cn gt A1 A2 Am
Left hand side (LHS)
Right hand side (RHS)
Condition which must hold before the rule can be
applied
Actions to be performed or conclusions to be
drawn when the rule is applied
6
Production systems Basic components

Rules -- Unordered set of user-defined if-then
rules.
Form if P1 ? ... ? Pm then A1, ..., An
The Pi are facts that determine the conditions
when a rule is applicable.
Actions can add or delete facts from the working
memory.
Working Memory -- A set of facts consisting of
positive literals defining whats known to be
true about the world
Usually flat tuples like (location umbc
baltimore)
Inference Engine -- Procedure for inferring
changes (additions and deletions) to working
memory
Typically uses forward chaining to make inferences

7
Typical CLIPS Rule

(defrule determine-gas-level ""
(working-state engine does-not-start)
(rotation-state engine rotates)
(not (repair ?))
gt
(if (not (yes-or-no-p Gas in tank?"))
then (assert (repair "Add gas."))))

(defrule normal-engine-state-conclusions ""
(declare (salience 10)) (working-state engine
normal) gt (assert (repair "No repair
needed.")) (assert (spark-state engine
normal)) (assert (charge-state battery
charged)) (assert (rotation-state engine
rotates)))
(defrule print-repair "" (declare (salience
10)) (repair ?item) gt (printout t crlf
crlf) (printout t "Suggested Repair")
(printout t crlf crlf) (format t " snnn"
?item))
8
Typical CLIPS facts

Facts in most production systems are basically
flat tuples
A simple extension supported by many is to allow
simple templates usingslot-filler pairs.
(deftemplate engine
(slot horsepower)
(slot displacement)
(slot manufacturer)
(slot year))
Matching slots in a template is order
insensitive, as in
(engine (year 1998) (horsepower ?x))
(engine (horsepower 250) (displacement 500) (year
1998))

(initial-fact)
(working-state engine unsatisfactory)
(charge-state battery charged)
(rotation-state engine rotates)
(repair "Clean the fuel line.")
(engine (horsepower 250)
(displacement 409)
(manufacturer ford))

9
Basic Procedure

While changes are made to Working Memory do
Match Construct the Conflict Set -- the set of
all possible (R, F) pairs such that R is one of
the rules and F is a subset of facts in WM that
unify with the antecedent (left-hand side) of R.
Conflict Resolution Select one pair from the
Conflict Set for execution.
Act Execute the actions associated with the
consequent (right-hand side) of R, after making
the substitutions used during unification of the
antecedent part with F.

10
Rete Algorithm

The Rete Algorithm (Greek for net) is the most
widely used, efficient algorithm for the
implementation of production systems.
Developed by Charles Forgy at Carnegie Mellon
University in 1979.
Charles L. Forgy, "Rete A Fast Algorithm for the
Many Pattern/Many Object Pattern Match Problem",
Artificial Intelligence,19, pp 17-37, 1982.
Rete is the only algorithm for production systems
whose efficiency is asymptotically independent of
the number of rules.
The basis for a whole generation of fast expert
system shells OPS5, ART, CLIPS and Jess.

11
Match Phase

RULES
(defrule R1 rule one
(a ?x)(b ?x)(c ?y)
gt (assert (d ?x)))
(defrule R2 rule two
(a ?x)(b ?y)(d ?x)
gt (assert (e ?x)))
(defrule R3 rule three
?fact lt- (a ?x) (b ?x) (e ?x)
gt (remove ?fact))

WORKING MEMORY (a 1) (a 2) (b 2) (b 3) (b 4)
(c 5)
12
Conflict Resolution Strategy Components

Refraction
A rule can only be used once with the same set of
facts in WM. Whenever WM is modified, all rules
can again be used. This strategy prevents a
single rule and list of facts from being used
repeatedly, resulting in an infinite loop of
reasoning.
Recency
Use rules that match the facts that were added
most recently to WM, providing a kind of focus
of attention strategy.
Specificity
Use the most specific rule if both R1 and R2
match, and R1s LHS logically implies R2s LHS,
use R2.
Explicit priorities
E.g., numeric salience attribute for rules

13
An Application R1 / XCON

An expert systems developed by DEC for
configuration.
Problem develop a single acceptable
configuration of hardware components for a
complete computer system based on partial
customer specifications.
Rules are used to determine
if an order is complete and, if not, adds
necessary items,
the spatial relations (connectivity) of
components
Bottom-up, data-driven, forward-chaining
deductive system for synthesizing a
solution.
No backtracking needed --- constraints in rules
are sufficient to directly construct a solution
Rules always determine locally whether taking a
particular action is globally consistent with
acceptable overall performance on the task
Had about 10,000 rules

14
Example XCON rule

Assign_Power_Supply_1
IF
Most current active context is assign-power-supply
An SBI module is in cabinet
Position of module in cabinet is known
Space is available in cabinet for a power supply
for that position
No power supply is currently available
Voltage frequency of components is known
THEN
Find a power supply of that voltage and
frequency and add it to order

15
R1/XCON

WM contains current configuration of components,
e.g., space filled and unfilled in cabinets and
backplane slots.
Uses specialization conflict resolution strategy
plus context markers in WM to control the
firing of rules.
Groups of rules are associated into separate
contexts which define a particular sub-task
that they are used for. First antecedent in each
rule indicates the context where the rule is to
be used.
For example, the rule above is to be used in the
context of assigning a power supply.

16
A context changing rule

To change contexts, there are rules such as the
following
Check_Voltage_And_Frequency_1
IF
Most current active context is checking-voltage-an
d-frequency
There is a component requiring one voltage or
frequency
There is another component requiring a different
voltage or frequency
THEN
Enter context of fixing-voltage-or-frequency-misma
tches

17
Semantic Networks

A semantic network is a simple representation
scheme that uses a graph of labeled nodes and
labeled, directed arcs to encode knowledge.
Usually used to represent static, taxonomic,
concept dictionaries
Semantic networks are typically used with a
special set of accessing procedures that perform
reasoning
e.g., inheritance of values and relationships
Semantic networks were very popular in the 60s
and 70s but are less frequently used today.
Often much less expressive than other KR
formalisms
The graphical depiction associated with a
semantic network is a significant reason for
their popularity.

18
Nodes and Arcs

Arcs define binary relationships that hold
between objects denoted by the nodes.

mother
age
Sue
john
5
wife
age
father
husband
mother(john,sue) age(john,5) wife(sue,max) age(max
,34) ...
34
Max
age
19
Semantic Networks

The ISA (is-a) or AKO (a-kind-of) relation is
often used to link instances to classes, classes
to superclasses
Some links (e.g. hasPart) are inherited along ISA
paths.
The semantics of a semantic net can be relatively
informal or very formal
often defined at the implementation level

20
Reification

Non-binary relationships can be represented by
turning the relationship into an object
This is an example of what logicians call
reification
reify v consider an abstract concept to be real
We might want to represent the generic give event
as a relation involving three things a giver, a
recipient and an object, give(john,mary,book32)

giver
john
give
recipient
object
mary
book32
21
Individuals and Classes
Genus

Many semantic networks distinguish
nodes representing individuals and those
representing classes
the subclass relation from the instance-of
relation

Animal
instance
subclass
hasPart
Bird
subclass
Wing
Robin
instance
instance
Red
Rusty
22
Link types
23
Inference by Inheritance

One of the main kinds of reasoning done in a
semantic net is the inheritance of values along
the subclass and instance links.
Semantic networks differ in how they handle the
case of inheriting multiple different values.
All possible values are inherited, or
Only the lowest value or values are inherited

24
Conflicting inherited values
25
Multiple inheritance

A node can have any number of superclasses that
contain it, enabling a node to inherit properties
from multiple parent nodes and their ancestors
in the network.
These rules are often used to determine
inheritance in such tangled networks where
multiple inheritance is allowed
if XltAltB and both A and B have property P then X
inherits As property.
If XltA and XltB but neither AltB nor BltZ, and A and
B have property P with different and inconsistent
values, then X does not inherit property P at
all.

26
Nixon Diamond

This was the classic example circa 1980.

Person
subclass
subclass
pacifist
Republican
Quaker
pacifist
FALSE
TRUE
instance
instance
Person
27
From Semantic Nets to Frames

Semantic networks morphed into Frame
Representation Languages in the 70s and 80s.
A frame is a lot like the notion of an object in
OOP, but has more meta-data.
A frame has a set of slots.
A slot represents a relation to another frame (or
value).
A slot has one or more facets.
A facet represents some aspect of the relation.

28
Facets

A slot in a frame holds more than a value.
Other facets might include
current fillers (e.g., values)
default fillers
minimum and maximum number of fillers
type restriction on fillers (usually expressed as
another frame object)
attached procedures (if-needed, if-added,
if-removed)
salience measure
attached constraints or axioms
In some systems, the slots themselves are
instances of frames.

29
(No Transcript)
30
Description Logics

Description logics provide a family of frame-like
KR systems with a formal semantics.
E.g., KL-ONE, LOOM, Classic,
An additional kind of inference done by these
systems is automatic classification
finding the right place in a hierarchy of
objects for a new description
Current systems take care to keep the languages
simple, so that all inference can be done in
polynomial time (in the number of objects)
ensuring tractability of inference

31
Abduction

Abduction is a reasoning process that tries to
form plausible explanations for abnormal
observations
Abduction is distinctly different from deduction
and induction
Abduction is inherently uncertain
Uncertainty is an important issue in abductive
reasoning
Some major formalisms for representing and
reasoning about uncertainty
Mycins certainty factors (an early
representative)
Probability theory (esp. Bayesian belief
networks)
Dempster-Shafer theory
Fuzzy logic
Truth maintenance systems
Nonmonotonic reasoning

32
Abduction

Definition (Encyclopedia Britannica) reasoning
that derives an explanatory hypothesis from a
given set of facts
The inference result is a hypothesis that, if
true, could explain the occurrence of the given
facts
Examples
Dendral, an expert system to construct 3D
structure of chemical compounds
Fact mass spectrometer data of the compound and
its chemical formula
KB chemistry, esp. strength of different types
of bounds
Reasoning form a hypothetical 3D structure that
satisfies the chemical formula, and that would
most likely produce the given mass spectrum

33
Abduction examples (cont.)

Medical diagnosis
Facts symptoms, lab test results, and other
observed findings (called manifestations)
KB causal associations between diseases and
manifestations
Reasoning one or more diseases whose presence
would causally explain the occurrence of the
given manifestations
Many other reasoning processes (e.g., word sense
disambiguation in natural language process, image
understanding, criminal investigation) can also
been seen as abductive reasoning

34
Comparing abduction, deduction, and induction
A gt B A --------- B

Deduction major premise All balls in the
box are black
minor premise These
balls are from the box
conclusion These
balls are black
Abduction rule All balls
in the box are black
observation These
balls are black
explanation These balls
are from the box
Induction case These
balls are from the box
observation These
balls are black
hypothesized rule All ball
in the box are black

A gt B B ------------- Possibly A
Whenever A then B ------------- Possibly A gt B
Deduction reasons from causes to
effects Abduction reasons from effects to
causes Induction reasons from specific cases to
general rules
35
Characteristics of abductive reasoning

Conclusions are hypotheses, not theorems (may
be false even if rules and facts are true)
E.g., misdiagnosis in medicine
There may be multiple plausible hypotheses
Given rules A gt B and C gt B, and fact B, both A
and C are plausible hypotheses
Abduction is inherently uncertain
Hypotheses can be ranked by their plausibility
(if it can be determined)

36
Characteristics of abductive reasoning (cont.)

Reasoning is often a hypothesize-and-test cycle
Hypothesize Postulate possible hypotheses, any
of which would explain the given facts (or at
least most of the important facts)
Test Test the plausibility of all or some of
these hypotheses
One way to test a hypothesis H is to ask whether
something that is currently unknownbut can be
predicted from His actually true
If we also know A gt D and C gt E, then ask if D
and E are true
If D is true and E is false, then hypothesis A
becomes more plausible (support for A is
increased support for C is decreased)

37
Characteristics of abductive reasoning (cont.)

Reasoning is non-monotonic
That is, the plausibility of hypotheses can
increase/decrease as new facts are collected
In contrast, deductive inference is monotonic it
never change a sentences truth value, once known
In abductive (and inductive) reasoning, some
hypotheses may be discarded, and new ones formed,
when new observations are made

38
Sources of uncertainty

Uncertain inputs
Missing data
Noisy data
Uncertain knowledge
Multiple causes lead to multiple effects
Incomplete enumeration of conditions or effects
Incomplete knowledge of causality in the domain
Probabilistic/stochastic effects
Uncertain outputs
Abduction and induction are inherently uncertain
Default reasoning, even in deductive fashion, is
uncertain
Incomplete deductive inference may be uncertain
?Probabilistic reasoning only gives probabilistic
results (summarizes uncertainty from various
sources)

39
Decision making with uncertainty

Rational behavior
For each possible action, identify the possible
outcomes
Compute the probability of each outcome
Compute the utility of each outcome
Compute the probability-weighted (expected)
utility over possible outcomes for each action
Select the action with the highest expected
utility (principle of Maximum Expected Utility)

40
Bayesian reasoning

Probability theory
Bayesian inference
Use probability theory and information about
independence
Reason diagnostically (from evidence (effects) to
conclusions (causes)) or causally (from causes to
effects)
Bayesian networks
Compact representation of probability
distribution over a set of propositional random
variables
Take advantage of independence relationships

41
Other uncertainty representations

Default reasoning
Nonmonotonic logic Allow the retraction of
default beliefs if they prove to be false
Rule-based methods
Certainty factors (Mycin) propagate simple
models of belief through causal or diagnostic
rules
Evidential reasoning
Dempster-Shafer theory Bel(P) is a measure of
the evidence for P Bel(?P) is a measure of the
evidence against P together they define a belief
interval (lower and upper bounds on confidence)
Fuzzy reasoning
Fuzzy sets How well does an object satisfy a
vague property?
Fuzzy logic How true is a logical statement?

42
Uncertainty tradeoffs

Bayesian networks Nice theoretical properties
combined with efficient reasoning make BNs very
popular limited expressiveness, knowledge
engineering challenges may limit uses
Nonmonotonic logic Represent commonsense
reasoning, but can be computationally very
expensive
Certainty factors Not semantically well founded
Dempster-Shafer theory Has nice formal
properties, but can be computationally expensive,
and intervals tend to grow towards 0,1 (not a
very useful conclusion)
Fuzzy reasoning Semantics are unclear (fuzzy!),
but has proved very useful for commercial
applications