Title: Similarity in CBR Contd
1Similarity in CBR (Contd)
- Sources
- Chapter 4
- www.iiia.csic.es/People/enric/AICom.html
- www.ai-cbr.org
2Simple-Matching-Coefficient (SMC)
n (A D) B C
- Another distance-similarity compatible function
is - f(x) 1 x/max (where max is the maximum
value for x)
- We can define the SMC similarity, simH
simH(X,Y) 1 ((n (AD))/n) (AD)/n 1-
((BC)/n)
Solution (I) Show that f(x) is order inverting
if x lt y then f(x) gt f(y)
3Simple-Matching-Coefficient (SMC) (II)
- If we use on simH(X,Y) 1- ((BC)/n) factor(A,
B, C, D) - Monotonic
- If A ? A then
- If B ? B then
- If C ? C then
- If D ? D then
factor(A,B,C,D) ? factor(A,B,C,D)
factor(A,B,C,D) ? factor(A,B,C,D)
factor(A,B,C,D) ? factor(A,B,C,D)
factor(A,B,C,D) ? factor(A,B,C,D)
- Symmetric
- simH (X,Y) simH(Y,X)
Solution(II) Show that simH(X,Y) is monotonic
4Variations of SMC (III)
- simH(X,Y) (AD)/n (AD)/(ABCD)
- We introduce a weight, ?, with 0 lt ? lt 1
sim?(X,Y) (?(AD))/ (?(AD) (1 - ?)(BC))
- For which ? is sim?(X,Y) simH(X,Y)?
? 0.5
- sim?(X,Y) preserves the monotonic and symmetric
conditions
Solution(III) Show that sim?(X,Y) is monotonic
5Homework (Part IV) Attributes May Have multiple
Values
- X (X1, , Xn) where Xi ? Ti
- Y (Y1, ,Yn) where Yi ? Ti
- Each Ti is finite
- Define a formula for the Hamming distance in this
context
6Tversky Contrast Model
- Defines a non monotonic distance
- Comparison of a situation S with a prototype P
(i.e, a case) - S and P are sets of features
- The following sets
- A S ? P
- B P S
- C S P
7Tversky Contrast Model (2)
- Tversky-distance
- Where f Sets ? 0, ?), ?, ?, and ? are
constants - f, ?, ?, and ? are fixed and defined by the
user - Example
- If f(A) elements in A
- ? ? ? 1
- T counts the number of elements in common minus
the differences - The Tversky-distance is not symmetric
T(P,S) ?f(A) - ?f(B) - ?f(C)
8Local versus Global Similarity Metrics
- In many situations we have similarity metrics
between attributes of the same type (called local
similarity metrics). Example
For a complex engine, we may have a similarity
for the temperature of the engine
- In such situations a reasonable approach to
define a global similarity sim?(x,y) is to
aggregate the local similarity metrics
simi(xi,yi). A widely used practice
- What requirements should we give to sim?(x,y) in
terms of the use of simi(xi,yi)?
sim?(x,y) to increate monotonically with each
simi(xi,yi).
9Local versus Global Similarity Metrics (Formal
Definitions)
- A local similarity metric on an attribute Ti is a
similarity metric simi Ti ? Ti ? 0,1 - A function ? 0,1n ? 0,1 is an aggregation
function if - ?(0,0,,0) 0
- ? is monotonic non-decreasing on every argument
- Given a collection of n similarity metrics sim1,
, simn, for attributes taken values from Ti, a
global similarity metric, is a similarity metric
simV ? V ? 0,1, V in T1 ? ? Tn, such that
there is an aggregation ? function with - sim(X,Y) sim?(X,Y) ?(sim1(X1,Y1),
,simn(Xn,Yn))
Homework provide an example of an aggregation
function and a non-aggregation function and prove
it. Show a global sim. metric
10Solution
- Suppose that cases use an object oriented
representation - Suppose that cases use a taxonomical
representation, describe how you would measure
similarity and give a concrete example
illustrating the process you described to measure
similarity - Suppose that cases use a compositional
representation, describe how you would measure
similarity and give a concrete example
illustrating the process you described to measure
similarity - Suggestion look at the book!
11Frontiers of Knowledge
- Dealing with numerical and non numerical values
- Aggregation of local similarity metrics into a
global similarity metric helps - but sometimes we dont have local similarity
metrics
12Homework (II)
- From Chapter 5, what is the difference between
completion and adaptation functions? What si
their role on adaptation? Provide an example - Show that Graph coloring is NP-complete
- Assume that Constraint-SAT is NP complete
- Definition. A constraint is a formula of the
form - (x y)
- (x ? y)
- Where x and y are variables that can take values
from a set (e.g., yellow, white, black, red, ) - Definition. Constraint-SAT given a conjunction
of constraints, is there an instantiation of the
variables that makes the conjunction true?