Ch 2' Concept Learning And General ToSpecific Ordering - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Ch 2' Concept Learning And General ToSpecific Ordering

Description:

Has the learner converged to the correct target concept? Version Space ... C-E algorithm will converge the hypothesis that correctly describes the target concept when ... – PowerPoint PPT presentation

Number of Views:156

Avg rating:3.0/5.0

Slides: 25

Provided by: nlpKo

Category:

more less

Transcript and Presenter's Notes

Title: Ch 2' Concept Learning And General ToSpecific Ordering

1
Ch 2. Concept Learning And General To-Specific
Ordering

Lee, Joo-Young

2
Objects

Concept Learning and Terminology
General-To-Specific Ordering
Version Space
Find-S
Candidate-Eliminate
Inductive Bias

3
Concept Learning

Inferring a boolean-valued function from training
examples of its input and output
Classification
Acquiring the definition of a general category
given sample of positive and negative training
examples of the category.
Inductive learning
Not deductive learning

4
Terminology

Instance set of attributes
Instance Space X a set of all distinct
instances
Hypothesis Space H a set of all distinct
hypotheses h
h X-gt0,1
A set of training examples D ? ltx,ygtx ? X, y ?
0, 1
Target concept c
c X-gt0, 1

5
Example

Hypothesis representation
A conjunction of constraints over the instance
attributes
?, single value, ø
ltSunny, Warm, Normal, ?, ?, ?gt, lt?, Warm, ?, ?,
?, ?gt
lt?, ?, ?, ?, ?, ?gt, ltø, ø, ø, ø, ø, øgt
ltø, ?, ?, ?, ?, ?gt

6
Example (Contd)

X 3x2x2x2x2x2 96 distinct instances
H
5x4x4x4x4x4 5120 syntactically distinct
hypotheses
4x3x3x3x3x3 1 973 semantically distinct
hypotheses
D
Positive examples x1, x2, x4
Negative examples x3

7
Assumption

Learning Find a h ? H such that h(x) c(x) for
?x?X
Inductive Learning assumption
If we find h which h(x) c(x) for all given
examples, this h will correct for unseen data.

8
Concept Learning as Search

Concept learning can be viewed as a task of
searching the best hypothesis in H
The best hypothesis is the one that best fits the
examples
e.g) Search among 4x3x3x3x3x3 1 973
hypotheses

9
General-to-Specific Ordering

Let hj, hk ? Hhj hk iff ?x ? X, hk(x) 1 then
hj(x) 1
figure 2.1(p. 25) ??
Hj gt hk iff (hj hk) ? (hk hj)
relation is partial order (p. 24 ??)

The relation is important because it provides
a useful structure over the hypothesis space H
for any concept learning problem
10
Find-S
h ? the most specific hypothesis in H for each
positive training example x for each
attribute constraint a1 in h if the constraint
a1 is not satisfied by x replace a1 in h by
the next more general output h

Example at figure 2.2 (p. 27)

11
Remarks on Find-S

Find-S is guaranteed to output the most specific
hypothesis within H that is consistent with the
positive training examples.
If the target concept is in H, and there is no
error in the training examples, the algorithm
find a hypothesis that is also consistent with
the negative training exmples
Why the most specific hypothesis?
What if there are several or no maximally
specific hypotheses?
What if there is an error in the training
examples?
Has the learner converged to the correct target
concept?

12
Version Space

Consistent A hypothesis h is consistent with a
set of training examples D if and only if h(x)
c(x) for each example ltx, c(x)gt in D.
Consistent(h, D) (?ltx, c(x)gt ? D) h(x)
c(x)
Version Space The version space, denoted VSH,D
with respect to hypothesis space H and training
examples D, is the subset of hypotheses from H
consistent with training examples in D
VSH,D h? H consistent(h, D)

13
List-Then-Eliminate Algorithm

VS ? H
for each training example, ltx, c(x)gt
for each h in VS
if (h(x) ?c(x))
VS VS h
output VS
The List-Then-Eliminate algorithm will output the
set of hypotheses that consistent with the D
Weakness requires exhaustively enumerating all
hypotheses in H

14
Specific and General Boundary

The general boundary G, with respect to
hypothesis space H and training data D, is the
set of maximally general members of H consistent
with D.
G g ? H consistent(g, D)?(?g ?H)(ggtg)
?consistent(g, D)
The specific boundary S, with respect to
hypothesis space H and training data D, is the
set of minimally general (i.e. maximally
specific) members of H consistent with D.
S s ? H consistent(s, D)?(?s ?H)(sgts)
?consistent(s, D)

15
Version space representationtheorem

p. 32 ??
VS h?H(?s?S)(?g?G)(ghs)
Proof

16
Intuition on VS representationtheorem

Any hypothesis more general than S will cover any
past positive examples.
Any hypothesis more specific than G will not
cover any past negative examples.

17
Candidate-EliminationAlgorithm

S ? set of maximally specific hypotheses in H
G ? set of maximally general hypotheses in H
For each training example d ? D, do
if d is positive
G ? G g ? G C(g, d) ( C consistent )
for each s ? S C(s,d)
S ? S s
S ? S h ? H(hgtmin s) ?C(h,d) ??g?G(ggth)
S ? S si?S?sj?S(si gt sj)
if d is positive
S ? S s ? S C(s, d)
for each g ? G C(g,d)
G ? G g
G ? G h ? H(ggtmin h) ?C(h,d) ??s?S(hgts)
G ? G gi?G?gj?G (gj gt gi)
Examples p. 3436

lt- Why?
lt- Why?
lt- Why?
lt- Why?
18
Apply to Example
ltSunny, Warm, ?, Strong, ?, ?gt
S
ltSunny, ?, ?, Strong, ?, ?gt
ltSunny, ?, ?, Strong, ?, ?gt
ltSunny, ?, ?, Strong, ?, ?gt
ltSunny, ?, ?, ?, ?, ?gt, lt?, Warm, ?, ?, ?, ?gt
G
19
New Example

H y gt ax 0 lt x, y lt 1, a gt 0
D

(1,1)
1
2
1
3-
4-
1
S, G, VS?
20
Remarks on C-E

C-E algorithm will converge the hypothesis that
correctly describes the target concept when
1) there are no errors in the training examples
2) there is some hypothesis in H that correctly
describes the target concept
Positive training examples force S to become more
general.
Negative training examples force G to become more
specific.

21
What Next?

What training example to request next for active
learners?
Ask for an example that satisfy exactly half the
hypotheses in the current version space.

22
Expressiveness of Hypothesis Space
23
Unbiased Learner

X 3x2x2x2x2x2 96
Disjunctions of conjunctions
H 296
e.g) ltSunny, ?,?,?,?,?gt V ltCloudy,?,?,?,?,?gt
S (x1?x2?x3), G (x4?x5)
Not learning i.e., Memorizing
Voting is futile
A learner that makes no a priori assumptions
regarding the identity of the target concept has
no rational basis for classifying any unseen
instances.

24
Inductive Bias

L Learning Algorithm
X instance space
c an arbitrary concept defined over X
Dc ltx, c(x)gt. Training example
L(xi, Dc) the classification that L assigns to
xi after learning from the training data Dc
The inductive bias of L is any minimal set of
assertions B such that for any target concept c
and corresponding training examples Dc
(?xi?X)(B?Dc?xi) L(xi,Dc)