Automatic Acquisition of Fuzzy Footprints - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Automatic Acquisition of Fuzzy Footprints

Description:

Bordering regions. Footprint can be constructed using the ADL gazetteer ... Search for bordering regions on the web to improve recall ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 33
Provided by: jup
Category:

less

Transcript and Presenter's Notes

Title: Automatic Acquisition of Fuzzy Footprints


1
Automatic Acquisition of Fuzzy Footprints
  • Steven Schockaert, Martine De Cock, Etienne E.
    Kerre

2
  • Introduction
  • Constructing fuzzy footprints
  • Experimental results

3
Geographical Question Answering
Give a list of Italian Restaurants in the
neighborhood of Agia Napa.
La Strada Italian Restaurant, Boskos ristorante,

4
Geographic Question Answering
  • Resources
  • Linguistic resources for question analysis,
    answer extraction,
  • A traditional search engine to locate relevant
    documents
  • Geographic background knowledge
  • Footprints provided by gazetteers are often
    inadequate
  • We need a more fine-grained representation than a
    bounding box
  • Questions may involve vague regions such as the
    Alpes, the Highlands,
  • Our solution construct footprints automatically
  • Use the web the collect relevant information
  • Use a digital gazetteer to map location names to
    co-ordinates
  • Use fuzzy sets to represent footprints

5
Fuzzy Sets
  • A fuzzy set A in a universe U is a mapping from U
    to 0,1 (Zadeh, 1965)
  • u belongs to A ? A(u)1
  • u doesnt belong to A ? A(u)0
  • u more or less belongs to A ? 0 lt A(u) lt 1

Old
6
Fuzzy Footprints
  • We represent footprints as fuzzy sets in the
    universe of co-ordinates

South of France
7
  • Introduction
  • Constructing fuzzy footprints
  • Experimental results

8
Obtaining relevant locations
the Ardeche region
- Located in the north of the Ardeche region,
ltcitygt- (ltcitygt,) and other cities in the
Ardeche region- ltcitygt is situated in the heart
of the Ardeche region-
St-Félicien, Lamastre, St-Agrève,
ADL gazetteer
9
Obtaining relevant locations
  • Disambiguation of location names based on
  • the country the region is located in
  • the distance to the other locations

10
Constructing a footprint
  • Existing approaches
  • Use the convex hull of the locations
  • ? web data is too noisy
  • ? not suitable for vague regions
  • Use the density of the locations (Purves et al.,
    2005)
  • ? reflects popularity rather than the extent of
    a region
  • Our solution search for additional constraints
    to filter out noise

11
Constructing a footprint
x is in the north of the Ardeche region
12
Constructing a footprint
inconsistent
x is in the north of the Ardeche region
???
consistent
13
Modelling constraints
x is located in the north of the Ardeche
Inconsistent
Gradual transition
Consistent
14
Modelling constraints
x is located in the north of the Ardeche
Inconsistent
Gradual transition
Based on the average difference in y co-ordinates
Consistent
15
Modelling constraints
  • In a similar way
  • x is located in the south of the Ardeche
  • x is located in the west of the Ardeche
  • x is located in the east of the Ardeche
  • x is located in the north-west of the Ardeche
  • ? x is located in the north of the Ardeche
  • ? x is located in the west of the Ardeche
  • x is located in the heart of the Ardeche

16
Modelling constraints
the Ardeche is located in the south of France
Inconsistent
Gradual transition
Consistent
17
Modelling constraints
the Ardeche is located in the south of France
Inconsistent
Gradual transition
Based on the minimal bounding box for France (ADL
gazetteer)
Consistent
18
Modelling constraints
  • In a similar way
  • R is located in the north of France
  • R is located in the east of France
  • R is located in the west of France
  • R is located in the north-west of France
  • ? R is located in the north of France ? R is
    located in the west of France
  • R is located in the heart of France

19
Modelling constraints
Heuristic points that are too far from
the median are likely to be noise
Inconsistent
Gradual transition
Consistent
20
Modelling constraints
Heuristic points that are too far from
the median are likely to be noise
Inconsistent
Gradual transition
Based on the average distance to the median
Consistent
21
Example
Constraints satisfied to degree 0
Constraints satisfied to degree 0.4
Constraints satisfied to degree 0.6
Constraints satisfied to degree 1
22
Example
Constraints satisfied to degree 1
23
Example
Constraints satisfied to degree 0.6
24
Example
Constraints satisfied to degree 0.4
25
Some remarks
  • If the set of constraints is inconsistent (i.e.
    no point satisfies all constraints), we remove a
    minimal set of constraints such that
  • As many constraints as possible are preserved
  • The area of the fuzzy footprint is as high as
    possible
  • Imposing constraints is used to improve
    precision, not recall

26
Bordering regions
Footprint can be constructed using the ADL
gazetteer
27
  • Introduction
  • Constructing fuzzy footprints
  • Experimental results

28
Evaluation metric
  • Precision degree to which the fuzzy footprint F
    is included in the correct footprint G
  • Recall degree to which the correct footprint G
    is included in the fuzzy footprint F

29
Test data
  • 81 political subregions of France, Italy, Canada,
    Australia and China
  • Divided into three groups
  • Regions for which we found more than 30 candidate
    cities
  • Regions for which we found less than 10 candidate
    cities
  • Regions for which we found between 10 and 30
    candidate cities
  • Gold standard convex hull of the locations that
    are known to lie in the region according to the
    ADL gazetteer

30
Precision
  • Without bordering regions
  • With bordering regions

31
Recall
  • Without bordering regions
  • With bordering regions

32
Conclusions
  • New approach to approximate the footprint of an
    unknown region
  • Also suitable for vague regions
  • Search for constraints on the web to improve
    precision
  • Search for bordering regions on the web to
    improve recall
  • Experimental results confirm this hypothesis

Thank you for your attention!
Write a Comment
User Comments (0)
About PowerShow.com