Geographic reference analysis for geographic document querying - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Geographic reference analysis for geographic document querying

Description:

Geographic reference analysis for geographic document querying. F.Bilhaut , T. ... 'the south of a Bordeaux-Gen ve line' Relevance degree (1) Quantification ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 25
Provided by: YM5
Category:

less

Transcript and Presenter's Notes

Title: Geographic reference analysis for geographic document querying


1
Geographic reference analysis for geographic
document querying
  • F.Bilhaut , T.Charnois, P.Enjalbert Y.Mathet
  • bilhaut, charnois, enjalbert, mathet_at_info.unicae
    n.fr
  • GREYC, CNRS UMR 6072
  • University of Caen

2
The "GéoSem" project
  • Passage extraction from geographical documents
  • From a query to a ranked set of passages
  • Queries are concerned with
  • - time
  • - phenomenon
  • - space

3
Excerpt from "Hérin" corpus
  • From 1965 to 1985, the number of high-school
    students has increased by 70, but at different
    rythms and intensities depending on academies and
    departments. Lower in South-West and Massif
    Central, moderate in Brittany and Paris, the rise
    has been considerable in Mid-West and Alsace.
    Also occurs the schooling duration increase which
    was more important in departments where, in the
    middle of the 60's, study continuation after
    primary school was far from beeing systematic.

4
Excerpt from "Hérin" corpus
  • From 1965 to 1985, the number of high-school
    students has increased by 70, but at different
    rythms and intensities depending on academies and
    departments. Lower in South-West and Massif
    Central, moderate in Brittany and Paris, the rise
    has been considerable in Mid-West and Alsace.
    Also occurs the schooling duration increase which
    was more important in departments where, in the
    middle of the 60's, study continuation after
    primary school was far from beeing systematic.

Time
5
Excerpt from "Hérin" corpus
  • From 1965 to 1985, the number of high-school
    students has increased by 70, but at different
    rythms and intensities depending on academies and
    departments. Lower in South-West and Massif
    Central, moderate in Brittany and Paris, the rise
    has been considerable in Mid-West and Alsace.
    Also occurs the schooling duration increase which
    was more important in departments where, in the
    middle of the 60's, study continuation after
    primary school was far from beeing systematic.

Time
Phenomenon
6
Excerpt from "Hérin" corpus
  • From 1965 to 1985, the number of high-school
    students has increased by 70, but at different
    rythms and intensities depending on academies and
    departments. Lower in South-West and Massif
    Central, moderate in Brittany and Paris, the rise
    has been considerable in Mid-West and Alsace.
    Also occurs the schooling duration increase which
    was more important in departments where, in the
    middle of the 60's, study continuation after
    primary school was far from beeing systematic.

Time
Phenomenon
Space
7
Queries
  • Which passages address educational difficulties
    in west of France in the 50's ?
  • Which passages address variations of the number
    of pupils in rural areas ?
  • Which passages address Calvados district?

8
Queries
  • Which passages address educational difficulties
    in west of France in the 50's?
  • Which passages address variations of the number
    of pupils in Paris area?
  • Which passages address Calvados district?

9
Some Signifiant Spatial Expressions
Paris in north of France from south of
Loire Some seabord towns The quarter of The
districts in north of France Fifteen All
Some seabord towns of Normandy The most rural
districts situated from south of Loire
10
The type "zone"a georeferenced area anchored in
a named place
Paris in north of France Normandy Fro
m Normandy to Alsace
from south of Loire
11
The LocGeo type
  • The canonical form
  • quantificationtypezone

Quant
Type Zone
qualification
administrative Position
named geo. entity
The quarter of / districts in north
of France Fifteen / All / Some
seabord towns of Normandy The
most rural districts situated from south
of Loire Some seabord towns
12
The LocGeo type
quant
type
zone
Quant
Type Zone
qualification
administrative Position
named geo. entity
The quarter of / districts in north
of France Fifteen / All / Some
seabord towns of Normandy The
most rural districts situated from south
of Loire Some seabord towns
13
Semantic Representation
Paris
ty_zone town
egn
nom Paris
zone
loc internal
Lat 45.633333
coord
Long 5.733333
14
Semantic Representation
Some seabord towns in north of Normandy
type relative
quant
ty_zone town
type
geo seabord
locgeo
ty_zone region
egn
nom Normandy
zone
loc internal
position north
15
Implementation and (first) Results
  • A tokenisation and a morphological analysis
  • A DCG to perform altogether syntactic and
    semantic analysis the grammar contains 160
    rules an internal lexical base of 200
    entries a gazetteer of 100000 named places
    (France)
  • 9OO expressions recognised and analysed from a
    geographical corpus (200 text pages)
  • Good results but a precise and quantitative
    evaluation to be done

16
Semantic matching Why ?
corpora
the south of a Bordeaux-Genève line
Text A
the northern half of France
3
a query

1
In Paris and Toulouse
"Which passages address Paris ?"
2
Text B
In Ile de France region
17
Semantic matching How ?
  • Spatial compatibility
  • Is the zone denoted by the passage spatially
    compatible with the one of the query? (is there,
    at least, an intersection?)
  • Relevance degree
  • if this zone is compatible, how relevant is
    it w.r.t.the query?
  • - probability
  • - granularity

18
Compatibility computation
  • Q1) Which passages address Paris ?
  • P1) the capital city
  • P2) big cities in France.
  • P3) the northern half of France
  • P4) South of a Bordeaux-Genève line.

YES
gazetteer
YES
gazetteer computation
YES
NO
giscomputation
19
"the northern half of France"
20
"the south of a Bordeaux-Genève line"
21
Relevance degree (1)Quantification
  • Query "Calvados" (french district)
  • P1 "The quarter of districts in north of France"
  • P2 "All districts in north of France"
  • P3 "Some districts in north of France"
  • P4 "Fifteen districts in north of France"

rank
3
r25
1
r100
4
ri/n5/529.6
2
ri/n15/5229
22
Relevance degree (2)Granularity
country region district city "zone"
 the northern half of France
"Basse Normandie"
"Caen"
"Calvados"
23
locgeo(locgeo(detDet..typeType..Zone)) --gt
prep, det(Det), type(Type), zone(Zone). det(Sem)
--gt X,lexique(X,XR,det,Sem). type(X)
--gt typeQualif(X). type(ty_zoneN) --gt
nomtype(N). typeQualif(ty_zoneN..Q) --gt option,
nomtype(N), prep, qualif(Q). nomtype(Sem) --gt
X, lexique(X,XR,nom,Sem). zone(X)--gt
egn(X). egn(egn(ty_zoneT..nomY..coordC))
--gt --gt ls_lexiconExtDCG(np,
type_semegn..type_zoneT..nomY..coordC
). egn(egn(ty_zoneT..nomY)) --gt
X,lexique(X,XR,np, type_semegn..type_zoneT
..nomY).  
24
lexique(quelque,quelque,det,type_semrelatif..ty
perelatif_qualifie ..nb'qualitatiffaible').
lexique(tout,tout,le,det,type_semexhaustif).
lexique(région,région,nom,type_semzone(administ
rative) ..nom_zonerégion). lexique(ville,ville
,nom,type_semzone(administrative) ..nom_zonevi
lle). Lexique('Bretagne','Bretagne', np,type_
semegn..type_zonerégion..nom'Bretagne').
Write a Comment
User Comments (0)
About PowerShow.com