Integration of complementary archaeological sources - PowerPoint PPT Presentation

About This Presentation
Title:

Integration of complementary archaeological sources

Description:

ICS-FORTH, Heraklion, Crete, Greece. Kurt Schaller. Magistrat der Stadt Wien. Gesch ftsgruppe Kultur und Wissenschaft Stadtarch ologie, Wien, Austria. 2 ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 19
Provided by: stav
Category:

less

Transcript and Presenter's Notes

Title: Integration of complementary archaeological sources


1
Integration of complementary archaeological
sources
  • Martin Doerr
  • Maria Theodoridou
  • ICS-FORTH, Heraklion, Crete, Greece
  • Kurt Schaller
  • Magistrat der Stadt WienGeschäftsgruppe Kultur
    und Wissenschaft Stadtarchäologie, Wien,
    Austria
  •  

2
Outline
  • Problem statement Working context
  • Objective
  • Approach
  • Technical description
  • Results
  • Conclusion, future work

3
Project VBI ERAT LVPA The Internet Tracks of the
Roman She-Wolf
  • Traditional corpora
  • very high quality, difficult to maintain,
    difficult to search, uncorrelated to
    complementary resources
  • New Database Projects
  • varying quality, overlapping contents,
    continuously updated, easy to search,
    uncorrelated between each other.
  • Altogether
  • A conglomerate of highly interrelated
    archaeological sources
  • of overwhelming detail and volume
  • Ubi-erat-lupa A European Culture 2000 Project
  • An aggregation of complementary scientific
    databases and corpora describing finds with
    inscriptions and iconography of the Roman era
  • to create a body of unique archaeological
    knowledge in digital form.

4
VBI ERAT LVPA Objective
  • creation of a global index about a set of
    semi-autonomous sources for global access to the
    unified knowledge
  • integration of complementary information under a
    common ontology/schema and identification of
    common elements in different sources
  • development of an integration algorithm that
    converges to the best state of knowledge und
    continuous update
  • creation of a research tool for formulating
    queries of archaeological content to detect
    contextual relationships that cannot be derived
    from interpreting the sources in isolation

5
Approach
  • Develop a semantic network based on the CIDOC
    CRM model to integrate the complementary
    archaeological sources
  • Data, relevant to global querying over all
    contents, are extracted, transformed and stored
    in an RDF repository, that is incrementally
    updated over time.
  • Integration in two phases
  • source schema is intellectually interpreted in
    terms of the CIDOC model
  • non canonical data reported to respective
    source
  • mistakes in sources removed, quality of source
    improved
  • actual data automatically transformed and stored
    into an RDF repository
  • an a posteriori data cleaning process removes as
    many duplicates as can be (semi-) automatically
    detected

6
The CIDOC CRMTop-level Entities relevant for
Integration
E55 Types
E39 Actor
E41 Appellations
refer to / identifie
affect or / refer to
E31 Document
E5 Event
7
The CIDOC CRM VBI-ERAT-LVPARepository Indexing
8
Complementary archaeological sources
  • Stone data bases
  • Lupa - 7000 archaeological records, City of
    Vienna, Austria
  • Arachne - 40.000 archaeological records, Antike
    Plastik, Cologne
  • Name data bases
  • ONOMASTICON PROVINCIARVM EVROPAE LATINARVM (OPEL)
  • Information about the amount and distribution of
    Roman names in the European provinces of the
    empire, City of Vienna, Austria
  • Epigraphic corpora
  • CIL Corpus Inscriptionum Latinarum
  • AE L'Année Epigraphique
  • Inscriptions Clauss/Slaby University of
    Frankfurt
  • Thesauri / Dictionaries
  • TGN Getty Thesaurus of Geographic Names
  • Alexandria DL Gazetteer 5.000.000 current place
    names (web service)
  • Barrington Atlas of the Greek and Roman World
    Map-by-Map Directory provides information about
    every place or feature in the Atlas

9
Mapping stone data bases to CIDOC-CRM
P102F.has_title
P1F.is identified_by
P2F.has_type
P106B.forms part of
P70B.is documented in
P106B.forms part of
POLUPA.5
P12B.was present at
P7F.took place at
P55F.has current location
P89F.falls within
10
Mapping stone data bases to CIDOC-CRM
P65F.shows visual item
P150F shows characters
P151F has transcription
P152F has clear text
P1F.is identified by P70B.is documented in
POLUPA.5
P106B.forms part of
P106B.forms part of
11
Mapping epigraphic corpora to CIDOC-CRM
P150F.shows characters
P151F.has transcription
P1F.is identified by P70B.is documented in
P106B.forms part of
P106B.forms part of
P106B.forms part of
12
Mapping OPEL to CIDOC-CRM
P67B.is referred to by
P70B.is documented in
P65B.is shown by
P139F.has alternative form
P2F.has type
P106B.forms part of
P12B.was present at
P7F.took place at
13
Integration Into One Resource
POLUPA.5
Stone data bases
Name data bases
Epigraphic corpora
Thesauri/Dictionaries
14
Identity Problem
  • Two approaches
  • a) avoid taking two different items for the same
    gt use local id, where uniqueness is guaranteed
  • b) try to find global names with a high chance
    to match.
  • Lupa solution is a)
  • We give a serial number to any new object we
    insert
  • We use the serial number of the source database.
  • Example P.O arachne.45305
  • or P.O lupa.4501
  • We maintain local id in the global index as valid
    names and remove detected duplicates
    continuously.
  • Cost-benefit optimization of over- and
    under-identification!

15
Reactive Data Cleaning Initial Data
has title
has type
is identified by
shows visual item
shows visual item
is identified by
POARACHNE.80581
has title
16
Reactive Data Cleaning Result
has title
has type
is identified by
shows visual item
is identified by
has title
17
VBI ERAT LVPA Results
  • A method and architecture for integration of
    diverse archaeological copora on the Roman stone
    monuments under the CIDOC CRM model.
  • We developed an efficient way for place name
    recognition
  • We are developing a research tool suitable for
    formulating queries and drawing conclusions on
    archaeological data
  • detection of contextual relationships that cannot
    be derived from interpreting the sources in
    isolation
  • a method of identifying epigraphic references and
    finds
  • test bed for the CIDOC CRM model - proved its
    adequacy
  • First large scale integration project of multiple
    complementary resources as a global index to the
    original sources

18
Future work
  • integrate more data sources
  • support a mechanism to visualize a source
  • support an automatic mapping process so that
    archaeologists will be able to maintain the
    system b themselves.
Write a Comment
User Comments (0)
About PowerShow.com