Title: Workshop Goals DELAMAN and DAM-LR
1Workshop GoalsDELAMAN and DAM-LR
Peter Wittenburg MPI for Psycholinguistics
Access Management Nijmegen November 2004
2When did we start?
- it is just 5 years that we started in our
discipline speaking about - large digital online collections
- standardizing the formats
- XML was new and users were very skeptical
- MPEG was and is something still not well
understood - open metadata to come to browsable and
searchable domains - using metadata to create well-organized archives
- interoperability
- LREC Athens 2000
- first workshop on these issues
- start of the ISLE project (linguistic concepts,
lexicon, metadata, ) - start of the IMDI work
- in 2000 also first LDC workshop with OLAC as
focus - little later DOBES was granted and E-Meld
started - this is very short time when you want to
convince a community
Access Management Nijmegen November 2004
3What did we achieve?
- have large on-line digital archives/collections
/Digital Libraries - MPI 40.000 session bundles / 10 TB
- DOBES 1.500 session bundles/ 1500 h
- AILLA
- PARADISEC
- Lund corpora
- also in HLT domain
- LDC
- ELRA
- BAS
- also traditional archives (Phonogramm Archiv,
NAA, ) - etc
- some of us became archivists by practice
- idea of web visibility and online accessibility
spreads - despite archiving attempts according to D.
Schüller 80 of the - digitized material is endangered
Access Management Nijmegen November 2004
4What did we achieve?
- much evangelization and agreement about
standards - DOBES workshops and documents
- LDC workshops and documents
- E-Meld workshops and excellent web-site
- ISLE workshops with IMDI result
- PARADISEC workshop with DELAMAN result
- HRELP workshops
- LREC workshops and contributions
- ACL workshops and contributions
- IASA/IAML conference
- etc
- everyone agrees with XML, UNICODE and linear
PCM - everyone understands the relevance of schemas
to make - linguistic structure and encoding explicit
- wrt JPEG and MPEG we are shooting on a moving
target, but - dont yet have real alternatives
Access Management Nijmegen November 2004
5What did we achieve?
- created awareness about the need of metadata for
visibility - created operational metadata infrastructures
within 4 years - structured IMDI for discovery and management
- OLAC for overall discovery
- gateways between the two domains
- however, still not satisfying situation
- gt 50 institutions are using IMDI (as far as we
know) - ?? institutions are providing OLAC records
- still only a small fraction of the language
resources are visible - MD creation is hard
- it is work for others although this
increasingly often is wrong - it means cleaning up your own holding and figure
out what is available - it means to write correct scripts and to learn
new software - it means being disciplined
- have done our development job have to continue
dissemination - despite limitations we hope that people stick to
what is out there
Access Management Nijmegen November 2004
6What did we achieve?
- interoperability is still a dream however
- have metadata gateways in our discipline
(OLAC-IMDI) - increasingly often tools are producing correct
XML, UNICODE, - have filters for character encodings and formats
although - we miss well-designed and comprehensive
services - have started with ontological work to tackle the
linguistic aspects - GOLD ontology from E-Meld
- ISO TC37/SC4 Data Category Registry
- TDS (Dutch Typology Project) meta-language
- EAGLES/ISLE/TEI specifications
- we are at the beginning
- cannot speak yet about fully operational
infrastructures - but there are islands like FIELD, LEXUS,
ONTO-ELAN,
Access Management Nijmegen November 2004
7Changing role of Language Archives
different groups of people contribute
The Archive
specialists maintain, unify, check quality, etc
different groups of people use the content
- at the MPI it is understood that the archive is
the capital to build on - in the DOBES programme the point to make results
explicit and accessible - only works if we dont have an inert, dusty
archives - not an attractive perspective hear more about
this from D.Schüller
Access Management Nijmegen November 2004
8Vision for a single archive
The Archive
Web-based Archive Exploration
Annotation Exploration
Domain of Registered Primary and Secondary
Resources
User
Domain of Descriptive Metadata
Primary Resources Texts Images Sound Movies
(Web-based) Archive Enrichment
Media Annotation
Access Management Nijmegen November 2004
9Everything ok so lets go home
- what about the following scenario?
Metadata
Metadata
data exchange for data survival reasons
archive A
archive B
Access Management Nijmegen November 2004
10Everything ok so lets go home
- what about the following scenario?
DOBES Archive
Raw Data
DOBES Trumai
Metadata
my personal Trumai archive
AILLA Archive
Raw Data
AILLA Trumai
not just copies but result of own creative process
Metadata
Access Management Nijmegen November 2004
11DELAMAN
- Digital Endangered Languages and Music Archive
Network - loose network of archives sharing a set of
visions such as - want to exchange data automatically (list
driven) - want to allow people to create integrated
virtual working spaces - want to have an integrated access management
domain - first talks in Nijmegen and at HRELP workshops
2003 - foundation at PARADISEC meeting in Sydney 2003
- no deep discussions about wishes in detail and
implementation - therefore this workshop in Nijmegen
- its about future usage scenarios with
distributed archives
Access Management Nijmegen November 2004
12DELAMAN / DAM-LR Map
MPI
EMELD
ELAR
Lund
INL
ANLC
AILLA
AMPM
LACITO
AIATSIS
PARADISEC
- DELAMAN is an international network
- DAM-LR
- Distributed Access Management for Language
Resources - 3 year EU project starting at 1.1.05 yes we
have money to start - centered around the DELAMAN intentions
Access Management Nijmegen November 2004
13Workshop
- want to get a deeper understanding of what we
want - need good requirements specifications
- want to get a deeper understanding what others
are doing - our ideas are not new we share them with
others - Digital Library initiatives (FEDORA, )
- GRID initiative(s) (SRB, GTK, )
- compute/function/data GRID
- therefore we invited
- linguists knowing about potential and real user
wishes - archivists knowing about maintaining large
repositories - technologists knowing about current and future
developments - some of us looked into the legal and ethical
aspects - at the end we should be ready to start
Access Management Nijmegen November 2004
14Programme 1. Day
29.11. Setting the Framework Setting the Framework
9.00 W. Klein Welcome
9.10 P. Wittenburg DELAMAN and Workshop Goals
9.40 D. Schüller Audiovisual archiving Visions, Challenges, Strategies
10.15 Discussion Discussion
10.30 Coffee Break Coffee Break
Researcher Requirements Researcher Requirements
Kamp 11.00 T. Aristar/H. Dry Linguist Wishes
Kamp 11.30 P. Austin/D. Nathan Linguist Wishes
Kamp 12.00 G. Holton/H. Johnson Legal Ethical Aspects
12.30 Lunch Break Lunch Break
Archivist Requirements Archivist Requirements
Strömquist 13.30 H. Johnson AILLA Setup and Implications
Strömquist 14.00 L. Barwick Paradisec Setup and Implications
Strömquist 14.30 Wittenburg/Skiba/Trilsbeek DOBES Setup and Implications
15.00 Coffee Break Coffee Break
Summary and Discussion Summary and Discussion
Strömquist 15.30 Uneson/Broeder/Strömquist Summary of Requirements
Strömquist 16.00 Questions and Discussion
Strömquist 17.00 W. Krull DOBES Program and the VW Foundation
Strömquist 17.15 Soddemann/Neumair/Verharen/Wbg Technology - Broad View
Strömquist 17.30
18.00 End End
20.00 Joint Dinner at Kwok Paw Joint Dinner at Kwok Paw
Access Management Nijmegen November 2004
15Programme 2. Day
30.11. Technology Components Technology Components
Nathan 9.00 T. Soddemann (got the Billing Award) Web Services
Nathan 9.40 D. Barry GRID Components
Nathan 10.10 B. Kerver Authentication and Authorization Systems
11.00 Coffee Break Coffee Break
Nathan 11.30 L. Lannom Handle System
Nathan 12.00 R. Moore Storage Resource Broker
Nathan 12.45 Discussion Discussion
13.00 Lunch Break Lunch Break
Mapping Requirements and Technology Mapping Requirements and Technology
Aristar/ Broeder 14.00 Aristar/Dry/Johnson/Barwick/ Understanding Technology Linguists/Archivists
Aristar/ Broeder 14.15 Broeder/Nathan/Jacobson/Neumair/... Choice and Integration Aspects
Aristar/ Broeder 14.30 Discussion Discussion
15.00 Coffee Break Coffee Break
15.30 Grand Summary and Open Discussion Grand Summary and Open Discussion
16.00 Wittenburg Summary
16.30 Discussion Discussion
17.00 End End
times not too strict its a workshop
Access Management Nijmegen November 2004
16Lets go
- The MPI team wishes us two interesting and highly
interactive days in Nijmegen - Daan, Andreas Technology
- Paul, Roman Archive
- Peter ??
Access Management Nijmegen November 2004