Title: Geen diatitel
1? ? ?
From DOBES to CLARIN and beyond
Axel Horstmann Peter Wittenburg Erhard
Hinrichs VolkswagenFoundation MPI for
Psycholinguistics University of Tübingen
?
2? ? ?
FACTS AND FIGURES
- Non-profit-making foundation established unter
private law - based in Hanover
- Not affiliated with the car manufacturer of the
same name - Founded by the Governments of the Federal
Republic of - Germany and the State of Lower Saxony in 1961
- Objective to support science and technology as
well as the - humanities and the social sciences in research
and - university teaching
- Assets about 2.45 billion euros
- Funding p.a. about 110 million euros
- One of the most potent private research funding
- foundations in Europe
3? ? ?
FOCUS ON HUMANITIES AND SOCIAL SCIENCES
- Current funding initiatives
- (see KURZINFORMATION / BASIC INFORMATION)
- about 45 to 50 of the funds given to HSC
- Initiatives focussing on infrastructural support
of HSC - Kulturwissenschaftliche Dokumentation (closed)
- Archive als Fundus der Forschung (closed)
- DOBES Dokumentation bedrohter Sprachen
- Projects including infrastructural support of
HSC - Strategy building on digitization of endangered
books - Digitization of the so-called Aschebücher of
the HAAB Weimar (in preparation)
4? ? ?
"E-HUMANITIES" POSSIBILITIES AND PERSPECTIVES
- Strong interest in innovative approaches
- Funds available for projects involving
activities towards - "E-Humanities" (e.g. digitization of data,
collections, - archival material) within current funding
initiatives - Funding possibilities for meetings, workshops,
- conferences etc. focussing on "E-Humanities"
(within - the funding initiative Symposia and Summer
Schools) - New perspectives on "E-Humanities" (possibly)
opened up - within a new funding initiative aiming at
Research in - Museums (actually in planning) including to a
certain - extent digitization activities - and not to
forget the - Flagship "DOBES" ...
5? ? ?
Concrete steps or Babylonian Tower
- we dont know exactly what eHumanities means
- we feel that mechanisms in research processes
are changing - rapidly with technological innovation as motor
- but we cant say we are now going to design
eHumanities - we probably can say lets plan further
concrete projects - and actions and see
- many excellent projects around let me just
refer to the good sides - of DOBES as one of these steps
- (Documentation of Endangered Languages funded
by VolkswagenFoundation)
6? ? ?
What is DOBES?
44 DOBES teams working fully distributed and
self-organized incl. linguists,
anthropologists, musicologists, ethno-biologists,
etc. In addition, VWF installed a central
archive Start in 2000
7? ? ?
What changed in DOBES?
- handing over all data after a limited time to an
archive was completely new - and is an explicit step, although the results
will not be ready - there is a push to make data accessible to
others from the beginning - also - new for many and not without conflicts
- asking researchers to categorize and organize
material according to - agreed metadata was also new and still
requires evangelization - including multimedia in the documentation and
dealing with audio/video as - basis was kind of new and requires
techno-knowledge
8? ? ?
Which infrastructure by DOBES?
- a stable, reliable and open repository/archiving
system handling 30 TB - data storage not encapsulated and in open
formats - introduction of persistent identifiers to ensure
investments in relating - fragments
- a network of 12 centers worldwide included in
data distribution - of these 6 copies in centers with hardware
migration strategy - a number of web-based applications offering
various ways to access the data
9? ? ?
CLARIN/D-SPIN Challenges
- eResearch is about global collaboration in key
areas of science and the next generation of
infrastructure that will enable it (J. Taylor) - goal is an open research infrastructure to
overcome the huge - fragmentation of language resources and tools
and to offer them to - research communities - in particular to
humanities - help tackling the LARGE challenges (multilingual
societies) - but also helping the individual researcher
- example align a transcription and an audio
signal - how many researchers know about how to do this
- see CLARIN/D-SPIN as a huge virtual marketplace
of resources - and tools that can be combined due to
integration and - interoperability solutions
- not forget Henry Thompsons (one of the XML
fathers) - don't have an agreed descriptive system in our
domain
10? ? ?
CLARIN/D-SPIN Research Infrastructure
- basis of big supermarket are classification and
- convincing organization principles
- based on 10 years of experience we know that
only - a flexible component model will be accepted
- seem to go towards a Federation of LRT producers
- that can make contracts with Identity
Federations - just one signature necessary to get all
researchers - with their home identity integrated
- have already setup a first small test
federation (EC-DAM-LR) - researchers dream virtual collection building
and creating - workflows flexibly - not trivial due to
import/export aspects - LREC showed that we know already a lot about
the problem
11? ? ?
CLARIN/D-SPIN Network of Service Centers
- need a network of strong and persistent
- centers of "new" type
- researchers will only adapt if they can rely
- on new mechanisms
- need to simplify the IPR/license situation
12? ? ?
towards eHumanities
- CLARIN has gt 100 members from 32 countries
- in Germany 9 well-known centers and some more
will join - is an enormous challenge to make a real step
ahead in CLARIN - can we all together extend to eHumanities
infrastructure or are we - already close to collapse?
13? ? ?
a few questions I
- will there be a separate infrastructure for each
H discipline? - NO
- there will be several shared services such as a
PID registration and - resolution service
- however
- building a joint infrastructure has to do with
community building, - trust, common language etc
- too big communities would not work
- so let's move on in TextGrid, DARIAH, CLARIN etc
- but let's have a close and fair contact to find
synergies - competition will become heavy and our
competitors are the Googles - of the world!
14? ? ?
a few questions II
- will there be a single market place for the
humanities? - NO
- acceptance of a market place is dependent on
classification and - organization principles - as already said
- these are different in all disciplines
- so have to start from the disciplines in our
solutions - already difficult enough
- leave it to Semantic Web guys to enable
cross-walk
15? ? ?
a few questions III
- who will be the main players?
- of course the big libraries, archives and
museums - but what about the universities and big
organizations such as MPG - important
- we see new requirement profiles emerging
- kind of job sharing can be predicted
- of course close collaboration with innovative
libraries such as - SUB etc is required
highly specialized groups
highly specialized MPI departments
content centers
a number of domain MPIs
curation centers
MPDL few domain MPIs
computer centers
RZG, GWDG
16? ? ?
a few questions IV
- key bricks for interoperability?
- we need open registries of all sort and smart
registry frameworks - schema registries
- concept registries (ISOcat - a creation of ISO
TC37/SC4) - relation registries
- etc
- however
- a very complex landscape seems to emerge
- how to make it usable by laymen?
- how to convince researchers to work with them?
- no one knows yet - we need to try out - what
else?
17? ? ?
Summary
- we need initiatives again and again to stepwise
advance the borders - it is now also time to transform existing
knowledge into persistent - infrastructures
- will need a lot of sensitivity and patience -
RI building costs time - emerging landscapes will have an underlying
complexity - need to offer discipline vocabulary
- need to hide complexity to a certain extent
- need to offer persistency
- Project solutions are not per se useful as
infrastructure solutions!
18? ? ?
End
in Germany we have already a good mixture with
TextGrid, DOBES, eAqua, DARIAH and
CLARIN/D-SPIN have to get together frequently
Thanks for the attention.