Title: IS1305
1IS1305 European Network of e-Lexikography
(ENel)
2Objectives of WG 2
- (according to the application)
- set up guidelines and standards for turning paper
dictionaries into a digital format - development of common standards in the field of
e-lexikography for retro-digitised paper
dictionaries already online or planning to go
online (objective 3 of the action)
3Tasks of WG 2 Task 1
- establish an overview of existing retro-digitised
dictionaries and an overview of dictionaries
which should be retro-digitised (necessity to be
digitised ? ranking? ? no, not necessary!) - ? necessary to give this overview scheme of
categories describing the dictionaries (to
develop in close exchange with WG 1, WG 2, WG 3) - result database to browse (? to coordinate with
WG 1) - ? question different categories as search
parameters? - time frame year 1
4Tasks of WG 2 Task 2
- develop a standard workflow for digitisation of
dictionaries planning to go online including
parameters necessary for estimating costs - digitisation (fulltext, images, OCR)
- encoding of retro-digitised dictionaries
- development of GUI
- standards of presentation and design
- long term preservation
- result guidelines (have to be written in such a
way that policy makers understand them) - time frame year 14
5Tasks of WG 2 Task 3
- define standards for the encoding of information
and the description of relevant information
categories for paper dictionaries - ? main objective guarantee interoperability,
platform interdependence - ? task collect standards used within the action
(TEI, LFM, ISO ? give this question to MC) - ? questions
- what markup languages to use?
- do we need a minimal set of standards for both
retro-digitised and new, born digital
dictionaries?
6Tasks of WG 2 Task 3
- 3.1 part of task 3 establish an overview of
software for the conversion of physical lay-out
information to logical information - ? question how to mark-up the dictionaries (i.e.
automatically, semi-automatically are there
mark-up tools to be re-used)? - result best practices for the encoding of
information, linked with dictionary database - time frame year 1 and 2
7Tasks of WG 2 Task 4
- a) investigate relevant information categories to
be added to the dictionary in order to make the
dictionary content more readily accessible and
interoperable - b) develop concepts for linking retro-digitised
dictionaries - ? questions
- which information do we need to interlink
dictionaries (extra-information?)? ? describe the
strategies
8Tasks of WG 2 Task 4
- ? questions
- integration of additional information to create
up new information (e.g. WordNet, wiki
dictionary, FrameNet)? - ? question to address to the WGs do you put
additional information in your dictionary - result best practices
- time frame year 3
9Tasks of WG 2 Task 5
- investigate the possible use of dictionary
content for computational linguistic applications - ? task is already done, no further need gt clear
task list! - time frame year 4
10Tasks of WG 2 Task 6
- identify future funding sources and develop
collaborative funding applications considering
the dictionary-candidates to retro-digitise and
the working plan for digitisation - ? information to have on an European level
- ? develop awareness in governments of Europe!
- ? questions
- national and international funds to go for
financial support? - develop guidelines / best practices for writing
funding applications? - ? responsibility of steering group!
- time frame year 14
11Tasks of WG 2 Task 6
- ? task 6 responsibility of steering group!
- time frame year 14
12Tasks of WG 2
- in Leiden we tried to divide tasks, to find
responsible(s) for the tasks, to form subgroups - ? not yet finished, especially for task 4 and 5
- (task 5 already done, no need to find
responsible(s))
13Participants
- 27 participants from 14 countries Austria (1),
Denmark (2), Finland (3), France (1), Germany
(5), Hungary (1), Netherlands (2), Poland (2),
Portugal (2), Romania (1), Serbia (2), Slovacia
(1), Switzerland (3), United Kingdom (1) - see file WG 2 Leiden 16-01-2014 minutes Annex1
participants.pdf
14Dictionaries in WG 2
- see list in file WG 2 Leiden 16-01-2014 minutes
Annex2 dictionaries.pdf - not yet complete
- for now 25 dictionaries of different types
- most of them monolingual
- 10 (?) languages
- most of them diachronic / historical
dictionaries, standard language dictionaries,
some dialect dictionaries
15Plans / ideas / work in progress
- bibliography of retro-digitised dictionaries
online available (student using Citavi for
organizing the bibliography) - ? structure of the bibliography language
dictionaries, specific dictionaries (e. g. A
dictionary of food and nutrition) - ? structure of entries author, year of
publication, title, place of publication,
publisher, url - (Adelung, Johann Christoph (1808)
Grammatisch-kritisches Wörterbuch der
hochdeutschen Mundart. Mit beständiger
Vergleichung der übrigen Mundarten, besonders
aber der oberdeutschen. Wien Richter. Online
http//ds.ub.uni-bielefeld.de/viewer/image/1323497
/1/LOG_0003/.)
16Plans / ideas / work in progress
- ? work in progress (for now 22 pages in Word
file) - ? questions
- re-use in the Action?
- which information should be given in this
bibliography of retro-digitised dictionaries
(close connection to the scheme of categories
describing the dictionaries?) - bibliography as basis for the database of
retro-digitised dictionaries and part of the
dictionary portal?
17Plans / ideas / work in progress
- collection of dictionary typologies trying to
find a scheme of categories describing the
dictionaries in the Action - problem so far only consideration of German
typologies - Storrer classification of internet dictionaries
- retro-digitised dictionaries
- digital born dictionaries
- dictionaries with user participation
- user generated dictionaries
- finished dictionaries
- dictionaries under construction
18Plans / ideas / work in progress
- Schlaefer
- language(s) covered monolingual, multilingual
- vocabulary/lexicon described
- user group addressed
- methodological basis
- lexikographical basis
- Hausmann
- synchronic vs diachronic dictionary
- historical vs contemporary dictionary
- standard language vs dialect dictionary
-
19Cooperation with other WGs
- cooperation with WG 1 concerning
- the encoding of dictionaries
- the linking of information between dictionaries
- user interfaces
- the overview of dictionaries
- cooperation with WG 3 in finding common
approaches to linking contents of retro-digitised
and innovative dictionaries - cooperation with WG 1, WG 3, WG 4
- in identifying funding sources and developing
funding applications
20Decisions which have to be made / questions
- Scientific aim
- develop a scheme of categories describing the
dictionaries (short standardized profile) ?
cooperation with WG 1, WG 3 and WG 4 - ? question which information should be given
about the dictionaries? - information about the dictionary itself (short
and clear description!) - dictionary type
- language covered (source language, description
language, target language)
21Decisions which have to be made / questions
- (? 1. information about the dictionary itself)
- year of publication (print and online)
- number of entries
- references, literature concerning the dictionary
-
- information about the technical process
- encoding
- XML schema and documentation
- year of publication
-
22Decisions which have to be made / questions
- ? questions
- which kinds of dictionaries to include / exclude?
- propose parameters / properties for all
dictionaries which can function as search
parameters in the dictionary portal (search for
dictionaries)?
23Decisions which have to be made / questions
- Organisation
- mailing list for each WG? ? establish at INL?
- (Google Groups for each WG, all of them including
members of steering group) - how to exchange information / results of WGs
within WGs and amongst all participants ? can we
use the intranet as envisaged in the proposal? or
Google Groups and Google Docs? (suitable
instruments?) - do we need slots for inter-WG meetings at all WG
meetings? - specialist workshops preceeding the WG meeting?
24Decisions which have to be made / questions
- Organisation
- STSMs
- central and open call? or call focused on
certain topics fostering certain tasks in the
action? - information concerning reimbursement to
participants? - Training Schools
- how to organize? where? when? how long?
- number of participants? experts?
- budget?
25To ask from participants of WG 2
- short biographies concerning their background
(like Anne did in WG 1, see minutes)? - collect them for ENeL website? secured or open
part of website? - continue to divide tasks / build subgroups
(especially for task 4) - invite them to think about topics relevant for
any concern of WG 2 not yet fixed in working
plan
26To ask from participants of WG 2
- invite them to think about experts to be involved
in the discussions of WG 2 (specialist workshops) - invite them to think about topic(s) to deal with
at Bolzano ? fixed in Leiden presentation of
first results of task 3 (development of standards
for the encoding of information and the
description of relevant information categories
for print dictionaries) at meeting in Bolzano - invite them once again to think about a 5-day
meeting in the Lorentz Center in Leiden in 2016
27To ask from participants of WG 2
- invite them to think about the Training School in
2015 Standard tools and methods for
retro-digitising dictionaries - ? date year 2, semester 2
- ? Rute will check location with Vlado and Vera
- give a description of their dictionary/ies
according to our dictionary scheme (deadline
depending on decision how this dictionary
profile looks like)
28Tasks for Bolzano
- define a list of dictionaries to begin with (e.g.
bilingual synonym dictionaries) - define a list of dictionaries to be
retro-digitized - define a list of metadata
- (ask all WGs for a list of dictionaries and a
list of mark up) - proposal with dictionary typology including
definitions of technical terms used (end of
June) define