Title: Applying Embodied Construction Grammar:
1Applying Embodied Construction Grammar
a description of some Afrikaans morphological
constructions
- Gerhard B van Huyssteen
- Potchefstroom University for CHE
- South Africa
- Acknowledgement Sulené Pilon
ICLC 2003
2Overview
- HLT and CL in South Africa
- Project Automatic Morphological Analysis of
Afrikaans - Requirements of a Formalism
- Two Afrikaans Constructions
- Plural Construction
- Nominalising Construction
- Concluding remarks
3HLT in South Africa
- CL and NLP
- well-established research fields in USA, Europe,
and other parts of the world - unexplored territory in South Africa
- no catholic HLT projects for many years
- Since 2000
- awareness of importance of HLT
- governmental level advisory committee of DACST
(2002) - academic level new projects programmes
4CL at the PUCHE
- Since 2001- prioritised CL as strategically
important - establish research focus area Language and
Technology - establish first complete graduate study programme
in CL in South Africa - set up dedicated HLT laboratory
- acquire text and speech corpora for
- Afrikaans
- South African English
- Setswana
- Two related Afrikaans projects
- Spelling Checker project (funded by University)
- Automatic Morphological Analysis of Afrikaans
project (funded by NRF)
5AMAA project
- Aim to develop efficient, reusable modules for
the automatic morphological analysis of Afrikaans - tokeniser hyphenator
- word segmenter POS tagger
- compound analyser stemmer
- Project team includes 4 linguists, 1
computational linguist (from University of
Tilburg, Netherlands), 2 computer scientists - Problem communication between
- different disciplines
- different languages
6In Search of a Formalism
- A formalism is a set of features used to
precisely and rigorously interpret linguistic
analysis (i.e. rules, principles, conditions,
etc.) in logical or mathematical terms, in order
to develop a calculus (cf. Crystal, 1997 156) - Looking for
- a formal rule system (i.e. formal grammar or
formalism) - for declarative purposes
- not for more procedural purposes (like parsing
and generation) - to represent Afrikaans morphological structure
- not particularly interested in syntax, semantics,
pragmatics
7Requirements Formalisms
- Accessibility
- Transparent
- Supported by literature
- Efficiency
- Linguistically efficient
- Must be able to capture all linguistic phenomena
accurately - Computationally efficient
- To be implemented in a computer environment
- Flexibility
- Describe language structure with ease
- Represent the underlying linguistic theory
- Reusability
- apply in different environments and applications
8Some specific requirements
- Must represent regexps
- developing a rule-based stemmer, using PERL
- Must rank the rules
- exceptions (i.e. low-level instantiations) are
ranked higher than rules (i.e. schemas) - longer rules are ranked higher than shorter
rules - DIM construction -tjie is removed before
jie paaltjie hondjie - Must be compatible with CG
9Procedure
- Identify main morphological processes
- Inflection
- Derivation
- Compounding
- Identify constructions
- PLURAL construction
- PAST construction
- NOMINALISING construction
- REDUPLICATION construction
- Draw categorisation networks
- Translate into ECG
- Implement in stemmer
10Afrikaans Plural Construction
- Inflectional process, realised by means of
suffixation - 2 prototypical constructions
- -e hond honde dogs bal balle balls
- -s venster vensters windows tafel tafels
tables - Elaborations of the general schema
- e 3 3e 3s
- s ma mas mothers
- Extensions of the general schema
- -a datum data
11Categorisation Network
GB van Huyssteen (PUCHE)
ICLC 2003
12PLURAL construction I
construction SUFFIXATION subclass of
AFFIXATION constructional constituents roo
t suffix constraints constituency
rootm/rootf ? suffixm/suffixf
form constraints rootf meets
suffixf suffixf .dependency ? dependent
rootf .dependency ? autonomous
dependent meaning constraints profile-det
? suffix
13PLURAL construction II
construction PLURAL subclass of
SUFFIXATION constructional evokes INFLECTI
ON constituents root NOUN-SG LET NUM
ABBR suffix PLURAL-SUF constraints ro
otm.scope-of-pred ? BOUNDED-REGION suffixm.scop
e-of-pred ? UNBOUNDED-REGION form meaning c
onstraints scope-of-pred ? UNBOUNDED-REGION
14PLURAL construction III
construction PLURAL-s subclass of
PLURAL constructional constituents root
NOUN-SG-CN suffix s constraints rootf
/(C)?V(C)Va-z/ suffixf
/s/ rootm.profile ? THING ranking
16 form constraints s /(C)?V(C)Va-z
/(C)?V(C)Va-zs/ meaning constraint
s profile ? THING
15PLURAL construction IV
construction PLURAL-s subclass of
PLURAL-s constructional constituents root
NOUN-SG-PROPER NOUN-SG-CN LETT NUM ABBR
suffix s constraints rootf
/PROPN(V)/ /CN(iouá)/ /(a-zlm
nrsxz)/ /(1-9123456)/ /ABBR(V
)/ rootm.profile ? THING SAR suffixf
/s/ ranking 13 form constraints s
/PROPN(V)/PROPN(V)s/ s
/CN(V)/CN(V)s/ s /(/a-zlmnrsxz/)/
(a-zlmnrsxzs)/ s /(1-9123456)/
(1-9123456)s/ s /ABBR(V)/ABBR(V)s
/ meaning constraints profile ? THING
16PLURAL construction V
construction PLURAL-specified subclass of
PLURAL constructional constituents root p
ad sambreel hemp seun bod
Aardklop (lspr)eeu man (m)?eeu
vrou voël kasteel bal oom suffix P
LURAL-SUF constraints ranking
1 form constraints s/pad/paaie/
s/sambreel/sambrele/ s/hemp/hemde/
s/seun/seuns/ s/bod/botte/ s/Aardklop/(Aardk
loppeAardklops) s/(lspr)eeu/(lspr)eeus/ s/ma
n/(mannemans) s/(m)?eeu/(m)?eeue/
s/vrou/(vrouevrouens) s/voël/(voëlsvoële) s/
kasteel/kastele/ s/bal/(balleballas) s/oom/oom
s/ meaning constraints profile ? THING
17Categorisation Network
GB van Huyssteen (PUCHE)
ICLC 2003
18NOMINALISING construction I
construction NOMINALISING subclass of
AFFIXATION constructional evokes DERIVATIO
N constituents root VERBADJADV affix
NOM-PREFIXNOM-SUFFIXNOM-CIRCUMFIX constrai
nts rootm.profile ? PROCESSSARCAR affixm.p
rofile ? THING form meaning constraints
profile ? THING
19NOMINALISING construction II
construction NOMINALISING-ge()Cery subclass
of NOMINALISING-ge()ery constructional consti
tuents root VERB circumfix
ge()ery constraints rootf
/VERB(áéíóúC/ rootm.profile ?
PROCESS circumfixf /ge()Cery/ ranking
1 form constraints s/VERB(áéíóúC/ge
(VERB)(áéíóúCCery/ meaning constraints
20NOMINALISING construction III
construction NOMINALISING--VCing subclass of
NOMINALISING-ing constructional constituents
root VERB suffix ing constraints
rootf /VERB(VVC)/ rootm.profile ?
PROCESS suffixf /-VCing/ ranking
10 form constraints s/VERB(VVC)/VERB(
VC)ing/ meaning constraints
21NOMINALISING construction IV
construction NOMINALISING-er subclass of
NOMINALISING-SUF constructional constituents
root VERB suffix er constraints r
ootf /(VERB)/ rootm.profile ?
PROCESS suffixf /er/ ranking
12 form constraints s/(VERB)/(VERB)er
/ meaning constraints attr ? HUMAN
22Summary of adaptations
- Our adaptations provided for our needs
- added regexps as form constraints
- added ranking as constructional constraints
- added attributes as meaning constraints
- added more CG concepts/constructs
- profile
- valence factors
- profile determinacy
- conceptual and phonological autonomy and
dependency - constituency
- correspondence?
- Make it therefore more accessible for us
23Evaluation ECG as a Declarative Formalism
- Accessible?
- very little ECG material (specifically on
morphology) available - isolated do whatever we want to do
- Efficient
- Linguistically efficient?
- handled our data beautifully
- Computationally efficient?
- not our primary concern
- improved communication with computational
linguist and computer scientists - Flexibility
- represents essence of Cognitive Linguistics
beautifully - easy to add features/adaptations
- Reusable?
- not our primary concern
- Main Advantage
- compatibility with Cognitive Grammar
24Conclusion
- Your conclusion
- What are we doing wrong?
- What are we missing?
- Are we abusing ECG?