Title: Contrasting Polish and English Derivational Groups
1Contrasting Polish and English Derivational Groups
- based on
- Jadacka, H. Rzeczeownik polski jako baza
derywacyjna,WN-PWN 1995 - independent contrastive study of 540
Polish-English pairs of derivations
November 28th 2000
2Outline
- Defining terms
- Derivational group
- Derivational base
- Affixes
- Similarity of and within derivational groups
- Procedure of comparison
- Conclusions
3Derivational group
- A well-ordered system constructed around an
underived entry word concentrating all the
derivatives connected with it by means of direct
or indirect process of derivation - a hierarchical structure in which each element
functions as a link between other derivatives and
the BASE
4Derivational base
- The item to which an affix is added to derive a
new word-form - the word-forms consisting of the derivational
base and an affix are called DERIVATIVES - e.g. STYLE - STYLIZE - STYLIZER
- e.g. CENTRE - CENTRIC - CENTRICALLY
5Affix
- a morpheme that is added to a word, and which
changes the meaning or function of the word - affixes are bound-forms that can be added
- to the beginning of a word a prefix, e.g.
unkind - to the end of a word suffix, e.g. kindness
6Similarity within derivational groups
- Four kinds of similarities within derivational
groups are considered. Three types of
translational similarity - translational similarity between morphemes
- translational similarity between derivatives
- translational similarity between derivational
groups - and one type of grapho-etymological similarity
- graphemic and etymological similarity between
bases
7degrees of translational similarity between
morphemes (incl. bases)
- def. translational similarity between L1 and L2
morphemes is a degree to which L1 morpheme can
correctly be rendered as a corresponding L2
morpheme (i.e. morphemes occupying the same
position with respect to the base). - no similarity, e.g.
- ponad- vs. -less in P. ponad-czasowy, E.
time-less) - 1st degree of similarity, e.g.
- bez- vs. -less in P. bez-glosny, E. voice-less
- 2nd degree of similarity, e.g.
- -ik vs. -er in P. glosn-ik, E. loudspeak-er
- -czas- vs. time- in P. ponad-czas-owy, E.
time-less)
8degrees of translational similarity between
derivatives
- def. a joint translational similarity between
all the corresponding morphemes of the Polish and
English derivatives - e.g. Pol. Eng.
- za- a-
- les- forest
- ac ? ?
- whereby two morphemes are corresponding iff they
occupy the same position with respect to the base.
9degrees of translational similarity between
derivative groups
- similarity between derivational groups is a
function of - the grapho-etymological similarity of their
bases, - and the translational similarity of all their
derivatives.
10Degrees of graphemic-etymological similarity
between derivational bases
- def. Similarity established between two bases
with respect to their etymological and graphemic
features with the assumption of their
translational equivalence - no similarity, e.g. dom vs. house
- remote similarity, e.g. brat vs. brother
- close similarity, e.g. styl vs. style
irrespective of the translational equivalence of
their derivatives
11Scale of translational similarity between
derivatives
- This scale used here consists of 12 levels of
similarity counted from 11 to 0, where 0 stands
for the lowest level of similarity and 11 denotes
the highest level of similarity.
0 1 2 3 4 5
6 7 8 9 10
11
12Treatment of compound derivatives
- If a single compound derivative of the form
A-B or AB (but not A B) has an equivalent
in the other language in the form of 2 separate
words C D then it is included into our
classification as long as - C is a direct translation of A and D is a direct
translation of B - or C is a direct translation of B and D is a
direct translation of A. - This convention has been adopted because
- Jadackas derivational groups contain only
derivatives of the form AB or A-B, but no A
B derivatives - Jadackas work constituted the main and most
reliable source of derivatives and derivational
groups considered in the study.
130 1 2 3 4 5
6 7 8 9 10
11
Scale of similarity
- 11. P. BASE1 BASE2 SUFFIX
- E. BASE1 BASE2 SUFFIX
- e.g. slowo - word
- slowo-twór-stwo ? word form-ation
- 10. E. BASE1 (BASE2 SUFFIX)
- P. (BASE2 SUFFIX) BASE1
- e.g. krew - blood
- blood-stain-ed ? poplamio-ny krwia
- 9. E. BASE1 BASE2
- P. BASE2 (BASE1 SUFFIX)
- e.g. glos - voice
- voice-mail ? poczta glos-owa
Compound derivatives 1
14Compound derivatives 2
0 1 2 3 4 5
6 7 8 9 10
11
Scale of similarity
- 8. P. BASE1 BASE2
- E. BASE1 BASE2
- e.g. slowo - word
- pól-slowo ? half-word
- 7. E. BASE1 BASE2
- P. BASE2 BASE1
- e.g. styl - style
- free-style ? styl wolny
150 1 2 3 4 5
6 7 8 9 10
11
- 6. P. BASE SUFFIX
- E. BASE SUFFIX
- e.g. las - forest
- les-nik forest-er
- P. BASE SUFFIX SUFFIX
- E. BASE SUFFIX SUFFIX
- e.g. styl - style
- styl-ist-yczny styl-ist-ic
- P. PREFIX BASE SUFFIX
- E. PREFIX BASE SUFFIX
- e.g. las - forest
- wy-les-anie de-forest-ation
- P. PREFIX BASE SUFFIX SUFFIX
- E. PREFIX BASE SUFFIX SUFFIX
- e.g. centrum - centre
- de-centr-al-izowac de-centr-al-ize
Scale of similarity
Single derivatives 1
160 1 2 3 4 5
6 7 8 9 10
11
Scale of similarity
- 5. P. PREFIX BASE SUFFIX
- E. BASE SUFFIX SUFFIX
- e.g. dziecko - child
- bez-dziet-nosc ? child-less-ness
- 4. P. PREFIX BASE SUFFIX
- E. BASE SUFFIX
- e.g. pan - lord
- wielko-pan-ski ? lord-ly
- 3. P. PREFIX BASE SUFFIX
- E. PREFIX BASE
- e.g. las - forest
- za-les-ac ? a-forest
Single derivatives 2
17- 2. P. BASE SUFFIX
- E. BASE ____
- e.g. slowo - word
- slow-nik ? word-book
- P. BASE SUFFIX
- E. BASE
- e.g. dziecko - child
- diec-inka ? child
- 1. P. BASE SUFFIX SUFFIX
- E. _____ _______ SUFFIX
- e.g. slowo - word
- slow-nik-arz ? lexico-graph-er
- P. BASE SUFFIX
- E. _____ SUFFIX
- e.g. znak - sign
- znacz-nik ? mark-er
0 1 2 3 4 5
6 7 8 9 10
11
Scale of similarity
Single derivatives 3
180 1 2 3 4 5
6 7 8 9 10
11
Scale of similarity
- 0. E. BASE BASE
- P. _____
- e.g. time - czas
- time-piece ? zegarek
- P. BASE SUFFIX
- E. _____
- e.g. kosc - bone
- kos-tka ? ankle
- E. PREFIX BASE
- P. _______
- e.g. child - dziecko
- grand-child ? wnuk
Single derivatives 4
19Experiment
- 540 Polish-English pairs of derivatives were
judged as to their similarity according to the
12-point scale presented above - the translational similarity points for each pair
of derivatives obtained for each of the Polish
and English bases together with the
grapho-etymological similarity between these
bases were analysed statistically
20Statistical tests applied in the study
- in spite of nonnormality of the data the
following parametric tests were applied - MANOVA for
- for translational similarity between derivatives
by - grapho-etymological similarity between the basis
these derivatives were obtained from, and - direction of translation
- (Polish-English based on Jadacka 95 and Collins
Polish-English Electronic Dictionary, - English-Polish based on Harper-Collins
Electronic Dictionary and Collins English-Polish
Electronic Dictionary) - Multiple Range Tests for
- translational similarity of the derivatives,
irrespective of whether they were obtained
through Polish-English or English-Polish
translation - by grapho-etymological similarity between the
Polish and English bases they were derived from - Multiple Range Tests for
- translational similarity of the derivatives
obtained through Polish-English translation - by grapho-etymological similarity between the
Polish and English bases they were derived from - additionally some non-parametric tests were
applied - Mann-Whitney W test to compare
- medians of the similarity points obtained for the
derivatives in Polish-English translation - with the medians of the similarity points
obtained for the derivatives in English-Polish
translation
21Some results MANOVA
- Type III Sums of Squares was used
- All F-ratios were based on the residual mean
square error. - Source
Sum of Squares Df Mean Square F-Ratio
P-Value - Agraph_ethym_sim_betw_bases
590,704 2 295,352 53,53
0,0000 - Bdirection_of_translation
195,227 1 195,227 35,38
0,0000 - RESIDUAL 2957,27
536 5,5173 - TOTAL (CORRECTED) 3903,44
539 - The P-values test the statistical significance of
each of the sources. Since P-values are less than
0,05, - these grapho-etymological similarity between
bases and the direction of translation have a
statistically - significant effect on the translational
similarity between the derivatives obtained from
these bases - at the 95,0 confidence level.
22Some results Multiple Range Tests
- Contrast
Difference /- Limits - 0 - 1 0,197742
1,25397 - 0 - 2 -2,60124
0,488299 - 1 - 2 -2,79898
1,30672 - denotes a statistically significant difference.
-
- which means that the derivational groups
- of the Polish-English bases that were judged
to bear no similarity with respect to their
grapho-etymological features, and the
derivational groups - of the bases that were judged to be remotely
similar with respect to their grapho-etymological
features - (i.e. 0-1) do not differ significantly with
respect to the similarity of the derivatives that
constitute derivational groups of each of these
basis. - on the other hand, groups derived from bases
that differed in their etymology and graphemic
representation (contrasts 0-2 and 1-2) have
significantly different derivatives as far as the
translational similarity of these derivatives is
concerned.
231 2 5 7
8
Frequency Cumulative
540 observations 100
24Applications of the study
- The results of the study provide insights into
the possibility of automatic translation of
UNKNOWN L1 derivatives on the basis of - the L2 equivalents of the component morphemes of
L1 derivative - the degree of grapho-etymological similarity
between the bases of these derivatives
25- For example
- assume
- we do not know the equivalent of a derivative
lesnik - we can interpret bases even if they are modified
by other morphemes (las ? les-) - we know the equivalents of the component
morphemes - les- ( las) ? forest
- -nik ? -er
- we know the grapho-etymological similarity
between the bases ( 0) - Hence, we guess with a relatively small certainty
that - English equivalent of lesnik is forester
26Pessimistic scenario for automatic translation of
derivatives
0 1 2 3 4 5
6 7 8 9 10
11
Scale of similarity
27Optimistic scenario for automatic translation of
derivatives
0 1 2 3 4 5
6 7 8 9 10
11
Scale of similarity
28Very optimistic scenario for automatic
translation of derivatives
0 1 2 3 4 5
6 7 8 9 10
11
Scale of similarity
29Conclusions
- COMPOSITIONALITY The meaning of the derivative
is a direct function of the meaning of its
morphemes in app. 38-56 of cases - Assuming we know the equivalents of all the
morphemes of an L1 derivative we have app. 38-56
chance of producing a comprehensible L2
derivative - The grapho-etymological similarity of L1 and L2
bases influences the translational similarity of
their derivational groups