Title: A natural-language approach to modeling
1A natural-language approach to modeling
- Why is some XML so difficult to write?
- Yves MARCOUX
- GRDS EBSI
- Université de Montréal
2Structure of the talk
- The problem
- Proposed direction for solution
- Conclusion
- Question period
3Writing well-formed XML authors choices
- ltsexgtltmale /gtlt/sexgt
- ltis-femalegtFALSElt/is-femalegt
- ltgender gender"x2642" /gt
- ltnotegtIt's a boy!lt/notegt
- x2642 ?
4Writing valid XML is collaborative work
- Modeler has chosen the markup (container)
- Author supplies the contents
- Much like a form
- Collaborative work ? communication between
parties modeler and author - But the modeler is gone
5Problem
- Authoring environments are
- good at conveying the syntactic intentions (or
decisions) of the modeler - not as good at conveying the semantic intentions
of the modeler - Often, all there is is a generic ID or some
slightly more developed form - Ex. date in a memo
6What is available?
- More or less developed forms of genIDs (and
attribute names) - General documentation of the model
- Per element (attribute) documentation
- OK for tooltips or popups
- Could we do better?
- (Applications / stylesheets are not appropriate)
7Could we aim at
- Having a semantic conversation right in the
editing window? - In the same way that there is actually a
syntactic conversation? - Yes
8Structure of the talk
- The problem
- Proposed direction for solution
- Conclusion
- Question period
9Key idea
- Have modeler prepare bits of NL (prose)
- That can be intertwined with author-supplied
contents to give them meaning - Allows fill-in-like sentences
- And thus, a semantic conversation in the editing
window - NB modeler segments can contain hyperlinks
10Example
Facts about some US cities
City Population Annual snowfall (inches)
Denver 850,000 23
Rochester 240,000 88
Palm Spring 48,000 0
11Raw XML
ltfacts-about-US-citiesgt ltcitygt
ltnamegtDenverlt/namegt ltpopulationgt850,000lt/populat
iongt ltannual-snowfall-in-inchesgt23lt/annual-snowf
all-in-inchesgt lt/citygt ltcitygt
ltnamegtRochesterlt/namegt ltpopulationgt240,000lt/popu
lationgt ltannual-snowfall-in-inchesgt88lt/annual-sn
owfall-in-inchesgt lt/citygt ... lt/facts-about-US-c
itiesgt
12Prose equivalent
Here are facts about some US cities. The city of
Denver has a population of 850,000 and an annual
snowfall of 23 inches. The city of Rochester has
a population of 240,000 and an annual snowfall of
88 inches. The city of Palm Spring has a
population of 48,000 and an annual snowfall of 0
inches.
13Modeler prepares peritext segments
Element text-before text-after
facts-about-US-cities "Here are facts about some US cities." empty
city " The city " "."
name "named " empty
population " has a population of " empty
annual-snowfall-in-inches " and an annual snowfall of " " inches"
14Possible semantic view
Here are facts about some US cities. The city
named Denver has a population of 850,000 and an
annual snowfall of 23 inches. The city named
Rochester has a population of 240,000 and an
annual snowfall of 88 inches. The city named Palm
Spring has a population of 48,000 and an annual
snowfall of 0 inches.
15What it allows during editing (in semantic view)
- Peritexts convey the semantic intentions of the
modeler - A semantic conversation takes place in the
editing window (instead of a syntactic one) - Fill-in sentences
- Make tag abuse embarrassing
- Likely to reduce some kinds of errors
- Other views / fragment viewing / hyperlink
16Discussion
- This is not like defining an application
- Not a stylesheet mechanism
- Peritexts (fixed here) could be allowed to vary
with some parameters - position among siblings
- attribute value
- etc.
- (Attributes should be treated)
17Why does it work?
- Sometimes tricky (see paper), but
- NL has very high affordance
- NL can act as its own metalanguage
- XML contents NL usually mix pretty well
18Intertextual semantics
- Meaning of a text fragment is given by placing it
in a network of other texts - That network can simply consist in a sentence (or
quasi-sentence) - Or more elaborate topology peritexts can contain
hyperlinks, determining sense-making / learning
paths - Too much hyperlinking can spoil the idea!
19Interpretation workflow
S
H
d ? S(d) ? actual meaning of d for H
- d is document or fragment, H is a human
- S(d) is the intertextual semantics of d
- S(d) is in NL
- S is machine computable
- Actual meaning of d for H may vary
- with H
- for a same H, from one reading of S(d) to
another
20Interpretation workflow
H2
H1
d
H3
H1
d
S(d)
H2
H3
21Suggests a modeling process
- Modeler starts with the prose
- Identify peritexts
- Work out more and more abbreviated forms
- Will correspond to different views in the
editor - Tersest level gives markup
- Increase model usability?
22Mixed content question revisited
- Known can get rid of mixed content with
- lt!ELEMENT text (PCDATA)gt
- Example
- lt!ELEMENT (e1 e2 PCDATA)gt
- becomes
- lt!ELEMENT (e1 e2 text)gt
- Why does it feel bad?
- Tags text are not abbreviations of any
reasonable peritexts!
23Is NL too much to ask for?
- Relative to some target community
- Can go a long way (previous slide)
- Hyperlinks are allowed in peritexts
- Allows defining sense-making or learning paths
- (Almost) anything formal can be turned into NL
24NL as formalism common denominator
Expression in artificial formalism
STAPLER
Textbook explaining formalism
Equivalent expression in NL
25Editing setup without intertextual semantics
World
Modeler
NL and presupposedknowledge of target community
Doc. / tr. material
Author
XML EDITOR
Valid XML instance or fragment
XML DTD
26Editing setup with intertextual semantics
World
Modeler
NL and presupposedknowledge of target community
Author
XML EDITOR
NL equivalent
Valid XML instance or fragment
text-before and text-after segments
XML DTD
27Structure of the talk
- The problem
- Proposed direction for solution
- Conclusion
- Question period
28What it suggests
- Bring some of the discipline of producing good
documents (manuals of style) into model
interface design - E.g., dont abuse hyperlinking
- Literate modeling, literate interfaces
- Literate interface / interaction design
- Benefit make explicit prerequisite knowledge
sense-making / learning paths
29Other possible uses of intertextual semantics
- Legal documents with multiple renditions
- NLP systems that cannot treat markup
- Including full-text indexing
- ltexgtHamletlt/exgt
- Exit Hamlet
- Other data models
- Ex. relational
- Normal forms
- A new look at expressivity
30Future work
- Editing
- Work out a few existing / new models
- Properly integrate attributes
- More powerful peritext computation
- Implement ideas in a real editor
- Display peritexts when chosing insertion
- Hyperlinks in displayed peritexts
- Experiment with real authors
31Future work
- More than peritexts?
- More than NL?
- Compare with other semantic frameworks
- Downstream semantics Wrightson, Renear et al.
- Other models
- Tackle literate modeling / interface design
32Thank you!
- Questions?
- PS I am currently looking for a location for an
upcoming sabbatical