Title: Kevin Waugh, Neil Smith, Pete Thomas
1DEAP Diagrammatic Electronic Assessment Project
- Kevin Waugh, Neil Smith, Pete Thomas
- Department of Computing
- The Open University
2Toward the automated assessment of ERDs
3The investigators
- Diagram Understanding
- Neil Smith
- Natural Language Processing
- Kevin Waugh
- Assessment, Teaching and Learning
- Pete Thomas
4Diagrams
5What is a diagram?
6What is a diagram?
- Free and structured text aren't
"It _is_ a long tail, certainly," said Alice,
looking down with wonder at the Mouse's
tail "but why do you call it sad?" And
she kept on puzzling about it while the
Mouse was speaking, so that her idea of the tale
was something like this----"Fury said to
a mouse, That
he met in the
house, Let us
both go to
law _I_ will
prose- cute
_you_.--
Come, I'll
take no de- nial
We must have
the trial
For really this morn-
ing I've nothing
to do.' Said
the mouse to
the cur,
Graham Joyce was sitting in one of the
sunloungers. He leaned forward and gave Tim a
firm handshake. 'Tim, greetings and salutations.'
For a man in his eighties he retained a
remarkably vigorous air, possessing a gaunt face
that genoprotein treatments had never quite
managed to soften and a shock of unruly
snow-white hair. His voice was like a forceful
foghorn.
7What is a diagram? We are using
- Segmentable
- Feature-based
- A common sense filter
- This isn't a diagram
8What is a diagram? These are .
9What is a diagram? and these are .
10Traditional take on diagrams
- Often treated as formal "visual" languages
- So they're expected to be parsable
- grammatical
- correct
- complete
11but real diagrams aren't formal
- they're not always grammatical
- they're often incomplete
- they're often incorrect
- they are not always parsable
- (especially when drawn by students!)
12Interesting question What if we treat
diagrams in the same way that we treat text?
13Text and diagram - a simple correspondence
- Characters/punctuation - segments
- Words features
- Phrases - "minimal meaningful units
- Sentences mmu aggregations
14Natural language
- A grammar is an approximation to actual language
use - Pragmatic - rather than correct/complete
- Do we even need a grammar?
15Sub-languages
- Specific grammars for specific domains
- Stylistic conventions
- textbooks
- novels
- poetry
- instruction manual
- Interpretation is domain specific
- No "universal" solution
16Research question What if we process diagrams
the same way we process text?
- Bag of words -
- Syntactic -
- Semantic -
- Statistical analysis -
17Our take on diagrams
- Diagrams
- are intended to carry meaning, in a given domain
- domain limited
- user is domain aware
- they have an interpretation
- use domain specific diagram notations
- domain expert can assess correct use of notation
- both well-formed and ill-formed (imprecise)
- domain expert can assess "correctness",
"understandability" of diagram - domain expert can interpret and correct
incomplete and incorrect diagrams (sometimes!)
18What does this represent?
- A diagram without domain context
- non-expert cannot interpret
19Diagram interpretation needs context and domain
knowledge
- Diagram with domain
- non-expert cannot interpret (informed guessing)
20Domain expert does not need correct, complete
diagrams
- Diagram with domain
- Expert can interpret, criticize and correct.
21- Could supply a notation description
- but that doesn't equate to a regular users
knowledge of the domain - it helps decide if a diagram is properly drawn,
not if it is meaningful.
22Our larger investigation
- If we attempt to process diagrams in ways
comparable to the ways we process formal, natural
and sub-language texts. - can we do useful things with diagrams?
- Things such as automated assessment?
23Automated assessment
24Automated assessment
- Coursework and Examinations
- Self-assessment and revision support
- Grade automated feedback
- Grading alone is not sufficient
- Directed, appropriate, focused feedback is a
requirement - (Multiple choice - not our concern)
25Successful automated assessment
- Textual assessment (essay and short text)
- bag-of-words
- bag-of-phrases
- sequences (ordered-bag-of-words/phrases)
- syntactic structure
- abstracting and comparison (semantic-syntactic)
- semantic analysis
- Diagram assessment
- restricted choice and "slot filling"
- multiple choice
- "Free" diagram assessment has not been
successfully achieved
26What if we assess diagrams the same way that we
assess text?
- What are the diagram assessment equivalents to
- bag-of-words
- bag-of-phrases
- sequences
- abstracting and comparison
- syntactic structure
- semantic analysis
- Can we achieve automated assessment of diagrams
comparable to that achieved by a human marker? - Can we provide focused feedback comparable to a
human tutor?
27Our initial experiment with ERDs
28Feasibility experiment pipelines
- Approach comparable to bag-of-words
- Results (13 answers)
- Human Mean 2.78 StdDev 1.05
- Tool Mean 2.73 StdDev 1.09
- Pearson correlation coefficient 0.75,
(significant at the 0.01 level, two
tailed), N13
29Why entity relationship diagrams?
- Scope right/wrong interpretable
- Range small large
- Range simple complex
- Correctness notation - meaning
- Question, Sample solution, Marking guide
(familiarity) - Aggregations mn decomposition,
relationship signatures, ...
30The question
- Give an E-R diagram that corresponds to the
relational model given. 25 - model BookGroup
- relation MemberNumber MemberNumbersName
PeopleNamesAddress AddressesIntroducedBy
MemberNumbersBorrowedBook ISBNsBorrowedCopy
CopyNumbersprimary key Numberalternate key
(BorrowedBook, BorrowedCopy) allowed
nullrelationship Introducesforeign key
IntroducedBy references Member not allowed
nullrelationship Borrowsforeign key
(BorrowedBook, BorrowedCopy) references Copy - . ltseveral relations omittedgt
31Solution and marking scheme
Marking scheme 1 mark for all three entities. (
zero if any more or less than three are shown) 6
marks for each relationship (64 24 marks)
broken down as 1 mark for naming used in the
relational model comments 1 mark for the
relationship being between the right entity
types 2 marks for the degree (11 or 1m as per
above figure zero marks if incorrect) 1 mark
for each participation condition correctly shown
32On the risks of using a drawing tool
- Slot filling?
- Prompting?
- No segmentation or feature extraction?
- Drawing "correct" diagrams because tool enforces
correctness?
33First results
- 21 human marked answers
- Human Mean 21.29 StdDev 3.757
- Tool Mean 22.24 StdDev 2.508
- Spearmans rho correlation coefficient 0.957
(significant at the 0.01 level, two-tailed),
N21 - Pearsons correlation coefficient
(significant at the 0.01 level, two-tailed), N21
34Simplistic? Yes - but ....
- First step in our assessment of diagrams as text
- comparable to bag-of-phrases processing - The pipeline experiment was bag-of-words
- Essentially uses same algorithm as the marking of
short answer texts - Gives us a base-line when considering adding
aggregation etc. - We are also aware of ...
- need to investigate how will we express complex
marking schemes (if we need them) - the above assessment is not dependent on
aggregation nor interpretation
35Where next
- Take what we have, add feedback and we have a
revision support tool. - More complex marking schemes inc. alternative
solutions. - Include aggregation and abstraction.
- ERD questions with scope for interpretation
scenario-based rather than translation based.
36DEAP Diagrammatic Electronic Assessment Project