Title: Internet Document and Knowledge Management
1Personalized Automatic Quiz Generation for
Learners on the Web
Meng Chang Chen Institute of Information
Science Academia Sinica
2Why Ubiquitous Networked Language Learning?
- People hang on (if not glued to) to the Internet.
- Why just learned prepared language learning
materials? - A learner can learn a language when surfs the
Internet or studies materials in their
profession. - Intelligent assistance can help learning
3Why Personalized?
- Each learner has their own interest.
- Each learner learns at their own pace and
progress. - Individual personalized learning plan, especially
base on previous learning history and
performance, helps achieve each learning goal.
4What is Auto-Quiz?
- Auto-quiz automatically prepares questionnaires
against the learning materials the learner is
reading. - After learner gives answers, auto-quiz system
will grade the answers and give necessary help. - It will save the quiz result for future
individual learning plan and questionnaires
preparation.
5Auto Quiz for Ubiquitous Learning
- A quiz on the browsed text can help user realize
their progress in language learning and
understand their shortfalls. It also allows
system to track the readers learning status. - A quiz can based on
- Learners reading comprehension
- Learner learning history and given language
learning agenda (vocabularies, grammars, sentence
structure). - Learners learning plan (presumably generated
automatically)
6User Learning History
- Accessed documents
- Learning behaviors
- Quiz results
USER Learning from Web Pages
- Data abstraction
- Data mining
- Learner performance
- evaluation
- Learner profile
- construction
- Learning plan
- construction
7Learning Personalization Scheme
According to learners English capability, Select
successive learning material
Material Recommender
WordNet
According to learners English capability,
Generate a quiz
Auto Quiz Generator
Common Senses
Analyze this web page
Update learner profile using learners quiz
results
Web Page Analyzer
Open a recommended web page
Personal Profile
Web Pages over Internet
Learner English Achievement
Vocabularies, Familiar and Erred
Learner Erring History
Open a web page
Other Performance Abstractions
Interesting Web Page
Interesting Web Page Collector
Agent
Mark as unfamiliar vocabulary
Update learners English capability
Vocabulary Helper
Vocabulary question?
Vocabulary Corpus
Essay Writing Environment
Information of vocabulary
Answer
8AUTO QUIZ SCHEME
Personal Profile
Learner Erring History
Learner English Achievement
Other Performance Abstractions
8.2
0
0
6
3.2
2.5
0
10
6.5
4.7
0
7.4
0
0
1.1
Quiz Results
Quiz Difficulty Weighting
Output a quiz
Quiz Generator
learners answers
Template Matcher
Question Type Selector
WordNet
Common Senses
WSD
Semantic Network Generator
Semantic Network Enhancer
Enhanced Semantic Network
9Semantic Network Representation of a Reading Text
- A semantic net can represent the text by identify
the players, actions, attributes, and
relationships between them. - Iterations of processes can be applied to refine
a semantic net to grasp the implicit and explicit
semantics of the text
10Semantics Network
- A text can be represented by a semantic network.
11Semantic Network Construction
- Parse text to generate links
- Link grammar parser
- Link objects under text semantics
- Pronoun referent
- WSD resolution
- Generate auxiliary links
- Wordnet
- Common sense reasoning
- Prune redundant/insignificant links
12Connector of Word Block
- Link Grammar Parser from CMU
S-
D-
S
D
dog
run
The
S
D
dog
run
The
13Linkage Structure of Sentence
- An agent does some action with a feeling.
S
John
MVa-
angrily
Agent
Feeling
break
LW E F T W A L L
O-
MVa
window
Os
Theme
S
D
Wd
John
broke
the
window
angrily
14Linkage Structure of Sentence
- An agent does some action with a instrument.
S
John
J-
Agent
hammer
Instr
with
Past
break
O-
window
Theme
LW E F T W A L L
MVp
Os
Js
S
D
D
Wd
John
broke
the
window
with
the
hammer
15Challenges in Semantic Network Construction
- Language inherent problems
- Pronoun referent
- PP attachment
- WSD
- Logic Inference
- Common sense reasoning
- Logical understanding of temporal and spatial
(and other common sense) relationship between
events or objects. - Background Knowledge
16Anaphora resolution
- In the pointing back relation,
- The entity that cant identify its role itself in
the document is called an anaphor - The entity that represents the role of the
anaphor is the antecedent of the anaphor - The process of determining the antecedent of an
anaphor is called anaphora resolution.
17Types of anaphora
- Three most widespread types
- Pronominal anaphora
- Definite noun phrase anaphora
- One-anaphora
- Subtypes based on the location of the antecedent
- Intrasentential anaphor
- Intersentential anaphor
18Pronominal anaphora
- The most widespread type
- Is realized by anaphoric pronouns
- Example
- Computational Linguists from many different
countries attended the tutorial. - They took extensive notes.
19Cyberterror impact, defense under scrutiny, by
Jon Swartz, USA TODAY
A terrorist threat is out there and not just
against physical structures. A coordinated
cyberattack against the USA could topple parts of
the Internet, silence communications and
commerce, and paralyze federal agencies and
businesses, government officials and security
experts warn. Such an attack could disrupt
millions of dollars in financial transactions,
hang up air traffic control systems, deny access
to emergency 911 services, shut down water
supplies and interrupt power supplies to millions
of homes, security experts say. But from whom the
attacks would come is unclear. Intelligence shows
al-Qaeda is more fixated on physical threats than
electronic ones, government officials and
cybersecurity experts say."Al-Qaeda doesn't see
cyberterrorism as achieving significant military
goals," says James Adams, CEO of Ashland
Institute for Strategic Studies, a research
group. "They see the world in a rather
old-fashioned way, where bombings and shootings
have direct impact and scare people." That
hasn't dissuaded other groups and nations from
eyeing cyberterrorism as a means to damage the
USA, whose infrastructures are increasingly tied
to the Internet. "There are a large number of
threats hackers, cybercriminals, other
countries," says Amit Yoran, director of the
Department of Homeland Security's National Cyber
Security Division. "It goes beyond al-Qaeda.
More than two dozen countries, including China
and Russia, have developed "asymmetrical warfare"
strategies targeting holes in U.S. computer
systems. Because of U.S. military firepower,
those countries see electronic warfare as their
best way to pierce U.S. defenses, military
experts say. Cyberattacks could come in the
form of distributed denial-of-service attacks, in
which hackers flood and disable Web sites with a
barrage of data, or from computer worms and
viruses, malicious computer programs spread over
the Internet to steal or destroy computer data.
Since the Sept. 11 attacks, government and
security experts have clamored for tougher laws
against hackers, more resources and closer
cooperation among agencies to thwart
attacks. Last year, the Bush administration
issued its strategy for shielding computer
systems from hackers and terrorists. It urged
Internet users to add anti-virus software and
firewalls to their computers, and companies to
routinely review their security plans. But
critics say the report, released by Homeland
Security, lacks government regulation and the
funding to have much impact. "It is a good
description of the problem, but doesn't put the
onus on the people who can fix it, such as the
software developers," says Alan Paller, director
of the Sans Institute, a computer
20Example of Semantic Network
21Some Question types
- Vocabulary meaning
- Cloze test
- Examine the skill to recognize the meaning of a
vocabulary of a given text - Player matching
- Cloze test
- Examine the skill to identify the referent of an
anaphor of a given text. - Fact Finding
- Finding fact from a sentence
- Concatenating two sentences to derive fact.
22System architecture
Generator for question about vocabulary meaning
Given text
Sense corpus
Pre- processor
Question ranker
Quiz
Learner profile
WordNet
Web corpus
Generator for Semantic Net
Generator for question about player matching
Player matcher
23The pre-processor
Mark PoS tag of each vocabulary
Pre-processor
PoS tagger
Sentence splitter
Link grammar parser
Obtain the syntactic relationships between
vocabularies
24Generator for question about vocabulary meaning
The sentence that contains a quizzing vocabulary
will form a question.
Generator for question about vocabulary meaning
Sentence set
Question generator
Vocabularies of sentences
Vocabulary difficulty measurer
WSD
Sense corpus
Vocabulary filter
A question can be formed from the example of a
quizzing vocabulary in WordNet.
Learner profile
The training data the WSD component
WordNet
25Player matcher
Player matcher
Anaphor detector
Anaphora Resolution
WordNet
Semantic dependence detector
Semantic nets of anaphors
Web corpus
Semantic nets of non-anaphors
26Semantic Dependence Detection
- Example of the detection process
John is buried
Bill killed John. He is buried.
Bill kill John is killed
Semantic nets of non-anaphors
Pattern rationality checker
Dependence Pattern generator
Web corpus
Anaphor detector
he is buried
Semantic nets of anaphors
27Quiz Generator for Adjective Replacement
- For each adjective-noun pair extracted from a
given sentence, a corresponding question which
contains four choices is generated. - Questions for adjectives can be divided into four
categories questions for collocations, questions
for antonyms, questions for synonyms, and
questions for similar words.
28System Architecture
29Generating Choice Candidates
- By consulting WordNet database, we collect
synonyms, antonyms, and similar words as the
source of choice candidates. - After generating a number of choice candidates,
we filter out some of them which appear incorrect
by utilizing Google search engine or other corpus
(e.g. BNC).
30Determining the Correct Choice
- Consult Google or corpus to obtain frequency
count - Use both adj n and n is adj as query strings
- Expand the query by adding title, first sentence,
or important words of the document. - Ex But with the right methods and the right
attitude, almost anyone can achieve some success. - Adjectives with 4 highest numbers of Google
search result are good method (1,060,000),
proper method (649,000), suitable method
(572,000), and correct method (539,000) and good
attitude (1,500,000), proper attitude
(332,000), wrong attitude (215,000), and
correct attitude (151,000).
31Determining Incorrect Choices
- To avoid ambiguities by including choices that
may not be incorrect, only one words from each
sense is selected. - For instance, the 5th synonym set of the
adjective easy contains three words easy,
gentle, and soft. If gentle is selected as a
choice, then soft will not be a choice of this
question.
32Process Diagram
Google
Google
Random selected adjective
hot
WordNet
Usage checker
Semantics checker
Synonyms
Applicable synonyms
Ranker
1th hots synset
Right options
Adjective Pattern
Document content
Inapplicable synonyms
2th hots synset
nth hots synset
Antonyms
Wrong options
33Example Quiz for Adjective Replacement
- In the sentence The research is hot., the
adjective hot can be replaced with - popular
- sexy
- baking
- warm
- cold
34Quiz pattern for verb phrase
- In the sentence He turns off the light., the
verb phrase turn off can be replaced with - switch off
- throw
- flip
- repel
35More Examples of Questionnaire Collocation
- In this sentence In high school, I was crazy
about English songs., the adjective high can
be replaced with - low
- advanced
- broad
- none of the above
36More Examples of QuestionnaireAntonym
- In this sentence Learning English is not an easy
job., the job here is - uneasy
- casual
- effortless
- difficult
37More Examples of QuestionnaireSynonym
- In this sentence Many students also have their
own special methods for learning English., the
adjective special can be replaced with - extra
- specialized
- unscheduled
- particular
38More Examples of QuestionnaireSimilar Word
- In this sentence Learning English is not an easy
job, the meaning of the adjective easy is
similar to - available
- comfortable
- light
- simple
39More Examples of QuestionnaireTerm Understanding
- ATMs have already been converted to read the
chip-embedded cards. - The ATM is
- (a) A unit of pressure
- (b) A means of digital communications that is
capable of very high speeds - (c) An unattended machine (outside some banks)
that dispenses money when a personal coded card
is used
40More Examples of QuestionnaireComprehension via
Anaphora Resolution
- Give the bananas to the monkeys although they are
not ripe, because they are hungry. Who are
hungry? - Monkeys
- Bananas
- Monkey is animal that is likely to be hungry
- Bananas is fruit that is likely to be ripe
41Status of Auto-Quiz
- There are already some mechanisms to parse text
into semantic network - But common sense reasoning is never complete
- But it lacks of background knowledge of text.
- Only shallow comprehension of text is achieved
- But is deep comprehension of story needed for
giving quiz? - Able to give various types of questionnaires
- But many more not
- Solitary job by IT people so far
- Needs helps from different disciplines, such as
language teaching, linguistics, pedagogy,
42NowWhat?
- What does auto-quiz really mean to language
learning and education? - What does auto-quiz help learners? In what
extent? How to interpret auto-quiz results? - What can learned users contribute?
- E.g. Questionnaire patterns, personalization
preference, common sense bank? - Yet another Web 2.0 practice?
43QA