Title: SCAFFOLDING INSTRUCTIONS TO LEARN PROCEDURES FROM USERS
1SCAFFOLDING INSTRUCTIONS TO LEARN PROCEDURES FROM
USERS
- Paul Groth and Yolanda Gil
- Information Sciences Institute
- University of Southern California
2Learning Procedures Naturally
- Humans learn procedures using a variety
mechanisms - Observation, practice, reading textbooks
- Human tutorial instruction
- Broad descriptions of actions and explanations of
their dependencies - The computer is told what to do by the
instructor. - Goal learn procedures from instruction that is
natural to provide
3What is Instruction by Telling
- General statements
- Not refer to a specific state
- Descriptive statements
- About types, functions, processes
- To dial a number, lift the receiver and punch
the number - Place a pot with water on the stove. Turn the
stove on and wait until the water bubbles - A good hovering area is behind a tall building
that is more than 200 ft away - A vehicle is parked if it is stopped on the side
of the road or if it is stopped in a parking lot
for more than 3 minutes
4Why is Instruction by Telling Important?
- Scaffidi et al 06 by 2012, 90M end user
programmers in the US alone - 13M would describe themselves as programmers
- 55M will use spreadsheets and databases
- Adams 08 we have gone from dozens of markets
of millions of users to millions of markets of
dozens of users - The long tail of programming Anderson 08
- Today most successful end user applications focus
on data manipulation through spreadsheets and web
forms - We need approaches to allow end users to specify
procedures to process data or to control a
physical environment - With examples
- By telling a natural method for humans, needed
if procedures are complex and hard to generalize
from examples
5Shortcomings of Human Instruction Gil 09
- Organization
- Omissions
- Structure
- Errors
- Students preparation
- Students ability
- Teachers skills
- Procedures are complex relational structures
and the mapping between these structures and a
linear sequence of propositions expressed in
discourse is not easy to define. Donin et al
02
6TellMe Learning Procedures by Being Told
- Developed four-stage process
- Ingestion create initial procedure stub from
given instruction - Elaboration map terms to existing knowledge,
infer missing information using heuristics,
create hypotheses of procedures - Elimination rule out hypotheses through symbolic
execution - Selection select one procedure hypothesis using
heuristics that maximize consistency
7An Example
Which player?
Closest could have meant closest sideline or it
could have meant closest player
Instruction is ambiguous incomplete
What is closest?
Repeat until when?
ltj.0Procedure rdfabout"SetupKlondikeSolitaire"
gt ltj.0hasSteps rdfparseType"Collection"gt
ltj.0Procedure rdfabout"Deal"/gt
ltj.0Procedure rdfabout"Layout"/gt
ltj.0Loopgt ltj.0until rdfresource"Unkno
wngt ltj.0repeat rdfparseType"Collection"
gt ltj.0Procedure rdfabout"Decremen
t"/gt ltj.0Procedure
rdfabout"Deal"/gt ltj.0Procedure
rdfabout"Layout"/gt lt/j.0repeatgt
lt/j.0Loopgt lt/j.0hasStepsgt ltj.0hasInput
rdfresource"deck"/gt ltj.0hasResult
rdfresource"solitaireGameSetup"/gt lt/j.0Procedu
regt
2. start SetupKlondikeSolitaire hasId 8887 3.
resultIs namesolitaireGameSetup isa
typeGameSetup 4. initSetup namedeck isa
typeCardDeck 5. doThis nameDeal basedOn deck, 7
expect namehand 6. namehand isa typeHand 7.
doThis nameLayout basedOn hand 8.
namenumOfCards isa typeInteger 9.
namenumOfCards value7 10. repeat doThis
nameDecrement basedOn numOfCards expect
numOfCards
Create initial interpretations based on
prior knowledge, annotate gaps
1
ltj.0Procedure rdfabout"SetupKlondikeSolitaire"
gt ltj.0hasSteps rdfparseType"Collection"gt
ltj.0Procedure rdfabout"Deal"/gt
ltj.0Procedure rdfabout"Layout"/gt
ltj.0Loopgt ltj.0until rdfresource"Unkno
wngt ltj.0repeat rdfparseType"Collection"
gt ltj.0Procedure rdfabout"Decremen
t"/gt ltj.0Procedure
rdfabout"Deal"/gt ltj.0Procedure
rdfabout"Layout"/gt lt/j.0repeatgt
lt/j.0Loopgt lt/j.0hasStepsgt ltj.0hasInput
rdfresource"deck"/gt ltj.0hasResult
rdfresource"solitaireGameSetup"/gt lt/j.0Procedu
regt
Repeat until left sideline is reached or right
sideline or front line
Elaborate using heuristics for
filling gaps
2
Repeat until teammate or opponent has ball
ltj.0Procedure rdfabout"SetupKlondikeSolitaire"
gt ltj.0hasSteps rdfparseType"Collection"gt
ltj.0Procedure rdfabout"Deal"/gt
ltj.0Procedure rdfabout"Layout"/gt
ltj.0Loopgt ltj.0until rdfresource"Unkno
wngt ltj.0repeat rdfparseType"Collection"
gt ltj.0Procedure rdfabout"Decremen
t"/gt ltj.0Procedure
rdfabout"Deal"/gt lt ltj.0Procedure
rdfabout"Deal"/ j.0Procedure
rdfabout"Layout"/gt lt/j.0repeatgt
lt/j.0Loopgt lt/j.0hasStepsgt ltj.0hasInput
rdfresource"deck"/gt ltj.0hasResult
rdfresource"solitaireGameSetup"/gt lt/j.0Procedu
regt
ltj.0Procedure rdfabout"SetupKlondikeSolitaire"
gt ltj.0hasSteps rdfparseType"Collection"gt
ltj.0Procedure rdfabout"Deal"/gt
ltj.0Procedure rdfabout"Layout"/gt
ltj.0Loopgt ltj.0until rdfresource"Unkno
wngt ltj.0repeat rdfparseType"Collection"
gt ltj.0Procedure rdfabout"Decremen
t"/gt ltj.0Procedure
rdfabout"Deal"/gt lt ltj.0Procedure
rdfabout"Deal"/ j.0Procedure
rdfabout"Layout"/gt lt/j.0repeatgt
lt/j.0Loopgt lt/j.0hasStepsgt ltj.0hasInput
rdfresource"deck"/gt ltj.0hasResult
rdfresource"solitaireGameSetup"/gt lt/j.0Procedu
regt
ltj.0Procedure rdfabout"SetupKlondikeSolitaire"
gt ltj.0hasSteps rdfparseType"Collection"gt
ltj.0Procedure rdfabout"Deal"/gt
ltj.0Procedure rdfabout"Layout"/gt
ltj.0Loopgt ltj.0until rdfresource"Unkno
wngt ltj.0repeat rdfparseType"Collection"
gt ltj.0Procedure rdfabout"Decremen
t"/gt ltj.0Procedure
rdfabout"Deal"/gt lt ltj.0Procedure
rdfabout"Deal"/ j.0Procedure
rdfabout"Layout"/gt lt/j.0repeatgt
lt/j.0Loopgt lt/j.0hasStepsgt ltj.0hasInput
rdfresource"deck"/gt ltj.0hasResult
rdfresource"solitaireGameSetup"/gt lt/j.0Procedu
regt
ltj.0Procedure rdfabout"SetupKlondikeSolitaire"
gt ltj.0hasSteps rdfparseType"Collection"gt
ltj.0Procedure rdfabout"Deal"/gt
ltj.0Procedure rdfabout"Layout"/gt
ltj.0Loopgt ltj.0until rdfresource"Unkno
wngt ltj.0repeat rdfparseType"Collection"
gt ltj.0Procedure rdfabout"Decremen
t"/gt ltj.0Procedure
rdfabout"Deal"/gt ltj.0Procedure
rdfabout"Layout"/gt lt/j.0repeatgt
lt/j.0Loopgt lt/j.0hasStepsgt ltj.0hasInput
rdfresource"deck"/gt ltj.0hasResult
rdfresource"solitaireGameSetup"/gt lt/j.0Procedu
regt
ltj.0Procedure rdfabout"SetupKlondikeSolitaire"
gt ltj.0hasSteps rdfparseType"Collection"gt
ltj.0Procedure rdfabout"Deal"/gt
ltj.0Procedure rdfabout"Layout"/gt
ltj.0Loopgt ltj.0until rdfresource"Unkno
wngt ltj.0repeat rdfparseType"Collection"
gt ltj.0Procedure rdfabout"Decremen
t"/gt ltj.0Procedure
rdfabout"Deal"/gt ltj.0Procedure
rdfabout"Layout"/gt lt/j.0repeatgt
lt/j.0Loopgt lt/j.0hasStepsgt ltj.0hasInput
rdfresource"deck"/gt ltj.0hasResult
rdfresource"solitaireGameSetup"/gt lt/j.0Procedu
regt
X
X
X
Eliminate through symbolic execution and
reasoning
3
Select based on heuristics that maximize
consistency
4
8Example Heuristics
- Ingestion
- If a variable is assigned a constant in the
instruction, then ?nd a consistent basic type for
it. - Elaboration
- If the input of a component (i.e. subtask) is
type compatible with the result of a preceding
component, then that result could be connected to
the input. - If two variables share any typing information
they could be unified. - Elimination
- Hypotheses with matching symbolic execution
traces can considered to be the same. - Selection
- Pick the simplest hypothesis (with least
components).
9Instructions that TellMe Can Process
// Stop and move back to your previous position
(e.g. cut back). 11 doThis nameMoveTowards 12
basedOn originalPosition expectcurrentPosition
// If you are not open, do this again. 13
until 14 nameOpen basedOn currentPosition //
Once your open, ?nd the ball and face it. 15
doThis nameFindTheBall expectballLocation 16
nameballLocation isa typePosition 17 doThis
nameFace basedOn ballLocation 18 end
- 1 begin lesson
- 2 start GetOpen hasId 8888
- 3 repeat
- // Find your closest opponent.
- 4 doThis nameGetCurrentPosition
- expect originalPosition
- 5 nameoriginalPosition isa typePosition
- 6 doThis nameFindClosestOpponent
- basedOnoriginalPosition
- expectopponentLocation
- // Dash away from them
- 7 nameopponentLocation isa typePosition
- 8 doThis nameFaceAwayFrom
- basedOn opponentLocation
- 9 doThis nameDash expectcurrentPosition
- 10 namecurrentPosition isa typePosition
10Instructions that TellMe Can Process
// Stop and move back to your previous position
(e.g. cut back). 11 doThis nameMoveTowards 12
basedOn originalPosition expectcurrentPosition
// If you are not open, do this again. 13
until 14 nameOpen basedOn currentPosition //
Once your open, ?nd the ball and face it. 15
doThis nameFindTheBall expectballLocation 16
nameballLocation isa typePosition 17 doThis
nameFace basedOn ballLocation 18 end
- 1 begin lesson
- 2 start GetOpen hasId 8888
- 3 repeat
- // Find your closest opponent.
- 4 doThis nameGetCurrentPosition
- expect originalPosition
- 5 nameoriginalPosition isa typePosition
- 6 doThis nameFindClosestOpponent
- basedOnoriginalPosition
- expectopponentLocation
- // Dash away from them
- 7 nameopponentLocation isa typePosition
- 8 doThis nameFaceAwayFrom
- basedOn opponentLocation
- 9 doThis nameDash expectcurrentPosition
- 10 namecurrentPosition isa typePosition
11Example of Procedure Learned by TellMe
12Applying the Framework
- A Scientific Workflow Construction Command Line
Groth Gil IUI-09 - Real natural language descriptions of procedures
- Protocols in GenePattern Reich et al 08
- Workflows in MyExperiment DeRoure et al 09
- Example used
- This workflow performs data cleansing on genes,
- clusters the results,
- and then displays a heatmap.
13Conclusion
- TellMe provides a framework for addressing
learning from instruction given by humans - The approach can be applied to different domains
- Future work includes
- More and better heuristics
- Dealing with more sophisticated language
constructs - Towards natural language input