Title: Learning Language from its Perceptual Context
1Learning Language from its Perceptual Context
- Ray Mooney
- Department of Computer Sciences
- University of Texas at Austin
Joint work with David Chen Rohit Kate Yuk Wah
Wong
2Current State of Natural Language Learning
- Most current state-of-the-art NLP systems are
constructed by training on large supervised
corpora. - Syntactic Parsing Penn Treebank
- Word Sense Disambiguation SenseEval
- Semantic Role Labeling Propbank
- Machine Translation Hansards corpus
- Constructing such annotated corpora is difficult,
expensive, and time consuming.
3Semantic Parsing
- A semantic parser maps a natural-language
sentence to a complete, detailed semantic
representation logical form or meaning
representation (MR). - For many applications, the desired output is
immediately executable by another program. - Two application domains
- GeoQuery A Database Query Application
- CLang RoboCup Coach Language
4GeoQuery A Database Query Application
- Query application for U.S. geography database
Zelle Mooney, 1996
DataBase
5CLang RoboCup Coach Language
- In RoboCup Coach competition teams compete to
coach simulated soccer players - The coaching instructions are given in a formal
language called CLang
Simulated soccer field
6Learning Semantic Parsers
- Manually programming robust semantic parsers is
difficult due to the complexity of the task. - Semantic parsers can be learned automatically
from sentences paired with their logical form.
NL?MR Training Exs
Meaning Rep
7Our Semantic-Parser Learners
- CHILLWOLFIE (Zelle Mooney, 1996 Thompson
Mooney, 1999, 2003) - Separates parser-learning and semantic-lexicon
learning. - Learns a deterministic parser using ILP
techniques. - COCKTAIL (Tang Mooney, 2001)
- Improved ILP algorithm for CHILL.
- SILT (Kate, Wong Mooney, 2005)
- Learns symbolic transformation rules for mapping
directly from NL to MR. - SCISSOR (Ge Mooney, 2005)
- Integrates semantic interpretation into Collins
statistical syntactic parser. - WASP (Wong Mooney, 2006 2007)
- Uses syntax-based statistical machine translation
methods. - KRISP (Kate Mooney, 2006)
- Uses a series of SVM classifiers employing a
string-kernel to iteratively build semantic
representations.
8WASPA Machine Translation Approach to Semantic
Parsing
- Uses statistical machine translation techniques
- Synchronous context-free grammars (SCFG) (Wu,
1997 Melamed, 2004 Chiang, 2005) - Word alignments (Brown et al., 1993 Och Ney,
2003) - Hence the name Word Alignment-based Semantic
Parsing
9A Unifying Framework for Parsing and Generation
Natural Languages
Machine translation
10A Unifying Framework for Parsing and Generation
Natural Languages
Semantic parsing
Machine translation
Formal Languages
11A Unifying Framework for Parsing and Generation
Natural Languages
Semantic parsing
Machine translation
Tactical generation
Formal Languages
12A Unifying Framework for Parsing and Generation
Synchronous Parsing
Natural Languages
Semantic parsing
Machine translation
Tactical generation
Formal Languages
13A Unifying Framework for Parsing and Generation
Synchronous Parsing
Natural Languages
Semantic parsing
Compiling Aho Ullman (1972)
Machine translation
Tactical generation
Formal Languages
14Synchronous Context-Free Grammars (SCFG)
- Developed by Aho Ullman (1972) as a theory of
compilers that combines syntax analysis and code
generation in a single phase. - Generates a pair of strings in a single
derivation.
15Synchronous Context-Free GrammarProduction Rule
Natural language
Formal language
QUERY ? What is CITY / answer(CITY)
16Synchronous Context-Free Grammar Derivation
QUERY
QUERY
What is the capital of Ohio
answer(capital(loc_2(stateid('ohio'))))
STATE ? Ohio / stateid('ohio')
17Probabilistic Parsing Model
d1
CITY
CITY
capital ( CITY )
capital
CITY
of
STATE
loc_2 ( STATE )
Ohio
stateid ( 'ohio' )
STATE ? Ohio / stateid('ohio')
18Probabilistic Parsing Model
d2
CITY
CITY
capital ( CITY )
capital
CITY
of
RIVER
loc_2 ( RIVER )
Ohio
riverid ( 'ohio' )
RIVER ? Ohio / riverid('ohio')
19Probabilistic Parsing Model
d1
d2
CITY
CITY
capital ( CITY )
capital ( CITY )
loc_2 ( STATE )
loc_2 ( RIVER )
stateid ( 'ohio' )
riverid ( 'ohio' )
0.5
0.5
?
?
0.3
0.05
0.5
0.5
STATE ? Ohio / stateid('ohio')
RIVER ? Ohio / riverid('ohio')
1.3
1.05
Pr(d1capital of Ohio) exp( ) / Z
Pr(d2capital of Ohio) exp( ) / Z
normalization constant
20Overview of WASP
Unambiguous CFG of MRL
Lexical acquisition
Training set, (e,f)
Lexicon, L (an SCFG)
Parameter estimation
SCFG parameterized by ?
Training
Testing
Input sentence, e'
Output MR, f'
Semantic parsing
21Tactical Generation
- Can be seen as inverse of semantic parsing
The goalie should always stay in our half
Semantic parsing
((true) (do our 1 (pos (half our))))
22Generation by Inverting WASP
- Same synchronous grammar is used for both
generation and semantic parsing.
Tactical generation
Semantic parsing
NL
MRL
QUERY ? What is CITY / answer(CITY)
23Learning Language from Perceptual Context
- Children do not learn language from annotated
corpora. - Neither do they learn language from just reading
the newspaper, surfing the web, or listening to
the radio. - The natural way to learn language is to perceive
language in the context of its use in the
physical and social world. - This requires inferring the meaning of utterances
from their perceptual context.
24Language Grounding
- The meanings of many words are grounded in our
perception of the physical world red, ball, cup,
run, hit, fall, etc. - Symbol Grounding Harnad (1990)
- Even many abstract words and meanings are
metaphorical abstractions of terms grounded in
the physical world up, down, over, in, etc. - Lakoff and Johnsons Metaphors We Live By
- Its difficult to put my words into ideas.
- Interest in competitions is up.
- Most work in NLP tries to represent meaning
without any connection to perception or to the
physical world circularly defining the meanings
of words in terms of other words or meaningless
symbols with no firm foundation.
25Mary is on the phone
26Ambiguous Supervision for Learning Semantic
Parsers
- A computer system simultaneously exposed to
perceptual contexts and natural language
utterances should be able to learn the underlying
language semantics. - We consider ambiguous training data of sentences
associated with multiple potential MRs. - Siskind (1996) uses this type referentially
uncertain training data to learn meanings of
words. - Extracting meaning representations from
perceptual data is a difficult unsolved problem. - Our system directly works with symbolic MRs.
27Mary is on the phone
28Mary is on the phone
29Ironing(Mommy, Shirt)
Mary is on the phone
30Ironing(Mommy, Shirt)
Working(Sister, Computer)
Mary is on the phone
31Ironing(Mommy, Shirt)
Carrying(Daddy, Bag)
Working(Sister, Computer)
Mary is on the phone
32Ambiguous Training Example
Ironing(Mommy, Shirt)
Carrying(Daddy, Bag)
Working(Sister, Computer)
Talking(Mary, Phone)
Sitting(Mary, Chair)
Mary is on the phone
33Next Ambiguous Training Example
Ironing(Mommy, Shirt)
Working(Sister, Computer)
Talking(Mary, Phone)
???
Sitting(Mary, Chair)
Mommy is ironing a shirt
34Ambiguous Supervision for Learning Semantic
Parsers contd.
- Our model of ambiguous supervision corresponds to
the type of data that will be gathered from a
temporal sequence of perceptual contexts with
occasional language commentary. - We assume each sentence has exactly one meaning
in its perceptual context. - Recently extended to handle sentences with no
meaning in its perceptual context. - Each meaning is associated with at most one
sentence.
35Sample Ambiguous Corpus
gave(daisy, clock, mouse)
ate(mouse, orange)
Daisy gave the clock to the mouse.
ate(dog, apple)
Mommy saw that Mary gave the hammer to the dog.
saw(mother, gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
threw(dog, ball)
runs(dog)
The dog threw the ball.
saw(john, walks(man, dog))
Forms a bipartite graph
36KRISPER KRISP with EM-like Retraining
- Extension of KRISP that learns from ambiguous
supervision. - Uses an iterative EM-like method to gradually
converge on a correct meaning for each sentence.
37KRISPERs Training Algorithm
1. Assume every possible meaning for a sentence
is correct
gave(daisy, clock, mouse)
ate(mouse, orange)
Daisy gave the clock to the mouse.
ate(dog, apple)
Mommy saw that Mary gave the hammer to the dog.
saw(mother, gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
threw(dog, ball)
runs(dog)
The dog threw the ball.
saw(john, walks(man, dog))
38KRISPERs Training Algorithm
1. Assume every possible meaning for a sentence
is correct
gave(daisy, clock, mouse)
ate(mouse, orange)
Daisy gave the clock to the mouse.
ate(dog, apple)
Mommy saw that Mary gave the hammer to the dog.
saw(mother, gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
threw(dog, ball)
runs(dog)
The dog threw the ball.
saw(john, walks(man, dog))
39KRISPERs Training Algorithm
2. Resulting NL-MR pairs are weighted and given
to KRISP
gave(daisy, clock, mouse)
1/2
ate(mouse, orange)
Daisy gave the clock to the mouse.
1/2
ate(dog, apple)
1/4
1/4
Mommy saw that Mary gave the hammer to the dog.
saw(mother, gave(mary, dog, hammer))
1/4
1/4
broke(dog, box)
1/5
1/5
1/5
The dog broke the box.
gave(woman, toy, mouse)
1/5
1/5
gave(john, bag, mouse)
1/3
1/3
John gave the bag to the mouse.
threw(dog, ball)
1/3
1/3
runs(dog)
1/3
The dog threw the ball.
1/3
saw(john, walks(man, dog))
40KRISPERs Training Algorithm
3. Estimate the confidence of each NL-MR pair
using the resulting trained parser
gave(daisy, clock, mouse)
ate(mouse, orange)
Daisy gave the clock to the mouse.
ate(dog, apple)
Mommy saw that Mary gave the hammer to the dog.
saw(mother, gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
threw(dog, ball)
runs(dog)
The dog threw the ball.
saw(john, walks(man, dog))
41KRISPERs Training Algorithm
4. Use maximum weighted matching on a bipartite
graph to find the best NL-MR pairs Munkres,
1957
gave(daisy, clock, mouse)
0.92
ate(mouse, orange)
Daisy gave the clock to the mouse.
0.11
ate(dog, apple)
0.32
0.88
Mommy saw that Mary gave the hammer to the dog.
saw(mother, gave(mary, dog, hammer))
0.22
0.24
broke(dog, box)
0.18
0.71
0.85
The dog broke the box.
gave(woman, toy, mouse)
0.14
0.95
gave(john, bag, mouse)
0.24
0.89
John gave the bag to the mouse.
threw(dog, ball)
0.33
0.97
runs(dog)
0.81
The dog threw the ball.
0.34
saw(john, walks(man, dog))
42KRISPERs Training Algorithm
4. Use maximum weighted matching on a bipartite
graph to find the best NL-MR pairs Munkres,
1957
gave(daisy, clock, mouse)
0.92
ate(mouse, orange)
Daisy gave the clock to the mouse.
0.11
ate(dog, apple)
0.32
0.88
Mommy saw that Mary gave the hammer to the dog.
saw(mother, gave(mary, dog, hammer))
0.22
0.24
broke(dog, box)
0.18
0.71
0.85
The dog broke the box.
gave(woman, toy, mouse)
0.14
0.95
gave(john, bag, mouse)
0.24
0.89
John gave the bag to the mouse.
threw(dog, ball)
0.33
0.97
runs(dog)
0.81
The dog threw the ball.
0.34
saw(john, walks(man, dog))
43KRISPERs Training Algorithm
5. Give the best pairs to KRISP in the next
iteration, and repeat until convergence
gave(daisy, clock, mouse)
ate(mouse, orange)
Daisy gave the clock to the mouse.
ate(dog, apple)
Mommy saw that Mary gave the hammer to the dog.
saw(mother, gave(mary, dog, hammer))
broke(dog, box)
The dog broke the box.
gave(woman, toy, mouse)
gave(john, bag, mouse)
John gave the bag to the mouse.
threw(dog, ball)
runs(dog)
The dog threw the ball.
saw(john, walks(man, dog))
44Results on Ambig-ChildWorld Corpus
45New ChallengeLearning to Be a Sportscaster
- Goal Learn from realistic data of natural
language used in a representative context while
avoiding difficult issues in computer perception
(i.e. speech and vision). - Solution Learn from textually annotated traces
of activity in a simulated environment. - Example Traces of games in the Robocup simulator
paired with textual sportscaster commentary.
46Grounded Language Learning in Robocup
Robocup Simulator
Sportscaster
Score!!!!
Score!!!!
47Robocup Sportscaster Trace
Natural Language Commentary
Meaning Representation
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
- purple7 passes the ball out to purple6
- purple6 passes to purple2
- purple2 makes a short pass to purple3
- purple3 loses the ball to pink9
48Robocup Sportscaster Trace
Natural Language Commentary
Meaning Representation
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
- purple7 passes the ball out to purple6
- purple6 passes to purple2
- purple2 makes a short pass to purple3
- purple3 loses the ball to pink9
49Robocup Sportscaster Trace
Natural Language Commentary
Meaning Representation
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
- purple7 passes the ball out to purple6
- purple6 passes to purple2
- purple2 makes a short pass to purple3
- purple3 loses the ball to pink9
50Sportscasting Data
- Collected human textual commentary for the 4
Robocup championship games from 2001-2004. - Avg events/game 2,613
- Avg sentences/game 509
- Each sentence matched to all events within
previous 5 seconds. - Avg MRs/sentence 2.5 (min 1, max 12)
- Manually annotated with correct matchings of
sentences to MRs (for evaluation purposes only).
51WASPER
- WASP with EM-like retraining to handle ambiguous
training data. - Same augmentation as added to KRISP to create
KRISPER.
52KRISPER-WASP
- First iteration of EM-like training produces very
noisy training data ( 50 errors). - KRISP is better than WASP at handling noisy
training data. - SVM prevents overfitting.
- String kernel allows partial matching.
- But KRISP does not support language generation.
- First train KRISPER just to determine the best
NL?MR matchings. - Then train WASP on the resulting unambiguously
supervised data.
53WASPER-GEN
- In KRISPER and WASPER, the correct MR for each
sentence is chosen based on maximizing the
confidence of semantic parsing (NL?MR). - Instead, WASPER-GEN determines the best matching
based on generation (MR?NL). - Score each potential NL/MR pair by using the
currently trained WASP-1 generator. - Compute NIST MT score (alternative to BLEU score)
between the generated sentence and the potential
matching sentence.
54Strategic Generation
- Generation requires not only knowing how to say
something (tactical generation) but also what to
say (strategic generation). - For automated sportscasting, one must be able to
effectively choose which events to describe.
55Example of Strategic Generation
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
56Example of Strategic Generation
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
57Learning for Strategic Generation
- For each event type (e.g. pass, kick) estimate
the probability that it is described by the
sportscaster. - Requires NL/MR matching that indicates which
events were described, but this is not provided
in the ambiguous training data. - Use estimated matching computed by KRISPER,
WASPER or WASPER-GEN. - Use a version of EM to determine the probability
of mentioning each event type just based on
strategic info.
58EM for Strategic Generation
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
purple7 passes the ball out to purple6
purple6 passes to purple2 purple2 makes
a short pass to purple3 purple3 loses the
ball to pink9
59EM for Strategic Generation
Estimate Generation Probs
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
purple7 passes the ball out to purple6
purple6 passes to purple2 purple2 makes
a short pass to purple3 purple3 loses the
ball to pink9
P(pass)(11/41/41/41/5)/30.65
60EM for Strategic Generation
Estimate Generation Probs
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
purple7 passes the ball out to purple6
purple6 passes to purple2 purple2 makes
a short pass to purple3 purple3 loses the
ball to pink9
P(pass)0.65
P(ballstopped)(1/41/4)/20.25
61EM for Strategic Generation
Estimate Generation Probs
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
purple7 passes the ball out to purple6
purple6 passes to purple2 purple2 makes
a short pass to purple3 purple3 loses the
ball to pink9
P(pass)0.65
P(ballstopped)0.25
P(kick)(1/41/41/51/5)/30.3
62EM for Strategic Generation
Estimate Generation Probs
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
purple7 passes the ball out to purple6
purple6 passes to purple2 purple2 makes
a short pass to purple3 purple3 loses the
ball to pink9
P(pass)0.65
P(ballstopped)0.25
P(kick)0.3
P(badpass)0.2
P(turnover)0.2
63EM for Strategic Generation
Reassign link weights
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
purple7 passes the ball out to purple6
purple6 passes to purple2 purple2 makes
a short pass to purple3 purple3 loses the
ball to pink9
P(pass)0.65
P(ballstopped)0.25
P(kick)0.3
P(badpass)0.2
63
P(turnover)0.2
64EM for Strategic Generation
Normalize link weights
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
purple7 passes the ball out to purple6
purple6 passes to purple2 purple2 makes
a short pass to purple3 purple3 loses the
ball to pink9
P(pass)0.65
P(ballstopped)0.25
P(kick)0.3
P(badpass)0.2
64
P(turnover)0.2
65EM for Strategic Generation
Recalculate Generation Probs and Repeat Until
Convergence
pass ( purple7 , purple6 ) ballstopped kick (
purple6 ) pass ( purple6 , purple2 )
ballstopped kick ( purple2 ) pass ( purple2 ,
purple3 ) kick ( purple3 ) badPass ( purple3 ,
pink9 ) turnover ( purple3 , pink9 )
purple7 passes the ball out to purple6
purple6 passes to purple2 purple2 makes
a short pass to purple3 purple3 loses the
ball to pink9
66Demo
- Game clip commentated using WASPER-GEN with
EM-based strategic generation, since this gave
the best results for generation. - FreeTTS was used to synthesize speech from
textual output.
67Experimental Evaluation
- Generated learning curves by training on all
combinations of 1 to 3 games and testing on all
games not used for training. - Baselines
- Random Matching WASP trained on random choice of
possible MR for each comment. - Gold Matching WASP trained on correct matching
of MR for each comment. - Metrics
- Precision of systems annotations that are
correct - Recall of gold-standard annotations correctly
produced - F-measure Harmonic mean of precision and recall
68Evaluating Matching Accuracy
- Measure how accurately various methods assign MRs
to sentences in the ambiguous training data. - Use gold-standard matches to evaluate correctness.
69Results on Matching
70Evaluating Semantic Parsing
- Measure how accurately learned parser maps
sentences to their correct meanings in the test
games. - Use the gold-standard matches to determine the
correct MR for each sentence that has one. - Generated MR must exactly match gold-standard to
count as correct.
71Results on Semantic Parsing
72Evaluating Tactical Generation
- Measure how accurately NL generator produces
English sentences for chosen MRs in the test
games. - Use gold-standard matches to determine the
correct sentence for each MR that has one. - Use NIST score to compare generated sentence to
the one in the gold-standard.
73Results on Tactical Generation
74Evaluating Strategic Generation
- In the test games, measure how accurately the
system determines which perceived events to
comment on. - Compare the subset of events chosen by the system
to the subset chosen by the human annotator (as
given by the gold-standard matching).
75Results on Strategic Generation
76Human Evaluation(Quasi Turing Test)
- Asked 4 fluent English speakers to evaluate
overall quality of sportscasts. - Randomly picked a 2 minute segment from each of
the 4 games. - Each human judge evaluated 8 commented game
clips, each of the 4 segments commented once by a
human and once by the machine when tested on that
game. - The 8 clips presented to each judge were shown in
random counter-balanced order. - Judges were not told which ones were human or
machine generated.
77Human Evaluation Metrics
78Results on Human Evaluation
79Immediate Future Directions
- Use strategic generation information to improve
resolution of ambiguous training data. - Produce generation confidences (instead of NIST
scores) for scoring NL/MR matches in WASPER-GEN. - Improve WASPs ability to handle noisy training
data. - Improve simulated perception to extract more
detailed and interesting symbolic facts from the
simulator.
80Longer Term Future Directions
- Apply approach to learning situated language in a
computer video-game environment (Gorniak Roy,
2005) - Teach game AIs how to talk to you!
- Apply approach to captioned images or video using
computer vision to extract objects, relations,
and events from real perceptual data (Fleischman
Roy, 2007)
81Conclusions
- Current language learning work uses expensive,
unrealistic training data. - We have developed a language learning system that
can learn from language paired with an ambiguous
perceptual environment. - We have evaluated it on the task of learning to
sportscast simulated Robocup games. - The system learns to sportscast almost as well as
humans.