Learning Theory and Natural Languages - PowerPoint PPT Presentation

1 / 39

About This Presentation

Title:

Learning Theory and Natural Languages

Description:

What are natural languages? Which languages can humans learn? ... Maybe there is a natural language that a child can learn, but no algorithm ... – PowerPoint PPT presentation

Number of Views:65

Avg rating:3.0/5.0

Slides: 40

Provided by: csta3

Category:

more less

Transcript and Presenter's Notes

Title: Learning Theory and Natural Languages

1
Learning Theory and Natural Languages

Presented by Yaron Singer

2
Outline

Introduction
Formal Learning Theory and motivation
Golds Paradigm
Alterative Models of Language Acquisition
Strong Nativism (time permitting)

3
Introduction
4
Introduction to todays presentation

This presentation is a brief introduction to
formal learning theory (to be defined shortly)
The questions which will be discussed
What are natural languages?
Which languages can humans learn?
Do humans learn a language from zero, or do they
have some inborn mechanism which enables language
acquisition (Strong Nativism, Chomsky)?
Can we impose some constraints a construct a
formal model where artificial natural language
acquisition is possible?

5
Formal Learning Theory and Motivation
6
Motivation

I wish to construct a precise model for the
intuitive notion able to speak a language in
order to be able to investigate theoretically how
it can be achieved artificially.
EM Gold (1967)

7
Comparative Grammar

Comparative Grammar is the attempt to
characterize the class of natural languages
through formal specification of their grammars
Theories of comparative grammars begin with
Chomsky (e.g. 1957, 1965).

8
Formal Learning Theory

What is Formal Learning Theory?
Link between the results of acquisitional studies
and comparative grammar.
For Example
Suppose we prove that given one rule of grammar
in some language, there is an algorithm which
generates all rules of grammar of that language.
Then we find out that children first use only one
rule of grammar.
We assume that children use the algorithm on
generating a grammar.

9
What do we know natural languages?

One of the most fundamental properties of natural
languages is

Children can learn it through unsystematic
exposure to it within a few years.

10
Thus

If we will be able to construct a formal model on
how children learn a language, we will be able to
train a computer to learn a language.
Maybe.

11
Golds Paradigm
12
Definitions

For the purpose of this discussion we will need
to define the following
Language
Learner
Learning Environment
Criterion of Learning

13
Languages and Grammars

We shall define Languages to be sets of
sentences.
The only constraint is that the set off all
possible sentences in the language is countable -
That is,
All Logically possible Grammars are defined here
as all possible Turing Machines.

14
Decidable Languages

A language is said to be decidable iff it has a
grammar and its complement has a grammar.
We focus on Non-empty languages.

15
Environment

To understand how a learner acquires a language,
we must understand their learning environment.
Assumptions on Learning environment
Sentences are presented one after another with no
ungrammatical intrusions
Negative Information is withheld
Each sentence in L eventually appears
Repetitions are allowed
Sentences can arrive at any order
Sentences are presented forever

16
Learning Environments as Text

Gold describes environments as infinite text.
Note that we are already making an assumption
which we know isnt correct
We are assuming that all input is grammatical.
We know that hugs, kisses, smiles, crying, tone,
intonation, volume of speech (shouting,
whispering), etc. effect learning.
This simplifies our model (cant / wont hug a
computer).

17
Texts as Environments

An environment is referred to as text.
A text is for a language L if every member of L
appears somewhere in t (repetitions are allowed),
and no members of L appear in t.
L(t) denotes the language for which t is a text.
We denote a text t, and the first n members of t
are denoted tn.
The set of all finite sequences of any length
(t1,t2,) is denoted SEQ.

18
Learning Function

A learning function is defined as any function
from the set of all finite sentence sequences
(denoted SEQ) the set of possible grammars.
Note It may be that some learning functions may
be undefined.

19
Children implementing Learning Functions

A child which acquires a language implements a
learning function as they are mapping finite
sequences of sentences into grammars.
We make a few assumptions here
Linguistic input
One grammatical hypothesis

20
Learning Functions

We wish to define some criterion that would
enable us to decide what is a good learning
function and what is a bad one.

21
Criterion of Learning

In his paper Language identification in the
Limit, Gold has suggested the following
criterion for learning
A learning function f is defined on text t if f
is defined on tn for all n in N
If f is defined on t and for some grammar g in G,
f(tn)g, for all but finitely many n in N then f
is said to converge on t to g
If f converges on t to a grammar for L(t), then f
is said to identify t.
If f identifies every text for a language, L,
then f is said to identify L.
If f identifies every language in a set of
languages then f is said to identify that set of
languages.
A collection of languages is said to be
identifiable if there is some learning function f
which identifies it.

22
Intuitive Example

To have some intuition on what is meant by
identifying a language lets consider the
following example
A text t is fed to a learner M, one sentence at a
time
With each new input, M is faced with a finite set
of sequences
M is defined on t if it offers a hypothesis on
all of these finite sequence of sentences
If M is undefined somewhere on t, then it is
stuck
If M does not get stuck and after some finite
time converges on t to a grammar g, M has
identified the language.

text fed to the learner
M
23
Questions
24
Unidentifiable Collection of Languages

In his paper, Gold proved the following
remarkable proposition
PROPOSITION Let L be a collection of
languages that includes every finite
language and at least one infinite language.
Then L is not identifiable.
CONCLUSION This raises a serious constraint on
models of language acquisition by children as
natural languages are infinite.
Golds learning paradigm offers useful conditions
on comparative grammar only to the extent that
the paradigm accurately portraits normal language
acquisition.

25
Alterative Models of Language Acquisition
26
Exploring alternative models

We will now explore alternative models to Golds
model.
In some ways we will try to make the model
tighter
Computable learning functions
Learning functions which generate infinite
functions
In other ways well try to make the model
looser
Noisy text learning with interruption.
We will see the constraints that these
alternative models impose on the languages which
they are able to learn.

27
Alternative Models of Language Acquisition

Do children implement a learning function?
Current hypothesis is that children are a proper
subset of the class of all learning functions

Learning Functions
children
28
Computability

We would like to believe that language
acquisition is computable. i.e. For a natural
language L there exists a Turing Machine M, which
generates a Turing Machine G, which identifies a
L.
Computable functions are a small subset of all
learning functions.
If we assume that language acquisition is
computable, what constraints are we imposing?

29
Identifiable Computable Languages

Are all Identifiable languages computable?
PROPOSITION There are collections, L of
languages such that L is identifiable, but no
computable learning function identifies L.

30
Is that good or bad?

Maybe there is a natural language that a child
can learn, but no algorithm exists which enables
a computer speak that language.
On the other hand, if we assume that natural
languages are computable, this proposition
enables us to ignore learning, this narrows down
our question of what natural languages are.
Still, there are still too many learning
functions. Lets consider subsets of the set of
computable functions.

31
Learning Functions
Learning Functions
Learning functions of identifiable languages
Computable
32
Nontriviality

Natural Languages are infinite.
(No natural language contains the longest
sentence).
A learning function is considered nontrivial if
It is computable
It produces a grammar which generates an infinite
language on every finite sequence for which it is
defined.

33
Constraints of Nontriviality

The next proposition shows that nontriviality
imposes limits on the computable learner
PROPOSITION There are collections, L of infinite
languages such that some computable function
identifies L ,but no nontrivial learning
functions identifies L.
If children are nontrivial learners, then there
are collections of infinite languages beyond
their reach that might otherwise (if theyre not
nontrivial learners) have been available.

34
Learning Functions
Learning Functions
Learning functions of identifiable languages
Children?
Computable
Nontrivial
35
Natural Environments

Gold has defined environment as text.
One the one hand
We know that a real learning environment contains
ungrammatical intrusions, as well as omission of
some grammatical sentences
On the other hand
Arbitrary text consists of arbitrary ordering of
sentences. Usually this is not the case.
Waldo is in the red house.
all positive even integers greater than two can
be expressed as the sum of two primes.

36
Noisy Text

Suppose D is some arbitrary finite set.
A noisy text for a language is a a text for L U
D. That is, a text for L into which a number of
intrusions has been inserted.
It is easy to see that no collection of languages
that includes finite variants is identifiable on
noisy text.

37
Infinite Languages on Noisy Text

But what about infinite languages?
PROPOSITION There are collections, L of
infinite, disjoint, and computable languages such
that no computable function identifies L on noisy
text.
Conclusion Trying to construct a model on noisy
text raises serious constraint!

38
Strong Nativism
39
Summary

We have presented Golds paradigm and the
difficulties with the assumptions it makes.
We have explored a few alternative models, and
witnessed their constraints.
We have introduced briefly the concept of Strong
Nativism.
Ultimately, we might hope to find sufficiently
powerful conditions on the human learning
function, on the environments in which it
typically operates, and on the criterion of
success to uniquely define the class of natural
languages.