Learning Theory and Natural Languages - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Learning Theory and Natural Languages

Description:

What are natural languages? Which languages can humans learn? ... Maybe there is a natural language that a child can learn, but no algorithm ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 40
Provided by: csta3
Category:

less

Transcript and Presenter's Notes

Title: Learning Theory and Natural Languages


1
Learning Theory and Natural Languages
  • Presented by Yaron Singer

2
Outline
  • Introduction
  • Formal Learning Theory and motivation
  • Golds Paradigm
  • Alterative Models of Language Acquisition
  • Strong Nativism (time permitting)

3
Introduction
4
Introduction to todays presentation
  • This presentation is a brief introduction to
    formal learning theory (to be defined shortly)
  • The questions which will be discussed
  • What are natural languages?
  • Which languages can humans learn?
  • Do humans learn a language from zero, or do they
    have some inborn mechanism which enables language
    acquisition (Strong Nativism, Chomsky)?
  • Can we impose some constraints a construct a
    formal model where artificial natural language
    acquisition is possible?

5
Formal Learning Theory and Motivation
6
Motivation
  • I wish to construct a precise model for the
    intuitive notion able to speak a language in
    order to be able to investigate theoretically how
    it can be achieved artificially.
  • EM Gold (1967)

7
Comparative Grammar
  • Comparative Grammar is the attempt to
    characterize the class of natural languages
    through formal specification of their grammars
  • Theories of comparative grammars begin with
    Chomsky (e.g. 1957, 1965).

8
Formal Learning Theory
  • What is Formal Learning Theory?
  • Link between the results of acquisitional studies
    and comparative grammar.
  • For Example
  • Suppose we prove that given one rule of grammar
    in some language, there is an algorithm which
    generates all rules of grammar of that language.
  • Then we find out that children first use only one
    rule of grammar.
  • We assume that children use the algorithm on
    generating a grammar.

9
What do we know natural languages?
  • One of the most fundamental properties of natural
    languages is
  • Children can learn it through unsystematic
    exposure to it within a few years.

10
Thus
  • If we will be able to construct a formal model on
    how children learn a language, we will be able to
    train a computer to learn a language.
  • Maybe.

11
Golds Paradigm
12
Definitions
  • For the purpose of this discussion we will need
    to define the following
  • Language
  • Learner
  • Learning Environment
  • Criterion of Learning

13
Languages and Grammars
  • We shall define Languages to be sets of
    sentences.
  • The only constraint is that the set off all
    possible sentences in the language is countable -
    That is,
  • All Logically possible Grammars are defined here
    as all possible Turing Machines.

14
Decidable Languages
  • A language is said to be decidable iff it has a
    grammar and its complement has a grammar.
  • We focus on Non-empty languages.

15
Environment
  • To understand how a learner acquires a language,
    we must understand their learning environment.
  • Assumptions on Learning environment
  • Sentences are presented one after another with no
    ungrammatical intrusions
  • Negative Information is withheld
  • Each sentence in L eventually appears
  • Repetitions are allowed
  • Sentences can arrive at any order
  • Sentences are presented forever

16
Learning Environments as Text
  • Gold describes environments as infinite text.
  • Note that we are already making an assumption
    which we know isnt correct
  • We are assuming that all input is grammatical.
  • We know that hugs, kisses, smiles, crying, tone,
    intonation, volume of speech (shouting,
    whispering), etc. effect learning.
  • This simplifies our model (cant / wont hug a
    computer).

17
Texts as Environments
  • An environment is referred to as text.
  • A text is for a language L if every member of L
    appears somewhere in t (repetitions are allowed),
    and no members of L appear in t.
  • L(t) denotes the language for which t is a text.
  • We denote a text t, and the first n members of t
    are denoted tn.
  • The set of all finite sequences of any length
    (t1,t2,) is denoted SEQ.

18
Learning Function
  • A learning function is defined as any function
    from the set of all finite sentence sequences
    (denoted SEQ) the set of possible grammars.
  • Note It may be that some learning functions may
    be undefined.

19
Children implementing Learning Functions
  • A child which acquires a language implements a
    learning function as they are mapping finite
    sequences of sentences into grammars.
  • We make a few assumptions here
  • Linguistic input
  • One grammatical hypothesis

20
Learning Functions
  • We wish to define some criterion that would
    enable us to decide what is a good learning
    function and what is a bad one.

21
Criterion of Learning
  • In his paper Language identification in the
    Limit, Gold has suggested the following
    criterion for learning
  • A learning function f is defined on text t if f
    is defined on tn for all n in N
  • If f is defined on t and for some grammar g in G,
    f(tn)g, for all but finitely many n in N then f
    is said to converge on t to g
  • If f converges on t to a grammar for L(t), then f
    is said to identify t.
  • If f identifies every text for a language, L,
    then f is said to identify L.
  • If f identifies every language in a set of
    languages then f is said to identify that set of
    languages.
  • A collection of languages is said to be
    identifiable if there is some learning function f
    which identifies it.

22
Intuitive Example
  • To have some intuition on what is meant by
    identifying a language lets consider the
    following example
  • A text t is fed to a learner M, one sentence at a
    time
  • With each new input, M is faced with a finite set
    of sequences
  • M is defined on t if it offers a hypothesis on
    all of these finite sequence of sentences
  • If M is undefined somewhere on t, then it is
    stuck
  • If M does not get stuck and after some finite
    time converges on t to a grammar g, M has
    identified the language.

text fed to the learner
M
23
Questions
24
Unidentifiable Collection of Languages
  • In his paper, Gold proved the following
    remarkable proposition
  • PROPOSITION Let L be a collection of
  • languages that includes every finite
  • language and at least one infinite language.
  • Then L is not identifiable.
  • CONCLUSION This raises a serious constraint on
    models of language acquisition by children as
    natural languages are infinite.
  • Golds learning paradigm offers useful conditions
    on comparative grammar only to the extent that
    the paradigm accurately portraits normal language
    acquisition.

25
Alterative Models of Language Acquisition
26
Exploring alternative models
  • We will now explore alternative models to Golds
    model.
  • In some ways we will try to make the model
    tighter
  • Computable learning functions
  • Learning functions which generate infinite
    functions
  • In other ways well try to make the model
    looser
  • Noisy text learning with interruption.
  • We will see the constraints that these
    alternative models impose on the languages which
    they are able to learn.

27
Alternative Models of Language Acquisition
  • Do children implement a learning function?
  • Current hypothesis is that children are a proper
    subset of the class of all learning functions

Learning Functions
children
28
Computability
  • We would like to believe that language
    acquisition is computable. i.e. For a natural
    language L there exists a Turing Machine M, which
    generates a Turing Machine G, which identifies a
    L.
  • Computable functions are a small subset of all
    learning functions.
  • If we assume that language acquisition is
    computable, what constraints are we imposing?

29
Identifiable Computable Languages
  • Are all Identifiable languages computable?
  • PROPOSITION There are collections, L of
    languages such that L is identifiable, but no
    computable learning function identifies L.

30
Is that good or bad?
  • Maybe there is a natural language that a child
    can learn, but no algorithm exists which enables
    a computer speak that language.
  • On the other hand, if we assume that natural
    languages are computable, this proposition
    enables us to ignore learning, this narrows down
    our question of what natural languages are.
  • Still, there are still too many learning
    functions. Lets consider subsets of the set of
    computable functions.

31
Learning Functions
Learning Functions
Learning functions of identifiable languages
Computable
32
Nontriviality
  • Natural Languages are infinite.
  • (No natural language contains the longest
    sentence).
  • A learning function is considered nontrivial if
  • It is computable
  • It produces a grammar which generates an infinite
    language on every finite sequence for which it is
    defined.

33
Constraints of Nontriviality
  • The next proposition shows that nontriviality
    imposes limits on the computable learner
  • PROPOSITION There are collections, L of infinite
    languages such that some computable function
    identifies L ,but no nontrivial learning
    functions identifies L.
  • If children are nontrivial learners, then there
    are collections of infinite languages beyond
    their reach that might otherwise (if theyre not
    nontrivial learners) have been available.

34
Learning Functions
Learning Functions
Learning functions of identifiable languages
Children?
Computable
Nontrivial
35
Natural Environments
  • Gold has defined environment as text.
  • One the one hand
  • We know that a real learning environment contains
    ungrammatical intrusions, as well as omission of
    some grammatical sentences
  • On the other hand
  • Arbitrary text consists of arbitrary ordering of
    sentences. Usually this is not the case.
  • Waldo is in the red house.
  • all positive even integers greater than two can
    be expressed as the sum of two primes.

36
Noisy Text
  • Suppose D is some arbitrary finite set.
  • A noisy text for a language is a a text for L U
    D. That is, a text for L into which a number of
    intrusions has been inserted.
  • It is easy to see that no collection of languages
    that includes finite variants is identifiable on
    noisy text.

37
Infinite Languages on Noisy Text
  • But what about infinite languages?
  • PROPOSITION There are collections, L of
    infinite, disjoint, and computable languages such
    that no computable function identifies L on noisy
    text.
  • Conclusion Trying to construct a model on noisy
    text raises serious constraint!

38
Strong Nativism
39
Summary
  • We have presented Golds paradigm and the
    difficulties with the assumptions it makes.
  • We have explored a few alternative models, and
    witnessed their constraints.
  • We have introduced briefly the concept of Strong
    Nativism.
  • Ultimately, we might hope to find sufficiently
    powerful conditions on the human learning
    function, on the environments in which it
    typically operates, and on the criterion of
    success to uniquely define the class of natural
    languages.
Write a Comment
User Comments (0)
About PowerShow.com