Language modelling word FST presentation

About This Presentation

Title:

Language modelling word FST

Description:

step 2: convert graphemes to phonemes. step 3: articulate phonemes ... correct pronunciation. predictable errors (prediction model needed) s. t. a. r. t. t. A. r ... –

Number of Views:55

Avg rating:3.0/5.0

Slides: 9

Provided by: fsto

Category:

more less

Transcript and Presenter's Notes

Title: Language modelling word FST

1
Language modelling (word FST)

Operational model for categorizing
mispronunciations

prompted image circus
step 1 decode visual image
(circus)
(cursus)
(circus)
step 2 convert graphemes to phonemes
(/k y r s y s/)
(/s i r k y s/)
(k i r k y s)
step 3 articulate phonemes
(/s i r k y s/)
(/k y - k y r s y s /) (/ka - i - Er - k y s/)
spoken utterance correct, miscue (step3)
or error (steps 1, 2)
2
Language modelling (word FST)

Prevalence of errors of different types (Chorec
data)

Children with RD tend to guess more often
Important to model steps 1 and 3 step 2 not so
important
3
Creation of word FST model step 1

correct pronunciation
predictable errors
(prediction model needed)

4
Creation of word FST model step 3

Per branch in previous FST
Correctly articulated
Restarts (fixed probabilities for now)
Spelling (phonemic) (fixed probabilities for now)

5
Modelling image decoding errors

Model 1 memory model
adopted in listen project
per target word
create list of errors found in database
keep those with P(list entry error TW) gt TH
advantages
very simple strategy
can model real words non-real-word errors
disadvantages
cannot model unseen errors
probably low precision

6
Modelling image decoding errors

Model 2 extrapolation model (idea from ..)
look for existing words that
expected to belong to vocabulary of child (
mental lexicon)
bare good resemblance with target word
select lexicon entries from that vocabulary
feature based expose (dis)similarities with TW
features length differences, alignment
agreement, word categories, graphemes in common,
decision tree ? P(entry decoding error
features)
keep those with P gt TH
advantage can model not previously seen errors
disadvantage can only model real word errors