Philip JB Jackson - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Philip JB Jackson

Description:

The development of speech communication in homo sapiens is driven by ... Auditory/vocal behaviour of locust/bird/chimp/? Better treatment of fluent speech ... – PowerPoint PPT presentation

Number of Views:1241
Avg rating:3.0/5.0
Slides: 25
Provided by: philipj5
Category:
Tags: jackson | locust | philip

less

Transcript and Presenter's Notes

Title: Philip JB Jackson


1
Is a speech recognition system an intelligent
machine?
  • Philip JB Jackson

Centre for Vision, Speech Signal Processing, U.
Surrey
2
Evolutionary drivers
  • The development of speech communication in homo
    sapiens is driven by evolutionary forces
  • relaying the location of natural resources
  • escaping predators
  • coordinating hunting
  • passing on learnt skills
  • monitoring the well being on community
  • and wooing!

3
Capabilities of an intelligent being
  • Communication entropy management
  • Behaviour energy management
  • Planning time management

4
The human vocal tract
5
The speech chain
6
The speech loop
7
Cognition control
8
Qualities of intelligence
INTRODUCTION
  • sophisticated perceptual abilities
  • complex behaviour
  • understanding of the world
  • resilience to error
  • self-awareness
  • adaptability

9
1. Perception
INTELLIGENCE
  • feature extraction
  • active sensing
  • multi-modal integration

10
2. Behaviour
INTELLIGENCE
  • complex actions
  • inference and reasoning
  • intention
  • autonomy

11
3. World model
INTELLIGENCE
  • objects
  • with attributes
  • actions
  • can act on certain objects
  • community (relations)
  • environment

12
4. Robustness
INTELLIGENCE
  • flexibility
  • strategies that allow for mistakes
  • e.g., trial and error
  • redundancy

13
5. Self model
INTELLIGENCE
  • present position
  • present activity
  • fitness/ability
  • bill of health
  • identity

14
6. Ability to adapt
INTELLIGENCE
  • learning
  • new experience
  • adaptation
  • change in context/environment
  • update
  • new knowledge or structure
  • feedback
  • perception of own performance

15
Attributes of an intelligent entity
INTELLIGENCE
  • Sensory input
  • Memory, processor, actuator
  • Models
  • Robust architecture
  • States
  • Learning, adaptation, update feedback

16
Automatic speech recognition
RECOGNIZER
17
Automatic speech recognition
RECOGNIZER
  • Audio(/video) input processing
  • As an expert system, relating to
  • transcription, database entries, commands,
    natural language
  • Hidden Markov models
  • Training, and noise and speaker adaptation
  • EM using B-W algm., PMC/spectral subtraction,
    MLLR/formant normalisation
  • Decoding
  • Viterbi algorithm.

18
Language understanding
RECOGNIZER
  • Symbols
  • description
  • Word network
  • syntax
  • Language model
  • priors
  • transcript
  • output

19
Understanding of the world
RECOGNIZER
  • models
  • refinement of perception during training, speaker
    adaptation (off-/on-line)
  • symbol grounding
  • correlations of sounds and experience
  • e.g., onomatopoeia, pop, bang, whizz, whisper
  • correlations of words and experience
  • e.g., semantic context, light/dark, noisy
  • developmental
  • social context
  • sympathy, altruism social behaviour

20
Details of front end
RECOGNIZER
  • Auditory analogy
  • noise robustness
  • speaker models and adaptation

21
Decoding with HMMs
RECOGNIZER
  • What is the most likely interpretation of what I
    just heard?
  • Hypotheses
  • Pruning
  • Traceback
  • Gaussian pdfs are natural choice of building
    block for emission probs.

22
Training process
RECOGNIZER
  • Initialisation
  • monophone models
  • alignment
  • Refinement
  • triphones
  • clustering
  • tied mixture models
  • multi-modal techniques

23
Features of current systems
SUMMARY
  • Speaker ID
  • Speaker normalisation
  • Tracking of speech dynamics
  • Integration of multimodal stimuli
  • Auditory scene analysis
  • Source separation (cocktail party)
  • Parallel model combination

24
For the future
SUMMARY
  • Auditory/vocal behaviour of locust/bird/chimp/?
  • Better treatment of fluent speech
  • coarticulation/dialogue language models/accents
  • Training for a well-rounded artefact
  • defence/business news/your own offspring!
  • The Apple and the Cross
Write a Comment
User Comments (0)
About PowerShow.com