Speaker Identification and Verification - PowerPoint PPT Presentation

About This Presentation
Title:

Speaker Identification and Verification

Description:

Speaker Identification and Verification. Dan Burnett, Nuance. 58th IETF. Terminology ... Speaker verification -- using utterances from a speaker, determine whether the ... – PowerPoint PPT presentation

Number of Views:406
Avg rating:3.0/5.0
Slides: 9
Provided by: burn155
Category:

less

Transcript and Presenter's Notes

Title: Speaker Identification and Verification


1
Speaker Identification and Verification
  • Dan Burnett, Nuance
  • 58th IETF

2
Terminology
  • Speaker identification -- using utterances from a
    speaker, determine who the caller is out of a set
    of known speakers
  • Speaker verification -- using utterances from a
    speaker, determine whether the caller is who
    he/she claims to be (requires an identity claim)
  • Training -- using utterances from a speaker to
    train a unique voiceprint that can later be used
    to identify/verify a speaker. Applies to both
    SI/SV.

3
draft-burnett-mrcpext-00.txt
  • Created by Nuance and Intervoice
  • Proposes extensions to MRCP v1 (draft-shanmugham-
    mrcp-04.txt)
  • Based originally on Nuance functionality,
    modified to be more general
  • Starting point for MRCP v2 functionality
    discussions
  • Also extensions for speaker-enrolled grammars,
    hotword recognition, and to the recognition
    resource

4
Proposed SI/SV process(simplified, see section
6.7)
VER-START-SESSION
VER-BUFFERING-START
VER-SET-VOICEPRINT
VER-BUFFERING-CONTROL
VER-FROM-BUFFER
VER-END-SESSION
VER-BUFFERING-STOP
Requires active buffering and ver/id sessions.
5
Discussion points
  • Why buffering?
  • Registry for return info
  • Anything else before I convert to MRCPv2?

6
Voice/Text Grammar Enrollment(simplified, see
section 5.5)
START-ENROLLMENT-SESSION
  • Extension to existing recognition resource
  • Creates speaker-produced grammar entries
  • E.g., voice-enrolled entries for voice dialing
  • Both speech and text can be used to create
    grammar entries

PAUSE/RESUME-ENROLLMENT-SESSION
ENROLLMENT-ROLLBACK
RECOGNIZE/STOP
ADD/DELETE/MODIFY-PHRASE
END/ABORT-ENROLLMENT-SESSION
These methods already exist in the recognizer
resource
7
Hotword(see section 7)
  • New recognition resource
  • Instead of listening for a set time period,
    listens continuously until it matches a grammar
  • Non-matching speech is ignored and does not
    affect the state of the recognizer

8
Other Extensions
  • Record method (sec. 4.4)
  • Allows end-pointed recording of an audio stream
  • Interpret method (sec. 4.5)
  • Behaves as a recognition except that text input
    is given instead of an audio stream. It returns
    a standard recognition result minus any
    audio-specific values.
Write a Comment
User Comments (0)
About PowerShow.com