Title: SAKHR Automatic Speech Recognition ASR
1SAKHR Automatic Speech Recognition (ASR)
- Wael Zohny
- Senior Programmer
2Outline
- Sakhr ASR main features
- Sakhr ASR contexts
- Sakhr Microsoft platform for telephony
applications development - Sakhr ASR main functionality
- Sakhr Native products line
- Sakhr Auto-Attendant
- Large Directory Assistance application
- Appendix COM object building and using on VOS
and Envox
3Sakhr ASR main features
- Speaker independent Continuous speech recognition
engine - Zero Training Required
- Intelligent recognition context handling
architecture - Dynamic recognition contexts
- N-Best recognition result format
- Accuracy levels 98.5
4Sakhr ASR main features
- Keyword Spotting
- Barge-in
- No initialization time for application
- Multiple simultaneous contexts recognition
- Entropy Based Speech/silence end-pointing
mechanism - Computer Telephony optimized structure
- (Custom Telephony Cards Recorder, Telephony ALAW
format direct support, DTMF access)
5Sakhr ASR contexts.
- Flat contexts.
- Grammar built contexts.
- Large Vocabulary Contexts (LVX).
- SAPI Contexts.
6Sakhr Microsoft platform for telephony
applications development
- The Microsoft Speech development platform.
- SALT
- The SAPI interface
- Sakhr SAPI compliant engines
7Sakhr Microsoft platform for telephony
applications development
8Sakhr Microsoft platform for telephony
applications development
- ltform id"Form1" method"post" runat"server"gt
- ltspeechQA id"QA1" style"Z-INDEX 100 LEFT
246px POSITION absolute TOP 215px"
runat"server" OnClientComplete"OnRecoComplete"gt
- ltPrompt InlinePrompt"Please, Say the city name
and state your request?"gtlt/Promptgt - ltAnswersgt
- ltspeechAnswer SemanticItem"City
XpathTrigger"/SML/CITY"gtlt/speechAnswergt - ltspeechAnswer SemanticItem"Verb"
XpathTrigger"/SML/VERB"gtlt/speechAnswergt - ltspeechAnswer SemanticItem"Method"
XpathTrigger"/SML/Method"gtlt/speechAnswergt - ltspeechAnswer SemanticItem"Number"
XpathTrigger"/SML/Number"gtlt/speechAnswergt - ltspeechAnswer SemanticItem"Person"
XpathTrigger"/SML/Name"gtlt/speechAnswergt - lt/Answersgt
- ltReco InitialTimeout"30000" Lang"en-US"
BabbleTimeout"100000" StartElement"Button1"
EndSilence"1000" MaxTimeout"100000"
StartEvent"onclick" StopEvent"onclick"
StopElement"Button2" Mode"Single"gt - ltGrammarsgt
- ltspeechGrammar Src"Dialer.grxml"gtlt/speechGra
mmargt - lt/Grammarsgt
- lt/Recogt
- lt/speechQAgt
- ltscript src"ClientProc.js"gtlt/scriptgt
9Sakhr ASR main functionality ASR Interfaces
- ASR SDK was designed to make Integration easier
than ever with more than one interface method
including standard DLL, COM object, ActiveX, or
SAPI interface. DLL, COM object, ActiveX, or SAPI
interface.
10Sakhr ASR main functionality Main ASR DLL APIs
- Channels Using
- HRESULT asrChannelInit(DWORD dwType, DWORD
dwDefaultDeviceType, DWORD pdwChannelID) - HRESULT asrChannelAddFile(DWORD dwChannelID,
const char strRecFileName) - HRESULT asrChannelAddBuffer(DWORD dwChannelID,
void pvBuffer, DWORD dwDataBytes) - Context Using
- HRESULT asrCtxLoad(const char strCtxFileName,
DWORD pdwCtxID) - HRESULT asrCtxUnload(DWORD dwCtxID)
11Sakhr ASR main functionality Main ASR DLL APIs
- Request Using
- HRESULT ASR_API asrRequestCreate(void
ppvRequestObj, DWORD dwChannelID) - HRESULT ASR_API asrRequestDestroy(void
pvRequestObj) - HRESULT ASR_API asrRequestAddContext(void
pvRequestObj, DWORD dwCtxID)
12Sakhr ASR main functionality Main ASR DLL APIs
- Recognition
- HRESULT asrRecognizeFileCtx(DWORD dwChannelID,
DWORD dwCtxID, const char strRecFileName, char
strResult, double pdConfidenceScore, bool
bConcatenate) - HRESULT asrRecognizeBuffCtx(DWORD dwChannelID,
DWORD dwCtxID, void pvRecBuffer, DWORD
dwRecBufferLength, char strResult, double
pdConfidenceScore, bool bConcatenate) - HRESULT asrRecognizeDeviceCtx(DWORD dwChannelID,
DWORD dwDevID, DWORD dwCtxID, char strResult,
double pdConfidenceScore, bool bConcatenate)
13Sakhr ASR main functionality Main ASR DLL APIs
- Recognition
- HRESULT asrRecognizeFileCtxAsync(void
pvRequestObj, DWORD dwCtxID, const char
strRecFileName, DWORD AsrCallBackFunctionSimpleRe
sult) - HRESULT asrRecognizeBuffCtxAsync(void
pvRequestObj, DWORD dwCtxID, void pvRecBuffer,
DWORD dwRecBufferLength, DWORD AsrCallBackFunction
SimpleResult) - HRESULT asrRecognizeDeviceCtxAsync(void
pvRequestObj, DWORD dwDevID, DWORD dwCtxID,
DWORD AsrCallBackFunctionSimpleResult)
14Sakhr ASR main functionality Main ASR DLL APIs
- Results Handling
- HRESULT asrGetResultsCount(DWORD
dwResultCollectionID, DWORD pdwResultsCount) - HRESULT asrGetAlternativesCount(DWORD
dwResultCollectionID, DWORD dwResultIndex, DWORD
pAltsCount) - HRESULT asrGetResultAt(DWORD dwResultCollectionID,
DWORD nResultIndex, DWORD dwAltIndex, char
strUtterence) - HRESULT asrFilterResultUtterence(char
strUtterenceIn, char strUtterenceOut) - HRESULT asrMapResultUtterence(char
strUtterenceIn, char strUtterenceOut)
15Sakhr Native products line
- Sakhr Auto-Attendant (ALLO)
- Large Directory Assistance application
- (more than 100,000 Names)
16Sakhr Native products line Sakhr
Auto-Attendant (ALLO)
- Incoming call management
- Outgoing call management
17Sakhr Native products line Sakhr
Auto-Attendant (ALLO)
ASR
18Sakhr Native products line Sakhr
Auto-Attendant (ALLO)
- Outgoing call management
- No need for human operators(24/7)
- Business and private calls classification
- Security, PrivacyÂ
- Send Voice E-mails
- Monthly detailed bill for each employee (through
e-mail or web) - Administration Reports for each department, each
employee calls cost - Easy deployment through simple web interface
19Sakhr Native products line Sakhr
Auto-Attendant (ALLO)
20Sakhr Native products line Large Directory
Assistance Application
- Helps the user to retrieve the subscriber number.
- Contains White and Yellow pages.
- Superior to human operated system.
- Speech technology is a MUST.
- ASR (Automatic Speech Recognition)
- TTS (Text-To-Speech)
21Sakhr Native products line Large Directory
Assistance Application
- Regional Limitations
- No clear last name and in many cases multiple
last names. - Complex names.
- Multiple entries with the same names.(Look-alikes
) - Many combinations alternates for the same entry.
22Sakhr Native products line Large Directory
Assistance Application
23Appendix COM object building and using on VOS
and Envox.
- COM Objects building
- VB
- VC
- VC
- Using in CT
- VOS
- Envox
24