Title: ArtificialBrainandOffice Matebasedon BrainInformationProcessingMechanism
1Artificial Brain and Office Mate based on Brain I
nformation Processing Mechanism
- 2007. 9. 14
- Young-Ik Kim
- Brain Science Research Center, KAIST
2Contents
- Introduction- About Brain Neuro-Informatics
Research Programof BSRC, KAIST- Main
functionalities of human brain - Implementation of artificial brain system and its
mechanisms - Auditory part- Vision part- Agent
(Service) part - Artificial Brain System Demo
- Toward more challenging problems
3Introduction
- About Brain Neuro-Informatics Research Program-
The third phase project of Brain Science Research
Center (BSRC) in KAIST. - Funded by Korean
Ministry of Commerce, Industry, and Energy. -
Complete research period 2004. 7 2008. 3 - Research focus - Understanding brain
information processing mechanism- Developing
brain-like intelligent systems (Artificial Brain)
4Introduction
- Motivations - We have achieved a great
development of computer technologies, but the
ability of machines is limited to simple tasks
which require human beings have to order what to
do. - We lack the specific and concrete
algorithms to solve practical problems in the
real world. - A human brain is the best model
in solving practical problems in the real world,
and we came up with neural networks based on the
human neural information processing
5Main Functionalities of Human Brain
6Artificial Brain System - Development Env.
- The development team- 11 research groups in 6
universities- 3 parts auditory, vision,
agent (secretary) - Each group generates functional modules-
developed independently- integrated using the
de-centralized system service (DSS) on the
Microsoft .Net framework. - The common language
runtime (CLR) property in .Net framework enables
each module can be developed in any languages
like C, C, Java, etc.
7Artificial Brain System Overall Configuration
Expression Recognition
Speech Separation
Face Recognition
Object Recognition
Sound Localization
Speaker Recognition
Speech Recognition
Attention Area
Vision Module
Auditory Module
Stereo- Camera
Stereo- Microphone
TCP/IP
Service Module (Agent)
Speaker
Robot Head Movement
Text-to- Speech
Robot Control
Response Sentence Generation
Knowledge-Base
Context Analysis
Dialog Manager
8Auditory Part Module Diagram
- Flow diagram for auditory perception
Speech
Active Noise Canceller
Auditory Filterbank
Voice Activity Detection
Stereo- Microphone
Noises
Speech Recognition
BSS (ICA)
Sound Localization
Speaker Recognition
Masking
Keyword Recognition
9Auditory Part Mechanisms
I. Binaural pathways and sound localization
- The superior olivary complex (SOC) receives
bilateral ascending input from the auditory
ventral cochlea nucleus (AVCN) and descending
input from the ipsilateral inferior colliculus
(IC). - The medial superior olive (MSO) cells are
sensitive to interaural time difference (ITD) and
the lateral superior olive (LSO) cells are
sensitive to interaural intensity difference
(IID).
10- The auditory signal is represented by the time at
which upward zero-crossing occurs and the peak
amplitude within the zero-crossing interval (D.
Kim et.al., 1999). - Binaural cue extraction- detect zero-crossing
times - measure zero-crossing interval powers
- The ITD and IID
11- SNR estimation (Y. Kim et.al, 2007)
- Identification of reliable ITD samples (a)
filtered signal (b) measured ITDs(c) SNR
estimation(d) selected ITDs with SNRgt15 dB
12- Localization of multiple sound sources(a) SNR
weighted ITD histogram(b) local peaks of the
histogram(c) normalized by the largest peak (d)
selected dominant peaks with threshold value 0.3
13Auditory Part Mechanisms
II. Masking of interfering sounds
- Cocktail party problem- Human speech perception
is robust in the presence of diffusive noise and
interfering sounds. - But, machine speech
recognition remains problematic in such
conditions. - Auditory masking? - When a sound is masked, it
is eliminated from perception as if the sound
never reached the ear. - Sound source can be
segregated by identifying the segments of the
sources in the time-freq. domain.
14- Directional mask estimation (Y. Kim et.al., 2006)
- Assign each zero-crossing interval power to
one of the nearest ITD source- Mask based on the
target-to-interferers power ratio for each
time-freq. segments - Example mask estimation (a mixture of 3 sounds)-
Target and interfering speeches located at 0,
-30, 30 degrees. - (a) Ideal mask
(b) Estimated mask
15Vision Part Module Diagram
- Flow diagram of visual perception
16Vision Part Mechanisms
Biological visual pathway of bottom-up and
top-down processing
17- The segmentation problem?- finding different
objects in the image.. - But what is the image
of a single object?- Is a nose an object? Is a
head one? - Finding salient regions in an image! - Human
brain draws attention to the salient object in
the image. - The saliency of an image may be
determined by the combination of local and global
aspects.
18- The architecture of bottom-up saliency map model
(Choi et.al,2006)-
I intensity, E edge, S symmetry- CSDN
center-surround difference and normalization-
ICA independent component analysis- SM
saliency map, SP saliency point- IOR
inhibition of return
19- Experimental results of bottom-up selective
attention- The saliency map model generates
candidates of interesting regions.
20Service Part - Modules
21Service Part - Scenarios
- Service domains of the OffceMate- schedule
management- patent search - new knowledge
acquisition from the internet - object
perception in an office - A Demo for the schedule management
22Toward More Challenging Problems
- Keyword spotting model with top-down attention
- Context-dependent information processing
23Selective Attention with an HMM (C. Lee et.al.,
2007)
- Train HMMs with training set
- For testing pattern, calc. likelihood for all
classes - Choose Nc best for candidates
- For each model,
- Set attention filter to 0.
- Update attention filter
- Calc. new likelihood of changed input
- Repeat 2)-3) until likelihood converges
- Calc. confidence measure M
- Choose maximum M
23
24Keyword Spotting Model with Attention
FB
VAD
signal
Compare Likelihood Decision Making
Confidence Measure
OOV Rejection
Attention Filter
Keyword?
Confidence Measure
OOV Rejection
Attention Filter
Activation Attention
25Keyword spotting performance with SA
26Context-dependent information processing
- What is a context? - In memory, our experiences
are represented in structure that cluster
together with related information.- Little is
known about the neural underpinnings of
contextual analysis and scene perception. - Searching for relevant mechanisms - K-Line by
M. Minsky, The Society of Mind, 1986. - Sequence
seeking and counter streams by S. Ullman, Cereb.
Cortex, 1995. - Proactive brain using analogies
and associations by M. Bar, TRENDS in Cog. Sci.,
2007.
27Translating analogies to predictive association
(M. Bar, 2007)
28Context-dependent information processing
- Some Big Questions! - What are the computational
mechanisms mediating the transformation of a past
memory into a future thought? - How does the
brain handle completely novel situations where no
reliable predictions can be generated? - ... - Try a keyword-based context generation - In
our service area, there are 4 static domains. -
Using the keywords in the domain, we can change
the context in our service domains. - But in
real situations, the keyword cannot be
pre-determined! - More dynamic context
generation and management are needed for
efficient services.
29Conclusions
- Human brain is the best model in solving
practical problems in the real world. - Our artificial brain system OfficeMate
incorporates many current findings of information
processing mechanisms in human brain. - New and challenging research areas are waiting
for our attentions! - Thank youAny research idea or comments are
welcomed!youngik_at_kaist.ac.kr