Title: Unsupervised recurrent networks
1Unsupervised recurrent networks
- Barbara Hammer, Institute of Informatics,
- Clausthal University of Technology
2Brocken
3(No Transcript)
4(No Transcript)
5(No Transcript)
6Prototype-based clustering
7Prototype based clustering
- data contained in a real-vector space
- prototypes characterized by locations in the data
space - clustering induced by the receptive fields based
on the euclidean metric
8Vector quantization
- init prototypes
- repeat
- present a data point
- adapt the winner into the direction of the data
point
9Cost function
- minimizes the cost function
- online stochastic gradient descent ?
10Neighborhood cooperation
Self-Organizing Map regular lattice
Neural gas data optimum topology
j(j1,j2)
?
?
11Clustering recurrent data
12(No Transcript)
13Old models
14Old models
leaky integration
x1,x2,x3,x4,,xt,
d(xt,wi) xt-wi ad(xt-1,wi)
training wi ? xt
Recurrent SOM
d(xt,wi) yt where yt (xt-wi) ayt-1
training wi ? yt
15(No Transcript)
16Our model
17Merge neural gas/SOM
explicit temporal context
xt,xt-1,xt-2,,x0
xt-1,xt-2,,x0
xt
(w,c)
xt w2
Ct - c2
merge-context content of the winner
Ct
training w ? xt c ? Ct
18Merge neural gas/SOM
(wj,cj) in Rnxn
- explicit context, global recurrence
- wj represents entry xt
- cj repesents the context which
equals the winner content of the last time step - distance d(xt,wj) axt-wj (1-a)Ct-cj
- where Ct ?wI(t-1) (1-?)cI(t-1), I(t-1)
winner in step t-1 (merge) - training wj ? xt, cj ? Ct
19Merge neural gas/SOM
C1 (42 50)/2 46
C2 (3345)/2 39
42 50 33 45 32 42
41 40 34 39 33 38
40 37 35 36 34 35
C3 (3338)/2 35.5
20Merge neural gas/SOM
- speaker identification, japanese vovel ae
UCI-KDD archive - 9 speakers, 30 articulations each
time
12-dim. cepstrum
MNG, 150 neurons 2.7 test error MNG, 1000
neurons 1.6 test error rule based 5.9, HMM
3.8
21Merge neural gas/SOM
- Experiment
- classification of donor sites for C.elegans
- 5 settings with 10000 training data, 10000 test
data, 50 nucleotides TCGA embedded in 3 dim, 38
donor Sonnenburg, Rätsch et al. - MNG with posterior labeling
- 512 neurons, ?0.25, ?0.075, a 0.999 ?
0.4,0.7 - 14.060.66 training error, 14.260.39 test
error - sparse representation 512 6 dim
22Merge neural gas/SOM
- Theorem context representation
- Assume
- a map with merge context is given (no
neighborhood) - a sequence x0, x1, x2, x3, is given
- enough neurons are available
- Then
- the optimum weight/context pair for xt is
- w xt, c ?i0..t-1
?(1-?)t-i-1xi - Hebbian training converges to this setting as a
stable fixed point - Compare to TKM
- optimum weights are w ?i0..t (1-a)ixt-i /
?i0..t (1-a)i - but no fixed point for TKM
- MSOM is the correct implementation of TKM
23More models
24More models
what is the correct temporal context ?
xt,xt-1,xt-2,,x0
(w,c)
xt w2
xt
Ct - c2
Context RSOM/TKM neuron itself MSOM winner
content SOMSD winner index RecSOM all
activations
Ct
training w ? xt c ? Ct
xt-1,xt-2,,x0
25More models
TKM RSOM MSOM SOMSD RecSOM
context Neuron itself Neuron itself Winner content Winner index Activation of all neurons
encoding Input space Input space Input space Lattice space Activation space
memory nN nN 2nN (dn)N (Nn)N
lattice all all all regular / hyperbolic all
capacity ltFSA ltFSA FSA FSA PDA
for normalised WTA context
26More models
- Experiment
- Mackey-Glass time series
- 100 neurons
- different lattices
- different contexts
- evaluation by the temporal quantization error
-
average(mean activity k steps into the past -
observed activity k steps into the past)2
27More models
SOM
quantization error
RSOM
NG
RecSOM
SOMSD
HSOMSD
MNG
now
past
28So what?
29So what?
- inspection / clustering of high-dimensional
events within their temporal context could be
possible - strong regularization as for standard SOM / NG
- possible training methods for reservoirs
- some theory
- some examples
- no supervision
- the representation of context is critical and
not clear at all - training is critical and not clear at all
30(No Transcript)