Unsupervised recurrent networks - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Unsupervised recurrent networks

Description:

online: stochastic gradient descent. Barbara Hammer. Institut of Informatics. 10 ... the optimum weight/context pair for xt is. w = xt, c = i=0..t-1 ?(1-?)t-i-1 xi ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 31
Provided by: instit47
Category:

less

Transcript and Presenter's Notes

Title: Unsupervised recurrent networks


1
Unsupervised recurrent networks
  • Barbara Hammer, Institute of Informatics,
  • Clausthal University of Technology

2
Brocken
3
(No Transcript)
4
(No Transcript)
5
(No Transcript)
6
Prototype-based clustering
7
Prototype based clustering
  • data contained in a real-vector space
  • prototypes characterized by locations in the data
    space
  • clustering induced by the receptive fields based
    on the euclidean metric

8
Vector quantization
  • init prototypes
  • repeat
  • present a data point
  • adapt the winner into the direction of the data
    point

9
Cost function
  • minimizes the cost function
  • online stochastic gradient descent ?

10
Neighborhood cooperation
Self-Organizing Map regular lattice
Neural gas data optimum topology
j(j1,j2)
?
?
11
Clustering recurrent data
12
(No Transcript)
13
Old models
14
Old models
  • Temporal Kohonen Map

leaky integration
x1,x2,x3,x4,,xt,
d(xt,wi) xt-wi ad(xt-1,wi)
training wi ? xt
Recurrent SOM
d(xt,wi) yt where yt (xt-wi) ayt-1
training wi ? yt
15
(No Transcript)
16
Our model
17
Merge neural gas/SOM
explicit temporal context
xt,xt-1,xt-2,,x0
xt-1,xt-2,,x0
xt
(w,c)
xt w2
Ct - c2
merge-context content of the winner
Ct
training w ? xt c ? Ct
18
Merge neural gas/SOM
(wj,cj) in Rnxn
  • explicit context, global recurrence
  • wj represents entry xt
  • cj repesents the context which

    equals the winner content of the last time step
  • distance d(xt,wj) axt-wj (1-a)Ct-cj
  • where Ct ?wI(t-1) (1-?)cI(t-1), I(t-1)
    winner in step t-1 (merge)
  • training wj ? xt, cj ? Ct

19
Merge neural gas/SOM
  • Example 42 ? 33? 33? 34

C1 (42 50)/2 46
C2 (3345)/2 39
42 50 33 45 32 42
41 40 34 39 33 38
40 37 35 36 34 35
C3 (3338)/2 35.5
20
Merge neural gas/SOM
  • speaker identification, japanese vovel ae
    UCI-KDD archive
  • 9 speakers, 30 articulations each

time
12-dim. cepstrum
MNG, 150 neurons 2.7 test error MNG, 1000
neurons 1.6 test error rule based 5.9, HMM
3.8
21
Merge neural gas/SOM
  • Experiment
  • classification of donor sites for C.elegans
  • 5 settings with 10000 training data, 10000 test
    data, 50 nucleotides TCGA embedded in 3 dim, 38
    donor Sonnenburg, Rätsch et al.
  • MNG with posterior labeling
  • 512 neurons, ?0.25, ?0.075, a 0.999 ?
    0.4,0.7
  • 14.060.66 training error, 14.260.39 test
    error
  • sparse representation 512 6 dim

22
Merge neural gas/SOM
  • Theorem context representation
  • Assume
  • a map with merge context is given (no
    neighborhood)
  • a sequence x0, x1, x2, x3, is given
  • enough neurons are available
  • Then
  • the optimum weight/context pair for xt is
  • w xt, c ?i0..t-1
    ?(1-?)t-i-1xi
  • Hebbian training converges to this setting as a
    stable fixed point
  • Compare to TKM
  • optimum weights are w ?i0..t (1-a)ixt-i /
    ?i0..t (1-a)i
  • but no fixed point for TKM
  • MSOM is the correct implementation of TKM

23
More models
24
More models
what is the correct temporal context ?
xt,xt-1,xt-2,,x0
(w,c)
xt w2
xt
Ct - c2
Context RSOM/TKM neuron itself MSOM winner
content SOMSD winner index RecSOM all
activations
Ct
training w ? xt c ? Ct
xt-1,xt-2,,x0
25
More models
TKM RSOM MSOM SOMSD RecSOM
context Neuron itself Neuron itself Winner content Winner index Activation of all neurons
encoding Input space Input space Input space Lattice space Activation space
memory nN nN 2nN (dn)N (Nn)N
lattice all all all regular / hyperbolic all
capacity ltFSA ltFSA FSA FSA PDA
for normalised WTA context
26
More models
  • Experiment
  • Mackey-Glass time series
  • 100 neurons
  • different lattices
  • different contexts
  • evaluation by the temporal quantization error

average(mean activity k steps into the past -
observed activity k steps into the past)2
27
More models
SOM
quantization error
RSOM
NG
RecSOM
SOMSD
HSOMSD
MNG
now
past
28
So what?
29
So what?
  • inspection / clustering of high-dimensional
    events within their temporal context could be
    possible
  • strong regularization as for standard SOM / NG
  • possible training methods for reservoirs
  • some theory
  • some examples
  • no supervision
  • the representation of context is critical and
    not clear at all
  • training is critical and not clear at all

30
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com