Title: Nonlinear approach for clustering magnetoencephalographic curves
1Nonlinear approach for clustering
magnetoencephalographic curves
- Sebino Stramaglia
- Center of Innovative Technologies for Signal
Detection and Processing - Physics Dept. University of Bari, Italy
2Summary
- SEF after somatosensory stimulation.
- Clustering.
- Classification of SEF curves.
- Results.
3Brain and its rhythms.
- Teta band 3.5Hz 7.5Hz
- Alfa band 7.5Hz 12.5Hz
- Beta1 band 12.5Hz 22Hz
- Beta2 band 22Hz 32Hz
- Gamma band 32Hz 60Hz
-
4Magnetoencephalography
- measuring neuromagnetic fields
50-500 fT - superconducting SQUID sensors
- Temporal resolution 1 ms
- multisensor recordings
- MEG analysis mostly involves brain cortex
- SEF Somatosensory Evoked magnetic Field
5Experiment of somatosensory stimulation
- Median nerve has been stimulated 300 times by an
electric pulse the time between consecutive
stimuli has been randomly chosen in 1.3sec.,
2.4sec.. The response has been recorded on the
opposite hemisphere.
6The MEG signal
7Extracting the SEF averaging
8SEF spatial map, current dipole
9EXPERIMENT OF SOMATOSENSORIAL STIMULATION(I)
- Analysis morphology of SEF curves
- Experimental evidence high variability of SEF
in different subjects
recognition of alterations of morphology
(lesions of cortex)
- Aim discovering morphological classes
- Tools unsupervised learning algorithms
clustering algorithms
10EXPERIMENT OF SOMATOSENSORIAL STIMULATION(II)
- Data set made of 37 healthy subjects
- Electrical Stimolations of the median nerve at
the wrist level, 0.2 ms long. - Recordings by a magnetoencephalographer with 28
channels
Definition of a similarity index
11Clustering
- Aim partitioning N objects in K classes, so that
similar objects belong to the same class. - Problem the transitive property does not hold
for similarity! - K assigned or to be inferred from data?
- Resolution (scale) at which data are seen?
Hierarchy by varying the scale.
12Centroid Methods - K-means
- Start with random position of K centroids.
- Iterate until centroids are stable
- Assign points to centroids
- Move centroids to centerof assign points
Iteration 0
13Centroid Methods - K-means
- Start with random position of K centroids.
- Iteratre until centroids are stable
- Assign points to centroids
- Move centroids to centerof assign points
Iteration 1
14Centroid Methods - K-means
- Start with random position of K centroids.
- Iteratre until centroids are stable
- Assign points to centroids
- Move centroids to centerof assign points
Iteration 1
15Centroid Methods - K-means
- Start with random position of K centroids.
- Iteratre until centroids are stable
- Assign points to centroids
- Move centroids to centerof assign points
Iteration 3
16Centroid Methods - K-means
- Result depends on initial centroids position
- Fast algorithm compute distances from data
points to centroids - No way to choose K.
- Breaks long clusters
17Agglomerative Hierarchical Clustering
Need to define the distance between thenew
cluster and the other clusters. Single Linkage
distance between closest pair. Complete
Linkage distance between farthest pair. Average
Linkage average distance between all pairs
or distance between
cluster centers
Distance between joined clusters
The dendrogram induces a linear ordering of the
data points
Dendrogram
18Agglomerative Hierarchical Clustering
- Results depend on distance update method
- Single Linkage elongated clusters
- Complete Linkage sphere-like clusters
- Greedy iterative process
- NOT robust against noise
- No inherent measure to choose the clusters
19Goal revealing genuine structures hidden in
data
- The deceit of randomness in a data-set
structures may arise as realizations of
randomness. - The risk is that in noisy and high dimensional
data one always finds what he is looking for !! - Retarded Learning (Van den Broeck, Engel, Opper).
20Statistical Physics for Clustering
- Temperature controls the order and the structures
arising from interactions, since it determines
the effective scale at which the system is
sensed. - Robustness of the statistical properties, w.r.t.
details of interactions - Varying the temperature, the transition from one
structure to the another is abrupt.
21Physics and CLUSTERING
- Algorithms rooted on Physical approaches
- Deterministic Annealing (Rose, Gurewitz)
Physical Review Letters (1990) - Coupled Chaotic Maps (L. Angelini et al.)
Physical Review Letters (2000) - Superparamagnetic Clustering (E. Domany)
Physical Review E (1998)
- Applications
- PA e HEP particle classification
- Biology classification of DNA sequences
- Econophysics analysis of financial indexes
- Humanitarian demining. Many others.
-
22Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation
- The idea behind SPC is based on the physical
properties dilute magnets. - Calculating correlation between magnet
orientations at different temperatures (T).
TLow
23Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation
- The idea behind SPC is based on the physical
properties dilute magnets. - Calculating correlation between magnet
orientations at different temperatures (T).
THigh
24Super-Paramagnetic Clustering (SPC) M.Blatt,
S.Weisman and E.Domany (1996) Neural Computation
- The idea behind SPC is based on the physical
properties dilute magnets. - Calculating correlation between magnet
orientations at different temperatures (T).
TIntermediate
25Super-Paramagnetic Clustering (SPC)
- The algorithm simulates the magnets behavior at a
range of temperatures and calculates their
correlation - The temperature (T) controls the resolution
- Example N4800 points in D2
26Super-Paramagnetic Clustering (SPC)
- The algorithm simulates the magnets behavior at a
range of temperatures and calculates their
correlation - The temperature (T) controls the resolution
- Example N4800 points in D2
T 1 All points are in ONE cluster
27Super-Paramagnetic Clustering (SPC)
- The algorithm simulates the magnets behavior at a
range of temperatures and calculates their
correlation - The temperature (T) controls the resolution
- Example N4800 points in D2
T 20 Three clusters emerge
28Super-Paramagnetic Clustering (SPC)
- The algorithm simulates the magnets behavior at a
range of temperatures and calculates their
correlation - The temperature (T) controls the resolution
- Example N4800 points in D2
T 30 Substructures appear
29Super-Paramagnetic Clustering (SPC)
- The algorithm simulates the magnets behavior at a
range of temperatures and calculates their
correlation - The temperature (T) controls the resolution
- Example N4800 points in D2
T ? Each points is its own cluster
30Output of SPC
A function ?(T) that peaks when stable clusters
break
Size of largest clusters as function of T
Dendrogram
Stable clusters live for large ?T
31Choosing a value for T
32Advantages of SPC
- Scans all resolutions (T)
- Robust against noise and initialization
-calculates collective correlations. - Identifies natural (?) and stable clusters (?T)
- No need to pre-specify number of clusters
- Clusters can be any shape
33Clustering SEF curves COMPLETE LINKAGE
Complete Linkage
No sound criterion to choose the optimal partition
34Clustering SEF curves Application of SPC
SPC
There is not a partition which is stable w.r.t.
the temperature.
35PRE-PROCESSING MULTIDIMENSIONAL SCALING
MDS
dij n?n
N points xi ??d
- reduction of complexity
- data visualization (d2,3)
36Multidimensional scaling (1)
- The dissimilarity measure allows to state that
two SEF curves are more similar than two other
curves, the exact value of the dissimilarity is a
detail.
- We need an approach whose output is invariant
under arbitrary variations of the dissimilarity
matrix leaving unchanged the ranking of the
matrix elements.
37Multidimensional scaling (2)
- MDS find N points, in an Euclidean d-dimensional
space, whose interpoint distances match the
dissimilarity matrix.
- We only require a monotone relationship between
dissimilarities and distances in the
configurations.
38MULTIDIMENSIONAL SCALING (3)
39REGRESSIONE MONOTONA
Monotone Regression to find reference distances
40APPLICATION OF MDS
MDS
41APPLICATION OF MDS COMPLETE LINKAGE RESULTS
42OUR APPROACH APPLICATION OF MDS AND SPC
MDS
SPC
43OUR PARTITION BY MDS AND SPC
44CONCLUSIONS
- Use of MDS e SPC revealed the natural
morphological classes in the data set. - This approach can be applied in other situations
where only a fuzzy definition of dissimilarity is
at hand. - From the physiological point of view, it has been
found that anomalouos SEF curves do not belong
to any of the morphological classes we found.