Title: Spike%20Sorting%20I:
1Spike Sorting I
- Bijan Pesaran
- New York University
2Acknowledgements
- Ken Harris and Samar Mehta at Neuroinformatics
course Woods Hole.
3Aims
- We would like to
- Monitor the activity of large numbers of neurons
simultaneously - Know which neuron fired when
- Know which neuron is of which type
- Estimate our errors
4THE PROBLEM Multiple Neural Signals
Primate retinal ganglion cells, courtesy of the
lab of Dr. E.J. Chichilnisky
5THE GOAL Spike Times of Single Neurons
Region from previous slide
Time (sec)
6THE GRADUATE STUDENT ALGORITHM
7A GENERAL FRAMEWORK
8Extracellular Recording Hardware
- You can buy two types of hardware, allowing
- Wide-band continuous recordings
- Filtered, spike-triggered recordings
9The Tetrode
- Four microwires twisted into a bundle
- Different neurons will have different amplitudes
on the four wires
10Raw Data
11High Pass Filtering
- Local field potential is primarily at low
frequencies. - Spikes are at higher frequencies.
- So use a high pass filter. 800hz cutoff is good.
12Filtered Data
13Spike Detection
- Locate spikes at times of maximum extracellular
negativity - Exact alignment is important is it on peak of
largest channel or summed channels?
14Data Reduction
- We now have a waveform for each spike, for each
channel. - Still too much information!
- Before assigning individual spikes to cells, we
must reduce further.
15Principal Component Analysis
- Create feature vector for each spike.
- Typically takes first 3 PCs for each channel.
- Do you use canonical principal components, or new
ones for each file?
16Feature Space
17Cluster Cutting
- Which spikes belong to which neuron?
- Assume a single cluster of spikes in feature
space corresponds to a single cell - Automatic or manual clustering?
18Cluster Cutting Methods
- Purely manual time consuming, leads to high
error rates. - Purely automatic untrustworthy.
- Hybrid less time consuming, lowest error rates.
19Semi-automatic Clustering
20How Do You Know It Works?
- We can split waveforms into clusters, but are we
sure they correspond to single cells? - Simultaneous intra- and extra-cellular recordings
allow us to estimate errors. - Quality measures allow us to guess errors even
without simultaneous intracellular recording.
21Intra-extra Recording
- Simultaneous recording with a wire tetrode and
glass micropipette.
22Intra-extra Recording
Extracellular waveform is almost minus derivative
of intracellular
23Bizarre Extracellular Waveshapes
Model
Experiment
24Two Types of Error
- Type I error (false positive)
- Incorrect inclusion of noise, or spikes of other
cells - Type II error (false negative)
- Omission of true spikes from cluster
- Which is worse? Depends on application
25Manual Clustering Contest
26Best Ellipsoid Error Rates
Find ellipsoid that minimizes weighted sum of
Type I and Type II errors. Must evaluate using
cross-validation!
27Humans vs. B.E.E.R.
28Waveshape Helps Separation
29Why were human errors higher?
- To understand this, try to understand why
clusters have the shape they do - Simplest possibility spike waveform is constant,
cluster spread comes from background noise - Are clusters multivariate normal?
30Problem Overlapping Spikes
31Problem Cellular Synchrony
32Problem Bursting
33Problem Misalignment
- When you have a spike whose peak occurs at
different times on different channels, it can
align on either. - This causes the cluster to be split in two.
34Problem Dimensionality
Manual clustering only uses 2 dimensions at a
time BEER measure can use all of them
35Semi-Automatic Clustering
- Uses all dimensions at once
- Errors should be lower
- Still requires human input
36Semi-automatic Performance
37Software KlustaKwik
klustakwik.sourceforge.net
- Mixture of Gaussians, unconstrained covariance
matrices - Speed is crucial
- CEM Algorithm faster than EM
- Most probabilities not calculated
- Local maxima result in over- and under-clustering
- Split and merge features to tunnel out of local
maxima - Still requires supercomputer resources.
38Software Klusters
klusters.sourceforge.net
Recluster Feature
Ergonomic Design
39Cluster Quality Measures
- Would like to automatically detect which cells
are well isolated. - BEER measure needs intracellular data, which we
dont have in general. - Will define two measures that only use
extracellular data.
40Isolation Distance
Size of ellipsoid within which as many spikes
belong to our cluster as not
41L_ratio
42False Positives and Negatives
43Which Measure to Use?
- Isolation distance correlates with false positive
error rates - Measures distance to other clusters
- L_ratio correlates with false negative error
rates - Measures number of spikes near cluster boundary
44Conclusions
- Automatic clustering will save time and reduce
errors. - Errors can be as low as 5.
- Quality measures give you a feeling of how bad
your errors are.
45Room for Improvement
Easy
- Make it faster
- Improved spike detection and alignment
- Quality measures that estimate error
- Fully automatic sorting
- Resolve overlapping spikes
Hard