From Neurons to Neural Networks - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

From Neurons to Neural Networks

Description:

(Poznanski) Ij due to ion channel(s) at the jth hot spot. Green's function G(x, xj, t) is solution to hot spot equation for Ij as a point ... – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 57
Provided by: KNIS
Category:

less

Transcript and Presenter's Notes

Title: From Neurons to Neural Networks


1
From Neurons to Neural Networks
  • Jeff Knisley
  • East Tennessee State University
  • Mathematics of Molecular and Cellular Biology
    Seminar
  • Institute for Mathematics and its Applications,
    April 2, 2008

2
Outline of the Talk
  • Brief Description of the Neuron
  • A Hot-Spot Dendritic Model
  • Classical Hodgkin-Huxley (HH) Model
  • A Recent Approach to HH Nonlinearity
  • Artificial Neural Nets (ANNs)
  • 1957 1969 Perceptron Models
  • 1980s soon MLPs and Others
  • 1990s Neuromimetic (Spiking) Neurons

3
Components of a Neuron
4
Pre-Synaptic to Post-Synaptic
If threshold exceeded, then neuron fires,
sending a signal along its axon.
5
Signal Propagation along Axon
  • Signal is electrical
  • Membrane depolarization from resting -70 mV
  • Myelin acts as an insulator
  • Propagation is electro-chemical
  • Sodium channels open at breaks in myelin
  • Much higher external Sodium ion concentrations
  • Potassium ions work against sodium
  • Chloride, other influences also very important
  • Rapid depolarization at these breaks
  • Signal travels faster than if only electrical

6
Signal Propagation along Axon
- - -
- - -
- - -
- - -
- - -
7
Action Potentials
  • Sodium ion channels open and close
  • Which causes
  • Potassium ion channels to open and close

8
Action Potentials
  • Model Spike
  • Actual Spike Train

9
Post-Synaptic may be SubThreshold
Signals Decay at Soma if below a Certain
threshold
10
Derivation of the Model
  • Some Assumptions
  • Assume Neuron separates R3 into 3
    regionsinterior (i), exterior (e), and boundary
    membrane surface (m)
  • Assume El is electric field and Bl is magnetic
    flux density, where l e, i
  • Maxwells Equations
  • Assume magnetic induction is negligible
  • Ee ?Ve and Ei ?Vi for potentials Vl , l
    i,e

11
Current Densities ji and je
  • Let sl conductivity 2-tensor, l i, e
  • Intracellular homogeneous small radius
  • Extracellular Ion Populations!
  • Ohms Law (local)

ji
L
0
?
12
Assume Circular Cross-sections
  • Let V Vi Ve Vrest be membrane potential
    difference, and let Rm, Ri , C be the membrane
    resistance, intracellular resistance, membrane
    capacitance, respectively. Let Isyn be a catch
    all for ion channel activity.

d
13
Dimensionless Cables
Let and let and tm
RmC constant
Iion
Tapered Cylinders Z instead of X and a taper
constant K.
Iion
14
Ralls Theorem for Untapered
daughters
  • If at each branching the parent
  • diameter and the daughter cylinder
  • diameters satisfy
  • then the dendritic tree can be reduced
  • to a single equivalent cylinder.

parent
Equivalent Cylinder
15
Dendritic Models
Soma
Tapered Equivalent Cylinder
Full Arbor Model
16
Tapered Equivalent Cylinder
  • Ralls theorem (modified for taper) allows us to
    collapse to an equivalent cylinder
  • Assume hot spots at x0, x1, , xm

. . .
Soma
0 x0 x1 . . .
xm l
17
Ion Channel Hot Spots
  • (Poznanski) Ij due to ion channel(s) at the jth
    hot spot
  • Greens function G(x, xj, t) is solution to hot
    spot equation for Ij as a point source and others
    0
  • Plus boundary conditions and Initial conditions
  • Green is solution to Equivalent Cylinder model

18
Equivalent Cylinder Model (Iion 0)
Soma
19
Properties
  • Spectrum is solely non-negative eigenvalues
  • Eigenvectors are orthogonal in Voltage Clamp
  • Eigenvectors are not orthogonal in original
  • Solutions are multi-exponential decays
  • Linear Models useful for subthreshold activation
    assuming nonlinearities (Iion) are not
    arbitrarily close to soma (and no electric field
    (ephaptic) effects)

20
Somatic Voltage Recording
Saturate to Steady State
Experimental Artifact
Ionic Channel Effects
0
10ms
21
Hodgkin-Huxley Ionic Currents
  • 1963 Nobel Prize in Medicine
  • Cable Equation plus Ionic Currents (Isyn)
  • From Numerous Voltage Clamp Experiments with
    squid giant axon (0.5-1.0 mm in diameter)
  • Produces Action Potentials
  • Ionic Channels
  • n potassium activation variable
  • m sodium activation variable
  • h sodium inactivation variable

22
Hodgkin-Huxley Equations
where any V with subscript is constant, any g
with a bar is constant, and each of the as and
bs are of similar form
23
HH combined with Hot Spots
  • The solution to the equiv cylinder with hotspots
    is
  • where Ij is the restriction of V to jth hot
    spot.
  • At a hot-spot, V satisfies ODE of the form
  • where m, n, and h are functions of V.

24
Brief description of an Approach to HH ion
channel nonlinearities
  • Goal Accessible Approximations that still
    produce action potentials.
  • Can be addressed using Linear Embedding, which is
    closely related to the method of Turning
    Variables.
  • Maps an finite degree polynomially nonlinear
    dynamical system into an infinite degree linear
    system.
  • The result is an infinite dimensional linear
    system which is as unmanageable as the original
    nonlinear equation.
  • Non-normal operators with continua of eigenvalues
  • Difficult to project back to nonlinear system
    (convergence and stability are thorny)
  • But still the approach has some value (action
    potentials).

25
The Hot-Spot Model Qualitatively
Key Features Summation of Synaptic Inputs. If
V(0,t) is large, action potential travels down
axon.
26
Artificial Neural Network (ANN)
  • Made of artificial neurons, each of which
  • Sums inputs xi from other neurons
  • Compares sum to threshold
  • Sends signal to other neurons if above threshold
  • Synapses have weights
  • Model relative ion collections
  • Model efficacy (strength) of synapse

27
Artificial Neuron
Nonlinear firing function
.
.
.
28
First Generation 1957 - 1969
  • Best Understood in terms of Classifiers
  • Partition a data space into regions containing
    data points of the same classification.
  • The regions are predictions of the classification
    of new data points.

29
Simple Perceptron Model
  • Given 2 classes Reference and Sample
  • Firing function (activation function) has only
    two values, 0 or 1.
  • Learning is by incremental updating of weights
    using a linear learning rule

w1
w2
wn
30
Perceptron Limitations
  • Cannot Do XOR (1969, Minsky and Papert)
  • Data must be linearly separable
  • 1970s ANNs Wilderness Experience only a
    handful working and very un-neuron-like

31
Support Vector Machine Perceptron on a Feature
Space
  • Data is projected into a high-dimensional Feature
    Space, separated with a hyperplane
  • Choice of Feature Space (kernel) is key.
  • Predictions based on location of hyperplane

32
Second Generation 1981 - Soon
  • Big Ideas from other Fields
  • J. J. Hopfield compares neural networks to Ising
    Spin Glass models. Uses statistical Mechanics to
    prove that ANNs minimize a total energy
    functional.
  • Cognitive Psychology provides new insights into
    how neural networks learn.
  • Big Ideas from Math
  • Kolmogorovs Theorem
    AND

33
Firing Functions are Sigmoidal
34
3 Layer Neural Network
The output layer may consist of a single neuron
Output
Input
Hidden (is usually much larger)
35
Multilayer Network
.
.
.
.
.
.
36
Hilberts Thirteenth Problem
  • Original Are there continuous functions of 3
    variables that are not representable by a
    superposition of composition of functions of 2
    variables?
  • Modern Can a continuous function of n variables
    on a bounded domain of n-space be written as sums
    of compositions of functions of 1 variable?

37
Kolmogorovs Theorem
  • Modified Version Any continuous function f
  • of n variables can be written
  • where only h and ws depend on f
  • (That is, the gs are fixed)

38
Cybenko (1989)
  • Let s be any continuous sigmoidal function,
  • and let x (x1,,xn). If f is absolutely
    integrable
  • over the n-dimensional unit cube, then for all
    e0,
  • there exists a (possibly very large ) integer N
    and
  • vectors w1,,wN such that
  • where a1,,aN and q1,,qN are fixed parameters.

39
Multilayer Network (MLPs)
.
.
.
.
.
.
40
ANN as a Universal Classifier
  • Designs a function f Data - Classes
  • Example f ( Red ) 1, f ( Blue) 0
  • Support of f defines the regions
  • Data is used to train (i.e., design ) function f

supp(f)
41
Example Predicting Trees that are or are not
RNA-like
RNA Like
NotRNA Like
  • Construct Graphical Invariants
  • Train ANN using known RNA-trees
  • Predict the others

42
2nd Generation Phenomenal Success
  • Data Mining of Micro-array data
  • Stock and commodities trading ANNs are an
    important part of computerized trading
  • Post office mail sorting

43
The Mars Rovers
  • ANN decides between rough and smooth
  • rough and smooth are ambiguous
  • Learningvia manyexamples

And a neural network can lose up to 10 of its
neurons without significant loss in performance!
44
ANN Limitations
  • Overfitting e.g, if Training Set is unbalanced
  • Mislabeled data can lead to slow (or no)
    convergence or incorrect results.
  • Hard Margins No fuzzing of the boundary

45
Problems on the Horizon
  • Limitations are becoming very limiting
  • Trained networks often are poor learners (and
    self-learners are hard to train)
  • In real neural networks, more neurons imply
    better networks (not so in ANNs ).
  • Temporal data is problematic ANNs have no
    concept or a poor concept of time
  • Hybridized ANNs becoming the rule
  • SVMs probably the tool of choice at present
  • SOFMs, Fuzzy ANNs, Connectionism

46
Third Generation 1997 -
  • Back to Bio Spiking Neural Networks (SNN)
  • Asynchronous, action-potential driven ANNs have
    been around for some time.
  • SNNs show promise but results beyond current
    ANNs have been elusive
  • Simulating actual HH equations (neuromimetic) has
    to date not been enough
  • Time is both a promise and a curse
  • A Possible Approach Use current dendritic models
    to modify existing ANNs.

47
ANNs with Multiple Time Scales
  • SNN that reduces to ANN preserves Kolmogorov
    Thm
  • The solution to the equiv cylinder with hotspots
    is
  • where Ij is the restriction of V to jth hot
    spot.
  • Equivalent Artificial Neuron

48
Incorporating MultiExponentials
  • G (0,x,t) is often a multi-exponential decay.
  • In terms of time constants tk
  • wjk are synaptic weights
  • tk from electrotonic and morphometric data
  • Rate of taper, Length of dendrites
  • Branching, capacitance, resistance

49
Approximation and Simplification
  • If xj(u) approx 1 or xj(u) approx 0, then
  • A Special Case (k is a constant)
  • t 0 yields the standard Neural Net Model
  • Standard Neural Net as initial Steady State
  • Modify with time-dependent transient

50
Artificial Neuron
wij, pij synaptic weights
Nonlinear firing function
wi1, pi1
.
.
.
win, pin
51
Steady State and Transient
  • Sensitivity and Soft Margins
  • t 0 is a perceptron with weights wij
  • t 8 is a perceptron with weights wij pij
  • For all t in (0, 8), a traditional ANN with
    weights between wij and wij pij
  • Transient is a perturbation scheme
  • Many predictions over time (soft margins)
  • Algorithm
  • Partition training set into subsets
  • Train at t0 for initial subset
  • Train at t 0 values for other subsets

52
Training the Network
  • Define an energy function
  • p vectors are the information to be learned
  • Neural networks minimize energy
  • The information in the network is equivalent to
    the minima of the total squared energy function

53
Back Propagation
  • Minimize Energy
  • Choose wj and aj so that
  • In practice, this is hard
  • Back Propagation with cont. sigmoidal
  • Feed Forward, Calculate E, modify weights
  • Repeat until E is sufficiently close to 0

54
Back Propagation with Transient
  • Train Network Initially (choose wj and aj)
  • Each synapse given a transient weight pij
  • Algorithm Addressing Over-fitting/Sensitivity
  • Weights must be given random initial values
  • Weights pij also given random initial values
  • Separate Training of wj and aj and pij
    ameliorates over-fitting during the training
    sequence

55
Observations/Results
  • Spiking does occur
  • But only if network is properly initiated
  • Spikes only resemble Action Potentials
  • This is one approach to SNNs
  • Not likely to be the final word
  • Other real neuron features may be necessary
    (e.g., tapering axons can limit frequency of
    action potentials alsobranching! )
  • This approach does show promise in handling
    temporal information

56
Any Questions?
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com