Neural Networks - PowerPoint PPT Presentation

1 / 45

About This Presentation

Title:

Neural Networks

Description:

Synonyms: connectionist networks, connectionism, neural computation, parallel ... a neuron is a brain cell capable of collecting electric signals, processing them, ... – PowerPoint PPT presentation

Number of Views:78

Avg rating:3.0/5.0

Slides: 46

Provided by: isabellebi

Category:

more less

Transcript and Presenter's Notes

Title: Neural Networks

1
Neural Networks
2
Learning Objectives

Understand the principles of neural networks
Understand the backpropagation algorithm

3
Principles of Neural Networks

A Neural Network (NN) is an Artificial Neural
Network (ANN) based on the analogy of the brain
as a network of neurons a neuron is a brain
cell capable of collecting electric signals,
processing them, and disseminating them.

Synonyms connectionist networks, connectionism,
neural computation, parallel distributed
processing.

Among the most efficient machine learning methods
to interpret complex real-world sensor data, for
example recognize hand-written characters
(LeCun), spoken words (Lang), or faces (Cottrell).

4
Principles of Neural Networks

Biological backgroundthe human brain contains a
network of 1011 interconnected neurons, with high
number of interconnections (104)

Fastest neuron answer time is 10-3
secondsCapable of fast decisions (10-1 s to
recognize mother)Speed of answer can be
explained by parallel processing

ANNs imitate imperfectly real neurons- many
characteristics of real neural networks are not /
cannot be reproduced in ANNs

5
Principles of Neural Networks
wi,j
?
f
yi
yj
yi f(xi)
w0,j
inputfunction
activationfunction
output
y0 -1

Mathematical model for a neuron

6
Principles of Neural Networks

Bias weightw0,i is the bias, or threshold of the
unit, and associated with an input activity of 1.

Criteria of activation function
Unit should be active (near 1) when the right
input arrives and inactive (near 0) when other
input arrive
Nonlinear function
Threshold function
Sigmoid function 1 / 1 e -x

7
Principles of Neural Networks
Threshold function
Sigmoid function
8
Principles of Neural Networks

Types of neural networks
Feed-forward networks (or acyclic)function of
the current input only
Recurrent networks (or recurrent)feeds outputs
back into its own inputs

Several layers
Input layer ? input units
Layer of hidden units
Output layer ? output unit

9
Principles of Neural Networks

x5 f( w3,5 x3 w4,5 x4 ) f( w3,5
(w1,3 x1 w2,3 x2 ) w4,5 (w1,4 x1
w2,4 x2 ) ) f( x1 , x2 )

10
Principles of Neural Networks
w00.5
w01.5
w11
w11
w21
w21
OR
AND

ANNs can represent boolean functions AND, OR,
NAND, NORany boolean function with two levels
deep network

11
Principles of Neural Networks

A perceptron is a single layer feed-forward
neural network
Each output unit is independent of the others

12
Principles of Neural Networks

A perceptron (single layer feed forward neural
network) can only represent functions that are
linearly separable

13
Neural Networks Principles

Learn by adjusting weights to reduce error on
training set
The squared error for an example with input x and
true output y is
Perform optimization search by gradient descent
Simple weight update rule

14
Principles of Neural Networks

ANNs can be used for
Classification
Regression

Machine learning terminology
Data (d1, t1), , (dn, tn) (pairs
data/target)
Training set / validation set
Supervised learning model is fitted to the pairs
data/target
Unsupervised learning the target is not known
Classification ? supervised learning
Regression ? unsupervised learning

15
Universal Approximation Properties

Neural networks can approximate any reasonable
real function to any degree of precision
(regression) with a 3-layer network
Any boolean function can be approximated by a
multi-layer feed forward network since they are
combinations of threshold gates
3-layer network with x in the input, a hidden
layer of sigmoid units, and one layer of linear
(identity function) output units, hidden layer
being as large as needed.

16
Universal Approximation Properties

Hypothesis
f is uniformly continuous on 0,1
f can be approximated with a function g such
that
g(0) f(0)
g(x) f(k/n) for x in ((k-1)/n, k/n, k1..n
Network needs one input unit, one output unit
receiving a connection from each hidden unit, and
n1 hidden threshold

17
Backpropagation Algorithm

Layers are usually fully connected numbers of
hidden units typically chosen by hand

18
Backpropagation Algorithm

Expressiveness of multilayer perceptronsAll
continuous functions w/ 2 layers, all functions
w/ 3 layers

19
Backpropagation Algorithm

Output layer same as for single-layer
perceptron,
Hidden layer back-propagate the error from the
output layer
Update rule for weights in hidden layer

20
Backpropagation Algorithm

The squared error on a single example is defined
as
where the sum is over the nodes in the output
layer.

21
Backpropagation Algorithm
22
Backpropagation Algorithm

At each epoch (one cycle through the examples),
sum gradient updates for all examples and apply

23
Backpropagation Algorithm
24
Backpropagation Algorithm

Handwritten digit recognition3-nearest-neighbor
2.4 error400-300-10 unit MLP 1.6
errorLeNet 768-192-30-10 unit MLP 0.9

25
Applications

Data clustering
Classification
Gene Reduction
Gene Regulatory Networks

26
Clustering

Tamayo et al. (1999) have used SOMs to cluster
gene expressions of yeast and humans.

Data yeast (Sacharomyces cerevisae) cell cycle
data from Spellman et al. (1998)hematopoietic
differentiation data

SOMs (Self Organizing Feature Maps, Kohonen) are
well suited to classify data into clusters in
complex multidimensional data

27
Clustering

Measurement of gene expression at 10 minutes
intervals throughout two cell cycles (160
minutes), gives 16 timesteps

Data first filtered to find the genes showing
significant variation in expression over the time
series

Gene expression levels normalized across
experiments to focus on shape of patterns, not
magnitude

28
Clustering

Self Organizing Map are at the basis of
GENECLUSTER, developed by the authors to cluster
and visualize gene expressions

A 6 x 5 node SOM trained on 416 genes, on yeast
cell cycle gene expressions previously analyzed
by hand, to compare with the clusters found with
SOM.

30 clusters. 4 replicate the four cell cycles
stages. The clusters identified by SOM match very
well the clusters built by human experts.
Correspond to G1, S, G2, and M phases of the cell
cycle.

29
Clustering
SOM-derived
Human-derived
30
Classification

Cai and Chou (1998) use ANNs to predict HIV
protease cleavage sites in proteins.

Knowing the HIV protease cleavage sites in
proteins will be helpful for designing specific
and efficient HIV protease inhibitors.

Subject of study HIV-1 protease.
Training set 299 oligopeptides.Test set 63
oligopeptides. Result high rate of correct
prediction (58/63 92.06).

31
Classification
32
Classification

HIV data 114 positive sequences, 248 negative
sequences, for a total of 362 sequences.300
cycles of ANNs.

HCV data 168 positive sequences, 752 negative
sequences, for a total of 920 sequences.500
cycles of ANNs.

20 positive for testing.10 different training
and test sets created for HCV and HIV using a
roulette wheel random selection preserving the
20 criterion.Each training/test pair was run
three times with random initialization of network.

33
Gene Expression Data (GED)

GED measure the relative expression levels of
genes at a single timestep using cDNA or
Affymetrix chips

When individuals are measured only once, a gene
classificatory network for the population can be
extracted (see myeloma data)

When individuals are measured more than once
across time, a gene regulatory network needs to
be reverse engineered

34
Gene Reduction

Narayanan et al. (2004) use ANNs to analyze
myeloma gene expressions

Goal by analyzing the genes involved temporally
in the development a disease, identify patterns
of genes to better characterize the disease, and
design efficient drugs.

Design drugs to target specific genes at
important points in time.

35
Gene Reduction

Two major problems for current gene expression
analysis techniques.

Dimensionality the sheer volume of data leads to
the need for fast analytical tools
Sparsity there are many more genes than samples

G S CG (gene expression analysis) is
concerned with selecting a small subset of
relevant genes (the S problem) as well as
combining individual genes to identify important
causal and classificatory relationships (the C
problem).

36
Gene Reduction

Myeloma data 7129 gene expression values for 105
samples.ANN with one-layer, 7129 input nodes,
one output node (myeloma / normal), feed forward
backpropagation ANN.

Until sum of squared errors (SSE) on output node
is less than 0.001 (3000 epochs, 8 minutes on
pentium laptop).

Weight values between 0.08196 and
0.07343.Average 0.000746.1443 links had 0 on
their weights across all runs.

37
Gene Reduction

The top 220 genes were then selected.Process of
training the network was repeated again on this
subset.

The relevant data was extracted from the full
dataset, with the class information of each
sample.Top 21 genes for myeloma were finally
extracted.

Learnt interesting causal and classificatory
rules.

38
Gene Reduction

Negative rules

If U24685 (-1.84127) is absent then
myeloma.U24685 corresponds to anti-B sell
antoantibody IgM heavy chain variable V-D-J
region (VH4)classified correctly 63 of 75
myeloma cases, with no false positives.

If L00022 (-1.79993) is absent then
myeloma.L00022 corresponds to Ig active heavy
chain epsilon-1classified correctly 68 of 75
myeloma cases, but also three normal cases.

39
Gene Reduction

Positive rules

If X57809 (1.58233) is present then myeloma.
X57809 corresponds to rearranged immunoglobulin
lambda light chainclassified correctly 51 of 75
myeloma cases, with no false positives.

If M34516 is present then myeloma. M34516
corresponds to omega light chain protein 14.1 (Ig
lambda chain related)classified correctly 61 of
75 myeloma cases, but also two normal cases.

40
Gene Regulatory Networks

Gene network construction
Requires temporal GED
Develops relationships between gene expression
values across timesteps.
These relationships can then form a gene
regulatory network
This network describes the excitation and
inhibition which govern gene expression patterns

41
Gene Regulatory Networks
42
Gene Regulatory Networks

Boolean network model
Each gene receives one or several inputs from
other genes
Sigmoid function models the gene as a binary
element
Compute the output (time T1) from the inputs
(time T) according to boolean logics
Time is discretized

43
Gene Regulatory Networks

Boolean gene network example
Input is at time T
Output is at time T1

44
Gene Regulatory Networks

Process to construct Liang networks
Train the ANN on pairs of gene expressions values
from the training set
Train the network between T pattern and T1
differences between expected and observed
patterns
Train on all pairs in the training set and
calculate percentage of correct values
Single layer networks reduce complexity and
improve transparency

45
Gene Regulatory Networks

All Boolean network time series terminate in
specific, repeating attractor patterns.
These can be visualized as basin of attraction
graphs.
All trajectories are strictly determined, and
many states converge on one attractor.
Stability of gene networks.

Write a Comment

User Comments (0)