CS 4700: Foundations of Artificial Intelligence

About This Presentation

Title:

CS 4700: Foundations of Artificial Intelligence

Description:

Threshold activation function a step function or ... Activation Functions. Changing the bias weight W0,i moves the ... stochastic activation functions ... – PowerPoint PPT presentation

Number of Views:13

Avg rating:3.0/5.0

Slides: 19

Provided by: csCor

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS 4700: Foundations of Artificial Intelligence

1
CS 4700Foundations of Artificial Intelligence

Prof. Carla P. Gomes
gomes_at_cs.cornell.edu
Module
Neural Networks
Concepts
(Reading Chapter 20.5)

2
Basic Concepts
A Neural Network maps a set of inputs to a set
of outputs Number of inputs/outputs is
variable The Network itself is composed of an
arbitrary number of nodes or units, connected by
links, with an arbitrary topology. A link from
unit i to unit j serves to propagate the
activation aj to j, and it has a weight Wij.
What can a neural networks do? Compute a known
function / Approximate an unknown
function Pattern Recognition / Signal
Processing Learn to do any of the above
3
Different types of nodes
4
An Artificial NeuronNode or UnitA Mathematical
Abstraction
Artificial Neuron, Node or unit , Processing
Unit i
Input function(ini) weighted sum of its
inputs, including fixed input a0.
Output
Activation function (g) applied to input
function (typically non-linear).
? a processing element producing an output based
on a function of its inputs
Note the fixed input and bias weight are
conventional some authors instead, e.g., or a01
and -W0i
5
Activation Functions

Threshold activation function ? a step function
or threshold function
(outputs 1 when the input is positive 0
otherwise).
(b) Sigmoid (or logistics function) activation
function (key advantage differentiable)
(c) Sign function, 1 if input is positive,
otherwise -1.

These functions have a threshold (either hard or
soft) at zero.
? Changing the bias weight W0,i moves the
threshold location.
6
Threshold Activation Function
Input edges, each with weights (positive,
negative, and change over time, learning)
?i threshold value associated with unit i
?i0
?it
7
Implementing Boolean Functions
Units with a threshold activation function can
act as logic gates we can use these units to
compute Boolean function of its inputs.
8
Boolean AND
input x1 input x2 ouput
0 0 0
0 1 0
1 0 0
1 1 1
W0 1.5
-1
w21
w11
x2
x1
9
Boolean OR
input x1 input x2 ouput
0 0 0
0 1 1
1 0 1
1 1 1
w0 0.5
-1
w21
w11
x2
x1
10
Inverter
input x1 output
0 1
1 0
x1
So, units with a threshold activation function
can act as logic gates given the appropriate
input and bias weights.
11
Network Structures

Acyclic or Feed-forward networks
Activation flows from input layer to
output layer
single-layer perceptrons
multi-layer perceptrons
Recurrent networks
Feed the outputs back into own inputs
Network is a dynamical system
(stable state, oscillations, chaotic behavior)
Response of the network depends on initial state
Can support short-term memory
More difficult to understand

Our focus
Feed-forward networks implement functions, have
no internal state (only weights).
12
Recurrent Networks

Can capture internal state (activation keeps
going around)
? more complex agents.
Brain cannot be a just a feed-forward network!
Brain has many feed-back connections and cycles
? brain is a recurrent network!

Two key examples Hopfield networks Boltzmann
Machines .
13
Hopfield Networks

A Hopfield neural network is typically used for
pattern recognition.
Hopfield networks have symmetric weights
(WijWji)
Output 0/1 only.
Train weights to obtain associative memory
e.g., store template patterns as multiple stable
states given a new input pattern, the network
converges to one of the exemplar patterns.
It can be proven that an N unit Hopfield net can
learn up to 0.138N patterns reliably.
Note no explicit storage all in weights!

14
Hopfield Networks

The user trains the network with a set of
black-and-white templates
Input units 100 pixels
Output units 100 pixels
For each template, each neuron in the network
(corresponding to one
pixel) learns to turn itself on or off based on
the current output of every
other neuron in the network.
After training, the network can be provided with
an arbitrary input pattern,
and it (may) converges to an output pattern
resembling whichever
template most closely matches this input pattern

http//www.cbu.edu/pong/ai/hopfield/hopfieldapple
t.html
15
Hopfield Networks
Given input pattern
After around 500 iterations the network
converges to
http//www.cbu.edu/pong/ai/hopfield/hopfieldapple
t.html
16
Hopfield Networks
Given input pattern
After around 500 iterations the network
converges to
http//www.cbu.edu/pong/ai/hopfield/hopfieldapple
t.html
17
Boltzmann Machines

Generalization of Hopfield Networks
Hidden neurons the Boltzamnn machines have
hidden units
Neuron update stochastic activation functions

Both Hopfield and Boltzamnn networks can solve
optimization problems (similar to Monte Carlo
methods).
We will not cover these networks.
18
Feed-forward NetworkRepresents a function of
Its Input
Two hidden units
Two input units
One Output
Each unit receives input only from units in the
immediately preceding layer.
(Bias unit omitted for simplicity)
Given an input vector x (x1,x2), the
activations of the input units are set to values
of the input vector, i.e., (a1,a2)(x1,x2), and
the network computes
Feed-forward network computes a parameterized
family of functions hW(x)
By adjusting the weights we get different
functions that is how learning is done in neural
networks!
Note the input layer in general does not include
computing units.
19
Feed-forward Network (contd.)