This time - PowerPoint PPT Presentation

1 / 47

About This Presentation

Title:

This time

Description:

... consider how one might get a ball-bearing traveling along the curve to 'probably ... the box 'about h hard' then the ball is more likely to go from D to ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 48

Provided by: paolopir

Category:

more less

Transcript and Presenter's Notes

Title: This time

1
This time

Iterative improvement
Hill climbing
Simulated annealing
Genetic Algorithms

2
Iterative improvement

In many optimization problems, path is
irrelevant
the goal state itself is the solution.
Then, state space space of complete
configurations.
Algorithm goal
- find optimal configuration (e.g., TSP), or,
- find configuration satisfying constraints
(e.g., n-queens)
In such cases, can use iterative improvement
algorithms keep a single current state, and
try to improve it.

3
Iterative improvement example vacuum world
Simplified world 2 locations, each may or not
contain dirt, each may or not contain vacuuming
agent. Goal of agent clean up the dirt. If path
does not matter, do not need to keep track of it.
4
Iterative improvement example n-queens

Goal Put n chess-game queens on an n x n board,
with no two queens on the same row, column, or
diagonal.
Here, goal state is initially unknown but is
specified by constraints that it must satisfy.

5
Hill climbing (or gradient ascent/descent)

Iteratively maximize value of current state, by
replacing it by successor state that has highest
value, as long as possible.

6
Question What is the difference between this
problem and our problem (finding global minima)?
7
Hill climbing

Note minimizing a value function v(n) is
equivalent to maximizing v(n),
thus both notions are used interchangeably.
Notion of extremization find extrema (minima
or maxima) of a value function.

8
Hill climbing

Problem depending on initial state, may get
stuck in local extremum.

9
Minimizing energy

Lets now change the formulation of the problem a
bit, so that we can employ new formalism
- lets compare our state space to that of a
physical system that is subject to natural
interactions,
- and lets compare our value function to the
overall potential energy E of the system.
On every updating,
we have DE ? 0

10
Minimizing energy

Hence the dynamics of the system tend to move E
toward a minimum.
We stress that there may be different such states
they are local minima. Global minimization is
not guaranteed.

11
Local Minima Problem

Question How do you avoid this local minimum?

barrier to local search
starting point
descend direction
local minimum
global minimum
12
Consequences of the Occasional Ascents
desired effect
Help escaping the local optima.
adverse effect
(easy to avoid by keeping track of best-ever
state)
Might pass global optima after reaching it
13
Boltzmann machines

The Boltzmann Machine of
Hinton, Sejnowski, and Ackley (1984)
uses simulated annealing to escape local minima.
To motivate their solution, consider how one
might get a ball-bearing traveling along the
curve to "probably end up" in the deepest
minimum. The idea is to shake the box "about h
hard" then the ball is more likely to go from
D to C than from C to D. So, on average, the
ball should end up in C's valley.

14
Simulated annealing basic idea

From current state, pick a random successor
state
If it has better value than current state, then
accept the transition, that is, use successor
state as current state
Otherwise, do not give up, but instead flip a
coin and accept the transition with a given
probability (that is lower as the successor is
worse).
So we accept to sometimes un-optimize the value
function a little with a non-zero probability.

15
Boltzmanns statistical theory of gases

In the statistical theory of gases, the gas is
described not by a deterministic dynamics, but
rather by the probability that it will be in
different states.
The 19th century physicist Ludwig Boltzmann
developed a theory that included a probability
distribution of temperature (i.e., every small
region of the gas had the same kinetic energy).
Hinton, Sejnowski and Ackleys idea was that this
distribution might also be used to describe
neural interactions, where low temperature T is
replaced by a small noise term T (the neural
analog of random thermal motion of molecules).
While their results primarily concern
optimization using neural networks, the idea is
more general.

16
Boltzmann distribution

At thermal equilibrium at temperature T, the
Boltzmann distribution gives the relative
probability that the system will occupy state A
vs.
state B as
where E(A) and E(B) are the energies associated
with states A and B.

17
Simulated annealing

Kirkpatrick et al. 1983
Simulated annealing is a general method for
making likely the escape from local minima by
allowing jumps to higher energy states.
The analogy here is with the process of annealing
used by a craftsman in forging a sword from an
alloy.
He heats the metal, then slowly cools it as he
hammers the blade into shape.
If he cools the blade too quickly the metal will
form patches of different composition
If the metal is cooled slowly while it is shaped,
the constituent metals will form a uniform alloy.

18
Real annealing Sword

He heats the metal, then slowly cools it as he
hammers the blade into shape.
If he cools the blade too quickly the metal will
form patches of different composition
If the metal is cooled slowly while it is shaped,
the constituent metals will form a uniform alloy.

19
Simulated annealing in practice

set T
optimize for given T
lower T (see Geman Geman, 1984)
repeat

20
Simulated annealing in practice

set T
optimize for given T
lower T
repeat

MDSA Molecular Dynamics Simulated Annealing
21
Simulated annealing in practice

set T
optimize for given T
lower T (see Geman Geman, 1984)
repeat
Geman Geman (1984) if T is lowered
sufficiently slowly (with respect to the number
of iterations used to optimize at a given T),
simulated annealing is guaranteed to find the
global minimum.
Caveat this algorithm has no end (Geman
Gemans T decrease schedule is in the 1/log of
the number of iterations, so, T will never reach
zero), so it may take an infinite amount of time
for it to find the global minimum.

22
Simulated annealing algorithm

Idea Escape local extrema by allowing bad
moves, but gradually decrease their size and
frequency.

Note goal here is to maximize E.
-
23
Simulated annealing algorithm

Idea Escape local extrema by allowing bad
moves, but gradually decrease their size and
frequency.

Algorithm when goal is to minimize E.
-
lt
-
24
Note on simulated annealing limit cases

Boltzmann distribution accept bad move with
?Elt0 (goal is to maximize E) with probability
P(?E) exp(?E/T)
If T is large ?E lt 0
?E/T lt 0 and very small
exp(?E/T) close to 1
accept bad move with high probability
If T is near 0 ?E lt 0
?E/T lt 0 and very large
exp(?E/T) close to 0
accept bad move with low probability

25
Note on simulated annealing limit cases

Boltzmann distribution accept bad move with
?Elt0 (goal is to maximize E) with probability
P(?E) exp(?E/T)
If T is large ?E lt 0
?E/T lt 0 and very small
exp(?E/T) close to 1
accept bad move with high probability
If T is near 0 ?E lt 0
?E/T lt 0 and very large
exp(?E/T) close to 0
accept bad move with low probability

Random walk
Deterministic down-hill
26
Summary

Best-first search general search, where the
minimum-cost nodes (according to some measure)
are expanded first.
Greedy search best-first with the estimated
cost to reach the goal as a heuristic measure.
- Generally faster than uninformed search
- not optimal
- not complete.
A search best-first with measure path cost
so far estimated path cost to goal.
- combines advantages of uniform-cost and
greedy searches
- complete, optimal and optimally efficient
- space complexity still exponential

27
Genetic Algorithms
28
The Traditional Approach

Ask an expert
Adapt existing designs
Trial and error

29
Natures Starting Point
Alison Everitts A Users Guide to Men
30
Optimised Man!
31
Example Pursuit and Evasion

Using NNs and Genetic algorithm
0 learning
200 tries
999 tries

32
Comparisons

Traditional
best guess
may lead to local, not global optimum
Nature
population of guesses
more likely to find a better solution

33
More Comparisons

Nature
not very efficient
at least a 20 year wait between generations
not all mating combinations possible
Genetic algorithm
efficient and fast
optimization complete in a matter of minutes
mating combinations governed only by fitness

34
The Genetic Algorithm Approach

Define limits of variable parameters
Generate a random population of designs
Assess fitness of designs
Mate selection
Crossover
Mutation
Reassess fitness of new population

35
A Population
36
Ranking by Fitness
37
Mate Selection Fittest are copied and replaced
less-fit
38
Mate Selection RouletteIncreasing the
likelihood but not guaranteeing the fittest
reproduction
39
CrossoverExchanging information through some
part of information (representation)
40
Mutation Random change of binary digits from 0
to 1 and vice versa (to avoid local minima)
41
Best Design
42
The GA Cycle
43
The Process
44
Genetic Algorithms

Adv
Good to find a region of solution including the
optimal solution. But slow in giving the optimal
solution

45
Genetic Approach

When applied to strings of genes, the approaches
are classified as genetic algorithms (GA)
When applied to pieces of executable programs,
the approaches are classified as genetic
programming (GP)
GP operates at a higher level of abstraction than
GA

46
Typical Chromosome
47
Summary

Time complexity of heuristic algorithms depend on
quality of heuristic function. Good heuristics
can sometimes be constructed by examining the
problem definition or by generalizing from
experience with the problem class.
Iterative improvement algorithms keep only a
single state in memory.
Can get stuck in local extrema simulated
annealing provides a way to escape local extrema,
and is complete and optimal given a slow enough
cooling schedule.