This time - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

This time

Description:

... consider how one might get a ball-bearing traveling along the curve to 'probably ... the box 'about h hard' then the ball is more likely to go from D to ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 48
Provided by: paolopir
Category:
Tags: ball | bearing | craftsman | time

less

Transcript and Presenter's Notes

Title: This time


1
This time
  • Iterative improvement
  • Hill climbing
  • Simulated annealing
  • Genetic Algorithms

2
Iterative improvement
  • In many optimization problems, path is
    irrelevant
  • the goal state itself is the solution.
  • Then, state space space of complete
    configurations.
  • Algorithm goal
  • - find optimal configuration (e.g., TSP), or,
  • - find configuration satisfying constraints
  • (e.g., n-queens)
  • In such cases, can use iterative improvement
    algorithms keep a single current state, and
    try to improve it.

3
Iterative improvement example vacuum world
Simplified world 2 locations, each may or not
contain dirt, each may or not contain vacuuming
agent. Goal of agent clean up the dirt. If path
does not matter, do not need to keep track of it.
4
Iterative improvement example n-queens
  • Goal Put n chess-game queens on an n x n board,
    with no two queens on the same row, column, or
    diagonal.
  • Here, goal state is initially unknown but is
    specified by constraints that it must satisfy.

5
Hill climbing (or gradient ascent/descent)
  • Iteratively maximize value of current state, by
    replacing it by successor state that has highest
    value, as long as possible.

6
Question What is the difference between this
problem and our problem (finding global minima)?
7
Hill climbing
  • Note minimizing a value function v(n) is
    equivalent to maximizing v(n),
  • thus both notions are used interchangeably.
  • Notion of extremization find extrema (minima
    or maxima) of a value function.

8
Hill climbing
  • Problem depending on initial state, may get
    stuck in local extremum.

9
Minimizing energy
  • Lets now change the formulation of the problem a
    bit, so that we can employ new formalism
  • - lets compare our state space to that of a
    physical system that is subject to natural
    interactions,
  • - and lets compare our value function to the
    overall potential energy E of the system.
  • On every updating,
  • we have DE ? 0

10
Minimizing energy
  • Hence the dynamics of the system tend to move E
    toward a minimum.
  • We stress that there may be different such states
    they are local minima. Global minimization is
    not guaranteed.

11
Local Minima Problem
  • Question How do you avoid this local minimum?

barrier to local search
starting point
descend direction
local minimum
global minimum
12
Consequences of the Occasional Ascents
desired effect
Help escaping the local optima.
adverse effect
(easy to avoid by keeping track of best-ever
state)
Might pass global optima after reaching it
13
Boltzmann machines
  • The Boltzmann Machine of
  • Hinton, Sejnowski, and Ackley (1984)
  • uses simulated annealing to escape local minima.
  • To motivate their solution, consider how one
    might get a ball-bearing traveling along the
    curve to "probably end up" in the deepest
    minimum. The idea is to shake the box "about h
    hard" then the ball is more likely to go from
    D to C than from C to D. So, on average, the
    ball should end up in C's valley.

14
Simulated annealing basic idea
  • From current state, pick a random successor
    state
  • If it has better value than current state, then
    accept the transition, that is, use successor
    state as current state
  • Otherwise, do not give up, but instead flip a
    coin and accept the transition with a given
    probability (that is lower as the successor is
    worse).
  • So we accept to sometimes un-optimize the value
    function a little with a non-zero probability.

15
Boltzmanns statistical theory of gases
  • In the statistical theory of gases, the gas is
    described not by a deterministic dynamics, but
    rather by the probability that it will be in
    different states.
  • The 19th century physicist Ludwig Boltzmann
    developed a theory that included a probability
    distribution of temperature (i.e., every small
    region of the gas had the same kinetic energy).
  • Hinton, Sejnowski and Ackleys idea was that this
    distribution might also be used to describe
    neural interactions, where low temperature T is
    replaced by a small noise term T (the neural
    analog of random thermal motion of molecules).
    While their results primarily concern
    optimization using neural networks, the idea is
    more general.

16
Boltzmann distribution
  • At thermal equilibrium at temperature T, the
  • Boltzmann distribution gives the relative
  • probability that the system will occupy state A
    vs.
  • state B as
  • where E(A) and E(B) are the energies associated
    with states A and B.

17
Simulated annealing
  • Kirkpatrick et al. 1983
  • Simulated annealing is a general method for
    making likely the escape from local minima by
    allowing jumps to higher energy states.
  • The analogy here is with the process of annealing
    used by a craftsman in forging a sword from an
    alloy.
  • He heats the metal, then slowly cools it as he
    hammers the blade into shape.
  • If he cools the blade too quickly the metal will
    form patches of different composition
  • If the metal is cooled slowly while it is shaped,
    the constituent metals will form a uniform alloy.

18
Real annealing Sword
  • He heats the metal, then slowly cools it as he
    hammers the blade into shape.
  • If he cools the blade too quickly the metal will
    form patches of different composition
  • If the metal is cooled slowly while it is shaped,
    the constituent metals will form a uniform alloy.

19
Simulated annealing in practice
  • set T
  • optimize for given T
  • lower T (see Geman Geman, 1984)
  • repeat

20
Simulated annealing in practice
  • set T
  • optimize for given T
  • lower T
  • repeat

MDSA Molecular Dynamics Simulated Annealing
21
Simulated annealing in practice
  • set T
  • optimize for given T
  • lower T (see Geman Geman, 1984)
  • repeat
  • Geman Geman (1984) if T is lowered
    sufficiently slowly (with respect to the number
    of iterations used to optimize at a given T),
    simulated annealing is guaranteed to find the
    global minimum.
  • Caveat this algorithm has no end (Geman
    Gemans T decrease schedule is in the 1/log of
    the number of iterations, so, T will never reach
    zero), so it may take an infinite amount of time
    for it to find the global minimum.

22
Simulated annealing algorithm
  • Idea Escape local extrema by allowing bad
    moves, but gradually decrease their size and
    frequency.

Note goal here is to maximize E.
-
23
Simulated annealing algorithm
  • Idea Escape local extrema by allowing bad
    moves, but gradually decrease their size and
    frequency.

Algorithm when goal is to minimize E.
-
lt
-
24
Note on simulated annealing limit cases
  • Boltzmann distribution accept bad move with
    ?Elt0 (goal is to maximize E) with probability
    P(?E) exp(?E/T)
  • If T is large ?E lt 0
  • ?E/T lt 0 and very small
  • exp(?E/T) close to 1
  • accept bad move with high probability
  • If T is near 0 ?E lt 0
  • ?E/T lt 0 and very large
  • exp(?E/T) close to 0
  • accept bad move with low probability

25
Note on simulated annealing limit cases
  • Boltzmann distribution accept bad move with
    ?Elt0 (goal is to maximize E) with probability
    P(?E) exp(?E/T)
  • If T is large ?E lt 0
  • ?E/T lt 0 and very small
  • exp(?E/T) close to 1
  • accept bad move with high probability
  • If T is near 0 ?E lt 0
  • ?E/T lt 0 and very large
  • exp(?E/T) close to 0
  • accept bad move with low probability

Random walk
Deterministic down-hill
26
Summary
  • Best-first search general search, where the
    minimum-cost nodes (according to some measure)
    are expanded first.
  • Greedy search best-first with the estimated
    cost to reach the goal as a heuristic measure.
  • - Generally faster than uninformed search
  • - not optimal
  • - not complete.
  • A search best-first with measure path cost
    so far estimated path cost to goal.
  • - combines advantages of uniform-cost and
    greedy searches
  • - complete, optimal and optimally efficient
  • - space complexity still exponential

27
Genetic Algorithms
28
The Traditional Approach
  • Ask an expert
  • Adapt existing designs
  • Trial and error

29
Natures Starting Point
Alison Everitts A Users Guide to Men
30
Optimised Man!
31
Example Pursuit and Evasion
  • Using NNs and Genetic algorithm
  • 0 learning
  • 200 tries
  • 999 tries

32
Comparisons
  • Traditional
  • best guess
  • may lead to local, not global optimum
  • Nature
  • population of guesses
  • more likely to find a better solution

33
More Comparisons
  • Nature
  • not very efficient
  • at least a 20 year wait between generations
  • not all mating combinations possible
  • Genetic algorithm
  • efficient and fast
  • optimization complete in a matter of minutes
  • mating combinations governed only by fitness

34
The Genetic Algorithm Approach
  • Define limits of variable parameters
  • Generate a random population of designs
  • Assess fitness of designs
  • Mate selection
  • Crossover
  • Mutation
  • Reassess fitness of new population

35
A Population
36
Ranking by Fitness
37
Mate Selection Fittest are copied and replaced
less-fit
38
Mate Selection RouletteIncreasing the
likelihood but not guaranteeing the fittest
reproduction
39
CrossoverExchanging information through some
part of information (representation)
40
Mutation Random change of binary digits from 0
to 1 and vice versa (to avoid local minima)
41
Best Design
42
The GA Cycle
43
The Process
44
Genetic Algorithms
  • Adv
  • Good to find a region of solution including the
    optimal solution. But slow in giving the optimal
    solution

45
Genetic Approach
  • When applied to strings of genes, the approaches
    are classified as genetic algorithms (GA)
  • When applied to pieces of executable programs,
    the approaches are classified as genetic
    programming (GP)
  • GP operates at a higher level of abstraction than
    GA

46
Typical Chromosome
47
Summary
  • Time complexity of heuristic algorithms depend on
    quality of heuristic function. Good heuristics
    can sometimes be constructed by examining the
    problem definition or by generalizing from
    experience with the problem class.
  • Iterative improvement algorithms keep only a
    single state in memory.
  • Can get stuck in local extrema simulated
    annealing provides a way to escape local extrema,
    and is complete and optimal given a slow enough
    cooling schedule.
Write a Comment
User Comments (0)
About PowerShow.com