Artificial Intelligence Informed search and exploration - PowerPoint PPT Presentation

About This Presentation
Title:

Artificial Intelligence Informed search and exploration

Description:

Title: Inteligencia Artificial Author: Luigi Last modified by: Luigi Ceccaroni Document presentation format: On-screen Show (4:3) Other titles: Arial ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 80
Provided by: Luig91
Learn more at: https://www.cs.upc.edu
Category:

less

Transcript and Presenter's Notes

Title: Artificial Intelligence Informed search and exploration


1
Artificial IntelligenceInformed search and
exploration
  • Fall 2008
  • professor Luigi Ceccaroni

2
Heuristic search
  • Heuristic function, h(n) it estimates the
    lowest cost from the n state to a goal state.
  • At each moment, most promising states are
    selected.
  • Finding a solution (if it exist) is not
    guaranteed.
  • Types of heuristic search
  • BB (Branch Bound), Best-first search
  • A, A
  • IDA
  • Local search
  • hill climbing
  • simulated annealing
  • genetic algorithms

2
3
Branch Bound
  • For each state, the cost is kept of getting from
    the initial state to that state (g).
  • The global, minimum cost is kept, too, and guides
    the search.
  • A branch is abandoned if its cost is greater than
    the current minimum.

3
4
Best-first search
  • Priority is given by the heuristic function
    (estimation of the cost from a given state to a
    solution).
  • At each iteration the node with minimum heuristic
    function is chosen.
  • The optimal solution is not guaranteed.

5
Greedy best-first search
6
Importance of the estimator
Operations - put a free block on the table
- put a free block on another free block
Estimator H1 - add 1 for each block which
is on the right block - subtract 1
otherwise
Initial state H1 4 H2 -28
Estimator H2 - for each block, if the
underlying structure is correct, add 1 for each
block of that structure - otherwise,
subtract 1 for each block of the structure
Final state H1 8 H2 28 ( 7654321)
7
Initial state H1 4 H2 -28
H1 ? H2 ?
H1 ? H2 ?
H1 ? H2 ?
8
Initial state H1 4 H2 -28
H1 6 H2 -21
H1 4 H2 -15
H1 4 H2 -16
9
Heuristic functions
Initial state
Possible heuristic functions h(n) w(n)
misplaced h(n) p(n) sum of distances to
the final position h(n) p(n) 3 s(n) where
s(n) is obtained cycling over non-central
positions and adding 2 if a tile is not followed
by the right one and adding 1 if there is a tile
in the center
2 8 3 1 6 4 7
5
1 2 3 8 4 7
6 5
Final state
10
A and A algorithms
A
A
11
A algorithm
12
A algorithm
  • The priority is given by the estimation function
    f(n)g(n)h(n).
  • At each iteration, the best estimated path is
    chosen (the first element in the queue).
  • A is an instance of best-first search
    algorithms.
  • A is complete when the branching factor is
    finished and each operator has a constant,
    positive cost.

13
A treatment of repeated states
  • If the repeated state is in the structure of open
    nodes
  • If its new cost (g) is lower, the new cost is
    used possibly changing its position in the
    structure of open nodes.
  • If its new cost (g) is equal or greater, the node
    is forgotten.
  • If the repeated state is in the structure of
    closed nodes
  • If its new cost (g) is lower, the node is
    reopened and inserted in the structure of open
    nodes with the new cost. Nothing is done with its
    successors they will be reopened if necessary.
  • If its new cost (g) is equal or greater, the node
    is forgotten.

13
14
Example Romania with step costs in km
15
A search example
16
A search example
17
A search example
18
A search example
19
A search example
20
A search example
21
A admissibility
  • The A algorithm, depending on the heuristic
    function, finds or not an optimal solution.
  • If the heuristic function is admissible, the
    optimization is granted.
  • A heuristic function is admissible if it
    satisfies the following property
  • ?n 0 h(n) h(n)
  • h(n) has to be an optimistic estimator it never
    has to overestimate h(n).
  • Using an admissible heuristic function guarantees
    that a node on the optimal path never seems too
    bad and that it is considered at some point in
    time.

22
Example no admissibility
  • h3
  • /\
  • / \
  • h2 h4
  • h1 h1
  • h1 goal
  • h1
  • goal

23
Optimality of A
  • A expands nodes in order of increasing f value.
  • Gradually adds "f-contours" of nodes.
  • Contour i has all nodes with ffi, where fi lt
    fi1.

24
A1, A2 admissible ? n?final 0 h2(n) lt
h1(n) h(n) ? A1 more informed than A2
25
(No Transcript)
26
More informed algorithms
  • Compromise between
  • Calculation time in h
  • h1(n) could require more calculation time than
    h2(n) !
  • Re-expansions
  • A1 may re-expand more nodes than A2 !
  • Not if A1 is consistent
  • Not if trees (instead of graphs) are considered
  • Loss of admissibility
  • Working with non admissible heuristic functions
    can be interesting to gain speed.

27
1
3
2
1
3
2
4
8
4
8
5
6
7
5
6
7
28
(No Transcript)
29

1
3
A3 no admissible
2
4
8
5
6
7
30
Memory bounded search
  • The A algorithm solves problems in which it is
    necessary to find the best solution.
  • Its cost in space and time, in average and if the
    heuristic function is adequate, is better than
    that of blind-search algorithms.
  • There exist problems in which the size of the
    search space does not allow a solution with A.
  • There exist algorithms which allow to solve
    problems limiting the memory used
  • Iterative deepening A (IDA)
  • Recursive best-first
  • Simplified memory-bounded A (SMA)

31
Iterative deepening A (IDA)
  • Iterative deepening A is similar to the
    iterative deepening (ID) technique.
  • In ID the limit is given by a maximum depth.
  • In IDA the limit is given by a maximum value of
    the f-cost.
  • Important The search is a standard depth-first
    the f-cost is used only to limit the expansion.
  • Starting limit f (initial)

(deepening expansion limit)
02
11
12
21
21
31
31
40
41
goal
50
goal
32
Iterative deepening A (IDA)
  • Iterative deepening A is similar to the
    iterative deepening (ID) technique.
  • In ID the limit is given by a maximum depth.
  • In IDA the limit is given by a maximum value of
    the f-cost.
  • Important The search is a standard depth-first
    the f-cost is used only to limit the expansion.
  • Starting limit f (initial)

(deepening expansion limit)
(1,3,8)
02
11
(2,4,9) (5,10) (11)
(6,12) (7,13) (14) (15)
12
21
21
31
31
40
41
goal
50
goal
33
IDA algorithm
34
IDA algorithm
Algorithm IDA depthf(Initial_state) While
not is_final?(Current) do Open_states.insert(I
nitial_state) Current Open_states.first()
While not is_final?(Current) and not
Open_states.empty?() do Open_states.delete_
first() Closed_states.insert(Current)
Successors generate_successors(Current,
depth) Successors process_repeated(Success
ors, Closed_states, Open_states)
Open_states.insert(Successors) Current
Open_states.first() eWhile depthdepth1
Open_states.initialize() eWhile eAlgorithm
  • The function generate_successors only generate
    those with an f-cost less or equal to the cutoff
    limit of the iteration.
  • The OPEN structure is a stack (depth-first
    search).
  • If repeated nodes are processed there is no
    space saving.
  • Only the current path (tree branch) is saved in
    memory.

35
Other memory-bounded algorithms
  • IDAs re-expansions can represent a high
    temporal cost.
  • There are algorithms which, in general, expand
    less nodes.
  • Their functioning is based on eliminating less
    promising nodes and saving information which
    allows to re-expand them (in necessary).
  • Examples
  • Recursive best-first
  • Memory bound A (MA)

35
36
Recursive best-first
  • It is a recursive implementation of best-first,
    with lineal spatial cost.
  • It forgets a branch when its cost is more than
    the best alternative.
  • The cost of the forgotten branch is stored in the
    parent node as its new cost.
  • The forgotten branch is re-expanded if its cost
    becomes the best one again.

36
37
Recursive best-first example
38
Recursive best-first example
39
Recursive best-first
  • In general, it expands less nodes than IDA.
  • Not being able to control repeated states, its
    cost in time can be high if there are loops.
  • Sometimes, memory restrictions can be relaxed.

39
40
Memory bound A (MA)
  • It imposes a memory limit number of nodes which
    can be stored.
  • A is used for exploration and nodes are stored
    while there is memory space.
  • When there is no more space, the worst nodes are
    eliminated, keeping the best cost of forgotten
    descendants.
  • The forgotten branches are re-expanded if their
    cost becomes the best one again.
  • MA is complete if the solution path fits in
    memory.

40
41
Local search algorithms and optimization problems
  • In local search (LS), from a (generally random)
    initial configuration, via iterative improvements
    (by operators application), a state is reached
    from which no better state can be attained.
  • LS algorithms (or meta-heuristics or local
    optimization methods) are prone to find local
    optima, which are not the best possible solution.
    The global optimum is generally impossible to be
    reached in limited time.
  • In LS, there is a function to evaluate the
    quality of the states, but this is not
    necessarily related to a cost.

42
Local search algorithms
43
Local search algorithms
  • These algorithms do not systematically explore
    all the state space.
  • The heuristic (or evaluation) function is used to
    reduce the search space (not considering states
    which are not worth being explored).
  • Algorithms do not usually keep track of the path
    traveled. The memory cost is minimal.
  • This total lack of memory can be a problem (i.e.,
    cycles).

44
Hill-climbing search
  • Standard hill-climbing search algorithm
  • It is a simple loop which search for and select
    any operation that improves the current state.
  • Steepest-ascent hill climbing or gradient search
  • The best move (not just any one) that improves
    the current state is selected.

45
Steepest-ascent hill-climbing algorithm
46
Steepest-ascent hill-climbing algorithm
Algorithm Hill Climbing Current Initial_state
end false While end do Children
generate_successors(Current) Children
order_and_eliminate_worse_ones(Children,
Current) if empty?(Children) then Current
Select_best(Children) else endtrue
eWhile eAlgorithm
47
Hill climbing
  • Children are considered only if their evaluation
    function is better than the one of the parent
    (reduction of the search space).
  • A stack could be used to save children which are
    better than the parent, to be able to backtrack
    but in general the cost of this is prohibitive.
  • The characteristics of the heuristic function
    determine the success and the rapidity of the
    search.
  • It is possible that the algorithm does not find
    the best solution
  • Local optima no successor has a better
    evaluation
  • Plateaux all successors has the same evaluation

48
Simulated annealing
  • It is a stochastic hill-climbing algorithm
    (stochastic local search, SLS)
  • A successor is selected among all possible
    successors according to a probability
    distribution.
  • The successor can be worse than the current
    state.
  • Random steps are taken in the state space.
  • It is inspired by the physical process of
    controlled cooling (crystallization, metal
    annealing)
  • A metal is heated up to a high temperature and
    then is progressively cooled in a controlled way.
  • If the cooling is adequate, the minimum-energy
    structure (a global minimum) is obtained.

49
Simulated annealing
  • Aim to avoid local optima, which represent a
    problem in hill climbing.

50
Simulated annealing
  • Solution to take, occasionally, steps in a
    different direction from the one in which the
    increase (or decrease) of energy is maximum.

51
Simulated annealing methodology
  • Terminology from the physical problem is often
    used.
  • Temperature (T) is a control parameter.
  • Energy (f or E) is a heuristic function about the
    quality of a state.
  • A function F decides about the selection of a
    successor state and depends on T and the
    difference between the quality of the nodes (?f
    ) the lower the temperature, the lower the
    probability of selecting worse successors.
  • The cooling strategy determines
  • maximum number of iterations in the search
    process
  • temperature-decrease steps
  • number of iterations for each step

52
Simulated annealing - Basic algorithm
53
Simulated annealing - Basic algorithm
  • An initial temperature is defined.
  • While the temperature is not zero do
  • / Random moves in the state space /
  • For a predefined number of iterations do
  • LnewGenerate_successor(Lcurrent)
  • if F(f(Lactual)- f(Lnew),T) gt 0 then
    LactualLnew
  • eFor
  • The temperature is decreased.
  • eWhile

54
Simulated annealing
55
Simulated annealing
  • Main idea Steps taken in random directions do
    not decrease (but actually increase) the ability
    of finding a global optimum.
  • Disadvantage The very structure of the algorithm
    increases the execution time.
  • Advantage The random steps possibly allow to
    avoid small hills.
  • Temperature It determines (through a probability
    function) the amplitude of the steps, long at the
    beginning, and then shorter and shorter.
  • Annealing When the amplitude of the random step
    is sufficiently small not to allow to descend the
    hill under consideration, the result of the
    algorithm is said to be annealed.

56
Simulated annealing
  • It is possible to demonstrate that, if the
    temperature of the algorithm is reduced very
    slowly, a global maximum will be found with a
    probability close to 1
  • Value of the energy function (E) in the global
    maximum m
  • Value of E in the best local maximum l lt m
  • There will be some temperature t high enough to
    allow to descend from the local maximum, but not
    from the global maximum.
  • If the temperature is reduced very slowly, the
    algorithm will work enough time with a
    temperature close to t, until it finds and ascend
    the global maximum and the solution will stay
    there because there wont be available steps long
    enough to descend from it.

57
Simulated annealing
  • Conclusion During the resolution of a search
    problem, a node should occasionally be explored,
    which appears substantially worse than the best
    node in the L list of OPEN nodes.

58
Hill climbing and simulated annealing
applications
  • Minimum spanning tree (MST, SST) A
    minimum-weight tree in a weighted graph which
    contains all of the graph's vertices.
  • Traveling salesman (TSP) Find a path through a
    weighted graph which starts and ends at the same
    vertex, includes every other vertex exactly once,
    and minimizes the total cost of edges.

59
Simulated annealing applications TSP
  • Search space N!
  • Operators define transformations between
    solutions inversion, displacement, interchange
  • Energy function sum of distances between cities,
    ordered according to the solution.
  • An initial temperature is defined. (It may need
    some experimentation.)
  • The number of iteration per temperature value is
    defined.
  • The way to decrease the temperature is defined.

60
Hill climbing and simulated annealing
applications
  • Euclidean Steiner tree A tree of minimum
    Euclidean distance connecting a set of points,
    called terminals, in the plane. This tree may
    include points other than the terminals.
  • Steiner tree A minimum-weight tree connecting a
    designated set of vertices, called terminals, in
    an undirected, weighted graph or points in a
    space. The tree may include non-terminals, which
    are called Steiner vertices or Steiner points.

61
Hill climbing and simulated annealing
applications
  • Bin packing problem Determine how to put the
    most objects in the least number of fixed space
    bins. More formally, find a partition and
    assignment of a set of objects such that a
    constraint is satisfied or an objective function
    is minimized (or maximized). There are many
    variants, such as, 3D, 2D, linear, pack by
    volume, pack by weight, minimize volume, maximize
    value, fixed shape objects.
  • Knapsack problem Given items of different values
    and volumes, find the most valuable set of items
    that fit in a knapsack of fixed volume.

62
Simulated annealing conclusions
  • It is suitable for problems in which the global
    optimum is surrounded by many local optima.
  • It is suitable for problems in which it is
    difficult to find a good heuristic function.
  • Determining the values of the parameters can be a
    problem and requires experimentation.

63
Local beam search
  • Keeping just one node in memory may be an extreme
    reaction to the problem of memory limits.
  • The local beam search keeps track of k nodes
  • It starts with k states randomly generated
  • At each step all successors of the k states are
    generated.
  • It is checked if some state is a goal.
  • If not, the k best successors from the complete
    list are selected and the process is repeated.

63
64
Local beam search
  • There is a difference between k random executions
    in parallel and in sequence!
  • If a state generates various good successors, the
    algorithm quickly abandons unsuccessful searches
    and moves its resources where major progress is
    made.
  • It can suffer from a lack of diversity among the
    k states (concentrated in a small region of the
    state space) and become an expensive version of
    hill climbing.

64
65
Stochastic beam search
  • It is similar to local beam search.
  • Instead of choosing the k best successors, k
    successors are chosen randomly
  • The probability of choosing a successor grows
    with its fitness function.
  • Analogy with the process of natural selection
    successors (descendants) of a state (organism)
    populate the following generation as a function
    of their value (fitness, health, adaptability).

65
66
Genetic algorithms
  • A genetic algorithm (GA) is a variant of
    stochastic beam search, in which two parent
    states are combined.
  • Inspired by the process of natural selection
  • Living beings adapt to the environment thanks to
    the characteristics inherited from their parents.
  • The possibility of survival and reproduction are
    proportional to the goodness of these
    characteristics.
  • The combination of good individuals can produce
    better adapted individuals.

66
67
Genetic algorithms
  • To solve a problem via GAs requires
  • The representation of the states (individuals)
  • Each individual is represented as a string over a
    finite alphabet (usually, a string of 0s and 1s)
  • A function, which measure the fitness of the
    states
  • Operators, which combine states to obtain new
    states
  • Cross-over and mutation operators
  • The size of the initial population
  • GAs start with a set of k states randomly
    generated
  • A strategy to combine individuals

67
68
Genetic algorithms representation of individuals
  • The coding of individuals defines the size of the
    search space and the kind of operators.
  • If the state has to specify the position of n
    queens, each one in a column of n squares,
    nlog2n bits are required (8, in the example).
  • The state could be represented also as n digits
    1,n.

68
69
Genetic algorithms fitness function
  • A fitness function is calculated for each state.
  • It should return higher values for better states.
  • In the n-queens problem, the number of
    non-attacking pairs of queens could be used.
  • In the case of 4 queens, the fitness function has
    a value of 6 for a solution.

70
Genetic algorithms operators
  • The combination of individuals is carried out via
    cross-over operators.
  • The basic operator is crossing over one (random)
    point and combining using that point as reference

71
Genetic algorithms operators
  • Other possible exchange-operators
  • Crossing over 2 points
  • Random exchange of bits
  • Ad hoc operators
  • Mutation operators
  • Analogy with gene combination
  • Sometimes the information of part of the genes
    randomly changes
  • A typical case (in the case of binary strings)
    consists of changing the sign of each bit with
    some (very low) probability.

72
Genetic algorithms combination
  • Each iteration of the search is a new generation
    of individuals
  • The size of the population is in general constant
    (N).
  • To get to the following population, the
    individuals that reproduce have to be chosen
    (intermediate population).

73
Genetic algorithms combination
  • Selection of individuals
  • Individuals are selected according to their
    fitness function value
  • N random tournaments of pairs of individuals,
    choosing the winner of each one
  • There will always be individuals represented more
    than once and other individuals not represented
    in the intermediate population

74
Genetic algorithms algorithm
  • Steps of the basic GA algorithm
  • N individuals from current population are
    selected to form the intermediate population
    (according to some predefined criteria).
  • Individuals are paired and for each pair
  • The crossover operator is applied with
    probability P_c and two new individuals are
    obtained.
  • New individuals are mutated with a probability
    P_m.
  • The resulting individuals form the new population.

75
Genetic algorithms algorithm
  • The process is iterated until the population
    converges or a specific number of iteration has
    passed.
  • The crossover probability influences the
    diversity of the new population.
  • The mutation probability is always very low.

76
Genetic algorithms 8-queens problem
  • An 8-queens state must specify the positions of 8
    queens, each in a column of 8 squares, and so
    requires 8 x log 8 24 bits.
  • Alternatively, the state can be represented as 8
    digits, each in the range from 1 to 8.
  • Each state is rated by the evaluation function or
    the fitness function.
  • A fitness function should return higher values
    for better states, so, for the 8-queens problem
    the number of non-attacking pairs of queens is
    used (28 for a solution).

2
76
77
Genetic algorithms 8-queens problem
  • The initial population in (a)
  • is ranked by the fitness function in (b),
  • resulting in pairs for mating in (c).
  • They produce offspring in (d),
  • which are subject to mutation in (e).

77
78
Genetic algorithms
  • Like stochastic beam search, GAs combine
  • an uphill tendency
  • random exploration
  • exchange of information among parallel search
    threads
  • The primary advantage of GAs comes from the
    crossover operation.
  • Yet it can be shown mathematically that, if the
    genetic code is permuted initially in a random
    order, crossover conveys no advantage.

78
79
Genetic algorithms application
  • In practice, GAs have had a widespread impact on
    optimization problems, such as
  • circuit layout
  • scheduling
  • At present, it is not clear whether the appeal of
    GAs arises
  • from their performance or
  • from their aesthetically pleasing origins in the
    theory of evolution.

79
Write a Comment
User Comments (0)
About PowerShow.com