No Free Lunch (NFL) Theorem - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

No Free Lunch (NFL) Theorem

Description:

Give an intuitive feeling for the NFL. Present some mathemtical background. To keep in mind. NFL is an impossibility theorem, such as ... – PowerPoint PPT presentation

Number of Views:272
Avg rating:3.0/5.0
Slides: 14
Provided by: nic1115
Category:

less

Transcript and Presenter's Notes

Title: No Free Lunch (NFL) Theorem


1
No Free Lunch (NFL)Theorem

Presentation by Kristian Nolde
Many slides are based on a presentation of Y.C. Ho
2
General notes
  • Goal
  • Give an intuitive feeling for the NFL
  • Present some mathemtical background
  • To keep in mind
  • NFL is an impossibility theorem, such as
  • Gödels proof in mathematics (roughly some facts
    cannot be proved or disaproved in any
    mathematical system)
  • Arrows theorem in economics (in principle,
    perfect democracy is not realizable)
  • Thus, practicle use is limited ?!?

3
The No Free Lunch Theorem
  • Without specific structural assumptions, no
    optimization scheme can perform better than blind
    search on the average
  • But blind search is very inefficient!
  • Prob (at least one out of N samples is in the
    top-n for search space of size Q) nN/Q
    ex. Prob0.0001 for Q109, n1000, N1000

4
Assume a finite World
Finite of input symbols (xs) and finite of
output symbols (ys) gt finite of possible
mappings from input to output (fs)
5
The Fundamental Matrix F
In each row, each value of Y appear Y X-1
times!
FACT equal number of 0s and 1s in each row!
Averaged over all f, the value is independent of
x!
6
Compare Algorithms
  • Think of two algorithms a1 and a2 e.g. a1
    always selects from x1 to x.5X
  • a2 always selects from x.5X to xX
  • For specific f a1 or a2 may be bettter. However,
    if f is not known average performance of both is
    equal
  • where d is a sample and dy is the cot value
    associated with d.

7
Comparing Algorithms Continued
  • Case 1 Algorithms can be more specific, e.g.
    assume a certain realization fk, a1
  • Case 2 Or, they can be more general, assume more
    uniform distribution of possible f, a2.
  • Then performance of a1 will be excellent for fk
    but catastrophic for all other cases (great
    performance, no robustness)
  • Contrary, a2 performs mediocre for all cases, but
    doesnt fail (poor performance, high robustness)
  • Common Sense says
  • Robustness Efficiency Constant
  • or Generality Depth Constant

8
Implication 1
  • Let x be the optimization variable, f the
    performance function, and y the performance,
    i.e., yf(x)
  • then averaged over all possible optimization
    problems, the result is choice independent
  • if you dont know the structure of f (which
    column you are dealing with), blind choice is as
    good as any!

9
Implications 2
  • Let X be the strategy (control law, decision
    rule) space decisionsinformation, f the
    performance function, and y the performance,
    i.e., yf(x)
  • Same conclusion for stochastic optimal control,
    adaptive control, decision theory, game theory,
    learning control, etc.
  • A goodalgorithm must be qualified!

10
Implications 2
  • Let X be the space of all possible representation
    (as in genetic algorithms), or space of all
    possible algorithms to apply to a class of
    problems
  • Without understanding of the problem, blind
    choice is as good as any.
  • understanding means you know which column of
    the F matrix you are dealing with

11
Implications 3
  • Even if you know which columns or group of
    columns you are dealing with gt you can
    specialize the choice of rows
  • You must accept that you will suffer LOSSES
    should other choices of column occur due to
    uncertainties or disturbances

12
The Fundamental Matrix F
Assume a distribution of the columns, then pick a
row that results in minimal expected losses or
maximal performance. This is stochastic
optimization
13
Implications 5
  • Worse, if you should estimate the probabilities
    incorrectly, then your stochastically optimized
    solution may suffer catastrophic bad outcomes
    more frequent then you like.
  • Reason you have already used up more of the good
    outcomes in your optimal choice. What are left
    are bad ones that are not suppose to occur! (HOT
    Design power law -Doyle)

14
Implications 6
  • Generality for generality sake is not very
    fruitful
  • Working on a specific problem can be rewarding
  • Because
  • the insight can be generalized
  • the problem is practically important
  • the 80-20 effect
Write a Comment
User Comments (0)
About PowerShow.com