Computational Intelligence - PowerPoint PPT Presentation

1 / 120
About This Presentation
Title:

Computational Intelligence

Description:

Computational Intelligence a Possible Solution for Unsolvable Problems Annam ria R. V rkonyi-K czy Dept. of Measurement and Information Systems, – PowerPoint PPT presentation

Number of Views:1946
Avg rating:3.0/5.0
Slides: 121
Provided by: KOC72
Category:

less

Transcript and Presenter's Notes

Title: Computational Intelligence


1
Computational Intelligence a Possible Solution
for Unsolvable Problems
  • Annamária R. Várkonyi-Kóczy
  • Dept. of Measurement and Information Systems,
  • Budapest University of Technology and Economics
  • koczy_at_mit.bme.hu

2
Contents
  • Motivation Why do we need something
    non-classical?
  • What is Computational Intelligence?
  • How CI works?
  • About some of the methods of CI
  • Fuzzy Logic
  • Neural Networks
  • Genetic Algorithms
  • Anytime Techniques
  • Engineering view Practical issues
  • Conclusions Is CI really solution for
    unsolvable problems?

3
Motivation Why do we need something
non-classical?
  • Nonlinearity, never unseen spatial and temporal
    complexity of systems and tasks
  • Imprecise, uncertain, insufficient, ambiguous,
    contradictory information, lack of knowledge
  • Finite resources ? Strict time requirements
    (real-time processing)
  • Need for optimization
  • Users comfort
  • New challanges/more complex tasks to be solved ?
    more sophisticated solutions needed

4
Never unseen spatial and temporal complexity of
systems and tasks
How can we drive in heavy traffic? Many
components, very complex system. Can classical or
even AI systems solve it? Not, as far as we
know. But WE, humans can. And we would like to
build MACHINES to be able to do the same.
Our car, save fuel, save time, etc.
5
Never unseen spatial and temporal complexity of
systems and tasks
  • Help
  • Increased computer facilities
  • Model integrated computing
  • New modeling techniques
  • Approximative computing
  • Hybrid systems

6
Imprecise, uncertain, insufficient, ambiguous,
contradictory information, lack of knowledge
  • How can I get to Shibuya?
  • (Person 1 Turn right at the lamp, than straight
    ahead till the 3rd corner, than right again ...
    NO better turn to the left) (Person 2 Turn
    right at the lamp, than straight ahead till appr.
    the 6th corner ... than I dont know) (Person 3
    It is in this direction ? somewhere ...)
  • It is raining
  • The traffic light is out of order
  • I dont know in which building do we have the
    special lecture (in Building III or II or ...)?
    And at what time???? (Does it start at 3 p.m. or
    at 2 p.m? And on the 3rd or 4th of October?)
  • When do I have to start from home at at what
    time?
  • Who (a person or computer) can show me an
    algorithm to find an OPTIMUM solution?

7
Imprecise, uncertain, insufficient, ambiguous,
contradictory information, lack of knowledge
  • Help
  • Intelligent and soft computing techniques being
    able to handle the problems
  • New data acquisition and representation
    techniques
  • Adaptivity, robustness, ability to learn

8
Finite resources ? Strict time requirements
(real-time processing)
  • It is 10.15 a.m. My lecture starts at 3 p.m.
    (hopefully the information is correct)
  • I am still not finished with my homework
  • I have run out of the fuel and I dont have
    enough money for a taxi
  • I am very hungry
  • I have promised my Professor to help him to
    prepare some demo in the Lab this morning
  • I can not fulfill everything with maximum
    preciseness

9
Finite resources ? Strict time requirements
(real-time processing)
  • Help
  • Low complexity methods
  • Flexible systems
  • Approximative methods
  • Results for qualitative evaluations for
    supporting decisions
  • Anytime techniques

10
Need for optimization
  • Traditionally
  • optimization precision
  • New definition
  • optimization cost optimization
  • But what is cost!?
  • presition and certainty also carry a cost

11
Need for optimization
  • Lets look TIME as a resource
  • The most important thing is to go the Lab and
    help my Professor (He is my Professor and I have
    promised it). I will spend there as needed, min.
    3 hours
  • I have to submit the homework, but I will work in
    the Lab., i.e. today I will prepare an average
    and not a maximum level homework (1 hour)
  • I dont have time to eat at home, I will buy a
    bento at the station (5 minutes)
  • The train is more expensive then the bus but
    takes much less time, i.e. I will go by train (40
    minutes)

12
Users comfort
  • I have to ask the way to the university but
    unfortunately, I dont speak Japanese
  • Next time I also want to find my way
  • Today it took one and a half hour to get here.
    How about tomorrow?
  • It would be good get more help
  • ....

13
Users comfort
  • Help
  • Modeling methods and representation techniques
    making possible to
  • handle
  • interprete
  • predict
  • improve
  • optimise the system and
  • give more and more support in the processing

14
Users comfort
Human language Modularity, simplicity,
hierarchical structures Aims of the processing
preprocessing
processing
improving the performance of the
algorithms giving more support to the processing
(new)
aims of preprocessing
image processing / computer vision
noise smoothing feature extraction (edge, corner
detection) pattern recognition, etc. 3D
modeling, medical diagnostics, etc. automatic 3D
modeling, automatic ...
preprocessing
processing
15
The most important elements of the solution
  • Low complexity, approximative modeling
  • Application of adaptive and robust techniques
  • Definition and application of the proper cost
    function including the hierarchy and measure of
    importance of the elements
  • Trade-off between accuracy (granularity) and
    complexity (computational time and resource need)
  • Giving support for the further processing
  • These do not cope with traditional and AI
    methods.
  • But how about the new approaches, about
    COMPUTATIONAL INTELLIGENCE?

16
What is Computational Intelligence?
  • Computer Intelligence

Increased computer facilities
Added by the new methods
L.A. Zadeh, Fuzzy Sets 1965 In traditional
hard computing, the prime desiderata are
precision, certainty, and rigor. By contrast, the
point of departure of soft computing is the
thesis that precision and certainty carry a cost
and that computation, reasoning, and decision
making should exploit whenever possible the
tolerance for imprecision and uncertainty.
17
What is Computational Intelligence?
  • CI can be viewed as a corsortium of methodologies
    which play important role in conception, design,
    and utilization of information/intelligent
    systems.
  • The principal members of the consortium are
    fuzzy logic (FL), neuro computing (NC),
    evalutionary computing (EC), anytime computing
    (AC), probabilistic computing (PC), chaotic
    computing (CC), and (parts of) machine learning
    (ML).
  • The methodologies are complementary and
    synergistic, rather than competitive.
  • What is common Exploit the tolerance for
    imprecision, uncertainty, and partial truth to
    achieve tractability, robustness, low solution
    cost and better rapport with reality.

18
Computational Intelligence fulfill all of the
five requirements(Low complexity,
approximative modelingapplication of adaptive
and robust techniquesDefinition and application
of the proper cost function including the
hierarchy and measure of importance of the
elementsTrade-off between accuracy (granularity)
and complexity (computational time and resource
need)Giving support for the further processing)
19
How CI works?1. Knowledge
  • Information acquisition (observation)
  • Information processing (numeric, symbolic)
  • Storage and retrieval of the information
  • Search for a structure (algorithm for the
    non-algorithmizable processing)
  • Certain knowledge (can be obtained by formal
    methods) closed, open world ABSTRACT WORLDS)
  • Uncertain knowledge (by cognitive methods)
    (ARTIFICIAL and REAL WORLDS)
  • Lack of knowledge
  • Knowledge representation

20
How CI works?1. Knowledge
  • In real life nearly everything is optimization
  • (Ex.1. Determination of the velocity
    Calculation of the optimum estimation of the
    velocity from the measured time and done
    distance)
  • Ex.2. Determination of the resistance the
    optimum estimation of the resistance with the
    help of the measured intensity of current and
    voltage
  • Ex.3. Analysis of a measurement result the
    optimum estimation of the measured quantity in
    the kowledge of the conditions of the measurement
    and the measured data)
  • Ex. 4. Daily time-table
  • Ex. 5. Optimum route between two towns
  • In Ex. 1-3 the criteria of the optimization is
    unambiguos and easily can be given
  • Ex. 4-5 are also simple tasks but the criteria is
    not unambiguos

21
  • Optimum route
  • What is optimum? (Subjective, depending on the
    requirements, taste, limits of the person)
  • - We prefer/are able to travel by aeroplane,
    train, car, ...
  • Lets say car is selected
  • the shortest route (min petrol need), the
    quickest route (motorway), the most beautiful
    route with sights (whenever it is possible I
    never miss the view of the Fuji-san ...), where
    by best restaurants are located, where I can
    visit my friends, ...
  • OK, lets fix the preferences of a certain
    person
  • But is it summer or winter, is it sunshine or
    raining, how about the road reconstructions, ....
  • By going into the details we get nearer and
    nearer to the solution
  • Knowledge is needed for the determination of a
    good descriptive model of the circumstances and
    goals
  • But do we know what kind of wheather will be in
    two months?

22
2. Model
  • Known model e.g. analithic model (given by
    differential equations) - too complex to be
    handled
  • Lack of knowledge - the information about the
    system is uncertain or imperfect
  • We need new, more precise knowledge
  • The knowledge representation (model) should be
    handable and should tolerate the problems

23
Learning and Modeling
  • New knowledge by learning
  • Unknown, partially unknown, known but too complex
    to be handled, ill-defined systems
  • Model by which we can be analyze the system and
    can predict the behavior of the system
  • Criteria (quality measure) for the validity of
    the model

24
u
Input
d
c
Measure of the quality of the model
y
Parameter tuning
1. Observation (u, d, y), 2. Knowledge
representation (model, formalism), 3. Decision
(optimizasion, c(d,y)), 4. Tuning (of the
parameters), 5. Environmental influence,(non-obser
ved input, noise, etc.) 6. Prediction ability
(for the future input)
25
Iterative procedure

We build a system for collecting information
We improve the system by building in the
knowledge
We collect the information
We improve the observation and collect more
information
26
Problem
Knowledge representation, Model
Represented knowledge
Independant space, coupled to the problem by the
formalism
Non-represented part of the problem
27
3. Optimization
  • Valid where the model is valid
  • Given a system with free parameters
  • Given an objective measure
  • The task is to set the parameters which mimimize
    or maximize the qualitative measure
  • Systematic and random methods
  • Exploitation (of the deterministic knowledge) and
    exploration (of new knowledge)

28
Methods of Computational Intelligence
  • fuzzy logic low complexity, easy build in of the
    a priori knowledge into computers, tolerance for
    imprecision, interpretability
  • neuro computing - learning ability
  • evalutionary computing optimization, optimum
    learning
  • anytime computing robustness, flexibility,
    adaptivity, coping with the temporal
    circumstances
  • probabilistic reasoning uncertainty, logic
  • chaotic computing open mind
  • machine learning - intelligence

29
Fuzzy Logic
  • Lotfi Zadeh, 1965
  • Knowledge representation in natural language
  • computing with words
  • Perceptions
  • Value imprecisiation ?meaning precisiation

30
History of fuzzy theory
  • Fuzzy sets logic Zadeh 1964/1965-
  • Fuzzy algorithm Zadeh 1968-(1973)-
  • Fuzzy control by linguistic rules Mamdani Al.
    1975-
  • Industrial applications Japan 1987- (Fuzzy
    boom), KoreaHome electronicsVehicle
    controlProcess controlPattern recognition
    image processingExpert systemsMilitary systems
    (USA 1990-)Space research
  • Applications to very complex control problems
    Japan 1991-e.g. helicopter autopilot

31
Areas in which Fuzzy Logic was succesfully used
  • Modeling and control
  • Classification and pattern recognition
  • Databases
  • Expert Systems
  • (Fuzzy) hardware
  • Signal and image processing
  • Etc.

32
  • Universe of discourse Cartesian (direct) product
    of all the possible values of each of the
    descriptors
  • Linguistic variable (linguistic term) Zadeh
    By a linguistic variable we mean a variable
    whose values are words or sentences in a natural
    or artificial language. For example, Age is a
    linguistic variable if its values are linguistic
    rather than numerical, i.e., young, not young,
    very young, quite young, old, not very old and
    not very young, etc., rather than 20, 21, 22, 23,
    ...
  • Fuzzy set It represents a property of the
    linguistic variable. A degree of includance is
    associated to each of the possible values of the
    linguistic variable (characteristic function)
  • Membership value The degree of belonging into
    the set.

33
An Example
  • A class of students (e.g. M.Sc. Students taking
  • the Spec. Course Computational Intelligence)
  • The universe of discourse X
  • Who does have a drivers license?
  • A subset of X A (Crisp) Set
  • ?(X) CHARACTERISTIC FUNCTION
  • 1 0 1 1 0 1 1
  • Who can drive very well?
  • ?(X) MEMBERSHIP FUNCTION
  • 0.7 0 1.0 0.8 0 0.4 0.2

FUZZY SET
34
Definitions
  • Crisp set
  • Convex setA is not convex as a?A, c?A,
    butd?a(1-?)c ?A, ??0, 1.B is convex as for
    every x, y?B and??0, 1 z?x(1-?)y ?B.
  • Subset

35
Definitions
  • Relative complement or differenceABx x?A
    and x?BB1, 3, 4, 5, AB2, 6.C1, 3, 4,
    5, 7, 8, AC2, 6!
  • Complement where X is
    the universe.Complementation is
    involutiveBasic properties
  • UnionA?Bx x?A or x?B
  • For

(Law of excluded middle)
36
Definitions
  • IntersectionA?Bx x?A and x?B. For
  • More properties Commutativity A?BB?A,
    A?BB?A. Associativity A?B?C(A?B)?CA?(B?C)
    , A?B?C(A?B)?CA?(B?C). Idempotence
    A?AA, A?AA. Distributivity A?(B?C)(A?
    B)?(A?C), A?(B?C)(A?B)?(A?C).

(Law of contradiction)
37
Membership function
Crisp set Fuzzy set
Characteristic function Membership function
?AX?0, 1 ?AX?0, 1
38
Some basic concepts of fuzzy sets
Ele-ments Infant Adult Young Old
5 0 0 1 0
10 0 0 1 0
20 0 .8 .8 .1
30 0 1 .5 .2
40 0 1 .2 .4
50 0 1 .1 .6
60 0 1 0 .8
70 0 1 0 1
80 0 1 0 1
39
Some basic concepts of fuzzy sets
  • Support supp(A)x ?A(x)gt0.
    supp?Infant0, so supp(Infant)0.If
    supp(A)lt?, A can be defined A?1/x1 ?2/x2
    ?n/xn.
  • Kernel (Nucleus, Core) Kernel(A)x
    ?A(x)1.

40
Definitions
  • Height
  • height(old)1 height(infant)0
  • If height(A)1 A is normal
  • If height(A)lt1 A is subnormal
  • height(0)0
  • (If height(A)1 then supp(A)0)
  • a-cut
  • Strong Cut
  • Kernel
  • Support
  • If A is subnormal, Kernel(A)0

41
Definitions
  • Fuzzy set operations defined by L.A. Zadeh in
    1964/1965
  • Complement
  • Intersection
  • Union

?(x)
42
Definitions
This is really a generalization of crisp set ops!
A B ?A A?B A?B 1-?A min max
0 0 1 0 0 1 0 0
0 1 1 0 1 1 0 1
1 0 0 0 1 0 0 1
1 1 0 1 1 0 1 1
43
Fuzzy Proportion
  • Fuzzy proportion X is PTina is young,
    whereTina Crispage, young fuzzy
    predicate.

Fuzzy sets expressing linguistic terms for ages
Truth claims Fuzzy sets over 0, 1
  • Fuzzy logic based approximate reasoning
  • is most important for applications!

44
? CRISP RELATION SOME INTERACTION OR
ASSOCIATION BETWEEN ELEMENTS OF TWO OR MORE
SETS. ? FUZZY RELATION VARIOUS DEGREES OF
ASSOCIATION CAN BE REPRESENTED A B A B
? ? ? ? ? ? ? ? ? ? ?
? ? ? CRISP RELATION FUZZY
RELATION ? CARTESIAN (DIRECT) PRODUCT OF TWO
(OR MORE) SETS X, Y X ? Y (x,y) ?
x ? X, y ? Y X ? Y ? Y ? X IF X ? Y
! MORE GENERALLY ? xi (x1,
x2, , xn) ? xi ? Xi , i ? Nn
0.5
0.8
1
0.9
0.6
CR
FR
n
i 1
45
Fuzzy Logic Control
  • Fuzzification converts the numerical value to a
    fuzzy one determines the degree of matching
  • Defuzzification converts the
  • fuzzy term to a classical numerical value
  • The knowledge base contains the fuzzy rules
  • The inference engine describes the methodology to
    compute the output from the input

46
Fuzzyfication
µ
1
8,4
X
The measured (crisp) value is converted to a
fuzzy set containing one element with membership
value1
µ(x) 1 if x8,4 0 otherwise
47
DefuzzificationCenter of Gravity Method (COG)
48
Specificity of fuzzy partitions
Fuzzy Partition A containing three linguistic
terms
Fuzzy Partition A containing seven linguistic
terms
49
Fuzzy inference mechanism (Mamdani)
  • If x1 A1,i and x2 A2,i and...and xn An,i
    then y Bi

The weighting factor wji characterizes, how far
the input xj corresponds to the rule antecedent
fuzzy set Aj,i in one dimension
The weighting factor wi characterizes, how far
the input x fulfils to the antecedents of the
rule Ri.
50
Conclusion
The conclusion of rule Ri for a given x
observation is yi
51
Fuzzy Inference
  • Mamdani Type

52
Fuzzy systems an example
TEMPERATURE
MOTOR_SPEED
Fuzzy systems operate on fuzzy rules IF
temperature is COLD THEN motor_speed is LOW IF
temperature is WARM THEN motor_speed is MEDIUM IF
temperature is HOT THEN motor_speed is HIGH
53
Inference mechanism (Mamdani)
Temperature 55
Motor Speed
RULE 1
RULE 2
RULE 3
Motor Speed 43.6

54
Planning of Fuzzy Controllers
  • Determination of fuzzy controllers
    determination of the antecedents consequents of
    the rules
  • Antecedents
  • Selection of the input dimensions
  • Determination of the fuzzy partitions for the
    inputs
  • Determination of the parameters for the fuzzy
    variables
  • Consequents
  • Determination of the parameters

55
Fuzzy-controlled Washing Machine (Aptronix
Examples)
  • Objective
  • Design a washing machine controller, which
    gives the correct wash time even though a precise
    model of the input/output relationship is not
    available
  • Inputs
  • Dirtyness, type of dirt
  • Output
  • Wash time

56
Fuzzy-controlled Washing Machine
  • Rules for our washing machine controller are
    derived from common sense data taken from typical
    home use, and experimentation in a controlled
    environment.
  • A typical intuitive rule is as follows
  • If saturation time is long and transparency
    is bad,then wash time should be long.

57
Air Conditioning Temperature Control
  • Temperature control has several unfavorable
    features non-linearity, interference, dead time,
    and external disturbances, etc.
  • Conventional approaches usually do not result in
    satisfactory temperature control.
  • Rules for this controller may be formulated using
    statements similar to
  •  If temperature is low then open heating valve
    greatly

There is a sensor in the room to monitor
temperature for feedback control, and there are
two control elements, cooling valve and heating
valve, to adjust the air supply temperature to
the room.
58
Air Conditioning Temperature Control Modified
Model
  • There are two sensors in the modified system one
    to monitor temperature and one to monitor
    humidity. There are three control elements
    cooling valve, heating valve, and humidifying
    valve, to adjust temperature and humidity of the
    air supply.

Rules for this controller can be formulated by
adding rules for humidity control to the basic
model. If temperature is low then open
humidifying valve slightly. This rule acts as a
predictor of humidity (it leads the humidity
value) and is also designed to prevent overshoot
in the output humidity curve.
59
Smart Cars 1 - Rules
  • The number of rules depends on the problem. We
    shall consider only two for the simplicity of the
    example
  • Rule 1 If the distance between two cars is short
    and the speed of your car is high(er than the
    other ones), then brake hard.
  • Rule 2 If the distance between two cars is
    moderately long and the speed of your car is
    high(er than the other ones), then brake
    moderately hard.

60
Smart Cars 2 Membership Functions
  • Determine the membership functions for the
    antecedent and consequent blocks
  • Most frequently 3, 5 or 7 fuzzy sets are used (3
    for crude control, 5 and 7 for finer control
    results)
  • Typical shapes (triangular most frequent)

61
Smart Cars 3 Simplify Rules using Codes
  • Distance between two cars X1 speed X2Breaking
    strength YLabels- small, medium, large S, M, L
  • In the case of X2 (speed), small, medium, and
    large mean the amount that this car's speed is
    higher than the car in front.
  • Rule 1
  • If X1S and X2M, then YL Rule 2
  • If X1M and X2L, then YM

PL - Positive LargePM - Positive MediumPS -
Positive SmallZR - Aproximately ZeroNS -
Negative SmallNM - Negative MediumNL - Negative
Large
62
Smart Cars 4 - Inference
  • Determine the degree of matching
  • Adjust the consequent block
  • Total evaluation of the conclusions based on the
    rules
  • To determine the control amount at a certain
    point, a defuzzifier is used (e.g. the center of
    gravity). In this case the center of gravity is
    located at a position somewhat harder than medium
    strength, as indicated by the arrow

63
Advantages of Fuzzy Controllers
  • Control design process is simpler
  • Design complexity reduced, without need for
    complex mathematical analysis
  • Code easier to write, allows detailed simulations
  • More robust, as tests with weight changes
    demonstrate
  • Development period reduced

64
Neural Networks
  • (McCullogh Pitts, 1943, Hebb, 1949)
  • Rosenblatt, 1958 (Perceptrone)
  • Widrow-Hoff, 1960 (Adaline)
  • It mimics the human brain

65
Neural Networks
  • Neural Nets are parallel, distributed information
    processing tools which are
  • Highly connected systems composed of identical or
    similar operational units evaluating local
    processing (processing element, neuron) usually
    in a well-ordered topology
  • Possessing some kind of learning algorithm which
    usually means learning by patterns and also
    determines the mode of the information processing
  • They also possess an information recall algorithm
    making possible the usage of the previously
    learned information

66
Application area where NNs are succesfully used
  • One and multidimentional signal processing (image
    processing, speach processing, etc.)
  • System identification and control
  • Robotics
  • Medical diagnostics
  • Economical features estimation

67
Application area where NNs are succesfully used
  • Associative memory content addresable memory
  • Classification system (e.g. Pattern recognition,
    character recognition)
  • Optimization system (the usually feedback NN
    approximates the cost function) (e.g. radio
    frequency distribution, A/D converter, traveling
    sailsman problem)
  • Approximation system (any input-output mapping)
  • Nonlinear dynamic system model (e.g. Solution of
    partial differtial equation systems, prediction,
    rule learning)

68
Main features
  • Complex, non-linear input-output mapping
  • Adaptivity, learning ability
  • distributed architecture
  • fault tolerant property
  • possibility of parallel analog or digital VLSI
    implementations
  • Analogy with neurobiology

69
The simple neuron
Linear combinator with non-linear activation
70
Typical activation functions
step linear sections tangens
hyperbolic sygmoid
71
Classical neural nets
  • Static nets (without memory, feedforward
    networks)
  • One layer
  • Multi layer
  • MLP (Multi Layer Perceptron)
  • RBF (Radial Basis Function)
  • CMAC (Cerebellar Model Artculation Controller)
  • Dynamic nets (with memory or feedback recall
    networks)
  • Feedforward (with memory elements)
  • Feedback
  • Local feedback
  • Global feedback

72
Feedforward architectures
One layer architectures Rosenblatt perceptron
73
Feedforward architectures
One layer architectures
Input
Output
Tunable parameters (weighting factors)
74
Feedforward architectures
Multilayer network (static MLP net)
75
Approximation property
  • universal approximation property for some kinds
    of NNs
  • Kolmogorov Any continuous real valued N
    variable function defined over the 0,1N compact
    interval can be represented with the help of
    appropriately chosen 1 variable functions and sum
    operation.

76
Learning
  • Learning parameter estimation
  • supervised learning
  • unsupervised learning
  • analytic learning

77
Supervised learning
estimation of the model parameters by x, y, d
n (noise)
x
d
Input
CC(e)
y
Parameter tuning
78
Supervised learning
  • Criteria function
  • Quadratic
  • ...

79
  • Minimization of the criteria
  • Analytic solution (only if it is very simple)
  • Iterative techniques
  • Gradient methods
  • Searching methods
  • Exhaustive
  • Random
  • Genetic search

80
Parameter correction
  • Perceptron
  • Gradient methods
  • LMS (least means square algorithm)
  • ...

81
LMS (Iterative solution based on the temporary
error)
  • Temporary error
  • Temporary gradient
  • Weight update

82
Gradient methods
  • The route of the convergence

83
Gradient methods
  • Single neuron with nonlinear acticvation
  • Multilayer network backpropagation (BP)

84
Teaching an MLP network The Backpropagation
algorithm
85
Design of MLP networks
  • Size of the network (number of layers, number of
    hidden neurons)
  • The value of the learning factor, µ
  • Initial values of the parameters
  • Validation, learning set, test set
  • Teaching method (sequential, batch)
  • Stopping criteria (error limit, number of
    cycles)

86
Modular networks
  • Hierarchical networks
  • Linear combination of NNs
  • Mixture of experts
  • Hybrid networks

87
Linear combination of networks
88
Mixture of experts (MOE)
Gating network
experts
89
Decomposition of complex tasks
  • Decomposition and learning
  • Decomposition before learning
  • Decomposition during the learning (automatic task
    decomposition)
  • Problem space decomposition
  • Input space decomposition
  • Output space decomposition

90
Example Automatic recognition of numbers (e.g.
Postal code)
  • Binary pictures with 16x16 pixels
  • Preprocessing (idea the numbers are composed of
    edge segments) 4 edge detections
  • normalization ? four 8x8 pictures (i.e. 256
    input elements
  • Classification by 45 independant networks, each
    classifying only two classes of the ten figures
    (1 or 2, 1 or 3, ..., 8 or 0, 9 or 0)
  • The corresponding network output are connected to
    an AND gate, if its output equals to 1 then the
    figure is recognized

91
Example Automatic recognition of handwritten
figures (e.g. Postal codes)
Edge detection
normalization
horizontal
input
diagonal \
Edge detection masks
vertical
diagonal /
92
Example Automatic recognition of handwritten
figures (e.g. Postal codes)
93
Genetic Algorithms
  • John Holland, 1975
  • Adaptive method for searching and optimization
    problems
  • Copying the genetic processes of the biological
    organisms
  • Natural selection (Charles Darwin The Origin of
    Species)
  • Multi points search

94
Successful applicational areas
  • Optimization (circuit design, scheduling)
  • Automatic programming
  • Machine learning (classification, prediction,
    wheather forecast, learning of NNs)
  • Economical systems
  • Immunology
  • Ecology
  • Modeling of social systems

95
The algorithm
  • Initial population ? parent selection ? creation
    of new individuals (crossover, mutation) ?
    quality measure, reproduction ? new generation ?
    exit criteria?
  • If no continue with the algorithm
  • If yes selection of the result, decoding
  • Like in biology in real word

96
Problem building
  • Selection of the most important features, coding
  • Fitness function quality measure (optimum
    criterium)
  • Exit criteria
  • Selection of the size of the population
  • Specification of the genetic operations

97
Simple genetic algorithms
  • Representation features coded in a binary
    string (chromosome, string)
  • Fitness function representing the viability
    (optimality) of the individual
  • Selection selecting the parent individuals from
    the generation (e.g. random but fitness based,
    i.e. better chance with higher fittness value)

98
Simple genetic algorithms
  • Crossover from 2 parents two offsprings (one
    point, two point, N-point, uniform)

?
99
Simple genetic algorithms
  • Mutation (of the bits (genes)) (one or
    independant)
  • Reproduction who will survive and form the next
    (new) generation
  • Individuals with the best fitness function
  • Exit after a number of generation or depending
    on the fitness function of the best individual or
    average of the generation, ...

?
100
Example for GAs
  • Maximize the f(x)x2 function where x can take
    values between 0 and 31
  • Lets start with a population containing 4
    elements (generated randomly by throwing a coin).
    Each element (string) consists of 5 bits (to be
    able to code numbers between 0 and 31)

101
Example for GAs
number Initial population x value f(x) f(xi)/? f(x) ranking
1 01101 13 169 0.14 1
2 11000 24 576 0.49 2
3 01000 8 64 0.06 0
4 10011 19 361 0.31 1
Sum 1170 1170 1.00 4
Average 293 293 0.25 1
Maximum 576 576 0.49 2
102
Example for GAs
The pairs Sequence of the selection Position of the crossover New population x value f(x)
0 1 1 0 1 2 4 01100 12 144
1 1 0 0 0 1 4 11001 25 625
1 1 0 0 0 4 2 11011 27 729
1 0 0 1 1 3 2 10000 16 256
Sum 1754
Average 439
Maximum 729
103
Conclusions
  • The fitness improved significantly in the new
    generation (both the average and the maximum)
  • Initial population randomly chosen
  • Selection 4 times by a roulette wheel where
    better individuals had bigger sectors having
    bigger chance (the 3rd (worst) string has died
    out!)
  • Pairs the 1-2, 3-4 selections
  • Position of the crossover randomly chosen
  • Mutation bit by bit with p0.001 probability
  • (the generation contains 20 bits, in average 0.02
    bit will be mutated in this example none)

104
Anytime Techniques Why do we need them?
  • Larger scale signal processing (DSP) systems,
    Artificial Intelligence
  • Limited amount of resources
  • Abrupt changes in
  • Environment
  • Processing system
  • Computational resources (shortage)
  • Data flow (loss)
  • Processing should be continued
  • Low complexity ? lower, but possibly enough
    accuracy or partial results (for qualitative
    decisions)
  • ? Anytime systems

105
Anytime Systems What do they offer?
  • To handle abrupt changes due to failures
  • To fulfill prescribed response time conditions
    (changeable response time)
  • Continuos operation in case of serious shortage
    of necessary data (temporary overload of certain
    communication channels, sensor failures, etc.)
    /processing time
  • To provide appropriate overall performance for
    the whole system
  • guaranteed response time, known error
  • Flexibility available input data, available
    time, computational power, balance between time
    and quality(quality accuracy, resolution, etc)

106
Anytime systems How do they work?
  • Conditions on-line computing, guaranteed
    response time, limited resources (changing in
    time)
  • Anytime processing coping with the temporarily
    available resources to maintain the overall
    performance
  • correctmodels, treatable by the limited
    resources during limited time, low and changeable
    complexity, possibility of reallocation of the
    resources, changeable and guaranteed response
    time/ computational need, known error
  • tools iterative algorithms, other types of
    methods used in a modular architecture

107
  • optimization of the whole system (processing
    chain) based on intelligent decisions (expert
    system, shortage indicators)
  • algorithms and models of simpler complexity
  • temporarily lower accuracy
  • data for qualitative evaluations for supporting
    decisions
  • coping with the temporal conditions
  • supporting early decision making
  • preventing serious alarm situations

108
  • Shortage indicators
  • Intelligent monitor
  • Special compilation methods during runtime
  • Strict time constraints for the monitor
  • The number and the complexity of the executable
    task can be very high
  • ?
  • add-in optimization

109
Missing input samples
  • Temporary overload of certain communication
    channels, sensor failures, etc. Þ the input
    samples fail to arrive in time or will be lost
  • ß
  • prediction mechanism (estimations based on
    previous data)
  • example resonator based filters

110
Temporal shortage of computing power
  • Temporary shortage of computer power Þ the signal
    processing can not be performed in time
  • ß
  • Trade-off between the approximation accuracy and
    the complexity
  • complexity reduction techniques, reduction of the
    sampling rate, application of less accurate
    evaluations

111
Temporal shortage of computing power
  • Examples
  • application of lower order filters or
    transformers (in case of recursive discrete
    transformers to switch off some of the channels,
    obvious req. to maintain e.g. the orthogonality
    of the transformations
  • Singular Value Decomposition applied to fuzzy
    models, B-spline neural networks, wavelet
    functions, Gabor functions, etc. - fuzzy filters,
    human hearing system, generalized NNs

112
Temporal shortage of computing time
  • Temporary shortage of computer time Þ the signal
    processing can not be performed in time
  • Examples
  • block-recursive filters and filter-banks
  • overcomplete signal representations

113
Anytime algorithms iterative methods
  • Evaluate 734/25! (after 1 second appr. 30 ?
    after 5 seconds better 29,3 ? after 8 seconds
    exactly 29,36

We build a system for collecting information
We improve the system by building in the
knowledge
We collect the information
We improve the observation and collect more
information
114
Anytime algorithms modular architecture
  • Units Distinct/different implementations of a
    task,with the same interface but different
    performance characteristics
  • characteristics
  • complexity
  • accuracy
  • error transfer characteristic
  • ? selection

115
Engineering view Practical issues
  • Well defined mathematical fundation but there is
    a gap between the theory and the implementation
  • When and which is working better? (the theory can
    not give any answer or is lazy to think over?)
  • How to choose the sizes/parameters/shapes/definiti
    ons/etc.?
  • What if the axioms are inconsistant/incomplete?
    (the practical possibility can be 0)
  • Handling of the exceptions, e.g. the rule for
    very young overwrites the rule young
  • Good advises Modeling, a priori knowledge,
    iteration, hybrid systems, smooth
    systems/parameters (as near to the real world as
    possible)

116
Accuracy problems
  • How can we handle accuracy problems if we e.g.
    dont have any input information?
  • What if in time critical applications not only
    the stationary responses are to be considered?
  • How can the different modeling/data
    representation methods interprete the others
    results?
  • New (classicalnonclassical) measures are needed

117
Transients
  • Dynamic systems
  • Change in the systems Þ transients
  • Depending on the transfer function and on the
    actual implementation of the structure
  • Strongly related to the energy distribution of
    the system
  • Effected by the steps and the reconfiguration
    route

118
Transients
  • Must be reduced and treated
  • careful choosing of the architecture (orthogonal
    structures have better transients)
  • multi step reconfiguration selection of the
    number and location of the intermediate steps
  • estimation of the effect of transients

119
Is CI really solution for unsolvable problems?
  • Yes The high number of succesful applications
    and the new areas where automatization became
    possible prove that Computational Intelligence
    can be a solution for otherwise unsolvable
    problems
  • Although With the new methods new problems have
    arised to be solved by you
  • Future engineering is unthinkable without
    Computational Intelligence

120
Conclusions
  • What is Computational Intelligence?
  • What is the secret of its success?
  • How does it work?
  • What kind of approaches/concepts are attached?
  • New problems with open questions
Write a Comment
User Comments (0)
About PowerShow.com