Title: Diapositiva 1
1 Data Mining 2 (ex Análisis Inteligente de Datos
y Data Mining) Lluís A. Belanche
2- www.lsi.upc.edu/...
- /belanche/docencia/aiddm/aiddm.html
- /avellido/teaching/data_mining.htm
3Contents of the course (hopefully)
- 1. Introduction methodologies
- 2. Exploratory DM through visualization
- 3. Pattern recognition introduction
- 4. Pattern recognition the Gaussian case
- 5. Feature extraction
- 6. Feature selection weighing
- 7. Error estimation
- 8. Linear methods are nice!
- 9. Probability in Data Mining
- 10. Latency, generativity, manifolds and all
that - 11. Application of GTM from medicine to ecology
- 12. DM Case studies
Sorry guys! no fuzzy systems
4(No Transcript)
5(No Transcript)
6Error estimation
7Feature extraction, selection and weighing have
many uses
8Linear classifiers are nice! (I)
9Linear classifiers are nice! (II) F
Transformation
- F (x) F1(x), F2(x), Fm(x)
- with x x1, x2, , xn
- Useful for ascending (mgtn) or descending
(mgtn) - with 0 lt m,n lt oo (integers) an example?
10Linear classifiers are nice! (III) F Nets
- F (x) F1(x), F2(x), Fm(x)
- with x x1, x2, , xn
?F(x)
x ?
11Utility
- This is a very powerful setting
- Let us suppose
- rgts ? increase in dimension
- increase in expressive power, ease the task
for almost any learning machine - rlts ? decrease in dimension
- visualization, compactation, noise
reduction, removal of useless information - Contradictory ? !?
12On intelligence
- What is Intelligence?
- What is the function of Intelligence?
- ? to ensure survival in nature
- What are the ingredients of intelligence?
- Perceive in a changing world
- Reason under partial truth
- Plan prioritize under uncertainty
- Coordinate different simultaneous tasks
- Learn under noisy experiences
13- Parking a Car (difficult or easy?)
- Generally, a car can be parked rather easily
because the final position of the car is not
specified exactly. It it were specified to
within, say, a fraction of a millimeter and a few
seconds of arc, it would take hours of
maneuvering and precise measurements of distance
and angular position to solve the problem. - ? High precision carries a high cost.
14The primordial soup
Fuzzy Logic
Neural Networks
Soft Computing
Chaos Fractals
Evolutionary Algorithms
Rough Sets
15What could MACHINE LEARNING possibly be?
- In the beginning, there was a set of examples
- To exploit imprecision, uncertainty, robustness,
data dependencies, learning and/or optimization
ability, to achieve a working solution to a
problem which is hard to solve. - To find an exact (approximate) solution to an
imprecisely (precisely) formulated problem.
16So what is the aim?
- The challenge is to put these capabilities into
use by devising methods of computation which lead
to an acceptable solution at the lowest possible
cost. - This should be the guiding principle
17Different methods different roles
Fuzzy Logic the algorithms for dealing with
imprecision and uncertainty Neural Networks
the machinery for learning and function
approximation with noise Evolutionary Algorithms
the algorithms for adaptive search and
optimization
uncertainty arising from the granularity in the
domain of discourse
18Examples of soft computing
- TSP 105 cities,
- accuracy within 0.75, 7 months
- accuracy within 1, 2 days
- Compare
- absoulute best for sure with very good with
very high probability
19Are you one of the top guns?
- Consider
- Search space of size s
- Draw N random samples
- What is the probability p that at least one of
them is in the top t ? - Answer p 1 (1-t/s)N
- Example s 1012, N100.000, t1.000
- ? 1 in 10.000 !
20On Algorithms
Specialized algorithms best performance for
special problems Generic algorithms good
performance over a wide range of problems
Generic Algorithms
Efficiency
Specialized Algo.
P
Problems
21Words are important !
- What is a theory ?
- What is an algorithm ?
- What is an implementation ?
- What is a model ?
- What does non-linear mean ?
- What does non-parametric mean ?
22- Learning Foreignia
- (Poggio Girosi93)
Can a machine learn to pronounce? 1. Do
nothing and wait 2. Learn all the pronunciation
rules 3. Memorize pronunciation examples 4.
Pick a subset of pronunciation pairs
and learn/memorize them 5. Pick subsets of
pronunciation examples and develop a
model explaining them
23The problem of induction
- Classical problem in Philosophy
- Example 1,2,3,4,5,?
- A more through example JT
24What are the conditions for successful learning?
- Training data (sufficiently) representative
- Principle of similarity
- Target function within capacity of the learner
- Non-dull learning algorithm
- Enough computational resources
- A correct (or close to) learning bias
25And the Oscar goes to
- The real problem is not whether machines think,
but whether men do. - B.F. Skinner,
- Contingencies of Reinforcement