Title: ARTIFICIAL INTELLIGENCE
1ARTIFICIAL INTELLIGENCE AND SCIENTIFIC DISCOVERY
Luigi Abruzzese, Luciano Bonvissuto, Giuseppe
Carluccio, Mario Ceresa, Michele Garbugli, Davide
Lo Pinto, Luca Di Rienzo Residenza Universitaria
Torrescalla Milano -Italy
2INTRODUCTION
Scientific discovery is one of the most
characterizing activity of the human mind
Can computers emulate human mind in scientific
discovery?
Or are computers only a strong help in this
activity?
3Artificial Intelligence
Theoretical foundations
Methodologies
Techniques
4Artificial Intelligence
Software
Hardware
Performances of human mind
5Philosophical background
MODEL
ABDUCTION
INDUCTION DEDUCTION
PHENOMENON
ADDUCTION
LAW
6Historical analysis
The early days of artificial intelligence saw
various attempts to automate creative tasks of
scientific and mathematical inference. Perhaps
the earliest examples (on electronic computers)
of symbolic mathematical or scientific inference
were masters theses at MIT (J.F. Nolan) and at
Temple (H. G. Kahrimanian) in 1953 on analytical
differentiation in the calculus. Starting in the
1960s, Lederberg invented an algorythm for
generating molecular structures efficiently,
which led to the Stanford Dendral project whose
goal was to elucidate molecular structure on the
basis of mass spectograms and other experimental
evidence.
7Space-state search
8BACON DISCOVERS KEPLERS THIRD LAW
- The squares of the periods of planets are
proportional to the cubes of the mean radii of
their orbits P2 / D3.
9BACON
- AI program developed in 1970s.
- BACON is provided with knowledge of certain
mathematical relationships. - It carries out a search through the space of
possible compositions of those relationships
10Heuristics of BACON
DmPn constant
11- In conclusion
- BACON does not know what it has discovered. It
is BACON creators who comprehend the significance
of the discovery. - So who is the real discoverer, human or machine?
12Mathematical definitions and demonstrations
Computer (Artificial Intelligence)
13The Four Color Theorem
14First demonstration
1922 Franklin max 25 regions
Reducibility and discharging
Heesch
Appel e Haken
1476 particular cases
1200 hours processing
15a not rigorous demonstration
The demonstration couldnt be verified by a human
brain
Proving something to the people would mean
persuading a sufficient number of qualified
people. If we accept this kind of definition, in
the future it will be possible that calculators
will help men in the discovery of the new laws
of math
16FROM AUTOMATED DISCOVERY PROGRAMS TO
COMPUTATIONAL SCIENTIFIC DISCOVERY SYSTEMS
- Up to the 80s automated discovery programs
discovered laws already known - The state space approach leads to the explosion
of possibilities - Only the scientists can introduce heuristics
- that can limit the number of possible states
- Who really makes the discovery man or machines?
17RECENT RESEARCH
- The idea of totally automated discoveries is
abandoned - The new trend is towards computer supported
scientific discovery - The new goal is to obtain really new discoveries,
that can be published on specialized literature
18MACHINE LEARNING
- Different kinds of machine learning
- Supervised learning
- Unsupervised learning
- Reinforcement learning
19SUPERVISED LEARNING
NEURAL NETS
- Ensemble of elemental units combined in a
reticular structure - Net elements, called neurons, are organized in
layers and are tightly interconnected - To each link is associated a weight that
represents a kind of inner knowledge - MAIN FEATURES
- Learning
- Prediction
- TRAINING
- Training-set of examples (as input/output)
- Learning Algorithm
- Weights calibration
- GOAL
- Generalization of training results based on
test-set ( prediction )
Formal neuron structure
TRAINING PHASE
20NEURAL NETS
- The endeavour of emulating the real neural nets
leads to conceive various kinds of nets that can
be classified according to some parameters - Use
- Learning algorithms
- Links structure
- The most known models are
- Feedforward nets (with back-propagation
algorithm,most used) - Associative nets
- Stochastic nets
- Self-organizing nets (Kohonen, also unsupervised)
- Genetic nets ( models from Darwin evolution
theory)
21UNSUPERVISED LEARNING
- DATA MINING
- Process of exploration and analysis ,with
automatic and semi-automatic tools, of vast
amount of data, oriented to discover significant
structures and rules and to develop predictive or
explicative models of a specific phenomenon. - We have many techniques of data mining
- Decisional trees , data warehouse , clustering
, associative rules and temporal sequences......
22CLUSTERING
- Cluster Objects/data collection
- Similar compared with
each object in the same cluster - Different from other
clusters objects - CLUSTERING ANALYSIS To group objects together
in cluster - Clustering is defined as unsupervised
classification - It doesnt use any background knowledge on
studied data set -
- TIPICAL APPLICATIONS
- As stand-alone tool to try to understand how data
are distributed - (for ex. in genic expression data
analysis,astronomic data elaboration....) - As preprocessing pass for other algorithms
23REINFORCEMENT LEARNING
- System acts directly on problem making attempts
- A teacher rewards or punishes
- the system through a numerical
- signal of reinforce , depending on
- system instant behaviour
-
Rewards and punishments
24An example GOLEM ( built by Muggleton and Feng,
1992 )
- PROBLEM Prediction of secondary structure of
protein from the _ ________ sequence of amino
acids. - A traditional method used for discovering the
secondary structure is X-ray crystallography ,
but a crystal structure determination may
require one or more man-year. - In general, other techniques also used for this
problem are costly,time-consuming and often
limitated by some proteins parameters (like
size ...). - From this the need of computational systems
support.
25GOLEM problem description
- The two main substructures of proteins are
-
- a helix structure
- ß filamants structure
- GOLEM
- Restricts the field of his analysis to
- a helix proteins
- Attempts to predict, from primary _structure,
if a particular residue (amino acid) belongs or
not to the a helix _type.
ß filaments structure _ a helix
structure
26GOLEM FUNCTIONING (1)
- TRAINING SET 12 proteins ,non homologous,
with well known structures - (LEARNING) of a helix type ,
comprising 1612 residues. -
- BACKGROUND
KNOWLEDGE -
- SMALL SET OF RULES used for predicting which
residues belong to ahelix _
proteins. - TEST SET 4 proteins (structure known) , ahelix
type, comprising 416 _ __
residues - ACCURACY(on test
set) 81 ( 2 ) -
27GOLEM FUNCTIONING (2)
- Information coded in 1 or 2 parts predicates
- Ex a(155C,105) means that a particular
protein (155c) residue (in 105 - ___ position) is a ahelix type .
- Preferential research toward residues that show
particular links characters with the others
(data mining) - Research of rules carried out with an iterative
procedure that involves - a bootstrapping learning process.
- Then the rules generated by GOLEM can be
considered hypothesis about - the ways through which ahelix form in
nature.They define the pattern of relations that
,if present in the sequence of residues,
indicates that a specific residue could be part
of a ahelix .
28GOLEM RESULTS
- One of the rules produced by GOLEM ,concerning
protein structure is - for example the RULE 12
- There is a ahelix residue in the protein A in
position B if - 1 The residue in B -2 is not proline
- 2 The residue in B -1 is neither aromatic
nor proline - 3 The residue in B is big, neither aromatic
and nor lysine - 4 The residue in B 1 is hydrophobic and
not lysine - 5 The residue in B 2 is neither aromatic
nor proline - 6 The residue in B 3 is neither aromatic
nor proline, and or small or polar - and,
- 7 The residue in B 4 is hydrophobic and
not lysine - This rule has an ACCURACY of 95 in training and
of 81 on test set. - This rule was not known before GOLEM discovered
it and it has contributed - to one of the most important actual problem of
natural sciences. - Thats why we can credit to GOLEM the discover
of a natural law.
29CONCLUSION
We have presented some attempts of creating
artificial intelligence programs that make
scientific discoveries From the historical
analysis we have shown that the first idea of A.I
programs that autonomously discover scientific
laws has been abandoned The new trend is that of
computer supported scientific discovery in which
Artificial Intelligence is a useful and
sometimes necessary tool for scientific
research