Title: Computer Architectures and Linguistics
1Computer Architectures and Linguistics
- Von Neumann Architecture versus Artificial Neural
Networks
2Computer Architectures and Linguistics Topics
- Von Neumann Architecture
- Human Neural Information Processing
- Artificial Neural Networks (ANN)
- Opportunities of ANN
- Reality of ANN
- Summary
- References
31. Von Neumann Architecture
- In chapter 1 the von Neumann computer
architecture is explained. - We learn something about the history and the
characteristics of the von Neumann architecture
and its properties.
41.1. Person and History
- Von Neumann (1907 1957) was a mathematics
scientist (a native of Austria Hungary) who
emigrated to USA. - He prepared a draft for an automatic programmable
device, in 1945 (later called EDVAC Electronic
Discrete Variable Automatic Computer).
51.2 Characteristics of the von Neumann
Architecture
- It is assumed that all computers are of the von
Neumann type unless stated otherwise. - Instructions and data are stored in the same
memory. - The memory is sequentially addressed.
- Only one bus is used for addresses and for data.
- The meaning of data is not stored with it (in
memory are only untyped binary data stored).
61.3 Basic Elements of the von Neumann Machine
Memory
Input/Output
Control Unit
ALU
Systembus
ALU arithmetic logic unit
71.4 Program Execution
- A program is a list of instructions stored in the
order to be executed. - The computer processing unit can only access one
word at a time. - The next instruction to be executed is stored
after the current one. - Logically the operation of a processor can be
decomposed into three phases - Fetch the instruction from memory,
- Decode Find out what the instruction is
and get - the data to operate on.
- Execute Carry out the required
instruction.
81.5 Sequential Programming Style
- The definition of variables in von Neumann
programming languages is based on symbolic named
memory locations. - Every layer of routines and subroutines is to be
executed always in the same order. It is not
possible to jump to a higher layer before
terminating the execution of the current one. - The required stringent sequential program
execution constitutes a specific programming
style.
91.6 Disadvantages of the von Neumann Architecture
- Semantic gaps between the von Neumann machine and
the computer programming languages The machine
operates only bit-chains and is not able to
distinguish between data and instruction. Memory
access is only controlled by the given memory
address not by data typing. - Von Neumann linguistic bottleneck In each
operation step only one object can be
transformed. Each execution is based on several
data accesses ( v. N. Linguistic Bottleneck).
102. Human Neural Information Processing
- In Chapter 2 we learn some facts about human
neural information processing so far they are
important for artificial neural network systems.
112.1 Neurons and Synapses
- The human brain contains more than
100.000.000.000 (10¹¹ 100 milliards) of
neurons. - Each neuron is connected up to 100.000 other
neurons. - The neurons are connected by synapses.
122.2 The two main Types of Synapses
- There are two main types of synapses
- The excitatory ones which increase the
postsynaptic potential, or bring the following
neuron closer to triggering. - The inhibitory ones which work in the opposite
direction. - The type of the synapse is determined by
chemical receptors.
132.3 Interpretation of Information
- The synaptic efficacy of synapses enables them to
interpret incoming stimulation impulses. - The synaptic efficacy increments, if one
connexion is used very frequently. - The synaptic efficacy regresses, if the connexion
is not used or is not used often. - This corresponds to the ability to represent
knowledge acquired through former experiences.
142.4 The Model of Associative Learning (Hebbs
Learning Rule)
- The synaptic efficacy changes in time in
proportion to the input impulses. - If the input impulses and the synaptic efficacy
are sufficient to pass over a specific threshold
value, the next neuron is stimulated. - If two neurons are stimulated at the same time,
their connexion is supported. - In this way simultaneous events of the
stimulation of connected neurons cause an
association of these neurons. For example See a
rose and smell the fragrance.
152.5 The Delta Learning Rule
- The delta learning rule is an extension of Hebbs
learning rule. - By the delta learning rule it is possible to
adapt the synaptic efficacy between neurons
(network elements) dependent on the difference
between desired and actual activation of a
pattern. - For Example The different pronunciations of the
same phoneme.
162.6 Properties of Computer and Brain
173. Artificial Neural Networks
- In chapter 3 we learn some facts about the
properties of artificial neural networks (ANN),
their characteristics, and in what sense they are
related to human language models.
183.1 The Basic Attributes of ANN
- Great number of simple uniform processing units,
- Parallel processing,
- Transmission of stimulation impulses to the
following elements, - Modification of the efficacy of connexions
between the elements dependent on the value of
incoming stimulation impulses, - Occurrence of excitatory and inhibitory types of
connexions, - Distributed representation of knowledge.
193.2 Abstraction of the Human Model
- The architecture and processing of artificial
neural networks (ANN) is an abstraction of
biological neural networks. - Abstraction means an idealized description and
modelling of the biological original pattern. - Compare page 11 it seems not possible to create
a real copy!
203.3 Connectionism
- Connectionism is the modelling and simulation of
information processing based on artificial neural
networks. - This term is used to describe the practical
application of the general principles of
artificial neural networks.
213.4 Abstract Processing Unit
- The efficacy is corresponding to an
interpretation of the output values of former
elements. - If the sum of modified incoming stimulations
passes over a specific threshold-value, the
processing unit is stimulated and is enabled to
send output to the following unit.
i3
f3
i2
out
f2
f1
i1
i input out output f efficacy modification
factor
223.4.1 Abstract Processing Unit Example
- Threshold-value 0,5
- i1, i2, i3 inputs
- f1, f2, f3 efficacy modification factor
- Output i1xf1 i2xf2 i3xf3
- f1-0,3 f2 0,4 f3 0,2
- If i10, i2 1, i3 1 activation (output 1)
- If i1, i2, i3 1 the unit remains inactive
(output 0)
I31
f30,2
I21
out
20,4
f1-0,3
i11 or 0
i input out output f efficacy modification
factor
233.5 Transformation of Conventional Databases
-
- It is possible to transform conventional
databases in artificial neural networks.
243.5.1 Sequential Characterization
253.5.2 Network Presentation
20s
30s
- Inside the pools are only inhibitory connexions,
because each node excludes the other nodes. - The nodes in the central pool are instances.
40s
John
Ben
Jenny
Tim
Arizona
student
teacher
Florida
Texas
263.5.3 Activation Process
- If one node of a pool is activated, all the other
nodes are suppressed. - Through the activation of the instance, which is
related to John, it is possible to activate all
the attributes of John, caused by the positive
connexions of these elements to the activated
instance. Here 30s, Florida and student.
273.5.4 Activation of the Node John
- This is a very simplified demonstration, only to
show the general principles of the activation
process. - The first column of the table shows the
designation of the processing element. - The second column shows the activation values.
28 3.5.5 Network Presentation and Activation of
John
20s
30s
40s
John
Ben
Jenny
Tim
Arizona
student
teacher
Florida
Texas
293.5.6 Discussion of the Activation Values
- The instance of John has a high value.
- Also the properties of John are high valued.
- The differences in the amounts of the activation
values in Johns properties show how typically
these properties are for John. - The differences in the values of the other
instances (Ben, Tim, Jenny) show that the
instances, which are sharing more properties with
John (see the figure on page 28 red arrows of
Jenny and Ben), have the highest activation
values.
303.5.7 Interpretation of the Activation Values
- Artificial neural networks have associative
capabilities, and are able to represent fine
granulated knowledge. - They are also capable to produce generalizations
based on associations. - This offers many advantages of ANN in
applications of Linguistics.
313.6 Context Feature Structure System
- The context feature structure system is a model
to describe concepts, which are changing their
semantic features depending on the context. - On the next page the German example Buch is
shown, modelled in context feature structure.
323.6.1 The German Example Buch in Context
Feature Structure
V
N
fem
male
neut
1. p
Buch (instance)
nom
2. p
acc
3. p
Buch
dat
sing
plur
333.6.2 Discussion of the Context Feature Structure
System
- The categorical properties are forming pools
class, case, number, person, gender. - It is possible that Buch represents a
nominative, accusative, or dative. So there are
connexions to each of these properties. - If Buch is activated, each of these nodes turns
softly stimulated. Caused by the mutual
suppression of the nodes the activation values
are not high. The appropriate case node is
activated through other activations caused by the
context (f. i. an article).
343.7 Symbols and Subsymbols
- Now we learn the differences between symbols
and subsymbols and therefore we compare the
properties of symbols and of subsymbols.
353.7.1 Symbols
- Represent concepts,
- Are arbitrary this means the relation between
symbol and concept is arbitrary, - Are definite - this means unequivocally defined,
- Are atomic - this means between the concepts are
definition gaps, - Are not substitutable - this means a lost symbol
is a lost concept.
363.7.2 Subsymbols
- Are the internal symbols of a model and represent
the concepts of a subsymbolic system (hidden
layer), - Are associative ( are able to categorize),
- Are based on networks of many simple processor
units, which are processing their input locally
and sending their output to connected processor
units, - Are not defined by the network designer,
- Are self-organized ( are able to constitute new
concepts), - Are related to other symbols and subsymbols and
dependent on the changing context.
373.7.3 Neural Net Layers
input values
input neuron layer
weight matrix
hidden neuron layer
weight matrix
output neuron layer
output values
383.7.4 Subsymbolic Model and Conceptualization
- The subsymbolic model leaves off only the
symbolic representation but not the basics of
symbolic thinking Concepts and categories. - An architecture which allows conceptualization is
to a certain extent the representation of the
concept of a category itself. - Source http//www.spinfo.uni-koeln.de/mweidner/su
bsym/subsym.htm/ - (From German to English by Inge)
393.8 Conceptualization
- Artificial neural networks are capable to
generalize information patterns and to
constitute associative relationships to existing
patterns. - In this chapter we learn how the ability to
constitute new patterns corresponds to the
ability of conceptualization.
403.8.1 Conceptualization Definitions
- Category A quantity of stimuli, which are
handled by the system in the similar way and
therefore produce the same conditions (or at
least similar conditions). - Concepts Identifiable internal conditions of a
system (internal symbols). - Source http//www.spinfo.uni-koeln.de/mweidner/su
bsym/subsym.htm/
413.8.2 Conceptualization Methods
- Bottom-up By external stimuli (acoustic,
linguistic, visual, sensorial,). - Top-down By the impacts of internal yet stored
conditions. - In both cases new concepts are compared with the
system concepts before categorization is
possible. This is a context-sensitive (see our
example on pages 31 and 32) process.
423.8.3 Conceptualization Explanations
- The system has to compare new concepts with
former generated or stored concepts. - The system has to know the contexts of former
stored concepts. - The contexts of subordinated categories are
similar in very many properties (example
chimpanzee, gorilla). - The contexts of superordinated categories are
different in very many properties (example ape,
cat).
433.8.4 Conzeptualization Model of the Connectionism
- Reduction of information by filtering irrelevant
details. For example a single screw is not
relevant for the category car. - Realization and recognition by several hidden
layers which are mutual and associatively
connected to several input layers. - Competition by the Hebb rule, following the
principle winner take more.
443.9 Summary ANN
- ANN are based on associative processing,
- Distributed representation of the concepts,
- Knowledge is performed by the sum of all
processing units and all their connexions, - Knowledge evolution by information reduction,
- Ability to describe conception structures.
454. Opportunities of ANN
- Some problems can not be solved by conventional
computers but by ANN, for instance - Pattern association and pattern classification,
- Regularity detection,
- Speech analysis,
- Processing of inaccurate or incomplete inputs,
- Image processing,
- ..
465. Reality of Using Neural Networks
- ANN are not available because we are lacking a
mathematical description of the dynamical
interaction in complex networks. There are only
some simplified prototypes realized. - The description problem is solved by
software-controlled simulations of neural
networks which are using the conventional
hardware. - Another opportunity is to use some conventional
processors which are connected and working
parallel.
476. Summary
- This report was only an introduction to the
really complex subject of artificial neural
networks and their differences to the von Neumann
architecture of computers. - The best model for linguistic information
processing is the human brain. - The most promising artificial model for
linguistic information processing is the model of
artificial neural networks.
487. Sources
- Detlef Peter Zaun, Künstliche neuronale Netze und
Computerlinguistik, Tübingen 1999. - Teuvo Kohonen, Self-Organization and Associative
Memory, Third Edition, Springer-Verlag Berlin
Heidelberg 1984, 1988 and 1989 - Günter Schmitt, Mikrocomputertechnik mit den
Prozessoren der 68000-Familie, München Wien
1987. - Atsushi Akara, The Early Computers, Oxford
University Press 2002. - Michael Friedewald, Der Computer als Werkzeug und
Medium, Berlin Diepholz 1999. - http//user.cs.tu-berlin.de/icoup/archiv/3.ausgab
e/artikel/neumann.html - http//rfhs8012.fh-regensburg.de/saj39122/jfroehl
/diplom/e-11-text.html - http//www.spinfo.uni-koeln.de/mweidner/subsym/sub
sym.html - http//www.coli.uni-sb.de/hansu/what_is_cl.html