Title: Evolutionary Computation on the Connex Architecture
1Evolutionary Computation on the Connex
Architecture
- István Lorentz1 Mihaela Malita2
Razvan Andonie3 - (presenter)
- 1Electronics and Computers Department,
Transylvania University of Brasov, Romania - 2Computer Science Department, Saint Anselm
College Manchester, NH, - 3Computer Science Department, Central Washington
University Ellensburg, WA, USA
MAICS 2011 The 22nd Midwest Artificial
Intelligence and Cognitive Science Conference
2Presentation Outline
- The Connex Architecture
- (more in Prof. Gheorghe M. Stefan)
- Evolutionary Algorithms (EA)
- Parallelizing EA on Connex
- Example problems
- Results
- Conclusions
3The Connex Chip
- The Connex Array
- Many-core data parallel area of 1024 Processing
Cells (PC) - Area 50 mm2 of the 1024-PC array, including
1Mbyte of memory and the two controllers - Clock speed 400 MHz
- Also on the chip
- Multi-core area 4 MIPS cores
- Speculative parallel pipe of 8 PE
- Interfaces
- DDR, PCI
- Video and Audio interfaces for 2 HDTV channels
- Total Power 5 Watts
- Total Area 82 mm2
- 65nm implementation
4The Connex Array
- Sequencer
- Issues in each cycle (on a 2-stage pipe) one
instruction for Connex Array and one instruction
for itself - I/O Controller
- Controls a 6.4 GB/s I/O channel
- Works in parallel with code running on the Connex
Array - Processing Cell
- Integer unit
- Data memory
- Boolean (predicate) unit
5Genetic Algorithms (GA)
Initialize population randomly
- Chromosomes represented as vectors of integer
components in Connex - Maximum chromosome length 1024 elements
- Population forms a matrix
- Processing blocks are parallelized
Crossover
Mutation
Evaluation
Select new generation
Convergence or limit ?
No
Yes
STOP
6Evolution strategy (ES)
Initialize population randomly
- Similar algorithm to GA
- Population and mutation parameters encoded in
vectors - Recombination forms a new individual from
multiple parents - Mutation adds a gaussian-distributed random
variable to each vector component - Deterministic selection of new generation, based
of fitness ranking
Recombination
Mutation
Evaluation
Select new parent generation
Convergence or limit ?
No
Yes
STOP
7Parallel Crossover
- Combines genes of two individuals (parents)
- Example 1-point crossover at a random position
in Vector-C - vector crossover (vector X, vector Y)
- int position rand( VECTORSIZE )
- where ( i lt position)
- C X
- elsewhere
- C Y
- return C
-
- Uses Connex's parallel-if construct where(cond)
elsewhere ...
8Parallel Mutation
- A single position is selected, randomly
- vector mutate(vector X)
- int pos rand(VECTOR_SIZE)
- float amount rand11()
- where (i pos)
- X amount
- return X
-
- The operation will affect only the selected
position
9Evaluation of fitness function
- The class of fitness functions that can be
evaluated efficiently on Connex are those
composed by - 1. data-parallel stage (local computation on each
PC), followed by - 2. parallel reduction (sum)
- For example
- - Sum of squared differences
- - Knapsack problem sum of weighted items
- - Travelling salesman problem sum of distances
between cities in a route
10Example 1. The Rosenbrock function
- Benchmark problem for optimizations
- Vector-C implementation
- where ( iltN )
- Xsh rotateLeft(X, 1)
- where( ilt(N-1) )
- X2 X X
- Xsh - X2
- Xsh Xsh 100
- X2 1 - X
- X2 X2 X2
- X2 Xsh
-
- return sumv(X2)
11Example 2 The molecular distance geometry
problem (MDGP)
- The problem given a set of distance measurements
between atoms, determine their cartesian
coordonates - Formulated as a global optimization problem,
minimize - Not all distances are known
- Some distances can be given as upper and lower
bounds
12Representing MDGP on Connex
- Each given distance d(i,j) is mapped to a
processing element - Some PC share vertices
- Shared vertices share also random generator seeds
- No interprocessor communication (except parallel
reduction)
13Running MDGP On Connex
- Evaluate distances
- Xi,Yi vertices
- D vector of known distances
- void evaluateDist(vector Xi,Yi,D)
-
- vector Dx, Dy
- DxXik-Xjk
- DyYik-Yjk
- Dx dx Dy dy
- Dx dy
- return sumAbsDiff(Dx,D)
14Results
Results
Operation Tpar Tseq Speedup
AB 1 1024 N
xorshift 128 13 13312 N
sumAbsDiffs 7 4096 0.5 N
1-Point Crossover 3 2048 0.6 N
Uniform Crossover 15 14350 0.9 N
Uniform Mutation 33 21172 0.6 N
HS Mutation 107 71506 0.6 N
Rosenbrock 14 14325 N
evaluateDist 13 10240 0.7 N
Summary of operations parallel instruction
counts, sequential instructions and speedups,
where N1024, the vector size.
15Conclusions
- - The Connex chip is suitable to parallelize
evolutionary algorithms, by vectorization - - By horizontal data mapping, we can benefit of
the parallel reduction, for a certain class of
optimization problems