Reconnect - PowerPoint PPT Presentation

About This Presentation
Title:

Reconnect

Description:

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, ... In development: ALPS/BiCePs/BLIS ... – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0
Slides: 35
Provided by: willia112
Category:
Tags: biceps | reconnect

less

Transcript and Presenter's Notes

Title: Reconnect


1
Reconnect 04Introduction to PICO
  • Cynthia Phillips, Sandia National Laboratories
  • Joint work with
  • Jonathan Eckstein, Rutgers
  • William E. Hart, Sandia National Laboratories

2
Parallel Computing Systems
  • A set of processors (from 2 up to tens of
    thousands) working together on a problem
  • communicating by messages (even if hidden from
    user)
  • Architectures
  • Grid
  • Network of workstations (LAN)
  • Beowulf cluster
  • Tightly-coupled system

3
Parallelism in Branch and Bound
  • Two sources of parallelism in BB
  • Within subproblems
  • Across subproblems
  • Warning
  • Can solve problems otherwise unsolvable but
  • A constant-factor increase in processors (even
    10,000) cannot overcome exponential growth.
  • We still have to be clever

4
Parallelism Issues for Branch and Bound
  • In the best of cases, all the processors are busy
    all the time doing useful, independent work
  • Overhead (coordination, exchange of data)
  • Load balancing
  • What to do when the tree is small?
  • Tree shape depends on order of node evaluation
  • Can lead to slowdown anomalies
  • Try to emulate a good serial ordering
  • Wed do a lot better with a single processor
    1000x faster

5
Parallel Experimental Algorithmics/Engineering
Issues
  • Inherent nondeterminism
  • Parallel random number generators
  • e.g. for randomized algorithms
  • Debugging

6
Solution Options for Integer Programming
  • Commercial codes (ILOGs cplex)
  • Good and getting better
  • Expensive
  • Serial (or modest SMP)
  • Free serial codes (ABACUS, MINTO, BCP)
  • Modest-level parallel codes (Symphony)
  • Grid parallelism (FATCOP)
  • In development ALPS/BiCePs/BLIS
  • Massive parallelism PICO (Parallel Integer and
    Combinatorial Optimizer)
  • Note Parallel BB for simple bounding PUBB,
    BoB/BOB, PPBB-lib, Mallba, Zram

7
Parallel Integer and Combinatorial Optimizer
(PICO)
  • Distributed memory (MPI), C
  • Massively parallel (scalable)
  • General parallel Branch Bound environment
  • Portable, flexible
  • Serial, small LAN, Cplant, ASCI Red, Red Storm
  • Allows exploitation of problem-specific
    knowledge/structure
  • Open Source release
  • Always support a free LP solver

8
PICO Features for Efficient Parallel BB
  • Efficient processor use during ramp-up
  • Integration of heuristics to generate good
    solutions early
  • Efficient work storage/distribution
  • Load balancing
  • Non-preemptive proportional-share thread
    scheduler
  • Flexible hub/worker interaction
  • Subproblem states with flexible search strategy
  • Correct termination
  • Early output

9
What To Do With 9000 Processors and One
Subproblem?
  • Option 1 Presplitting
  • Make log P branching choices and expand all ways
    (P problems)
  • P processors
  • BAD!
  • Expands many problems that would be fathomed in a
    serial solution.

10
PICO MIP Ramp-up
  • Serialize tree growth
  • All processors work in parallel on a single node
  • Parallelize
  • LP bounding
  • Preprocessing
  • Cutting plane generation
  • Incumbent Heuristics
  • Pseudocost (gradient) initialization
  • Work division by processor ID/rank
  • Crossover to parallel with perfect load balance
  • When there are enough subproblems to keep the
    processors busy
  • When single subproblems cannot effectively use
    parallelism

11
Parallel Incumbent Search
  • Genetic algorithms
  • Decomposition-based methods (general)
  • Pivot, cut, and dive general heuristic
  • Custom Methods

12
Interior-Point Method for Solving the Root Problem
  • Mehrotras predictor-corrector (primal-dual)
    method
  • Iterative method where the computational core of
    each iteration is the solution of a linear system
    with constraint matrix
  • A is the original LP constraint matrix.
  • D is a diagonal matrix that changes each
    iteration.
  • Direct Cholesky Solvers OK for moderate
    parallelism
  • Iterative methods
  • Preconditioning is a big issue
  • Support theory can help if the matrix has network
    structure

13
Resolving LP on Subproblems
Cutting plane (valid inequality or branch
constraint)
Original LP Feasible region
LP optimal solution
Integer optimal
  • Dual simplex is much faster than starting over
  • Need parallel dual simplex!

14
Hubs and Workers
  • Each hub controls some number of workers (can
    work itself)
  • Setting parameters, can go from fully centralized
    to fully distributed
  • Subproblem pools at both the hub and workers
  • Heap (best-first), stack (depth-first), queue
    (breadth-first), custom
  • Hubs only keep tokens

15
Subproblem Movement
  • Hub Worker
  • When worker has low load or low-quality local
    pool
  • Worker Hub
  • Draw back when hub out of work and cluster
    unbalanced
  • Send new subproblem tokens to hub
    (probabilistically) depending on load
  • Probabilistically scatter tokens to a random hub.
    If load in cluster is high relative to others,
    scatter probability increases.
  • Setting parameters, go from pure master-slave to
    local
  • Tradeoffs Communication, Processor utilization,
    approximation of serial search order

16
Subproblem Movement/Data Storage
T
Hub
Worker
T
T
T
T
SP
SP
T
SP
SP
SP
SP
SP Server
SP
SP
SP Receiver
SP
SP
SP
SP Server
17
Load Balancing
  • Hub pullback
  • Random scattering
  • Rendezvous
  • Hubs determine load (function of quantity and
    quality)
  • Use binary tree of hubs
  • Determine donors and receivers, match them,
    exchange

18
Non-Preemptive Scheduler is Sufficient for PICO
  • Processes are cooperating
  • Control is returned voluntarily so data
    structures left in clean state
  • No memory access conflicts, no locks
  • PICO has its own thread scheduler
  • High priority, short threads are round robin and
    done first
  • Hub communications, incumbent broadcasts, sending
    subproblems
  • If these are delayed by long tasks could lead to
  • Idle processors
  • Processors working on low-quality work
  • Compute threads are proportional share (stride)
    scheduling
  • Adjust during computation (e.g. between lower and
    upper-bounding)

19
(No Transcript)
20
Subproblem States
Boundable
Being Bounded
Bounded
Being Separated
Separated
Dead
Handlers lazy, eager, hybrid, build your own
21
Early Output
  • Problem If you have to abort a long run, want to
    know variable settings for the incumbent
  • May be good enough to stop
  • Otherwise seed new search with the incumbent
    value
  • PICO will save a new incumbent if
  • It is a strict improvement over the last saved
    value (or is the first)
  • A sufficient time has passed since the last write
  • Requires a new message-triggered thread in
    parallel
  • Hub, incumbent holder, I/O processor

22
Serial Class Structure - Inheritance
  • Branching classes - control search
  • Branchsub classes - subproblems (tree nodes)
  • Problem data classes - derived only

PICO Core
AMPL Interface (optional)
Nonlinear BranchPrune
Knapsack
PICO MIP CORE
PICO BC CORE
MIP Application
23
Required Methods for Derived Node Class
  • bGlobal( ) - subproblem pointer to branching
    (search control)
  • setRootComputation( ) - create the root of the
    search tree
  • boundComputation( ) - compute subproblem lower
    bound (for min)
  • splitComputation( ) - determine how to partition
    the subproblem
  • makeChild( ) - create a child subproblem from a
    split parent
  • candidateSolution( ) - determine whether a
    proposed solution is viable candidate for
    optimality

24
Optional Customizations
  • Incumbent heuristic
  • Incumbent representation/update
  • Solution output
  • Solution validation
  • Preprocessing
  • Override default parameters
  • In MIP
  • Custom cutting planes
  • Adjust branching priorities
  • Plan to add more complex branching strategies

25
All PICOs Parallelism Comes (Almost) For Free
PICO serial Core
PICO parallel Core
Serial application
Parallel application
  • User must
  • Define serial application (debug in serial)
  • Describe how to pack/unpack data (using a generic
    packing tool)
  • C inheritance gives parallel management
  • User may add threads to
  • Share global data
  • Exploit problem-specific parallelism
  • MIP pseudocosts

26
Utilib
  • Predates STL
  • Abstract data types arrays, heaps, hash tables,
    balanced trees
  • Random number generators
  • Hash tables
  • work well for doubles very close in value
  • Arrays offer
  • Protected access (bounds checking)
  • Sharing
  • PackBuffer methods facilitate parallelization

27
Pieces of PICO
  • PICO requires
  • utilib (for data structures, math, etc)
  • COIN (an IBM-sponsored optimization interface
    standard)
  • Base interface to LP solvers
  • We add more PICO-specific functionality
  • Cut generation library
  • An LP solver
  • Currently support cplex, soplex, CLP

28
Using A Math Programming Language
  • How easily can one bring up applications?
  • In our world, applications are a moving target
    need agility

29
A Mathematical Programming Language (AMPL)
  • AMPL builds the matrix.
  • Nice cross between programming language and LaTeX
    (math view)

30
AMPL-PICO Interface
Data Files
IP
Exact
Solver PICO
AMPL
LP
Model Files
Compute Approximate Solution
Cutting Planes
  • Write cutting-plane and approximate-solution code
    using AMPL variables
  • Mapping transparent

31
AMPL-PICO Interface
  • Standard AMPL interfaces
  • Customized PICO Interface

32
Availability
  • PICO will be free under GNU lesser public license
  • MIP Requires serial LP solver
  • Cplex is expensive, but many companies/universitie
    s have it
  • CLP is free (through COIN)
  • Part of ACRO (A Common Repository for Optimizers)
  • http//software.sandia.gov/Acro/
  • Need password for CVS checkout otherwise
    tarballs

33
Open Problems (Wish List)
  • Tools wed like to see
  • Parallel matrix generation from a
    math-programming interface
  • Parallel (sparse) dual simplex solver for linear
    programming
  • Open algorithms questions
  • Ramp up management multiple subproblems in
    parallel

34
Development Team
  • Core Team
  • Jonathan Eckstein (RUTCOR) PICO core
  • Bill Hart (Sandia) scheduler, utilib, AMPL
    interface, design, etc
  • Cindy Phillips (Sandia) MIP layer, MIP
    applications
  • Other Developers
  • Harvey Greenberg (UCD) preprocessor design
  • Vitus Leung (Sandia) preprocessor
  • Tod Morrison (UCD, student) soplex interface,
    porting
  • Mikhail Nediak (RUTCOR student, now McMaster)
    MIP heuristic
  • Konrad Borys (RUTCOR student) core
    templatization, heuristic integration
  • Mike Eldred (Sandia) DAKOTA optimization
    framework
  • Ojas Parekh (ex-CMU student, Sandia) soplex
    interface
  • Mario Alleva (Sandia) porting
Write a Comment
User Comments (0)
About PowerShow.com