Title: Public Computing Challenges and Solutions
1Public Computing - Challenges and Solutions
- Yi Pan
- Professor and Chair of CS
- Professor of CIS
- Georgia State University
- Atlanta, Georgia, USA
- AINA 2007
- May 21, 2007
2Outlines
- What is Grid Computing?
- Virtual Organizations
- Types of Grids
- Grid Components
- Applications
- Grid Issues
- Conclusions
3Outlines -continued
- Public Computing and the BOINC Architecture
- Motivation for New Scheduling Strategies
- Scheduling Algorithms
- Testing Environment and Experiments
- MD4 Password Hash Search
- Avalanche Photodiode Gain and Impulse Response
- Gene Sequence Alignment
- Peer to Peer Model and Experiments
- Conclusion and Future Research
4What is Grid Computing?
- Analogy is to power grid
- Heterogeneous and geographically dispersed
5What is Grid Computing?
- Analogy is to power grid
- Heterogeneous and geographically dispersed
- Standards allow for transportation of power
6What is Grid Computing?
- Analogy is to power grid
- Heterogeneous and geographically dispersed
- Standards allow for transportation of power
- Standards define interface with grid
7What is Grid Computing?
- Analogy is to power grid
- Heterogeneous and geographically dispersed
- Standards allow for transportation of power
- Standards define interface with grid
- Non-trivial overhead of managing movement and
storage of power - Economies of scale compensate for this overhead
allowing for cheap, accessible power
8A Computational Power Grid
- Goal is to make computation a utility
- Computational power, data services, peripherals
(Graphics accelerators, particle colliders) are
provided in a heterogeneous, geographically
dispersed way
9A Computational Power Grid
- Goal is to make computation a utility
- Computational power, data services, peripherals
(Graphics accelerators, particle colliders) are
provided in a heterogeneous, geographically
dispersed way - Standards allow for transportation of these
services
10A Computational Power Grid
- Goal is to make computation a utility
- Computational power, data services, peripherals
(Graphics accelerators, particle colliders) are
provided in a heterogeneous, geographically
dispersed way - Standards allow for transportation of these
services - Standards define interface with grid
- Architecture provides for management of resources
and controlling access - Large amounts of computing power should be
accessible from anywhere in the grid
11Virtual Organizations
- Independent organizations come together to pool
grid resources - Component organizations could be different
research institutions, departments within a
company, individuals donating computing time, or
anything with resources - Formation of the VO should define participation
levels, resources provided, expectations of
resource use, accountability, economic issues
such as charge for resources - Goal is to allow users to exploit resources
throughout the VO transparently and efficiently
12Types of Grids
- Computational Grid
- Data Grid
- Scavenging Grid
- Peer-to-Peer
- Public Computing
13Computational Grids
- Traditionally used to connect high performance
computers between organizations - Increases utilization of geographically dispersed
computational resources - Provides more parallel computational power to
individual applications than is feasible for a
single organization - Most traditional grid project concentrate on
these types of grids - Globus and OSGA
14Data Grids
- Distributed data sources
- Queries of distributed data
- Sharing of storage and data management resources
- D0-Partical Physics Data Grid allows access to
both compute and data resources of huge amounts
of physics data - Google
15Scavenging Grids
- Harness idle cycles on systems especially user
workstations - Parallel application must be quite granular to
take advantage of large amounts of weak computing
power - Grid system must support terminating and
restarting work when systems cease idling - Condor system from University of Wisconsin
16Peer-to-Peer
- Converging technology with traditional grids
- Contrasts with grids having little infrastructure
and high fault tolerance - Highly scalable for participation but difficult
to locate and monitor resources - Current P2P like Gnutella, Freenet, FastTrack
concentrate on data services
17Public Computing
- Also converging with grid computing
- Often communicates through a central server in
contrast with peer-to-peer technologies - Again scalable with participation
- Adds even greater impact of multiple
administrative domains as participants are often
untrusted and unaccountable
18Public Computing Examples
- SETI_at_Home (http//setiathome.ssl.berkeley.edu/)
Search for Extraterrestrial Intelligence in radio
telescope data (UC Berkeley) ?????????????? - Has more than 5 million participants
- The most powerful computer, IBM's ASCI White, is
rated at 12 TeraFLOPS and costs 110 million.
SETI_at_home currently gets about 15 TeraFLOPs and
has cost 500K so far.
19More Public Computing Examples
- Folding_at_Home project (http//folding.stanford.edu)
for molecular simulation aimed at new drug
discovery - Distributed.net (http//distributed.net) for
cracking RC5 64-bit encryption algorithm used
more than 300,000 nodes over 1757 days
20Grid Components
- Authentication and Authorization
- Resource Information Service
- Monitoring
- Scheduler
- Fault Tolerance
- Communication Infrastructure
21Authentication and Authorization
- Important for allowing users to cross the
administrative boundaries in a virtual
organization - System security for jobs outside the
administrative domain currently rudimentary - Work being done on sandboxing, better job
control, development environments
22Resource Information Service
- Used in resource discovery
- Leverages existing technologies such as LDAP,
UDDI - Information service must be able to report very
current availability and load data - Balanced with overhead of updating data
23Monitoring
- Raw performance characteristics are not the only
measurement of resource performance - Current and expected loads can have a tremendous
impact - Balance between accurate performance data and
additional overhead of monitoring systems and
tracking that data
24Scheduler
- Owners of systems interested in maximizing
throughput - Users interested in maximizing runtime
performance - Both offer challenges with crossing
administrative boundaries - Unique issues such as co-allocation and
co-location - Interesting work being done in scheduling like
market based scheduling
25Fault Tolerance
- More work exploring fault tolerance in grid
systems leveraging peer-to-peer and public
computing research - Multiple administrative domains in VO challenge
the reliability of resources - Faults can refer not only to resource failure but
violation of service level agreements (SLA) - Impact on fault tolerance if there is no
accountability for failure
26Fault Tolerance
- More work exploring fault tolerance in grid
systems leveraging peer-to-peer and public
computing research - Multiple administrative domains in VO challenge
the reliability of resources - Faults can refer not only to resource failure but
violation of service level agreements (SLA) - Impact on fault tolerance if there is no
accountability for failure
27Fault Tolerance
- More work exploring fault tolerance in grid
systems leveraging peer-to-peer and public
computing research - Multiple administrative domains in VO challenge
the reliability of resources - Faults can refer not only to resource failure but
violation of service level agreements (SLA) - Impact on fault tolerance if there is no
accountability for failure
28Fault Tolerance
- More work exploring fault tolerance in grid
systems leveraging peer-to-peer and public
computing research - Multiple administrative domains in VO challenge
the reliability of resources - Faults can refer not only to resource failure but
violation of service level agreements (SLA) - Impact on fault tolerance if there is no
accountability for failure
29Communication Infrastructure
- Currently most grids have robust communication
infrastructure - As more grids are deployed and used, more
concentration must be done on network QoS and
reservation - Most large applications are currently data rich
- P2P and Public Computing have experience in
communication poor environments
30Applications
- Embarrassingly parallel, data poor applications
in the case of pooling large amounts of weak
computing power - Huge data-intensive, data rich applications that
can take advantage of multiple, parallel
supercomputers - Application specific grids like Cactus and Nimrod
31Grid Issues
- Site autonomy
- Heterogeneous resources
- Co-allocation
- Metrics for resource allocation
- Language for utilizing grids
- Reliability
32Site autonomy
- Each component of the grid could be administered
by an individual organization participating in
the VO - Each administrative domain has its own policies
and procedures surrounding their resources - Most scheduling and resource management work must
be distributed to support this
33Heterogeneous resources
- Grid resources will have not only heterogeneous
platforms but heterogeneous workloads - Applications truly exploiting grid resources will
need to scale from idle cycles on workstations,
huge vector based HPCs, to clusters - Not only computation power, also storage,
peripherals, reservability, availability, network
connectivity
34Co-allocation
- Unique challenges of reserving multiple resources
across administrative domains - Capabilities of resource management may be
different for each component of a composite
resource - Failure of allocating components must be handled
in a transaction-like manner - Acceptable substitute components may assist in
co-allocating a composite resource
35Metrics for resource allocation
- Different scheduling approaches are measure
performance differently - Historical performance
- Throughput
- Storage
- Network connectivity
- Cost
- Application specific performance
- Service level
36Language for utilizing grids
- Much of the work in grids is protocol or language
work - Expressive languages needed for negotiating
service level, reporting performance or resource
capabilities, security, and reserving resources - Protocol work in authentication and
authorization, data transfer, and job management
37Summary about Grids
- Grids offer tremendous computation and data
storage resources not available in single systems
or single clusters - Application and algorithm design and deployment
still either rudimentary or application specific - Universal infrastructure still in development
- Unique challenges still unsolved especially in
regard to fault tolerance and multiple
administrative domains
38Public Computing
- Aggregates idle workstations connected to the
Internet for performing large scale computations - Initially seen in volunteer projects such as
Distributed.net and SETI_at_home - Volunteer computers periodically download work
from a project server and complete the work
during idle periods - Currently used in projects that have large
workloads on the scale of months or years with
trivially parallelizable tasks
39BOINC Architecture
- Berkeley Open Infrastructure for Network
Computing - Developed as a generic public computing framework
- Next generation architecture for the SETI_at_home
project - Open source and encourages use in other public
computing projects
40BOINC lets you donate computing power to the
following projects
- Climateprediction.net study climate change
- Einstein_at_home search for gravitational signals
emitted by pulsars - LHC_at_home improve the design of the CERN LHC
particle accelerator - Predictor_at_home investigate protein-related
diseases - SETI_at_home Look for radio evidence of
extraterrestrial life - Cell Computing biomedical research (Japanese
requires nonstandard client software)
41BOINC Architecture
42Motivation for New Scheduling Strategies
- Many projects requiring large scale computational
resources not of the current public computing
scale - Grid and cluster scale projects are very popular
in many scientific computing areas - Current public computing scheduling does not
scale down to these smaller projects
43Motivation for New Scheduling Strategies
- Grid scale scheduling for public computing would
make public computers a viable alternative or
complimentary resource to grid systems - Public computing has the potential to offer a
tremendous amount of computing resources from
idle systems of organizations or volunteers - Scavenging grid projects such as Condor indicate
interest in harnessing these resources in the
grid research community
44Scheduling Algorithms
- Current BOINC scheduling algorithm
- New scheduling algorithms
- First Come, First Serve with target workload of 1
workunit (FCFS-1) - First Come, First Serve with target workload of 5
workunits (FCFS-5) - Ant Colony Scheduling Algorithm
45BOINC Scheduling
- Originally designed for unlimited work
- Clients can request as much work as desired up to
a specified limit - Smaller, limited computational jobs faced with
the challenge of more accurate scheduling - Too many workunits assigned to a node leads to
either redundant computation by other nodes or
exhaustion of available workunits - Too few workunits assigned leads to increased
communication overhead
46New Scheduling Strategies
- New strategies target computational problems on
the scale of many hours or days - Four primary goals
- Reduce application execution time
- Increase resource utilization
- No reliance on client supplied information
- Remain application neutral
47First Come First Serve Algorithms
- Naïve scheduling algorithms based solely on the
frequency of client requests for work - Server-centric approach which does not depend on
client supplied information for scheduling - At each request for work, the server compares the
number of workunits already assigned to a node
and sends work to the node based on a target
worklevel - Two algorithms tested targeting either a workload
of one workunit (FCFS-1) or five workunits
(FCFS-5)
48Ant Colony Algorithms
- Meta-heuristic modeling the behavior of ants
searching for food - Ants make decisions based on pheromone levels
- Decisions affect pheromone levels to influence
future decisions
?
49Ant Colony Algorithms
- Initial decisions are made at random
- Ants leave trail of pheromones along their path
- Next ants use pheromone levels to decide
- Still random since initial trails were random
?
50Ant Colony Algorithms
- Shorter paths will complete quicker leading to
feedback from the pheromone trail - Ant at destination now bases return decision on
pheromone level - Decisions begin to become ordered
?
?
51Ant Colony Algorithms
- Repeated reinforcement of shortest path leads to
greater pheromone buildup - Pheromone trails degrade over time
?
?
52Ant Colony Algorithms
- At this point the route discovery has converged
- Probabilistic model of route choice allows for
random searching of potentially better routes - Allows escape from local minima or adaptation to
changes in environment
?
?
53Ant Colony Scheduling
- In the context of scheduling, the scheduler
attempts to find optimal distribution of
workunits to processing nodes - To carry out the analogy, workunits are the
ants, computational power is the food, and
the mapping is the path - Scheduler begins by randomly choosing mappings of
workunits to nodes - As workunits are completed and returned, more
powerful nodes are reinforced more often than
weaker nodes - More workunits are sent to more powerful nodes
54Ant Colony Scheduling in BOINC
- To take advantage of more workunits on each node,
distributions are chosen on batches of workunits - A percentage of a target batch is sent based on
pheromone level - Due to batching of workunits, server to client
communication is consolidated and reduced - Using pheromone heuristic ensures nodes get a
share of workunits proportional to their
computing power
55Ant Colony Scheduling in BOINC
- Pheromone levels based on actual performance of
completed workunits not on reported benchmarks of
nodes - Attempts to improve on CPU benchmarks
- Incorporates communication overhead
- Fluctuations in performance
- Dynamic removal and addition of nodes
- Level can be calculated completely by server and
not on untrusted nodes
56Testing Environment and Experiments
- Testing of new scheduling strategies implemented
on a working BOINC system - Scheduling metrics and data
- Strategies used to schedule three experiments
- MD4 Password Hash Search
- Avalanche Photodiode Gain and Impulse Response
Calculations - Gene Sequence Alignment
57Testing Environment
58Scheduling Metrics and Data
- All three experiments are measured with the same
metrics - Application runtime of each scheduling algorithm
and the sequential runtime - Speedup versus sequential runtime for each
scheduling algorithm - Workunit Distribution of each algorithm
59MD4 Password Hash Search
- MD4 is a cryptographic hash used in password
security in systems such as Microsoft Windows and
the open source SAMBA - Passwords are stored by computing the MD4 hash of
the password and storing the hashed result - Ensures clear-text passwords are not stored on a
system - When password verification is needed, the
supplied password is hashed and compared to the
stored hash - Cryptographic security of MD4 ensures the
password cannot be derived from the hash - Recovering a password is possible through brute
force exhaustion of all possible passwords and
searching for a matching hash value
60MD4 Search Problem Formulation
- MD4 search experiment searches though all
possible 6 character passwords - A standard keyboard allows 94 possible characters
in a password - For 6 character passwords, there are 946 possible
passwords
61MD4 Search Problem Formulation
- BOINC implementation divides the entire password
space into 2,209 workunits of 9444 possible
passwords - All passwords in the workunit are hashed and
compared to a target hash - Results are sent back to the central server for
processing - All workunits are processed regardless of finding
a match
62MD4 Search Problem Formulation
- Problem is ideally suited to the public computing
architecture - Computationally intensive
- Independent tasks
- Low communication requirements
63MD4 Search Results
- Parallel runtimes are measured versus an
extrapolated sequential runtime based on the time
needed for computing passwords of one workunit - Parallel implementation takes on the additional
load of scheduling and communication costs
64MD4 Search Runtime
65MD4 Search Runtime
66MD4 Search Runtime
67MD4 Search Parallel Runtime
68MD4 Search Runtime
- All three show runtimes significantly lower than
sequential - Ant Colony and FCFS-5 show similar runtimes lower
than FCFS-1 - FCFS-5 shows erratic runtime due to processing
and reporting five workunits at a time
69MD4 Search Speedup Comparison
70MD4 Search Speedup
- FCFS-1 quickly approaches and maintains a lower
peak speedup level due to communication overhead
and delay from scheduling requests - FCFS-1 also suffers from reduced parallelism due
to inability to exploit local parallelism on the
quad processor system - FCFS-5 erratically approaches higher speedup
level - Ant colony approaches peak speedup level with a
similar pattern to FCFS-1 with a level similar to
FCFS-5
71MD4 Search Workunit Distribution
72MD4 Search Workunit Distribution
- Quad processor system underutilized with FCFS-1
algorithm - Remaining systems evenly distributed for all
three scheduling algorithms - Lower speed workstations receive proportionally
smaller workloads
73MD4 Search Conclusion
- MD4 search is ideally suited to the public
computing architecture - Calculation benefits from larger workloads
assigned to nodes to reduce communication
overhead - Ant Colony and FCFS-5 perform similarly with
FCFS-1 performing poorly
74Avalanche Photodiode Gain and Impulse Response
- Avalanche Photodiodes (APDs) are used as
photodetectors in long-haul fiber-optic systems - The gain and impulse response of APDs is a
stochastic process with a random shape and
duration - This experiment calculates the joint probability
distribution function (PDF) of APD gain and
impulse response
75APD Problem Formulation
- The joint PDF of APD gain and impulse response is
based on the position of an input carrier - This input carrier causes ionization on the APD
leading to additional carriers within a
multiplication region - This avalanche effect leads to a gain in carrier
over time - Due to this avalanche effect, the joint PDF can
be calculated iteratively based on the
probability of a carrier ionizing and in turn
causing additional impacts and ionizations
creating new carriers
76APD Problem Formulation
- BOINC implementation parallelizes calculation of
the PDF for any carrier in 360 of the unit
circle - 360 workunits are created corresponding to each
of these positions using identical parameters - The result of each workunit is a matrix of
results with all values for all positions of a
carrier and the impulse response for all times
77APD Runtime
- Sequential runtime is based on extrapolating
total runtime from the average CPU time of a
single workunit - All three parallel schedules show runtimes
significantly lower than sequential - Ant Colony and FCFS-5 show similar runtimes lower
than FCFS-1 - FCFS-5 shows erratic runtime due to processing
and reporting five workunits at a time
78APD Runtime
79APD Runtime
80APD Runtime
81APD Parallel Runtime
82APD Runtime
- Ant Colony has lowest runtime followed by FCFS-1
and finally by FCFS-5 - Note the spike in runtime for FCFS-5 at the end
of the calculation - Long runtime of individual workunits accounts for
this spike at the end of the calculation for
FCFS-5 when pool of workunits is exhausted
83APD Speedup Comparison
84APD Speedup
- Large fluctuations at the beginning of the
calculation likely due to constrained bandwidth
for output data - Bandwidth constraint leaves all nodes performing
similarly except for the single local node
85Testing Environment
86APD Workunit Distribution
87APD Workunit Distribution
- Local workstation is highest performer of the
nodes - Quad processor is weakest performer
- Shares outbound bandwidth with most other nodes
- Constrained bandwidth of single network interface
dominates any benefit from local parallelism - Workunits on other nodes randomly distributed due
to contention for communication medium - Ant colony allocates the fewest workunits to the
quad processor and most workunits to the local
node
88APD Conclusion
- APD experiment focuses on the impact of
communication overhead due to output data on
scheduling strategy - All three offer significant speedup over
sequential with FCFS-5 performing the worst - Ant colony outperforms both naïve algorithms by
an increased allocation of work to the best
performing node - Ant colony benefits from reserving more workunits
in the work pool for higher performing nodes at
the end of the calculation
89Gene Sequence Alignment
- Problem from bioinformatics finding the best
alignment for two sequences of genes based on
matching of bases and penalties for insertion of
gaps in either sequence - Alignments of two sequences are scored to
determine the best alignment - Different alignments can offer different scores
90Gene Sequence Alignment
- Given two sequences
- A bonus is given for a match in the sequences
- A penalty is applied for a mismatch
91Gene Sequence Alignment
- Sequences can be realigned by inserting gaps
- Gaps are penalized
- Resulting scores will differ depending on where
gaps are inserted
92Sequence Alignment Problem Formulation
- Finding the best possible alignment is based on a
dynamic programming algorithm - A scoring matrix is calculated to simultaneously
calculate all possible alignments - Calculating the scoring matrix steps through each
position and determines the score for all
combinations of gaps - Once calculated, the best score can be found and
backtracked to determine the alignment
93Sequence Alignment Problem Formulation
- Each entry in the scoring matrix depends on
adjacent neighbors from the position before - These dependencies create a pattern depicted in
the diagram
94Sequence Alignment Problem Formulation
- The dependencies of the scoring matrix make
parallelization difficult - Nodes cannot compute scores until previous
dependencies are satisfied - Maximum parallelism can be achieved by
calculating the scores in a diagonal major fashion
95Sequence Alignment Problem Formulation
- BOINC implementation only measures calculation
and storage of the solution matrix - Does not include finding the maximum score and
backtracing through the alignment - Solution matrix is left on the client and not
transferred to the central server - Problem calculates the solution matrix for
aligning two generated sequences each of length
100,000
96Sequence Alignment Runtime
97Sequence Alignment Runtime
- Runtime curves shows a slight wave beginning with
a decrease in per unit runtime and later
increasing again - Due to the wavefront completion of tasks in the
diagonal major computation leading to increasing
parallelism up to the longest diagonal - After this midpoint, parallelism decreases
98Diagonal Major Execution
99Sequence Alignment Runtime
100Sequence Alignment Runtime
101Sequence Alignment Parallel Runtime
102Sequence Alignment Speedup Comparison
103Sequence Alignment Speedup
- FCFS-1 shows a steady curve reflecting gradual
increase in parallelism due to available tasks
and a steady decrease in parallelism as the
wavefront passes the largest diagonal of the
calculation - FCFS-5 shows a more gradual incline at the
beginning of the calculation and steeper decline
toward the end - Ant colony shows a steeper incline and gradual
decline
104Sequence Alignment Speedup
- FCFS-5 enjoys less parallelism at the beginning
of the calculation due to allocating many
workunits to nodes requesting work with such a
small pool to draw from - As more workunits become available, FCFS-5s
aggressive scheduling works to its advantage
until workunits begin to become exhausted again - FCFS-1 is conservative in scheduling throughout
- Ant Colony begins conservatively but occasionally
sends multiple workunits to a node - This leads to a quicker buildup of generated
workunits early in the calculation - Later in the calculation, Ant Colony schedules
more aggressively and eventually exhausts the
workunit pool similarly to FCFS-5
105Sequence Alignment Workunit Distribution
106Sequence Alignment Workunit Distribution
- Mostly random distribution due to task dependency
dominating the calculation - All three scheduling techniques show little
preference for any node based on communication or
processing resources
107Sequence Alignment Conclusion
- Ant colony provides an interesting mix of
attributes of both FCFS-1 and FCFS-5 when
scheduling this computation - It shares the conservative scheduling of FCFS-1
and later aggressive scheduling similar to FCFS-5 - All three parallel computations offer only a
slight benefit to the problem due to the task
dependency structure - It should be noted, the theoretical computation
time of the sequential algorithm would require a
sequential machine with 37.3 GB of memory if no
memory reduction techniques are used in storing
the solution matrix
108Performance Summary
- Ant colony scheduling offers top performance in
all three experiments - FCFS-1 and FCFS-5 offer varying performance
levels depending on the attributes of the target
application - Ant colony adapts to match or better the best of
the competing algorithms - All three offer acceptable schedules for the
parallel applications without relying on client
supplied information
109Problems with BOINC
- Client Server Model
- Traffic Congestion Problem
- Server too busy to handle requests
- Solution peer to peer model
110Peer to Peer Platform
- Written in Java
- Uses the JXTA toolkit to provide communication
services and peer to peer overlay network - Platform provides basic object and messaging
primitives to facilitate peer to peer application
development - Messages between objects can be transparently
passed to objects on other peers or to local
objects - Each object in an application runs in its own
thread
111Distributed Scheduling
- Applications provide their own scheduling and
decomposition of work - Typical pattern of application design
- A factory object generates work units from
decomposed job data - Worker objects perform computation
- Result objects consolidate and report results
- The work factory handles distributing work to
cooperating peers and consolidating results
112Distributed Sequence Alignment
- Sequence Alignment job begins as a comparison of
two complete sequences - Work factory breaks the complete comparison into
work units up to a minimum size - A work unit can begin processing as soon as its
dependencies are available - Initially, only the upper left corner work unit
of the result matrix has all dependencies
satisfied - When a work unit is completed, its adjacent work
units become eligible for processing
113Distributed Sequence Alignment
- The distributed application attempts to complete
work units in squares of four adjacent work units - As more work units are completed, larger and
larger squares of work units become eligible for
processing
Complete
Eligible
Complete
Not Ready
Eligible
Processing
114Distributed Sequence Alignment
- Peers other than the peer initially starting the
job will begin with no work to complete - The initial peer will broadcast a signal
signifying availability of eligible work units - Peers will attempt to contact a peer advertising
work requesting work
115Distributed Sequence Alignment
- A peer with available work will distribute the
largest amount of work eligible and mark the work
unit as remotely processed - When a peer completes all work in a work unit it
will report to the peer who initially assigned
the work - Only adjacent edges of results necessary for
computing new work units are reported to reduce
communication - Complete results are stored at the peer
performing the computation
116Distributed Sequence Alignment
- After reporting the completion of a work unit a
peer will seek new work from all peers - Once the initial peer completes all work the job
is done - Peers could then be queried to report maximum
scores and alignments
117Experiment
- Peer to peer algorithm was tested aligning two
sequences of 160,000 bases each - Alignments used a typical scoring matrix and gap
penalty - The minimum work unit size for decomposition of
the total job was 10,000 bases for each sequence
resulting in 256 work units
118Experiment
- The distributed system was executed on a local
area network with 2, 4, and 6 fairly similar
peers - The results were compared to a sequential
implementation of the algorithm run on one of the
peers - The sequential implementation does a
straightforward computation of the complete
matrix - Disk is used to periodically save the sections of
the matrix since the complete matrix would
exhaust available memory
119Runtime
- Runtime is reduced from 1 hour and 9 minutes to
28 minutes - This is about 2.4 times faster than sequential
- The most dramatic drop is at 2 nodes with at 1.75
times faster at 39 minutes
120Node Efficiency
- The first peer generally has the highest
efficiency - Average efficiency drops as more nodes are added
121Analysis
- Findings are in line with the structure of the
sequential problem - Due to dependencies of tasks on previous tasks,
many nodes must initially remain idle as some
nodes complete dependent tasks - As more work becomes eligible for processing,
more nodes can work simultaneously - Later in the computation fewer work units become
eligible for computation due to fewer dependencies
122Comparison With CS Model
- Previous work performed the same computation with
a client server platform BOINC - Aligned two sequences of 100,000 bases each
- Work unit size was 2,000 bases for each sequence
- 2500 work units
- Performed on 30 nodes
123Comparison With CS Model
- Direct comparison is difficult
- Previous job was smaller but with smaller
granularity increasing number of work units - Previous sequential portions of alignment seem to
be inefficient compared to current implementation
based on overall sequential job completion time
124Comparison With CS Model
- Best comparison is overall runtime reduction
factor - Runtime reduction factor compares distributed
completion time with similar sequential
completion time - Factors out differing performance of sequential
aspects of computation - BOINC implementation achieved a reduction factor
only about 1.2 times sequential - Peer to peer achieved 2.4 with only 1/5 the nodes
125Comparison With CS Model
- BOINC implementation was impeded by a central
server which was using a slower link to most
nodes compared to the peer to peer configuration - Peer to peer implementation also shows signs of
diminishing returns as more nodes are added - Peer to peer utilizes all participating systems,
client server normally uses a server which does
not participate in the computation outside of
scheduling
126Conclusions
- Our scheduling strategy is effective on BOINC
- P2P is more effective than BOINC Client-server
model - Public computing can solve problems of large
computing power requirement and huge memory
demand - Potentially replace supercomputing for certain
applications (large grains)
127Future Work
- Measure the impact of fault tolerance on these
scheduling algorithms - Measure the impact of work redundancy and work
validation - Continue to benchmark the P2P implementation with
more nodes - Implement Ant Colony Scheduling on P2P model
128Future Work
- Currently a peer only seeks more work when it has
completed all of its work - It does not seek work when waiting for reports
from peers to which it has distributed work - Allowing a peer to seek work while it is waiting
for other peers may increase utilization - Create a more direct comparison with client
server model - Additional applications and improvements to the
base platform
129Thank You