Public Computing Challenges and Solutions - PowerPoint PPT Presentation

1 / 129
About This Presentation
Title:

Public Computing Challenges and Solutions

Description:

... of more workunits on each node, distributions are chosen on batches of workunits. A percentage of a target batch is sent based on pheromone level ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 130
Provided by: stephenj
Category:

less

Transcript and Presenter's Notes

Title: Public Computing Challenges and Solutions


1
Public Computing - Challenges and Solutions
  • Yi Pan
  • Professor and Chair of CS
  • Professor of CIS
  • Georgia State University
  • Atlanta, Georgia, USA
  • AINA 2007
  • May 21, 2007

2
Outlines
  • What is Grid Computing?
  • Virtual Organizations
  • Types of Grids
  • Grid Components
  • Applications
  • Grid Issues
  • Conclusions

3
Outlines -continued
  • Public Computing and the BOINC Architecture
  • Motivation for New Scheduling Strategies
  • Scheduling Algorithms
  • Testing Environment and Experiments
  • MD4 Password Hash Search
  • Avalanche Photodiode Gain and Impulse Response
  • Gene Sequence Alignment
  • Peer to Peer Model and Experiments
  • Conclusion and Future Research

4
What is Grid Computing?
  • Analogy is to power grid
  • Heterogeneous and geographically dispersed

5
What is Grid Computing?
  • Analogy is to power grid
  • Heterogeneous and geographically dispersed
  • Standards allow for transportation of power

6
What is Grid Computing?
  • Analogy is to power grid
  • Heterogeneous and geographically dispersed
  • Standards allow for transportation of power
  • Standards define interface with grid

7
What is Grid Computing?
  • Analogy is to power grid
  • Heterogeneous and geographically dispersed
  • Standards allow for transportation of power
  • Standards define interface with grid
  • Non-trivial overhead of managing movement and
    storage of power
  • Economies of scale compensate for this overhead
    allowing for cheap, accessible power

8
A Computational Power Grid
  • Goal is to make computation a utility
  • Computational power, data services, peripherals
    (Graphics accelerators, particle colliders) are
    provided in a heterogeneous, geographically
    dispersed way

9
A Computational Power Grid
  • Goal is to make computation a utility
  • Computational power, data services, peripherals
    (Graphics accelerators, particle colliders) are
    provided in a heterogeneous, geographically
    dispersed way
  • Standards allow for transportation of these
    services

10
A Computational Power Grid
  • Goal is to make computation a utility
  • Computational power, data services, peripherals
    (Graphics accelerators, particle colliders) are
    provided in a heterogeneous, geographically
    dispersed way
  • Standards allow for transportation of these
    services
  • Standards define interface with grid
  • Architecture provides for management of resources
    and controlling access
  • Large amounts of computing power should be
    accessible from anywhere in the grid

11
Virtual Organizations
  • Independent organizations come together to pool
    grid resources
  • Component organizations could be different
    research institutions, departments within a
    company, individuals donating computing time, or
    anything with resources
  • Formation of the VO should define participation
    levels, resources provided, expectations of
    resource use, accountability, economic issues
    such as charge for resources
  • Goal is to allow users to exploit resources
    throughout the VO transparently and efficiently

12
Types of Grids
  • Computational Grid
  • Data Grid
  • Scavenging Grid
  • Peer-to-Peer
  • Public Computing

13
Computational Grids
  • Traditionally used to connect high performance
    computers between organizations
  • Increases utilization of geographically dispersed
    computational resources
  • Provides more parallel computational power to
    individual applications than is feasible for a
    single organization
  • Most traditional grid project concentrate on
    these types of grids
  • Globus and OSGA

14
Data Grids
  • Distributed data sources
  • Queries of distributed data
  • Sharing of storage and data management resources
  • D0-Partical Physics Data Grid allows access to
    both compute and data resources of huge amounts
    of physics data
  • Google

15
Scavenging Grids
  • Harness idle cycles on systems especially user
    workstations
  • Parallel application must be quite granular to
    take advantage of large amounts of weak computing
    power
  • Grid system must support terminating and
    restarting work when systems cease idling
  • Condor system from University of Wisconsin

16
Peer-to-Peer
  • Converging technology with traditional grids
  • Contrasts with grids having little infrastructure
    and high fault tolerance
  • Highly scalable for participation but difficult
    to locate and monitor resources
  • Current P2P like Gnutella, Freenet, FastTrack
    concentrate on data services

17
Public Computing
  • Also converging with grid computing
  • Often communicates through a central server in
    contrast with peer-to-peer technologies
  • Again scalable with participation
  • Adds even greater impact of multiple
    administrative domains as participants are often
    untrusted and unaccountable

18
Public Computing Examples
  • SETI_at_Home (http//setiathome.ssl.berkeley.edu/)
    Search for Extraterrestrial Intelligence in radio
    telescope data (UC Berkeley) ??????????????
  • Has more than 5 million participants
  • The most powerful computer, IBM's ASCI White, is
    rated at 12 TeraFLOPS and costs 110 million.
    SETI_at_home currently gets about 15 TeraFLOPs and
    has cost 500K so far.

19
More Public Computing Examples
  • Folding_at_Home project (http//folding.stanford.edu)
    for molecular simulation aimed at new drug
    discovery
  • Distributed.net (http//distributed.net) for
    cracking RC5 64-bit encryption algorithm used
    more than 300,000 nodes over 1757 days

20
Grid Components
  • Authentication and Authorization
  • Resource Information Service
  • Monitoring
  • Scheduler
  • Fault Tolerance
  • Communication Infrastructure

21
Authentication and Authorization
  • Important for allowing users to cross the
    administrative boundaries in a virtual
    organization
  • System security for jobs outside the
    administrative domain currently rudimentary
  • Work being done on sandboxing, better job
    control, development environments

22
Resource Information Service
  • Used in resource discovery
  • Leverages existing technologies such as LDAP,
    UDDI
  • Information service must be able to report very
    current availability and load data
  • Balanced with overhead of updating data

23
Monitoring
  • Raw performance characteristics are not the only
    measurement of resource performance
  • Current and expected loads can have a tremendous
    impact
  • Balance between accurate performance data and
    additional overhead of monitoring systems and
    tracking that data

24
Scheduler
  • Owners of systems interested in maximizing
    throughput
  • Users interested in maximizing runtime
    performance
  • Both offer challenges with crossing
    administrative boundaries
  • Unique issues such as co-allocation and
    co-location
  • Interesting work being done in scheduling like
    market based scheduling

25
Fault Tolerance
  • More work exploring fault tolerance in grid
    systems leveraging peer-to-peer and public
    computing research
  • Multiple administrative domains in VO challenge
    the reliability of resources
  • Faults can refer not only to resource failure but
    violation of service level agreements (SLA)
  • Impact on fault tolerance if there is no
    accountability for failure

26
Fault Tolerance
  • More work exploring fault tolerance in grid
    systems leveraging peer-to-peer and public
    computing research
  • Multiple administrative domains in VO challenge
    the reliability of resources
  • Faults can refer not only to resource failure but
    violation of service level agreements (SLA)
  • Impact on fault tolerance if there is no
    accountability for failure

27
Fault Tolerance
  • More work exploring fault tolerance in grid
    systems leveraging peer-to-peer and public
    computing research
  • Multiple administrative domains in VO challenge
    the reliability of resources
  • Faults can refer not only to resource failure but
    violation of service level agreements (SLA)
  • Impact on fault tolerance if there is no
    accountability for failure

28
Fault Tolerance
  • More work exploring fault tolerance in grid
    systems leveraging peer-to-peer and public
    computing research
  • Multiple administrative domains in VO challenge
    the reliability of resources
  • Faults can refer not only to resource failure but
    violation of service level agreements (SLA)
  • Impact on fault tolerance if there is no
    accountability for failure

29
Communication Infrastructure
  • Currently most grids have robust communication
    infrastructure
  • As more grids are deployed and used, more
    concentration must be done on network QoS and
    reservation
  • Most large applications are currently data rich
  • P2P and Public Computing have experience in
    communication poor environments

30
Applications
  • Embarrassingly parallel, data poor applications
    in the case of pooling large amounts of weak
    computing power
  • Huge data-intensive, data rich applications that
    can take advantage of multiple, parallel
    supercomputers
  • Application specific grids like Cactus and Nimrod

31
Grid Issues
  • Site autonomy
  • Heterogeneous resources
  • Co-allocation
  • Metrics for resource allocation
  • Language for utilizing grids
  • Reliability

32
Site autonomy
  • Each component of the grid could be administered
    by an individual organization participating in
    the VO
  • Each administrative domain has its own policies
    and procedures surrounding their resources
  • Most scheduling and resource management work must
    be distributed to support this

33
Heterogeneous resources
  • Grid resources will have not only heterogeneous
    platforms but heterogeneous workloads
  • Applications truly exploiting grid resources will
    need to scale from idle cycles on workstations,
    huge vector based HPCs, to clusters
  • Not only computation power, also storage,
    peripherals, reservability, availability, network
    connectivity

34
Co-allocation
  • Unique challenges of reserving multiple resources
    across administrative domains
  • Capabilities of resource management may be
    different for each component of a composite
    resource
  • Failure of allocating components must be handled
    in a transaction-like manner
  • Acceptable substitute components may assist in
    co-allocating a composite resource

35
Metrics for resource allocation
  • Different scheduling approaches are measure
    performance differently
  • Historical performance
  • Throughput
  • Storage
  • Network connectivity
  • Cost
  • Application specific performance
  • Service level

36
Language for utilizing grids
  • Much of the work in grids is protocol or language
    work
  • Expressive languages needed for negotiating
    service level, reporting performance or resource
    capabilities, security, and reserving resources
  • Protocol work in authentication and
    authorization, data transfer, and job management

37
Summary about Grids
  • Grids offer tremendous computation and data
    storage resources not available in single systems
    or single clusters
  • Application and algorithm design and deployment
    still either rudimentary or application specific
  • Universal infrastructure still in development
  • Unique challenges still unsolved especially in
    regard to fault tolerance and multiple
    administrative domains

38
Public Computing
  • Aggregates idle workstations connected to the
    Internet for performing large scale computations
  • Initially seen in volunteer projects such as
    Distributed.net and SETI_at_home
  • Volunteer computers periodically download work
    from a project server and complete the work
    during idle periods
  • Currently used in projects that have large
    workloads on the scale of months or years with
    trivially parallelizable tasks

39
BOINC Architecture
  • Berkeley Open Infrastructure for Network
    Computing
  • Developed as a generic public computing framework
  • Next generation architecture for the SETI_at_home
    project
  • Open source and encourages use in other public
    computing projects

40
BOINC lets you donate computing power to the
following projects
  • Climateprediction.net study climate change
  • Einstein_at_home search for gravitational signals
    emitted by pulsars
  • LHC_at_home improve the design of the CERN LHC
    particle accelerator
  • Predictor_at_home investigate protein-related
    diseases
  • SETI_at_home Look for radio evidence of
    extraterrestrial life
  • Cell Computing biomedical research (Japanese
    requires nonstandard client software)

41
BOINC Architecture
42
Motivation for New Scheduling Strategies
  • Many projects requiring large scale computational
    resources not of the current public computing
    scale
  • Grid and cluster scale projects are very popular
    in many scientific computing areas
  • Current public computing scheduling does not
    scale down to these smaller projects

43
Motivation for New Scheduling Strategies
  • Grid scale scheduling for public computing would
    make public computers a viable alternative or
    complimentary resource to grid systems
  • Public computing has the potential to offer a
    tremendous amount of computing resources from
    idle systems of organizations or volunteers
  • Scavenging grid projects such as Condor indicate
    interest in harnessing these resources in the
    grid research community

44
Scheduling Algorithms
  • Current BOINC scheduling algorithm
  • New scheduling algorithms
  • First Come, First Serve with target workload of 1
    workunit (FCFS-1)
  • First Come, First Serve with target workload of 5
    workunits (FCFS-5)
  • Ant Colony Scheduling Algorithm

45
BOINC Scheduling
  • Originally designed for unlimited work
  • Clients can request as much work as desired up to
    a specified limit
  • Smaller, limited computational jobs faced with
    the challenge of more accurate scheduling
  • Too many workunits assigned to a node leads to
    either redundant computation by other nodes or
    exhaustion of available workunits
  • Too few workunits assigned leads to increased
    communication overhead

46
New Scheduling Strategies
  • New strategies target computational problems on
    the scale of many hours or days
  • Four primary goals
  • Reduce application execution time
  • Increase resource utilization
  • No reliance on client supplied information
  • Remain application neutral

47
First Come First Serve Algorithms
  • Naïve scheduling algorithms based solely on the
    frequency of client requests for work
  • Server-centric approach which does not depend on
    client supplied information for scheduling
  • At each request for work, the server compares the
    number of workunits already assigned to a node
    and sends work to the node based on a target
    worklevel
  • Two algorithms tested targeting either a workload
    of one workunit (FCFS-1) or five workunits
    (FCFS-5)

48
Ant Colony Algorithms
  • Meta-heuristic modeling the behavior of ants
    searching for food
  • Ants make decisions based on pheromone levels
  • Decisions affect pheromone levels to influence
    future decisions

?
49
Ant Colony Algorithms
  • Initial decisions are made at random
  • Ants leave trail of pheromones along their path
  • Next ants use pheromone levels to decide
  • Still random since initial trails were random

?
50
Ant Colony Algorithms
  • Shorter paths will complete quicker leading to
    feedback from the pheromone trail
  • Ant at destination now bases return decision on
    pheromone level
  • Decisions begin to become ordered

?
?
51
Ant Colony Algorithms
  • Repeated reinforcement of shortest path leads to
    greater pheromone buildup
  • Pheromone trails degrade over time

?
?
52
Ant Colony Algorithms
  • At this point the route discovery has converged
  • Probabilistic model of route choice allows for
    random searching of potentially better routes
  • Allows escape from local minima or adaptation to
    changes in environment

?
?
53
Ant Colony Scheduling
  • In the context of scheduling, the scheduler
    attempts to find optimal distribution of
    workunits to processing nodes
  • To carry out the analogy, workunits are the
    ants, computational power is the food, and
    the mapping is the path
  • Scheduler begins by randomly choosing mappings of
    workunits to nodes
  • As workunits are completed and returned, more
    powerful nodes are reinforced more often than
    weaker nodes
  • More workunits are sent to more powerful nodes

54
Ant Colony Scheduling in BOINC
  • To take advantage of more workunits on each node,
    distributions are chosen on batches of workunits
  • A percentage of a target batch is sent based on
    pheromone level
  • Due to batching of workunits, server to client
    communication is consolidated and reduced
  • Using pheromone heuristic ensures nodes get a
    share of workunits proportional to their
    computing power

55
Ant Colony Scheduling in BOINC
  • Pheromone levels based on actual performance of
    completed workunits not on reported benchmarks of
    nodes
  • Attempts to improve on CPU benchmarks
  • Incorporates communication overhead
  • Fluctuations in performance
  • Dynamic removal and addition of nodes
  • Level can be calculated completely by server and
    not on untrusted nodes

56
Testing Environment and Experiments
  • Testing of new scheduling strategies implemented
    on a working BOINC system
  • Scheduling metrics and data
  • Strategies used to schedule three experiments
  • MD4 Password Hash Search
  • Avalanche Photodiode Gain and Impulse Response
    Calculations
  • Gene Sequence Alignment

57
Testing Environment
58
Scheduling Metrics and Data
  • All three experiments are measured with the same
    metrics
  • Application runtime of each scheduling algorithm
    and the sequential runtime
  • Speedup versus sequential runtime for each
    scheduling algorithm
  • Workunit Distribution of each algorithm

59
MD4 Password Hash Search
  • MD4 is a cryptographic hash used in password
    security in systems such as Microsoft Windows and
    the open source SAMBA
  • Passwords are stored by computing the MD4 hash of
    the password and storing the hashed result
  • Ensures clear-text passwords are not stored on a
    system
  • When password verification is needed, the
    supplied password is hashed and compared to the
    stored hash
  • Cryptographic security of MD4 ensures the
    password cannot be derived from the hash
  • Recovering a password is possible through brute
    force exhaustion of all possible passwords and
    searching for a matching hash value

60
MD4 Search Problem Formulation
  • MD4 search experiment searches though all
    possible 6 character passwords
  • A standard keyboard allows 94 possible characters
    in a password
  • For 6 character passwords, there are 946 possible
    passwords

61
MD4 Search Problem Formulation
  • BOINC implementation divides the entire password
    space into 2,209 workunits of 9444 possible
    passwords
  • All passwords in the workunit are hashed and
    compared to a target hash
  • Results are sent back to the central server for
    processing
  • All workunits are processed regardless of finding
    a match

62
MD4 Search Problem Formulation
  • Problem is ideally suited to the public computing
    architecture
  • Computationally intensive
  • Independent tasks
  • Low communication requirements

63
MD4 Search Results
  • Parallel runtimes are measured versus an
    extrapolated sequential runtime based on the time
    needed for computing passwords of one workunit
  • Parallel implementation takes on the additional
    load of scheduling and communication costs

64
MD4 Search Runtime
65
MD4 Search Runtime
66
MD4 Search Runtime
67
MD4 Search Parallel Runtime
68
MD4 Search Runtime
  • All three show runtimes significantly lower than
    sequential
  • Ant Colony and FCFS-5 show similar runtimes lower
    than FCFS-1
  • FCFS-5 shows erratic runtime due to processing
    and reporting five workunits at a time

69
MD4 Search Speedup Comparison
70
MD4 Search Speedup
  • FCFS-1 quickly approaches and maintains a lower
    peak speedup level due to communication overhead
    and delay from scheduling requests
  • FCFS-1 also suffers from reduced parallelism due
    to inability to exploit local parallelism on the
    quad processor system
  • FCFS-5 erratically approaches higher speedup
    level
  • Ant colony approaches peak speedup level with a
    similar pattern to FCFS-1 with a level similar to
    FCFS-5

71
MD4 Search Workunit Distribution
72
MD4 Search Workunit Distribution
  • Quad processor system underutilized with FCFS-1
    algorithm
  • Remaining systems evenly distributed for all
    three scheduling algorithms
  • Lower speed workstations receive proportionally
    smaller workloads

73
MD4 Search Conclusion
  • MD4 search is ideally suited to the public
    computing architecture
  • Calculation benefits from larger workloads
    assigned to nodes to reduce communication
    overhead
  • Ant Colony and FCFS-5 perform similarly with
    FCFS-1 performing poorly

74
Avalanche Photodiode Gain and Impulse Response
  • Avalanche Photodiodes (APDs) are used as
    photodetectors in long-haul fiber-optic systems
  • The gain and impulse response of APDs is a
    stochastic process with a random shape and
    duration
  • This experiment calculates the joint probability
    distribution function (PDF) of APD gain and
    impulse response

75
APD Problem Formulation
  • The joint PDF of APD gain and impulse response is
    based on the position of an input carrier
  • This input carrier causes ionization on the APD
    leading to additional carriers within a
    multiplication region
  • This avalanche effect leads to a gain in carrier
    over time
  • Due to this avalanche effect, the joint PDF can
    be calculated iteratively based on the
    probability of a carrier ionizing and in turn
    causing additional impacts and ionizations
    creating new carriers

76
APD Problem Formulation
  • BOINC implementation parallelizes calculation of
    the PDF for any carrier in 360 of the unit
    circle
  • 360 workunits are created corresponding to each
    of these positions using identical parameters
  • The result of each workunit is a matrix of
    results with all values for all positions of a
    carrier and the impulse response for all times

77
APD Runtime
  • Sequential runtime is based on extrapolating
    total runtime from the average CPU time of a
    single workunit
  • All three parallel schedules show runtimes
    significantly lower than sequential
  • Ant Colony and FCFS-5 show similar runtimes lower
    than FCFS-1
  • FCFS-5 shows erratic runtime due to processing
    and reporting five workunits at a time

78
APD Runtime
79
APD Runtime
80
APD Runtime
81
APD Parallel Runtime
82
APD Runtime
  • Ant Colony has lowest runtime followed by FCFS-1
    and finally by FCFS-5
  • Note the spike in runtime for FCFS-5 at the end
    of the calculation
  • Long runtime of individual workunits accounts for
    this spike at the end of the calculation for
    FCFS-5 when pool of workunits is exhausted

83
APD Speedup Comparison
84
APD Speedup
  • Large fluctuations at the beginning of the
    calculation likely due to constrained bandwidth
    for output data
  • Bandwidth constraint leaves all nodes performing
    similarly except for the single local node

85
Testing Environment
86
APD Workunit Distribution
87
APD Workunit Distribution
  • Local workstation is highest performer of the
    nodes
  • Quad processor is weakest performer
  • Shares outbound bandwidth with most other nodes
  • Constrained bandwidth of single network interface
    dominates any benefit from local parallelism
  • Workunits on other nodes randomly distributed due
    to contention for communication medium
  • Ant colony allocates the fewest workunits to the
    quad processor and most workunits to the local
    node

88
APD Conclusion
  • APD experiment focuses on the impact of
    communication overhead due to output data on
    scheduling strategy
  • All three offer significant speedup over
    sequential with FCFS-5 performing the worst
  • Ant colony outperforms both naïve algorithms by
    an increased allocation of work to the best
    performing node
  • Ant colony benefits from reserving more workunits
    in the work pool for higher performing nodes at
    the end of the calculation

89
Gene Sequence Alignment
  • Problem from bioinformatics finding the best
    alignment for two sequences of genes based on
    matching of bases and penalties for insertion of
    gaps in either sequence
  • Alignments of two sequences are scored to
    determine the best alignment
  • Different alignments can offer different scores

90
Gene Sequence Alignment
  • Given two sequences
  • A bonus is given for a match in the sequences
  • A penalty is applied for a mismatch

91
Gene Sequence Alignment
  • Sequences can be realigned by inserting gaps
  • Gaps are penalized
  • Resulting scores will differ depending on where
    gaps are inserted

92
Sequence Alignment Problem Formulation
  • Finding the best possible alignment is based on a
    dynamic programming algorithm
  • A scoring matrix is calculated to simultaneously
    calculate all possible alignments
  • Calculating the scoring matrix steps through each
    position and determines the score for all
    combinations of gaps
  • Once calculated, the best score can be found and
    backtracked to determine the alignment

93
Sequence Alignment Problem Formulation
  • Each entry in the scoring matrix depends on
    adjacent neighbors from the position before
  • These dependencies create a pattern depicted in
    the diagram

94
Sequence Alignment Problem Formulation
  • The dependencies of the scoring matrix make
    parallelization difficult
  • Nodes cannot compute scores until previous
    dependencies are satisfied
  • Maximum parallelism can be achieved by
    calculating the scores in a diagonal major fashion

95
Sequence Alignment Problem Formulation
  • BOINC implementation only measures calculation
    and storage of the solution matrix
  • Does not include finding the maximum score and
    backtracing through the alignment
  • Solution matrix is left on the client and not
    transferred to the central server
  • Problem calculates the solution matrix for
    aligning two generated sequences each of length
    100,000

96
Sequence Alignment Runtime
97
Sequence Alignment Runtime
  • Runtime curves shows a slight wave beginning with
    a decrease in per unit runtime and later
    increasing again
  • Due to the wavefront completion of tasks in the
    diagonal major computation leading to increasing
    parallelism up to the longest diagonal
  • After this midpoint, parallelism decreases

98
Diagonal Major Execution
99
Sequence Alignment Runtime
100
Sequence Alignment Runtime
101
Sequence Alignment Parallel Runtime
102
Sequence Alignment Speedup Comparison
103
Sequence Alignment Speedup
  • FCFS-1 shows a steady curve reflecting gradual
    increase in parallelism due to available tasks
    and a steady decrease in parallelism as the
    wavefront passes the largest diagonal of the
    calculation
  • FCFS-5 shows a more gradual incline at the
    beginning of the calculation and steeper decline
    toward the end
  • Ant colony shows a steeper incline and gradual
    decline

104
Sequence Alignment Speedup
  • FCFS-5 enjoys less parallelism at the beginning
    of the calculation due to allocating many
    workunits to nodes requesting work with such a
    small pool to draw from
  • As more workunits become available, FCFS-5s
    aggressive scheduling works to its advantage
    until workunits begin to become exhausted again
  • FCFS-1 is conservative in scheduling throughout
  • Ant Colony begins conservatively but occasionally
    sends multiple workunits to a node
  • This leads to a quicker buildup of generated
    workunits early in the calculation
  • Later in the calculation, Ant Colony schedules
    more aggressively and eventually exhausts the
    workunit pool similarly to FCFS-5

105
Sequence Alignment Workunit Distribution
106
Sequence Alignment Workunit Distribution
  • Mostly random distribution due to task dependency
    dominating the calculation
  • All three scheduling techniques show little
    preference for any node based on communication or
    processing resources

107
Sequence Alignment Conclusion
  • Ant colony provides an interesting mix of
    attributes of both FCFS-1 and FCFS-5 when
    scheduling this computation
  • It shares the conservative scheduling of FCFS-1
    and later aggressive scheduling similar to FCFS-5
  • All three parallel computations offer only a
    slight benefit to the problem due to the task
    dependency structure
  • It should be noted, the theoretical computation
    time of the sequential algorithm would require a
    sequential machine with 37.3 GB of memory if no
    memory reduction techniques are used in storing
    the solution matrix

108
Performance Summary
  • Ant colony scheduling offers top performance in
    all three experiments
  • FCFS-1 and FCFS-5 offer varying performance
    levels depending on the attributes of the target
    application
  • Ant colony adapts to match or better the best of
    the competing algorithms
  • All three offer acceptable schedules for the
    parallel applications without relying on client
    supplied information

109
Problems with BOINC
  • Client Server Model
  • Traffic Congestion Problem
  • Server too busy to handle requests
  • Solution peer to peer model

110
Peer to Peer Platform
  • Written in Java
  • Uses the JXTA toolkit to provide communication
    services and peer to peer overlay network
  • Platform provides basic object and messaging
    primitives to facilitate peer to peer application
    development
  • Messages between objects can be transparently
    passed to objects on other peers or to local
    objects
  • Each object in an application runs in its own
    thread

111
Distributed Scheduling
  • Applications provide their own scheduling and
    decomposition of work
  • Typical pattern of application design
  • A factory object generates work units from
    decomposed job data
  • Worker objects perform computation
  • Result objects consolidate and report results
  • The work factory handles distributing work to
    cooperating peers and consolidating results

112
Distributed Sequence Alignment
  • Sequence Alignment job begins as a comparison of
    two complete sequences
  • Work factory breaks the complete comparison into
    work units up to a minimum size
  • A work unit can begin processing as soon as its
    dependencies are available
  • Initially, only the upper left corner work unit
    of the result matrix has all dependencies
    satisfied
  • When a work unit is completed, its adjacent work
    units become eligible for processing

113
Distributed Sequence Alignment
  • The distributed application attempts to complete
    work units in squares of four adjacent work units
  • As more work units are completed, larger and
    larger squares of work units become eligible for
    processing

Complete
Eligible
Complete
Not Ready
Eligible
Processing
114
Distributed Sequence Alignment
  • Peers other than the peer initially starting the
    job will begin with no work to complete
  • The initial peer will broadcast a signal
    signifying availability of eligible work units
  • Peers will attempt to contact a peer advertising
    work requesting work

115
Distributed Sequence Alignment
  • A peer with available work will distribute the
    largest amount of work eligible and mark the work
    unit as remotely processed
  • When a peer completes all work in a work unit it
    will report to the peer who initially assigned
    the work
  • Only adjacent edges of results necessary for
    computing new work units are reported to reduce
    communication
  • Complete results are stored at the peer
    performing the computation

116
Distributed Sequence Alignment
  • After reporting the completion of a work unit a
    peer will seek new work from all peers
  • Once the initial peer completes all work the job
    is done
  • Peers could then be queried to report maximum
    scores and alignments

117
Experiment
  • Peer to peer algorithm was tested aligning two
    sequences of 160,000 bases each
  • Alignments used a typical scoring matrix and gap
    penalty
  • The minimum work unit size for decomposition of
    the total job was 10,000 bases for each sequence
    resulting in 256 work units

118
Experiment
  • The distributed system was executed on a local
    area network with 2, 4, and 6 fairly similar
    peers
  • The results were compared to a sequential
    implementation of the algorithm run on one of the
    peers
  • The sequential implementation does a
    straightforward computation of the complete
    matrix
  • Disk is used to periodically save the sections of
    the matrix since the complete matrix would
    exhaust available memory

119
Runtime
  • Runtime is reduced from 1 hour and 9 minutes to
    28 minutes
  • This is about 2.4 times faster than sequential
  • The most dramatic drop is at 2 nodes with at 1.75
    times faster at 39 minutes

120
Node Efficiency
  • The first peer generally has the highest
    efficiency
  • Average efficiency drops as more nodes are added

121
Analysis
  • Findings are in line with the structure of the
    sequential problem
  • Due to dependencies of tasks on previous tasks,
    many nodes must initially remain idle as some
    nodes complete dependent tasks
  • As more work becomes eligible for processing,
    more nodes can work simultaneously
  • Later in the computation fewer work units become
    eligible for computation due to fewer dependencies

122
Comparison With CS Model
  • Previous work performed the same computation with
    a client server platform BOINC
  • Aligned two sequences of 100,000 bases each
  • Work unit size was 2,000 bases for each sequence
  • 2500 work units
  • Performed on 30 nodes

123
Comparison With CS Model
  • Direct comparison is difficult
  • Previous job was smaller but with smaller
    granularity increasing number of work units
  • Previous sequential portions of alignment seem to
    be inefficient compared to current implementation
    based on overall sequential job completion time

124
Comparison With CS Model
  • Best comparison is overall runtime reduction
    factor
  • Runtime reduction factor compares distributed
    completion time with similar sequential
    completion time
  • Factors out differing performance of sequential
    aspects of computation
  • BOINC implementation achieved a reduction factor
    only about 1.2 times sequential
  • Peer to peer achieved 2.4 with only 1/5 the nodes

125
Comparison With CS Model
  • BOINC implementation was impeded by a central
    server which was using a slower link to most
    nodes compared to the peer to peer configuration
  • Peer to peer implementation also shows signs of
    diminishing returns as more nodes are added
  • Peer to peer utilizes all participating systems,
    client server normally uses a server which does
    not participate in the computation outside of
    scheduling

126
Conclusions
  • Our scheduling strategy is effective on BOINC
  • P2P is more effective than BOINC Client-server
    model
  • Public computing can solve problems of large
    computing power requirement and huge memory
    demand
  • Potentially replace supercomputing for certain
    applications (large grains)

127
Future Work
  • Measure the impact of fault tolerance on these
    scheduling algorithms
  • Measure the impact of work redundancy and work
    validation
  • Continue to benchmark the P2P implementation with
    more nodes
  • Implement Ant Colony Scheduling on P2P model

128
Future Work
  • Currently a peer only seeks more work when it has
    completed all of its work
  • It does not seek work when waiting for reports
    from peers to which it has distributed work
  • Allowing a peer to seek work while it is waiting
    for other peers may increase utilization
  • Create a more direct comparison with client
    server model
  • Additional applications and improvements to the
    base platform

129
Thank You
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com