Title: On Grid Performance Evaluation using Synthetic Workloads
1 - On Grid Performance Evaluation using Synthetic
Workloads
Carsten Franke, Alexander Papaspyrou, Lars
Schley, Baiyi Song, and Ramin Yahyapour
Alexandru Iosup, Dick Epema
PDS Group, ST/EWI, TU Delft
PDS Group, ST/EWI, TU Delft
JSSPP 2006
2Outline
- A Short Introduction to Grid Computing
- On Grid Performance Evaluation
- Experimental Environments
- Performance Indicators
- General Workload Modeling
- Grid-Specific Workload Modeling
- The GrenchMark Framework
- Future Work
- Conclusions
3A Short Introduction to Grid Computing
- Typical grid environment
- Applications !
- Unitary, composite
- Data
- Resources
- Compute (Clusters)
- Storage
- (Dedicated) Network
- Virtual Organizations, Projects
- Groups, Users
- Grids vs. parallel production environments
- Dynamic
- Heterogeneous
- Very large-scale (world)
- No central administration
- ? Most resource management problems are NP-hard
4Experimental Environments Real-World Testbeds
- Real-World Testbed
- DAS, NorduGrid, Grid3/OSG, Grid5000
- Pros
- True performance, also shows it works!
- Infrastructure in place
- Cons
- Time-intensive
- Exclusive access (repeatability)
- Controlled environment problem (limited
scenarios) - Workload structure (little or no realistic data)
- What to measure (new environment)
5Experimental Environments Simulated and Emulated
Testbeds
- Simulated and Emulated Testbeds
- GridSim, SimGrid, GangSim, MicroGrid
- Essentially trade-off precision vs. speed
- Pros
- Exclusive access (repeatability)
- Controlled environment (unlimited scenarios)
- Cons
- Synthetic Grids What to generate? How to
generate? Clusters, Disks, Network, VOs, Groups,
Users, Applications, etc. - Workload structure (little or no realistic data)
- What to measure (new environment)
- Validity of results (accuracy vs. time)
6Grid Performance Evaluation Current Practice
- Performance Indicators
- Define my own metrics, or use U and AWT/ART, or
both - Workload Structure
- Run my own workload, or use traces that are not
validated by peer researchers do not make
comparisons! - Run benchmarks from typical parallel production
environments - Mostly all users are created equal assumption
Need a common performance evaluation framework
for Grid
7Grid Performance Evaluation Current Issues
- Performance Indicators
- What are the metrics for the new environment?
- Workload Structure
- Which general aspects are important?
- Which Grid-specific aspects need to be addressed?
Need a common performance evaluation framework
for Grid
8Performance Indicators
- Time-, Resource-, and System-Related Metrics
- Traditional utilization, A(W)RT, A(W)WT, A(W)SD
- New waste, fairness (or service quality
reliability) - Workload Completion and Failure Metrics
- In Grids, functionality may be even more
important than performance - Workload Completion (WC)
- Task and Enabled Task Completion (TC, ETC)
- System Failure Factor (SFF)
9General Aspects for Workload Modeling
- User/Group/VO model
- Detailed modeling for top-5/10 users, then
clustering (Use squash area to group) - Submission patterns
- Yearly, monthly, weekly, daily
- Do daily patterns exist? (Are Grids truly
global?) - Temporal patterns
- Repeated submission (batches of jobs)
- Job dependencies (composite applications common
in Grid(?)) - Feedback
- Empiric rules (dont submit jobs when system
busy). But, reactive submission tools,
co-allocators, evolving applications, etc.
10Grid-Specific Workload ModelingComputation
Management
- Processor co-allocation
- Fixed, non-fixed, semi-fixed jobs
- Job flexibility
- Moldable, evolvable, flexible, -ble
- Other aspects
- Background load define top jobs (by
consumption), model the rest as background load - Project stage
11Grid-Specific Workload ModelingData Management
- Clearly Defined I/O Requirements
- Files, streams,
- Data location and size
- Replicas
- Replica location
- Other aspects
-
12Grid-Specific Workload ModelingNetwork Management
- Clearly Defined Network Requirements
- Bandwidth, latency,
- Communication pattern
- Special Situations
- Dedicated paths, other QoS
- Other aspects
- Background load
13Grid-Specific Workload ModelingLocality/Origin
Management
- Job issuer and execution site
- Not all VOs are created equal !
- Two-level view Which VO generates the next job?
Within a VO, which user generates the next job? - Three-level view, Multi-level view (Project, VO,
Group, User) - (Usage) Service Level Agreements
- Use my system 50 for 7 days, or 20 for 30 days
- Dedicated paths, other QoS
- Other aspects
- Background load pertaining to same (u)SLA
14Grid-Specific Workload ModelingFailure Modeling
- Error level
- Infrastructure
- Middleware
- Application
- User
- Fault tolerance scheme for submitted jobs
- Catch the system feedback into the model
- Other aspects
-
15Grid-Specific Workload ModelingEconomic Models
- Pricing
- Application cost
- Application utility
- Other aspects
-
16GrenchMark a Framework for Analyzing, Testing,
and Comparing grids
- Whats in a name?grid benchmark ? working
towards a generic tool for the whole community
help standardizing the testing procedures,
but benchmarks are
too early we use
synthetic grid workloads instead - Whats it about?A systematic approach to
analyzing, testing, and comparing grid settings,
based on synthetic workloads - A set of metrics for analyzing grid settings
- A set of representative grid applications
- Both real and synthetic
- Easy-to-use tools to create synthetic grid
workloads - Flexible, extensible framework
17GrenchMark Overview Easy to Generate and Run
Synthetic Workloads
18 but More Complicated Than You Think
- Workload structure
- User-defined and statistical models
- Dynamic jobs arrival
- Burstiness and self-similarity
- Feedback, background load
- Machine usage assumptions
- Users, VOs
- Metrics
- A(W) Run/Wait/Resp. Time
- Efficiency, MakeSpan
- Failure rate !
- (Grid) notions
- Co-allocation, interactive jobs, malleable,
moldable,
- Measurement methods
- Long workloads
- Saturated / non-saturated system
- Start-up, production, and cool-down scenarios
- Scaling workload to system
- Applications
- Synthetic
- Real
- Workload definition language
- Base language layer
- Extended language layer
- Other
- Can use the same workload for both simulations
and real environments
19GrenchMark Iterative Research Roadmap
20GrenchMark Iterative Research Roadmap
Simple functional system A.Iosup, J.Maassen,
R.V.van Nieuwpoort, D.H.J.Epema, Synthetic Grid
Workloads with Ibis, KOALA, and GrenchMark,
CoreGRID IW, Nov 2005.
21GrenchMark Iterative Research Roadmap
Open- GrenchMark CommunityEffort
Complex extensible system A.Iosup, D.H.J.Epema,
GrenchMark A Framework for Analyzing, Testing,
and Comparing Grids, IEEE CCGrid'06, May 2006.
22Take home message
- Performance Evaluation of Grid Systems - need a
common performance evaluation framework for grids
- need real grid traces (scheduling,
accounting, monitoring, etc.) - need more
research on workload modeling and performance
indicators - Performance indicators - failure metrics as
important as traditional performance metrics - Workload modeling - generic workload modeling
needs validation based on real grid traces -
computation/data/network management -
locality/origin management - failure modeling -
economic models - GrenchMark - generic tool for the whole
community - generates diverse grid workloads -
easy-to-use, flexible, portable, extensible,
23Thank you!
- Questions? Remarks? Observations? All welcome!
- GrenchMark http//grenchmark.st.ewi.tudelft.nl/
http//grenchmark.st.ewi.tudelft.nl/
24(No Transcript)
25Representative Grid applications (3/4)Composite
DAG-based
- DAG-based applications
- Real DAG
- Chain of tools
- Try to model real or predicted (use) cases