Endtoend vertical slice - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Endtoend vertical slice

Description:

Why did a recent component upgrade cause problems? ... Have all copies of a defunct document been expunged? Verifying experimental system correctness ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 9
Provided by: kkee8
Category:

less

Transcript and Presenter's Notes

Title: Endtoend vertical slice


1
End-to-end vertical slice
  • Archana Ganapathi, Edward Hunter, Kim Keeton,
    George Porter

2
Motivating problem areas
  • Detecting and diagnosing problems
  • Why did a recent component upgrade cause
    problems?
  • Where is the scalability hitch in a storm
    application?
  • What are the implications of a large-scale
    failure?
  • Configuring the system
  • How to tune 623 DB2 parameters to get desired
    performance?
  • Verifying high-level expectations about system
    behavior
  • Has a performance SLA been met?
  • Has a users data privacy been maintained?
  • Have all copies of a defunct document been
    expunged?
  • Verifying experimental system correctness
  • Is time dilation in RAMP accurate?

3
Motivating problem areas
  • Detecting and diagnosing problems
  • Why did a recent component upgrade cause
    problems?
  • Where is the scalability hitch in a storm
    application?
  • What are the implications of a large-scale
    failure?
  • Configuring the system
  • How to tune 623 DB2 parameters to get desired
    performance?
  • Verifying high-level expectations about system
    behavior
  • Has a performance SLA been met?
  • Has a users data privacy been maintained?
  • Have all copies of a defunct document been
    expunged?
  • Verifying experimental system correctness
  • Is time dilation in RAMP accurate?

4
Understanding effects of changes
  • What state needs to be captured?
  • How to represent state to support efficient
    comparisons?
  • What support should be provided for human
    debugging?

5
What state needs to be captured?
  • Goal track changes to system state to enable
    user to understand
  • What changed since system last worked well
  • How problematic config differs from working
    configs from similar systems
  • Whether a bug fix actually solves a problem
  • State to capture
  • Persistent state
  • Hardware configuration (e.g., memory and disk
    capacity, processor speed)
  • Versions for apps, middleware, OS, firmware, etc.
  • Configuration settings for apps, OS, router, etc.
  • User data?
  • Dynamic state
  • Workload inputs (e.g., request rate, request mix)
  • Health of system components (e.g., hardware
    failure)
  • System response (e.g., app performance, CPU
    utilization, disk free space)

6
How to represent system state (1)?
  • Basic idea multi-dimensional tree
  • Part 1 tree for persistent state
  • Lowest level nodes contain hashes for individual
    configuration parameter values, software
    versions, etc.
  • Higher levels aggregate these hashes to represent
    subsystem state
  • Enables efficient change detection by looking for
    subsystems (subtrees) that diverge

7
How to represent system state (2)?
  • Part 2 additional dimensions for capturing
    dynamic system state
  • Offered load, system health, system response
  • Capture at differing levels of granularity
  • min, max ranges during version epoch
  • Average value for fixed time (e.g., 1 min)
    windows
  • Enables differentiation of good vs. bad system
    states

8
Support for human debugging?
  • Visualization of tree structures
  • Highlight diverging areas
  • Flag problematic behavior
  • Repository of system states for similar systems
  • Enable comparison between similar systems a la
    PeerPressure
  • Similar configurations with no problems
  • Similar problems and their causes ( recommended
    fixes?)
Write a Comment
User Comments (0)
About PowerShow.com