Software Steering: Fuzzy Logic, Time Series, and Hidden Markov Models

1 / 59
About This Presentation
Title:

Software Steering: Fuzzy Logic, Time Series, and Hidden Markov Models

Description:

Overlap computation with disk I/O to minimize application stalls for disk I/O ... I/O STALL. I/O STALL. University of Illinois Department of Computer Science ... –

Number of Views:195
Avg rating:3.0/5.0
Slides: 60
Provided by: debor113
Category:

less

Transcript and Presenter's Notes

Title: Software Steering: Fuzzy Logic, Time Series, and Hidden Markov Models


1
Software Steering Fuzzy Logic, Time Series,
and Hidden Markov Models
  • Daniel A. Reed
  • Department of Computer Science
  • University of Illinois
  • reed_at_cs.uiuc.edu
  • http//www-pablo.cs.uiuc.edu

2
SvPablo
  • A graphical source code browser and performance
    capture/correlation tool
  • Allows user to select loops and procedures to
    instrument in C, F77, F90 code. Automatic
    instrumentation for HPF via PGI performance
    interface.
  • Collects performance data and later displays it
    relative to source code line
  • Option for real-time data transmission via
    Autopilot tagged sensors (more later)

3
SvPablo GUI
4
The Next Frontier
  • Emerging applications are irregular
  • adaptive meshes
  • data dependent behavior
  • And they are dynamic
  • time varying resource demands
  • time varying resource availability
  • geographically distributed and heterogeneous
  • computational grids

5
Automatic Tuning Requirements
  • End-to-end, real-time data capture
  • multiple levels
  • hardware, system software, libraries,
    applications
  • multiple granularities
  • microseconds to hours
  • multiple sites
  • geographically distributed computations
  • Intelligent data analysis
  • qualitative feature extraction
  • Dynamic policy selection
  • interactive and automatic

6
Closed Loop Adaptive Control
  • Rationale
  • adaptive applications
  • dynamic demands
  • Research approach
  • monitor resource demands and responses
  • select policies based on observed behavior
  • implement policy changes locally and globally

7
Autopilot Components
  • Sensors and actuators
  • distributed measurement and software control
  • Attached functions
  • sensor/actuator local extension
  • Sensor/actuator managers(s)
  • publish/subscribe
  • Remote client(s) and client handlers
  • remote sensor/actuator interaction
  • Fuzzy logic decision procedures
  • distributed performance control
  • Standard performance daemons

8
Autopilot Components
Attached Function
Sensor
Local Reduction
Remote Client
Manager
Decision
Attached Function
Client Handler
Actuator
Local Reduction
All Built Atop Globus
9
Autopilot Performance Sensors
  • Three modes
  • intrinsic (procedural)
  • extrinsic (threaded)
  • push (requested by external agent)
  • Two aspects
  • quantitative resource use
  • qualitative request patterns
  • Accessibility
  • wide area publish/subscribe attributes
  • automatic insertion

10
Autopilot Policy Actuators
  • Functions
  • remote process control and update
  • software steering
  • Activation points
  • synchronous (application controls)
  • asynchronous (external agent controls)
  • Building blocks
  • Nexus communication library
  • sensor/actuator registration infrastructure

11
Autopilot Manager Infrastructure
Remote Client Task(s)
Autopilot Manager
12
Autopilot Decision Procedures
  • Control mechanism issues
  • performance data is noisy
  • decisions must not oscillate
  • Classic control theory
  • presumes formalisms
  • resource policies often lack formalism
  • I/O policies and rules
  • network protocol selection
  • Fuzzy logic is an attractive alternative

13
Fuzzy Logic Rationale
  • Humans rely on qualitative rules
  • If it is RAINING, drive SLOWLY
  • If the system is BUSY, backups should be
    POSTPONED
  • Fuzzy logic expresses these rules formally
  • captures human experience
  • supports contradictory statements
  • processes gray statements

14
Fuzzy Controller Structure
  • Fuzzifier
  • scales and maps input variables to fuzzy sets
  • Inference mechanism
  • approximate reasoning block
  • deduces the control action
  • Compositional Rule of Inference (CRI)
  • Defuzzifier
  • converts fuzzy output values to control signals
  • several defuzzification methods

15
Autopilot Decision Process
Monitor/Control Tasks
Instrumented Tasks
16
Sample Fuzzy Rule Base
rulebase FurnaceRules var roomtemp(0,100)
set trapez cold ( 0, 50, 0, 20 ) set
trapez medium( 50, 70, 10, 10 ) set
trapez hot ( 80, 100, 20, 0 )
17
Sample Fuzzy Rule Base
var furnace(0,1) set triangle off ( 0,
0, 0.1 ) set triangle half( 0.5, 0.1, 0.1
) set triangle full( 1, 0.1, 0 ) //
the rules if ( roomtemp cold ) furnace
full if ( roomtemp medium ) furnace
half if ( roomtemp hot ) furnace off
18
Sample Fuzzy Logic Control
// Fuzzify sensor value _furnaceRules-gtroomt
emp.value( _sensedTemperature ) // Execute
the rule base _furnaceRules-gtfurnaceRules.base.
evaluate_all() // Defuzzify the outputs
_furnaceRules-gtfurnaceRules.base.defuzzy_all()
// Update actuator in the instrumented task
_newFurnaceIntensity _furnaceRules-gtfurnace.va
lue() _heatActuatorClient-gtchangeValue(
_newFurnaceIntensity, 1 )
19
Why I/O Prefetching?
  • Narrow the performance gap
  • Overlap computation with disk I/O to minimize
    application stalls for disk I/O

Processor Speed (Moores Law) 100/18
months Disk Rotational Latency
12/18 months Disk Transfer Bandwidth
60/18 months
B1 B5
B1
B5
  • Concurrent Disk fetch

B1 in memory
B5 in memory
20
Intelligent I/O Libraries
  • I/O characterization experience
  • policies must match application stimuli
  • policy configuration dependent on many factors
  • hardware capabilities
  • access pattern attributes (consider DVCs)
  • resource contention (other users)
  • Implication
  • static optimization alone is not sufficient
  • adaptive optimization and tuning

21
I/O Prefetching Challenges
  • Time varying accesses
  • application internal variations
  • system utilization fluctuations
  • ESCAT interarrivals
  • non-deterministic
  • non-stationary
  • correlated
  • periodically similar
  • bursty

22
PPFS II Architecture
INTELLIGENT DECISION PROCEDURES
SENSOR ACTUATOR MANAGER
METADATA MANAGER
CLIENT
CLIENT
GLOBAL NETWORK (Globus)
I/O APIs
I/O APIs
ANN/HMM CLASSIFIER
ARIMA CLASSIFIER
CLIENT CACHE
CLIENT CACHE
STORAGE SERVER
STORAGE SERVER
STORAGE SERVER
23
Access Pattern Exploitation
  • Local (per parallel thread)
  • artificial neural networks (ANNs)
  • hidden Markov models (HMMs)
  • Global (parallel program)
  • temporal algebra
  • Universal
  • across job mix

Random
Write only
Read/write
Strided
Read only
Sequentiality
Sequential
Regular
Variable
Request Size
Read/Write
24
Neural Net Classification
  • I/O access abstraction
  • file byte offset
  • request size
  • operation type (read/write)
  • Neural network classification
  • operates in real-time
  • access abstraction (input)
  • qualitative classification (output)

25
Hidden Markov Classification
  • ANN classification limitations
  • training can be difficult
  • qualitative access pattern recognition
  • Hidden Markov models (HMMs)
  • learn access probability distribution functions
  • exploit knowledge of prior executions
  • can recognize non-qualitative patterns

26
HMM Prediction Example
  • Next-block prediction accuracy
  • parallel physics code

27
ARIMA Forecasting
  • Three basic steps
  • model identification
  • parameter estimation
  • forecasting (prediction)
  • ACF (autocorrelation function)
  • linear relation between observation pairs
  • PACF (partial autocorrelation function)
  • conditional correlation
  • intervening observations removed

28
Adaptive Prefetcher
READ REQUESTS
29
Prefetch Scheduler
  • Build schedules using Forestall algorithm
  • If disk service times gt predicted interarrival
    times
  • disk is congested (has queuing delay)
  • otherwise, sufficient disk bandwidth

30
Online Time Series Modeling
31
PPFS II Predictor Integration
Application
UNIX API
File Object
Real-time HMM/ARIMA Neural Net Predictor
Sensors
Policy Decision Procedure
Arrival Predictions
PPFS II Cache
UNIX I/O
32
PRISM I/O Trace
  • One season prediction behavior
  • root mean square prediction error 18
  • average prediction accuracy 82

33
PRISM I/O Trace (Detail)
Gradual improvement in prediction accuracy
34
PRISM I/O Trace Error
35
Cactus Wavetoy Behavior
One-season ahead predictions accuracy 87
Spikes of long computation times Bursts of short
interarrival times
36
Cactus Wavetoy Detail
37
Cactus Wavetoy Performance
38
Cactus Wavetoy Performance
39
Caltech ESCAT ARIMA Predictions
Learning
40
Open I/O Research Problems
  • Analytical models of disk striping
  • to study the performance of very large systems
  • Adaptive disk striping parameterization
  • smoothly adapting policies
  • fuzzy logic with configurable rule bases
  • Space-time tradeoffs
  • redundant storage formats
  • multiple file copies
  • off-line and/or online transformations
  • request scheduling for multiple copies

41
Research Directions
  • Transduction and reduction
  • external daemons and formats
  • SDDF and XML
  • clustering and projection
  • Intelligent prediction and control
  • application signatures
  • compact representations
  • time series/HMMs/neural nets in other domains
  • network latency/bandwidth and I/O
  • Performance contracts
  • specification and validation
  • interval reasoning

42
Large Data Problem
  • Detailed measurements enable
  • flexible post-mortem analysis
  • spatio-temporal correlations
  • tracing is the standard approach
  • Detailed event tracing creates problems
  • (potentially) large data volume
  • application perturbations
  • high storage and analysis costs

43
Complex, Multi-dimensional Data
  • How bad can it be?
  • 50-100 performance metrics
  • hundreds or thousands of concurrent activities
  • millisecond time scales for some measures
  • Implications
  • thousands of points
  • high-dimensional space
  • rapid point trajectories

44
Statistical Data Clustering
  • Goals
  • identify behavioral equivalence classes
  • record data from behavioral representatives
  • reduce aggregate data volume
  • minimize instrumentation perturbations
  • retain advantages of detailed tracing
  • Approach
  • real-time data analysis
  • cluster representative logging

45
Clustering and Program Models
  • Three basic alternatives
  • functional decomposition
  • single program multiple data (SPMD)
  • data parallel (HPF or HPC)
  • De facto equivalence classes
  • functions (different code)
  • SPMD (same code but data dependent)
  • data parallel (same code but data dependent)

46
Performance Metric Trajectories
47
Clustering and Metric Spaces
  • Observations
  • clustering identifies similar trajectories
  • one need only record one from each class
  • Clustering
  • based on metrics, not traces
  • metrics, like traces, evolve over time

48
SAR Data Reduction
49
Projection Pursuit
  • Statistical clustering
  • reduces the number of data points
  • but not their dimensionality (metric count)
  • Projection pursuit
  • generalization
  • principal component analysis
  • identifies important metrics

50
Application Signatures
  • Compact behavioral descriptions
  • application intrinsic measures
  • decoupled from resource mapping
  • e.g., I/O volume/statement
  • time series analysis
  • phase identification

51
What Is a Grid?
  • Collection of computing resources
  • varying in power or architecture
  • potentially dynamically varying in load and
    reliability
  • Interconnected by network
  • links may vary in bandwidth
  • load may vary dynamically
  • Distribution
  • across room, campus, state, nation, and globe

52
Grid Applications
  • NSF Network for Earthquake Engineering Simulation
    (NEES)
  • integrated instrumentation, collaboration,
    simulation
  • Grid Physics Network (GriPhyN)
  • ATLAS, CMS, LIGO, SDSS
  • distributed analysis of petascale data
  • Globus Foster and Kesselman

GriPhyN Physics Grid Network
NSF NEES Earthquake Grid
53
Grid Implications
  • Dynamic optimization requires
  • resource models
  • scheduling
  • real-time measurement and control
  • wide-area infrastructure
  • multiple execution modes
  • Come as you are execution

54
The Grid Big Picture
Dynamic Adaptation
Participants Ruth Aydt, Andrew Chien, Fran
Berman, Jack Dongarra, Ian Foster,
Dennis Gannon, Ken Kennedy, Carl Kesselman,
Lennart Johnsson, and Dan Reed
55
Configurable Object Program
  • Representation of the application
  • dynamic reconfiguration and optimization
  • for distributed targets
  • includes
  • program intermediate code
  • annotations from the compiler (reconfiguration
    strategy)
  • historical information (run profile to now)
  • Reconfiguration strategies
  • aggregation of data regions (submeshes)
  • aggregation of tasks
  • definition/redefinition of parameters
  • used for algorithm selection

56
Program Execution System
To PPS
57
Performance Contracts
  • At the heart of the GrADS Model
  • mechanisms for managing mapping and execution
  • What are they?
  • mappings from resources to performance
  • mechanisms for determining
  • when to interrupt and reschedule
  • Abstract definition
  • random variable r(A,I,C,t0) with a PDF
  • A app, I input, C configuration, t0 time
    of initiation
  • important statistics lower and upper bounds (95
    confidence)
  • issue
  • Is r a derivative at t0? (Wolski)

58
Performance Contracts
100 GF/s
Viz
CFD
Wind Tunnel
Viz
200 MB/s
Viz
2 GB/s
Database
10 fps
  • Goals
  • find a schedule
  • compute, storage, and network resource binding
  • subject to performance constraints
  • continuously validate contract
  • application or system violations possible

59
Performance Contracts
  • Given
  • compact application signature
  • clustering of application intrinsic metrics
  • resource behavioral estimates
  • Project application signature
  • Validate projection against measured behavior

Resources
Metric 1
60
Real-time Multilevel Analysis
  • Multilevel Drilldown
  • multiple sites
  • multiple metrics
  • Autopilot daemons
  • Svpablo instrumentation
  • real-time display
  • Five Components
  • shared views
  • direct manipulation tools
  • multimedia annotations
  • actualization interfaces
  • CAVE to palm

61
Rocket Simulation Case Study
Could Modify Iterations with Actuator
Init
Fluids Code (10 fluid iterations)
  • Code developed by DOE ASCI Center for Simulation
    of Advanced Rockets (CSAR) at UIUC
  • 40,000 lines of Fortran, MPI for communication
    between processes, runs on SGI Origin
  • 200 hours on 128 PEs to simulate 1/2 second of
    burn
  • Ultimately want to model 2 minutes for complete
    booster burn-off

Interpolation
Solids Code Do 31 Multigrid Solution for Each
of the Meshes
3 for coarse grain mesh 1 for fine grain
Convergence Test
n
Check Against a Residual Best Case, Converge
on First Try
Y
Saves Date Advances Time Step
Output
62
Virtue CSAR Visualization
Write a Comment
User Comments (0)
About PowerShow.com