Title: Software Steering: Fuzzy Logic, Time Series, and Hidden Markov Models
1 Software Steering Fuzzy Logic, Time Series,
and Hidden Markov Models
- Daniel A. Reed
- Department of Computer Science
- University of Illinois
- reed_at_cs.uiuc.edu
- http//www-pablo.cs.uiuc.edu
2SvPablo
- A graphical source code browser and performance
capture/correlation tool - Allows user to select loops and procedures to
instrument in C, F77, F90 code. Automatic
instrumentation for HPF via PGI performance
interface. - Collects performance data and later displays it
relative to source code line - Option for real-time data transmission via
Autopilot tagged sensors (more later)
3SvPablo GUI
4The Next Frontier
- Emerging applications are irregular
- adaptive meshes
- data dependent behavior
- And they are dynamic
- time varying resource demands
- time varying resource availability
- geographically distributed and heterogeneous
- computational grids
5Automatic Tuning Requirements
- End-to-end, real-time data capture
- multiple levels
- hardware, system software, libraries,
applications - multiple granularities
- microseconds to hours
- multiple sites
- geographically distributed computations
- Intelligent data analysis
- qualitative feature extraction
- Dynamic policy selection
- interactive and automatic
6Closed Loop Adaptive Control
- Rationale
- adaptive applications
- dynamic demands
- Research approach
- monitor resource demands and responses
- select policies based on observed behavior
- implement policy changes locally and globally
7Autopilot Components
- Sensors and actuators
- distributed measurement and software control
- Attached functions
- sensor/actuator local extension
- Sensor/actuator managers(s)
- publish/subscribe
- Remote client(s) and client handlers
- remote sensor/actuator interaction
- Fuzzy logic decision procedures
- distributed performance control
- Standard performance daemons
8Autopilot Components
Attached Function
Sensor
Local Reduction
Remote Client
Manager
Decision
Attached Function
Client Handler
Actuator
Local Reduction
All Built Atop Globus
9Autopilot Performance Sensors
- Three modes
- intrinsic (procedural)
- extrinsic (threaded)
- push (requested by external agent)
- Two aspects
- quantitative resource use
- qualitative request patterns
- Accessibility
- wide area publish/subscribe attributes
- automatic insertion
10Autopilot Policy Actuators
- Functions
- remote process control and update
- software steering
- Activation points
- synchronous (application controls)
- asynchronous (external agent controls)
- Building blocks
- Nexus communication library
- sensor/actuator registration infrastructure
11Autopilot Manager Infrastructure
Remote Client Task(s)
Autopilot Manager
12Autopilot Decision Procedures
- Control mechanism issues
- performance data is noisy
- decisions must not oscillate
- Classic control theory
- presumes formalisms
- resource policies often lack formalism
- I/O policies and rules
- network protocol selection
- Fuzzy logic is an attractive alternative
13Fuzzy Logic Rationale
- Humans rely on qualitative rules
- If it is RAINING, drive SLOWLY
- If the system is BUSY, backups should be
POSTPONED - Fuzzy logic expresses these rules formally
- captures human experience
- supports contradictory statements
- processes gray statements
14Fuzzy Controller Structure
- Fuzzifier
- scales and maps input variables to fuzzy sets
- Inference mechanism
- approximate reasoning block
- deduces the control action
- Compositional Rule of Inference (CRI)
- Defuzzifier
- converts fuzzy output values to control signals
- several defuzzification methods
15Autopilot Decision Process
Monitor/Control Tasks
Instrumented Tasks
16Sample Fuzzy Rule Base
rulebase FurnaceRules var roomtemp(0,100)
set trapez cold ( 0, 50, 0, 20 ) set
trapez medium( 50, 70, 10, 10 ) set
trapez hot ( 80, 100, 20, 0 )
17Sample Fuzzy Rule Base
var furnace(0,1) set triangle off ( 0,
0, 0.1 ) set triangle half( 0.5, 0.1, 0.1
) set triangle full( 1, 0.1, 0 ) //
the rules if ( roomtemp cold ) furnace
full if ( roomtemp medium ) furnace
half if ( roomtemp hot ) furnace off
18Sample Fuzzy Logic Control
// Fuzzify sensor value _furnaceRules-gtroomt
emp.value( _sensedTemperature ) // Execute
the rule base _furnaceRules-gtfurnaceRules.base.
evaluate_all() // Defuzzify the outputs
_furnaceRules-gtfurnaceRules.base.defuzzy_all()
// Update actuator in the instrumented task
_newFurnaceIntensity _furnaceRules-gtfurnace.va
lue() _heatActuatorClient-gtchangeValue(
_newFurnaceIntensity, 1 )
19Why I/O Prefetching?
- Narrow the performance gap
- Overlap computation with disk I/O to minimize
application stalls for disk I/O
Processor Speed (Moores Law) 100/18
months Disk Rotational Latency
12/18 months Disk Transfer Bandwidth
60/18 months
B1 B5
B1
B5
B1 in memory
B5 in memory
20Intelligent I/O Libraries
- I/O characterization experience
- policies must match application stimuli
- policy configuration dependent on many factors
- hardware capabilities
- access pattern attributes (consider DVCs)
- resource contention (other users)
- Implication
- static optimization alone is not sufficient
- adaptive optimization and tuning
21I/O Prefetching Challenges
- Time varying accesses
- application internal variations
- system utilization fluctuations
- ESCAT interarrivals
- non-deterministic
- non-stationary
- correlated
- periodically similar
- bursty
22PPFS II Architecture
INTELLIGENT DECISION PROCEDURES
SENSOR ACTUATOR MANAGER
METADATA MANAGER
CLIENT
CLIENT
GLOBAL NETWORK (Globus)
I/O APIs
I/O APIs
ANN/HMM CLASSIFIER
ARIMA CLASSIFIER
CLIENT CACHE
CLIENT CACHE
STORAGE SERVER
STORAGE SERVER
STORAGE SERVER
23Access Pattern Exploitation
- Local (per parallel thread)
- artificial neural networks (ANNs)
- hidden Markov models (HMMs)
- Global (parallel program)
- temporal algebra
- Universal
- across job mix
Random
Write only
Read/write
Strided
Read only
Sequentiality
Sequential
Regular
Variable
Request Size
Read/Write
24Neural Net Classification
- I/O access abstraction
- file byte offset
- request size
- operation type (read/write)
- Neural network classification
- operates in real-time
- access abstraction (input)
- qualitative classification (output)
25Hidden Markov Classification
- ANN classification limitations
- training can be difficult
- qualitative access pattern recognition
- Hidden Markov models (HMMs)
- learn access probability distribution functions
- exploit knowledge of prior executions
- can recognize non-qualitative patterns
26HMM Prediction Example
- Next-block prediction accuracy
- parallel physics code
27ARIMA Forecasting
- Three basic steps
- model identification
- parameter estimation
- forecasting (prediction)
- ACF (autocorrelation function)
- linear relation between observation pairs
- PACF (partial autocorrelation function)
- conditional correlation
- intervening observations removed
28Adaptive Prefetcher
READ REQUESTS
29Prefetch Scheduler
- Build schedules using Forestall algorithm
-
- If disk service times gt predicted interarrival
times - disk is congested (has queuing delay)
- otherwise, sufficient disk bandwidth
30Online Time Series Modeling
31PPFS II Predictor Integration
Application
UNIX API
File Object
Real-time HMM/ARIMA Neural Net Predictor
Sensors
Policy Decision Procedure
Arrival Predictions
PPFS II Cache
UNIX I/O
32PRISM I/O Trace
- One season prediction behavior
- root mean square prediction error 18
- average prediction accuracy 82
33PRISM I/O Trace (Detail)
Gradual improvement in prediction accuracy
34PRISM I/O Trace Error
35Cactus Wavetoy Behavior
One-season ahead predictions accuracy 87
Spikes of long computation times Bursts of short
interarrival times
36Cactus Wavetoy Detail
37Cactus Wavetoy Performance
38Cactus Wavetoy Performance
39Caltech ESCAT ARIMA Predictions
Learning
40Open I/O Research Problems
- Analytical models of disk striping
- to study the performance of very large systems
- Adaptive disk striping parameterization
- smoothly adapting policies
- fuzzy logic with configurable rule bases
- Space-time tradeoffs
- redundant storage formats
- multiple file copies
- off-line and/or online transformations
- request scheduling for multiple copies
41Research Directions
- Transduction and reduction
- external daemons and formats
- SDDF and XML
- clustering and projection
- Intelligent prediction and control
- application signatures
- compact representations
- time series/HMMs/neural nets in other domains
- network latency/bandwidth and I/O
- Performance contracts
- specification and validation
- interval reasoning
42Large Data Problem
- Detailed measurements enable
- flexible post-mortem analysis
- spatio-temporal correlations
- tracing is the standard approach
- Detailed event tracing creates problems
- (potentially) large data volume
- application perturbations
- high storage and analysis costs
43Complex, Multi-dimensional Data
- How bad can it be?
- 50-100 performance metrics
- hundreds or thousands of concurrent activities
- millisecond time scales for some measures
- Implications
- thousands of points
- high-dimensional space
- rapid point trajectories
44Statistical Data Clustering
- Goals
- identify behavioral equivalence classes
- record data from behavioral representatives
- reduce aggregate data volume
- minimize instrumentation perturbations
- retain advantages of detailed tracing
- Approach
- real-time data analysis
- cluster representative logging
45Clustering and Program Models
- Three basic alternatives
- functional decomposition
- single program multiple data (SPMD)
- data parallel (HPF or HPC)
- De facto equivalence classes
- functions (different code)
- SPMD (same code but data dependent)
- data parallel (same code but data dependent)
46Performance Metric Trajectories
47Clustering and Metric Spaces
- Observations
- clustering identifies similar trajectories
- one need only record one from each class
- Clustering
- based on metrics, not traces
- metrics, like traces, evolve over time
48SAR Data Reduction
49Projection Pursuit
- Statistical clustering
- reduces the number of data points
- but not their dimensionality (metric count)
- Projection pursuit
- generalization
- principal component analysis
- identifies important metrics
50Application Signatures
- Compact behavioral descriptions
- application intrinsic measures
- decoupled from resource mapping
- e.g., I/O volume/statement
- time series analysis
- phase identification
51What Is a Grid?
- Collection of computing resources
- varying in power or architecture
- potentially dynamically varying in load and
reliability - Interconnected by network
- links may vary in bandwidth
- load may vary dynamically
- Distribution
- across room, campus, state, nation, and globe
52Grid Applications
- NSF Network for Earthquake Engineering Simulation
(NEES) - integrated instrumentation, collaboration,
simulation - Grid Physics Network (GriPhyN)
- ATLAS, CMS, LIGO, SDSS
- distributed analysis of petascale data
- Globus Foster and Kesselman
GriPhyN Physics Grid Network
NSF NEES Earthquake Grid
53Grid Implications
- Dynamic optimization requires
- resource models
- scheduling
- real-time measurement and control
- wide-area infrastructure
- multiple execution modes
- Come as you are execution
54The Grid Big Picture
Dynamic Adaptation
Participants Ruth Aydt, Andrew Chien, Fran
Berman, Jack Dongarra, Ian Foster,
Dennis Gannon, Ken Kennedy, Carl Kesselman,
Lennart Johnsson, and Dan Reed
55Configurable Object Program
- Representation of the application
- dynamic reconfiguration and optimization
- for distributed targets
- includes
- program intermediate code
- annotations from the compiler (reconfiguration
strategy) - historical information (run profile to now)
- Reconfiguration strategies
- aggregation of data regions (submeshes)
- aggregation of tasks
- definition/redefinition of parameters
- used for algorithm selection
56Program Execution System
To PPS
57Performance Contracts
- At the heart of the GrADS Model
- mechanisms for managing mapping and execution
- What are they?
- mappings from resources to performance
- mechanisms for determining
- when to interrupt and reschedule
- Abstract definition
- random variable r(A,I,C,t0) with a PDF
- A app, I input, C configuration, t0 time
of initiation - important statistics lower and upper bounds (95
confidence) - issue
- Is r a derivative at t0? (Wolski)
58Performance Contracts
100 GF/s
Viz
CFD
Wind Tunnel
Viz
200 MB/s
Viz
2 GB/s
Database
10 fps
- Goals
- find a schedule
- compute, storage, and network resource binding
- subject to performance constraints
- continuously validate contract
- application or system violations possible
59Performance Contracts
- Given
- compact application signature
- clustering of application intrinsic metrics
- resource behavioral estimates
- Project application signature
- Validate projection against measured behavior
Resources
Metric 1
60Real-time Multilevel Analysis
- Multilevel Drilldown
- multiple sites
- multiple metrics
- Autopilot daemons
- Svpablo instrumentation
- real-time display
- Five Components
- shared views
- direct manipulation tools
- multimedia annotations
- actualization interfaces
- CAVE to palm
61Rocket Simulation Case Study
Could Modify Iterations with Actuator
Init
Fluids Code (10 fluid iterations)
- Code developed by DOE ASCI Center for Simulation
of Advanced Rockets (CSAR) at UIUC - 40,000 lines of Fortran, MPI for communication
between processes, runs on SGI Origin - 200 hours on 128 PEs to simulate 1/2 second of
burn - Ultimately want to model 2 minutes for complete
booster burn-off
Interpolation
Solids Code Do 31 Multigrid Solution for Each
of the Meshes
3 for coarse grain mesh 1 for fine grain
Convergence Test
n
Check Against a Residual Best Case, Converge
on First Try
Y
Saves Date Advances Time Step
Output
62Virtue CSAR Visualization