Software Steering: Fuzzy Logic, Time Series, and Hidden Markov Models presentation

About This Presentation

Title:

Software Steering: Fuzzy Logic, Time Series, and Hidden Markov Models

Description:

Overlap computation with disk I/O to minimize application stalls for disk I/O ... I/O STALL. I/O STALL. University of Illinois Department of Computer Science ... –

Number of Views:195

Avg rating:3.0/5.0

Slides: 60

Provided by: debor113

Category:

more less

Transcript and Presenter's Notes

Title: Software Steering: Fuzzy Logic, Time Series, and Hidden Markov Models

1
Software Steering Fuzzy Logic, Time Series,
and Hidden Markov Models

Daniel A. Reed
Department of Computer Science
University of Illinois
reed_at_cs.uiuc.edu
http//www-pablo.cs.uiuc.edu

2
SvPablo

A graphical source code browser and performance
capture/correlation tool
Allows user to select loops and procedures to
instrument in C, F77, F90 code. Automatic
instrumentation for HPF via PGI performance
interface.
Collects performance data and later displays it
relative to source code line
Option for real-time data transmission via
Autopilot tagged sensors (more later)

3
SvPablo GUI
4
The Next Frontier

Emerging applications are irregular
adaptive meshes
data dependent behavior
And they are dynamic
time varying resource demands
time varying resource availability
geographically distributed and heterogeneous
computational grids

5
Automatic Tuning Requirements

End-to-end, real-time data capture
multiple levels
hardware, system software, libraries,
applications
multiple granularities
microseconds to hours
multiple sites
geographically distributed computations
Intelligent data analysis
qualitative feature extraction
Dynamic policy selection
interactive and automatic

6
Closed Loop Adaptive Control

Rationale
adaptive applications
dynamic demands
Research approach
monitor resource demands and responses
select policies based on observed behavior
implement policy changes locally and globally

7
Autopilot Components

Sensors and actuators
distributed measurement and software control
Attached functions
sensor/actuator local extension
Sensor/actuator managers(s)
publish/subscribe
Remote client(s) and client handlers
remote sensor/actuator interaction
Fuzzy logic decision procedures
distributed performance control
Standard performance daemons

8
Autopilot Components
Attached Function
Sensor
Local Reduction
Remote Client
Manager
Decision
Attached Function
Client Handler
Actuator
Local Reduction
All Built Atop Globus
9
Autopilot Performance Sensors

Three modes
intrinsic (procedural)
extrinsic (threaded)
push (requested by external agent)
Two aspects
quantitative resource use
qualitative request patterns
Accessibility
wide area publish/subscribe attributes
automatic insertion

10
Autopilot Policy Actuators

Functions
remote process control and update
software steering
Activation points
synchronous (application controls)
asynchronous (external agent controls)
Building blocks
Nexus communication library
sensor/actuator registration infrastructure

11
Autopilot Manager Infrastructure
Remote Client Task(s)
Autopilot Manager
12
Autopilot Decision Procedures

Control mechanism issues
performance data is noisy
decisions must not oscillate
Classic control theory
presumes formalisms
resource policies often lack formalism
I/O policies and rules
network protocol selection
Fuzzy logic is an attractive alternative

13
Fuzzy Logic Rationale

Humans rely on qualitative rules
If it is RAINING, drive SLOWLY
If the system is BUSY, backups should be
POSTPONED
Fuzzy logic expresses these rules formally
captures human experience
supports contradictory statements
processes gray statements

14
Fuzzy Controller Structure

Fuzzifier
scales and maps input variables to fuzzy sets
Inference mechanism
approximate reasoning block
deduces the control action
Compositional Rule of Inference (CRI)
Defuzzifier
converts fuzzy output values to control signals
several defuzzification methods

15
Autopilot Decision Process
Monitor/Control Tasks
Instrumented Tasks
16
Sample Fuzzy Rule Base
rulebase FurnaceRules var roomtemp(0,100)
set trapez cold ( 0, 50, 0, 20 ) set
trapez medium( 50, 70, 10, 10 ) set
trapez hot ( 80, 100, 20, 0 )
17
Sample Fuzzy Rule Base
var furnace(0,1) set triangle off ( 0,
0, 0.1 ) set triangle half( 0.5, 0.1, 0.1
) set triangle full( 1, 0.1, 0 ) //
the rules if ( roomtemp cold ) furnace
full if ( roomtemp medium ) furnace
half if ( roomtemp hot ) furnace off
18
Sample Fuzzy Logic Control
// Fuzzify sensor value _furnaceRules-gtroomt
emp.value( _sensedTemperature ) // Execute
the rule base _furnaceRules-gtfurnaceRules.base.
evaluate_all() // Defuzzify the outputs
_furnaceRules-gtfurnaceRules.base.defuzzy_all()
// Update actuator in the instrumented task
_newFurnaceIntensity _furnaceRules-gtfurnace.va
lue() _heatActuatorClient-gtchangeValue(
_newFurnaceIntensity, 1 )
19
Why I/O Prefetching?

Narrow the performance gap
Overlap computation with disk I/O to minimize
application stalls for disk I/O

Processor Speed (Moores Law) 100/18
months Disk Rotational Latency
12/18 months Disk Transfer Bandwidth
60/18 months
B1 B5
B1
B5

Concurrent Disk fetch

B1 in memory
B5 in memory
20
Intelligent I/O Libraries

I/O characterization experience
policies must match application stimuli
policy configuration dependent on many factors
hardware capabilities
access pattern attributes (consider DVCs)
resource contention (other users)
Implication
static optimization alone is not sufficient
adaptive optimization and tuning

21
I/O Prefetching Challenges

Time varying accesses
application internal variations
system utilization fluctuations
ESCAT interarrivals
non-deterministic
non-stationary
correlated
periodically similar
bursty

22
PPFS II Architecture
INTELLIGENT DECISION PROCEDURES
SENSOR ACTUATOR MANAGER
METADATA MANAGER
CLIENT
CLIENT
GLOBAL NETWORK (Globus)
I/O APIs
I/O APIs
ANN/HMM CLASSIFIER
ARIMA CLASSIFIER
CLIENT CACHE
CLIENT CACHE
STORAGE SERVER
STORAGE SERVER
STORAGE SERVER
23
Access Pattern Exploitation

Local (per parallel thread)
artificial neural networks (ANNs)
hidden Markov models (HMMs)
Global (parallel program)
temporal algebra
Universal
across job mix

Random
Write only
Read/write
Strided
Read only
Sequentiality
Sequential
Regular
Variable
Request Size
Read/Write
24
Neural Net Classification

I/O access abstraction
file byte offset
request size
operation type (read/write)
Neural network classification
operates in real-time
access abstraction (input)
qualitative classification (output)

25
Hidden Markov Classification

ANN classification limitations
training can be difficult
qualitative access pattern recognition
Hidden Markov models (HMMs)
learn access probability distribution functions
exploit knowledge of prior executions
can recognize non-qualitative patterns

26
HMM Prediction Example

Next-block prediction accuracy
parallel physics code

27
ARIMA Forecasting

Three basic steps
model identification
parameter estimation
forecasting (prediction)
ACF (autocorrelation function)
linear relation between observation pairs
PACF (partial autocorrelation function)
conditional correlation
intervening observations removed

28
Adaptive Prefetcher
READ REQUESTS
29
Prefetch Scheduler

Build schedules using Forestall algorithm
If disk service times gt predicted interarrival
times
disk is congested (has queuing delay)
otherwise, sufficient disk bandwidth

30
Online Time Series Modeling
31
PPFS II Predictor Integration
Application
UNIX API
File Object
Real-time HMM/ARIMA Neural Net Predictor
Sensors
Policy Decision Procedure
Arrival Predictions
PPFS II Cache
UNIX I/O
32
PRISM I/O Trace

One season prediction behavior
root mean square prediction error 18
average prediction accuracy 82

33
PRISM I/O Trace (Detail)
Gradual improvement in prediction accuracy
34
PRISM I/O Trace Error
35
Cactus Wavetoy Behavior
One-season ahead predictions accuracy 87
Spikes of long computation times Bursts of short
interarrival times
36
Cactus Wavetoy Detail
37
Cactus Wavetoy Performance
38
Cactus Wavetoy Performance
39
Caltech ESCAT ARIMA Predictions
Learning
40
Open I/O Research Problems

Analytical models of disk striping
to study the performance of very large systems
Adaptive disk striping parameterization
smoothly adapting policies
fuzzy logic with configurable rule bases
Space-time tradeoffs
redundant storage formats
multiple file copies
off-line and/or online transformations
request scheduling for multiple copies

41
Research Directions

Transduction and reduction
external daemons and formats
SDDF and XML
clustering and projection
Intelligent prediction and control
application signatures
compact representations
time series/HMMs/neural nets in other domains
network latency/bandwidth and I/O
Performance contracts
specification and validation
interval reasoning

42
Large Data Problem

Detailed measurements enable
flexible post-mortem analysis
spatio-temporal correlations
tracing is the standard approach
Detailed event tracing creates problems
(potentially) large data volume
application perturbations
high storage and analysis costs

43
Complex, Multi-dimensional Data

How bad can it be?
50-100 performance metrics
hundreds or thousands of concurrent activities
millisecond time scales for some measures
Implications
thousands of points
high-dimensional space
rapid point trajectories

44
Statistical Data Clustering

Goals
identify behavioral equivalence classes
record data from behavioral representatives
reduce aggregate data volume
minimize instrumentation perturbations
retain advantages of detailed tracing
Approach
real-time data analysis
cluster representative logging

45
Clustering and Program Models

Three basic alternatives
functional decomposition
single program multiple data (SPMD)
data parallel (HPF or HPC)
De facto equivalence classes
functions (different code)
SPMD (same code but data dependent)
data parallel (same code but data dependent)

46
Performance Metric Trajectories
47
Clustering and Metric Spaces

Observations
clustering identifies similar trajectories
one need only record one from each class
Clustering
based on metrics, not traces
metrics, like traces, evolve over time

48
SAR Data Reduction
49
Projection Pursuit

Statistical clustering
reduces the number of data points
but not their dimensionality (metric count)
Projection pursuit
generalization
principal component analysis
identifies important metrics

50
Application Signatures

Compact behavioral descriptions
application intrinsic measures
decoupled from resource mapping
e.g., I/O volume/statement
time series analysis
phase identification

51
What Is a Grid?

Collection of computing resources
varying in power or architecture
potentially dynamically varying in load and
reliability
Interconnected by network
links may vary in bandwidth
load may vary dynamically
Distribution
across room, campus, state, nation, and globe

52
Grid Applications

NSF Network for Earthquake Engineering Simulation
(NEES)
integrated instrumentation, collaboration,
simulation
Grid Physics Network (GriPhyN)
ATLAS, CMS, LIGO, SDSS
distributed analysis of petascale data
Globus Foster and Kesselman

GriPhyN Physics Grid Network
NSF NEES Earthquake Grid
53
Grid Implications

Dynamic optimization requires
resource models
scheduling
real-time measurement and control
wide-area infrastructure
multiple execution modes
Come as you are execution

54
The Grid Big Picture
Dynamic Adaptation
Participants Ruth Aydt, Andrew Chien, Fran
Berman, Jack Dongarra, Ian Foster,
Dennis Gannon, Ken Kennedy, Carl Kesselman,
Lennart Johnsson, and Dan Reed
55
Configurable Object Program

Representation of the application
dynamic reconfiguration and optimization
for distributed targets
includes
program intermediate code
annotations from the compiler (reconfiguration
strategy)
historical information (run profile to now)
Reconfiguration strategies
aggregation of data regions (submeshes)
aggregation of tasks
definition/redefinition of parameters
used for algorithm selection

56
Program Execution System
To PPS
57
Performance Contracts

At the heart of the GrADS Model
mechanisms for managing mapping and execution
What are they?
mappings from resources to performance
mechanisms for determining
when to interrupt and reschedule
Abstract definition
random variable r(A,I,C,t0) with a PDF
A app, I input, C configuration, t0 time
of initiation
important statistics lower and upper bounds (95
confidence)
issue
Is r a derivative at t0? (Wolski)

58
Performance Contracts
100 GF/s
Viz
CFD
Wind Tunnel
Viz
200 MB/s
Viz
2 GB/s
Database
10 fps

Goals
find a schedule
compute, storage, and network resource binding
subject to performance constraints
continuously validate contract
application or system violations possible

59
Performance Contracts

Given
compact application signature
clustering of application intrinsic metrics
resource behavioral estimates
Project application signature
Validate projection against measured behavior

Resources
Metric 1
60
Real-time Multilevel Analysis

Multilevel Drilldown
multiple sites
multiple metrics
Autopilot daemons
Svpablo instrumentation
real-time display
Five Components
shared views
direct manipulation tools
multimedia annotations
actualization interfaces
CAVE to palm

61
Rocket Simulation Case Study
Could Modify Iterations with Actuator
Init
Fluids Code (10 fluid iterations)

Code developed by DOE ASCI Center for Simulation
of Advanced Rockets (CSAR) at UIUC
40,000 lines of Fortran, MPI for communication
between processes, runs on SGI Origin
200 hours on 128 PEs to simulate 1/2 second of
burn
Ultimately want to model 2 minutes for complete
booster burn-off

Interpolation
Solids Code Do 31 Multigrid Solution for Each
of the Meshes
3 for coarse grain mesh 1 for fine grain
Convergence Test
n
Check Against a Residual Best Case, Converge
on First Try
Y
Saves Date Advances Time Step
Output
62
Virtue CSAR Visualization

Write a Comment

User Comments (0)

About PowerShow.com