ThermalAware Scheduling in Environmentally Coupled CyberPhysical Distributed Systems - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

ThermalAware Scheduling in Environmentally Coupled CyberPhysical Distributed Systems

Description:

IEEE Tran. Biomedical Eng. 07. DCOSS'05 ... Supplied cold air. Inlet Temperature. Constants depends on. hardware specifications ... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 43
Provided by: eric396
Category:

less

Transcript and Presenter's Notes

Title: ThermalAware Scheduling in Environmentally Coupled CyberPhysical Distributed Systems


1
?Thermal-Aware Scheduling in Environmentally
Coupled Cyber-Physical Distributed Systems
  • Qinghui Tang
  • Committee
  • Dr. Sandeep Gupta
  • Dr. Martin Reisslein
  • Dr. Loren Schwiebert
  • Dr. Cihan Tepedelenlioglu
  • Dr. Junshan Zhang

Sponsors
2
Presentation Outline
Background and Motivation Unified Thermal-Aware
Approach Applications of Thermal-Aware
Scheduling Summary of Research Results Conclusions
3
Background and Motivation
  • What are Cyber-Physical Systems (CPS)
  • Computing systems tightly coupled with physical
    world
  • Environmentally coupled CPS (ECCPDS).)
  • Applying interference on system itself and the
    surrounding environment
  • Increasing deployment of distributed system
  • Sensor networks
  • Pervasive computing
  • Grid/cluster computing
  • Existing approaches and methodology did not take
    into account the interference and interactions
    among systems and environment
  • Emerging new systems require new methodology and
    approach
  • Cross disciplinary, more complicated applications

4
Environmentally Coupled Distributed CPS
  • Terminologies
  • Interference
  • the negative impact to the environment which
  • Self-interference
  • Environmental interference
  • Cross-interference
  • Interference models
  • Quantitative model
  • Temporal model
  • Spatial model
  • Comprehensive model
  • Individual design approach
  • Network/system operation approach
  • Task scheduling

5
Thermal-Aware ECCPDS
  • We focus on thermal related applications because
  • Correlation between heat dissipation and power
    consumption (energy efficiency)
  • Correlation between temperature change and
    reliability
  • Importance of energy efficiency system lifetime
  • Direct impact on embedded environment
  • Green technology is the new trend
  • Energy efficient and environmentally friendly

6
Examples of Task Scheduling of Cyber-Physical
Systems
  • Implanted biomedical sensor networks are used for
    prosthesis or monitoring
  • Sensor nodes work in shift to accomplish the
    assigned task
  • task scheduling in temporal domain
  • Server farms inside data centers
  • Heat dissipation of one server may heat up other
    servers
  • task scheduling in spatial domain

2
4
3
1
7
Unified Thermal-Aware Scheduling for ECCPDS (1)
  • A Cyber-Physical system with N nodes interacting
    with each others
  • A scheduler assigns the total task Ctotal into a
    task vector , resulting in a power
    consumption vector
  • Each node
  • performs a subset of the total task Ctotal
  • consumes power in certain rate
  • experiences temperature change Ti depending on
    other nodes power consumption
  • System objective function W depends on node
    temperatures (and task assignments)

7
8
Unified Thermal-Aware Scheduling for ECCPDS (2)
  • Problem Statement
  • How to divide the total task Ctotal into C2,Cn to minimize/maximize the objective
    function W
  • Generalized Approach
  • Step 1 Profiling the correlation between power
    consumption, task and temperature rise function
    Gi(?)
  • estimation, measurement or profiling
  • Step 2 Characterizing the thermal interference
    function Fi(?) and building fast thermal
    evaluation method
  • Step 3 Formalizing the objective function
    function H(?)
  • Step 4 Exploring design space find the best
    scheduling

8
9
Related Work
  • Previous research on minimizing thermal
    interference
  • focused on individual design approach instead of
    system operation approach
  • Used numerical method for thermal evaluation, and
    was not appropriate for online and real-time
    scheduling
  • failed to consider the cross interference applied
    by neighboring nodes

9
10
Dissertation Contributions
  • Proposed a unified methodology and analytical
    technique of analyzing and designing
    interference-minimized distributed systems
  • Verified in two thermal applications
  • Can be applied to other forms of interference
    (i.e. sonic
  • Verified the methodology by applying the approach
    on two vastly different applications
  • Built an abstract heat model for fast thermal
    evaluation and power consumption prediction
  • Thermal-aware task scheduling for biomedical
    sensor networks
  • IEEE Tran. Biomedical Eng. 07
  • DCOSS05
  • Minimizing data center cooling energy cost
    through thermal-aware task placement
  • IEEE TPDS special issue on Power-aware Parallel
    and Distributed Computing
  • DASC06, ICISIP06, Cluster07, COMSWARE07

10
11
Dissertation Contributions (cont.)
  • Thermal-aware task scheduling for biomedical
    sensor networks
  • Modeling thermal interference of implanted
    biosensors
  • Identifying factors that minimize thermal effects
  • Time-Space function for fast thermal evaluation
  • Minimizing data center cooling energy cost
    through thermal-aware task placement
  • Homogeneous data center with a single task
  • Thermal-aware algorithm based interference
    characterization
  • Heterogeneous data center with heterogeneous
    tasks
  • Multiple tasks with different timing information

12
  • Application Example Task Scheduling of
    Biosensor Networks

12
13
Biosensor Scheduling Overview
  • Implanted biomedical sensor networks are used for
    prosthesis or monitoring
  • Sensor nodes work in shift to accomplish the
    assigned task
  • Environment interference should be minimized
  • It is task scheduling in temporal domain
  • Task assignments for multiple time slots
  • Ctotal 1
  • Each slot only one node performing the task

2
2
4
4
3
3
1
1
13
14
Biosensor Scheduling Step 1 Profiling the
correlation
  • Profiling the correlation between power
    consumption and temperature rise Gi(?) with
    Pennes bioheat equation

Heat by radiation
Heat by power dissipation
Heat transfer by conduction
Heat accumulated
Heat by metabolism
Heat transfer by convection
14
15
Biosensor Scheduling Step 2 Characterizing
Thermal Interference F(?)
  • Characterizing cross interference between node i
    and node j as a function of spatial distance and
    temporal distance

2
4
3
1
Spatial Distance
3
2
1
4
Temporal Distance
15
16
Biosensor Scheduling Step 3 and 4 Exploring
Design Space
  • The objective function H(?)
  • Searching the best scheduling sequence by using
    Genetic Algorithm

16
17
  • Application Example Thermal Aware Task
    Scheduling of Data Center

18
Problem Statement of Task Scheduling in Data
Centers
  • Given a total task C, how to divide it among N
    server nodes to finish computing task with
    minimal cooling energy cost ?
  • Self-Interference and cross-interference lead to
    the temperature rise of inlet air, should be
    minimized
  • Environment interference (room temperature) is
    not critical
  • Task scheduling in spatial domain

Data Center with 4 servers
?
Task 30
19
Conceptual overview ofthermal-aware task
placement
Server task distribution
Power consumption distribution
Temperature distribution
Energy cost
20
Data Center Preliminary Layout
  • Outlet temperature Tout

Inlet temperature Tin Must less than 25?C
Cold supply temperature Ts
20
21
Data Center Preliminary Scheduling vs. Cooling
Cost
Inlet temperature distribution without Cooling
Inlet temperature distribution with Cooling
Scheduling 1
25?C
Scheduling 2
25?C
21
Minimizing the peak inlet temperature equals to
minimizing the cooling cost
22
Data Center Step 1 Profiling the Correlation
Gi(?)
Server Power Consumption Pi Depending on amount
of computing task
Outlet Airflow
Inlet Airflow, a mixture of Supplied cold air and
Recirculated hot air
22
23
Data Center Step 2 Characterizing Cross
Interference F(?)
  • Heat Recirculation Coefficients
  • Analytical
  • Matrix-based
  • Characterizing process
  • Running CFD with various power consumption
    scenario
  • Calculating recirculation coefficients based on
    Law of Conservation of Matter and Energy
  • Using coefficients to predict temperature without
    running CFD

Tin
Tsup
D
P



heat distribution
powervector
inlettemperatures
supplied airtemperatures
24
Benefit Fast Thermal Evaluation
Extracttemperatures
Run CFD simulation (days)
Give workload
Courtesy Flometrics
D
Tin
Tsup


Yieldstemperatures
Give workload
Compute vector (seconds)
25
Data Center Step 4 Explore Solutions
  • Homogeneous data center with a single task
  • Naïve algorithms without considering cross
    interference
  • Thermal-aware algorithm based interference
    characterization
  • Heterogeneous data center with heterogeneous
    tasks
  • Multiple tasks with different timing information

26
Recirculation Coefficients
  • Consistent with data center observations
  • Large values are observed along diagonal
  • Strong recirculation among neighboring servers,
    or between bottom servers and top servers

?1-4
?46-50
?1-5
?1-10
20
40
45
10
50
5
9
4
8
Victims
Sources
3
7
2
6
46
?1-40
1
27
Fast Thermal Evaluation Results
  • Thermal Evaluation
  • Fast thermal evaluation
  • Acceptable predict error less than normal
    temperature fluctuation
  • Energy Efficiency
  • Consistently provide optimal or near-optimal
    energy efficiency
  • Energy savings by 530 depending on utilization
    rate

28
Heterogeneous Data Center with Heterogeneous Tasks
Data Center with 4 servers
Change of solution Vector to matrix
28
Change of constraints
29
Multiple Tasks with Different Timing Parameters
Data Center with 4 servers
Tasks 35, 30
Change of objective function Change of constraints
30
Conclusion and Future Work
  • Increasingly tightly coupled Cyber-Physical
    Systems require new methodology to apply on new
    applications
  • Proposed approach
  • Characterizing complicated interference between
    systems and embedded environment
  • Minimizing thermal effects
  • Real-time online decisions
  • Future work in biosensor networks
  • Thermal-aware scheduling for multiple clusters
  • Cross-cluster interference
  • Applying interference minimization to coverage
    and topology applications

30
31
Conclusion and Future Work (cont.)
  • Future work in data center management
  • Overall data center operation cost
  • Trade-off between cooling cost computing cost
  • Hardware reliability model, trade-off between
    energy cost and hardware cost
  • Multiple tasks with different priorities and
    deadlines
  • Estimation of execution time
  • Other Challenges in Environmentally Coupled
    Cyber-Physical Systems
  • Online characterization
  • without interrupting normal operation
  • For the case where it is impossible to conduct
    test and verification
  • Unknown environment
  • Investigate the applicability of using the
    methodology on other non-thermal interference
  • Chemical sensors to monitor enzyme reaction
  • Minimizing the chance of being detected in a
    hostile environment
  • Different approaches of modeling interference
  • For the case where interference can not be
    measured directly

31
32
Questions ?
33
Backup Slides for
33
34
Review of Two Studied Cyber-Physical Applications
34
35
Experience Obtained
Verify solution Performance comparison
Relax assumption
Formalizing problem Explore solutions
  • Cross disciplinary research problems
  • Challenging and promising
  • Incremental Research Approach
  • Extensive survey to identify existing problems
    and gaps between existing solutions
  • Start with simplified system model, gradually
    relax system assumption and obtain a more
    realistic one

Modeling Interference
Characterizing Interference
Identify interference source impact
Problem Investigation
36
System Model
Interference cause undesired Temperature rise
Heat Exchange
System performance depends on the thermal
distribution
37
Characterizing the Interference Function F(?)
  • Characterizing the interference applied to
    neighboring nodes and the environment
  • Building heat model to characterize
  • Power consumption of each node
  • Heat dissipation of each node
  • Thermal interference to other nodes
  • Conducting fast thermal evaluation
  • Replacing traditional numerical method to predict
    thermal performance in realtime

A Task scheduling result
Numerical Simulation
Fast thermal evaluation
37
Temperature prediction
38
Application Background
  • What are data centers
  • Server farms, IT centers, computer rooms
  • Why they are important
  • Centralized management, powerful computation
    capabilities
  • Backbones of Internet Infrastructure
  • Why thermal management is important
  • Improve reliability
  • Reduce system down time
  • Save energy cost !!
  • 400,000 annually to power a 1,000 volume
    server-unit data center, then how much for this
  • More than 40 are cooling cost

39
Data Center Step 2 Characterizing Cross
Interference F(?)
The amount of heat in outlet air
some recirculates to other inlets
  • Recirculation coefficients
  • Quantified description of recirculation

some returns to AC
  • Characterizing process
  • Running CFD with various power consumption
    scenario
  • Calculating recirculation coefficients based on
    Law of Conservation of Matter and Energy
  • Using coefficients to predict temperature without
    running CFD

Power Consumption
The amount of heat in inlet air
consists of cold supply air
and recirculated heat
40
Data Center Step 2 Fast Thermal Evaluation
  • Based on Law of Conservation of Energy, after
    some mathematical derivation, we have

Power Consumptions
Supplied cold air
Inlet Temperature
Recirculation coefficients
Constants depends on hardware specifications and
constant properties of air
41
Data Center Step 3 Formalizing the Minimization
Problem H(?)
  • Minimizing the maximal inlet temperature
  • Can be converted into Linear or Non-linear
    optimization problems

41
42
Airflow Inside Data Centers
  • Observation
  • Airflow patterns are stable (confirmed through
    CFD simulations)
  • Hypothesis
  • The amount of recirculated heat is stable, can be
    quantified as recirculation coefficients
  • Define ?ij as the percentage of recirculated heat
    from node i to node j

Courtesy Flomerics
Write a Comment
User Comments (0)
About PowerShow.com