Title: Challenges in Distributed Energy Adaptive Computing
1Challenges in Distributed Energy Adaptive
Computing
2- Information communication Technology (ICT) has
a problem - Performance Centric ? Energy Sustainability
centric - How do we get there?
3ICT Power Growth until 2020
- Increase in spite of power efficient designs
- Clients 8x in number, 3X in power
- Data Centers gt 2X increase
- Network 3X increase
Network
Clients
Transmission, conversion distribution
Data Center
4Current StateUnsustainable Computing
5Data Center Infrastructure
- Resource intensive Water, cabling, metal,
- 50 power wasted before getting to racks
6Distribution Infrastructure
10 distribution loss High carbon impact
IT LOAD
2.5MW Generator 180 Gallons/hour
13.2kv
208V
1 loss in switch gear and conductors
115kv
UPS
480V
13.2kv
13.2kv
1.0 loss 99.0 efficient
6 loss 94 efficient
0.3 loss 99.7 efficient
0.5 loss 99.5 efficient
750 Rack Power Wasted
Component Total Used Comments
CPU 80 60 Operating at 100 utilization
Fans 50 25 Temp. directed fan at 100 util
Memory (32 GB) 88 24 2GB DIMMS, 4W idle, 19W active
Hard drives 40 10 6 SATA drives, 25 busy
I/O adapters 20 4 25 disk, 15 network
Motherboard 22 12 N/S bridges devices, VRs,
Total DC power 300 135
Power supply loss 50 7 14 ? 5 loss of AC input pwr
AC input power 350 142 gt 50 of power is wasted
8Sustainable Computing
9Renewable Energy Push
- Limit energy draw from grid
- Less infrastructure
- Less losses
- but variable supply
Need better power adaptability
10High Temperature DCs
- Chiller-less operation
- Less energy/materials, but space inefficient
- High temperature operation
- Smaller Toutlet Tinlet
- More throttling
- More failure prone (?)
X
Need smarter thermal adaptability
11Overdesign
- Overdesign is the norm today
- Huge power supplies, fans, heat sinks, server
cases, high rack capacity, UPS capacity, - Engineered for worst case ? Rarely encountered
- Huge power wastage, waste of materials, energy,
- What if we right-size everything?
- Highly energy efficient but need smarter control
Better energy adaptability to deal w/ frugal
design
12Energy Adaptive Computing
- EAC strives to do dynamic end to end adjustment
to - Workload adaptation for graceful QoS degradation
under energy limitations - Infrastructure adaptation to cope with temporary
energy deficiencies. - Requires coordinated power/thermal mgmt of
computation, network storage. - Enhances sustainability of IT infrastructure
13EAC Instances
14Client-server EAC
- Transparently adapt to client energy states
- State on-AC, normal, low-battery,
- Service contract Ci setup QoS, operational
QoS - Adaptation Challenges
- Communicating enforcing contracts.
- Group adaptation of clients forced by
network/servers ?
15Cluster EAC
- Adaptation to intra inter-DC limits
- Multi-level Server, rack DC levels
- Adaptation Challenges
- Estimate collect power deficits/surplus at
multiple levels - Coordination across large range of devices
- Location based services
- Coordination across levels
- Simultaneously handle client-server loop
16P2P EAC
- Adaptation based on available energy
- Content video resolution, audio coding,
- Network modulate wireless radio usage (?)
- Energy proportional use of peer resources
- Energy driven content replication
reorganization - Adaptation Challenges
- Satisfying QoS ?
- Balancing src/dest usage vs. relay node energy
usage ?
17ChallengesSome specific Issues
18Power Estimation Challenges
- Notion of effective power?
- Additive relationship Workload ? power
- Why is this hard? Interference
- Available power
- Determined by power, thermal perhaps other
issues (noise). - Required at multiple levels facility, enclosure,
machine,
19Network Role in EAC
- Energy Adaptation
- Aggressive control of switch/router ports
- Speed, state width controls
- Traffic consolidation across paths
- Adaptation induced congestion
- Propagation (e.g., ECN, EBCN) response
- Computation communication tradeoff ?
- Redirection ?
- Network protocol support for adaptation?
20Other Issues
- EAC Security
- Attacks on power sources
- Energy Attacks on IT, e.g.,
- Demanding too much, cyclic demands,
- Storage adaptation
- Storage devices, controllers network.
- Coordinated end to end control is hard!
- Formal models to understand impact of energy
adaptation.
21Energy Adaptation in Data Centers
22Adaptation Methods
- Workload Adaptation
- Coarse grain Shut down low priority tasks
- Fine grain Graceful QoS degradation, e.g.,
- Batched service, poorer resolution,
- Infrastructure Adaptation
- Operation at lower speeds (DVFS)
- Effective use of low power modes width
control. - Workload adaptation always done first
23Infrastructure Adaptation
- Need a multilevel scheme
- Individual assets up to entire data center
- Need both supply demand side adaptations
24Supply Side Adaptation
- Supply side Limits
- Hard caps at higher levels (true limit) vs.
soft (artificial) caps at lower levels. - Limits may be a result of thermal/cooling issues.
- Load consolidation
- An essential part of energy efficient operation
- Load consolidation vs. soft capping
- Need to address workload adaptation changes as a
result of supply increase decrease.
25Demand Side Adaptation
- Adaptation to fluctuating demand
- Transactional workload Migrate queries or app
VMs? - Issues w/ combined supply demand side
adaptations - Imbalance One node squeezed while other has
surplus power - Ping-pong Control Oscillatory migration of
workload - Error accumulation down the hierarchy.
26A Proposed Algorithm
- Unidirectional control
- Load migration moves up the hierarchy, from local
to global. - Local migrations are temporary do not trigger
changes to soft caps on supply. - Target Node selection
- Based on bin packing (best-fit decreasing)
- Allows for more imbalance, which can be exploited
for workload consolidation - Properties
- Avoids ping-pong, attempts to minimize imbalance
27Experimental Results
- Scenario
- 3 levels, 18 identical servers (44 55)
- 3 applications, total of 25 app instances
- Any app can run on any server
- Demand Poisson (active power 8 utilization)
28Migration Frequency
- Migration drivers consolidation vs. energy
deficiency - Low util ? Consolidation, High util ? Energy
deficiency - Other characteristics
- Migration frequency low in all cases
- No ping-pong observed
29Thermal Impacts
- Additional Issues
- Energy consumption limited by thermal/cooling
issues, not energy availability - Migrations required to limit temperature
- Temperature power have nonlinear relationship
- Need to account for both power thermal effects
30Results w/ Thermal Effects
- Imbalanced cooling
- Servers 1-14 Ta25o C, Servers 15-18 Ta40oC
- Temperature limit 65oC
- Power demand is adjusted by the alg. to account
for higher temperature
31Conclusions
- Need to go beyond energy efficiency
- Design devices/systems to minimize life-cycle
energy footprint - Creatively adapt to available energy to operate
at the edge - Ongoing/future work
- Coordinated server, network storage mgmt.
- Explore tradeoffs between QoS, power savings and
admission control performance
32Thank you!
33Power Inefficiencies
Wasted leakage clock power
Rack supply
90-95 efficient
CPU
Voltage Regulators
280V
Server PSU
DRAM Mem controller
12, 5V
70-90 efficient
Fans
Adapters
Storage
95 efficient
Idle wasted power
34Operating Regimes
35So, Whats the Problem
Client
Client
- Local constraints controls ? end-to-end impacts
- DC to DC load shift
- Service disruption post-shift impact
- Client request to alter content
- Less or more work for server
- Potential conflicting controls
Network
Core Network