Title: Eventdriven reactive embedded systems
1Event-driven reactive embedded systems
- Reactive systems perform tasks in response to
input events - Purely reactive invoked only to respond to
events - System can generate events either actively or in
response to the environment - Events are handled by event handlers who can
contain processing tasks - Real time constraints on all observable events
- Is communication asynchronous?
2Model of Computation for PicoRadio Protocol Stacks
- MOC defines the behavior and the interaction of
the blocks Sequential behavior, Concurrency,
Communication - The chosen MOC is Concurrent Extended Finite
State Machines (CEFSM)
EFSM
CgtG
CgtG
Protocol stack
Concurrent EFSMs
3Challenges in Low Power OS Implementation
- Tight set of constraints for OS
- Energy efficiency the most critical design metric
- Real time performance
- Unprecedented high level of integration
- System heterogeneity and complexity
- Traditional general-purpose OSs are no longer
sufficient - Developed for broad application
- OS developed independently of the application
- Blindly treating tasks as a random processes
4What Makes an Efficient Low-power Embedded OS
- Target reactive event-driven nature of the
embedded systems - Support for concurrency in the application
architecture - Aggressive power management
- Minimal overheads
5Three OS Implementation
- General Purpose Multi-tasking OS (eCOS)
- TinyOS
- Targets event-driven communication systems.
- Hierarchical Power Manager Scheduler
6General-purpose Multi-tasking OS
- Originally developed for the PC platform
- Good for supporting several mostly independent
applications running in virtual concurrency - Not designed for heavily coupled processes across
layered protocol stack - Inter-task communication involves context
switching - Expensive overhead tolerable for PC applications
- Coarse grain computation block granularity
- Low communication/switching frequency
- Far less tolerable for event-driven systems
- Fine grain computation block granularity
- High communication/switching frequency
- No built-in energy management mechanisms
7Implementing PicoRadio II with general purpose OS
- void cyg_user_start(void )
-
- cyg_thread_create(0, task_ui_2_, 0, )
- cyg_thread_create(0, task_transport_1_transport_b
s) - cyg_thread_create(0,task_transport_1_transport_re
mote_,) - ..
-
- cyg_thread_resume(task_ui_2__handle)
- cyg_thread_resume(task_transport_1_transport_bs__
handle) - cyg_thread_resume(task_transport_1_transport_remo
te__handle) - ..
- eCos is chosen due to its popularity and
availability - Each Component is turned into a thread.
- Thread management communication rely on the OS.
- Design tool used is Virtual Component Codesign
(VCC) from Cadence Design Systems
8Inefficient Implementation of PicoRadio II With
General-purpose OS
- Processor memories occupy gt70 of the total
area despite very low processor utilization (7) - 50 of instruction code is communication overhead
- Massive data memory size of 54K due to
- Communication overhead
- Expensive scheduler overhead
- Memory management, and stack allocations
PicoRadio II floorplan
9Tiny OS Basics
- Targets event-driven communication systems.
- MOC CEFSM
- Communication overhead and other OS related costs
significantly reduced - Unnecessary performance-degrading polling is
eliminated and context switching minimized - Application graph of components
- OS scheduler
- Component
- Frame (storage)
- Tasks (concurrency)
- Commands events handlers
- Rudimentary power control
- when no task is present, CPU goes to sleep
10Implementing PicoRadioII in TinyOS
- External events from the RF or sensors propagate
from the lowest layers up until handled by the
higher layers - The system must process incoming events faster
than their arrival rate to prevent event loss
11General Purpose vs. Event-driven OS
12General Purpose Vs Event-base OS-- Code Size
Instruction Memory size reduces by 3x Data memory
size reduces by 20x
13General Purpose Vs Event-based OS -- Performance
General Purpose OS Total cycle count 16365
Event-driven OS Total cycle count 2554
10.1
86.9
85.98
10.17
14General Purpose Vs Event-based OS -- Power
- For 0.18µm technology, VDD1.8v
- Arm7 0.25mw/MHz
- Read per access for 64K SRAM 0.407mw/MHz
- Write per access for 64K SRAM 0.447mw/MHz
- Assume
- 10 of the instructions involve memory reads and
10 writes - Power consumption of SRAM scales roughly as the
square root of the capacity
Gen. OS 0.608mW/MHz
TinyOS 0.053mW/MHz
12x Reduction in Power
15Evaluating TinyOS
- Research goal design an energy efficient OS for
domain specific heterogeneous architectures. - Basic TinyOS concepts are very attractive
- Its event-driven asynchronous characteristics can
naturally support the interactions between
modules of vastly different behavior and
processing speeds in a heterogeneous system. - Its simplicity reduces overheads and leads to
more power efficient implementation. - Provides some support for multiple flows of
control
16Limitations of TinyOS
- Communication buffers queues hidden inside the
component - Hard to optimize for communication
- Flat component graph structure lacks scalability
- Insufficient support for concurrency
- One global scheduler handles all tasks
- Difficulty in specifying application that has
complex concurrent behavior - No analysis or support for global optimality
- How do we optimize the management of the system
for performance metric such as power? - Software centric approach does not allow full
exploration of the integrated, heterogeneous
system architecture - Rudimentary power management scheme
- Assumes off the shelf components and has no
access to customized power-efficient blocks
17Hierarchical Power Management
- Explicit communication
- Intelligent buffers queues management
- Voltage scaling based on queue length event
rates - Hierarchical architecture and schedulers
- Enhances scalability
- Supports concurrency
- Explores locality
- Ability to devise optimal management policy
- Scheduling for power
- Highly integrated with power aware hardware
architecture
18Hierarchy is desirable for complex systems
- Hides low level details enhance modularity
- Enhances scalability
- Explore concurrency locality inherent in the
system - System is partitioned into multiple power domains
and each domain can further be partitioned into
sub-domains - Each hierarchy has its own power scheduler
- Enable power control at various granularity
Multi-thread Graph A system model for real-time
embedded software synthesis, Thoen et al
19Power states for system Blocks
- Awake states
- Can respond to incoming events
- Could run at variable frequencies/voltages/power
- Two sub-states
- Active states
- Processing incoming events
- Idle states
- Finished processing all events
- No pending events
- Sleep states
- Can not respond to incoming events
- Needs to be waken up from outside
- Low power states
Active
Idle
sleep
Awake
20Hierarchical System Architecture
- At each hierarchy, the scheduler provides power
management interface for all blocks - All power state transitions MUST go through the
scheduler - Centralized scheduler has the global vision to
implement the optimal management policy - Individual blocks export power control events to
the scheduler - Data transfer between blocks does not involve the
scheduler - Minimize power management overhead
Power Scheduler
Block 2
Block3
Block 1
21Hybrid power management policy
- Distributed
- Each block implements its own sleep policy
- Queue/buffer based voltage scaling
- Centralized Power scheduler has the global vision
to implement the optimal policy - Static information Power consumption for all the
power states the cost for state transitions - Dynamic information
- Exported control events supplied by individual
blocks - Knowledge of current power states
- Exclusive right to change all power states
22Distributed Power control When to go to sleep
- How long does the block have to wait in its IDLE
state before it goes to sleep? - Optimized policy formulated by Simulic
- Formulated based on Time-indexed semi-Markov
Decision process - Minimize energy under a set of constraints
- Obtained policy works as a randomized time-out
decide when to go to sleep after a certain
amount of idle time
23Distributed Power controlQueue Management
- Scale the voltage based on queue length event
rates (Simunic) - Based on M/M/1 queuing theory
- Goal to keep processing queuing delay constant
- Detect change in the inter-arrival rate of events
or the service rate - Optimally calculate the new VDD frequency
24Hardware power control at basic block level
- Clock gating -- gate the clock signal to the
block - No dynamic power consumption
- Only static power from leakage current
- Consumes considerable amount of power for low
duty cycle operation - Power-down
- Turn off the power rails
- No power consumption
- Power-down is much preferred!
25Enable block power-down by exporting internal
events
- Infrequent, periodic, maintenance events e.g.
timer events - May require some system resources to be on all
the time - Prevent the entire block to go to sleep
- Export control signals to be incorporated into
the power scheduler - Affect the scheduling of other blocks
- Global statistical information gathering
- Mention Memory????
26Example The Exportation of Network Layer Control
Events for Power Scheduling
- Only certain control events are exported. Data
flow does NOT involve the scheduler. - TurnMACOn! ResumeTimer are control signals to
be sent to the scheduler - SendPacketToQueue! populates the transmit queue
to MAC w/o the scheduler intervention
SendPacketToQueue!
MACPacket?
Init
Data?
Parse
TimerExpired?
Data
Interest?
Check Table
Forward? TurnMACOn!
Intr Proc
ResumeTimer!
Forward? TurnMACOn!
Update?
Table Update
SendPacketToQueue!
27Centralized Power scheduler
- Provides power management interface for all the
blocks - Manage all the power states
- Has exclusive right to initiate all power state
transitions - All power state transitions MUST go through the
scheduler - Wake-up are only performed by the scheduler
- SLEEP is implemented as following A block sends
a SLEEP request to the scheduler. The scheduler
accepts it and put the block to SLEEP - Clean simple approach to avoid mutual exclusion
problem
28Centralized Power scheduler
- Intelligent power scheduling policy to minimize
power consumption while meeting performance
constraints - Look-ahead strategy to turn blocks on before they
are accessed. - Keep blocks from going to sleep if they will be
accessed later - Even if the block will be accessed later, PS
could still sleeps it if this saves power - Given the power consumption in different power
states the cost of power transitions, optimal
scheduling can be devised
29Centralized Power scheduler Example
- Initially A B are sleeping
- PS wakes up A upon detection of incoming events
- A in active processing state
- A realizes it need to send data to B
- A signals the PS
- PS wakes up B
- A sends events/data to B
- A finishes processing, and requests to sleep.
- PS detects more incoming events for A and demands
A to stay awake (sleep request refused)
Power Scheduler
Block A
Block B
30TinyOS
PicoRadio
- Communication buffers queues hidden inside the
component - Hard to optimize for communication
- Flat component graph structure lacks scalability
- Insufficient support for concurrency
- Minimal OS support
- Global scheduler handles all tasks
- Optimality is not explicitly defined
- Explicit intelligent buffers queues management
- Voltage scaling based on queue length event
rates - Prevent event losses
- Hierarchical architecture and schedulers
- Enhances scalability
- Supports concurrency
- Explores locality
- Ability to devise optimal scheduling policy