Title: Lic Presentation
1Lic Presentation
Memory Aware Task Assignment and Scheduling for
MultiprocessorEmbedded Systems
Radoslaw Szymanek / Embedded System
Design Radoslaw.Szymanek_at_cs.lth.se http//www.cs.l
th.se/home/Radoslaw_Szymanek
2Outline
- Introduction
- Problem Formulation and Motivational Example
- CLP Introduction
- CLP Modeling
- Optimization Heuristic and Experimental Results
- Conclusions
3System Level Synthesis (SLS)
- Multiprocessor embedded systems are designed
using CPUs, ASICs, buses, and interconnection
links - The application areas range from signal and image
processing to multimedia and telecommunication - Task graph representation for application
- The main design activities are task assignment
and scheduling for a given architecture - Memory constraints (code and data memory)
4SLS with memory constraints
ROM
RAM
ROM
RAM
L1
P1
P2
B1
ROM
RAM
RAM
L2
P3
A1
target architecture
annotated task graph
5Problem Assumptions and Formulation
- Data dominated application represented as
directed bipartite acyclic task graph - Each task is annotated with execution time, code
and data memory requirements - Heterogeneous architecture
- Both tasks and communications are atomic and they
must be performed in one step - Find a good CLP model
- Find a good heuristic for memory constrained time
minimization task assignment and scheduling
satisfying all constraints
6Motivation
- SoC multiprocessor architectures
- Co-design methodology needs tool support
- Memory consideration to decrease cost and power
consumption - System Level design for fast evaluation
7Motivating example (memory)
Data Memory
Schedule
8
8
P1
P2
DC3
6
6
P2
DC2
C3
C2
4
4
C1
L1
C2
t
t
P1
DC2
P1
P2
task graph
8
8
6
6
P2
4
4
L1
P1
P2
L1
C3
DC1
DC3
P1
t
t
DC3
DC2
architecture
Task - 1kB code memory, 4kB data memory,
Communication - 2kB data memory
8CLP Introduction
- Constraint programming represents one of the
closest approaches computer science has yet made
to the Holy Grail of programming the user states
the problem, the computer solves it. - Eugene C. Freuder
- CONSTRAINTS, April 1997
9CLP Introduction
- Relatively young and attractive approach for
modeling many types of optimization problems - Many heterogeneous applications of constraints
programming exist today - State decision variables which constitute to
solution - State constraints which must be satisfied by
solution - Search for solutions using knowledge you can
derive from constraints
10Constraints properties
- may specify partial information need not
uniquely specify the values of its variables, - non-directional typically one can infer a
constraint on each present variable, - declarative specify relationship, not a
procedure to enforce this relationship, - additive order of imposing constraints does not
matter, - rarely independent typically they share
variables.
11A simple constraint problem
1. Specify all decision variables and their
initial domains
Natural language description There are three
tasks, namely, T1, T2, and T3. Each of these
tasks can execute on any of two available
processors, P1 and P2. Tasks T1 and T2 send data
to task T3. The tasks should be assigned and
scheduled in such a way that the schedule length
does not exceed 10 seconds.
CLP description TP1, TP2, TP3 1..2, TS1,
TS2, TS3 0..10, Cost 0..10,
12A simple constraint problem
2. Specify all constraints and additional
variables
The execution time of task T1 is four seconds on
processor P1 and two seconds on processor P2.
Task T2 requires three and five seconds to
complete execution on processor P1 and P2
respectively. Task T3 always needs three seconds
for execution.
If TP1 1 then TD1 4. If TP1 2 then TD1 2,
If TP2 1 then TD2 3, If TP2 2 then TD2
5, TD3 3,
13A simple constraint problem
Tasks T1 and T2 must execute on different
processors. Tasks T1 and T2 send data to task
T3. If two communicating tasks are executed on
different processors there must be at least one
second delay between them so the data can be
transferred. The tasks should be assigned and
scheduled in such a way that the schedule length
does not exceed 10 seconds.
TP1 ! TP2,
If TP1 ! TP3 then D1 1 else D1 0, TS1 TD1
D1 lt TS3, ,
Cost gt TS1 TD1, Cost gt TS2 TD2, Cost gt
TS3 TD3.
14Search Tree
15Modeling
- Constraint Logic Programming (finite domain, CHIP
solver) - Global constraints (cumulative, diffn, sequence,
etc.) reduce model complexity of the synthesis
problem and exploit specific features of the
problem - Global constraints are useful for modeling
placement problems and graph problems - Problem-specific search heuristic for NP-hard
problem
16CLP Model
- Decision variables for task
- TS start time of the task execution
- TP resource on which task is executed
- TDP exact placement of task local data in
memory - Additional variables for task
- TD task duration
- TCM and TDM denote the amount of code and data
memory for task execution
17CLP Model
- Decision variables for data
- DS start time of the data communication
- DB resource on which data is communicated
- DCP and DPP exact placement of data in memory
of the producer and consumer processor - Additional variables for data
- DD data communication duration
18CLP Model Task Requirements
19CLP Model Data Requirements
DM
CU
DM
DD
DA
1
DA
DCP
DB
DPP
time
time
time
DSDD
DS
TSc TDc
DS
TSp
data mem (cons)
communication time
data mem (prod)
20Simple Example
P2
D1_c
T1
D2_e
T2
D1
D2
B1
C1
P1
T3
T1
T2
T3
D2_p
D2_c
D1_p
P1
D1_e
D3_e
Diffn constraint
21Code Memory Constraint
Code Memory Limit
T8
T4
T2
T3
T1
T7
T5
T6
Processor
22Constraints types
- precedence constraints
- processing resources constraints
- communication resource constraints
- pipelining constraints
- code memory constraints
- data memory constraints
23Task Assignment and Scheduling Heuristic
Choose a task from ready task set with
min(max(Ti)) minimize schedule length
Assign the task to a processor with the minimal
implementaion cost ci
Schedule communications that Ti is minimal
Assign data memory
Y
data memory estimate no. 1 holds?
N
Y
data memory estimate no. 2 holds?
N
Undo all decision choose a task which consumes
the most data
24Execution Cost
Ind LowTS/PTS LowCM/PCM
i-th task, n-th processor
ATS available time slots, ACM available code
memory
25Data and Communication Cost
i-th task, n-th processor
26Estimates
- Estimate no. 1
- where S (Sn) is a set of tasks already scheduled
on a processor (processor Pn), tasks tj are
direct successors of task ti, and dij is amount
of data communicated between ti and tj. - Estimate no. 2 uses the global constraint diffn
and it takes time into account
27MATAS System
28Synthesis Results - H.261 example
DCT
Video Coding Algorithm H.261
29Experimental results H.261 example
30Experimental results(random task graphs)
31Main Contributions
- Definition of the extended task assignment and
scheduling problem - Inclusion of memory constraints to decrease the
cost for data dominated applications - Specialized search heuristic to solve resource
constrained task assignment and scheduling - CLP modeling framework to facilitate an
efficient, clean, and readable problem definition
32Conclusions and Future Work
- The synthesis problem modeled as a constraint
satisfaction problem and solved by the proposed
heuristic, - Good coupling between model and search method for
efficient search space pruning, - Memory constraints and pipelined designs taken
into account, - Heterogeneous constraints can be modeled in CLP,
important advantage over other approaches - Need for our own constraint engine
implementation, approximate solutions, mixture of
techniques - Need for better lower bounds, problem specific
global constraints, designer interaction during
search
33Lic Presentation
Memory Aware Task Assignment and Scheduling for
MultiprocessorEmbedded Systems
Radoslaw Szymanek / Embedded System
Design Radoslaw.Szymanek_at_cs.lth.se http//www.cs.l
th.se/home/Radoslaw_Szymanek
34Related Work
- J. Madsen, P. Bjorn-Jorgensen, Embedded System
Synthesis under Memory Constraints, CODES 99
(GA, only RAM) - S. Prakash and A. Parker, Synthesis of
Application-Specific Heterogeneous Multiprocessor
Systems, VLSI Signal Processing, 94 (MILP, no
ASICs, optimal)
35A simple constraint problem
There are three tasks, namely, T1, T2, and T3.
Each of these tasks can execute on any of two
available processors, P1 and P2. Tasks T1 and T2
send data to task T3. Tasks T1 and T2 must
execute on different processors due to some fault
tolerant issues. The execution time of task T1 is
four seconds on processor P1 and two seconds on
processor P2. Task T2 requires three and five
seconds to complete execution on processor P1 and
P2 respectively. Task T3 always needs three
seconds for execution. In case when two
communicating tasks are executed on different
processors there must be one second delay between
them so the data can be transferred. The tasks
should be assigned and scheduled in such a way
that the schedule length does not exceed 10
seconds.