Predictable Programming on a Precision Timed Architecture - PowerPoint PPT Presentation

About This Presentation

Title:

Predictable Programming on a Precision Timed Architecture

Description:

Predictable Programming on a Precision Timed Architecture. Ben ... When decoded. Stall instruction until timer value is 0. Then set timer value to new value ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 30

Provided by: hsie2

Category:

more less

Transcript and Presenter's Notes

Title: Predictable Programming on a Precision Timed Architecture

1
Predictable Programming on a Precision Timed
Architecture

Ben Lickly - UC Berkeley
Isaac Liu - UC Berkeley
Sungjun Kim - Columbia University
Hiren D. Patel UC Berkeley
Stephen A. Edwards - Columbia University
Edward A. Lee - UC Berkeley

2
Edwards and Lee - Case for PRET

2007 Edwards and Lee made a case for precision
timed computers (PRET machines)
Predictability
Repeatability
S. A. Edwards and E. A. Lee, The case for the
precision timed (PRET) machine. In Proceedings of
the 44th Annual Conference on Design Automation
(San Diego, California, June 04 - 08, 2007). DAC
'07. ACM, New York, NY, 264-265.

2
3
Edwards and Lee - Case for PRET

Unpredictability
Difficulty in determining timing behavior through
analysis
Non-repeatability
Different executions may yield different timing
behavior
Brittleness
Small changes have big effects on timing behavior

3
4
Brittleness

Expensive affair
Tight coupling of software and hardware
Reliance on testing for validation
Upgrading difficult
Solution stockpile

Source www.skycontrol.net
4
5
But wait

Real-time scheduling
Worst-case execution time
Detailed model of hardware
Large engineering effort
Valid for particular hardware models
Interrupts, inter-process communication, locks
Bench testing
Brittle

Sebastian Altmeyer, Christian Hümbert, Björn
Lisper, and Reinhard Wilhelm. Parametric Timing
Analysis for Complex Architectures. In
Proceedings of the 14th IEEE International
Conference on Embedded and Real-Time Computing
Systems and Applications (RTCSA'08), pages
367-376, Kaohsiung, Taiwan, August 2008. IEEE
Computer Society.
5
6
Precise Timing and High Performance
Traditional
Alternative
Caches
Scratchpads
Deep pipelines
Thread-interleaved pipelines
Function-only ISAs
ISAs with timing instructions
Function-only languages
Languages and programming models with timing
Best-effort communication
Fixed-latency communication
Time-sharing
Multiple independent processors
6
7
Outline

Introduction
Related Work
PRET Machine
Programming Example
Future Work
Conclusion

7
8
Related Work

Java Optimized Processor
Schoeberl et al. 2003
Timing instructions
Ip and Edwards 2006
Reactive processors
Von Hanxleden et al. 2005
Salcic et al. 2005
Virtual Simple Architecture
Mueller et al. 2003

8
9
Semantics of Timing Instructions

Ip and Edwards 2007
Deadline instructions
Denote the required execution time of a block
When decoded
Stall instruction until timer value is 0
Then set timer value to new value

deadi t0, 10
deadi t0, 8
deadi t0, 0
L0
deadi t0, 10
b L0

Straight Line Block 0
Straight Line Block 1
Loop Block
9
10
Tracing A Program Fragment
cycle
t0

A deadi t0, 6
B sethi hi(0x3f800000), g1
C or g1, 0x200, g1
D st g1, fp -12
E deadi t0, 8
F

11
Precision Timed Architecture
Scratchpad memories
Round-robin thread scheduling
Thread-interleaved pipeline
Time-triggered main memory access
11
12
Clocks and Memory Hierarchy

Clocks
Main clock
Derived clocks
Instruction and data scratchpad memories
1 cycle access latency
Main memory
16MB size
Latency of 50ns
Frequency250Mhz
13 cycles latency

Core
Main Mem.
DMA
12
13
Thread-interleaved Pipeline

Thread stalls
Main memory access
Deadline instructions
Replay mechanism
Execute same PC next iteration

Decrement Deadline Timers
Fetch
F/D
Decode
D/R
Stall if Deadline Instruction
Reg. Access
R/E
Execute
E/M
Check main memory access
Memory
M/W
Increment PC
WriteBack
13
14
Time-Triggered Access through Memory Wheel

Decouple threads access pattern
Time-triggered access

Each thread must make and complete access within
its window

14
15
Tool Flow

GCC 3.4.4, SystemC 2.2, Python 2.4

15
16
Simple Mutual Exclusion Example

Producer followed by Consumer and Observer
Consumer and Observer execute together
Loop rate of two rotations of memory wheel
1st for Producer to write
2nd Consumer and Observer to read

Write to output
Write to shared data
Read from shared data
16
17
Video Game Example
Graphic Thread
VGA-Driver Thread
Main-Control Thread
Pixel Data
Command
Even Buffer
Even Queue
Command
Pixel Data
Odd Buffer
Odd Queue
Swap (When Sync Requested and When Odd Queue
Empty)
Swap (When sync requested and when Vertical
blank)
Update Screen (Sync request)
Refresh (Sync request)
Sync (After queue swapped)
Sync (After buffer swapped)
17
18
Timing Requirements
Signal
Timing Requirement
Pixel Cycles
V. Sync
64µs
1611
V. Back-porch
1.02ms
25679
Draw 480 lines
15.25ms
V. Front-porch
350µs
8811
H. Sync
3.77µs
96
H. Back-porch
1.89µs
48
Draw 640 pixels
25.42µs
H. Front-porch
0.64µs
16
18
19
Timing Implementation

Pixel-clock using derived clock
25.175Mhz
Drawing 16 pixels

19
20
Future Work

Architecture
DMA
DDR2 main memory model
Thread synchronization primitives
Shared data between threads
Real-time Benchmarks
With timing requirements
Programming models
Memory allocation schemes
Synchronizations

20
21
Conclusion

What we want
Time as a first class citizen of embedded
computing
Predictability
Repeatability
Where we are at
PRET cycle-accurate simulator
Release
http//chess.eecs.berkeley.edu/pret/

21
22
(No Transcript)
23
Extras
24
More on Brittleness

Small changes may have big effects on timing
behavior
Theorem (Richards anomalies)
If a task set with fixed priorities, execution
times, and precedence constraints is optimally
scheduled on a fixed number of processors, then
increasing the number of processors, reducing
execution times, or weakening precedence
constraints can increase the schedule length.
Richard L. Graham, Bounds on the performance of
scheduling algorithms, in E. G. Coffman,
Jr.(ed.), Computer and Job-Shop Scheduling
Theory, John Wiley, New York, 1975.

25
Richards Anomalies