Title: Register Window Analysis in ASIPs
1Register Window Analysis in ASIPs
- Vishal P. Bhatt
- M.Tech. CSE, IITD
12th May 2001
Embedded Systems Group
IIT Delhi
2ASIPs - Application Specific Instruction
Processors
- Set of architectural parameters
- Value of parameters is decided by
- looking at the application
- Deciding value of a parameter multidimensional
optimization
3Focus of this project
- Performance analysis for the ASIP parameter,
number of register windows (NRWs)
4Register Windows
- A set of registers
- Typically, the set is divided into three subsets
the out, in and the local registers - Overlapping registers Sparc V8 type architecture
5Overlapping Registers
6Effects of NRWs
Memory
Program
f1
f1
f4
f2
f3
f3
f2
f4
f5
7Effects of NRWs
Memory
Program
f1
f1
f4
f2
f3
f1
f3
f2
f4
SPILL
f5
8Effects of NRWs
Memory
Program
f1
f5
f4
f2
f3
f1
f3
f2
f4
SPILL
f5
9Effects of NRWs (contd)
- Time penalty time spent in transferring data to
and from memory - Penalty paid due to lack of enough NRWs
- Increasing NRWs increases cost
- Cost-performance trade-off
10Approach
- Compute number of spills
- Choose a memory access time model
- Compute time penalty
11(No Transcript)
12Spill Count Computation
13Spill Count Computation (contd)
- Problem can be modeled by regular language
recognition problem - The Problem
- Represent the application as a sequence of cs
and rs - For every NRWs, we have a predefined r.e.
(regular expression) - Find the number of matches of each r.e. in the
application string
14Spill Count Computation (contd)
15Spill Count Computation (contd)
16Memory Access Time Models
- Processor design goes hand-in-hand with memory
design - Decision diagram for memory configuration has
been developed shown in report
17ASIP and Memory Design
18Results and Analysis
- Time penalties were generated for four mediabench
applications - JPEG Encoder
- JPEG Decoder
- MPEG Encoder
- MPEG Decoder
19(No Transcript)
20(No Transcript)
21 - Drastic fall as we go from one window to two
windows - Curve slope variation depends on function call
distribution across the application
22Example
23(No Transcript)
24Memory Models considered
- Three
- of the
- sixteen
- models
- considered
25System Configurations
26Total Execution Time
- Penalty time No of penalty words for given
NRWs - Average
memory access time for -
corresponding system configuration - Total Execution time 4(Branch count)
-
2(Ld_Str count) -
1(Others) Cycle time for -
corresponding system -
configuration - Penalty time for
corresponding - NRWs
27Execution time for MPEG Decoder
28Time Overhead for different NRWs (MPEG Decoder)
29Time Overhead for different NRWs (MPEG Decoder)
30Application Parameter (proposal)
- PCAC Probability of C After C
- PRAC Probability of R After C
31Results PCAC values for the considered
applications
32(No Transcript)
33Conclusions
- Spill count computation can be modeled by regular
language recognition - Methodology for predicting time penalty for given
NRWs and given application has been proposed - Time penalty curves for the four mediabench
applications - JPEG encoder, decoder, MPEG
encoder and decoder - have been generated
34Conclusions(contd)
- Decision diagram for exploring a number of
possible memory architectures has been developed - An application parameter which can be used for a
quick analysis of the application, has been
proposed
35Future Suggestions
- Effects of NRWs on multithreaded applications
should be studied - Data generated can be validated by simulating the
applications on processor models for different
NRWs - Integration with area and performance models of
register windows with different sizes
36Future Suggestions(contd)
- Integration into the ASIP framework
- Taking into account the penalty due to lack of
enough number of registers in the individual
windows, more accurate penalty estimates should
be obtained - The proposed application parameter should be
correlated with the generated curves
37(No Transcript)
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43Total Execution Time for different NRWs (MPEG
Decoder)
- Constraint Decoder at the rate of 10 frames/sec
- For memory model 0, the constraint is satisfied
for NRWgt5 - For model number 3 and 15, the constraint is
satisfied for NRWgt3