Title: A Configurable Simulator for OOO Speculative Execution
1A Configurable Simulator for OOO Speculative
Execution
By Mustafa Imran Ali ID230203
2Architecture Modeled
- Fetch logic
- Trace driven execution. Branches outcome
explicitly specified. - Issue Logic
- Issue width configurable
- Functional Units Reservations Stations (RS)
- RS count configurable
- Execution Units modeled after MIPS R4000 Pipeline
(Hennessy Peterson Computer Architecture 3rd
Ed.) - No. of pipeline stages configurable
- Common Data Buses
- No. of CDBs configurable
- ROB and commit logic
- ROB size and commit capacity configurable
3Simulation Methodology
- A program trace file written in comma separated
variable (CSV) format - A configuration file to specify values of
configurable parameters - Trace file and configuration file input to the
simulator
4Architectural Assumptions
- Only load misses supported. Stores are committed
in a single cycle - Stores use a direct bus to transfer the
calculated Effective Address into the ROB - Branch outcomes are written to ROB using the CDB
- Branch mispredict is handled when the branch
instruction reaches the Head of ROB
5Architectural Assumptions (cont.)
- Dynamic memory disambiguation implemented by
using a Store EA cache - A load is only allowed to proceed if there are no
pending Stores with the same effective address - Reservations Stations issue the first ready
instruction detected - Not necessarily the oldest Instruction
6Architectural Assumptions (cont.)
- The number of CDBs available are arbitrated
- When a request for CDB arrives, the following
priority order is used to grant the requests - Branch FU
- Div FU
- LD/ST
- MULT FU
- FPADD FU
- INT ALU FU
7List of Configurable Parameters
- ISSUE SIZE
- The maximum number of instructions examined for
parallel issue - COMMIT SIZE
- The maximum number of instructions examined in
ROB for commit - ROB SIZE
- The number of entries in Reorder Buffer
- NUM CDB
- Number of Common Data Buses
- LSQ SIZE
- Number of entries in load store buffer
- STORE CACHE SIZE
- Number of entries in store EA lookup table
8List of Configurable Parameters
- NUMRSBU
- NUMRSINTALU
- NUMRSMULT
- MULTSTAGES
- NUMRSDIV
9List of Configurable Parameters
- DIVCYCLES
- NUMRSFPADD
- FPADDSTAGES
- MISSPROB
- MPPROB
10Simulator Structure
- main()
-
- readtracefile()
- readconfigfile()
- while(NOT EXIT)
-
- commit()
- ROB_update()
- RS_update()
- CDB_Arbiter()
- writeback()
- execute()
- issue()
- fetch()
-
- printStatistics()
-
11Block Diagram
Issue Unit
Trace
INT ALU RS
BR UNIT RS
LSQ
DIV UNIT RS
MULT UNIT RS
ROB
Arbiter
CDB
RF
12Metrics Measured
- Cycles to Complete
- Issue Stall Cycles
- Cycles when no instructions can be issued to RS
- FU utilizations (for each FU)
- No. of FU type Instructions / Total Cycles
- CDB utilizations (for each CDB)
- No. broadcasts / Total Cycles
- Cycles Per Instruction
13Metrics Measured (cont.)
- Frequency of Various Issue Count over all
execution cycles - Frequency of Various Commit Count over all
execution cycles - RS occupancy Frequency over all cycles
- ROB occupancy Frequency over all cycles
14Simulator Design
- Coded in C
- Compiled using MS VC 6.0
15Execution Demonstration
Registers State Initializations REGS1.valid1 RE
GS2.valid1 REGS3.valid1 REGS8.valid1 REGS
9.valid1 REGS11.valid1 REGS12.valid1 REGS
15.valid1 REGS16.valid1 REGS17.valid1
- Sample Program
- ADD R0,R1,R2
- ADD R4,R0,R3
- ADD R7,R4,R0
- ADD R10,R11,R12
- ADD R13,R10,R15
- ADD R13,R16,R17
- ADD R15,R11,R12
- ADD R17,R15,R12
- EXIT
RAW
RAW
RAW
WAR
WAW
RAW
16Results Cycles
17Present Implementation
- Completely Configurable Simulator
- INT ALU in working State
18Immediate Extension
- Branch Unit Completion
- Pipelined Multiplier Completion
- LD/STORE Unit Completion