Title: Dynamic Scheduling Using Tomasulo
1Dynamic Scheduling Using Tomasulos Approach
- Salient Characteristics
- Track instruction dependences and availability of
operands - Allow execution as soon as operands are available
to avoid RAW hazards - Use register renaming to avoid WAW and WAR
hazards - A dynamic scheduling scheme, in which hardware
reschedules instruction execution to reduce
stalls
2The Structure of a DLX FP Unit
- (see Figure 4.8)
- Instructions are issued in FIFO order from
Instruction Queue - Reservation stations include the operation and
the actual operands - Load buffers hold the results of outstanding
loads - All results from FP units or load units are put
on the common data bus (CDB)
3Lifecycle of an Instruction
- 1. Issue
- Get an instruction from the Instruction Queue
- Issue it if there is an empty reservation station
- Send operands to the reservation station if they
are in the registers - A load/store operation can issue if theres an
available buffer - If a buffer or reservation station is not
available, the instructions stalls due to a
structural hazard
4Lifecycle of an Instruction (Contd)
- 2. Execute
- Execute when both operands are available
- Monitor the CDB while waiting for operands
- 3. Write Result
- When the result is available, write it on the CDB
- From CDB, the result is written into the
registers and any reservation station waiting for
this result
5Reservation Stations Fields
- Every reservation station has six fields
- OP - operation to perform
- Qj, Qk - the reservation stations that will
produce the source operand - Vj, Vk - the value of the source operands
- Busy - indicates that this reservation station is
busy - The register file has a field, Qi
- Qi - the reservation station or buffer that
contains the operation whose result is to be
stored into the register
6Tomasulos Algorithm - Example
LD F6, 34(R2) LD F2, 45(R3) MULTD F0, F2,
F4 SUBD F8, F6, F2 DIVD F10, F0,
F6 ADDD F6, F8, F2 See Figure 4.9 and 4.
10 See Figure 4.11 for steps in the algorithm
7Tomasulos Algorithm A Loop-Based Example
- Loop LD F0, 0(R1)
- MULTD F4, F0, F2
- SD 0(R1), F4
- SUBI R1, R1, 8
- BNEZ R1, Loop branches if R1 ? 0
- If we predict taken branches, the loop is
unrolled dynamically by the hardware
8Scoreboard - Steps in Execution
- 1. Issue The scoreboard issues an instruction if
- a. A functional unit for the instruction is free
- b. No other active instruction has the same
destination register - If a structural or WAW hazard exists, the
instruction issue stalls. - 2. Read Operands
- The scoreboard monitors the availability of
operands - When operands become available, the execution
begins after reading the operands - RAW hazards are dynamically resolved here
9Scoreboard - Step in Execution (Contd)
- 3. Execution
- The functional unit begins execution
- When the result is ready, it notifies the
scoreboard - 4. Write a result
- The scoreboard checks for WAR hazard and stalls
writing the result if needed