CS 7810 Lecture 8 - PowerPoint PPT Presentation

About This Presentation
Title:

CS 7810 Lecture 8

Description:

Ld/St. An incomplete store stalls ... Every ld/st depends on the last store in its set. Causes serialized stores ... belong to one color keep track of the ... – PowerPoint PPT presentation

Number of Views:13
Avg rating:3.0/5.0
Slides: 18
Provided by: RajeevBala4
Category:
Tags: coloring | lecture

less

Transcript and Presenter's Notes

Title: CS 7810 Lecture 8


1
CS 7810 Lecture 8
Memory Dependence Prediction using Store
Sets G.Z. Chrysos and J.S. Emer Proceedings of
ISCA-25 1998
2
Lifetime of a Load
3
LSQ Basics
Ld/St Address Data Completed
Store Unknown 1000 --
Load x40000000 -- --
Store x50000000 -- --
Load x50000000 -- --
Load x30000000 -- --
  • An incomplete store stalls all future loads No
  • Speculation the paper is overly conservative
  • because it also waits for store values
  • Most of these stalls are unnecessary
    artificial
  • dependences

4
Aggressive Approach
  • Assume that loads do not conflict with earlier
  • stores all loads and stores execute out of
    order
  • -- Naive Speculation
  • When there is a conflict, the load behaves like
    a
  • branch mispredict all subsequent instructions
  • are squashed and re-fetched
  • Expensive 30-cycle penalty
  • Rename checkpoints for all instructions
  • Re-execute only the dependent instructions?
  • more complex, better performance

5
Ideal Model
  • In the perfect model, loads only wait for
    conflicting
  • stores no artificial dependences and no
  • memory-order violations

6
False Dependences and Violations
7
Store Sets Concept
  • For every load, keep track of all stores that it
  • has conflicted with in the past
  • A load does not issue if members of its store
  • set have not finished (dependences are
    introduced
  • at the time of dispatch)
  • The implementation is easy if
  • a load depends on only one store
  • a store is present in only one store set

8
Trivial Implementations
  • Execution time normalized to an ideal store set
  • implementation

9
Ideal Store Set Predictor
  • An occasional memory-order violation can
  • introduce many false dependencies hence,
  • use saturating counters

10
Implementation Overview
  • Every ld/st depends on the last store in its set
  • Causes serialized stores and false dependences

st
st
st
st
st
11
Store Set Implementation
  • Every load and store belong to one color keep
    track of the
  • last writer for each color mpreds can pose
    problems
  • Colors are merged as you discover m-o violations

12
Store Set Merging
  • Store set merging improves performance by 12
  • Note that merging happens gradually no need to
  • instantly correct all entries in the table

13
Design Details
  • Merging store sets
  • To deal with occasional dependences and
    conflicts
  • clear the table every million cycles
  • use saturating counters for each entry
  • The SSIT needs 4K entries and the LFST needs
  • 128 entries

14
Results
15
Related Work
  • Store barrier cache identify stores that are
    likely
  • to pose conflicts
  • Keep track of all store-load conflict pairs and
  • associatively check for dependences while
  • dispatching instructions

16
Next Weeks Paper
  • Effective Hardware-Based Prefetching for
  • High-Performance Microprocessors, T.F. Chen
  • and J.L. Baer, IEEE Transactions on Computers,
  • May 1995

17
Title
  • Bullet
Write a Comment
User Comments (0)
About PowerShow.com