Hardware-Only Stream Prefetching and Dynamic Access Ordering - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Hardware-Only Stream Prefetching and Dynamic Access Ordering

Description:

Hardware-Only Stream Prefetching and Dynamic Access Ordering. Charles Zhang and Sally A. McKee ... Streamed computations. Poor cache behavior. Good regularity ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 13
Provided by: KKK8
Category:

less

Transcript and Presenter's Notes

Title: Hardware-Only Stream Prefetching and Dynamic Access Ordering


1
Hardware-Only Stream Prefetching and Dynamic
Access Ordering
  • Charles Zhang and Sally A. McKee

2
Motivation
  • Memory system bottleneck
  • Streamed computations
  • Poor cache behavior
  • Good regularity
  • How to make stream computations faster?

3
Stream Prefetching Dynamic Access Ordering
  • Stream detection
  • Stream prefetching
  • Dynamic access ordering
  • Can DAO improve performance w/o pattern info from
    software?
  • How much performance difference is possible?

4
Implementation Choices
  • Prefetching
  • Next-line vs stride vs pointer-based
  • Cache vs streambuffers
  • Fixed vs adaptive distances
  • Access ordering
  • With vs without prefetching
  • Optimality vs implementability

5
System Model
CPU
IL1
DL1
Direct RDRAMS
RPT prefetcher
L2
reordering memory controller
bus
6
Reordering Algorithm
  • AEAP (As Early As Possible)
  • Maintains next-issue candidate
  • Chooses between new request candidate
  • Recomputes candidate w/ each issue
  • Consistency

7
Experimental Setup
  • Simplescalar
  • Direct Rambus DRAMs
  • Benchmark suites
  • Spec95 int fp
  • Microbenchmarks Hong et al., HPCA 99
  • Pointer benchmarks Austin et al., PLDI 94

8
Results for Hydro2d
Speedup
Prefetch distance
9
Results for Copy ( stride 10 )
Speedup
Prefetch distance
10
Results for Anagram
Speedup
Prefetch distance
11
Conclusion
  • Hardware-only access ordering can deliver non
    trivial speedups
  • Prefetching and access ordering benefit each
    other

12
Future Work
  • Comparison w/ non-spatial locality cache
    prefetching
  • Comparison w/ software and with other hardware
    approaches
  • Exploring performance for other DRAMs
Write a Comment
User Comments (0)
About PowerShow.com