Lecture 25: WrapUp - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture 25: WrapUp

Description:

To improve single-thread performance, can even schedule ... NASCAR Applied to CPUs !?! Bullet. Source: Eric Rotenberg (NCSU) 17. Runahead Execution ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 20
Provided by: rajeevbala
Learn more at: https://my.eng.utah.edu
Category:

less

Transcript and Presenter's Notes

Title: Lecture 25: WrapUp


1
Lecture 25 Wrap-Up
  • Mid-term-II stats
  • High 91
  • Mean 73.12
  • Qs 1-3 half the class got 25/25
  • Qs 4 only one student got 25/25 almost no one
    mentioned
  • that well need a mechanism to determine
    exclusivity
  • Qs 5 highest was 22/30 very few mentioned that
    allowing
  • blocks to move would complicate search

2
Example Solutions
3
Example Solutions
4
Example Solutions
5
Example Solutions
6
Example Solutions
7
Example Solutions
8
CPU 2
CPU 3
L1D
L1I
L1D
L1I
CPU 4
L1I
L1D
L1D
CPU 1
L1I
CPU 5
L1I
L1D
L1D
CPU 0
L1I
CPU 6
L1D
L1I
CPU 7
L1D
L1I
9
Tetris?!
10
Non-Uniform Cache Access (NUCA)
  • Many open problems in NUCA and D-NUCA
  • How should search happen?
  • Allocation/replacement/migration policies
  • Managing bandwidth/latency on the network
  • Prefetch mechanisms
  • Selective replication of blocks
  • Efficient write-throughs
  • Power/performance trade-offs
  • P.S. We have simulators, etc., to help model
    such
  • caches in case anyone is interested

11
Shameless Plug
  • CS 7810 Advanced Architecture
  • Lectures based on seminal (and still relevant)
    papers
  • Not much work, apart from class project (in
    teams)
  • Class project can involve as little as 1 weeks
    worth of
  • concentrated effort
  • or, enough to get a paper out of it
  • you WILL work on novel problems
  • lots of help from me/other students with the
    simulator

12
3-D
  • Imagine a similar problem in 3D

C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
13
3-D
  • Imagine a similar problem in 3D

C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
Must schedule threads to manage temperature
14
Single Thread Performance
  • To improve single-thread performance, can even
    schedule
  • a single threads instructions across cores
    large window
  • of in-flight instructions to mine high ILP
    requires high
  • levels of speculation (power-hungry!) any
    solutions?

C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
C
P
15
Heterogeneous CMPs (Alpha EVx and Cell)
in-o
o-o-o
o-o-o
16
NASCAR Applied to CPUs !?!
  • Bullet

Source Eric Rotenberg (NCSU)
17
Runahead Execution
Single thread in a baseline architecture
Single thread executing in tandem with a helper
thread
18
Reliability
For power
For performance
P1
C2
P2
C1
SMT core 1
SMT core 2
19
Title
  • Bullet
Write a Comment
User Comments (0)
About PowerShow.com