RTR: 1 Byte/Kilo-Instruction Race Recording

About This Presentation

Title:

RTR: 1 Byte/Kilo-Instruction Race Recording

Description:

RTR: 1 ByteKiloInstruction Race Recording – PowerPoint PPT presentation

Number of Views:55

Avg rating:3.0/5.0

Slides: 24

Provided by: min9150

Learn more at: https://research.cs.wisc.edu

Category:

more less

Transcript and Presenter's Notes

Title: RTR: 1 Byte/Kilo-Instruction Race Recording

1
RTR 1 Byte/Kilo-InstructionRace Recording

Min Xu

Rastislav Bodik
Mark D. Hill
2
Why Do You Need a Recorder?

gcc sim.c
a.out
Segmentation fault

gdb a.out gdbgt run Program received SIGSEGV. In
get() at hash.c45 45 a bucket-gtd
gdb a.out gdbgt run Program exited normally. gdbgt
gcc para-sim.c a.out Segmentation fault
gdb a.out log gdbgt run Program received
SIGSEGV. In get() at para-hash.c67 67 a
bucket-gtd
gcc para-sim.c a.out Segmentation fault Race
recorded in log
3
Ideally
Long recording small log
Low runtime overhead
Low cost
gdb a.out log gdbgt run Program received
SIGSEGV. In get() at para-hash.c67 67 a
bucket-gtd
gcc para-sim.c a.out Segmentation fault Race
recorded in log
4
Better and Better Recorders
5
A New Recorder
1 Byte/Kilo- Instruction ASPLOS06

This talk covers only RTR
Regulated Transitive Reduction algorithm

Result One more step toward practical
6
Outline
Race Recording
RTR Algorithm
Compress log during recording ? replay more
regularly
Results with Commercial Workloads
Conclusion
7
Technically, whats race recording?
8
Race Recording
Thread I
Thread J
Thread I
Thread J
X 1 X print(X)
- - - X X5 -
X 1 X print(X)
- X X5 - -
Original
Replay
X6
X10
9
Terminologies and Assumptions
Dependence (black)
Conflicts (red)
Thread I
Thread J
Thread I
Thread J
ld A
add
ld A
add
st B
st B
st C
st C
st C
Log
st C
ld B
ld B
ld D
st A
ld D
st A
sub
sub
st C
st C
ld B
ld B
st D
st D
Recording
Replay
Goal Reproduce same conflicts with minimum log
data
10
Regulated Transitive Reduction (RTR)
11
Log All Conflicts
Thread I
Thread J
ld A
add
st B
st C
st C
ld B
st A
ld D
sub
st C
ld B
st D
Replay
But too many conflicts
12
Netzers Transitive Reduction (TR)
Thread I
Thread J
TR reduced
1
1
ld A
add
st B
st C
2
2
st C
ld B
3
3
st A
ld D
4
4
sub
st C
5
5
ld B
st D
6
6
Replay
How to further reduce log size?
13
The Intuition of the RTR Algorithm
After Reduction
14
Stricter Dependences to Aid Vectorization
Thread I
Thread J
1
1
ld A
add
st B
st C
2
2
st C
ld B
3
3
st A
ld D
4
4
Replay
Fewer dependencies to log
15
Compress Vectorized Dependencies
Thread I
Thread J
1
1
ld A
add
st B
st C
2
2
st C
ld B
3
3
st A
ld D
4
4
sub
st C
5
5
ld B
st D
6
6
Replay
TR?RTR fewer deps fewer byte/dep
16
Deadlock Avoidance of RTR
Thread I
Thread J
1
1
ld A
add
st B
st C
2
2
st C
ld B
3
3
st A
ld D
4
4
sub
st C
5
5
ld B
st D
6
6
Recording
Limit the strict dependencies (see paper)
17
Results with Commercial Workloads
18
Full-system Simulation Method