Software Transactions: A ProgrammingLanguages Perspective - PowerPoint PPT Presentation

About This Presentation
Title:

Software Transactions: A ProgrammingLanguages Perspective

Description:

Performance (optimistic 'no conflict' without locks) Research should be guiding: ... The essence of the advantage over locks. Language design: Rigorous high ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 65
Provided by: dangro
Category:

less

Transcript and Presenter's Notes

Title: Software Transactions: A ProgrammingLanguages Perspective


1
Software Transactions A Programming-Languages
Perspective
  • Dan Grossman
  • University of Washington
  • 27 March 2008

2
Atomic
  • An easier-to-use and harder-to-implement primitive

void deposit(int x) synchronized(this) int
tmp balance tmp x balance tmp
void deposit(int x) atomic int tmp
balance tmp x balance tmp
lock acquire/release
(behave as if) no interleaved computation
3
Viewpoints
  • Software transactions good for
  • Software engineering (avoid races deadlocks)
  • Performance (optimistic no conflict without
    locks)
  • Research should be guiding
  • New hardware with transactional support
  • Software support
  • Semantic mismatch between language hardware
  • Prediction hardware for the common/simple case
  • May be fast enough without hardware
  • Lots of nontransactional hardware exists

4
PL Perspective
  • Complementary to lower-level implementation work
  • Motivation
  • The essence of the advantage over locks
  • Language design
  • Rigorous high-level semantics
  • Interaction with rest of the language
  • Language implementation
  • Interaction with modern compilers
  • New optimization needs
  • Answers urgently needed for the multicore era

5
Today, part 1
  • Language design, semantics
  • Motivation Example the GC analogy OOPSLA07
  • Semantics strong vs. weak isolation PLDI07
    POPL08
  • Interaction w/ other features ICFP05SCHEME07P
    OPL08
  • Joint work with Intel PSL

6
Today, part 2
  • Implementation
  • On one core ICFP05SCHEME07
  • Static optimizations for strong isolation
    PLDI07
  • Multithreaded transactions
  • Joint work with Intel PSL

7
Code evolution
void deposit() synchronized(this) void
withdraw() synchronized(this) int
balance() synchronized(this)
8
Code evolution
void deposit() synchronized(this) void
withdraw() synchronized(this) int
balance() synchronized(this) void
transfer(Acct from, int amt)
if(from.balance()gtamt amt lt maxXfer)
from.withdraw(amt) this.deposit(amt)

9
Code evolution
void deposit() synchronized(this) void
withdraw() synchronized(this) int
balance() synchronized(this) void
transfer(Acct from, int amt)
synchronized(this) //race
if(from.balance()gtamt amt lt maxXfer)
from.withdraw(amt) this.deposit(amt)

10
Code evolution
void deposit() synchronized(this) void
withdraw() synchronized(this) int
balance() synchronized(this) void
transfer(Acct from, int amt)
synchronized(this) synchronized(from)
//deadlock (still) if(from.balance()gtamt
amt lt maxXfer) from.withdraw(amt)
this.deposit(amt)
11
Code evolution
void deposit() atomic void withdraw()
atomic int balance() atomic
12
Code evolution
void deposit() atomic void withdraw()
atomic int balance() atomic
void transfer(Acct from, int amt)
//race if(from.balance()gtamt amt lt
maxXfer) from.withdraw(amt)
this.deposit(amt)
13
Code evolution
void deposit() atomic void withdraw()
atomic int balance() atomic
void transfer(Acct from, int amt) atomic
//correct and parallelism-preserving!
if(from.balance()gtamt amt lt maxXfer)
from.withdraw(amt) this.deposit(amt)

14
But can we generalize
  • So transactions sure looks appealing
  • But what is the essence of the benefit?

Transactional Memory (TM) is to shared-memory
concurrency as Garbage Collection (GC) is to
memory management
15
GC in 60 seconds
  • Allocate objects in the heap
  • Deallocate objects to reuse heap space
  • If too soon, dangling-pointer dereferences
  • If too late, poor performance / space exhaustion
  • Automate deallocation via reachability
    approximation

16
GC Bottom-line
  • Established technology with widely accepted
    benefits
  • Even though it can perform arbitrarily badly in
    theory
  • Even though you cant always ignore how GC works
    (at a high-level)
  • Even though an active research area after 40
    years
  • Now about that analogy

17
The problem, part 1
  • Why memory management is hard
  • Balance correctness (avoid dangling pointers)
  • And performance (space waste or exhaustion)
  • Manual approaches require whole-program protocols
  • Example Manual reference count for each object
  • Must avoid garbage cycles

18
The problem, part 2
  • Manual memory-management is non-modular
  • Caller and callee must know what each other
    access or deallocate to ensure right memory is
    live
  • A small change can require wide-scale code
    changes
  • Correctness requires knowing what data subsequent
    computation will access

19
The solution
  • Move whole-program protocol to language
    implementation
  • One-size-fits-most implemented by experts
  • Usually inside the compiler and run-time
  • GC system uses subtle invariants, e.g.
  • Object header-word bits
  • No unknown mature pointers to nursery objects

20
So far
21
Incomplete solution
  • GC a bad idea when reachable is a bad
    approximation of cannot-be-deallocated
  • Weak pointers overcome this fundamental
    limitation
  • Best used by experts for well-recognized idioms
    (e.g., software caches)
  • In extreme, programmers can encode
  • manual memory management on top of GC
  • Destroys most of GCs advantages

22
Circumventing TM
class SpinLock private boolean b false
void acquire() while(true) atomic
if(b) continue b true
return void release()
atomic b false
23
It really keeps going (see the essay)
24
Lesson
  • Transactional memory is to
  • shared-memory concurrency
  • as
  • garbage collection is to
  • memory management
  • Huge but incomplete help for correct, efficient
    software
  • Analogy should help guide transactions research

25
Today, part 1
  • Language design, semantics
  • Motivation Example the GC analogy OOPSLA07
  • Semantics strong vs. weak isolation PLDI07
    POPL08 Katherine Moore
  • Interaction w/ other features ICFP05SCHEME07P
    OPL08
  • Joint work with Intel PSL

26
Weak isolation
initially y0
atomic y 1 x 3 y x
x 2 print(y) //1? 2? 666?
  • Widespread misconception
  • Weak isolation violates the all-at-once
    property only if corresponding lock code has a
    race
  • (May still be a bad thing, but smart people
    disagree.)

27
Its worse
  • Privatization One of several examples where lock
    code works and weak-isolation transactions do not

ptr
initially ptr.f ptr.g
sync(lk) r ptr ptr new
C() assert(r.fr.g)
sync(lk) ptr.f ptr.g
f
g
(Example adapted from Rajwar/Larus and Hudson
et al)
28
Its worse
  • (Almost?) every published weak-isolation system
    lets the assertion fail!
  • Eager-update or lazy-update

ptr
f
g
initially ptr.f ptr.g
atomic r ptr ptr new
C() assert(r.fr.g)
atomic ptr.f ptr.g
29
The need for semantics
  • Which is wrong the privatization code or the
    transactions implementation?
  • What other gotchas exist?
  • What language/coding restrictions suffice to
    avoid them?
  • Can programmers correctly use transactions
    without understanding their implementation?
  • What makes an implementation correct?
  • Only rigorous source-level semantics can answer

30
What we did
  • Formal operational semantics for a collection of
    similar languages that have different isolation
    properties
  • Program state allows at most one live
    transaction
  • aHe1 en aHe1 en
  • Multiple languages, including

31
What we did
  • Formal operational semantics for a collection of
    similar languages that have different isolation
    properties
  • Program state allows at most one live
    transaction
  • aHe1 en aHe1 en
  • Multiple languages, including
  • 1. Strong If one thread is in a transaction,
    no other thread may use shared memory or enter a
    transaction

32
What we did
  • Formal operational semantics for a collection of
    similar languages that have different isolation
    properties
  • Program state allows at most one live
    transaction
  • aHe1 en aHe1 en
  • Multiple languages, including
  • 2. Weak-1-lock If one thread is in a
    transaction, no other thread may enter a
    transaction

33
What we did
  • Formal operational semantics for a collection of
    similar languages that have different isolation
    properties
  • Program state allows at most one live
    transaction
  • aHe1 en aHe1 en
  • Multiple languages, including
  • 3. Weak-undo Like weak, plus a transaction may
    abort at any point, undoing its changes and
    restarting

34
A family
  • Now we have a family of languages
  • Strong other threads cant use memory
    or start
    transactions
  • Weak-1-lock other threads cant start
    transactions
  • Weak-undo like weak, plus undo/restart
  • So we can study how family members differ and
    conditions under which they are the same
  • Oh, and we have a kooky, ooky name

The AtomsFamily
35
Easy Theorems
  • Theorem
  • Every program behavior in strong is
  • possible in weak-1-lock
  • Theorem
  • weak-1-lock allows behaviors strong does not
  • Theorem
  • Every program behavior in weak-1-lock is
  • possible in weak-undo
  • Theorem (slightly more surprising)
  • weak-undo allows behavior weak-1-lock does not

36
Hard theorems
  • Consider a (formally defined) type system that
    ensures any mutable memory is either
  • Only accessed in transactions
  • Only accessed outside transactions
  • Theorem If a program type-checks, it has the
    same possible behaviors under strong and
    weak-1-lock
  • Theorem If a program type-checks, it has the
    same possible behaviors under weak-1-lock and
    weak-undo

37
A few months in 1 picture
strong-undo
strong
weak-1-lock
weak-undo
38
Lesson
  • Weak isolation has surprising behavior
  • formal semantics lets us model the behavior and
  • prove sufficient conditions for avoiding it
  • In other words With a (too) restrictive type
    system, get semantics of strong and
    performance of weak

39
Today, part 1
  • Language design, semantics
  • Motivation Example the GC analogy OOPSLA07
  • Semantics strong vs. weak isolation PLDI07
    POPL08
  • Interaction w/ other features ICFP05SCHEME07P
    OPL08
  • Joint work with Intel PSL

40
What if
  • Real languages need precise semantics for all
    feature interactions. For example
  • Native Calls Ringenburg
  • Exceptions Ringenburg, Kimball
  • First-class continuations Kimball
  • Thread-creation Moore
  • Java-style class-loading Hindman
  • Open Bad interactions with memory-consistency
    model
  • See joint work with Manson and Pugh MSPC06

41
One cool ML thing
  • To the front-end, atomic is just a first-class
    function
  • So yes, you can pass it around
  • Like every other function, it has two run-time
    versions
  • For outside of a transaction (start one)
  • For inside of a transaction (just call the thunk)

Thread.atomic (unit -gt a) -gt a
42
Today, part 2
  • Implementation
  • On one core ICFP05 SCHEME07 Michael
    Ringenburg, Aaron Kimball
  • Static optimizations for strong isolation
    PLDI07
  • Multithreaded transactions
  • Joint work with Intel PSL

43
Interleaved execution
  • The uniprocessor (and then some) assumption
  • Threads communicating via shared memory don't
    execute in true parallel
  • Important special case
  • Uniprocessors still exist
  • Many language implementations assume it
    (e.g., OCaml, Scheme48)
  • Multicore may assign one core to an application

44
Implementing atomic
  • Key pieces
  • Execution of an atomic block logs writes
  • If scheduler pre-empts during atomic, rollback
    the thread
  • Duplicate code so non-atomic code is not slowed
    by logging

45
Logging efficiency
  • Keep the log small
  • Dont log reads (key uniprocessor advantage)
  • Need not log memory allocated after atomic
    entered
  • Particularly initialization writes
  • Need not log an address more than once
  • To keep logging fast, switch from array to
    hashtable after many (50) log entries

46
Representing closures/objects
  • Representation of closures an interesting
  • (and pervasive) design decision
  • OCaml

add 3, push,
header
code ptr
free variables
47
Representing closures/objects
  • Representation of closures an interesting
  • (and pervasive) design decision
  • AtomCaml
  • bigger closures -- and related GC changes
  • (unnecessary with bytecodes -- but we did it
    anyway)

add 3, push,
add 3, push,
header
code ptr1
free variables
code ptr2
48
Representing closures/objects
  • Representation of closures an interesting
  • (and pervasive) design decision
  • AtomCaml alternative
  • (slower calls in atomic)

add 3, push,
add 3, push,
code ptr2
header
code ptr1
free variables
49
Evaluation
  • Strong isolation on uniprocessors at little cost
  • See papers for in the noise performance
  • Memory-access overhead
  • Recall initialization writes need not be logged
  • Rare rollback

50
Lesson
  • Implementing transactions in software for a
    uniprocessor is so efficient it deserves
    special-casing
  • Note Dont run other multicore services on a
    uniprocessor either

51
Today, part 2
  • Implementation
  • On one core ICFP05 SCHEME07
  • Static optimizations for strong isolation
    PLDI07
  • Steven Balensiefer, Benjamin Hindman
  • Multithreaded transactions
  • Joint work with Intel PSL

52
Strong performance problem
  • Recall uniprocessor overhead

With parallelism
53
Optimizing away strongs cost
Thread local
Not accessed in transaction
Immutable
  • New static analysis for not-accessed-in-transacti
    on

54
Not-accessed-in-transaction
  • Revisit overhead of not-in-atomic for strong
    isolation, given information about how data is
    used in atomic

not in atomic
Yet another client of pointer-analysis
55
Analysis details
  • Whole-program, context-insensitive,
    flow-insensitive
  • Scalable, but needs whole program
  • Can be done before method duplication
  • Keep lazy code generation without losing
    precision
  • Given pointer information, just two more passes
  • How is an abstract object accessed
    transactionally?
  • What abstract objects might a non-transactional
    access use?

56
Collaborative effort
  • UW static analysis using pointer analysis
  • Via Paddle/Soot from McGill
  • Intel PSL high-performance STM
  • Via compiler and run-time
  • Static analysis annotates bytecodes, so the
    compiler back-end knows what it can omit

57
Benchmarks
Tsp
58
Benchmarks
JBB
59
Lesson
  • The cost of strong isolation is in the
    nontransactional code compiler optimizations
    help a lot

60
Today, part 2
  • Implementation
  • On one core ICFP05 SCHEME07
  • Static optimizations for strong isolation
    PLDI07
  • Multithreaded transactions Aaron Kimball
  • Caveat ongoing work
  • Joint work with Intel PSL

61
Multithreaded Transactions
  • Most implementations (hw or sw) assume code
    inside a transaction is single-threaded
  • But isolation and parallelism are orthogonal
  • And Amdahls Law will strike with manycore
  • Language design need nested transactions
  • Currently modifying Microsofts Bartok STM
  • Key correct logging without sacrificing
    parallelism
  • Work perhaps ahead of the technology curve
  • like concurrent garbage collection

62
Credit
  • Semantics Katherine Moore
  • Uniprocessor Michael Ringenburg, Aaron Kimball
  • Optimizations Steven Balensiefer, Ben Hindman
  • Implementing multithreaded transactions Aaron
    Kimball
  • Memory-model issues Jeremy Manson, Bill Pugh
  • High-performance strong STM Tatiana Shpeisman,
    Vijay Menon, Ali-Reza Adl-Tabatabai,
    Richard Hudson, Bratin Saha

wasp.cs.washington.edu
63
Please read
  • High-Level Small-Step Operational Semantics for
    Transactions POPL08  Katherine F. Moore, Dan
    Grossman
  • The Transactional Memory / Garbage Collection
    Analogy OOPSLA07  Dan Grossman
  • Software Transactions Meet First-Class
    Continuations SCHEME07  Aaron Kimball, Dan
    Grossman
  • Enforcing Isolation and Ordering in STM
  • PLDI07 Tatiana Shpeisman, Vijay Menon,
  • Ali-Reza Adl-Tabatabai, Steve
    Balensiefer, Dan Grossman,
  • Richard Hudson, Katherine F. Moore,
    Bratin Saha
  • Atomicity via Source-to-Source Translation
  • MSPC06 Benjamin Hindman and Dan Grossman
  • What Do High-Level Memory Models Mean for
    Transactions? MSPC06 Dan Grossman, Jeremy
    Manson, William Pugh
  • AtomCaml First-Class Atomicity via Rollback
  • ICFP05 Michael F. Ringenburg, Dan
    Grossman

64
Lessons
  • Transactions the garbage collection of shared
    memory
  • Semantics prove sufficient conditions for
    avoiding weak-isolation anomalies
  • Must define interaction with features like
    exceptions
  • Uniprocessor implementations are worth
    special-casing
  • Compiler optimizations help remove the overhead
    in nontransactional code resulting from strong
    isolation
  • Amdahls Law suggests multithreaded transactions,
    which we believe we can make scalable
Write a Comment
User Comments (0)
About PowerShow.com