Capriccio: Scalable Threads for Internet Services von Behren - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Capriccio: Scalable Threads for Internet Services von Behren

Description:

Web 'transactions' involve a number of steps which must be performed in sequence. ... If we multiplex requests on a small set of threads, it's more difficult. ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 33
Provided by: Ken667
Category:

less

Transcript and Presenter's Notes

Title: Capriccio: Scalable Threads for Internet Services von Behren


1
Capriccio Scalable Threads for Internet Services
(von Behren)
  • Kenneth Chiu

2
Background
  • Non-blocking I/O, async I/O
  • NB
  • Usually doesnt work well for disks.
  • Async I/O
  • Issue a request, get completion.
  • epoll()/poll()
  • convoy tendency for threads to bunch up
  • priority inversion
  • call graph
  • average, weighted moving average
  • capriccio improvisatory style, free form

3
The Problem
  • Web transactions involve a number of steps
    which must be performed in sequence.
  • For high-throughput, we want to service many of
    these requests concurrently.
  • When does concurrency help? When does it not?
  • If we use a single thread per request, we will
    have too many threads.
  • If we multiplex requests on a small set of
    threads, its more difficult.

4
Read two numbers and add
  • while (true)
  • fd get_read_ready()
  • state lookup(fd)
  • if (state.step READING_FIRST)
  • c read(fd, , bytes_left)
  • if (have enough)
  • state.step READING_SECOND
  • else if (state.step READING_SECOND)

while (true) int n1, n2 readexact(fd,
n1, 4) readexact(fd, n2, 4)
printf(d\n, n1 n2)
5
Thread Design and Scalability
6
The Case for User-Level Threads
  • Flexibility
  • Level of indirection between applications and the
    kernel, which helps decouple the two.
  • Kernel-level thread scheduling must handle all
    applications. User-level can be tailored.
  • Lightweight which means can use zillions of them.
  • Performance
  • Cooperative scheduling is nearly free.
  • Do not require kernel crossing for uncontended
    locks. (Why do contended locks require kernel
    crossings?)
  • Disadvantages
  • Non-blocking I/O requires an additional system
    call. (Why?)
  • SMPs

7
Implementation
  • Context switches
  • Built on coroutine library.
  • I/O
  • Intercept blocking system calls, use epoll() and
    AIO for disk.
  • Can be less efficient
  • Scheduling
  • Main scheduling loop looks very much like an
    event-driven application. (What is an EDA?)
  • Makes it relatively easy to switch schedulers.
  • Synchronization
  • Cooperative threading on UP.
  • Efficiency
  • All O(1), except sleep queue.

8
Benchmarks
  • 2 X 2.4 GHz Xeon, 1 GB memory, 2 X 10K RPM SCSI,
    GigE.
  • 2 X 1.2 GHz US III
  • Linux 2.5.70, epoll(), AIO.
  • Solaris 8
  • Capriccio, LinuxThreads, NPTL

9
Thread Primitives
10
Thread Scalability
  • Producer-consumer

11
Thread Scalability
  • Drop between 100 and 1000 to cache footprint.

12
I/O Performance
  • pipetest
  • Pass a number of tokens among a set of pipes.
  • Disk scheduling
  • A number of threads perform random 4 KB reads
    from a 1 GB file.
  • Disk I/O through buffer cache
  • 200 threads reading with a fixed miss rate.

13
  • When concurrency is low, performance is poorer.

14
  • Benefits of disk head scheduling.

15
  • I/O out of buffer.
  • Performance is lower due to AIO.

16
Linked Stack Management
17
Thread Stacks
  • If a lot of threads, the cumulative stack space
    can be quite large.
  • Solution Use a dynamic allocation policy and
    allocate on demand. Link stack chunks together.
  • Problem How do you link stack chunks together?
    How do you know when to link a new one?

18
Weighed Call Graph
  • Use static analysis to create a weighted call
    graph.
  • Each node is weighed by the maximum stack space
    that that function might consume. (Why is it
    maximum, and not exact?)
  • Now what?

19
Bounds
  • Most real-world programs use recursion.
  • Even without, static bound wastes too much.
  • Instead insert checkpoints at key places to link
    in new stack chunks.
  • Chunks switched right before arguments are pushed.

20
Placing Checkpoints
  • Make sure one checkpoint in every cycle by
    inserting in back edges. (How?) (Is this
    efficient?)
  • Then make sure each path (sum) is not too long.

21
  • Function B is executing.
  • Function D, both ways.
  • Recursion.

22
Special Cases
  • Function pointers
  • Difficult, but they try to analyze.
  • External functions
  • Allow annotations.
  • Alternatively, link in a large chunk.
  • Variable length arrays
  • C99

23
Question
  • What kind of a problem is this?
  • Is it being solved at the right level?

24
Resource-Aware Scheduling
25
Admission Control
  • Weve seen many graphs where performance degrades
    as some variable increases.
  • Scheduling in Capriccio is to keep performance in
    the good part of the curve.

26
Blocking Graph
  • Each node is a location where the program
    blocked.
  • Location is call chain.
  • Generated at run time.
  • Annotate with resource usage
  • Average running time (with exponentially-weighted
    moving average), memory, stack, sockets, etc.
  • Maintain a run queue for each node. Admit threads
    till resources reach maximum capacity.

27
Pitfalls
  • Too many non-linear effects to predict.
  • One solution is to use some kind of
    instrumentation, plus feedback control.
  • But even detecting that is hard.

28
Web Server Test
29
(No Transcript)
30
Summary
  • Control flow maintains state. Control flow can be
    swapped for explicit maintenance.
  • Threads perform two functions
  • Maintain state (logical threads of programming
    model)
  • Allow concurrency (kernel)
  • Should separate the two, since the overhead of
    concurrency is not necessary when just want to
    maintain state.
  • Cooperative multitasking has been denigrated
    before, but can be good.

31
(No Transcript)
32
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com