Slides About Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Slides About Systems

Description:

Servers and Threads, Continued Jeff Chase Duke University – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 42
Provided by: JeffC68
Category:

less

Transcript and Presenter's Notes

Title: Slides About Systems


1
Servers and Threads, Continued
Jeff Chase Duke University
2
Processes and threads
other threads (optional)


Each process has a virtual address space (VAS) a
private name space for the virtual memory it
uses. The VAS is both a sandbox and a
lockbox it limits what the process can see/do,
and protects its data from others.
From now on, we suppose that a process could have
additional threads. We are not concerned with
how to implement them, but we presume that they
can all make system calls and block
independently.
Each process has a thread bound to the VAS, with
stacks (user and kernel). If we say a process
does something, we really mean its thread does
it. The kernel can suspend/restart the thread
wherever and whenever it wants.
3
Inside your Web server
Server operations create socket(s) bind to port
number(s) listen to advertise port wait for
client to arrive on port (select/poll/epoll of
ports) accept client connection read or recv
request write or send response close client socket
Server application (Apache, Tomcat/Java, etc)
accept queue
packet queues
listen queue
disk queue
4
Web server handling a request
Accept Client Connection
Read HTTP Request Header
may block waiting on disk I/O
may block waiting on network
Find File
Send HTTP Response Header
Read File Send Data
Want to be able to process requests concurrently.
5
Multi-programmed server idealized
worker loop
Magic elastic worker pool
Resize worker pool to match incoming request
load create/destroy workers as needed.
Handle one request, blocking as necessary.
dispatch
Incoming request queue
When request is complete, return to worker pool.
idle workers
Workers wait here for next request dispatch.
Workers could be processes or threads.
6
Multi-process server architecture
Process 1

separate address spaces
Process N
7
Multi-process server architecture
  • Each of P processes can execute one request at a
    time, concurrently with other processes.
  • If a process blocks, the other processes may
    still make progress on other requests.
  • Max requests in service concurrently P
  • The processes may loop and handle multiple
    requests serially, or can fork a process per
    request.
  • Tradeoffs?
  • Examples
  • inetd internet daemon for standard
    /etc/services
  • Design pattern for (Web) servers prefork a
    fixed number of worker processes.

8
Example inetd
  • Classic Unix systems run an inetd internet
    daemon.
  • Inetd receives requests for standard services.
  • Standard services and ports listed in
    /etc/services.
  • inetd listens on the ports and accepts
    connections.
  • For each connection, inetd forks a child process.
  • Child execs the service configured for the port.
  • Child executes the request, then exits.

Apache Modeling Project http//www.fmc-modeling.
org/projects/apache
9
Children of init inetd
New child processes are created to run network
services. They may be created on demand on
connect attempts from the network for designated
service ports. Should they run as root?
10
Prefork
In the Apache MPM prefork option, only one
child polls or accepts at a time the child at
the head of a queue. Avoid thundering herd.
Apache Modeling Project http//www.fmc-modeling.
org/projects/apache
11
Details, details
Scoreboard keeps track of child/worker
activity, so parent can manage an elastic worker
pool.
12
Multi-threaded server architecture
This structure might have lower cost than the
multi-process architecture if threads are
cheaper than processes.
13
Servers structure, recap
  • The server structure discussion motivates
    threads, and illustrates the need for concurrency
    management.
  • We return later to performance impacts and
    effective I/O overlap.
  • A continuing theme of the class presentation
    Unix systems fall short of the idealized model.
  • Thundering herd problem when multiple workers
    wake up and contend for an arriving request one
    worker wins and consumes the request, the others
    go back to sleep their work was wasted. Recent
    fix in Linux.
  • Separation of poll/select and accept in Unix
    syscall interface multiple workers wake up when
    a socket has new data, but only one can accept
    the request thundering herd again, requires an
    API change to fix it.
  • There is no easy way to manage an elastic worker
    pool.
  • Real servers (e.g., Apache/MPM) incorporate lots
    of complexity to overcome these problems. We
    skip this topic.

14
Threads
  • We now enter the topic of threads and concurrency
    control.
  • This will be a focus for several lectures.
  • We start by introducing more detail on thread
    management, and the problem of nondeterminism in
    concurrent execution schedules.
  • Server structure discussion motivates threads,
    but there are other motivations.
  • Harnessing parallel computing power in the
    multicore era
  • Managing concurrent I/O streams
  • Organizing/structuring processing for user
    interface (UI)
  • Threading and concurrency management are
    fundamental to OS kernel implementation
    processes/threads execute concurrently in the
    kernel address space for system calls and fault
    handling. The kernel is a multithreaded program.
  • So lets get to it.

15
The theater analogy
context (stage)
script
Program
Running a program is like performing a play.
lpcox
16
A Thread
fencepost
Thread t
unused
low
name/status etc
0xdeadbeef
Stack
machine state
stack top
high
thread object or thread control block (TCB)
int stackStackSize
ucontext_t
17
Example pthreads
pthread_t threadsN int rc int t
rc pthread_create(threadst, NULL,
PrintHello, (void )t) if (rc) error.
void PrintHello(void threadid) long tid
tid (long)threadid printf("Hello World!
It's me, thread ld!\n", tid)
pthread_exit(NULL)
http//computing.llnl.gov/tutorials/pthreads/
18
Example Java Threads (1)
class PrimeThread extends Thread
long minPrime PrimeThread(long
minPrime) this.minPrime
minPrime
public void run() // compute
primes larger than minPrime . .
.
PrimeThread p new PrimeThread(143)
p.start()
http//download.oracle.com/javase/6/docs/api/java
/lang/Thread.html
19
Example Java Threads (2)
class PrimeRun implements Runnable
long minPrime PrimeRun(long minPrime)
this.minPrime minPrime
public void run() //
compute primes larger than minPrime
. . .
PrimeRun p new PrimeRun(143) new
Thread(p).start()
http//download.oracle.com/javase/6/docs/api/java
/lang/Thread.html
20
Thread states and transitions
exited
exit
running
EXIT
yield
The kernel process/thread scheduler governs these
transitions.
sleep
ready
blocked
wakeup
wait, STOP, read, write, listen, receive, etc.
Sleep and wakeup are internal primitives. Wakeup
adds a thread to the schedulers ready pool a
set of threads in the ready state.
21
Two threads sharing a CPU
concept
reality
context switch
22
CPU Scheduling 101
  • The OS scheduler makes a sequence of moves.
  • Next move if a CPU core is idle, pick a ready
    thread t from the ready pool and dispatch it (run
    it).
  • Schedulers choice is nondeterministic
  • Schedulers choice determines interleaving of
    execution

blocked threads
If timer expires, or wait/yield/terminate
ready pool
Wakeup
GetNextToRun
SWITCH()
23
A Rough Idea
Yield() disable next
FindNextToRun() ReadyToRun(this)
Switch(this, next) enable
Sleep() disable this-gtstatus
BLOCKED next FindNextToRun()
Switch(this, next) enable
Issues to resolve What if there are no ready
threads? How does a thread terminate? How does
the first thread start?
24
/ Save context of the calling thread (old),
restore registers of the next thread to run
(new), and return in context of new.
/ switch/MIPS (old, new) old-gtstackTop
SP save RA in old-gtMachineStatePC save
callee registers in old-gtMachineState restore
callee registers from new-gtMachineState
RA new-gtMachineStatePC SP
new-gtstackTop return (to RA)
This example (from the old MIPS ISA) illustrates
how context switch saves/restores the user
register context for a thread, efficiently and
without assigning a value directly into the PC.
25
Example Switch()
Save current stack pointer and callers return
address in old thread object.
switch/MIPS (old, new) old-gtstackTop
SP save RA in old-gtMachineStatePC save
callee registers in old-gtMachineState restore
callee registers from new-gtMachineState RA
new-gtMachineStatePC SP new-gtstackTop retu
rn (to RA)
Caller-saved registers (if needed) are already
saved on its stack, and restored automatically on
return.
Switch off of old stack and over to new stack.
Return to procedure that called switch in new
thread.
RA is the return address register. It contains
the address that a procedure return instruction
branches to.
26
What to know about context switch
  • The Switch/MIPS example is an illustration for
    those of you who are interested. It is not
    required to study it. But you should understand
    how a thread system would use it (refer to state
    transition diagram)
  • Switch() is a procedure that returns immediately,
    but it returns onto the stack of new thread, and
    not in the old thread that called it.
  • Switch() is called from internal routines to
    sleep or yield (or exit).
  • Therefore, every thread in the blocked or ready
    state has a frame for Switch() on top of its
    stack it was the last frame pushed on the stack
    before the thread switched out. (Need per-thread
    stacks to block.)
  • The thread create primitive seeds a Switch()
    frame manually on the stack of the new thread,
    since it is too young to have switched before.
  • When a thread switches into the running state, it
    always returns immediately from Switch() back to
    the internal sleep or yield routine, and from
    there back on its way to wherever it goes next.

27
Creating a new thread
  • Also called forking a thread
  • Idea create initial state, put on ready queue
  • Allocate, initialize a new TCB
  • Allocate a new stack
  • Make it look like thread was going to call a
    function
  • PC points to first instruction in function
  • SP points to new stack
  • Stack contains arguments passed to function
  • Add thread to ready queue

28
Thread control block
Address Space
TCB1 PC SP registers
TCB2 PC SP registers
TCB3 PC SP registers
Ready queue
Code
Code
Code
Stack
Stack
Stack
CPU
Thread 1 running
PC SP registers
29
Thread control block
Address Space
TCB2 PC SP registers
TCB3 PC SP registers
Ready queue
Code
Stack
Stack
Stack
CPU
Thread 1 running
PC SP registers
30
Kernel threads (native)
Thread
Thread
Thread
Thread
PC SP
PC SP
PC SP
PC SP




User mode
Kernel mode
Scheduler
31
User-level threads (green)
Thread
Thread
Thread
Thread
PC SP
PC SP
PC SP
PC SP




Sched
User mode
Kernel mode
Scheduler
32
Andrew Birrell
Bob Taylor
33
Concurrency An Example
int countersN int total / Increment a
counter by a specified value, and keep a running
sum. / void TouchCount(int tid, int
value) counterstid value total
value
34
Reading Between the Lines of C
/ counters and total are global data
tid and value are local data counterstid
value total value / load counters, R1
load counters base load 8(SP), R2 load tid
index shl R2, 2, R2 index index
sizeof(int) add R1, R2, R1 compute index to
array load (R1), R2 load counterstid load 4(S
P), R3 load value add R2, R3, R2
counterstid value store R2, (R1) store
back to counterstid load total, R2 load
total add R2, R3, R2 total value store R2,
total store total
35
Reading Between the Lines of C
load total, R2 load total add R2, R3, R2
total value store R2, total store total
load add store
load add store
Two executions of this code, so two values are
added to total.
36
Interleaving matters
load total, R2 load total add R2, R3, R2
total value store R2, total store total
load add store
load add store
In this schedule, only one value is added to
total last writer wins. The scheduler made a
legal move that broke this program.
37
Non-determinism and ordering
Thread A
Thread B
Thread C
Global ordering
Time
Why do we care about the global ordering?
Might have dependencies between events
Different orderings can produce different
results Why is this ordering unpredictable?
Cant predict how fast processors will run
38
Non-determinism example
  • y10
  • Thread A x y1
  • Thread B y y2
  • Possible results?
  • A goes first x 11 and y 20
  • B goes first y 20 and x 21
  • What is shared between threads?
  • Variable y

39
Another example
  • Two threads (A and B)
  • A tries to increment i
  • B tries to decrement i

Thread A i o while (i lt 10) i
print A done.
Thread B i o while (i gt -10) i--
print B done.
40
Example continued
  • Who wins?
  • Does someone have to win?

Thread A i o while (i lt 10) i
print A done.
Thread B i o while (i gt -10) i--
print B done.
41
Debugging non-determinism
  • Requires worst-case reasoning
  • Eliminate all ways for program to break
  • Debugging is hard
  • Cant test all possible interleavings
  • Bugs may only happen sometimes
  • Heisenbug
  • Re-running program may make the bug disappear
  • Doesnt mean it isnt still there!
Write a Comment
User Comments (0)
About PowerShow.com