Title: Process Management
1Process Management
.
P1
Processes
P2
Pn
OS
Hardware
- The Process is
- the OSs abstraction for execution
- the unit of execution/scheduling
- instantiation of programs
- also called a job or task
- defines the instruction-at-a-time execution of a
program
2A processs address space
0xFFFFFFFF
stack (dynamic allocated mem)
SP
heap (dynamic allocated mem)
address space
static data (data segment)
code (text segment)
PC
0x00000000
3Whats in a process
- A process consists of (at least)
- Code for the running program
- Data for the running program
- An execution Stack
- traces state of procedure calls made
- Program counter (PC), indicating the next
instruction - A set of general-purpose processor registers and
their values - A set of OS resources
- open files, network connections, sound channels,
- The process is a container for all of this state
- named by a process ID (PID)
- just an integer
4Process Data Structures
- How does the OS represent a process in the
kernel? - the OS data structure that represents each
process is called the process control block (PCB) - The PCB is a data structure with many, many
fields - process ID (PID) an integer
- execution state PC, SP, registers
- memory management info
- UNIX username of owner
- scheduling priority
- accounting info
- pointers into state queues
- In Linux
- defined in task_struct (include/linux/sched.h)
- over 95 fields!!!
5Process Behavior
- All processes alternate between burst of
computing with I/O requests (disk, keyboard,
network, )
(a) a CPU-bound process (b) an I/O bound
process
6Process States
- Each process has an execution state, which
indicates - what it is currently doing
- ready waiting to be assigned to CPU
- could run, but another process has the CPU
- running executing on the CPU
- is the process that currently controls the CPU
- Question how many processes can be running
simultaneously? - waiting waiting for an event, e.g. I/O
- cannot make progress until event happens
- As a process executes, it moves from state to
state - Windows Run the task manager and look up the
state of all active processes - Linux/Unix type ps aux to see all processes
in the system
7Process State Transitions
create
New
Ready
I/O done
Waiting
unschedule
schedule
kill
Terminated
Running
I/O, page fault, etc.
- What can cause state transitions?
8State Queues
- The OS maintains a collection of queues that
represent the state of all processes in the
system - typically one queue for each state
- e.g., ready, waiting,
- each PCB is queued onto a state queue according
to its current state - as a process changes state, its PCB is unlinked
from from queue, and linked onto another
9More State Queues
Ready queue header
netscape pcb
emacs pcb
ls pcb
head ptr
tail ptr
Wait queue header
cat pcb
netscape pcb
head ptr
tail ptr
- There may be many wait queues, one for each type
of wait (particular device, timer, message, )
10PCBs and Hardware State
- At any time, there are many processes, each in
its own particular state - Running, ready, waiting
- When a process is running, its hardware state is
inside the CPU - PC, SP, registers
- CPU contains current values
- When the OS stops running a process (puts it in
the ready/waiting state), it saves the registers
values in the PCB - when the OS puts the process in the running
state, it loads the hardware registers from the
values in that process PCB
11Context Switch
- The act of switching the CPU from one process to
another is called a context switch - timesharing systems may do 100s or 1000s of
switches/s - Must be very fast as this is pure overhead
12Process Creation
- One process can create another process
- creator is called the parent
- created process is called the child
- what creates the first process, and when?
- In Unix, the first process is called init,
which creates a login process per terminal. When
a user logs in, the login process creates a shell
process, which is used to enter user commands - In some systems, parent defines or donates
resources and privileges for its children - UNIX child inherits parents userID field, etc.
- when child is created, parent may either wait for
it to finish, or it may continue in parallel
13Unix Process Creation
- UNIX process creation through fork() system call
- creates and initializes a new PCB
- creates a new address space
- initializes new address space with a copy of the
entire contents of the address space of the
parent - initializes kernel resources of new process with
resources of parent (e.g. open files) - places new PCB on the ready queue
- the fork() system call returns twice
- once into the parent, and once into the child
- returns the childs PID to the parent
- returns 0 to the child
14Fork()
- int main(int argc, char argv)
-
- char name argv0
- int child_pid fork()
- if (child_pid 0)
- printf(Child of s is d\n, name, child_pid)
- return 0
- else
- printf(My child is d\n, child_pid)
- return 0
- //end-else
- //end-main
15Output
- gcc -o testparent testparent.c
- ./testparent
- My child is 486
- Child of testparent is 0
- ./testparent
- Child of testparent is 0
- My child is 488
16Fork Exec
- So how do we start a new program, instead of just
forking the old program? - the exec() system call!
- int exec(char prog, char argv)
- exec()
- stops the current process
- loads program prog into the address space
- initializes hardware context, args for new
program - places PCB onto ready queue
- note does not create a new process!
17Unix Shells
- int main(int argc, char argv)
-
- while (1)
- char cmd get_next_command()
- int child_pid fork()
- If (child_pid 0)
- manipulate STDIN/STDOUT/STDERR fds
- exec(cmd)
- panic(exec failed!)
- else
- wait(child_pid)
-
-
18Process Hierarchies
- In some systems, when a process creates another
process, the parent process and the child process
are associated in some ways - A process and its descendants form a process
group - When a signal is delivered (ctrlc) to kill a
process, the signal is delivered to all processes
in the group
19Threads Motivation
- Consider a web server, which forks off copies of
itself to handle multiple simultaneous tasks or,
imagine we have any parallel program on a
multiprocessor - To execute these, we need to
- create several processes that execute in parallel
- cause each to map to the same address space to
share data - have the OS schedule them in parallel
- This is really inefficient
- space PCB, page tables, etc.
- time creating OS structures, fork and copy addr
space, etc.
20Can we do better?
- Whats similar in these processes
- they all share the same code and data (address
space) - they all share the same privileges
- they all share the same resources (files,
sockets, etc.) - Whats different?
- each has its own hardware execution state
- PC, registers, stack pointer, and stack
- Key idea
- separate the concept of
- a process (address space, etc.) from that of
- a minimal thread of control (execution state
PC, etc.) - this execution state is usually called a thread,
or sometimes, a lightweight process
21Threads and Processes
- Most modern OSs (Mach, Chorus, NT, Solaris,
Linux) therefore support two entities - the process, which defines the address space and
general process attributes (such as open files,
etc.) - the thread, which defines a sequential execution
stream within a process - A thread is bound to a single process
- processes, however, can have multiple threads
executing within them - sharing data between threads is cheap all see
same address space - Threads become the unit of scheduling
- processes are just containers in which threads
execute
22The Thread Model
- Each thread has its own stack
23Recall a processs address space
0xFFFFFFFF
stack (dynamic allocated mem)
SP
heap (dynamic allocated mem)
address space
static data (data segment)
code (text segment)
PC
0x00000000
24A processs address space with threads
thread 1 stack
SP (T1)
0xFFFFFFFF
thread 2 stack
SP (T2)
SP
thread 3 stack
SP (T3)
address space
heap (dynamic allocated mem)
static data (data segment)
0x00000000
code (text segment)
PC
PC (T2)
PC (T1)
PC (T3)
25Thread Design Space
older UNIXes
MS/DOS
one thread/process
address space
one thread/process
one process
many processes
thread
Java
Mach, NT, Chorus, Linux,
many threads/process
many threads/process
many processes
one process
26Process/Thread Separation
- Concurrency (multithreading) is useful for
- Improving program structure (modular code)
- Some threads doing I/O, some threads doing
computation - Handling concurrent events (e.g., web servers)
- Building parallel programs (Computation speedup)
- So, multithreading is useful even on a
uniprocessor - even though only one thread can run at a time
27Kernel Thread and User-Level Threads
- Who is responsible for creating/managing threads?
- Two answers, in general
- the OS (kernel threads)
- thread creation and management requires system
calls - the user-level process (user-level threads) a
library linked into the program manages the
threads
28Kernel Threads
- OS now manages threads and processes
- all thread operations are implemented in the
kernel - OS schedules all of the threads in a system
- if one thread in a process blocks (e.g. on I/O),
the OS knows about it, and can run other threads
from that process - possible to overlap I/O and computation inside a
process
29Kernel Threads
- Kernel threads are cheaper than processes
- less state to allocate and initialize
- But, they can still be too expensive
- thread operations are all system calls
- OS must keep state for each thread
- Created by clone system call in Linux
- Similar to fork, but threads share the same
address space
30User-Level Threads
- To make threads cheap and fast, they need to be
implemented at the user level - managed entirely by user-level library, e.g.
libpthreads.a - Each process keeps track of its own threads in a
thread table
31User-Level Threads
- Why is user-level thread management possible?
- threads share the same address space
- therefore the thread manager doesnt need to
manipulate address spaces - threads only differ in hardware contexts
(roughly) - PC, SP, registers
- these can be manipulated by the user-level
process itself!
32User-Level Threads
- User-level threads are small and fast
- each thread is represented simply by a PC,
registers, a stack, and a small thread control
block (TBC) - creating a thread, switching between threads, and
synchronizing threads are done via procedure
calls - no kernel involvement is necessary!
- user-level thread operations can be 10-100x
faster than kernel threads as a result
33Performance Example
- On a 700MHz Pentium running Linux 2.2.16
- Processes
- fork/exit 251 ms
- Kernel threads
- pthread_create()/pthread_join() 94 ms
- User-level threads
- pthread_create()/pthread_join 4.5 ms
34User-Level Thread Limitations
- But, user-level threads arent perfect
- tradeoff, as with everything else
- User-level threads are invisible to the OS
- there is no integration with the OS
- As a result, the OS can make poor decisions
- blocking a process whose thread initiated I/O,
even though the process has other threads that
are ready to run
35User Level Thread Implementation
- A thread scheduler determines when a thread runs
- it uses queues to keep track of what threads are
doing - just like the OS and processes
- but, implemented at user-level as a library
- run queue threads currently running
- ready queue threads ready to run
- wait queue threads blocked for some reason
- maybe blocked on I/O, maybe blocked on a lock
36Thread Context Switch
- Very simple
- save context of currently running thread
- push machine state onto thread stack
- restore context of the next thread
- pop machine state from next threads stack
- return to caller as the new thread
- execution resumes at PC of next thread
- This is all done by assembly language
37Thread Interface
- This is taken from the POSIX pthreads API
- t pthread_create(attributes, start_procedure
- creates a new thread of control
- new thread begins executing at start_procedure
- pthread_exit()
- terminates the calling thread
- pthread_wait(t)
- waits for the named thread to terminate
- Windows
- CreateThread to start a thread
- WaitForSingleObject(handle) to wait for thread
termination