Title: CS 2200 Threads and Synchronization
1CS 2200Threads and Synchronization
- (Lectures based on the work of Jay Brockman,
Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
Ken MacKenzie, Richard Murphy, and Michael
Niemier)
2Review Mutex
- Mutex -- mutual exclusion problem
- only one process/thread in a critical section
at a time - motivated by multiprocessor applies to
uniprocessor too - Five low-level solutions
- enable/disable interrupts (works
in-kernel only) - lock-free data structures (restrictive!)
- software solutions
- special atomic operations in HW lt--- std.
solution - test set, swap
- speculate retry! possible
future soln? - See discussion on board, handout for more
3Today Threads
- 1. What are threads why do you want em
- 2. Styles of thread programming
- intro to POSIX threads (pthreads) API
- 3. Synchronization (again)
- from point of view of programmer, not of hardware
- primitives that block instead of spin-wait
- 4. Implementation
4Terminology
- Process full-blown virtual machine
- register set stack
- protected area of memory
- Thread multiplexed CPU only
- registers set stack
- A process may contain multiple threads. If so,
all threads see the same address space.
5Recall
- Process
- Program Counter
- Registers
- Stack
- Code (Text)
- Data
- Page Table
- etc.
- Processes must be protected from one another.
- For more on threadssee the board! And page 4-5
of your handout
Memory
6Recall
- Context Switching
- Requires considerable work
- What about a single users application?
- Is there a way to make it more efficient
- In effect, allow the user to have multiple
processes executing in the same space? - Yes, solution Threads or Multithreading
7What is Multithreading?
- Technique allowing program to do multiple tasks
- Example Java GUI's
- Is it a new technique?
- has existed since the 70s (concurrent Pascal,
Ada tasks, etc.) - Why now?
- Time has come for this technology
- Emergence of SMPs in particular
-
8What is a Thread?
- Basic unit of CPU utilization
- A lightweight process (LWP)
- Consists of
- Program Counter
- Register Set
- Stack Space
- Shares with peer threads
- Code
- Data
- OS Resources
- Open files
- Signals
9SMP? Symmetric Multiprocessors
- What is an SMP?
- Multiple CPUs in a single box sharing all the
resources such as memory and I/O - Is a dual-processor SMP more cost effective than
two uniprocessor boxes? - Yes, (roughly 20 more for a dual processor SMP
compared to a uniprocessor). - Modest speedup for a program on a dual-processor
SMP over a uniprocessor will make it worthwhile. - Example DELL WORKSTATION 650
- 2.4GHz Intel Xeon (Pentium 4)
- 1GB SDRAM memory, 80GB disk, 20/48X CD, 19
monitor, Quadro4 900XGL Graphics card, RedHat
Linux, 3yrs service - 2,584 for 2nd processor, add 434
10Threads
- Can be context switched more easily
- Registers and PC
- Not memory management
- Can run on different processors concurrently in
an SMP - Share CPU in a uniprocessor
- May (Will) require concurrency control
programming like mutex locks.
11Process Vs. Thread
P1
P2
user
PCB
PCB
kernel
Kernel code and data
- Two single-threaded applications on one machine
12Process Vs. Thread
P1
P2
user
PCB
PCB
kernel
Kernel code and data
- P1 is multithreaded P2 is single-threaded
- Computational state (PC, regs, ) for each thread
- How different from process state?
13Memory Layout
- Multithreaded program has a per-thread stack
- Heap, static, and code are common to all threads
14Threads and OS
- Programs in a traditional OS are single threaded
- One PC per program (process), one stack, one set
of CPU registers - If a process blocks (say disk I/O, network
communication, etc.) then no progress for the
program as a whole
15MultiThreaded Operating Systems
- How widespread is support for threads in OS?
- Digital Unix, Sun Solaris, Win9x, Win NT, Win2k,
Linux, Free BSD, etc - Process vs. Thread?
- In a single threaded program, the state of the
executing program is contained in a process - In a multithreaded program, the state of the
executing program is contained in several
concurrent threads
16Why use threads?
- Multiprocessor
- convenient way to use the multiple processors
- all memory is shared
- Uniprocessor?
Process
active
- Allows concurrency between I/O and user
processing even in a uniprocessor box
17Threads
- Share address space of process
- Cooperate to get job done
- Concurrent?
- May be if the box is a true multiprocessor
- Share the same CPU on a uniprocessor
- Allow programmer to write 1 program that can run
with 1 or more processors
18Thread Programming
- Three common models
- one per processor model
- workpile or pool of threads model
- pipeline model
19One per Processor
main
thread
workers one per physical processor
synchronization
Common strategy in multiprocessing
20Workpile model
- Central pile of work to do
- queue or other data structure
- N threads (gt of processors)
- read unit of work from pile
- do work, possibly generating more work
- add new work to pile
21Pipeline Model
- Good for tolerating I/O delays
- Also used in heterogenous system
- e.g. system with specialized processors
22Mailbox
Mailbox
23Threads
- Threaded code different from non-threaded?
- Protection for data shared among threads
- Mutex
- Synchronization among threads
- Way for threads to talk (signal) to one another
- Thread-safe libraries
NOTES strtok is unsafe for multi-thread
applications. strtok_r is MT-Safe and should be
used instead.
24Typical Operation
- Main programs creates (or spawns) threads
- Threads may
- perform one task and die
- last for duration of program
- Threads must
- be able to synchronize activity
- communicate with one another
25Example from text
- include ltpthread.hgt
- include ltstdio.hgt
- int sum / this data is shared by the
threads / - void runner(void param) / the thread /
- main(int argc, char argv)
- pthread_t tid / the thread identifier /
- pthread_attr_t attr / set of thread
attributes / - if (argc ! 2)
- fprintf(stderr, usage a.out ltinteger
valuegt\n) - exit()
-
- if (atoi(argv1) lt 0)
- fprintf(stderr, d must be lt 0\n,
atoi(argv1)) - exit()
-
- pthread_attr_init(attr) / get the default
thread attributes / - pthread_create(tid, attr, runner, argv1) /
create the thread /
26Example from text
- / The thread will begin control in this function
/ - void runner(void param)
- int upper atoi(param)
- int i
- sum 0
- if (upper gt 0)
- for(i1 I lt upper i)
- sum i
-
- pthread_exit(0)
27Programming Support for Threads
- Creation
- pthread_create(top-level procedure, args)
- Termination
- return to top-level procedure
- explicit kill
- Rendezvous
- creator can wait for children
- pthread_join(child_tid)
- Synchronization
- mutex
- condition variables
Why? Example Searching a DB, 1 thread finds
data Example User presses stop to stop
loading a web page
28Programming with Threads
- Synchronization
- For coordination of the threads
- Communication
- For inter-thread sharing of data
- Threads can be in different processors
- How to achieve sharing in SMP?
- Software accomplished by keeping all threads in
the same address space by the OS - Hardware accomplished by hardware shared memory
and coherent caches
29Synchronization Primitives
- lock and unlock
- mutual exclusion among threads
- busy-waiting vs. blocking
- pthread_mutex_trylock no blocking (rare)
- pthread_mutex_lock blocking
- pthread_mutex_unlock
- condition variables
- pthread_cond_wait block for a signal
- pthread_cond_signal signal one waiting thread
- pthread_cond_broadcast signal all waiting threads
30How to wait?
- 1. spin!
- easy to implement
- - locks out the thread youre waiting for?
void spin_mutex_lock(int mutex)
while(test_and_set(mutex) 1) /
spin! / void spin_mutex_unlock(int
mutex) mutex 0
31How to wait?
- 2. switch-spin
- easy to implement
- - doesnt scale
- ideal for two threads, though
void switchspin_mutex_lock(int mutex)
while(test_and_set(mutex) 1)
thread_yield() / let another thread run /
void switchspin_mutex_unlock(int mutex)
mutex 0
32How to wait?
- 3. sleep-spin?
- easy to implement
- - wasteful
void sleepspin_mutex_lock(int mutex)
while(test_and_set(mutex) 1)
usleep(10000) / let another thread run /
void sleepspin_mutex_unlock(int mutex)
mutex 0
33How to wait?
- 4. blocking
- - hard to implement
- - expensive
- absolutely the right thing if youre waiting a
long time
34Example
Initially mutex is unlocked resource_state is
FREE
- lock(mutex)
- while (resource_state BUSY)
- //spin
- resource_state BUSY
- unlock(mutex)
- use resource
- lock(mutex)
- resource_state FREE
- unlock(mutex)
Will this work? - Yes ? - No ? - Maybe ?
35Example
- lock(mutex)
- while (resource_state BUSY)
- //spin
- resource_state BUSY
- unlock(mutex)
- use resource
- lock(mutex)
- resource_state FREE
- unlock(mutex)
Thread 1
36Example
- lock(mutex)
- while (resource_state BUSY)
- //spin
- resource_state BUSY
- unlock(mutex)
- use resource
- lock(mutex)
- resource_state FREE
- unlock(mutex)
Thread 2
Thread 1
37Example
- lock(mutex)
- while (resource_state BUSY)
- //spin
- resource_state BUSY
- unlock(mutex)
- use resource
- lock(mutex)
- resource_state FREE
- unlock(mutex)
Thread 2
Thread 1
38Example with cond-var
- lock(mutex)
- while(resource_state BUSY)
- wait(cond_var) / implicitly give up mutex /
- / implicitly re-acquire mutex
/ -
- resource_state BUSY
- unlock(mutex)
- / use resource /
- lock(mutex)
- resource_state FREE
- unlock(mutex)
- signal(cond_var)
39Example with cond-var
- lock(mutex)
- while(resource_state BUSY)
- wait(cond_var) / implicitly give up mutex /
- / implicitly re-acquire mutex
/ -
- resource_state BUSY
- unlock(mutex)
- / use resource /
- lock(mutex)
- resource_state FREE
- unlock(mutex)
- signal(cond_var)
T1
40Example with cond-var
- lock(mutex)
- while(resource_state BUSY)
- wait(cond_var) / implicitly give up mutex /
- / implicitly re-acquire mutex
/ -
- resource_state BUSY
- unlock(mutex)
- / use resource /
- lock(mutex)
- resource_state FREE
- unlock(mutex)
- signal(cond_var)
T2
T1
41pthreads
- Mutex
- Must create mutex variables
- pthread_mutex_t padlock
- Must initialize mutex variable
- pthread_mutex_init(padlock, NULL)
- Condition Variable (used for signaling)
- Must create condition variables
- pthread_cond_t non_full
- Must initialize condition variables
- pthread_cond_init(non_full, NULL)
42Classic CS Problem Producer Consumer
- Producer
- If (! full)
- Add item to buffer
- empty FALSE
- if(buffer_is_full)
- full TRUE
- Consumer
- If (! empty)
- Remove item from buffer
- full FALSE
- if(buffer_is_empty)
- empty TRUE
43Example Producer Threads Program
- while(forever)
- // produce item
- pthread_mutex_lock(padlock)
- while (full)
- pthread_cond_wait(non_full, padlock)
- // add item to buffer
- buffercount
- if (buffercount BUFFERSIZE)
- full TRUE
- empty FALSE
- pthread_mutex_unlock(padlock)
- pthread_cond_signal(non_empty)
44Example Consumer Threads Program
- while(forever)
- pthread_mutex_lock(padlock)
- while (empty)
- pthread_cond_wait (non_empty, padlock)
- // remove item from buffer
- buffercount--
- full false
- if (buffercount 0)
- empty true
- pthread_mutex_unlock(padlock)
- pthread_cond_signal(non_full)
- // consume_item
45// Producer while(forever) // produce
item pthread_mutex_lock(padlock) while (full)
pthread_cond_wait(non_full, padlock) // add
item to buffer buffercount if (buffercount
BUFFERSIZE) full TRUE empty
FALSE pthread_mutex_unlock(padlock) pthread_co
nd_signal(non_empty) // Consumer while(forever)
pthread_mutex_lock(padlock) while (empty)
pthread_cond_wait (non_empty, padlock) //
remove item from buffer buffercount-- full
false if (buffercount 0) empty
true pthread_mutex_unlock(padlock) pthread_con
d_signal(non_full) // consume_item
46Threads Implementation
- User level threads
- OS independent
- Scheduler is part of the runtime system
- Thread switch is cheap (save PC, SP, regs)
- Scheduling customizable, i.e., more application
control - Blocking call by thread blocks process
47Threads Implementation
- Solution to blocking problem in user level
threads - Non-blocking version of all system calls
- Switching among user level threads
- Yield voluntarily
- How to make preemptive?
- Timer interrupt from kernel to switch
48Threads Implementation
- Kernel Level
- Expensive thread switch
- Makes sense for blocking calls by threads
- Kernel becomes complicated process vs. threads
scheduling - Thread packages become non-portable
- Problems common to user and kernel level threads
- Libraries
- Solution is to have thread-safe wrappers to such
library calls
49Solaris Threads
- Three kinds
- user, lwp, kernel
- User Any number can be created and attached to
lwps - One to one mapping between lwp and kernel threads
- Kernel threads known to the OS scheduler
- If a kernel thread blocks, associated lwp, and
user level threads block as well
50Solaris Terminology
51More Conventional Terminology
Processes
P1
P2
P3
Thread
kernel thread (user-level view)
(Inside the kernel)
52Kernel Threads vs. User Threads
- Advantages of kernel threads
- Can be scheduled on multiple CPUs
- Can be preempted by CPU
- Kernel scheduler knows their relative priorities
- Advantages of user threads
- (Unknown to kernel)
- Extremely lightweight No system call to needed
to change threads.
53Things to know?
- The reason threads are around?
- 2. Benefits of increased concurrency?
- 3. Why do we need software controlled "locks"
(mutexes) of shared data? - 4. How can we avoid potential deadlocks/race
conditions. - 5. What is meant by producer/consumer thread
synchronization/communication using pthreads? - 6. Why use a "while" loop around a
pthread_cond_wait() call? - 7. Why should we minimize lock scope (minimize
the extent of code within a lock/unlock block)? - 8. Do you have any control over thread
scheduling?
54Some ?s for you to think about
55Questions
- Why are threads becoming increasingly important?
- Widespread influence of Java (GUI's)
- Increasing availability of SMP's
- Importance of computer gaming market
- Logical way of abstracting complex task
- Why use a "while" loop around a
pthread_cond_wait() call? - In order to properly spinlock
- To insure that the wait gets executed
- To verify that the resource is free
- SMP cache coherency
56Questions
- Why should we minimize lock scope (minimize the
extent of code within a lock/unlock block)? - Allow for more concurrency
- Depends
- 42
- Add a bit
- Do you have any control over thread scheduling?
- Yes
- No
- Why would I want to?
- Maybe
57Whats wrong with Semaphores?
- void wait(int s) / P /
- while (S lt 0)
- / spin /
-
- s s - 1
-
- void signal(int s) / V /
- s s 1
58Whats wrong with Semaphores?
- void wait(int s) / P /
- while (S lt 0)
- / spin /
-
- s s - 1
-
- void signal(int s) / V /
- s s 1
59Solution
- Block thread while waiting.
- How?
- Lets examine a simple threads package...
60Typical API
- Thread_init
- Thread_create
- Thread_yield
- Thread_exit
- Mutex_create
- Mutex_lock
- Mutex_unlock
- Alarm
61Program starts in main
- main is going to call thread_init
Nothing much happening in Threadland
62Thread_init
- Create linked list of thread control blocks
- Context
- Status
- RUNNING
- READY
- DONE
- BLOCKED
- Pointer to mutex thread is waiting for.
- Start timer (Request from OS)
- Which will be handled by "Alarm"
63Thread_Init
I even created a TCB for myself! But, I'm so
lonely...
64Thread_create
- Make a new TCB
- malloc space for the new threads stack
- Get current context
- Modify context
- stack address
- stack size
- starting address
- Note Context is in TCB
- Add TCB to linked list (Ready)
65Thread_create (3 times)
I think I'll make me a mutex
66Mutex_create
- malloc new mutex variable (struct)
- status UNLOCKED
- Pointer to thread holding lock
67Mutex_create
I just made a mutex!
68Alarm (i.e. a timer interrupt)
Thread_yield
- If no other threads
- return
- Else
- Make current threads status READY (if not DONE or
BLOCKED) - Make next thread RUNNING
- Context switch current with next
69Timer Interrupt (Alarm)
70Mutex_lock
- If mutex is already locked
- Mark thread as blocked
- Note in TCB mutex being waited on
- Thread_yield
- Else
- Make mutex locked
- Point mutex at thread holding lock
71Lock Mutex
I have the mutex!
72Timer Interrupt (Alarm)
73Mutex_lock
- If mutex is already locked
- Mark thread as blocked
- Note in TCB mutex being waited on
- Thread_yield
- Else
- Make mutex locked
- Point mutex at thread holding lock
74Try to Lock Mutex (Fails)
- If mutex is already locked
- Mark thread as blocked
- Note in TCB mutex being waited on
- Thread_yield
- Else
- Make mutex locked
- Point mutex at thread holding lock
I want the mutex!
75Timer Interrupt (Alarm)
76Timer Interrupt (Alarm)
77Timer Interrupt (Alarm)
Note The yield routine can check each
blocked thread to make sure mutex is still locked
Still waiting!
78Timer Interrupt (Alarm)
79Timer Interrupt (Alarm)
Guess I don't need the mutex anymore
80Mutex_unlock
- Check to see if thread trying to unlock is the
holder of the mutex - Set mutex status to unlock
81Unlocks Mutex
82Timer Interrupt (Alarm)
Since mutex is unlocked we can change status to
RUNNING
83Same Timer Interrupt (Alarm)
Since mutex is unlocked we can change status to
RUNNING
84Same Timer Interrupt (Alarm)
And give the mutex to the requestor
85Timer Interrupt (Alarm)
I'm done!
86Thread_exit
- Make status DONE
- Thread_yield
87Thread Exits
88Demo
- Sample pthreads program
- (we wont go through this in class itd get
long and tedious)
89Operation
Scoreboard
- main starts up operation
- producer thread
- Locks scoreboard
- Waits for row to be available
- Marks row as in use
- Unlocks scoreboard
- Fills buffer row with random ints
- Locks scoreboard
- Marks row as sortable
- Unlocks scoreboard
Row
Status
0
AVAILABLE
1
AVAILABLE
2
AVAILABLE
3
AVAILABLE
4
AVAILABLE
5
AVAILABLE
...
...
AVAILABLE 0 define FILLING 1 define SORTABLE
2 define SORTING 3
90Operation
Scoreboard
- main starts up operation
- producer thread
- Locks scoreboard
- Waits for row to be available
- Marks row as in use
- Unlocks scoreboard
- Fills buffer row with random ints
- Locks scoreboard
- Marks row as sortable
- Unlocks scoreboard
Row
Status
0
FILLING
1
AVAILABLE
2
AVAILABLE
3
AVAILABLE
4
AVAILABLE
5
AVAILABLE
...
...
AVAILABLE 0 define FILLING 1 define SORTABLE
2 define SORTING 3
91Operation
Scoreboard
- main starts up operation
- producer thread
- Locks scoreboard
- Waits for row to be available
- Marks row as in use
- Unlocks scoreboard
- Fills buffer row with random ints
- Locks scoreboard
- Marks row as sortable
- Unlocks scoreboard
Row
Status
0
SORTABLE
1
AVAILABLE
2
AVAILABLE
3
AVAILABLE
4
AVAILABLE
5
AVAILABLE
...
...
AVAILABLE 0 define FILLING 1 define SORTABLE
2 define SORTING 3
92Operation
Scoreboard
- sorter threads
- Lock scoreboard
- Wait for row to be sortable
- Mark row as sorting
- Unlock scoreboard
- Sort row (using bubblesort)
- Lock scoreboard
- Mark row as available
Row
Status
0
SORTABLE
1
AVAILABLE
2
AVAILABLE
3
AVAILABLE
4
AVAILABLE
5
AVAILABLE
...
...
93Operation
Scoreboard
- sorter threads
- Lock scoreboard
- Wait for row to be sortable
- Mark row as sorting
- Unlock scoreboard
- Sort row (using bubblesort)
- Lock scoreboard
- Mark row as available
Row
Status
0
SORTING
1
AVAILABLE
2
AVAILABLE
3
AVAILABLE
4
AVAILABLE
5
AVAILABLE
...
...
94Operation
Scoreboard
- sorter threads
- Lock scoreboard
- Wait for row to be sortable
- Mark row as sorting
- Unlock scoreboard
- Sort row (using bubblesort)
- Lock scoreboard
- Mark row as available
- Runnable as user level threads or kernel level
(lwp/thread).
Row
Status
0
AVAILABLE
1
AVAILABLE
2
AVAILABLE
3
AVAILABLE
4
AVAILABLE
5
AVAILABLE
...
...
951000
Scoreboard
10
sorter
sorter
sorter
7
sorter
sorter
producer
sorter
sorter
96- / td -- Thread Demo /
- include ltpthread.hgt
- include ltstdio.hgt
- include ltstdlib.hgt
- define _REENTRANT
-
- define SIZE 1000 / Size of sort buffers
/ - define SCOUNT 7 / Number of sorters
/ - define ROWS 10 / Number of buffers
/ - define STEPS 22 / Buffers full of data to
- sort
/ - define NONEFOUND -1 / Used in searching
/
97- / Allowable states for row buffers (in
scoreboard) / - enum AVAILABLE, FILLING, SORTABLE, SORTING
- enum NO, YES
- static int dataROWSSIZE / The buffers
/ - static int available / Num of buffers
- available to fill
/ - static int sortable / How many are
- avail to sort
/ - static int scoreboardROWS / Row access
- scoreboard /
- static int run / Flag used to
- shutdown
- gracefully /
-
98- / Scoreboard mutex lock /
- static pthread_mutex_t scorelock
- / The producer can work! /
- static pthread_cond_t pcanwork
- / A sorter can work! /
- static pthread_cond_t sorterworkavail
-
- / Function prototypes /
- static void producer()
- static void sorter()
- void sort(int)
Creating necessary mutex and condition variables
99Threads have id numbers!
- int main(int argc, char argv)
-
- pthread_t producer_id
- pthread_t sorter_idSCOUNT
- pthread_attr_t attr
- int i
- available ROWS
- sortable 0
- run YES
-
100- pthread_attr_init(attr)
- / This binds thread to lwp allowing kernel
to - schedule thread (Will allow threads to run
on - different cpu's) /
- pthread_attr_setscope(attr,PTHREAD_SCOPE_SYST
EM) - pthread_mutex_init(scorelock, NULL)
- pthread_cond_init(pcanwork, NULL)
- pthread_cond_init(sorterworkavail, NULL)
- for (i0 i lt ROWS i)
- scoreboardi AVAILABLE
-
- if(argc 1) / No concurrency /
- pthread_create
- (producer_id, NULL, producer,
NULL) - else
- pthread_create
- (producer_id, attr, producer,
NULL)
101- for(i 0 i lt SCOUNT i)
-
- if(argc 1)
- pthread_create
- (sorter_idi, NULL, sorter,
NULL) - else
- pthread_create
- (sorter_idi, attr, sorter,
NULL) -
- printf("maingt All threads running\n")
- pthread_join(producer_id, NULL)
-
102- / After the producer is finished we send
signals - to all sorters to wake up and see that
they - should quit /
- for(i 0 i lt SCOUNT i)
-
- pthread_cond_signal(sorterworkavail)
-
- for(i 0 i lt SCOUNT i)
-
- pthread_join(sorter_idi, NULL)
-
- printf("Normal Termination\n")
- return 0
103This is the loop which controls the total
number of rows we process
- static void producer()
- int pcount
- int target
- int i
- for(pcount 0 pcount lt STEPS pcount)
- pthread_mutex_lock(scorelock)
- while(available 0)
- pthread_cond_wait(pcanwork,
scorelock) -
- target NONEFOUND
- for(i0 ilt ROWS i)
- if(scoreboardi AVAILABLE)
- target i
- available available - 1
- break
-
-
-
104- pthread_mutex_unlock(scorelock)
- if(target NONEFOUND)
- printf(" Producer cannot find"
- " available row!\n")
- pthread_exit(NULL)
-
- printf("pgt Filling row d\n", target)
- for(i0 i lt SIZE i)
- datatargeti rand()
-
- printf("pgt Row d complete\n", target)
- pthread_mutex_lock(scorelock)
- scoreboardtarget SORTABLE
- sortable sortable 1
- pthread_mutex_unlock(scorelock)
- pthread_cond_signal(sorterworkavail)
-
105This means that we can quit once we finish all
sorting
- run NO
- return NULL
- / pthread_exit(NULL) /
-
-
106- static void sorter()
- int i
- int target
- pthread_t me
- me pthread_self()
- while(1)
- pthread_mutex_lock(scorelock)
- while(sortable 0 run YES)
- pthread_cond_wait
- (sorterworkavail,
scorelock) -
-
107- / If the producer says stop and there is
no - work...exit /
- if(run NO available ROWS)
- printf(" Sgt x Exiting..."
- "prod done no filled rows\n",
me) - pthread_mutex_unlock(scorelock)
- pthread_exit(NULL)
-
- target NONEFOUND
- for(i 0 i lt ROWS i)
- if(scoreboardi SORTABLE)
- target i
- sortable sortable - 1
- scoreboardtarget SORTING
- break
-
-
108- if(target NONEFOUND)
- / We get here if the producer is
finished - and some threads are being sorted
but - none are available for sorting /
- printf("Sgt x couldn't find thread to
" - "sort.\n", me)
- pthread_mutex_unlock(scorelock)
- pthread_exit(NULL)
-
- pthread_mutex_unlock(scorelock)
- printf("Sgt x starting...\n", me)
- sort(target)
- printf("Sgt x finishing min d max
d\n", - me, datatarget0,
- datatargetSIZE-1)
-
109- pthread_mutex_lock(scorelock)
- scoreboardtarget AVAILABLE
- available available 1
- pthread_mutex_unlock(scorelock)
- pthread_cond_signal(pcanwork)
-
-
110- void sort(int target)
- int outer
- int inner
- int temp
- outer SIZE - 1
- for(outer SIZE - 1 outer gt 0 outer--)
- for(inner0 inner lt outer inner)
- if(datatargetinner gt
- datatargetinner1)
- temp datatargetinner
- datatargetinner
- datatargetinner
1 - datatargetinner1 temp
-
-
-
(Bubble sort)
111A short bit on networking
1123 kinds of networks
- Massively Parallel Processor (MPP) network
- Typically connects 1000s of nodes over a short
distance - Often banks of computers
- Used for high performance/scientific computing
- Local Area Network (LAN)
- Connects 100s of computers usually over a few kms
- Most traffic is 1-to-1 (between client and
server) - While MPP is over all nodes
- Used to connect workstations together (like in
Fitz) - Wide Area Network (WAN)
- Connects computers distributed throughout the
world - Used by the telecommunications industry
113Some basics
- Before we go into some specific details, we need
to define some terms - Done within the context of a very simple network
- sending something from Machine A to Machine B
each connected by unidirectional wires - A lot of this may be review for some of you but
just bear with me for a bit
Machine A
Machine B
114Some basics prepping a message
- If Machine A wants data from Machine B, it 1st
must send a request to B with the address of the
data it wants - Machine B must then send a reply with the data
- Again, overhead starts to raise its ugly head
- In this simple case we need extra bits of data to
detect if message is a new request or a reply to
a request - Kept in header or footer usually
- Software is also involved with the whole process
- How?
115What does the software do?
- Well, for starters, its everywhere
- Software must translate requests for reads and
replies into messages that the network can handle - A big reason is processes
- Network is shared by 2 computers with different
processes - Must make sure right message goes to right
process OS does this - This information can be/is included in the header
more overhead
116What does the software do?
- Software also helps with reliability
- SW adds and acknowledges a checksum added to a
message - Makes sure that no bits were flipped in
transmission for example - Also makes messages not lost in transit
- Often done by setting a time if no
acknowledgement by time x, message is resent
117Sending and receiving a message and SW
- Sending a message
- Application copies data to be sent into an OS
buffer - OS will
- Calculate a checksum, put it in header/trailer,
start time - OS sends data into network interface HW and tells
HW to send - Receiving a message (almost the reverse of
sending) - System copies data from NW interface HW into OS
buffer - System checks checksum field
- If checksum OK, receiver acknowledges receipt
- If not, message deleted (sender resends after a
time) - If data OK, copy data to user address space done
- What about the sender?
- If data is good, data deleted from buffer it
time-out, resend
118Performance parameters
- Bandwidth
- Maximum rate at which interconnection network can
propagate data once a message is in the network - Usually headers, overhead bits included in
calculation - Units are usually in megabits/second, not
megabytes - Sometimes see throughput
- Network bandwidth delivered to an application
- Time of Flight
- Time for 1st bit of message to arrive at receiver
- Includes delays of repeaters/switches length /
m (speed of light) (m determines property of
transmission material) - Transmission Time
- Time required for message to pass through the
network - size of message divided by the bandwidth
119More performance parameters
- Transport latency
- Time of flight transmission time
- Time message spends in interconnection network
- But not overhead of pulling out or pushing into
the network - Sender overhead
- Time for mP to inject a message into the
interconnection network including both HW and SW
components - Receiver overhead
- Time for mP to pull a message out of
interconnection network, including both HW and SW
components - So, total latency of a message is
120An example
- Consider a network with the following parameters
- Network has a bandwidth of 10 Mbit/sec
- Were assuming no contention for this bandwidth
- Sending overhead 230 mSec, Receiving overhead
270 mSec - We want to send a message of 1000 bytes
- This includes the header
- It will be sent as 1 message (no need to split it
up) - Whats the total latency to send the message to a
machine - 100 m apart (assume no repeater delay for
this) - 1000 km apart (not realistic to assume no
repeater - delay)
121An example, continued
- Well use the facts that
- The speed of light is 299,792.5 km/s
- Our m value is 0.5
- This means we have a good fiber optic or coaxial
cable (more later) - Lets use
- If the machines are 100 m apart
- If the machines are 1000 km apart
122Some more odds and ends
- Note from the example (with regard to longer
distance) - Time of flight dominates the total latency
component - Repeater delays would factor significantly into
the equation - Message transmission failure rates rise
significantly - Its possible to send other messages with no
responses from previous ones - If you have control of the network
- Can help increase network use by overlapping
overheads and transport latencies - Can simplify the total latency equation to
- Total latency Overhead (Message
size/bandwidth) - Leads to
- Effective bandwidth Message size/Total latency