System Architecture
(INFT12-212 and 72-212)
Lab Notes for Week 12: POSIX Threads and Synchronisation
1 Introduction to POSIX Threads
We saw in the lecture that a thread is a computational unit within
a process, and shares the process address space with zero or more
other threads. In Unix and Linux, the most common thread API is POSIX
threads, or Pthreads. Much of the following I've borrowed from
POSIX Threads Programming
by Blaise Barney.
Pthreads are mapped onto the existing Unix process model. All threads
share the same program machine code and global data area (the heap),
but each thread is allocated its own stack. Each thread can thus call
functions, and each function will have its own local variables.
(yes, the text should be below the data and the data should be below
the heap!)
Each pthread:
- Exists within a process and uses the process resources.
- Has its own independent flow of control as long as its parent process
exists and the OS supports it.
- Duplicates only the essential resources it needs to be independently
schedulable.
- May share the process resources with other threads that act equally
independently (and dependently).
- Dies if the parent process dies - or something similar.
- Is "lightweight" because most of the overhead has already been
accomplished through the creation of its process.
And because threads within the same process share resources:
- Changes made by one thread to shared system resources (such as closing
a file) will be seen by all other threads.
- Two pointers having the same value (in one or more pthreads) will
point to the same data, as all pthreads share the same addres space.
- Reading and writing to the same memory locations is possible, and
therefore requires explicit synchronization by the programmer.
The functions which comprise the Pthreads API can be informally grouped
into four major groups:
- Thread management: Functions that work directly on threads
- creating, detaching, joining, etc. They also include functions to
set/query thread attributes (joinable, scheduling etc.)
- Mutexes: Functions that deal with synchronization, called
a "mutex", which is an abbreviation for "mutual exclusion".
Mutex functions provide for creating, destroying, locking and unlocking
mutexes. These are supplemented by mutex attribute functions that
set or modify attributes associated with mutexes.
- Condition variables: Functions that address communications
between threads that share a mutex based upon programmer specified
conditions. This group includes functions to create, destroy, wait
and signal based upon specified variable values. Functions to set/query
condition variable attributes are also included.
- Synchronization: Functions that manage read/write locks and
barriers.
2 Pthread Management
pthread_create (thread,attr,start_routine,arg)
When a process starts, it has exactly one thread of execution. pthread_create()
is used to create a new thread in the process. The arguments are:
- thread: An opaque, unique identifier for the new thread returned by
the subroutine.
- attr: An opaque attribute object that may be used to set thread attributes.
You can specify a thread attributes object, or NULL for the default
values.
- start_routine: the C routine that the thread will execute once it
is created.
- arg: A single argument that may be passed to start_routine. It must
be passed by reference as a pointer cast of type void. NULL may be
used if no argument is to be passed.
The maximum number of threads that may be created by a process is
implementation dependent. Once created, threads are peers and may
create other threads. There is no implied hierarchy or dependency
between threads.
pthread_exit (status)
There are several ways in which a thread may be terminated:
- The thread returns normally from its starting routine. Its work is
done.
- The thread makes a call to the pthread_exit() function - whether
its work is done or not.
- The thread is canceled by another thread via the pthread_cancel()
function.
- The entire process is terminated due to making a call to either the
exec() or exit().
- If main() finishes first, without calling pthread_exit() explicitly
itself.
The pthread_exit() function allows the programmer to specify an optional
termination status parameter. This optional parameter is typically
returned to threads "joining" the terminated thread (not covered
in these notes). In function that execute to completion normally,
you can often dispense with calling pthread_exit() - unless, of
course, you want to pass the optional status code back.
2.1 Example Pthread Program
Download this C program which creates and destroys some pthreads:
pthreadhello.c. Read through the
code. The process starts in main(), and creates several threads. Each
thread starts up in the PrintHello() function, where it prints out
the details of its thread-id and then pthread_exit()s. The main()
thread also does a pthread_exit().
To compile this program on Linux, you need to use this compile command:
cc -o pthreadhello -pthread pthreadhello.c
The extra -pthread argument tells the compiler to use Pthread-aware
library functions.
Once the program is compiled, run it and see what it does.
3 Synchronisation and Condition Variables
We don't have time to look at all the things that you can do with
pthreads, so if you are interested in these you need to explore on
your own. However, we will look at condition variables.
Condition variables provide one way for threads to synchronize. While
mutexes implement synchronization by controlling thread access to
data, condition variables allow threads to synchronize based upon
the actual value of data. Without condition variables, the programmer
would need to have threads continually polling (possibly in a critical
section), to check if the condition is met. This can be very resource
consuming since the thread would be continuously busy in this activity.
A condition variable is a way to achieve the same goal without polling.
A condition variable is always used in conjunction with a mutex lock.
A representative sequence for using condition variables is shown below.
Main Thread
===========
Declare and initialize global data/variables which require synchronization (such as "count")
Declare and initialize a condition variable object
Declare and initialize an associated mutex
Create threads A and B to do work
Thread A Thread B
======== ========
Do work up to the point where a certain condition must occur Do work
(such as "count" must reach a specified value) Lock associated mutex
Lock associated mutex and check value of a global variable Change the value of the global variable
Call pthread_cond_wait() to perform a blocking wait that Thread-A is waiting upon
for signal from Thread-B. Note that a call to Check value of the global Thread-A wait
pthread_cond_wait() automatically and atomically variable. If it fulfills the desired
unlocks the associated mutex variable so that it condition, signal Thread-A
can be used by Thread-B.
When signalled, wake up. Mutex is automatically and
atomically locked.
Explicitly unlock mutex Unlock mutex
Continue Continue
Main Thread
===========
Continue with rest of the work
In the above style of programming, Thread-B is known as a producer
as it is often producing data, and Thread-A is a consumer,
as it uses the data from the other thread, but it has to wait for
the other thread to make it.
3.1 Example Pthread Condition Program
Download this C program which uses a condition variable: condvar.c.
Read through the code. This simple example code demonstrates the use
of several Pthread condition variable routines. The main routine creates
three threads. Two of the threads perform work and update a "count"
variable. The third thread waits until the count variable reaches
a specified value.
Compile the program using the command:
cc -o condvar -pthread condvar.c
and then run the program. You should see this sort of output:
Starting watch_count(): thread 1
inc_count(): thread 2, count = 1, unlocking mutex
inc_count(): thread 3, count = 2, unlocking mutex
watch_count(): thread 1 going into wait...
inc_count(): thread 3, count = 3, unlocking mutex
inc_count(): thread 2, count = 4, unlocking mutex
inc_count(): thread 3, count = 5, unlocking mutex
inc_count(): thread 2, count = 6, unlocking mutex
inc_count(): thread 3, count = 7, unlocking mutex
inc_count(): thread 2, count = 8, unlocking mutex
inc_count(): thread 3, count = 9, unlocking mutex
inc_count(): thread 2, count = 10, unlocking mutex
inc_count(): thread 3, count = 11, unlocking mutex
inc_count(): thread 2, count = 12 Threshold reached. Just sent signal.
inc_count(): thread 2, count = 12, unlocking mutex
watch_count(): thread 1 Condition signal received.
watch_count(): thread 1 count now = 137.
inc_count(): thread 3, count = 138, unlocking mutex
inc_count(): thread 2, count = 139, unlocking mutex
inc_count(): thread 3, count = 140, unlocking mutex
inc_count(): thread 2, count = 141, unlocking mutex
inc_count(): thread 3, count = 142, unlocking mutex
inc_count(): thread 2, count = 143, unlocking mutex
inc_count(): thread 3, count = 144, unlocking mutex
inc_count(): thread 2, count = 145, unlocking mutex
Main(): Waited on 3 threads. Final value of count = 145. Done.
That's all we have time for with Pthreads, but for more information
you can read the POSIX Threads Programming
by Blaise Barney.
4 Deadlock Simulator
On the Modern Operating System Simulators
web page, there is a deadlock simulator for Linux. Install the program
on your Debian box in the Linux Lab. This should be as simple as:
- download the simulator tarball from this local copy
- tar vxzf deadlock.tgz
- cd deadlock
Once you have the deadlock simulator installed and working, read the
user manual.
At this point, Warren will have written some more notes to explain
what you should do here. If he hasn't, then you should show some initiative
and explore the simulator yourself! As a start, you can read the user
manual and do some of the exercises suggested there.
File translated from
TEX
by
TTH,
version 3.85.
On 1 Jan 2012, 13:14.