System Architecture

(INFT12-212 and 72-212)

Lab Notes for Week 12: POSIX Threads and Synchronisation

1 Introduction to POSIX Threads

We saw in the lecture that a thread is a computational unit within a process, and shares the process address space with zero or more other threads. In Unix and Linux, the most common thread API is POSIX threads, or Pthreads. Much of the following I've borrowed from POSIX Threads Programming by Blaise Barney.

Pthreads are mapped onto the existing Unix process model. All threads share the same program machine code and global data area (the heap), but each thread is allocated its own stack. Each thread can thus call functions, and each function will have its own local variables.

(yes, the text should be below the data and the data should be below the heap!)

Each pthread:

Exists within a process and uses the process resources.
Has its own independent flow of control as long as its parent process exists and the OS supports it.
Duplicates only the essential resources it needs to be independently schedulable.
May share the process resources with other threads that act equally independently (and dependently).
Dies if the parent process dies - or something similar.
Is "lightweight" because most of the overhead has already been accomplished through the creation of its process.

And because threads within the same process share resources:

Changes made by one thread to shared system resources (such as closing a file) will be seen by all other threads.
Two pointers having the same value (in one or more pthreads) will point to the same data, as all pthreads share the same addres space.
Reading and writing to the same memory locations is possible, and therefore requires explicit synchronization by the programmer.

The functions which comprise the Pthreads API can be informally grouped into four major groups:

Thread management: Functions that work directly on threads - creating, detaching, joining, etc. They also include functions to set/query thread attributes (joinable, scheduling etc.)
Mutexes: Functions that deal with synchronization, called a "mutex", which is an abbreviation for "mutual exclusion". Mutex functions provide for creating, destroying, locking and unlocking mutexes. These are supplemented by mutex attribute functions that set or modify attributes associated with mutexes.
Condition variables: Functions that address communications between threads that share a mutex based upon programmer specified conditions. This group includes functions to create, destroy, wait and signal based upon specified variable values. Functions to set/query condition variable attributes are also included.
Synchronization: Functions that manage read/write locks and barriers.

2 Pthread Management

 pthread_create (thread,attr,start_routine,arg)

When a process starts, it has exactly one thread of execution. pthread_create() is used to create a new thread in the process. The arguments are:

thread: An opaque, unique identifier for the new thread returned by the subroutine.
attr: An opaque attribute object that may be used to set thread attributes. You can specify a thread attributes object, or NULL for the default values.
start_routine: the C routine that the thread will execute once it is created.
arg: A single argument that may be passed to start_routine. It must be passed by reference as a pointer cast of type void. NULL may be used if no argument is to be passed.

The maximum number of threads that may be created by a process is implementation dependent. Once created, threads are peers and may create other threads. There is no implied hierarchy or dependency between threads.

pthread_exit (status)

There are several ways in which a thread may be terminated:

The thread returns normally from its starting routine. Its work is done.
The thread makes a call to the pthread_exit() function - whether its work is done or not.
The thread is canceled by another thread via the pthread_cancel() function.
The entire process is terminated due to making a call to either the exec() or exit().
If main() finishes first, without calling pthread_exit() explicitly itself.

The pthread_exit() function allows the programmer to specify an optional termination status parameter. This optional parameter is typically returned to threads "joining" the terminated thread (not covered in these notes). In function that execute to completion normally, you can often dispense with calling pthread_exit() - unless, of course, you want to pass the optional status code back.

2.1 Example Pthread Program

Download this C program which creates and destroys some pthreads: pthreadhello.c. Read through the code. The process starts in main(), and creates several threads. Each thread starts up in the PrintHello() function, where it prints out the details of its thread-id and then pthread_exit()s. The main() thread also does a pthread_exit().

To compile this program on Linux, you need to use this compile command:

  cc -o pthreadhello  -pthread  pthreadhello.c

The extra -pthread argument tells the compiler to use Pthread-aware library functions.

Once the program is compiled, run it and see what it does.

3 Synchronisation and Condition Variables

We don't have time to look at all the things that you can do with pthreads, so if you are interested in these you need to explore on your own. However, we will look at condition variables.

Condition variables provide one way for threads to synchronize. While mutexes implement synchronization by controlling thread access to data, condition variables allow threads to synchronize based upon the actual value of data. Without condition variables, the programmer would need to have threads continually polling (possibly in a critical section), to check if the condition is met. This can be very resource consuming since the thread would be continuously busy in this activity. A condition variable is a way to achieve the same goal without polling.

A condition variable is always used in conjunction with a mutex lock. A representative sequence for using condition variables is shown below.

Main Thread
===========
Declare and initialize global data/variables which require synchronization (such as "count")
Declare and initialize a condition variable object 
Declare and initialize an associated mutex 
Create threads A and B to do work

Thread A                                                        Thread B
========                                                        ========
Do work up to the point where a certain condition must occur    Do work
   (such as "count" must reach a specified value)               Lock associated mutex
Lock associated mutex and check value of a global variable      Change the value of the global variable
Call pthread_cond_wait() to perform a blocking wait               that Thread-A is waiting upon
   for signal from Thread-B. Note that a call to                Check value of the global Thread-A wait
   pthread_cond_wait() automatically and atomically               variable. If it fulfills the desired 
  unlocks the associated mutex variable so that it                condition, signal Thread-A
  can be used by Thread-B.     
When signalled, wake up. Mutex is automatically and
  atomically locked.
Explicitly unlock mutex                                         Unlock mutex 
Continue                                                        Continue

Main Thread
===========

Continue with rest of the work

In the above style of programming, Thread-B is known as a producer as it is often producing data, and Thread-A is a consumer, as it uses the data from the other thread, but it has to wait for the other thread to make it.

3.1 Example Pthread Condition Program

Download this C program which uses a condition variable: condvar.c. Read through the code. This simple example code demonstrates the use of several Pthread condition variable routines. The main routine creates three threads. Two of the threads perform work and update a "count" variable. The third thread waits until the count variable reaches a specified value.

Compile the program using the command:

  cc -o condvar  -pthread  condvar.c

and then run the program. You should see this sort of output:

Starting watch_count(): thread 1
inc_count(): thread 2, count = 1, unlocking mutex
inc_count(): thread 3, count = 2, unlocking mutex
watch_count(): thread 1 going into wait...
inc_count(): thread 3, count = 3, unlocking mutex
inc_count(): thread 2, count = 4, unlocking mutex
inc_count(): thread 3, count = 5, unlocking mutex
inc_count(): thread 2, count = 6, unlocking mutex
inc_count(): thread 3, count = 7, unlocking mutex
inc_count(): thread 2, count = 8, unlocking mutex
inc_count(): thread 3, count = 9, unlocking mutex
inc_count(): thread 2, count = 10, unlocking mutex
inc_count(): thread 3, count = 11, unlocking mutex
inc_count(): thread 2, count = 12  Threshold reached. Just sent signal.
inc_count(): thread 2, count = 12, unlocking mutex
watch_count(): thread 1 Condition signal received.
watch_count(): thread 1 count now = 137.
inc_count(): thread 3, count = 138, unlocking mutex
inc_count(): thread 2, count = 139, unlocking mutex
inc_count(): thread 3, count = 140, unlocking mutex
inc_count(): thread 2, count = 141, unlocking mutex
inc_count(): thread 3, count = 142, unlocking mutex
inc_count(): thread 2, count = 143, unlocking mutex
inc_count(): thread 3, count = 144, unlocking mutex
inc_count(): thread 2, count = 145, unlocking mutex
Main(): Waited on 3 threads. Final value of count = 145. Done.

That's all we have time for with Pthreads, but for more information you can read the POSIX Threads Programming by Blaise Barney.

4 Deadlock Simulator

On the Modern Operating System Simulators web page, there is a deadlock simulator for Linux. Install the program on your Debian box in the Linux Lab. This should be as simple as:

download the simulator tarball from this local copy
tar vxzf deadlock.tgz
cd deadlock

Once you have the deadlock simulator installed and working, read the user manual.

At this point, Warren will have written some more notes to explain what you should do here. If he hasn't, then you should show some initiative and explore the simulator yourself! As a start, you can read the user manual and do some of the exercises suggested there.

File translated from T_EX by T_TH, version 3.85.
On 1 Jan 2012, 13:14.