Most simple programs are linear in structure. They start executing at one point, and they continue executing one statement at a time, in a well-defined order, until they terminate. Here's a simple example, in Pascal:
Program foo; var Name : String; Age : Integer; Days : Integer; Begin Writeln('Hello! What is your name? '); Readln(Name); Writeln('How many years old are you, ',Name,'?'); Readln(Age); Days = Age * 365; Writeln('You are ',Days,' days old.'); End.
However, some applications don't lend themselves to such linear structure. Consider a server which receives requests (for example, web-page retreival requests) to which it must respond. The requests come asynchronously; they can come in bursts, or be evenly distributed. Furthermore, let us assume that the requests may take a relatively long period of time to process, relative to the time between requests. The naive implementation might have the following structure, in C pseudocode:
/** linear server example */ int main(int argc, char **argv) { while (running) if (RequestReceived()) ProcessRequest(GetNextRequest()); else wait(); return 0; }
What happens in the above example if one request takes an extraordinary length of time to process? Everyone else must wait in line for that lengthy request to complete. Quite a backlog could develop.
Consider a computer with multiple processors. The above example would only be using a single processor. The other processor might be sitting idle! While we are processing one request on one processor, we could also be processing another request on another processor, in parallel. Even better, even if we don't have multiple physical processors at our disposal, we can pretend that we do by context switching, otherwise known as multitasking, a concept we should all be familiar with. This paradigm lets us have multiple threads of control in our single program. In other words, multiple separate sections may be executing "simultaneously". In a single processor system, the threads might not actually be executing simultaneously, but we can make it appear as if they are by alternating between them -- give some processor time to thread A, then some to thread B, then to thread C, and so on. Futhermore, on most systems, threads are lightweight, meaning that the creation of additional threads is not a very expensive (burdensome, resource consuming, lengthy, etc) task. Our new server might look like this, in C pseudocode:
/** Multithreaded server example */ int main(int argc, char **argv) { while (running) if (RequestReceived()) CreateNewThread(ProcessRequest(GetNextRequest)); else wait(); return (0); }
Our new server launches additional threads to process all incoming requests. This is cool -- the server is able to respond immediately to incoming requests and begin processing them. Consider the example of a lengthy request followed by several simpler requests. The server would be able to process these requests in parallel, and, although the CPU's time would still be divided amoungst the n tasks, it might be able to finish one or more of the simpler requests while it is concurrently processing the longer request.
If requests really are coming in at a rate too great for the server to process, the background tasks will begin to accumulate; things will get slower and slower. Eventually, if the number of threads increases without bound, we'll run out of memory or some other resource. Bad things will happen. In a real server, we would use more complicated logic -- for instance, we might limit the number of request- processing threads to some finite number (eg, 10) and stipulate that requests received while the maximum number of processing threads has been reached are (a) refused or (b) put in line, or more likely, (c) a combination of (a) and (b). If you did not take these precautions, a malicious attacker might be able to exploit a denial of service attack by feeding your server requests at a rate too great for it to handle. A simple program can be written to demonstrate this vividly.
Okay, enough background. Let's see how we really create and manipulate threads. This discussion will focus on POSIX threads, which is the threads interface found on most unix systems. POSIX is the IEEE Portable Operating System Interface.
_POSIX_THREADS
macro
will be defined in <unistd.h>
. This macro can be used to
include or exclude thread support at compile time:
Thread support can also be tested at runtime by calling#include <unistd.h> #include <stdio.h> int main(int argc, char **argv) { printf("Sample Program...\n"); #ifdef _POSIX_THREADS printf("This program was compiled with thread support.\n"); #else printf("This program was compiled without thread support.\n"); #endif return(0); }
sysconf()
with
_SC_THREADS
(defined in <unistd.h>
) as an
argument. sysconf()
returns 1 if a feature is supported, -1
otherwise.
#include <unistd.h> #include <stdio.h> int main(int argc, char **argv) { printf("Your system %s support threads.\n", ((int)sysconf(_SC_THREADS) ==1 ? "does" : "does not")); return(0); }
The first step in writing a threaded application is to include the POSIX thread ("pthread") library. Include the pthreads header file in your C program:
#include <pthread.h>
And specify the pthread library on the
command line of the c compiler (linker):
gcc mycode.c -lpthread
Most pthread functions begin with "pthread".
pthread_create()
is no exception:
Creating a thread is surprisingly simple. The arguments to pthread_create are as follows:#include <pthread.h> int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_func)(void *), void *arg );
thread
is a pointer to a pthread_t
type,
into which pthread_create()
puts the identifier of the newly
created thread; attr is a pointer to an optional thread attribute structure,
or NULL
if you don't wish to provide any attributes;
start_func
is the function you wish to launch as a new thread;
and arg
is an argument for that function.
It's important to note that the start_func
must be a function
taking one void pointer as its only argument and also returning a void pointer.
More complex data types may be passed by typecasting.
pthread
functions return 0
on success; a nonzero
return value indicates an error.
The pthread_join()
function waits for a thread to finish, if
it hasn't exited already. pthread_join()
then fetches the return
value of the thread's start_func
(a void pointer) and places
it into a buffer:
#include <pthread.h> int pthread_join(pthread_t thread, void **buffer);
buffer
points to a location where the thread's start_func
's
return value may be stored. If buffer == NULL
, the return value is
not stored anywhere and is lost. thread
is the identifer of the thread
for whose exit you are waiting; this value is obtained from pthread_create its
argument of the same name.
#include <stdio.h> #include <unistd.h> #include <pthread.h> void *background_task(void *argument) { /* do something in the background */ return (NULL); } int main() { pthread_t mythread; int result; if ( sysconf( _SC_THREADS ) == -1 ) { fprintf(stderr, "Sorry, this program requires threads, which aren't supported here.\n"); exit(0); } result = pthread_create( &mythread, NULL, background_task, NULL ); if ( result != 0 ) { fprintf(stderr, "For some reason I couldn't create a thread! (errno = %d)\n",result); exit(0); } pthread_join( mythread, NULL ); }
We've seen that concurrent programming lets us have multiple sections of our program execute simultaneously, and that this is cool. Many applications lend themselves well to concurrent implementations. In the real world, events are asynchronous and must be processed concurrently -- thus it makes sense for our programs to accept this challenge. However, concurrent execution introduces many possibilities for bugs that were not present with linear execution. These bugs are easily avoided, and they must be addressed. Furthermore, the elimination of concurrency related bugs should not be an afterthought. Concurrency issues must be addressed from the beginning and considered always. A simple example illustrates the hazards involved in concurrent programs. Consider the the canonical example of a bank account, from which we may withdraw money:
/** Initial balance */ int balance = 10; /** Withdraw amount from the account. Returns the amount withdrawn, or zero if there was insufficient balance in the account. */ int withdraw(int amount) { if (balance >= amount) { balance -= amount; return (amount); } else { printf("Insufficient funds!"); return (0); } }
This example looks benign enough. However, we must remember:
The order of execution of multiple threads is indeterminate.
Consider what would happen if two separate threads called
withdraw(10)
simultaneously.
Thread A: Thread B: if (balance >= amount) { if (balance >= amount) { balance -= amount; balance -= amount; return(amount); } return(amount); }
Both threads would successfully withdraw $10. The final balance would be -10. This doesn't look too bad, because, although we'd rather the bank customers not be able to withdraw more money than they really have, their final balance is negative, so we do know that they really did withdraw more money than they had. But, this isn't always the case. Consider the statement
balance -= amount;
We have no guarantee that this statement is executed atomically.
This statement is really made up of smaller elements (foo
might be a CPU register):
foo = balance;
foo = foo - amount;
balance = foo;
It's easy to see that these elements could be executed in such a way
that the final balance is 0, not -10. Clearly, a means of ensuring
that particular sections of code (sometimes called critical
sections) are never executed by multiple threads simultaneously
is necessary!
Situations where the outcome of a process depends upon the (non-predetermined) thread execution order are called race conditions.
When we want to prevent a section of code from being executed by two threads simultaneously, we usually use what's called a mutal exclusion lock, or mutex. Mutexes are used to eliminate race conditions.
Think of a mutex as an object upon which you may perform two operations.
The first operation is acquire
. The mutex's acquire
function waits until the mutex is not locked. Then it locks
the mutex and returns. The second operation is release
,
which releases the current lock on the mutex. Each critical section
of code, or critical resource, needs a mutex. The mutex guarantees that
the acquire
and release
operations are
atomic, meaning that they will not be interrupted.
Here's an example:
int balance = 0; Mutex account_mutex; int withdraw(int amount) { account_mutex.acquire(); if (balance >= amount) { balance -= amount; account_mutex.release(); return(amount); } else { printf("Insufficient funds"); account_mutex.release(); return(0); } }
If two threads simultaneously call withdraw(), it is guaranteed that
one will successfully acquire (lock) the mutex first. This thread will
then be able to run through the withdraw() function until it releases
the mutex. During this period of time, the other thread is waiting (we
say that it is blocked) to acquire the mutex. When the first
thread releases the mutex, the second thread will then be able to acquire
the mutex and continue. Thus, the critical section of withdraw
,
namely the (balance >= amount)
condition and the balance
-= amount
statement, is protected.
pthreads provides the following mutex functions:
int pthread_mutex_init(pthread_mutex_t *, pthread_mutexattr_t);
int pthread_mutex_lock(pthread_mutex_t *);
int pthread_mutex_trylock(pthread_mutex_t *);
int pthread_mutex_unlock(pthread_mutex_t *);
int pthread_mutex_destroy(pthread_mutex_t *);
The pthread_mutex_t
datatype is an opaque type representing
a mutex. init
creates a mutex, destroy
destroys
a mutex. The pthread_mutexattr_t
parameter specifies options
for the mutex. A NULL
value for this parameter specifies that
the default parameters be used. For the other possibilities, see the
pthread_mutex_init
manpage.
lock
acquires the mutex, blocking until the mutex
is acquired. unlock
releases the mutex. trylock
tries to lock the mutex, but does not block if the mutex is already locked.
In that case (the mutex is already locked), trylock
returns an
error condition. Our example becomes:
pthread_mutex_t account_mutex; int balance = 10; int withdraw(int amount) { pthread_mutex_lock(&account_mutex); if (balance >= amount) { balance -= amount; pthread_mutex_unlock(&account_mutex); return(amount); } else { pthread_mutex_unlock(&account_mutex); printf("error..."); return(0); } } int main(int argc, char **argv) { pthread_mutex_init(&account_mutex,NULL); .... pthread_mutex_destroy(&account_mutex); }
Note that we really should check the return
values of every library
function. The manpage specifies that the pthread_mutex_
return 0 on success, or a nonzero error code on failure. Checking and
responding to error conditions is important for resiliant programs. Such
error checks were omitted from the above example for clarity. At the
simplest, each library function can be wrapped with assert
,
eg:
#include <assert.h>
....
assert( 0 == pthread_mutex_unlock(&account_mutex));
This won't provide a graceful recovery, but it will prevent a program from
continuing in a perhaps undefined manner after an error condition, and will
alert you to the exact location of the failure. Otherwise a problem is
likely to creep up an an unrelated later point in execution and would
prove to be very difficult to debug. In a real program more graceful error
recovery is necessary. C++, Java, ObjectPascal, etc, provide Exceptions
which make error-recovery code less intrusive, but in straight C, we're
stuck with if (library_call error condition) { complain }
.
We've seen that we need to be careful when using threads, to ensure that race conditions do not occur, and that critial sections are properly protected. Even if we are very careful, problems can still creep up: standard library functions are not necessarily thread-safe. Yes, that means that library functions might have critical sections that should be protected but aren't. What can you do? Check to see if the library you want to use is thread-safe. It should say somewhere in the library documentation, eg, the manpages, or the source code. If it's a simple library and you have source, you can look and see if it's thread safe, although it's always possible that you'll overlook something. If the documentation doesn't say anything about thread safety, you can always use a different library. Or you could wrap library functions in mutexes.
The Win32 API supports threads, and if you're unfortunate enough to be required to write software for Windows, you'll either be using Win32 directly or using some wrapper that eventually calls Win32. Here's the Win32 function to create a thread. It is called CreateThread:
HANDLE CreateThread(
LPSECURITY_ATTRIBUTES lpThreadAttributes, // pointer to security attributes
DWORD dwStackSize, // initial thread stack size
LPTHREAD_START_ROUTINE lpStartAddress, // pointer to thread function
LPVOID lpParameter, // argument for new thread
DWORD dwCreationFlags, // creation flags
LPDWORD lpThreadId // pointer to receive thread ID
);
Here's an example of how it is used. Most of the parameters can be set to 0 and
default values will be used instead, although caveats apply. Notably lpThreadId
cannot be NULL in Windows 95 or 98.
#include
#include
DWORD CALLBACK thread_function(void *myname) {
while (1) {
printf("My name is %s.\n",myname);
Sleep(100);
}
return 0;
}
int main(int argc, char **argv) {
HANDLE h = CreateThread(NULL,0,thread_function,"A",0,NULL);
HANDLE j = CreateThread(NULL,0,thread_function,"B",0,NULL);
CloseHandle(h); // We don't need to keep track of the threads,
CloseHandle(j); // so we can free the handles.
Sleep(1000);
return 0; // the threads are terminated when main() exits
}