New GSI Analysis System GO4

Experiences with the root TThread class

Status: 20.07.1999 / Marc Hemberger, Jörn Adamczewski / GO4 GSI Darmstadt

We tested on SMP machines running Debian (2.0.36 Linux kernel) with TPosixTThread-implementation (pthreads) and root 2.21 (recompiled with flag –DR__THREAD). In the following, we refer to this unchanged version of threaded root as standard root. A summary of modifications, conclusions, and examples can be found at the end.

1.1 Threads and main program are root-CINT Macros

Threadfunctions and function main() (with creation of TThread instances) are defined in one macro (threads.C) which is loaded within root interpreter (.L threads.C); then main()is called .

libraries /usr/lib/libpthread.so and libThread.so have to be loaded in main-function (using gSystem->Load);
Threads start correctly
Debugging TThread::Call shows that iPointertype=4, i.e. case G__BYTECODEFUNC is used (section 1.4)

1.2 Main program with Thread call is root-CINT Macro, Threads defined in so-library

Example: task2new.C, libthread1new.so, libthread2new.so; libthread3new.so: libraries are compiled with root dictionary; global variables/function names are declared with #pragma link C++ global/ function; all globals defined within first library .

1.2.1 General

Thread names are recognized and prompted correctly;
Thread functions work;
Debugging of TThread::Call shows that iPointertype=2, i.e. case G__COMPILEDINTERFACEMETHOD is used (section 1.4)
Thread functions do not start before the end of the calling macro (i.e.. TApplication::Run is invoked by interpreter then); this may be a problem if another task shall be run in the same macro that calls the thread

Possible solutions:

One macro only starts the threads; after threads are running, further macros can be processed in CINT
Thread definitions and thread starting function tstart() is placed within shared library; on loading the lib, tstart() can be called (or runs automatically). Then CINT can run further macros.
Tasks to be executed in the main function while threads are running in the background are handled by TTimer objects, e.g. histograms filled by threads are updated in the canvas by TTimer::Notify every second

Influence of foreground macros: testing by TBenchmark in each thread, measuring rtime/cputime of calculation loop which is outside TMutex::Lock(): Threads are slowed down by intensive applications (e.g.. hsum.C) in realtime (1.5 s instead 1.0 s), as expected;

1.2.2 Threads stability and locks

TThread::Lock() uses its own class member fgMainMutex; this mutex does not lock against root X-actions: method XARequest uses class member fgXActMutex instead

TThread::Printf() uses XARequest , locking automatically fgXActMutex, but not fgMainMutex; but

TThread::Lock()
printf(...) , cout << ..
TThread::UnLock()
is not the same mechanism.

Apart from fgXActMutex and fgMainMutex, a user TMutex instance myMutexcan be created and applied (myMutex=new TMutex; myMutex->Lock(); myMutex->UnLock()).
Creation of objects and allocation of memory by means of operator new have to be locked within threads (via Thread::Lock() or myMutex->Lock() and corresponding UnLock()). Otherwise conflicts between threads during allocation, causing segmentation violations.
Remark: TBenchmark can affect the thread stability if constructed many times; Example with repeated TBenchmark ends after < 20 minutes (Fatal in operator new: storage exhhaust); without TBenchmark stable operation > 46 hours tested. Probably caused by TBenchmark constructor (contains Operator new, item 4) , which was not locked in this example.

1.2.3 Global Objects

Simple variables (example thread1.C) or TObjects: access without mutex lock from different threads causes no segmentation violations; should be locked to keep content consistent if different threads use both read and write access. For example threadfill.cxx (global histogram filled from 2 threads): with or without mutex lock no segmentation violation; but filling same histogram bin from two threads without locking can cause a loss of entries.

1.2.4 Graphics/ TCanvas operations

Example: If several threads work on the same canvas (the same pad, resp.): Generally segmentation violations until program breakdowns occur, which can provoked by X-actions (mouse motion and clicks);
One single thread which is working on a global Canvas also crashes;

There is no safe way to use threads together with root graphics: all canvas operations must be handled by the main program (e.g. by means of TTimer to update graphics); threads can access the non graphic global objects (item 1.2.3), so histograms can be filled by background threads and drawn by the main program

1.2.5 Synchronisation, Usage of TCondition

TCondition instance requires TMutex for lock of Wait and TimedWait method; one can pass the address of an external mutex to TCondition constructor (mutex should not be used elsewhere). If NULL is passed, TCondition creates and uses internal mutex.

TCondition::Wait() waits until any thread sends a signal of the same condition instance: myCondition1.Wait() reacts on myCondition1.Signal() or myCondition1.Broadcast(); myCondition2.Signal() has no effect.
If several threads wait for the signal of the same TCondition myCondition1, at myCondition1.Signal() only the first thread will react; to activate myCondition1.Wait() within second thread another Signal() is required, etc.
If several threads wait for the signal of the same TCondition myCondition1, at myCondition1.Broadcast() all threads waiting for myCondition1 are activated at once. Remark: in some tests only first thread’s Wait() was activated (depending whether myCondition1had been signalled before).

myCondition1.TimedWait(secs,nanosecs) waits for myCondition1 until the absolute time in seconds and nanoseconds since begin of the epoche (January, 1^st,1970) is reached; to use relative timeouts delta one requires absolute time at the begin of waiting; Using root-Objects: Ulong_t now,then,delta; // seconds

TDatime myTime; // root daytime class
myTime.Set(); // myTime set to ``now''
now=myTime.Convert(); // to seconds since 1970
then=now+delta; // absolute timeout
wait=myCondition1.TimedWait(then,0); // waiting

Return value Int_t wait of myCondition1.TimedWait should be 0, if myCondition1.Signal() was received, and should be nonzero, if timeout was reached. In the standard root version, there was a wrong check of PthreadDraftVersion in PosixThreadInc.h causing a not suitable definition of ERRNO(). Thus TimedWait() returned zero in any case. After changes within PosixThreadInc.h the return values were correct, see above.
The TSemaphore-class instantiates its own TMutex, TCondition, and an integer as counter. TSemaphore(Int t n) creates semaphore with counter initial value n. TSemaphore::Post() (resp. TSemaphore::Wait()) will call a Signal() (resp. Wait()) of ist own TCondition , if counter is zero. The counter is always incremented by method Post and decremented by methodWait. TMutex is used as lock.

1.2.6 UsingTSocket with TThreads

Example: server4.cxx, client4.cxx started from shared library with root macro cliserv4.C: server and client work as threads in one root process; client fills buffer array and histogram with numbers, sends buffer array via TSocket::SendRaw() and histogram via TSocket::Send(Tmessage* mess) methods every second. Server gets them with corresponding TSocket::RecvRaw()and TSocket::Recv(mess) buffer contents and histogram is displayed on console. This works.
Examples based on hclient.C and hserv.C from tutorials: 2 clients create one histogram each, which is send via Socket; histogram copies are drawn into canvas by the server. In this case, there occured problems with socket communication, if clients and server are threads within the same process (e.g. often one client starts correctly, the second client never; the program crashes also without graphics display of the server). However, if clients are threads in root-session 1 and server is thread in root-session 2 (different processes), the socket communication worked (when the server thread tries to display the histogram, the usual problems with graphics in TThreads happens though). If server runs as main program in process 2 and client as threads in process 1, everything works well as in the tutorial examples.
Further example: server3.cxx, server as standalone programm (process 1); client3.cxx and hist3.cxx two threads in library, called by root macro cliserv3.C (process 2); hist creates and fills two global histograms, client sends histograms into socket, server displays them. Synchronisation between threads by means of global flag (tells clientsock that clienthist has finished, then socket is closed).

1.2.7 Exceptions within Threads

Example situation (thread1new.so, thread2new.so, thread3new.so): Use exception class hierarchy in shared library libthreadexcept.so, compiled with -frtti (!) and -fexception; this is linked against the library in which thread1 function is running with try and catch blocks; the thread calling root macro loads exception library (gSystem->Load("libthreadexcept.so")), as well as the thread libraries.

thread1 contains try block with explicit throws of derived exception classes, and catch blocks (example l: only exception base class is caught which handles all derived exceptions, these can be identified by their own console "message").
Exceptions are caught correctly.
Exception-handler of thread1 can terminate all other threads and thread1 itself (see section 1.2.8 for details of thread cancellation).
After exception within Thread and consecutive termination of all threads, apart from the root process another unix process still keeps alive (for 3 threads: 5 root.exe processes when threads are running, 2 processes after thread cancellation); without threads there is just 1 process root.exe (besides there is always the mother process of the root session with different process name root).

Remark: Generally, exceptions should not be used within interpreted code (root CINT macros).

1.2.8 TThreads Cancellation and CleanUp

To cancel a TThread , this has to be enabled by SetCancelOn(); with SetCancelOff()(default) an external termination is forbidden.
There are two modes: SetCancelAsynchronous() allows to terminate immediately and at any time (causes problems, if thread uses e.g.external functions or keeps resources on cancelling); SetCancelDeferred() allows to terminate at certain points which can be set with CancelPoint(); default cancel points are e.g. Wait() and TimedWait() calls
The methods TThread::Kill(``threadname'') or TThread::Delete(tpoint) cancel the TThread* tpoint with name "threadname"

There is a stack of pointers to functions that are to be executed after thread termination. Using CleanUpPush((void*) &func, (void*) &arg) a function func with argument arg (arg might be e.g. the pointer to executing TThread) is pushed on the stack; with CleanUpPop(int exe) the last function is popped from the stack; if exe=1, this function is run (if exe=0 it is not run!).

TThread::CleanUp is for Posix the same as CleanUpPop(1)

The function TThread::AfterCancel(tpoint) is pushed by default to the cleanup stack before thread start (in TThread::Fun). The standard root TThread-implementation, however, disabled execution of AfterCancel (CleanUpPop(0) after function termination in TThread::Fun). We changed this, now the last function on stack is executed automatically after thread cancellation. Calling explicitly CleanUpPush and CleanUpPop with an own user function also works.
TThread cleanup stack does not apply the cleanup stack mechanism of the pthreads library (is commented out in TPosixThread).
Although pthreads cease after Kill() , Delete , and also after the normal threadfunction end (Unix processes are gone), the TThread instance TThread* tpoint exists until delete tpoint command in the TThread creating macro.

1.3 Main function root CINT macro, TThreads defined and called in so-library

Shared libraries (libtstart.so, libthread1new.so, libthread2new.so) are loaded by root CINT; function tstart() from libtstart.so which creates and starts threads is called from CINT

Threads start running on the return from tstart() ;
Thread names are not recognized (see below);

segmentation violations, no text output from threads
There is no difference if root-dictionaries of thread.so's are linked against tstart.so
When tstart is called directly within libtstart.so: Threads start running on loading the library, then same behaviour as above

Explanation: Since the thread functions are called from compiled code, the problem is the same as described in the following section 1.4 .

1.4 Main function, Tthreads definitions and start entirely compiled as executable

In the standard root it was not possible to launch threads without segmentation violation; to achieve this, changes of the TThread-class were necessary.

Situation before the changes:

Thread name at message ``Thread... requested/is running'' is empty;
Any running thread (also functions doing ``nothing'' ) causes messages break:segmentation violation or illegal instruction on stderr;
This is independent from variations of TThread constructors and thread function declarations (no difference between constructors for joinable and detached TThreads)

Explanation: The call of the thread function uses in TThread::Call the method G__p2f2funcname from CINT_shl.c to derive from the thread function pointer p2f the function name fname (as string*) ; fname is then applied to figure out the pointer iPointer2Function, by means of the CINT classes G__ClassInfo and G__MethodInfo ; this latter pointer is used for the thread function call, depending on the situation of the TThread creation (case of iPointerType).

Debugging shows that G__p2f2funcname returns zero (fname = 0), if TThread is started from compiled function; if TThread is started from CINT interpreter, one gets the real function name fname
If fname = 0, then iPointer2Function and iPointerType are not defined (this was the mistake!); although in case of an unknown pointer type the function iPointer2Function is called directly, the thread crashes since iPointer2Function points nowhere.

Therefore we changed the TThread-class: Before fname is checked, iPointer2Function is always set on the function pointer p2f which was passed to the TThread constructor; if fname = 0 (case compiled), p2f is called directly; otherwise iPointer2Function is set to the appropriate value and the interpreted thread function is executed

Further changes: To identify the TThread object by name (using TNamed::GetName) if G__p2f2funcname fails, we defined variants of the TThread(..) constructors. These expect a thread name (char* ) as first argument, the following arguments are the same as for the TThread constructors for detached or joinable threads, respectively. This name is set for the TThread object instead of the G__p2f2funcname return value.

After these changes, the problems described in sections 1.3 and 1.4 were solved.

Note: Threads are requested immediately on call of TThread.Run() from a compiled program (the corresponding Unix processes are then existent , as checked with ps); however, the thread function is not executed (i.e. TThread displays ``running''-message) before TApplication.Run() or gSystem->Run() is called. This is similar to the situation when the thread is called from interpreter macro: the thread functions do not start before the end of the macro (section 1.2 ).

Summary of modifications to standard root TThread classes:

Based on version 2.21/04, we introduced the following changes in the TThread classes and related files (we greatly appreciate the work and help of Victor Perevoztchnikov!):

Makefile: set compiler flag -DR__THREAD for compiling ROOT. This steers the compilation process in GPAD_Canvas.cxx, G_VirtualPad.cxx and TCanvas.h, TVirtualPad.h and G2_LinkDef.h for building the libGraf.so.
THREAD_Thread.cxx, TThread.h:
CleanUpPush: modified argument list
CleanUpPop(0) changed to CleanUpPop(1) to handle cleanup stack
method TThread::Call() modified to get pointer to function in thread correctly in compiled code
two additional ctors added to give a thread a specific name; additional private member fNamed introduced
THREAD_PosixCondition.h:
#ifdef PthreadDraftVersion modified
PosixThreadInc.h:
#ifdef PthreadDraftVersion modified (usage of errno)
GPAD_Canvas.cxx: call to Constructor(), Destructor() method in TCanvas-class introduced when calling the ctors for TCanvas.
included a seperate Makefile for the libThread.so
necessary creation of a dummy pthread.h, because CINT does not interpret the pthread-header file: had to be changed in TPosixCondition.h, TPosixMutex.h and TPosixThread.h
created LinkDef-header-file for CINT

Conclusions:

The root TThread classes have been tested in the described configuration. After some changes it is now possible to launch and run pthreads as instances from root environment.
The operator new must be locked within threads by the user to avoid conflicts on allocating memory. Root objects, however, might use the new operator in their constructor and their methods implicitly. As a consequence, any root class method might not be thread safe.
TCanvas operations are not possible in threads.

Examples:

threads.C: Macro to load within root interpreter. Contains thread functions thr1, thr2 which increment global and local counters and print contents on console. Call of function main() after loading starts threads.

thread1.C, thread1.h, thread2.C, thread2.h : sources to compile shared libraries libthread1.so and libthread2.so. Each library defines a thread function which increments global counter and local counter and prints contents on console. Threads are started by macro .x task2new.C from root interpreter.

thread1new.C, thread1new.h, thread2new.C, thread2new.h, thread3new.C, thread3new.h:sources to compile shared libraries libthread1new.so, libthread2new.so, libthread3new.so. Each library defines a thread function. Example shows communication between thread1, thread2, thread3 by means of TConditions.Signal() , Wait(), and TimedWait() methods: thread2 and thread3 wait for signal myCondition (mySync, resp.) from thread1; on getting the signal, they acknowledge by signals myAckn (myAckn2, resp.). Thread1 uses TimedWait() to receive these acknowledge conditions (in function signal_ackn_wait). This procedure is repeated in a loop, with a sleep of 5 secs in thread1; thus the other threads will wait for thread1. Exceptions are used with own exception class hierarchy defined in libthreadexcpt.so; if a timeout of the acknowledge signal occurs, exception is thrown and caught by thread1. On exception, thread1 cancels other threads; demonstrates CleanUp stack and SetCancel methods. Since the Sleep time of thread2 is increased with its counter in this example, a timeout exception will occur for count>45; Change Sleep(millisecs) in thread2 or thread3 to values greater 10000 ms to get a many_time_outs exception at once.
usage: after making the shared library, execute macro task2new.C from root interpreter to load this and all related libraries and to start threads

hist3.cxx, hist3.h, client3.cxx, client3.h: sources to compile shared libraries hist3.so, and client3.so. Each library defines a thread function. This example is based on the hserv.C / hclient.C macros in root distribution tutorials. Here client thread sends both global histograms hpx, hpxpy over two sockets to server, which is a separate program without threads server3.cxx. This displays histograms on root graphics. The histograms are filled by thread hist (from hist3.cxx).
usage: start compiled program server3.cxx in one shell; start root session, execute macro cliserv3.C to load shared libraries and to start histogram and client threads.

server4.cxx, server4.h, client4.cxx, client4.h: : sources to compile shared libraries server4.so, and client4.so. Each library defines a thread function. Example shows data transfer on a socket between two threads. Client thread sends raw data array and histogram to server thread every second. Server thread receives data and prints contents and histogram summary on console (no graphics used here).
usage: after making the shared library, execute macro cliserv4.C from root interpreter to load this and all related libraries and to start threads.

threadfill.cxx, threadfill.h:sources to compile an executable program which defines two threads. Function main (no thread) draws canvas with two pads and creates two histograms (double and float precision); threads thrd1 and thrd2 are started; each thread fills both histograms: thrd 1 fills values 3 and 0 , thrd 2 fills values -3 and 0; number of filling events can be set by second command line parameter. Main program uses TTimer object grUpdate to redraw histograms every 1000ms. The histogram contents are displayed at the end of the main function; note that float histogram bins reach limits after ~1.7e+7 events; if histogram fill is not locked with TThread::Lock/UnLock (parameter gLock=0), the zero bin, which should contain sum of value -3 bin and value 3 bin, will lose entries (switch with command line argument 1)
usage: "threadfill gLock #EVENTS"
gLock=1 : enable histogram Mutex lock (default); gLock=0 : disable histogram Mutex lock
with #EVENTS number of events to fill (optional); threads and graphics update timer will start; after threads have finished, use canvas window menu "File--QuitRoot" to stop update timer and view histogram contents and thread status

Hit Counter

Downloads-Manuals-References

GSI Helmholtzzentrum für Schwerionenforschung, GSI
Planckstr. 1, 64291 Darmstadt, Germany
For all questions and ideas contact: J.Adamczewski@gsi.de or S.Linev@gsi.de
Last update: 27-11-13.