NPTL (Native POSIX Threading Library) Tests and Trace

pingouin


NPTL Test Campaign #2

~ Final Report ~


gnou
  

 Introduction

 

This report comes as a conclusion for the NPTL Tests and Trace project (http://nptl.bullopensource.org). It is designed to be a tutorial on testing the NPTL (but it is not restricted to NPTL, it can be used for testing any POSIX implementation), and it uses information about our second test campaign to illustrate the tutorial. It also summarizes our activity since the beginning of our project.

This report is focused on the Test activity of the project. You can find information about the NPTL Trace tool on the dedicated website (http://nptltracetool.sourceforge.net) and in the paper we presented at the Ottawa Linux Symposium (proceedings are available here: http://www.linuxsymposium.org/2005/). Note that that paper also deals with the Test activity.

 

 Conclusions on RHEL4 u2 NPTL Component.

 

Here is the list of problems the testsuite has shown, when run on a box powered by two 3GHz Hyperthreaded i686 CPU. If you are interested in building the same list for another implementation, you can find an HOWTO in the next chapter in this report. The shown problems are not critical, but one should know that there are some divergences between NPTL (or in many case, the Linux kernel NPTL relyes on) and the POSIX standard.

This list won't be updated, because it would mean a lot of work to keep current with both the NPTL evolution and the POSIX standard corrigenda, and we cannot afford this work.

Limited or missing features

Asynchronous I/O related features
Related Test aio_*/*, lio_*/*
Description All these tests deal with the AIO POSIX option, which is not fully supported in the base glibc.
Comments These tests can issue random results as they rely on machine load and other unpredictible behaviors.
Thread Priority Protection related features
Related Test pthread_mutex_getprioceiling/*, pthread_mutexattr_getprioceiling/*, pthread_mutexattr_getprotocol/*, pthread_mutexattr_setprioceiling/*, pthread_mutexattr_setprotocol/*, pthread_h/2-2.c
Description These features are also not supported yet in the base glibc.
Comments The tests won't link.
Process Sporadic Server
Related Test sched_h/*, sched_setparam/*, sched_setscheduler/*
Description The Sporadic Server scheduling model is not supported in Linux.
Comments The tests dealing with SCHED_SPORADIC symbol or associated data won't build.
Typed Memory Objects
Related Test sys/mman_h/*
Description The glibc does not support these options.
Comments The tests dealing with symbols such as POSIX_TYPED_MEM_* won't build.
SA_RESTART flag ignored
Related Test sigaction/16-*.c
Description These tests check that the SA_RESTART flag behaves as expected.
Comments This has already been reported and ignored by mainteners.

Problematic behaviors

Modification after the end of mmap'ed area
Related Tests mmap/11-4.c, mmap/11-5.c
Description These testcases check if a modification after the end of a mmap'ed area is written out (POSIX requires it shall not be).
Comments This issue has to be reported.
File's st_ctime and st_mtime are not updated when mmap'ed memory is changed.
Related Test mmap/14-1.c
Description This test checks that the st_mtime and st_ctime fields of a file are updated when the file is mmap'ed and the corresponding memory is changed.
Comments This issue has to be reported.
mmap succeeds when memory locking should be unavailable
Related Test mmap/18-1.c
Description Tests if an mmap operation succeeds when a limit has been set on memory locking and new memory is locked automatically.
Comments The feature used to set the limit on the memory locking (RLIMIT_MEMLOCK) is not described in POSIX. The tested feature is part of the Process Memory Locking option in POSIX. This issue has to be reported.
mmap error when memory area is bigger than the mmap'ed object
Related Test mmap/28-1.c
Description Create an object, then mmap more memory than the object size.
Comments Linux does not return an error. This issue has to be reported.
Error checking in sched_setscheduler
Related Test sched_setscheduler/19-5.c
Description Tests that an invalid policy cannot be passed to sched_setscheduler.
Comments This issue has to be reported.
Thread scheduling and read-write locks.
Related Test pthread_rwlock_rdlock/2-{1,2}.c, pthread_rwlock_unlock/3-1.c
Description These tests check several situations where POSIX specifies which thread shall acquire the lock.
Comments The Linux / NPTL behavior does not obey the POSIX requirements about thread scheduling. This issue has to be reported.
Process scheduling and semaphores
Related Test sem_post/8-1.c
Description This test checks the higher priority processes get the semaphore first.
Comments It looks the priority is not taken into account. This issue has to be reported.

Solved or not-critical problems

Timer with CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clock
Related Test timer_create/10-1.c, timer_create/11-1.c
Description This test creates a timer using a CPU-time clock, when supported.
Comments timer_create returned an error during the test campaign, but it seems to behave fine on more recent Linux kernels. An issue may remain in the test (the time spent in nanosleep() is counted against CPU-time clock?).
Timer overruns
Related Test timer_getoverrun/2-2.c
Description This test sets a timer on a masked signal, and then checks that timer_getoverruns returns the correct amount of timer expirations.
Comments The test reports a failure, but I not sure the test procedure is correct -- to be verified.
mmap error when no memory is available
Related Test mmap/24-2.c
Description Tests that mmap returns ENOMEM when the free address space is not sufficient.
Comments This issue seems to be solved in recent Linux kernel (tested with 2.6.14).
EOVERFLOW error support (large files extension)
Related Test mmap/31-1.c
Description This test checks that the EOVERFLOW error is returned.
Comments Currently we get EINVAL error. This test may need the large file support extension.
Non root user can see scheduling parameters of root's processes.
Related Test sched_getparam/6-1.c, sched_getscheduler/7-1.c
Description These tests check that a non root user cannot query scheduling parameters of root's processes.
Comments POSIX does not specify if this operation shall be fobidden, therefore this is not really a bug -- though it may be a security issue.
 

 How to run the tests

 

The testsuite we are using here is the Open POSIX Test Suite (http://posixtest.sourceforge.net/), which we extended and improved during our project (see Results link at the left for more information). You will need this testsuite (get the archive from their website or from CVS for most recent fixes), and the GNU make utility as well as a C compiler (gcc is suitable). If you need to run the tests on a system without a C compiler, it is possible to compile the tests on a first machine and then run them on another, but this won't be dealt here.

Here are the detailed steps for the Test Campaign 2 on RHAS4:

  1. Install a fresh RedHat Advanced Server 4 Update 2 system. Be sure to install the Development packages. You also need the nptl-devel package to be installed.
  2. Get a fresh CVS copy of the Open POSIX TestSuite:
    cvs -z3 -d:pserver:anonymous:@cvs.sourceforge.net:/cvsroot/posixtest co -P posixtestsuite

Now, you only need to set up the flags in the testsuite, and you're ready. Just edit the LDFLAGS file and read the comments inside. To test the NPTL, you probably need to provide special headers set and library, as NPTL is not the default library on many Linux distributions (at least, for the moment).

For RHAS4u2, the following flags are suitable:
-I /usr/include/nptl -L /usr/lib/nptl -D_XOPEN_SOURCE=600 -lpthread -lrt -lm

If you can't find these directories, you probably forgot to install the nptl-devel package.

To test the default thread library (for example, testing on Fedora Core 4 distribution), the following LDFLAGS file is suitable:
-D_XOPEN_SOURCE=600 -lpthread -lrt -lm

Now, you can run the test suite. Be sure to run it as super-user because some of the tests need special priviledges. It is advised not to run the testsuite on a shared machine, because some of the tests will use a lot of resources (CPU, IO, RAM). You may need to increase the timeout duration in the Makefile file if you have a large system (more than 1GB RAM for example).

On our system, we changed the TIMEOUT parameter to a larger value (2400). This avoids false HUNG status -- but a run can last very long.

To run the tests, just issue a make command (as root). It will execute all conformance tests, and will save the results in a 'logfile' file. You can also run only a subset of the tests (refer to the test suite documentation for more information). Please note also that some tests will change the system clock of the tested system, and therefore this clock can be slightly desynchronized.

Total time for a run is around 25 min on our test machine (2x Xeon HyperThreaded 3GHz).

We also provided some scalability and stress tests. These tests are located in the stress/threads subdirectory.

For the stress tests (stress*.c), one can use the helper.c script to run all the tests together. This can be useful to stress the system globally, but won't provide very useful information. The other option is to run the tests one-by-one. A stress test will run forever until it fails or is killed with signal SIGUSR1.

For the scalability tests (s-c*.c), there are several compilation options available:

  • SCALABILITY_FACTOR: from 1 (default) to 5. A bigger value will use more resources on the box.
  • PLOT_OUTPUT: when defined, the testcase will output measurement data to be used in the do_plot script provided. This script will use the GNUplot software to generate drawings of the test execution. Please have a look at the forum to see some examples.

In any case, the scalability tests return a status PASSED or FAILED, determined by the least square analysis method on the results. If the measures are constant, the test is PASSED, otherwise it is FAILED.

No scalability or stress test has been run during Test Campaign 2. Please ask on our mailing-list if you need support.

 

 How to analyze the results

 

Once you have run the testsuite, the results are located in the 'logfile' file in the root folder of the testsuite. You may use whatever tool you want to read this file, including human reading, but the file is somewhat huge (350kb) and boring. We have developped a tool to ease tester's life, called tslogparser (http://tslogparser.sourceforge.net). This tool provides a more user-friendly interface to access and filter the tests results. The installation of tslogparser won't be discussed here -- refer to the INSTALL file in the package for more information. Basically, tslogparser is a web application based on LAMP (Linux, Apache, MySQL, PHP). There are two logical steps to store your results in the database, once the software is set up:

  1. Save the testsuite description in the database, by providing tslogparser your testsuite archive (tar.gz format).
  2. Save your run results by providing tslogparser the generated 'logfile' file.

You can access our Test Campaigns results through the tslogparser interface by clicking on the Results link at the left.

During our analysis, we wanted to find out which test results were not reliable, and which features were not POSIX compliant. For this purpose, we ran the test suite several times with constant parameters, and stored each resulting logfile in the database. Then, tslogparser allowed us to check each run and compare them; and to see only the differences.

This helped us find which tests produced non-constant results. In Test Campaign 2, we found several cases:

  • The aio_* tests results should be ignored, as the glibc does not claim to provide a POSIX compliant AIO library. Moreover, these tests rely on the system load and therefore can issue unconstant results.
  • Some speculative tests can also issue random results; this is nothing to be worried about, as these tests are meant to test non-POSIX behaviors, and therefore behaviors which the user should not rely on.
  • Some tests can sometimes return the UNRESOLVED status, or another status. This is normal, by definition of the UNRESOLVED status. This is for example the case when you're short on free memory. In this case, one should only consider the alternate status returned.
  • More problematic cases are tests which sometimes return PASSED, and sometimes return FAILED. These tests need further attention. In our serie, the problematic tests were: pthread_cond_init tests 1-2 and 2-2; pthread_detach test 4-3; and timer_settime test 9-2.

 

Then, we can have a look at a complete run log and examine the results. Here is the meaning of the status:

build FAILED
The test case failed to compile. You should have the compiler message available through tslogparser to find out what's wrong. This error can occur for example for unsupported POSIX optional features.
link SKIP
Some of the testcases are compile-only tests: the test succeeds when the compiler is able to compile, and it never tries to link. This status is therefore to be interpreted as 'test PASS'.
execution UNSUPPORTED
Some test are able to compile cleanly, but will detect either at compilation or at runtime that the tested feature is unavailable in the implementation, or that for any reason the test cannot run properly. These results can be safely ignored.
execution UNTESTED
For some features, when no test has been written, a false test case is present and returns this status as a reminder for developpers.
execution UNRESOLVED
Sometimes, an external event prevents the testcase to finish and give its status. This can occur for example when a memory allocation fails. In many cases, re-running this testcase will give a clean status. If the test keeps on returning UNRESOLVED, you may try and find out why this is happening.
execution PASS
The tested feature is present and behaves as expected. If all tests returned this status, life would be simpler!
execution INTERRUPTED
For some reason, the test did not run to completion and was killed by a signal. Most of the time, this is a segmentation fault. You can safely ignore this status for a 'speculative' test, but this should be investigated for normal tests.
execution HUNG
This status is a subset of the previous; it means that the test terminated because of SIGALRM. As every test is run with an "alarm($TIMEOUT)" pending, it should mean that the test did hang, but this status can also be returned on some other cases. Further analysis may be needed here also.
execution FAILED
This status means that the test ran to completion, and could find that the tested assertion does NOT behave as expected in this implementation. In case it is a speculative test, this can be ignored, but otherwise this most probably shows a bug / limitation in the tested implementation (which is what we're looking for, remember). tslogparser can give you the text of the tested assertion, assuming that the OPTS description is correct. Last step in this process is to check if the BugZilla related to the implementation is already aware of the issue, and eventually open a new report.

 

 Project History

 
Dec 1st, 2003 - Mar 30, 2004
Project start, analyzis of NPTL existing tests and POSIX standard.
Apr, 2004
Little design for the test cases. OPTS project choosen for submission. Creation of the nptl.bullopensource.org website.
Apr 29, 2004
First test submission to OPTS project.
May 1st - Sep 28, 2004
Test writing, bug submissions, fixes for the first list of functions we defined.
Jul 21, 2004
Analysis of OPTS coverage in NPTL with gcov.
Aug 4, 2004
Approval of the OPTS project inside STP platform.
Oct 12, 2004
First design for the Trace Tool published in the project's forum.
Oct 1st - Oct 21, 2004
First test campaign, start of results analysis.
Nov 18, 2004
First test campaign report published on the website.
Nov 23, 2004
Creation of tslogparser as a separate SourceForge project.
Dec 1st, 2004 - Mar 9, 2005
Test writing, bug submissions, fixes for the second list of functions we defined.
Dec 20, 2004
Creation of the The NPTL Trace Tool project in SourceForge.net (nptltracetool)
Jan 5, 2005
First nptltracetool code released in CVS.
Feb 23, 2005
10.000 hits on the website.
Mar 3rd, 2005
This project was choosen for a presentation at the Ottawa Linux Symposium
Jul 23, 2005
Ottawa Linux Symposium presentation concludes several weeks of preparation. The paper we wrote for this event is available on the OLS website.
Aug - Oct 2005
No activity on this project
Nov 2005
Latest Test Campaign, writing of this report, closure of the forum and the test project. Further work on the trace tool is planned to begin next year.

To see complete history of the project, see the News archives on the website.

 
 
 

Page maintained by: Tony Reix
Last update: Nov 25th, 2005.