|
|
| |
Introduction
|
| |
|
This report comes as a conclusion for the NPTL Tests
and Trace project (http://nptl.bullopensource.org).
It is designed to be a tutorial on testing the NPTL (but it is not restricted to NPTL,
it can be used for testing any POSIX implementation),
and it uses information about our second test campaign to illustrate the tutorial.
It also summarizes our activity since the beginning of our project.
This report is focused on the Test activity of the project. You can find information about the
NPTL Trace tool on the dedicated website
(http://nptltracetool.sourceforge.net) and in the paper
we presented at the Ottawa Linux Symposium
(proceedings are available here: http://www.linuxsymposium.org/2005/).
Note that that paper also deals with the Test activity.
|
| |
Conclusions on RHEL4 u2 NPTL Component.
|
| |
|
Here is the list of problems the testsuite has shown, when run on a box powered by two 3GHz Hyperthreaded i686 CPU. If you are interested in building the same list for another implementation,
you can find an HOWTO in the next chapter in this report. The shown problems are not critical, but one should know
that there are some divergences between NPTL (or in many case, the Linux kernel NPTL relyes on) and the POSIX standard.
This list won't be updated, because it would mean a lot of work to keep current with both the NPTL evolution and the POSIX
standard corrigenda, and we cannot afford this work.
Limited or missing features
| Asynchronous I/O related features |
| aio_*/*, lio_*/* |
| All these tests deal with the AIO POSIX option, which is not fully supported in the base glibc. |
| These tests can issue random results as they rely on machine load and other unpredictible behaviors. |
|
| Thread Priority Protection related features |
|
pthread_mutex_getprioceiling/*,
pthread_mutexattr_getprioceiling/*,
pthread_mutexattr_getprotocol/*,
pthread_mutexattr_setprioceiling/*,
pthread_mutexattr_setprotocol/*,
pthread_h/2-2.c
|
| These features are also not supported yet in the base glibc. |
| The tests won't link. |
|
| Process Sporadic Server |
| sched_h/*, sched_setparam/*, sched_setscheduler/* |
| The Sporadic Server scheduling model is not supported in Linux. |
| The tests dealing with SCHED_SPORADIC symbol or associated data won't build. |
|
| Typed Memory Objects |
| sys/mman_h/* |
| The glibc does not support these options. |
| The tests dealing with symbols such as POSIX_TYPED_MEM_* won't build. |
|
| SA_RESTART flag ignored |
| sigaction/16-*.c |
| These tests check that the SA_RESTART flag behaves as expected. |
| This has already been reported and ignored by mainteners. |
|
| Modification after the end of mmap'ed area |
| mmap/11-4.c, mmap/11-5.c |
| These testcases check if a modification after the end of a mmap'ed area is written out (POSIX requires it shall not be). |
| This issue has to be reported. |
|
| File's st_ctime and st_mtime are not updated when mmap'ed memory is changed. |
| mmap/14-1.c |
| This test checks that the st_mtime and st_ctime fields of a file are updated when the file is mmap'ed and the corresponding memory is changed. |
| This issue has to be reported. |
|
| mmap succeeds when memory locking should be unavailable |
| mmap/18-1.c |
| Tests if an mmap operation succeeds when a limit has been set on memory locking and new memory is locked automatically. |
| The feature used to set the limit on the memory locking (RLIMIT_MEMLOCK) is not described in POSIX.
The tested feature is part of the Process Memory Locking option in POSIX. This issue has to be reported. |
|
| mmap error when memory area is bigger than the mmap'ed object |
| mmap/28-1.c |
| Create an object, then mmap more memory than the object size. |
| Linux does not return an error. This issue has to be reported. |
|
| Error checking in sched_setscheduler |
| sched_setscheduler/19-5.c |
| Tests that an invalid policy cannot be passed to sched_setscheduler. |
| This issue has to be reported. |
|
| Thread scheduling and read-write locks. |
| pthread_rwlock_rdlock/2-{1,2}.c, pthread_rwlock_unlock/3-1.c |
| These tests check several situations where POSIX specifies which thread shall acquire the lock. |
| The Linux / NPTL behavior does not obey the POSIX requirements about thread scheduling. This issue has to be reported. |
|
| Process scheduling and semaphores |
| sem_post/8-1.c |
| This test checks the higher priority processes get the semaphore first. |
| It looks the priority is not taken into account. This issue has to be reported. |
|
Solved or not-critical problems
| Timer with CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clock |
| timer_create/10-1.c, timer_create/11-1.c |
| This test creates a timer using a CPU-time clock, when supported. |
| timer_create returned an error during the test campaign, but it seems to behave fine on more recent Linux kernels.
An issue may remain in the test (the time spent in nanosleep() is counted against CPU-time clock?). |
|
| Timer overruns |
| timer_getoverrun/2-2.c |
| This test sets a timer on a masked signal, and then checks that timer_getoverruns returns the correct amount of timer expirations. |
| The test reports a failure, but I not sure the test procedure is correct -- to be verified. |
|
| mmap error when no memory is available |
| mmap/24-2.c |
| Tests that mmap returns ENOMEM when the free address space is not sufficient. |
| This issue seems to be solved in recent Linux kernel (tested with 2.6.14). |
|
| EOVERFLOW error support (large files extension) |
| mmap/31-1.c |
| This test checks that the EOVERFLOW error is returned. |
| Currently we get EINVAL error. This test may need the large file support extension. |
|
| Non root user can see scheduling parameters of root's processes. |
| sched_getparam/6-1.c, sched_getscheduler/7-1.c |
| These tests check that a non root user cannot query scheduling parameters of root's processes. |
| POSIX does not specify if this operation shall be fobidden, therefore this is not really a bug -- though it may be a security issue. |
|
|
| |
How to run the tests
|
| |
|
The testsuite we are using here is the Open POSIX Test Suite
(http://posixtest.sourceforge.net/), which
we extended and improved during our project (see Results link at the left for more information).
You will need this testsuite (get the archive from their website or from CVS for most recent fixes), and
the GNU make utility as well as a C compiler (gcc is suitable). If you need to run the tests on a system
without a C compiler, it is possible to compile the tests on a first machine and then run them
on another, but this won't be dealt here.
Here are the detailed steps for the Test Campaign 2 on RHAS4:
- Install a fresh RedHat Advanced Server 4 Update 2 system. Be sure to install
the Development packages. You also need the nptl-devel package to be installed.
- Get a fresh CVS copy of the Open POSIX TestSuite:
cvs -z3 -d:pserver:anonymous:@cvs.sourceforge.net:/cvsroot/posixtest co -P posixtestsuite
Now, you only need to set up the flags in the testsuite, and you're ready. Just edit the LDFLAGS
file and read the comments inside. To test the NPTL, you probably need to provide special headers set and library, as
NPTL is not the default library on many Linux distributions (at least, for the moment).
For RHAS4u2, the following flags are suitable:
-I /usr/include/nptl -L /usr/lib/nptl -D_XOPEN_SOURCE=600 -lpthread -lrt -lm
If you can't find these directories, you probably forgot to install the nptl-devel package.
To test the default thread library (for example, testing on Fedora Core 4 distribution), the following
LDFLAGS file is suitable:
-D_XOPEN_SOURCE=600 -lpthread -lrt -lm
Now, you can run the test suite. Be sure to run it as super-user because some of the tests
need special priviledges. It is advised not to run the testsuite on a shared machine, because
some of the tests will use a lot of resources (CPU, IO, RAM). You may need to increase the timeout duration in the
Makefile file if you have a large system (more than 1GB RAM for example).
On our system, we changed the TIMEOUT parameter to a larger value (2400).
This avoids false HUNG status -- but a run can last very long.
To run the tests, just issue a make command (as root). It will execute all conformance tests, and will save the results in
a 'logfile' file. You can also run only a subset of the tests (refer to the test suite documentation for more information).
Please note also that some tests will change the system clock of the tested system, and therefore this clock can be slightly desynchronized.
Total time for a run is around 25 min on our test machine (2x Xeon HyperThreaded 3GHz).
We also provided some scalability and stress tests. These tests are located in the
stress/threads subdirectory.
For the stress tests (stress*.c), one can use the helper.c script to run all the tests together.
This can be useful to stress the system globally, but won't provide very useful information. The other option is
to run the tests one-by-one. A stress test will run forever until it fails or is killed with signal SIGUSR1.
For the scalability tests (s-c*.c), there are several compilation options available:
- SCALABILITY_FACTOR: from 1 (default) to 5. A bigger value will use more resources on the box.
- PLOT_OUTPUT: when defined, the testcase will output measurement data to be used in the do_plot script provided.
This script will use the GNUplot software to generate drawings of the test execution. Please have a look at the forum to
see some examples.
In any case, the scalability tests return a status PASSED or FAILED, determined by the least square analysis method
on the results. If the measures are constant, the test is PASSED, otherwise it is FAILED.
No scalability or stress test has been run during Test Campaign 2. Please ask on our mailing-list if you need support.
|
| |
How to analyze the results
|
| |
|
Once you have run the testsuite, the results are located in the 'logfile' file in the root folder of the testsuite.
You may use whatever tool you want to read this file, including human reading, but the file is
somewhat huge (350kb) and boring. We have developped a tool to ease tester's life, called
tslogparser (http://tslogparser.sourceforge.net).
This tool provides a more user-friendly interface to access and filter the tests results.
The installation of tslogparser won't be discussed here -- refer to the INSTALL file in the package
for more information. Basically, tslogparser is a web application based on LAMP (Linux, Apache, MySQL, PHP).
There are two logical steps to store your results in the database, once the software is set up:
- Save the testsuite description in the database, by providing tslogparser your testsuite archive (tar.gz format).
- Save your run results by providing tslogparser the generated 'logfile' file.
You can access our Test Campaigns results through the tslogparser interface by clicking on the Results
link at the left.
During our analysis, we wanted to find out which test results were not reliable, and which features were not
POSIX compliant. For this purpose, we ran the test suite several times with constant parameters, and stored
each resulting logfile in the database. Then, tslogparser allowed us to check each run and compare them; and to see
only the differences.
This helped us find which tests produced non-constant results. In Test Campaign 2, we found several cases:
- The aio_* tests results should be ignored, as the glibc does not claim to provide a POSIX compliant AIO library.
Moreover, these tests rely on the system load and therefore can issue unconstant results.
- Some speculative tests can also issue random results; this is nothing to be worried about, as these tests
are meant to test non-POSIX behaviors, and therefore behaviors which the user should not rely on.
- Some tests can sometimes return the UNRESOLVED status, or another status. This is normal, by definition of the
UNRESOLVED status. This is for example the case when you're short on free memory. In this case, one should
only consider the alternate status returned.
- More problematic cases are tests which sometimes return PASSED, and sometimes return FAILED. These tests need further
attention. In our serie, the problematic tests were: pthread_cond_init tests 1-2 and 2-2; pthread_detach test 4-3; and timer_settime test 9-2.
Then, we can have a look at a complete run log and examine the results. Here is the meaning of the status:
- build FAILED
- The test case failed to compile. You should have the compiler message available through tslogparser to find out what's wrong.
This error can occur for example for unsupported POSIX optional features.
- link SKIP
- Some of the testcases are compile-only tests: the test succeeds when the compiler is able to compile, and it never
tries to link. This status is therefore to be interpreted as 'test PASS'.
- execution UNSUPPORTED
- Some test are able to compile cleanly, but will detect either at compilation or at runtime that
the tested feature is unavailable in the implementation, or that for any reason the test cannot
run properly. These results can be safely ignored.
- execution UNTESTED
- For some features, when no test has been written, a false test case is present and returns this status
as a reminder for developpers.
- execution UNRESOLVED
- Sometimes, an external event prevents the testcase to finish and give its status. This can occur for example
when a memory allocation fails. In many cases, re-running this testcase will give a clean status. If the
test keeps on returning UNRESOLVED, you may try and find out why this is happening.
- execution PASS
- The tested feature is present and behaves as expected. If all tests returned this status, life would be simpler!
- execution INTERRUPTED
- For some reason, the test did not run to completion and was killed by a signal. Most of the time,
this is a segmentation fault. You can safely ignore this status for a 'speculative' test, but this
should be investigated for normal tests.
- execution HUNG
- This status is a subset of the previous; it means that the test terminated because of SIGALRM.
As every test is run with an "alarm($TIMEOUT)" pending, it should mean that the test did hang, but this status
can also be returned on some other cases. Further analysis may be needed here also.
- execution FAILED
- This status means that the test ran to completion, and could find that the tested assertion does
NOT behave as expected in this implementation. In case it is a speculative test, this can be ignored,
but otherwise this most probably shows a bug / limitation in the tested implementation (which is
what we're looking for, remember). tslogparser can give you the text of the tested assertion,
assuming that the OPTS description is correct. Last step in this process is to check if the BugZilla related to
the implementation is already aware of the issue, and eventually open a new report.
|
| |
Project History
|
| |
- Dec 1st, 2003 - Mar 30, 2004
- Project start, analyzis of NPTL existing tests and POSIX standard.
- Apr, 2004
- Little design for the test cases. OPTS project choosen for submission.
Creation of the nptl.bullopensource.org website.
- Apr 29, 2004
- First test submission to OPTS project.
- May 1st - Sep 28, 2004
- Test writing, bug submissions, fixes for the first list of functions we defined.
- Jul 21, 2004
- Analysis of OPTS coverage in NPTL with gcov.
- Aug 4, 2004
- Approval of the OPTS project inside STP platform.
- Oct 12, 2004
- First design for the Trace Tool published in the project's forum.
- Oct 1st - Oct 21, 2004
- First test campaign, start of results analysis.
- Nov 18, 2004
- First test campaign report published on the website.
- Nov 23, 2004
- Creation of tslogparser as a separate SourceForge project.
- Dec 1st, 2004 - Mar 9, 2005
- Test writing, bug submissions, fixes for the second list of functions we defined.
- Dec 20, 2004
- Creation of the The NPTL Trace Tool project in SourceForge.net (nptltracetool)
- Jan 5, 2005
- First nptltracetool code released in CVS.
- Feb 23, 2005
- 10.000 hits on the website.
- Mar 3rd, 2005
- This project was choosen for a presentation at the Ottawa Linux Symposium
- Jul 23, 2005
- Ottawa Linux Symposium presentation concludes several weeks of preparation.
The paper we wrote for this event is available on the OLS website.
- Aug - Oct 2005
- No activity on this project
- Nov 2005
- Latest Test Campaign, writing of this report, closure of the forum and the test project.
Further work on the trace tool is planned to begin next year.
To see complete history of the project, see the News archives on the website.
|
| |
|
|
|