GNU bug report logs - #8788
Weird testsuite failure on NetBSD (parallel tests, background processes)

Previous Next

Package: automake;

Reported by: Stefano Lattarini <stefano.lattarini <at> gmail.com>

Date: Thu, 2 Jun 2011 16:46:02 UTC

Severity: normal

Tags: moreinfo, patch

Merged with 10447

Done: Stefano Lattarini <stefano.lattarini <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8788 in the body.
You can then email your comments to 8788 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-automake <at> gnu.org:
bug#8788; Package automake. (Thu, 02 Jun 2011 16:46:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefano Lattarini <stefano.lattarini <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-automake <at> gnu.org. (Thu, 02 Jun 2011 16:46:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefano Lattarini <stefano.lattarini <at> gmail.com>
To: bug-automake <at> gnu.org
Subject: Weird testsuite failure on NetBSD (parallel tests,
	background processes)
Date: Thu, 2 Jun 2011 18:43:53 +0200
[Message part 1 (text/plain, inline)]
Hello automakers.

While teststing the `testsuite-work' branch on NetBSD 5, I've encountered
a weird failure in the test `parallel-tests3.test', which actually caused
the whole testsuite to crash (!) due to a stray SIGTERM.

I've reduced the failure to the attached testcase, which expose the bug
also for the older automake release 1.11.1.

Note that, on GNU/Linux and Solaris 10, the testcases passes without
problems.

To reproduce from a freshly extracted 1.11.1 tarball, make sure GNU make
is in PATH named as `gmake', then run:
  $ ./configure && make
  $ cd tests
  $ cp /path/to/saved/foo.test .
  $ sh foo.test
  ...
  gmake[2]: Leaving directory `/tmp/automake-1.11.1/tests/foo.dir'
  gmake[1]: Leaving directory `/tmp/automake-1.11.1/tests/foo.dir'
  [2]   Terminated              ${sleep}
  gmake: *** [check-am] Terminated
  + signal=15
  + Exit 1
  + set +e
  [1]   Terminated+               ${MAKE} -j1 checkexit
   1
  + exit 1
  + exit_status=1
  + set +e
  + cd /tmp/automake-1.11.1/tests
  + test 15 != 0
  + echo foo: caught signal 15
  foo: caught signal 15
  + echo foo: exit 1
  foo: exit 1
  + exit 1

Attached are the config.log file and test logs for /bin/sh, /bin/ksh, and
bash 4.1.

Any idea of what's going on?

Regards,
  Stefano
[ksh.log (text/x-log, inline)]
/tmp/automake-1.11.1/tests:/home/slattarini/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R7/bin:/usr/X11R6/bin:/usr/pkg/bin:/usr/pkg/sbin:/usr/games:/usr/local/bin:/usr/local/sbin
=== Running test foo.test
+ pwd
/tmp/automake-1.11.1/tests/foo.dir
+ MAKE=gmake
+ export MAKE
+ cat
+ > configure.in 
+ << "END" 
+ cat
+ > Makefile.am 
+ << "END" 
+ cat
+ > foo1.test 
+ << "END" 
+ chmod a+x foo1.test
+ cat
+ > foo2.test 
+ << "END" 
+ chmod a+x foo2.test
+ cat
+ > foo3.test 
+ << "END" 
+ chmod a+x foo3.test
+ aclocal-1.11 -Werror
+ autoconf
+ automake-1.11 --foreign -Werror -Wall -a
+ ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... ./install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether gmake sets $(MAKE)... yes
configure: creating ./config.status
config.status: creating Makefile
+ sleep 2
+ gmake -j1 check
gmake  check-TESTS
gmake[1]: Entering directory `/tmp/automake-1.11.1/tests/foo.dir'
gmake[2]: Entering directory `/tmp/automake-1.11.1/tests/foo.dir'
PASS: foo1.test
+ kill 20320
+ test ! -f test-suite.log
+ sleep 2
PASS: foo2.test
PASS: foo3.test
==================
All 3 tests passed
==================
gmake[2]: Leaving directory `/tmp/automake-1.11.1/tests/foo.dir'
gmake[1]: Leaving directory `/tmp/automake-1.11.1/tests/foo.dir'
Terminated 
+ signal=15
+ Exit 1
gmake: *** [check-am] Terminated
foo: caught signal 15
foo: exit 1
[bash.log (text/x-log, inline)]
/tmp/automake-1.11.1/tests:/home/slattarini/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R7/bin:/usr/X11R6/bin:/usr/pkg/bin:/usr/pkg/sbin:/usr/games:/usr/local/bin:/usr/local/sbin
=== Running test foo.test
++ pwd
/tmp/automake-1.11.1/tests/foo.dir
+ MAKE=gmake
+ export MAKE
+ cat
+ cat
+ for i in 1 2 3
+ cat
+ chmod a+x foo1.test
+ for i in 1 2 3
+ cat
+ chmod a+x foo2.test
+ for i in 1 2 3
+ cat
+ chmod a+x foo3.test
+ aclocal-1.11 -Werror
+ autoconf
+ automake-1.11 --foreign -Werror -Wall -a
+ ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... ./install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether gmake sets $(MAKE)... yes
configure: creating ./config.status
config.status: creating Makefile
+ sleep 2
+ gmake -j1 check
gmake  check-TESTS
gmake[1]: Entering directory `/tmp/automake-1.11.1/tests/foo.dir'
gmake[2]: Entering directory `/tmp/automake-1.11.1/tests/foo.dir'
PASS: foo1.test
+ kill 22441
+ test '!' -f test-suite.log
+ sleep 2
PASS: foo2.test
PASS: foo3.test
==================
All 3 tests passed
==================
gmake[2]: Leaving directory `/tmp/automake-1.11.1/tests/foo.dir'
gmake[1]: Leaving directory `/tmp/automake-1.11.1/tests/foo.dir'
Terminated
++ signal=15
++ Exit 1
++ set +e
gmake: *** [check-am] Terminated
++ exit 1
++ exit 1
+ exit_status=1
+ set +e
+ cd /tmp/automake-1.11.1/tests
+ case $exit_status,$keep_testdirs in
+ test 15 '!=' 0
+ echo 'foo: caught signal 15'
foo: caught signal 15
+ echo 'foo: exit 1'
foo: exit 1
+ exit 1
[sh.log (text/x-log, inline)]
/tmp/automake-1.11.1/tests:/home/slattarini/bin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/X11R7/bin:/usr/X11R6/bin:/usr/pkg/bin:/usr/pkg/sbin:/usr/games:/usr/local/bin:/usr/local/sbin
=== Running test foo.test
+ pwd
/tmp/automake-1.11.1/tests/foo.dir
+ MAKE=gmake
+ export MAKE
+ cat
+ cat
+ cat
+ chmod a+x foo1.test
+ cat
+ chmod a+x foo2.test
+ cat
+ chmod a+x foo3.test
+ aclocal-1.11 -Werror
+ autoconf
+ automake-1.11 --foreign -Werror -Wall -a
+ ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... ./install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether gmake sets $(MAKE)... yes
configure: creating ./config.status
config.status: creating Makefile
+ gmake -j1 check
+ sleep 2
gmake  check-TESTS
gmake[1]: Entering directory `/tmp/automake-1.11.1/tests/foo.dir'
gmake[2]: Entering directory `/tmp/automake-1.11.1/tests/foo.dir'
PASS: foo1.test
+ kill 26199
+ test ! -f test-suite.log
+ sleep 2
PASS: foo2.test
PASS: foo3.test
==================
All 3 tests passed
==================
gmake[2]: Leaving directory `/tmp/automake-1.11.1/tests/foo.dir'
gmake[1]: Leaving directory `/tmp/automake-1.11.1/tests/foo.dir'
[2]   Terminated              ${sleep}
+ signal=15
+ Exit 1
+ set +e
gmake: *** [check-am] Terminated
[1]   Terminated              ${MAKE} -j1 check
+ exit 1
+ exit 1
+ exit_status=1
+ set +e
+ cd /tmp/automake-1.11.1/tests
+ test 15 != 0
+ echo foo: caught signal 15
foo: caught signal 15
+ echo foo: exit 1
foo: exit 1
+ exit 1
[config.log (text/x-log, inline)]
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.

It was created by GNU Automake configure 1.11.1, which was
generated by GNU Autoconf 2.65.  Invocation command line was

  $ ./configure 

## --------- ##
## Platform. ##
## --------- ##

hostname = gcc70.fsffrance.org
uname -m = amd64
uname -r = 5.1
uname -s = NetBSD
uname -v = NetBSD 5.1 (GENERIC) #0: Sat Nov  6 13:19:33 UTC 2010  builds <at> b6.netbsd.org:/home/builds/ab/netbsd-5-1-RELEASE/amd64/201011061943Z-obj/home/builds/ab/netbsd-5-1-RELEASE/src/sys/arch/amd64/compile/GENERIC

/usr/bin/uname -p = x86_64
/bin/uname -X     = unknown

/bin/arch              = unknown
/usr/bin/arch -k       = unknown
/usr/convex/getsysinfo = unknown
/usr/bin/hostinfo      = unknown
/bin/machine           = unknown
/usr/bin/oslevel       = unknown
/bin/universe          = unknown

PATH: /home/slattarini/bin
PATH: /bin
PATH: /sbin
PATH: /usr/bin
PATH: /usr/sbin
PATH: /usr/X11R7/bin
PATH: /usr/X11R6/bin
PATH: /usr/pkg/bin
PATH: /usr/pkg/sbin
PATH: /usr/games
PATH: /usr/local/bin
PATH: /usr/local/sbin


## ----------- ##
## Core tests. ##
## ----------- ##

configure:1722: checking build system type
configure:1736: result: x86_64-unknown-netbsd5.1
configure:1781: checking for a BSD-compatible install
configure:1849: result: /usr/bin/install -c
configure:1860: checking whether build environment is sane
configure:1910: result: yes
configure:2051: checking for a thread-safe mkdir -p
configure:2090: result: lib/install-sh -c -d
configure:2103: checking for gawk
configure:2133: result: no
configure:2103: checking for mawk
configure:2133: result: no
configure:2103: checking for nawk
configure:2133: result: no
configure:2103: checking for awk
configure:2119: found /usr/bin/awk
configure:2130: result: awk
configure:2141: checking whether make sets $(MAKE)
configure:2163: result: yes
configure:2265: checking for perl
configure:2283: found /usr/pkg/bin/perl
configure:2295: result: /usr/pkg/bin/perl
configure:2314: checking whether /usr/pkg/bin/perl supports ithreads
configure:2337: result: yes
configure:2349: checking for tex
configure:2379: result: no
configure:2396: checking whether autoconf is installed
configure:2401: eval autoconf --version
autoconf (GNU Autoconf) 2.68
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+/Autoconf: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>, <http://gnu.org/licenses/exceptions.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by David J. MacKenzie and Akim Demaille.
configure:2404: $? = 0
configure:2412: result: yes
configure:2419: checking whether autoconf works
configure:2426: cd conftest && eval autoconf -o /dev/null conftest.ac
configure:2429: $? = 0
configure:2438: result: yes
configure:2445: checking whether autoconf is recent enough
configure:2452: cd conftest && eval autoconf -o /dev/null conftest.ac
configure:2455: $? = 0
configure:2464: result: yes
configure:2471: checking whether ln works
configure:2491: result: yes
configure:2506: checking for grep that handles long lines and -e
configure:2564: result: /usr/bin/grep
configure:2569: checking for egrep
configure:2631: result: /usr/bin/grep -E
configure:2636: checking for fgrep
configure:2698: result: /usr/bin/grep -F
configure:2704: checking whether /bin/sh has working 'set -e' with exit trap
configure:2717: result: yes
configure:2876: creating ./config.status

## ---------------------- ##
## Running config.status. ##
## ---------------------- ##

This file was extended by GNU Automake config.status 1.11.1, which was
generated by GNU Autoconf 2.65.  Invocation command line was

  CONFIG_FILES    = 
  CONFIG_HEADERS  = 
  CONFIG_LINKS    = 
  CONFIG_COMMANDS = 
  $ ./config.status 

on gcc70.fsffrance.org

config.status:776: creating Makefile
config.status:776: creating doc/Makefile
config.status:776: creating lib/Automake/Makefile
config.status:776: creating lib/Automake/tests/Makefile
config.status:776: creating lib/Makefile
config.status:776: creating lib/am/Makefile
config.status:776: creating m4/Makefile
config.status:776: creating tests/Makefile
config.status:776: creating tests/defs
config.status:776: creating tests/aclocal-1.11
config.status:776: creating tests/automake-1.11

## ---------------- ##
## Cache variables. ##
## ---------------- ##

ac_cv_build=x86_64-unknown-netbsd5.1
ac_cv_env_build_alias_set=
ac_cv_env_build_alias_value=
ac_cv_env_host_alias_set=
ac_cv_env_host_alias_value=
ac_cv_env_target_alias_set=
ac_cv_env_target_alias_value=
ac_cv_path_EGREP='/usr/bin/grep -E'
ac_cv_path_FGREP='/usr/bin/grep -F'
ac_cv_path_GREP=/usr/bin/grep
ac_cv_path_PERL=/usr/pkg/bin/perl
ac_cv_path_install='/usr/bin/install -c'
ac_cv_prog_AWK=awk
ac_cv_prog_make_make_set=yes
am_cv_autoconf_installed=yes
am_cv_autoconf_version=yes
am_cv_autoconf_works=yes
am_cv_prog_PERL_ithreads=yes
am_cv_prog_ln=ln
am_cv_sh_errexit_works=yes

## ----------------- ##
## Output variables. ##
## ----------------- ##

ACLOCAL='perllibdir="/tmp/automake-1.11.1/lib:./lib" "/tmp/automake-1.11.1/aclocal" --acdir=m4 -I m4'
AMTAR='${SHELL} /tmp/automake-1.11.1/lib/missing --run tar'
APIVERSION='1.11'
AUTOCONF='${SHELL} /tmp/automake-1.11.1/lib/missing --run autoconf'
AUTOHEADER='${SHELL} /tmp/automake-1.11.1/lib/missing --run autoheader'
AUTOMAKE='perllibdir="/tmp/automake-1.11.1/lib:./lib" "/tmp/automake-1.11.1/automake" --libdir=lib'
AWK='awk'
CYGPATH_W='echo'
DEFS='-DPACKAGE_NAME=\"GNU\ Automake\" -DPACKAGE_TARNAME=\"automake\" -DPACKAGE_VERSION=\"1.11.1\" -DPACKAGE_STRING=\"GNU\ Automake\ 1.11.1\" -DPACKAGE_BUGREPORT=\"bug-automake <at> gnu.org\" -DPACKAGE_URL=\"http://www.gnu.org/software/automake/\" -DPACKAGE=\"automake\" -DVERSION=\"1.11.1\"'
ECHO_C=''
ECHO_N='-n'
ECHO_T=''
EGREP='/usr/bin/grep -E'
FGREP='/usr/bin/grep -F'
GREP='/usr/bin/grep'
HELP2MAN='${SHELL} /tmp/automake-1.11.1/lib/missing --run help2man'
INSTALL_DATA='${INSTALL} -m 644'
INSTALL_PROGRAM='${INSTALL}'
INSTALL_SCRIPT='${INSTALL}'
INSTALL_STRIP_PROGRAM='$(install_sh) -c -s'
LIBOBJS=''
LIBS=''
LN='ln'
LTLIBOBJS=''
MAKEINFO='${SHELL} /tmp/automake-1.11.1/lib/missing --run makeinfo'
MKDIR_P='lib/install-sh -c -d'
MODIFICATION_DELAY='2'
PACKAGE='automake'
PACKAGE_BUGREPORT='bug-automake <at> gnu.org'
PACKAGE_NAME='GNU Automake'
PACKAGE_STRING='GNU Automake 1.11.1'
PACKAGE_TARNAME='automake'
PACKAGE_URL='http://www.gnu.org/software/automake/'
PACKAGE_VERSION='1.11.1'
PATH_SEPARATOR=':'
PERL='/usr/pkg/bin/perl'
PERL_THREADS='1'
SET_MAKE=''
SHELL='/bin/ksh'
STRIP=''
TEX=''
VERSION='1.11.1'
am_AUTOCONF='autoconf'
am_AUTOHEADER='autoheader'
am__isrc=''
am__leading_dot='.'
am__tar='${AMTAR} chof - "$$tardir"'
am__untar='${AMTAR} xf -'
bindir='${exec_prefix}/bin'
build='x86_64-unknown-netbsd5.1'
build_alias=''
build_cpu='x86_64'
build_os='netbsd5.1'
build_vendor='unknown'
datadir='${datarootdir}'
datarootdir='${prefix}/share'
docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
dvidir='${docdir}'
exec_prefix='${prefix}'
host_alias=''
htmldir='${docdir}'
includedir='${prefix}/include'
infodir='${datarootdir}/info'
install_sh='${SHELL} /tmp/automake-1.11.1/lib/install-sh'
libdir='${exec_prefix}/lib'
libexecdir='${exec_prefix}/libexec'
localedir='${datarootdir}/locale'
localstatedir='${prefix}/var'
mandir='${datarootdir}/man'
mkdir_p='$(top_builddir)/lib/install-sh -c -d'
oldincludedir='/usr/include'
pdfdir='${docdir}'
pkgvdatadir='${datadir}/automake-1.11'
prefix='/usr/local'
program_transform_name='s,x,x,'
psdir='${docdir}'
sbindir='${exec_prefix}/sbin'
sh_errexit_works='yes'
sharedstatedir='${prefix}/com'
sysconfdir='${prefix}/etc'
target_alias=''

## ----------- ##
## confdefs.h. ##
## ----------- ##

/* confdefs.h */
#define PACKAGE_NAME "GNU Automake"
#define PACKAGE_TARNAME "automake"
#define PACKAGE_VERSION "1.11.1"
#define PACKAGE_STRING "GNU Automake 1.11.1"
#define PACKAGE_BUGREPORT "bug-automake <at> gnu.org"
#define PACKAGE_URL "http://www.gnu.org/software/automake/"
#define PACKAGE "automake"
#define VERSION "1.11.1"

configure: exit 0
[foo.test (application/x-shellscript, inline)]

Information forwarded to bug-automake <at> gnu.org:
bug#8788; Package automake. (Tue, 18 Oct 2011 21:19:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefano Lattarini <stefano.lattarini <at> gmail.com>
To: bug-automake <at> gnu.org
Cc: 8788 <at> debbugs.gnu.org, bug-autoconf <at> gnu.org
Subject: Re: bug#8788: Weird testsuite failure on NetBSD (parallel tests,
	background processes)
Date: Tue, 18 Oct 2011 23:16:25 +0200
Reference:
 <http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8788>

[Adding bug-autoconf in CC]

On Thursday 02 June 2011, Stefano Lattarini wrote:
> Hello automakers.
> 
> While teststing the `testsuite-work' branch on NetBSD 5, I've encountered
> a weird failure in the test `parallel-tests3.test', which actually caused
> the whole testsuite to crash (!) due to a stray SIGTERM.
> 
> [SNIP]
> 
> Any idea of what's going on?
> 
Ah ah, got it! (I think).  The failure is due to an interaction between some
features of GNU make and some (mis)features the NetBSD Korn Shell.  Let's see
the details.

[1] The Korn shell gets selected to run the Makefile recipes
-------------------------------------------------------------

On NetBSD, an autoconf-generated configure script will select /bin/ksh as
the $(SHELL) used to execute the Makefile recipes:
 
  $ grep 'SHELL.*=' tests/parallel-tests3.dir/*/config.log
  tests/parallel-tests3.dir/parallel/config.log:SHELL='/bin/ksh'
  tests/parallel-tests3.dir/serial/config.log:SHELL='/bin/ksh'

[2] The Korn shell has quirks w.r.t. signal handling
----------------------------------------------------

The NetBSD's Korn Shell is one of those shells which try to "propagate"
terminating signals, as explained in the ``Signal Handling'' node of the
(as of today yet unreleased) bleeding-edge autoconf manual; see also these
relevant links:

 <http://lists.gnu.org/archive/html/autoconf-patches/2011-09/msg00005.html>
 <https://lists.gnu.org/archive/html/bug-autoconf/2011-09/msg00004.html>
 <http://mail.opensolaris.org/pipermail/ksh93-integration-discuss/2009-February/004121.html>

And in fact, NetBSD's Korn Shell even seems to propagate a fatal signal
it has received *to all its process group*!  Let's see a few examples:

 $ /bin/sh -c '/bin/sh -c "kill -15 \$\$"; echo alive'
 [1]   Terminated              /bin/sh -c "kill...
 alive

 $ /bin/ksh -c '/bin/sh -c "kill -15 \$\$"; echo alive'
 Terminated 
 alive

 # ksh apparently terminate its parent
 $ /bin/sh -c '/bin/ksh -c "kill -15 \$\$"; echo alive'
 Terminated

 $ /bin/ksh -c '/bin/ksh -c "kill -15 \$\$"; echo alive'
 Terminated 
 Terminated

Just to be sure, let's try to trace the systems calls made by the Korn
shell:

  $ ktrace /bin/sh -c '
  > echo parent: $$
  > ktrace -a /bin/ksh -c "echo child: \$\$; kill -15 \$\$"
  > echo alive
  '
  parent: 20429
  child: 4829
  Terminated

  $ kdump ktrace.out | grep -i sig | grep -v __sig
   4829  1 ksh  CALL  kill(0x12dd, SIGTERM)
   4829  1 ksh  PSIG  SIGTERM caught handler=0x420810 mask=(): code=SI_USER sent by pid=4829, uid=1242)
   4829  1 ksh  CALL  kill(0, SIGTERM)
   4829  1 ksh  PSIG  SIGTERM SIG_DFL: code=SI_USER sent by pid=4829, uid=1242)
  20429  1 sh   PSIG  SIGTERM SIG_DFL: code=SI_USER sent by pid=4829, uid=1242)

(Note that `0x12dd' is decimal 4829).

[3] GNU make propagates signal to the running recipes
-----------------------------------------------------

If GNU make receives a terminating signal while it's updating some target(s), it
propagates that signal to the currently-executing recipe(s):

  $ cat Makefile 
  all: 1 2
  1 2:
       @trap 'echo got SIGTERM; exit 77' 15; while :; do :; done
  $ gmake -j2 &
  [1] 5980
  $ kill $!
  got SIGTERM
  got SIGTERM
  gmake: *** [2] Error 77
  gmake: *** [1] Error 77

(FWIW, I find this to be an helpful and rational behaviour).

[4] Putting it all together
---------------------------

So here is my diagnosis of what happens when `parallel-tests3.test' is
run on NetBSD with GNU make:

 1) various setup/preparation commands get executed in this script; the
    Korn shell gets selected to run the recipe of the Makefile;
 2) "make -j1 check" is launched in the background:
      cd serial
      $MAKE -j1 check &
 3) some more commands get run, and they concludes before the background
    make process launched in (2) has concluded;
 4) the shell executing `parallel-tests3.test' explicitly kills the still
    running background "make" process  with a SIGTERM:
      cd ..
      kill $!
 5) GNU make "relays" the SIGTERM to the korn shell executing the still
    running recipe(s);
 6) in turn, the korn shell relays the SIGTERM to all processes in its
    process group;
 7) this includes the top-level make process that is running the automake
    testsuite (if any); which explains the crash that is the object of
    this bug report.

I'm not 100% positive that point (7) is completely correct, but I'm running
out of time now, so I'll settle for this explanation; kudos to anyone who
can give some confirmation about the correctness of point (7)!

-*-*-*-

Now, the right fix for the bug is *not* to work around this behaviour
of the Korn shell; rather, we should fix the suspicious logic of the
`parallel-tests3.test' script, which was also causing a testsuite hanging
on FreeBSD.  Patch coming up shortly.

And it goes without saying that this horrendous NetBSD's Korn Shell
incompatibility should be documented in the autoconf manual; I will
maybe give it a shot in the next days if nobody beats me.

Regards,
  Stefano




Information forwarded to bug-automake <at> gnu.org:
bug#8788; Package automake. (Tue, 18 Oct 2011 21:19:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-automake <at> gnu.org:
bug#8788; Package automake. (Tue, 18 Oct 2011 21:35:02 GMT) Full text and rfc822 format available.

Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Stefano Lattarini <stefano.lattarini <at> gmail.com>
To: bug-automake <at> gnu.org
Cc: 8788 <at> debbugs.gnu.org, automake-patches <at> gnu.org
Subject: Re: bug#8788: Weird testsuite failure on NetBSD (parallel tests,
	background processes)
Date: Tue, 18 Oct 2011 23:32:35 +0200
[Message part 1 (text/plain, inline)]
[Dropping bug-autoconf from CC]
[Adding automake-patches to CC]

On Tuesday 18 October 2011, Stefano Lattarini wrote:
> Reference:
>  <http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8788>
> 
> On Thursday 02 June 2011, Stefano Lattarini wrote:
> > While teststing the `testsuite-work' branch on NetBSD 5, I've encountered
> > a weird failure in the test `parallel-tests3.test', which actually caused
> > the whole testsuite to crash (!) due to a stray SIGTERM.
> > 
> > [SNIP]
> > 
> > Any idea of what's going on?
> > 
> Ah ah, got it! (I think).  The failure is due to an interaction between some
> features of GNU make and some (mis)features the NetBSD Korn Shell.
>
> [SNIP]
> 
> Now, the right fix for the bug is *not* to work around this behaviour
> of the Korn shell; rather, we should fix the suspicious logic of the
> `parallel-tests3.test' script, which was also causing a testsuite hang
> on FreeBSD.  Patch coming up shortly.
> 
And here is the promised patch (see attachement).  I will allow a couple of
days for reviews before pushing.

Regards,
  Stefano
[0001-tests-avoid-spurious-failure-in-parallel-tests3.test.patch (text/x-patch, inline)]
From f5b69b8a0d787cf798653fdb975affa9e7ff44b8 Mon Sep 17 00:00:00 2001
Message-Id: <f5b69b8a0d787cf798653fdb975affa9e7ff44b8.1318973480.git.stefano.lattarini <at> gmail.com>
From: Stefano Lattarini <stefano.lattarini <at> gmail.com>
Date: Tue, 18 Oct 2011 21:05:24 +0200
Subject: [PATCH] tests: avoid spurious failure in 'parallel-tests3.test'

This fixes automake bug#8788.

* tests/parallel-tests3.test: To ensure that the serial run of
the dummy testsuite is still ongoing when the parallel run has
terminated, use `kill -0', not a bare `kill'.  This will prevent
a testsuite crash on NetBSD 5.1, and a testsuite hang on FreeBSD
8.2.  Also, since we are at it, try harder to avoid possible
hangs of the script in other unusual situations.
---
 ChangeLog                  |   11 +++++++++++
 tests/parallel-tests3.test |   19 +++++++++++++------
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index a2ecefc..9ed30f0 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2011-10-18  Stefano Lattarini  <stefano.lattarini <at> gmail.com>
+
+	tests: avoid spurious failure in 'parallel-tests3.test'
+	This fixes automake bug#8788.
+	* tests/parallel-tests3.test: To ensure that the serial run of
+	the dummy testsuite is still ongoing when the parallel run has
+	terminated, use `kill -0', not a bare `kill'.  This will prevent
+	a testsuite crash on NetBSD 5.1, and a testsuite hang on FreeBSD
+	8.2.  Also, since we are at it, try harder to avoid possible
+	hangs of the script in other unusual situations.
+
 2011-10-17  Stefano Lattarini  <stefano.lattarini <at> gmail.com>
 
 	tests: fix spurious failure with autoconf 2.62
diff --git a/tests/parallel-tests3.test b/tests/parallel-tests3.test
index a138f90..69ba1d0 100755
--- a/tests/parallel-tests3.test
+++ b/tests/parallel-tests3.test
@@ -70,15 +70,22 @@ $sleep
 : >stdout
 $MAKE -j4 check >> stdout
 cd ..
-kill $!
+# Ensure the tests are really being run in parallel mode: if this is
+# the case, the serial run of the dummy testsuite started above should
+# still be ongoing when the parallel one has terminated.
+kill -0 $!
 cat parallel/stdout
 test `grep -c PASS parallel/stdout` -eq 8
 
-# Wait long enough so that there are no open files any more
-# when the post-test cleanup runs.
-while test ! -f serial/test-suite.log
-do
-  $sleep
+# Wait long enough so that there are no open files any more when the
+# post-test cleanup runs.  But exit after we've waited for two minutes
+# or more, to avoid testsuite hangs in unusual situations (this has
+# already happened).
+i=1
+while test ! -f serial/test-suite.log && test $i -le 120; do
+  i=`expr $i + 1`
+  sleep '1' # Extra quoting to please maintainer-check.
 done
 $sleep
+
 :
-- 
1.7.3.5


Information forwarded to bug-automake <at> gnu.org:
bug#8788; Package automake. (Tue, 18 Oct 2011 21:35:02 GMT) Full text and rfc822 format available.

Added tag(s) patch. Request was from Stefano Lattarini <stefano.lattarini <at> gmail.com> to control <at> debbugs.gnu.org. (Thu, 20 Oct 2011 13:42:02 GMT) Full text and rfc822 format available.

Reply sent to Stefano Lattarini <stefano.lattarini <at> gmail.com>:
You have taken responsibility. (Thu, 20 Oct 2011 19:51:02 GMT) Full text and rfc822 format available.

Notification sent to Stefano Lattarini <stefano.lattarini <at> gmail.com>:
bug acknowledged by developer. (Thu, 20 Oct 2011 19:51:02 GMT) Full text and rfc822 format available.

Message #24 received at 8788-done <at> debbugs.gnu.org (full text, mbox):

From: Stefano Lattarini <stefano.lattarini <at> gmail.com>
To: 8788-done <at> debbugs.gnu.org
Cc: automake-patches <at> gnu.org
Subject: Re: bug#8788: Weird testsuite failure on NetBSD (parallel tests,
	background processes)
Date: Thu, 20 Oct 2011 21:48:48 +0200
On Tuesday 18 October 2011, Stefano Lattarini wrote:
> On Tuesday 18 October 2011, Stefano Lattarini wrote:
> > Reference:
> >  <http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8788>
> >
> > [SNIP]
> >
> > Now, the right fix for the bug is *not* to work around this behaviour
> > of the Korn shell; rather, we should fix the suspicious logic of the
> > `parallel-tests3.test' script, which was also causing a testsuite hang
> > on FreeBSD.  Patch coming up shortly.
> > 
> And here is the promised patch (see attachement).  I will allow a couple of
> days for reviews before pushing.
> 
Pushed now.  I'm thus also closing bug#8788.

Regards,
  Stefano




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 18 Nov 2011 12:24:04 GMT) Full text and rfc822 format available.

bug unarchived. Request was from Stefano Lattarini <stefano.lattarini <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 08 Jan 2012 12:03:02 GMT) Full text and rfc822 format available.

Merged 8788 10447. Request was from Stefano Lattarini <stefano.lattarini <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 08 Jan 2012 12:03:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 05 Feb 2012 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 188 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.