GNU bug report logs -
#68860
race condition with make recheck
Previous Next
To reply to this bug, email your comments to 68860 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Thu, 01 Feb 2024 01:13:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Peter Johansson <trojkan <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-automake <at> gnu.org
.
(Thu, 01 Feb 2024 01:13:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi automakers,
I think I've found a race condition with 'make recheck' that results in
a source file being compiled twice in parallel and resulting in a
failure such as
mv: cannot stat '.deps/foo.Tpo': No such file or directory
In my trimmed down example my Makefile.am looks like:
lib_LIBRARIES = libfoo.a
libfoo_a_SOURCES = foo.cc
check_LIBRARIES = libtest.a
libtest_a_SOURCES = test.cc
TESTS = one.test two.test
TEST_EXTENSIONS = .test
AM_DEFAULT_SOURCE_EXT = .cc
EXTRA_PROGRAMS = $(TESTS)
libtest_a_LIBADD = libfoo.a
LDADD = libtest.a libfoo.a
The problem seems to be that both $(TESTS) and check_LIBRARIES depend on
libfoo.a and trigger compilation of foo.cc. I haven't managed to get the
same problem with 'make check', so I thought comparing the generated
rules for check: and recheck: would be useful.
recheck: all $(check_LIBRARIES)
<long rule running failed TESTS>
all: config.h
$(MAKE) $(AM_MAKEFLAGS) all-am
...
check-am: all-am
$(MAKE) $(AM_MAKEFLAGS) $(check_LIBRARIES)
$(MAKE) $(AM_MAKEFLAGS) check-TESTS
check: check-am
I can see how the "check-am: all-am" works as firewall against the race
condition. OTOH, in the rule for recheck, 'all' triggers a sub-process
that will build libfoo.a and in the meantime the main process will build
$(check_LIBRARIES) which trigger the building of libfoo.a. My
understanding of parallel make is a bit hazy, but I guess the main
process and sub-process are only talking wrt how many workers they
employ and are not talking about which rules to work on.
I suppose this is not by design or that I'm doing something illegal by
having check_LIBRARIES depend stuff that is built within 'make all'. I'm
not sure what the best way to fix this would be. One idea would to
change the rule for recheck to
recheck: all
$(MAKE) $(AM_MAKEFLAGS) $(check_LIBRARIES)
<long rule running failed TESTS>
but personally I don't fancy these sub-processes because it feels like
they are the core of the problem for these sort of race conditions.
I have tested with automake 1.16.5 (ubuntu) and 1.16i.
Please find attached a trimmed down example of the problem.
Best Regards,
Peter
[automake.sh (application/x-shellscript, attachment)]
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Thu, 01 Feb 2024 22:26:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 68860 <at> debbugs.gnu.org (full text, mbox):
Hi Peter,
The problem seems to be that both $(TESTS) and check_LIBRARIES depend on
libfoo.a and trigger compilation of foo.cc.
Thanks much for the report and analysis. What you wrote looks sensible
to me.
My understanding of parallel make is a bit hazy,
Me too :(. If anyone else here has a chance to look into this, that
would be great.
One idea would to change the rule for recheck to
It looks plausible. Another possibility that comes to mind is to make
the recheck target more parallel to all, i.e., with a recheck-am
target. I', not sure.
Please find attached a trimmed down example of the problem.
Thanks again. Will ponder. --best, karl.
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Fri, 16 Aug 2024 22:23:02 GMT)
Full text and
rfc822 format available.
Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello.
Thank you for reporting the issue.
The attached patch should fix the problem. It may be a bit of an
overkill, perhaps just one of the fixes would suffice, but it seems to
work at least.
I've re-made your useful script into an Automake test. Since
non-deterministic defects may be hard to find and fix, and certainly
harder to test if they're fixed, the new version simply runs parallel
'make recheck' a few times "just in case". Without the fix, the test
failed in the first or the second run. With the fix, the test (which
runs 'make recheck' 5 times) passed 5 times in a row. This *should* be
a decent sample.
All tests with "check" in the name pass.
The test and my patch can, of course, be adapted and further changed.
--
Regards - Bogdan ('bogdro') D. (GNU/Linux & FreeDOS)
X86 assembly (DOS, GNU/Linux): http://bogdro.evai.pl/index-en.php
Soft(EN): http://bogdro.evai.pl/soft http://bogdro.evai.pl/soft4asm
www.Xiph.org www.TorProject.org www.LibreOffice.org www.GnuPG.org
[automake-recheck-race-mail.diff (text/x-patch, attachment)]
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Fri, 16 Aug 2024 22:23:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Sat, 17 Aug 2024 22:24:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 68860 <at> debbugs.gnu.org (full text, mbox):
Thanks Bogdan! I will review as soon as I have a chance. --best, karl.
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Fri, 23 Aug 2024 21:12:02 GMT)
Full text and
rfc822 format available.
Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi.
I've just noticed that bug #68860 (patched) may be a duplicate of
#26471. Different descriptions and error messages, but looks like the
same cause.
--
Regards - Bogdan ('bogdro') D. (GNU/Linux & FreeDOS)
X86 assembly (DOS, GNU/Linux): http://bogdro.evai.pl/index-en.php
Soft(EN): http://bogdro.evai.pl/soft http://bogdro.evai.pl/soft4asm
www.Xiph.org www.TorProject.org www.LibreOffice.org www.GnuPG.org
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Fri, 23 Aug 2024 21:12:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Sun, 25 Aug 2024 16:47:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 68860 <at> debbugs.gnu.org (full text, mbox):
Thanks much, Bogdan.
-recheck: all %CHECK_DEPS%
+recheck: all-am %CHECK_DEPS%
Do you have a grip on all-am? Looking at handle_all in bin/automake, I
admit I remain baffled as to what all those pieces of all-am are, and
why it's done as it is.
- $output_rules .= "check-am: all-am\n";
+ $output_rules .= "check-am: all-am";
if (@check)
{
- pretty_print_rule ("\t\$(MAKE) \$(AM_MAKEFLAGS)", "\t ", @check);
+ $output_rules .= " @check";
+ #pretty_print_rule ("\t\$(MAKE) \$(AM_MAKEFLAGS)", "\t ", @check);
depend ('.MAKE', 'check-am');
}
+ $output_rules .= "\n";
So I gather the basic fix to output the check targets as dependencies of
check-am, instead of as sub-makes. That seems a plausible reason and fix
for the parallel bug to me.
Anyway, I will tweak a few words and install this soon. --thanks again, karl.
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Sun, 25 Aug 2024 18:45:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 68860 <at> debbugs.gnu.org (full text, mbox):
Karl Berry <karl <at> freefriends.org>, 2024-08-25 10:45:
> Thanks much, Bogdan.
>
> -recheck: all %CHECK_DEPS%
> +recheck: all-am %CHECK_DEPS%
>
> Do you have a grip on all-am? Looking at handle_all in bin/automake, I
> admit I remain baffled as to what all those pieces of all-am are, and
> why it's done as it is.
Te be honest, not really :). At least, not fully. As far as I
understand/remember, those "all-am" were the ones processed
recursively. But, I may be wrong, seeing this comment in handle_all:
# We need to make sure config.h is built before we recurse.
# We also want to make sure that built sources are built
# before any ordinary 'all' targets are run. We can't do this
# by changing the order of dependencies to the "all" because
# that breaks when using parallel makes. Instead we handle
# things explicitly.
So, "all" just checks/remakes config.h before starting "the real work"
in all-am (be it recursive or not, parallel or not).
> - $output_rules .= "check-am: all-am\n";
> + $output_rules .= "check-am: all-am";
> if (@check)
> {
> - pretty_print_rule ("\t\$(MAKE) \$(AM_MAKEFLAGS)", "\t ", @check);
> + $output_rules .= " @check";
> + #pretty_print_rule ("\t\$(MAKE) \$(AM_MAKEFLAGS)", "\t ", @check);
> depend ('.MAKE', 'check-am');
> }
> + $output_rules .= "\n";
>
> So I gather the basic fix to output the check targets as dependencies of
> check-am, instead of as sub-makes. That seems a plausible reason and fix
> for the parallel bug to me.
Yes, I'm adding the dependencies as I believe they should be. Here
and in check.am. Maybe the check.am is too much (especially seeing
that skipping the dependency on config.h may *not* be desired) and
fixing only the code will be enough.
As it is with non-deterministic problem, it's not 100% guaranteed
that this fixes the problem. But, a few runs of parallel 'make
recheck' seems to prove it.
> Anyway, I will tweak a few words and install this soon. --thanks again, karl.
No problem. And thanks :)
--
Regards - Bogdan ('bogdro') D. (GNU/Linux & FreeDOS)
X86 assembly (DOS, GNU/Linux): http://bogdro.evai.pl/index-en.php
Soft(EN): http://bogdro.evai.pl/soft http://bogdro.evai.pl/soft4asm
www.Xiph.org www.TorProject.org www.LibreOffice.org www.GnuPG.org
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Mon, 26 Aug 2024 01:19:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 68860 <at> debbugs.gnu.org (full text, mbox):
> - $output_rules .= "check-am: all-am\n";
> + $output_rules .= "check-am: all-am";
> if (@check)
> {
> - pretty_print_rule ("\t\$(MAKE) \$(AM_MAKEFLAGS)", "\t ", @check);
> + $output_rules .= " @check";
Looking again, the comment before this code says:
# The check target must depend on the local equivalent of
# 'all', to ensure all the primary targets are built. Then it
# must build the local check rules.
.. which makes sense. We have to make all before we can make check.
Hence the check targets can't be dependencies, since then they would be
run in parallel with make, and the programs built by 'all' might not be
built yet. This explains why they made it a sub-make.
So I'm puzzled as to how all the tests can still be passing. Maybe there
is no test specifically for this? --thanks, karl.
Information forwarded
to
bug-automake <at> gnu.org
:
bug#68860
; Package
automake
.
(Mon, 26 Aug 2024 19:52:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 68860 <at> debbugs.gnu.org (full text, mbox):
Karl Berry <karl <at> freefriends.org>, 2024-08-25 19:17:
> > - $output_rules .= "check-am: all-am\n";
> > + $output_rules .= "check-am: all-am";
> > if (@check)
> > {
> > - pretty_print_rule ("\t\$(MAKE) \$(AM_MAKEFLAGS)", "\t ", @check);
> > + $output_rules .= " @check";
>
> Looking again, the comment before this code says:
>
> # The check target must depend on the local equivalent of
> # 'all', to ensure all the primary targets are built. Then it
> # must build the local check rules.
>
> .. which makes sense. We have to make all before we can make check.
> Hence the check targets can't be dependencies, since then they would be
> run in parallel with make, and the programs built by 'all' might not be
> built yet. This explains why they made it a sub-make.
Totally makes sense, and I'm not removing the dependency on all-am.
When I see that the first command of a target is a 'make', I start
thinking that something in dependency management is wrong. It
shouldn't be needed, right? That's one of the jobs 'make' does -
figure out what needs to be built and in what order. So, if the
dependencies would be correct in the first place, maybe running 'make'
in a target wouldn't be needed (well, not in the beginning, at least).
That's why I'm adding @check to the dependency list instead of
building it manually as the first command. The dependencies /should/
be computed correctly and built just once (if needed, that is).
But, correct dependencies are maybe just in the perfect world.
There probably were reasons to do it this way, like parallel make
(which /should/ work correctly, but maybe not all implementations do)
or some implementations that e.g. don't follow the order and break the
builds because of that, or too many too complicated dependencies to
put on each target, or...
So, what do we do? It has just become a bit scary to apply the
patch, but it looks like it's exactly the dependency list that should
be fixed...
> So I'm puzzled as to how all the tests can still be passing. Maybe there
> is no test specifically for this? --thanks, karl.
Maybe. Or maybe tests pass on the well-behaving GNU Make, but not on
all 'make's. Or I didn't run the "right ones".
--
Regards - Bogdan ('bogdro') D. (GNU/Linux & FreeDOS)
X86 assembly (DOS, GNU/Linux): http://bogdro.evai.pl/index-en.php
Soft(EN): http://bogdro.evai.pl/soft http://bogdro.evai.pl/soft4asm
www.Xiph.org www.TorProject.org www.LibreOffice.org www.GnuPG.org
This bug report was last modified 290 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.