GNU bug report logs - #40525
inferior process on core-updates crashes: mmap(PROT_NONE) failed

Previous Next

Package: guix;

Reported by: Christopher Baines <mail <at> cbaines.net>

Date: Thu, 9 Apr 2020 19:46:01 UTC

Severity: serious

Done: Christopher Baines <mail <at> cbaines.net>

Bug is archived. No further changes may be made.

Full log


Message #19 received at 40525 <at> debbugs.gnu.org (full text, mbox):

From: Christopher Baines <mail <at> cbaines.net>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 40525 <at> debbugs.gnu.org
Subject: Re: bug#40525: inferior process on core-updates crashes:
 mmap(PROT_NONE) failed
Date: Thu, 16 Apr 2020 18:29:59 +0100
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi Christopher,
>
> Christopher Baines <mail <at> cbaines.net> skribis:
>
>> I've attached a script that when run should reproduce the issue. I
>> extracted the code relating to lint warnings from the Guix Data
>> Service. The script attached runs this code twice against the inferior,
>> once will often be enough to cause it to crash, but twice should
>> reproduce it more reliably.
>
> Thanks a lot.
>
> Here’s a backtrace from the core dumped by the inferior:

...

> It could be an unbounded growth of libgc’s finalizer table or our weak
> tables as we experienced in <https://bugs.gnu.org/28590>.
>
> We should be able to reproduce it with something like:
>
>   guix time-machine --commit=d523eb5c9c2659cbbaf4eeef3691234ae527ee6a -- \
>     lint -c inputs-should-be-native,license,mirror-url,source-file-name,source-unstable-tarball,derivation,patch-file-names,formatting,synopsis
>
> In top one can see that heap usage keeps growing, which may well be a
> bug in Guix proper rather than in Guile… but it doesn’t crash.
>
> I would propose three actions here:
>
>   1. Run linters un ‘gcprof’ to see what’s eating memory and hopefully
>      find and address the leak.  As a start, maybe just start reducing
>      the list of checkers to see if there’s one of them that’s causing
>      it.
>
>      The ‘derivation’ checker is definitely responsible for a lot of the
>      heap consumption because of the various caches in (guix packages) &
>      co.  Perhaps add calls to ‘invalidate-derivation-caches!’ as in
>      (gnu ci).
>
>   2. Work around the problem in Guix Data Service by running, say, one
>      inferior per checker instead of one inferior for all checkers for
>      all packages.
>
>   3. If #1 didn’t help, let’s see if we can isolate a Guile weak-table
>      bug or something like that.
>
> Thoughts?

Thanks, that's useful to know.

I think I've now managed to find a way of reproducing this without the
inferior getting in the way. I was testing if triggering garbage
collection in Guile would help avoid the problem, but actually it seems
to cause it. I guess given the mentions of GC in the above stacktrace,
and the major version change of libgc, some GC related bug seems quite
likely here.

I've been testing with a checkout of Guix built with Guix from the
core-updates branch. I think that provides the same broken Guile that
the guix repl is using.

When trying to just use a checkout of the core-updates branch, and guile
built from that branch I get the following odd error:

→ ./pre-inst-env /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/guile ./reproduce-core-updates-mmap-PROT_NONE-failed.scm
guile: warning: failed to install locale
warning: failed to load '(gnu packages abiword)': Function not implemented
error: git-fetch: unbound variable
hint: Did you forget `(use-modules (guix git-download))'?

error: git-version: unbound variable



No idea what's happening there, but when I ./configure and make with
packages from core-updates, I seem to end up with a setup that works:

This is the guile I'm using: /gnu/store/18hp7flyb3yid3yp49i6qcdq0sbi5l1n-guile-3.0.2/bin/guile

If you just run the script, you should see:

→ ./pre-inst-env guile ./reproduce-core-updates-mmap-PROT_NONE-failed.scm

;;; ("%package-table-setup" #<hash-table 7f5f329278a0 13275/28099>)
mmap(PROT_NONE) failed
Aborted


For more information, you can pipe the script to the REPL. What you
should see is that it's slow to compute the lint warnings the first
time, but the subsequent times are quick, and it crashes in one of the
(gc) calls.

I'm going to try and continue looking in to this, at least it'll be
easier to delve in to guile now that I can directly control what guile
is used.

Thanks,

Chris

[reproduce-core-updates-mmap-PROT_NONE-failed.scm (text/plain, attachment)]
[signature.asc (application/pgp-signature, inline)]

This bug report was last modified 4 years and 10 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.