GNU bug report logs - #76705
31.0.50; igc: crash

Previous Next

Package: emacs;

Reported by: Óscar Fuentes <oscarfv <at> eclipso.eu>

Date: Mon, 3 Mar 2025 04:33:04 UTC

Severity: normal

Found in version 31.0.50

Full log


View this message in rfc822 format

From: Pip Cet <pipcet <at> protonmail.com>
To: Helmut Eller <eller.helmut <at> gmail.com>
Cc: Gerd Möllmann <gerd.moellmann <at> gmail.com>, Óscar Fuentes <oscarfv <at> eclipso.eu>, Eli Zaretskii <eliz <at> gnu.org>, 76705 <at> debbugs.gnu.org
Subject: bug#76705: 31.0.50; igc: crash
Date: Wed, 05 Mar 2025 13:45:59 +0000
"Helmut Eller" <eller.helmut <at> gmail.com> writes:

> On Tue, Mar 04 2025, Pip Cet wrote:
>
>> Gerd, Helmut, Eli:
>>
>> See https://github.com/Ravenbrook/mps/issues/304 for the description of
>> a Linux-specific MPS issue that currently causes hard crashes when we
>> create "too many" mappings for the Linux kernel (this is purely a kernel
>> issue, so no GNU/ prefix here), which causes 'mprotect' to fail with
>> ENOMEM.
>
> For me, this code
> ;; -*- lexical-binding: t -*-
>
> (defun test ()
>   (let* ((npages 120000)
> 	 (pages (make-vector npages nil)))
>     (dotimes (i npages)
>       (aset pages i (make-vector (/ 4096 2) 0)))
>     (message "nmaps: %s"
> 	     (shell-command-to-string
> 	      (format "wc -l /proc/%d/maps" (emacs-pid))))
>     (dotimes (i npages)
>       (when (zerop (% i 2))
> 	(let ((page (aref pages i)))
> 	  (aset page 0 1))))))
>
> crashes with:
> protix.c:117: Emacs fatal error: assertion failed: unreachable code
>
> Though this seems to require >4GB.

3.23user 4.00system 0:07.56elapsed 95%CPU (0avgtext+0avgdata 3723392maxresident)k

with some modifications.  So it seems that ARENA_GRAIN_SIZE doesn't
influence it (it seems to be ProtGranularity, which is bad news).

And, as far as I know, we don't know how eagerly MPS installs memory
barriers and what the worst case is.

>> We have to find a workaround for this.
>
> We could emit a warning (similar to [1]) and telling users to increase
> max_map_count.  E.g. if COMMIT_LIMIT / PAGE_SIZE > max_map_count.

Yeah, that's the worst case: tell people to reconfigure their kernel so
they can run Emacs.

> We could also set COMMIT_LIMIT to max_map_count / PAGE_SIZE, so that MPS
                                                  ^

*, I hope.  Yes, that's the 256 MB/512 MB limit for MPS memory, except
it's actually less because of all the non-MPS map entries.

> tries harder to reuse memory.  Beside issue #288[4], we would likely
> see situations where MPS gets slower and slower because it can free some
> memory but not enough to make much progress on non-GC tasks.

Doesn't seem much better, and it assumes all of the maps are available
for MPS to use, which isn't true: thousands of them are already used for
malloc, dlopen(), and mmap.

>> Increasing AREA_GRAIN_SIZE may delay running into the problem, or there
>> might be a way to handle a failed mprotect call and at least recover the
>> Emacs session, but ultimately the maximum safe size of an Emacs session
>> appears to be ARENA_GRAIN_SIZE * /proc/sys/vm/max_map_count, or about
>> 512 MB of MPS memory (minus whatever mappings libraries, malloc, and
>> mmap require).  That's obviously not sufficient.
>
> My guess is that AREA_GRAIN_SIZE is used for mmap/mmunmap but the memory
> barriers always work at page granularity.  I could be wrong of course.

That seems correct given your code above, and considering

Size ProtGranularity(void)
{
  /* Individual pages can be protected. */
  return PageSize();
}

which we could also override, of course.

>> One open question is whether Linux is actually capable of merging
>> adjacent mappings when they're 'mprotect'ed to have the same
>> permissions.
>
> That's easy to verify with the attached C code.  Without the second call
> to change_protection, the number mappings increases until the process
> aborts.

The question was whether removing a memory barrier (making the
protection the same as the surrounding mappings) would *reliably* reduce
the number of mappings again or *sometimes* keep redundant adjacent
mappings with the same protection.

Mappings are coalesced sometimes, as we know, but sometimes redundant
mappings show up in /proc/$$/maps, so it doesn't seem to be reliable
(and even if it were reliable now, which Linux versions does that apply
to?)

Thanks a lot for your investigation! I'm afraid this looks a lot like we
need to modify MPS to be usable for ordinary Emacs use cases, to keep
track of the number of mapping discontinuities (I don't even know how to
get that number short of reading /proc/$$/maps and counting lines) and
give up and trigger some memory barriers if it gets too close to the
limit.

Pip





This bug report was last modified 162 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.