GNU bug report logs - #78444
30.1; Crash in GC (vector_marked_p)

Previous Next

Package: emacs;

Reported by: George P <georgepanagopo <at> gmail.com>

Date: Thu, 15 May 2025 18:46:01 UTC

Severity: normal

Found in version 30.1

Full log


View this message in rfc822 format

From: Pip Cet <pipcet <at> protonmail.com>
To: George P <georgepanagopo <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, Andrea Corallo <acorallo <at> gnu.org>, 78444 <at> debbugs.gnu.org
Subject: bug#78444: 30.1; Crash in GC (vector_marked_p)
Date: Fri, 30 May 2025 16:10:03 +0000
"George P" <georgepanagopo <at> gmail.com> writes:

> Thanks, Pip! Really appreciate you putting so much effort into this. See below for what you asked for.

>  Can you check whether eln files were created just before the crash?
>  Over here, the relevant command would be
>
>  $ ls -Strl ~/.emacs.d/eln-cache/*/*
>
>  and looking at the last few lines to see whether any files have the
>  creation date in the right range (after you started using eshell but
>  before the crash happened).
>
> Nothing in the directory ( /u/panagopo/.config/emacs/.local/cache/eln/30.1-1ed0c1e8 for me) that is on the same day of the crash. In fact, eshell isn't
> there at all, which probably makes sense as I am using the built-in version, so it was already natively-compiled.

Oh, okay.

>    So we first should inspect the memory around 0x196922b0 to find out
>  whether it looks like a valid vector block, and whether the bad word was
>  in this block or in the string object marked just before.  I'd suggest
>  running
>
>      x/32gx 0x196922b0
>
>  to look at the memory following the pointer, and
>
>      x/32gx 0x19692200
>
>  to get some idea of whether it might be the middle of a vectorlike. 
>
> (gdb)   x/32gx 0x196922b0
> 0x196922b0: 0xc00000001f000005 0x0000000000000406
> 0x196922c0: 0x000015554efc8fac 0x000000000ffdf6e1
> 0x196922d0: 0x0000000000000016 0x000015554efc8f74
> 0x196922e0: 0xc00000001200a000 0x00001555386f8c00

That looks like a perfectly ordinary bytecode closure in a vector block,
except for the mysterious word 0xffdf6e1 where the constant vector
should be.

Not only is that word not a Lisp object, it doesn't ring any bells at
all - it's just below 256 MB if it's in bytes.  I'm not sure whether
your memory layout places anything there, and even if it did it would be
a strangely unaligned address.  Just in case, can you run:

    x/32gx 0xffdf600
    p $rsp

Have you customized anything to 256 MB, precisely or approximately?
What are your values for gcmh-high-cons-threshold and
gcmh-low-cons-threshold, assuming that was in use during the crash?  Can
you also try:

    p globals.f_gc_cons_threshold
    p globals.f_string_chars_consed
    p &globals.f_gc_cons_threshold

The rest of the memory dump looks normal; the other bytecode closures in
the block are fine.  I must confess that it looks at this point like
some random C code (we know it's not Lisp, at least) wrote a single word
that doesn't ring any bells to a memory location that it may have owned
at some point but which had been freed and reused for a vector block, or
for a new vectorlike within the existing vector block.

There's a native comp unit in the block, and a lambda_gc_guard_h table.
Can you

    p (struct Lisp_String *)0x000000001cfbfe40

to find out which one it is?

Thanks, in any case, even though I'm kind of stumped right now...

Pip





This bug report was last modified 3 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.