GNU bug report logs - #78444
30.1; Crash in GC (vector_marked_p)

Previous Next

Package: emacs;

Reported by: George P <georgepanagopo <at> gmail.com>

Date: Thu, 15 May 2025 18:46:01 UTC

Severity: normal

Found in version 30.1

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> protonmail.com>
Cc: acorallo <at> gnu.org, georgepanagopo <at> gmail.com, 78444 <at> debbugs.gnu.org
Subject: bug#78444: 30.1; Crash in GC (vector_marked_p)
Date: Fri, 30 May 2025 19:15:20 +0300
> Date: Fri, 30 May 2025 16:10:03 +0000
> From: Pip Cet <pipcet <at> protonmail.com>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, 78444 <at> debbugs.gnu.org, Andrea Corallo <acorallo <at> gnu.org>
> 
> "George P" <georgepanagopo <at> gmail.com> writes:
> 
> > Thanks, Pip! Really appreciate you putting so much effort into this. See below for what you asked for.
> 
> >  Can you check whether eln files were created just before the crash?
> >  Over here, the relevant command would be
> >
> >  $ ls -Strl ~/.emacs.d/eln-cache/*/*
> >
> >  and looking at the last few lines to see whether any files have the
> >  creation date in the right range (after you started using eshell but
> >  before the crash happened).
> >
> > Nothing in the directory ( /u/panagopo/.config/emacs/.local/cache/eln/30.1-1ed0c1e8 for me) that is on the same day of the crash. In fact, eshell isn't
> > there at all, which probably makes sense as I am using the built-in version, so it was already natively-compiled.
> 
> Oh, okay.
> 
> >    So we first should inspect the memory around 0x196922b0 to find out
> >  whether it looks like a valid vector block, and whether the bad word was
> >  in this block or in the string object marked just before.  I'd suggest
> >  running
> >
> >      x/32gx 0x196922b0
> >
> >  to look at the memory following the pointer, and
> >
> >      x/32gx 0x19692200
> >
> >  to get some idea of whether it might be the middle of a vectorlike. 
> >
> > (gdb)   x/32gx 0x196922b0
> > 0x196922b0: 0xc00000001f000005 0x0000000000000406
> > 0x196922c0: 0x000015554efc8fac 0x000000000ffdf6e1
> > 0x196922d0: 0x0000000000000016 0x000015554efc8f74
> > 0x196922e0: 0xc00000001200a000 0x00001555386f8c00
> 
> That looks like a perfectly ordinary bytecode closure in a vector block,
> except for the mysterious word 0xffdf6e1 where the constant vector
> should be.
> 
> Not only is that word not a Lisp object, it doesn't ring any bells at
> all - it's just below 256 MB if it's in bytes.  I'm not sure whether
> your memory layout places anything there, and even if it did it would be
> a strangely unaligned address.  Just in case, can you run:
> 
>     x/32gx 0xffdf600
>     p $rsp
> 
> Have you customized anything to 256 MB, precisely or approximately?
> What are your values for gcmh-high-cons-threshold and
> gcmh-low-cons-threshold, assuming that was in use during the crash?  Can
> you also try:
> 
>     p globals.f_gc_cons_threshold
>     p globals.f_string_chars_consed
>     p &globals.f_gc_cons_threshold
> 
> The rest of the memory dump looks normal; the other bytecode closures in
> the block are fine.  I must confess that it looks at this point like
> some random C code (we know it's not Lisp, at least) wrote a single word
> that doesn't ring any bells to a memory location that it may have owned
> at some point but which had been freed and reused for a vector block, or
> for a new vectorlike within the existing vector block.
> 
> There's a native comp unit in the block, and a lambda_gc_guard_h table.
> Can you
> 
>     p (struct Lisp_String *)0x000000001cfbfe40
> 
> to find out which one it is?
> 
> Thanks, in any case, even though I'm kind of stumped right now...

Can this be due to a GCC optimization bug, whereby it truncates 64-bit
values to 32 bits?  Maybe George should try -fno-tree-sra?




This bug report was last modified 43 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.