GNU bug report logs - #68690
Segmentation fault building with native-comp

Previous Next

Package: emacs;

Reported by: john muhl <jm <at> pub.pink>

Date: Wed, 24 Jan 2024 16:44:02 UTC

Severity: normal

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

Full log


Message #32 received at 68690 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: jm <at> pub.pink, 68690 <at> debbugs.gnu.org
Subject: Re: bug#68690: Segmentation fault building with native-comp
Date: Thu, 25 Jan 2024 12:26:29 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: jm <at> pub.pink,  68690 <at> debbugs.gnu.org
> Date: Wed, 24 Jan 2024 18:59:44 -0500
> 
> > Here's the backtrace from GDB:
> >
> >   lisp.h:1784: Emacs fatal error: assertion failed: VECTORLIKEP (a)
> >
> >   Thread 1 hit Breakpoint 1, terminate_due_to_signal (sig=22,
> >       backtrace_limit=2147483647) at emacs.c:442
> >   442       signal (sig, SIG_DFL);
> >   (gdb) bt
> >   #0  terminate_due_to_signal (sig=22, backtrace_limit=2147483647) at emacs.c:442
> >   #1  0x00772401 in die (msg=0xddc80d <b_fwd+233> "VECTORLIKEP (a)",
> >       file=0xddc740 <b_fwd+28> "lisp.h", line=1784) at alloc.c:8062
> >   #2  0x00626a44 in XVECTOR (a=XIL(0x92348b000000000)) at lisp.h:1784
> >   #3  0x00626ace in gc_asize (array=XIL(0x92348b000000000)) at lisp.h:1800
> >   #4  0x00626bba in AREF (array=XIL(0x92348b000000000), idx=1) at lisp.h:1971
> >   #5  0x0063174d in Fcharset_after (pos=make_fixnum(113)) at charset.c:2084
> 
> Hmm... I can't reproduce it here (even with native-comp and
> `--with-wide-int`).

This build is without native-comp, but it's a 32-bit build.  Did you
try that?  I think that's the key to unlock this (see below).

> The above stack frame suggests it might be related
> to commit 33b8d5b6c5a (and hence unrelated to the original bug#68690
> which was a bug in `DOHASH`).
> 
> Any chance you can investigate what is this `0x92348b000000000`?

It's obviously a bogus value, since Lisp objects in this build should
have their high 32 bits zero except for the type tag in the MSBs.

> It should be a charset's attributes and the "idx=1" is because
> we're using `CHARSET_ATTR_NAME` to extract the name.

It sounds like we are not dumping the charset attributes correctly,
and that also corrupts all the fields of a struct charset following
the attributes.  Here's this charset in temacs:

  Thread 1 hit Breakpoint 2, dump_charset (ctx=0x5f6dad0, cs_i=0)
      at pdumper.c:3224
  3224      dump_field_lv (ctx, &out, cs, &cs->attributes, WEIGHT_NORMAL);
  (gdb) p cs
  $1 = (const struct charset *) 0x1050de0 <charset_table_init>
  (gdb) p *cs
  $2 = {
    id = 0,
    attributes = XIL(0xa000000009023d88),
    dimension = 1,
    code_space = {0, 127, 128, 128, 0, 0, 1, 128, 0, 0, 1, 128, 0, 0, 1},
    code_space_mask = 0x0,
    code_linear_p = 1,
    iso_chars_96 = 0,
    ascii_compatible_p = 1,
    supplementary_p = 0,
    compact_codes_p = 1,
    unified_p = 0,
    iso_final = 66,
    iso_revision = -1,
    emacs_mule_id = 0,
    method = CHARSET_METHOD_OFFSET,
    min_code = 0,
    max_code = 127,
    char_index_offset = 0,
    min_char = 0,
    max_char = 127,
    invalid_code = 128,
    fast_map = "\001", '\000' <repeats 188 times>,
    code_offset = 0
  }
  (gdb) p cs->attributes
  $3 = XIL(0xa000000009023d88)
  (gdb) xtype
  Lisp_Vectorlike
  PVEC_NORMAL_VECTOR
  (gdb) xvector
  $4 = (struct Lisp_Vector *) 0x9023d88
  {make_fixnum(0), XIL(0x2ca0), XIL(0xc0000000091014e0), XIL(0), XIL(0), XIL(0),
    XIL(0), XIL(0), XIL(0), XIL(0)}
  (gdb) p AREF(cs->attributes,1)
  $5 = 11424
  (gdb) xtype
  Lisp_Symbol
  (gdb) xsymbol
  $6 = (struct Lisp_Symbol *) 0x10beda0 <lispsym+11424>
  "ascii"

Looks entirely reasonable, and is the ASCII charset (makes sense since
the ID is zero).

And here's the same charset in emacs, after we restore from dump:

  #5  0x0063174d in Fcharset_after (pos=make_fixnum(113)) at charset.c:2084
  2084      return (CHARSET_NAME (charset));
  (gdb) p charset
  $1 = (struct charset *) 0x9100064
  (gdb) p *charset
  $2 = {
    id = 0,
    attributes = XIL(0x92848b000000000),
    dimension = -1610612736,
    code_space = {1, 0, 127, 128, 128, 0, 0, 1, 128, 0, 0, 1, 128, 0, 0},
    code_space_mask = 0x1 <error: Cannot access memory at address 0x1>,
    code_linear_p = 0,
    iso_chars_96 = 0,
    ascii_compatible_p = 0,
    supplementary_p = 0,
    compact_codes_p = 0,
    unified_p = 0,
    iso_final = 21,
    iso_revision = 66,
    emacs_mule_id = -1,
    method = CHARSET_METHOD_OFFSET,
    min_code = 0,
    max_code = 0,
    char_index_offset = 127,
    min_char = 0,
    max_char = 0,
    invalid_code = 127,
    fast_map = "\200\000\000\000\001", '\000' <repeats 184 times>,
    code_offset = 0
  }

Note that the attributes are bogus (zero-extended on the right to 64
bits), and all the fields after that are shifted (by 32 bits, I'm
guessing).

So I think we fail to dump the attributes, and my guess is that this
is related to the fact that in this build a pointer is 32-bit wide,
but a Lisp object is a 64-bit data type.

I tried to figure out what is wrong with how we dump this new field,
but got lost in the proverbial twisty little passages of pdumper.c,
all alike.  For example, I cannot understand why some fields which are
Lisp objects are dumped with dump_field_lv while others with
dump_field_lv_or_rawptr, and what is the significance of WEIGHT_NORMAL
vs WEIGHT_STRONG.  Hopefully, the above gives enough information for
you to figure this out.

TIA




This bug report was last modified 1 year and 116 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.