GNU bug report logs - #76517
31.0.50; feature/igc 6ff509af3d31 crash on Wayland KDE, (with -g3

Previous Next

Package: emacs;

Reported by: Eval Exec <execvy <at> gmail.com>

Date: Mon, 24 Feb 2025 02:28:02 UTC

Severity: normal

Found in version 31.0.50

Done: Pip Cet <pipcet <at> protonmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pip Cet <pipcet <at> protonmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: execvy <at> gmail.com, 76517 <at> debbugs.gnu.org
Subject: bug#76517: 31.0.50; feature/igc 6ff509af3d31 crash on Wayland KDE, (with -g3
Date: Mon, 24 Feb 2025 16:17:24 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes:

>> Cc: 76517 <at> debbugs.gnu.org
>> Date: Mon, 24 Feb 2025 15:49:38 +0000
>> From:  Pip Cet via "Bug reports for GNU Emacs,
>>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>>
>> "Eval Exec" <execvy <at> gmail.com> writes:
>>
>> > Hello,
>> > I'm helping to test feature/igc branch
>>
>> Thanks for the report!
>>
>> At first glance, the problem doesn't seem to be specific to feature/igc.
>>
>> > #16 0x00000000004933d8 in c_string_width (nbytes=<synthetic pointer>,
>> > nchars=<synthetic pointer>, precision=<optimized out>, len=69,
>> >     str=0x7f1ec16860fa "\274\214我中间使用 充电宝给电脑冲了一次电。还行。。。") at
>>
>> This string starts with an incomplete character sequence.
>
> The evaluation of the header-line-format shows the complete string.

The relevant part in the screenshot is "%,我中间使用"

>> As the screenshot at https://imgur.com/a/tON6P7w (why a screenshot?)
>> shows, the last character before that is "%", followed by what looks
>> like ",", a fullwidth comma.
>>
>> It seems the "%" was interpreted as introducing a mode line escape,
>> which used the first byte of the three-byte encoding used for the
>> fullwidth comma.  The remaining bytes were then interpreted as the
>> beginning of a multi-byte character, which ended up out of range and
>> accessing an element of the display_table_ chartab which wasn't defined.
>>
>> So I guess our mode line escapes need to be fixed for multibyte
>> characters, and hopefully no further action is necessary (you might also
>> want to consider not making mode line escapes part of your header
>> lines).
>
> I don't see any "%", but are you saying that some UTF-8 byte sequence

Look at the screenshot.

> of a non-ASCII character that is not the character '%' itself could
> have the '%' byte as part of it?  I thought that was impossible,

No.  I'm saying that display_mode_element scans for a '%', finds it,
takes the next byte, which is the first byte of the fullwidth comma,,
passes it to decode_mode_spec, then leaves offset pointing to the second
byte of the multi-byte sequence following the %, and attempts to
continue printing the modeline from that offset, in the middle of a
multi-byte sequence.

The multi-byte sequence decodes to an out-of-range character (in my
case, c = 0xc427df80), and char_table_ref makes no attempt to verify the
character is in range; CHARTAB_IDX doesn't either, so this code:

#define CHARTAB_IDX(c, depth, min_char)		\
  (((c) - (min_char)) >> chartab_bits[(depth)])

    {
      val = tbl->contents[CHARTAB_IDX (c, 0, 0)];
      if (SUB_CHAR_TABLE_P (val))
	val = sub_char_table_ref (val, c, UNIPROP_TABLE_P (table));
    }

just accesses random memory that isn't anywhere near the char table's
actual contents.

> guaranteed by the way UTF-8 sequences are produced.  AFAIK, ASCII
> bytes can only happen as themselves in UTF-8 encoding.  So when we see
> '%', it cannot be anything but the ASCII chyaracter '%'.

It's the next character that matters, the fullwidth comma after the '%'.
Something like this should help:

diff --git a/src/xdisp.c b/src/xdisp.c
index 577d5b1b401..4ee47eea818 100644
--- a/src/xdisp.c
+++ b/src/xdisp.c
@@ -27933,6 +27933,12 @@ display_mode_element (struct it *it, int depth, int field_width, int precision,
 		while ((c = SREF (elt, offset++)) >= '0' && c <= '9')
 		  field = field * 10 + c - '0';
 
+		if (c > 127)
+		  {
+		    offset--;
+		    continue;
+		  }
+
 		/* Don't pad beyond the total padding allowed.  */
 		if (field_width - n > 0 && field > field_width - n)
 		  field = field_width - n;

Pip





This bug report was last modified 131 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.