Package: emacs;
Reported by: Óscar Fuentes <oscarfv <at> eclipso.eu>
Date: Wed, 30 Jul 2025 20:20:02 UTC
Severity: normal
Found in version 31.0.50
View this message in rfc822 format
From: Pip Cet <pipcet <at> protonmail.com> To: Óscar Fuentes <oscarfv <at> eclipso.eu> Cc: gerd.moellmann <at> gmail.com, Eli Zaretskii <eliz <at> gnu.org>, 79131 <at> debbugs.gnu.org, Yuan Fu <casouri <at> gmail.com> Subject: bug#79131: 31.0.50; igc: nested signal, SIGSEGV Date: Sun, 03 Aug 2025 14:47:29 +0000
Pip Cet <pipcet <at> protonmail.com> writes: > Óscar Fuentes <oscarfv <at> eclipso.eu> writes: > >> Eli, Gerd, Pip: >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >>>> #12 add_text_properties_1 (start=<optimized out>, start <at> entry=0x1f06a, end=<optimized out>, >>>> end <at> entry=0x1f07a, properties=0x7f4fe3c2acc3, object=0x7f4fe645cfbd, >>>> object <at> entry=0x0, set_type=set_type <at> entry=TEXT_PROPERTY_REPLACE, destructive=destructive <at> entry=true) >>>> --Type <RET> for more, q to quit, c to continue without paging--c >>>> at ../../emacs/src/textprop.c:1252 >>>> i = 0x0 >>>> unchanged = <optimized out> >>>> s = 31770 >>>> len = 3 >>>> modified = <optimized out> >>>> first_time = <optimized out> >>> >>> Since this in code that is the result of your local merge, please be >>> sure to show the source lines corresponding to the call-stack frames >>> where the signal was raised. Otherwise, we are left guessing what is >>> line 1252 in your version of textprop.c that could trigger SIGSEGV. >>> My guess is that it's here: >>> >>> >>> /* We are at the beginning of interval I, with LEN chars to scan. */ >>> for (;;) >>> { >>> eassert (i != 0); >>> >>> if (LENGTH (i) >= len) <<<<<<<<<<<<<<<< >>> >>> but I shouldn't be guessing. If my guess is correct, this is some >>> snafu with intervals in the buffer that happens to be the current one. >> >> textprop.c was not touched by the merge, is the same as master. >> >>> This tels me that the crash happened insider prepare_menu_bars, which >>> called pre-redisplay-function. What is your value of >>> pre-redisplay-functions (note: "functions", plural)? >> >> pre-redisplay-functions is a variable defined in ‘simple.el’. >> >> Its value is (redisplay--update-region-highlight) >> >> However, this is in my new session. The crashed one was running for >> several days, and it is for sure that it had more features loaded that >> the current one. >> >>> The backtrace >>> indicates that treesit--pre-redisplay is involved; is that true? >> >> I was editing a file with a treesit-based major mode, that's all I can >> say, as the Elisp backtrace is not available. >> >> (gdb) xbacktrace >> You can't do that without a process to debug. >> >> Gerd Möllmann <gerd.moellmann <at> gmail.com> writes: >> >>> That would be around here >>> >>> textprop.c: >>> 1251 /* We are at the beginning of interval I, with LEN chars to scan. */ >>> 1252 for (;;) >>> 1253 { >>> 1254 eassert (i != 0); >>> 1255 >>> 1256 if (LENGTH (i) >= len) >>> 1257 { >>> >>> and that probably means i is NULL, which is a pointer to an interval. It >>> is accessed in LENGTH. Which in would mean that the interval tree is >>> kaput. Can you reproduce that? >> >> No idea how to reproduce it, no. >> >> >> Gerd Möllmann <gerd.moellmann <at> gmail.com> writes: >> >>> Gerd Möllmann <gerd.moellmann <at> gmail.com> writes: >>> >>>> I'm in the process of merging master, BTW. >>> >>> Done. >> >> Thanks! >> >> >> Pip Cet <pipcet <at> protonmail.com> writes: >> >>> It does look like the interval tree was in an inconsistent state. >>> >>> Please run >>> >>> p *current_buffer->text >> >> >> (gdb) fr 13 >> #13 0x000055e77414774b in Fadd_text_properties (start=make_fixnum(31770), end=make_fixnum(31774), >> properties=<optimized out>, object=XIL(0)) at ../../emacs/src/textprop.c:1308 >> 1308 return add_text_properties_1 (start, end, properties, object, >> (gdb) p *current_buffer->text >> $1 = { >> beg = 0x55e77e157f80 "", >> gpt = 1, >> z = 31775, > > I think Z = 31775 should mean that the interval tree covers 31775 > characters. > >> (gdb) p $i = current_buffer->text->intervals >> $2 = (INTERVAL) 0x7f4fe5280a28 >> (gdb) p *$i >> $3 = { >> gc_header = { >> v = 34955678229, >> gcaligned = 21 '\025' >> }, >> total_length = 31770, > > But the interval tree only covers 31770 characters. > >> (gdb) p *$i >> $28 = { >> gc_header = { >> v = 35073135893, >> gcaligned = 21 '\025' >> }, >> total_length = 1, >> position = 31770, >> left = 0x0, >> right = 0x0, > > Assuming that this interval's "position" cache is correct (I think it > should be), the code that crashed would try to move on to the next > interval, which doesn't exist, fall off the end of the world and crash. > > But I don't know the interval code that well; is it possible that's a > valid interval tree if the last few characters don't have properties? > > I looked around a little, but I don't really see any MPS-specific code > which might violate this invariant. > > Can you have a look at what current_buffer actually contains? The > contents should start at current_buffer->text.beg + 1153, after the gap, > an we're particularly interested in the last few bytes and the bytes > around PT. > > So can you try > > p current_buffer->pt > p *XSTRING(current_buffer->name_) > p current_buffer->beg + 1153 > p current_buffer->beg + 1153 + 31770 - 64 > > Thanks! Also, is it possible you used quit (C-g) shortly before the crash happened? My current theory is that the code in adjust_intervals_for_insertion calls Fmemq and Fassq, but should call memq_no_quit and assq_no_quit. The former functions sometimes (rarely) signal a quit, which aborts the execution of the current function and might leave the interval tree in an inconsistent state. The function is entered with Vinhibit_quit = Qnil, so if the quit flag is set, we might quit during the Fmemq or Fassq calls, and then fail to adjust the length of the intervals, resulting in a broken interval tree. This would be a bug in both the master and feature/igc branches; my guess is that a tiny race condition which never happened on the master branch became a lot more likely with MPS with its comparatively huge pause-time. I've so far been unable to reproduce this, but it seems to make sense to me that no quitting should be allowed between the point at which we modify Z and the point at which the interval tree has been updated to reflect this change. Pip
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.