GNU bug report logs - #76705
31.0.50; igc: crash

Previous Next

Package: emacs;

Reported by: Óscar Fuentes <oscarfv <at> eclipso.eu>

Date: Mon, 3 Mar 2025 04:33:04 UTC

Severity: normal

Found in version 31.0.50

Full log


Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: Óscar Fuentes <oscarfv <at> eclipso.eu>
Cc: Óscar Fuentes <bug-gnu-emacs <at> gnu.org>,
 76705 <at> debbugs.gnu.org
Subject: Re: bug#76705: 31.0.50; igc: crash
Date: Mon, 03 Mar 2025 15:36:09 +0000
Óscar Fuentes <oscarfv <at> eclipso.eu> writes:

> Pip Cet <pipcet <at> protonmail.com> writes:
>
>> Óscar Fuentes via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> writes:
>>
>>> Emacs just crashed on a session started more than a week ago, IIRC.
>>>
>>> The following backtrace is from the core dump. Sorry for not being more
>>> helpful.
>>
>> Can you try generating a full backtrace ("bt full" should work on the
>> core dump, too)?

Thanks!  Two new leads :-)

> #0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo <at> entry=6, no_tid=no_tid <at> entry=0)
>     at ./nptl/pthread_kill.c:44
>         tid = <optimized out>
>         ret = 0
>         pd = <optimized out>
>         old_mask = {__val = {0}}
>         ret = <optimized out>
> #1  0x00007f487751de2f in __pthread_kill_internal (threadid=<optimized out>, signo=6)
>     at ./nptl/pthread_kill.c:78
> #2  0x00007f48774c9d02 in __GI_raise (sig=sig <at> entry=6) at ../sysdeps/posix/raise.c:26
>         ret = <optimized out>
> #3  0x0000556e0ead9d68 in terminate_due_to_signal
>     (sig=sig <at> entry=6, backtrace_limit=backtrace_limit <at> entry=2147483647) at ../../emacs/src/emacs.c:463
> #4  0x0000556e0ed16d73 in set_state (state=IGC_STATE_DEAD) at ../../emacs/src/igc.c:1023
>         old_state = <optimized out>
>         old_state = <optimized out>
> #5  set_state (state=IGC_STATE_DEAD) at ../../emacs/src/igc.c:1002
>         old_state = <optimized out>
> #6  igc_assert_fail (file=<optimized out>, line=<optimized out>, msg=<optimized out>)
>     at ../../emacs/src/igc.c:306

It's a pity we don't have this message...

> #7  0x0000556e0edd0c10 in shieldFlushEntries ()

So it seems that this function called ProtSet (see the code below), and
ProtSet tail-called mps_lib_assert_fail.  In my build, it only does that
when mprotect fails, which is something that happened in another bug
report.

I assume this is a Linux kernel?  My assumption is that errno is ENOMEM
(you should be able to figure this out from the core dump, but I don't
know how glibc hides errno these days):

	ENOMEM
		Changing the protection  of a memory region  would result in
		the total number of mappings with distinct attributes (e.g.,
		read  versus read/write  protection)  exceeding the  allowed
		maximum.   (For example,  making the  protection of  a range
		PROT_READ in the  middle of a region  currently protected as
		PROT_READ|PROT_WRITE  would result  in  three mappings:  two
		read/write mappings at  each end and a  read-only mapping in
		the middle.)

That sounds like something that might happen.  I assume there's a sysctl
or ulimit controlling this limit.

Can you do an objdump -h on the core file, reporting only the count of
"load" sections?  (Over here, I see about 3000 sections, which is a
lot).  What is the output of

cat /proc/sys/vm/max_map_count

?  It is 65530 here, and that's not a lot more than 3000.

There are recommendations on the internet to increase this limit, so
I'll try reducing it and seeing whether it causes my Emacs to crash...

If that is indeed the problem, we should probably increase the arena
grain size, which means more than one of the tiny 4 KB pages x86 has
will be mprotected at a time, reducing the number of maps...

> #38 Fjson_parse_buffer (nargs=<optimized out>, args=<optimized out>) at ../../emacs/src/json.c:1812
>         count = {bytes = <optimized out>}
>         conf = {object_type = <optimized out>, array_type = <optimized out>, null_object = <optimized out>, false_object = <optimized out>}
>         p = {input_current = 0x556e3342596e "}],\"detail\":\"struct\",\"kind\":23,\"name\":\"ra_struct_0xAB71CCE7u<true>\",\"range\":{\"end\":{\"character\":62,\"line\":9091},\"start\":{\"character\":0,\"line\":9091}},\"selectionRange\":{\"end\":{\"character\":10,\"line\":9091"..., input_begin = 0x556e32ee1f60 "{\"id\":17456,\"jsonrpc\":\"2.0\",\"result\":[{\"children\":[{\"detail\":\"lp0::PluginInfoAdder\",\"kind\":13,\"name\":\"ppa8\",\"range\":{\"end\":{\"character\":33,\"line\":7},\"start\":{\"character\":0,\"line\":7}},\"selectionRange\":"..., input_end = 0x556e33733f5e "", secondary_input_begin = 0x0, secondary_input_end = 0x0, current_line = 1, current_column = 5519886, point_of_current_line = 0, available_depth = 9993, conf = {object_type = json_object_hashtable, array_type = json_array_array, null_object = 0x0, false_object = 0x0}, additional_bytes_count = 0, internal_object_workspace = {0x7f4767c48874, 0x110c2, 0x7f4767c4889c, 0x7f4767c488c4, 0x7f4767c4972d, 0x7f4767c4c63d, 0x7f4767c4cd4d, 0x7f4767c4f9c5, 0x7f4767c53005, 0x7f4767c53ef5, 0x7f4767c54df5, 0x7f4767c55ce5, 0x7f4767c58bf5, 0x7f4767c59ae5, 0x7f4767c5c76d, 0x7f4767c5dd9d, 0x7f4767c60c8d, 0x7f4767c61b7d, 0x7f4767c62a6d, 0x7f4767c6395d, 0x7f4767c6689d, 0x7f4767c6778d, 0x7f4767c6868d, 0x7f4767c6957d, 0x7f4767c6c48d, 0x7f4767c6d37d, 0x7f4767c6e26d, 0x7f4767c6f15d, 0x7f4767c7009d, 0x7f4767c70f8d, 0x7f4767c71e7d, 0x7f4767c74d7d, 0x7f4767c75c6d, 0x7f4767c76b7d, 0x7f4767c77a6d, 0x7f4767c7a9ad, 0x7f4767c7b89d, 0x7f4767c7c78d, 0x7f4767c7d67d, 0x7f4767c80575, 0x7f4767c81465, 0x7f4767c8237d, 0x7f4767c8326d, 0x7f4767c8615d, 0x7f4767c8704d, 0x7f4767c87f3d, 0x7f4767c88e5d, 0x7f4767c89d4d, 0x7f4767c8cc8d, 0x7f4767c8cd64, 0x7f4767c8d4dd, 0x7f4767c--Type <RET> for more, q to quit, c to continue without paging--
> 8d4f4, 0x7f4767c8d51c, 0x7f4767c8d544, 0x56, 0x7f4767c8d56c, 0x7f4767c8d594, 0x7f4767c8d5bc, 0x7f4767c8d805, 0x7f4767c8d8c4, 0x7f4767c8d93d, 0x7f4767c8d9fc, 0x2, 0x21e}, object_workspace = 0x556e374c43e0, object_workspace_size = 4096, object_workspace_current = 4023, internal_byte_workspace = "9091acterRange(lxw_chart_options::x_scale) *, const lxw_chart_options *))*))s *)bj *)mats))ement *)mat *)...).).)\321\025;\375\177\000\000\000\000\000\000\000\000\000\000\020\370\302\016nU\000\000\b\000\000\000\000\000\000\000\020\321\025;\375\177\000\000\t\000\000\000\000\000\000\000\310\321\025;\375\177\000\000\000\000\000\000\000\000\000\000\020\370\302\016nU\000\000\b\000\000\000\000\000\000\000p\321\025;\375\177\000\000"..., byte_workspace = 0x7ffd3b15d050 "9091acterRange(lxw_chart_options::x_scale) *, const lxw_chart_options *))*))s *)bj *)mats))ement *)mat *)...).).)\321\025;\375\177", byte_workspace_end = 0x7ffd3b15d250 "P\320\025;\375\177", byte_workspace_current = 0x7ffd3b15d054 "acterRange(lxw_chart_options::x_scale) *, const lxw_chart_options *))*))s *)bj *)mats))ement *)mat *)...).).)\321\025;\375\177"}

I see that the object_workspace is large, which means several
rounds of allocation most likely happened.  If this happens a lot, I'm
not sure whether it creates more mappings that count against
max_map_count.

>>> #7  0x0000556e0edd0c10 in shieldFlushEntries ()
>>
>> shieldFlushEntries contains several asserts.  Can you disassemble the
>> function shieldFlushEntries to see which one we hit?
>> (gdb: "disass/s shieldFlushEntries").
>>
>
> (gdb) disass/s shieldFlushEntries
> Dump of assembler code for function shieldFlushEntries:
>    0x0000556e0edd0be1 <+97>:    xor    %r14d,%r14d
>    0x0000556e0edd0be4 <+100>:   xor    %r13d,%r13d
>    0x0000556e0edd0be7 <+103>:   jmp    0x556e0edd0c34 <shieldFlushEntries+180>
>    0x0000556e0edd0be9 <+105>:   nopl   0x0(%rax)
>    0x0000556e0edd0bf0 <+112>:   test   %r13,%r13
>    0x0000556e0edd0bf3 <+115>:   je     0x556e0edd0c91 <shieldFlushEntries+273>
>    0x0000556e0edd0bf9 <+121>:   cmp    %r14,%r13
>    0x0000556e0edd0bfc <+124>:   jae    0x556e0edd0d00 <shieldFlushEntries+384>
>    0x0000556e0edd0c02 <+130>:   mov    %r12d,%edx
>    0x0000556e0edd0c05 <+133>:   mov    %r14,%rsi
>    0x0000556e0edd0c08 <+136>:   mov    %r13,%rdi
>    0x0000556e0edd0c0b <+139>:   call   0x556e0edcf8b0 <ProtSet>
> => 0x0000556e0edd0c10 <+144>:   movzwl 0x30(%r15),%r12d

Thanks! This means that it was actually ProtSet which aborted, and since
it doesn't show up in the backtrace it must have tail-called the
assertion function.  Can you disassemble ProtSet just to make sure that
we're looking at an mprotect failure here?

Pip





This bug report was last modified 162 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.