Package: emacs;
Reported by: ali_gnu2 <at> emvision.com
Date: Tue, 12 Mar 2024 20:38:02 UTC
Severity: normal
Done: Po Lu <luangruo <at> yahoo.com>
Bug is archived. No further changes may be made.
Message #8 received at 69762 <at> debbugs.gnu.org (full text, mbox):
From: Po Lu <luangruo <at> yahoo.com> To: ali_gnu2 <at> emvision.com Cc: 69762 <at> debbugs.gnu.org Subject: Re: bug#69762: X11 versions of Emacs 29 on sparc fail at startup Date: Wed, 13 Mar 2024 08:34:34 +0800
ali_gnu2 <at> emvision.com writes: > Hello, > > I maintain the the GNU emacs delivered with the > Solaris OS: > > https://github.com/oracle/solaris-userland/tree/master/components/emacs > > We're still at version 28.2. It's getting long in the > tooth, so I recently tried to move to 29.2. There were no > issues on x86, but on sparc, I see display problems with > both the gtk (emacs-gtk) and lucid (emacs-x) versions that > prevent it from running. > > I double checked version 28.2, which we are currently > delivering, and it has no problem. Then, I built 29.1, > and it shows the same issues as 29.2. It seems that > something was introduced early in the 29 series. > > With the GTK version, I see 3 different failures on repeated > attempts: > > 1) > > % emacs-gtk > Connection lost to X server 'localhost:10.0' > When compiled with GTK, Emacs cannot recover from X disconnects. > This is a GTK bug: https://gitlab.gnome.org/GNOME/gtk/issues/221 > For details, see etc/PROBLEMS. > Fatal error 6: Aborted > Backtrace: > /usr/bin/emacs-gtk'emacs_backtrace+0x50 [0x1ffe749825f2ac] > /usr/bin/emacs-gtk'terminate_due_to_signal+0xb4 [0x1ffe749822f804] > /usr/bin/emacs-gtk'emacs_abort+0x8 [0x1ffe74982609c4] > /usr/bin/emacs-gtk'x_connection_closed+0x3cc [0x1ffe74981efd54] > /usr/bin/emacs-gtk'x_io_error_quitter+0x3c [0x1ffe74981f00f0] > /usr/lib/sparcv9/libX11.so.4.0.0'_XIOError+0x74 [0x1ffe74904f8034] > /usr/lib/sparcv9/libX11.so.4.0.0'_XReply+0x360 [0x1ffe74904f3bf0] > /usr/lib/sparcv9/libXi.so.5.0.0'XIGetSelectedEvents+0x84 [0x1ffe748ed17044] > /usr/bin/emacs-gtk'Fx_create_frame+0x17b4 [0x1ffe74982172fc] > /usr/bin/emacs-gtk'funcall_subr+0x120 [0x1ffe74982e6f58] > /usr/bin/emacs-gtk'exec_byte_code+0x52c [0x1ffe74983470e0] > /usr/bin/emacs-gtk'Ffuncall+0x108 [0x1ffe74982e37d4] > /usr/bin/emacs-gtk'Fapply+0x378 [0x1ffe74982e3c98] > /usr/bin/emacs-gtk'funcall_subr+0xa8 [0x1ffe74982e6ee0] > /usr/bin/emacs-gtk'exec_byte_code+0x52c [0x1ffe74983470e0] > /usr/bin/emacs-gtk'apply_lambda+0xd4 [0x1ffe74982ea734] > /usr/bin/emacs-gtk'eval_sub+0x630 [0x1ffe74982e88c0] > /usr/bin/emacs-gtk'Feval+0x60 [0x1ffe74982ebb2c] > /usr/bin/emacs-gtk'internal_condition_case+0x78 [0x1ffe74982e1bbc] > /usr/bin/emacs-gtk'top_level_1+0x40 [0x1ffe7498233bd0] > /usr/bin/emacs-gtk'internal_catch+0x40 [0x1ffe74982e1ae8] > /usr/bin/emacs-gtk'command_loop+0xd8 [0x1ffe749823241c] > /usr/bin/emacs-gtk'recursive_edit_1+0xb8 [0x1ffe7498237d4c] > /usr/bin/emacs-gtk'Frecursive_edit+0x124 [0x1ffe7498238274] > /usr/bin/emacs-gtk'main+0x1f28 [0x1ffe7498231770] > /usr/bin/emacs-gtk'_start+0x64 [0x1ffe74980c5c04] > Abort (core dumped) > > Note that I don't believe that X disconnects are involved. > This happened immediately at startup. > > 2) > > % emacs-gtk > [xcb] Unknown sequence number while processing queue > [xcb] You called XInitThreads, this is not your fault > [xcb] Aborting, sorry about that. > Assertion failed: !xcb_xlib_threads_sequence_lost, file /builds/ulhg/mrcarson-trunk_166/components/x11/lib/libX11/libX11-1.8.7/src/xcb_io.c, line 278, function poll_for_event > Fatal error 6: Aborted > Backtrace: > /usr/bin/emacs-gtk'emacs_backtrace+0x50 [0x1fffec39a5f2ac] > /usr/bin/emacs-gtk'terminate_due_to_signal+0xb4 [0x1fffec39a2f804] > /usr/bin/emacs-gtk'handle_fatal_signal+0x8 [0x1fffec39a609b0] > /usr/bin/emacs-gtk'deliver_fatal_thread_signal+0x98 [0x1fffec39a5d4b8] > /lib/sparcv9/libc.so.1'__sighndlr+0xc [0x1fffec392c6410] > /lib/sparcv9/libc.so.1'call_user_handler+0x400 [0x1fffec392b8cb8] > /lib/sparcv9/libc.so.1'sigacthandler+0xd0 [0x1fffec392b90a8] > /lib/sparcv9/libc.so.1'__lwp_sigqueue+0x8 [0x1fffec392cb528] > /lib/sparcv9/libc.so.1'abort+0xb4 [0x1fffec391e5154] > /lib/sparcv9/libc.so.1'_assert_c99+0x64 [0x1fffec391e5ffc] > /usr/lib/sparcv9/libX11.so.4.0.0'poll_for_event+0x1fc [0x1fffec31cf2a8c] > /usr/lib/sparcv9/libX11.so.4.0.0'poll_for_response+0x2c [0x1fffec31cf2b2c] > /usr/lib/sparcv9/libX11.so.4.0.0'_XEventsQueued+0x7c [0x1fffec31cf2e8c] > /usr/lib/sparcv9/libX11.so.4.0.0'XFlush+0x1c [0x1fffec31cba45c] > /usr/bin/emacs-gtk'x_set_icon_type+0x70 [0x1fffec39a0c640] > /usr/bin/emacs-gtk'gui_set_frame_parameters_1+0x1800 [0x1fffec398e0028] > /usr/bin/emacs-gtk'gui_default_parameter+0x58 [0x1fffec398e55e8] > /usr/bin/emacs-gtk'Fx_create_frame+0xf20 [0x1fffec39a16a68] > /usr/bin/emacs-gtk'funcall_subr+0x120 [0x1fffec39ae6f58] > /usr/bin/emacs-gtk'exec_byte_code+0x52c [0x1fffec39b470e0] > /usr/bin/emacs-gtk'Ffuncall+0x108 [0x1fffec39ae37d4] > /usr/bin/emacs-gtk'Fapply+0x378 [0x1fffec39ae3c98] > /usr/bin/emacs-gtk'funcall_subr+0xa8 [0x1fffec39ae6ee0] > /usr/bin/emacs-gtk'exec_byte_code+0x52c [0x1fffec39b470e0] > /usr/bin/emacs-gtk'apply_lambda+0xd4 [0x1fffec39aea734] > /usr/bin/emacs-gtk'eval_sub+0x630 [0x1fffec39ae88c0] > /usr/bin/emacs-gtk'Feval+0x60 [0x1fffec39aebb2c] > /usr/bin/emacs-gtk'internal_condition_case+0x78 [0x1fffec39ae1bbc] > /usr/bin/emacs-gtk'top_level_1+0x40 [0x1fffec39a33bd0] > /usr/bin/emacs-gtk'internal_catch+0x40 [0x1fffec39ae1ae8] > /usr/bin/emacs-gtk'command_loop+0xd8 [0x1fffec39a3241c] > /usr/bin/emacs-gtk'recursive_edit_1+0xb8 [0x1fffec39a37d4c] > /usr/bin/emacs-gtk'Frecursive_edit+0x124 [0x1fffec39a38274] > /usr/bin/emacs-gtk'main+0x1f28 [0x1fffec39a31770] > /usr/bin/emacs-gtk'_start+0x64 [0x1fffec398c5c04] > Abort (core dumped) > > 3) > > Sometimes, this clobbers the state of my video card, leaving > me with a black screen with a large white square in the upper > left corner. When this happens, I log in remotely, and using > ps, it appears that my desktop is still running, so the X server > is unaware that the video hardware has been whacked. I can only > assume that a bad request is not being caught by the X11 server, > which is likely a second bug. > > With the Lucid version, I normally get this result, after > a long pause: > > % emacs-x > > Connection lost to X server 'localhost:10.0' > > The fact that both the plain X11, and GTK, versions, have > problems seems to point at some generic non-toolkit specific > X code. Perhaps this is an endian issue (x86 works fine), or > maybe there's some uninitialized data involved. > > I asked Rainer Orth, who I work with on Solaris gcc issues, and > who I know is also an emacs user, and he was able to confirm the > same issues, using a different X server: > > >> Given that I'm still on 27.1.90/28.2 myself, I didn't. On the sparc > >> side, I primarily use emacs to run gdb, which I haven't done in a > >> while. I can give it a whirl, though. > > > > here's what I found: I configured emacs with > > > > configure \ > > 'CFLAGS=-g3 -O2 -m64' \ > > --prefix=/vol/gnu \ > > --with-gif=ifavailable > > PKG_CONFIG_PATH=/usr/lib/64/pkgconfig:/usr/lib/64/pkgconfig:/usr/share/pkgconfig > > > > and built it with the bundled gcc 13.2.0. Started just fine with emacs > > -nw (to confirm the basics), but then I started it remotely from my > > ThinLinc session at work (the thin client we use): that resulted in an > > immediate SEGV of Xvnc and termination of the session. I'm glad I > > didn't try this from my session at home 😉 > > > > The Xvnc backtrace is like this: > > > > (EE) > > (EE) Backtrace: > > (EE) 0: /opt/thinlinc/libexec/Xvnc (xorg_backtrace+0x41) [0x76a2f1] > > (EE) 1: /opt/thinlinc/libexec/Xvnc (0x400000+0x36d909) [0x76d909] > > (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f7341b61000+0x14420) > > [0x7f7341b75420] > > (EE) 3: /lib/x86_64-linux-gnu/libc.so.6 (0x7f73416d1000+0x18bb38) > > [0x7f734185cb38] > > (EE) 4: /opt/thinlinc/libexec/Xvnc (FlushClient+0x348) [0x76d228] > > (EE) 5: /opt/thinlinc/libexec/Xvnc (WriteToClient+0x100) [0x76d460] > > (EE) 6: /opt/thinlinc/libexec/Xvnc (ProcXIGetSelectedEvents+0x2a8) [0x54fd98] > > (EE) 7: /opt/thinlinc/libexec/Xvnc (Dispatch+0x325) [0x71c925] > > (EE) 8: /opt/thinlinc/libexec/Xvnc (dix_main+0x388) [0x720778] > > (EE) 9: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf3) [0x7f73416f5083] > > (EE) 10: /opt/thinlinc/libexec/Xvnc (0x400000+0xc2fa0) [0x4c2fa0] > > (EE) > > (EE) Segmentation fault at address 0x17b40b60 > > (EE) > > > > I'm not sure if I can easily get a Xvnc core dump, though. > > He also reports the same issue with Debian/sparc64, which > we think means that it seems not to be an OS-specific issue: > > > For comparison's sake, I tried the same with a Debian/sparc64 LDom I > > have around for some LLVM and GCC work. I installed emacs 29.2 there, > > too (from packages this time) and was 'rewarded' with exactly the same > > Xvnc SEGV as before. So there's nothing Solaris specific in here, it > > seems. > > I'd appreciate it if someone who knows this code could have > a look. I'm happy to try patches or experiments to help narrow > the issue down further. Emacs works fine on sparc64-sun-solaris2.10, but the difference is that the X libraries and servers installed there are ancient and predate the introduction of generic events or XInput 2. The backtrace Rainer produced demonstrates that the client-side abort is consequent on the X server crashing as it attempts to respond to an XIGetSelectedEvents request, which is _always_ a bug in the X server, whatever the circumstances, and so I suggest redirecting your attention to X.Org, and building `--without-xinput2' in the meantime.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.