Package: emacs;
Reported by: ali_gnu2 <at> emvision.com
Date: Tue, 12 Mar 2024 20:38:02 UTC
Severity: normal
Done: Po Lu <luangruo <at> yahoo.com>
Bug is archived. No further changes may be made.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: ali_gnu2 <at> emvision.com To: bug-gnu-emacs <at> gnu.org Subject: X11 versions of Emacs 29 on sparc fail at startup Date: Tue, 12 Mar 2024 11:57:46 -0600
Hello, I maintain the the GNU emacs delivered with the Solaris OS: https://github.com/oracle/solaris-userland/tree/master/components/emacs We're still at version 28.2. It's getting long in the tooth, so I recently tried to move to 29.2. There were no issues on x86, but on sparc, I see display problems with both the gtk (emacs-gtk) and lucid (emacs-x) versions that prevent it from running. I double checked version 28.2, which we are currently delivering, and it has no problem. Then, I built 29.1, and it shows the same issues as 29.2. It seems that something was introduced early in the 29 series. With the GTK version, I see 3 different failures on repeated attempts: 1) % emacs-gtk Connection lost to X server 'localhost:10.0' When compiled with GTK, Emacs cannot recover from X disconnects. This is a GTK bug: https://gitlab.gnome.org/GNOME/gtk/issues/221 For details, see etc/PROBLEMS. Fatal error 6: Aborted Backtrace: /usr/bin/emacs-gtk'emacs_backtrace+0x50 [0x1ffe749825f2ac] /usr/bin/emacs-gtk'terminate_due_to_signal+0xb4 [0x1ffe749822f804] /usr/bin/emacs-gtk'emacs_abort+0x8 [0x1ffe74982609c4] /usr/bin/emacs-gtk'x_connection_closed+0x3cc [0x1ffe74981efd54] /usr/bin/emacs-gtk'x_io_error_quitter+0x3c [0x1ffe74981f00f0] /usr/lib/sparcv9/libX11.so.4.0.0'_XIOError+0x74 [0x1ffe74904f8034] /usr/lib/sparcv9/libX11.so.4.0.0'_XReply+0x360 [0x1ffe74904f3bf0] /usr/lib/sparcv9/libXi.so.5.0.0'XIGetSelectedEvents+0x84 [0x1ffe748ed17044] /usr/bin/emacs-gtk'Fx_create_frame+0x17b4 [0x1ffe74982172fc] /usr/bin/emacs-gtk'funcall_subr+0x120 [0x1ffe74982e6f58] /usr/bin/emacs-gtk'exec_byte_code+0x52c [0x1ffe74983470e0] /usr/bin/emacs-gtk'Ffuncall+0x108 [0x1ffe74982e37d4] /usr/bin/emacs-gtk'Fapply+0x378 [0x1ffe74982e3c98] /usr/bin/emacs-gtk'funcall_subr+0xa8 [0x1ffe74982e6ee0] /usr/bin/emacs-gtk'exec_byte_code+0x52c [0x1ffe74983470e0] /usr/bin/emacs-gtk'apply_lambda+0xd4 [0x1ffe74982ea734] /usr/bin/emacs-gtk'eval_sub+0x630 [0x1ffe74982e88c0] /usr/bin/emacs-gtk'Feval+0x60 [0x1ffe74982ebb2c] /usr/bin/emacs-gtk'internal_condition_case+0x78 [0x1ffe74982e1bbc] /usr/bin/emacs-gtk'top_level_1+0x40 [0x1ffe7498233bd0] /usr/bin/emacs-gtk'internal_catch+0x40 [0x1ffe74982e1ae8] /usr/bin/emacs-gtk'command_loop+0xd8 [0x1ffe749823241c] /usr/bin/emacs-gtk'recursive_edit_1+0xb8 [0x1ffe7498237d4c] /usr/bin/emacs-gtk'Frecursive_edit+0x124 [0x1ffe7498238274] /usr/bin/emacs-gtk'main+0x1f28 [0x1ffe7498231770] /usr/bin/emacs-gtk'_start+0x64 [0x1ffe74980c5c04] Abort (core dumped) Note that I don't believe that X disconnects are involved. This happened immediately at startup. 2) % emacs-gtk [xcb] Unknown sequence number while processing queue [xcb] You called XInitThreads, this is not your fault [xcb] Aborting, sorry about that. Assertion failed: !xcb_xlib_threads_sequence_lost, file /builds/ulhg/mrcarson-trunk_166/components/x11/lib/libX11/libX11-1.8.7/src/xcb_io.c, line 278, function poll_for_event Fatal error 6: Aborted Backtrace: /usr/bin/emacs-gtk'emacs_backtrace+0x50 [0x1fffec39a5f2ac] /usr/bin/emacs-gtk'terminate_due_to_signal+0xb4 [0x1fffec39a2f804] /usr/bin/emacs-gtk'handle_fatal_signal+0x8 [0x1fffec39a609b0] /usr/bin/emacs-gtk'deliver_fatal_thread_signal+0x98 [0x1fffec39a5d4b8] /lib/sparcv9/libc.so.1'__sighndlr+0xc [0x1fffec392c6410] /lib/sparcv9/libc.so.1'call_user_handler+0x400 [0x1fffec392b8cb8] /lib/sparcv9/libc.so.1'sigacthandler+0xd0 [0x1fffec392b90a8] /lib/sparcv9/libc.so.1'__lwp_sigqueue+0x8 [0x1fffec392cb528] /lib/sparcv9/libc.so.1'abort+0xb4 [0x1fffec391e5154] /lib/sparcv9/libc.so.1'_assert_c99+0x64 [0x1fffec391e5ffc] /usr/lib/sparcv9/libX11.so.4.0.0'poll_for_event+0x1fc [0x1fffec31cf2a8c] /usr/lib/sparcv9/libX11.so.4.0.0'poll_for_response+0x2c [0x1fffec31cf2b2c] /usr/lib/sparcv9/libX11.so.4.0.0'_XEventsQueued+0x7c [0x1fffec31cf2e8c] /usr/lib/sparcv9/libX11.so.4.0.0'XFlush+0x1c [0x1fffec31cba45c] /usr/bin/emacs-gtk'x_set_icon_type+0x70 [0x1fffec39a0c640] /usr/bin/emacs-gtk'gui_set_frame_parameters_1+0x1800 [0x1fffec398e0028] /usr/bin/emacs-gtk'gui_default_parameter+0x58 [0x1fffec398e55e8] /usr/bin/emacs-gtk'Fx_create_frame+0xf20 [0x1fffec39a16a68] /usr/bin/emacs-gtk'funcall_subr+0x120 [0x1fffec39ae6f58] /usr/bin/emacs-gtk'exec_byte_code+0x52c [0x1fffec39b470e0] /usr/bin/emacs-gtk'Ffuncall+0x108 [0x1fffec39ae37d4] /usr/bin/emacs-gtk'Fapply+0x378 [0x1fffec39ae3c98] /usr/bin/emacs-gtk'funcall_subr+0xa8 [0x1fffec39ae6ee0] /usr/bin/emacs-gtk'exec_byte_code+0x52c [0x1fffec39b470e0] /usr/bin/emacs-gtk'apply_lambda+0xd4 [0x1fffec39aea734] /usr/bin/emacs-gtk'eval_sub+0x630 [0x1fffec39ae88c0] /usr/bin/emacs-gtk'Feval+0x60 [0x1fffec39aebb2c] /usr/bin/emacs-gtk'internal_condition_case+0x78 [0x1fffec39ae1bbc] /usr/bin/emacs-gtk'top_level_1+0x40 [0x1fffec39a33bd0] /usr/bin/emacs-gtk'internal_catch+0x40 [0x1fffec39ae1ae8] /usr/bin/emacs-gtk'command_loop+0xd8 [0x1fffec39a3241c] /usr/bin/emacs-gtk'recursive_edit_1+0xb8 [0x1fffec39a37d4c] /usr/bin/emacs-gtk'Frecursive_edit+0x124 [0x1fffec39a38274] /usr/bin/emacs-gtk'main+0x1f28 [0x1fffec39a31770] /usr/bin/emacs-gtk'_start+0x64 [0x1fffec398c5c04] Abort (core dumped) 3) Sometimes, this clobbers the state of my video card, leaving me with a black screen with a large white square in the upper left corner. When this happens, I log in remotely, and using ps, it appears that my desktop is still running, so the X server is unaware that the video hardware has been whacked. I can only assume that a bad request is not being caught by the X11 server, which is likely a second bug. With the Lucid version, I normally get this result, after a long pause: % emacs-x Connection lost to X server 'localhost:10.0' The fact that both the plain X11, and GTK, versions, have problems seems to point at some generic non-toolkit specific X code. Perhaps this is an endian issue (x86 works fine), or maybe there's some uninitialized data involved. I asked Rainer Orth, who I work with on Solaris gcc issues, and who I know is also an emacs user, and he was able to confirm the same issues, using a different X server: >> Given that I'm still on 27.1.90/28.2 myself, I didn't. On the sparc >> side, I primarily use emacs to run gdb, which I haven't done in a >> while. I can give it a whirl, though. > > here's what I found: I configured emacs with > > configure \ > 'CFLAGS=-g3 -O2 -m64' \ > --prefix=/vol/gnu \ > --with-gif=ifavailable > PKG_CONFIG_PATH=/usr/lib/64/pkgconfig:/usr/lib/64/pkgconfig:/usr/share/pkgconfig > > and built it with the bundled gcc 13.2.0. Started just fine with emacs > -nw (to confirm the basics), but then I started it remotely from my > ThinLinc session at work (the thin client we use): that resulted in an > immediate SEGV of Xvnc and termination of the session. I'm glad I > didn't try this from my session at home 😉 > > The Xvnc backtrace is like this: > > (EE) > (EE) Backtrace: > (EE) 0: /opt/thinlinc/libexec/Xvnc (xorg_backtrace+0x41) [0x76a2f1] > (EE) 1: /opt/thinlinc/libexec/Xvnc (0x400000+0x36d909) [0x76d909] > (EE) 2: /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f7341b61000+0x14420) > [0x7f7341b75420] > (EE) 3: /lib/x86_64-linux-gnu/libc.so.6 (0x7f73416d1000+0x18bb38) > [0x7f734185cb38] > (EE) 4: /opt/thinlinc/libexec/Xvnc (FlushClient+0x348) [0x76d228] > (EE) 5: /opt/thinlinc/libexec/Xvnc (WriteToClient+0x100) [0x76d460] > (EE) 6: /opt/thinlinc/libexec/Xvnc (ProcXIGetSelectedEvents+0x2a8) [0x54fd98] > (EE) 7: /opt/thinlinc/libexec/Xvnc (Dispatch+0x325) [0x71c925] > (EE) 8: /opt/thinlinc/libexec/Xvnc (dix_main+0x388) [0x720778] > (EE) 9: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf3) [0x7f73416f5083] > (EE) 10: /opt/thinlinc/libexec/Xvnc (0x400000+0xc2fa0) [0x4c2fa0] > (EE) > (EE) Segmentation fault at address 0x17b40b60 > (EE) > > I'm not sure if I can easily get a Xvnc core dump, though. He also reports the same issue with Debian/sparc64, which we think means that it seems not to be an OS-specific issue: > For comparison's sake, I tried the same with a Debian/sparc64 LDom I > have around for some LLVM and GCC work. I installed emacs 29.2 there, > too (from packages this time) and was 'rewarded' with exactly the same > Xvnc SEGV as before. So there's nothing Solaris specific in here, it > seems. I'd appreciate it if someone who knows this code could have a look. I'm happy to try patches or experiments to help narrow the issue down further. Thanks. - Ali
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.