Package: emacs;
Reported by: "admin <at> sonictk.com" <admin <at> sonictk.com>
Date: Fri, 20 Jun 2025 05:10:02 UTC
Severity: normal
To reply to this bug, email your comments to 78846 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#78846
; Package emacs
.
(Fri, 20 Jun 2025 05:10:02 GMT) Full text and rfc822 format available."admin <at> sonictk.com" <admin <at> sonictk.com>
:bug-gnu-emacs <at> gnu.org
.
(Fri, 20 Jun 2025 05:10:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: "admin <at> sonictk.com" <admin <at> sonictk.com> To: "bug-gnu-emacs <at> gnu.org" <bug-gnu-emacs <at> gnu.org> Subject: Emacs hangs non-deterministically when eglot and clangd are used Date: Fri, 20 Jun 2025 05:09:06 +0000
Hi: This has been a problem ever since as long as I can remember, but using `eglot` with `clangd` results in Emacs sometimes just hanging. I work in the Unreal Engine codebase, which results in some pretty massive `clangd` memory usage (sometimes 200+ GB) and thus response times, along with having to spin up multiple `clangd` servers simultaneously. As far as I can tell, Emacs can hang in cases with as little as two `clangd` servers spun up, but generally this problem repros more often when I have multiple servers spun up. It usually happens on some LSP request, and it's not always deterministic which request it is. I work on Windows, and I grabbed a callstack of Emacs's main thread when the hang occurs in WinDbg: ``` [0x0] ntdll!NtWaitForMultipleObjects+0x14 0x3df75fdea8 0x7ffdf18abaf0 [0x1] KERNELBASE!WaitForMultipleObjectsEx+0xf0 0x3df75fdeb0 0x7ffdf18ab9ee [0x2] KERNELBASE!WaitForMultipleObjects+0xe 0x3df75fe1a0 0x7ff635a921f7 [0x3] emacs!sys_select+0x12b7 0x3df75fe1e0 0x7ff635a1147e [0x4] emacs!really_call_select+0x5e 0x3df75fe2a0 0x7ff635a1273d [0x5] emacs!thread_select+0x9d 0x3df75fe300 0x7ff6359d9aec [0x6] emacs!wait_reading_process_output+0x111c 0x3df75fe450 0x7ff6359db4c9 [0x7] emacs!send_process+0x269 0x3df75fe9c0 0x7ff6359dbd11 [0x8] emacs!Fprocess_send_string+0xb1 0x3df75fea70 0x7ff6359b5c5e [0x9] emacs!exec_byte_code+0x7ee 0x3df75feab0 0x7ff63592eefe [0xa] emacs!Ffuncall+0xfe 0x3df75feb90 0x7ff63592f3a1 [0xb] emacs!Fapply+0x141 0x3df75fec20 0x7ff6359b5c5e [0xc] emacs!exec_byte_code+0x7ee 0x3df75fed00 0x7ff63592eefe [0xd] emacs!Ffuncall+0xfe 0x3df75fede0 0x7ffd967e587e [0xe] jsonrpc_e62a9c36_c0dd1fe3!F6a736f6e7270632d6e6f74696679_jsonrpc_notify_0+0x4e 0x3df75fee70 0x7ff63592eefe [0xf] emacs!Ffuncall+0xfe 0x3df75feed0 0x7ffd96ebc1b7 [0x10] eglot_3726620f_1a2b9d18!F65676c6f742d2d63616e63656c2d696e666c696768742d6173796e632d7265717565737473_eglot__cancel_inflight_async_requests_0+0x1a7 0x3df75fef60 0x7ff63592eefe [0x11] emacs!Ffuncall+0xfe 0x3df75ff030 0x7ffd96ec657e [0x12] eglot_3726620f_1a2b9d18!F65676c6f742d2d7072652d636f6d6d616e642d686f6f6b_eglot__pre_command_hook_0+0x3e 0x3df75ff0c0 0x7ff63592eefe [0x13] emacs!Ffuncall+0xfe 0x3df75ff100 0x7ff635927a05 [0x14] emacs!internal_condition_case_n+0x45 0x3df75ff190 0x7ff63584f384 [0x15] emacs!safe_run_hook_funcall+0xa4 0x3df75ff1d0 0x7ff635927ffb [0x16] emacs!run_hook_with_args+0x9b 0x3df75ff260 0x7ff635868e7c [0x17] emacs!command_loop_1+0x3cc 0x3df75ff2d0 0x7ff635927839 [0x18] emacs!internal_condition_case+0x39 0x3df75ff4b0 0x7ff63584ca56 [0x19] emacs!command_loop_2+0x26 0x3df75ff4f0 0x7ff6359277a7 [0x1a] emacs!internal_catch+0x37 0x3df75ff520 0x7ff63584c9ec [0x1b] emacs!command_loop+0x15c 0x3df75ff560 0x7ff6358571e7 [0x1c] emacs!recursive_edit_1+0x87 0x3df75ff670 0x7ff6358575e4 [0x1d] emacs!Frecursive_edit+0xe4 0x3df75ff6c0 0x7ff635aeac77 [0x1e] emacs!main+0x2787 0x3df75ff6f0 0x7ff635701318 [0x1f] emacs!__tmainCRTStartup+0x168 0x3df75ffc20 0x7ff635701426 [0x20] emacs!mainCRTStartup+0x16 0x3df75ffc70 0x7ffdf3d37374 [0x21] KERNEL32!BaseThreadInitThunk+0x14 0x3df75ffca0 0x7ffdf3edcc91 [0x22] ntdll!RtlUserThreadStart+0x21 0x3df75ffcd0 0x0 ``` The only workaround, without killing Emacs, is to kill all `clangd.exe` processes - this usually gets Emacs unstuck after a second or so. Sometimes, though, this trick doesn't work, and then the only recourse I have is to kill Emacs.exe entirely. I've been trying to find time on and off to debug this properly, but today I finally gave up and decided to report this in the hope that maybe someone will see this callstack and know immediately where the bug is. My emacs build was built from source. Version information is as follows: ``` In GNU Emacs 31.0.50 (build 5, x86_64-w64-mingw32) of 2025-06-03 built on CDW-AQRHE1HHT39 Repository revision: eb788fd8fd2026fa4d29b918ff95b12d8e3e0bab Repository branch: master Windowing system distributor 'Microsoft Corp.', version 10.0.19045 System Description: Microsoft Windows 10 Enterprise (v10.0.2009.19045.5965) Configured using: 'configure --without-pop --with-imagemagick --without-compress-install -without-dbus --with-gnutls --with-tree-sitter --without-gconf --with-rsvg --without-gsettings --with-mailutils --with-native-compilation --with-modules --with-xml2 --with-wide-int 'CFLAGS=-O3 -ggdb -fno-math-errno -funsafe-math-optimizations -fno-finite-math-only -fno-trapping-math -freciprocal-math -fno-rounding-math -fno-signaling-nans -fassociative-math -fno-signed-zeros -frename-registers -funroll-loops -mtune=native -march=native -fomit-frame-pointer -fallow-store-data-races -fno-semantic-interposition -floop-parallelize-all -ftree-parallelize-loops=4' PKG_CONFIG_PATH=/mingw64/lib/pkgconfig:/mingw64/share/pkgconfig' ``` I haven't been able to find an isolated test case for this to happen deterministically. Anyone who has ideas for me to try/look in debugging this issue, happy to give them a go. I encounter this fairly frequently, so I should be able to catch it in action. Thanks. - Yi Liang
bug-gnu-emacs <at> gnu.org
:bug#78846
; Package emacs
.
(Fri, 20 Jun 2025 07:48:02 GMT) Full text and rfc822 format available.Message #8 received at 78846 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: "admin <at> sonictk.com" <admin <at> sonictk.com> Cc: 78846 <at> debbugs.gnu.org Subject: Re: bug#78846: Emacs hangs non-deterministically when eglot and clangd are used Date: Fri, 20 Jun 2025 10:46:51 +0300
> From: "admin <at> sonictk.com" <admin <at> sonictk.com> > Date: Fri, 20 Jun 2025 05:09:06 +0000 > > This has been a problem ever since as long as I can remember, but > using `eglot` with `clangd` results in Emacs sometimes just hanging. Does this happen only in Emacs 31, or did you see that in older versions as well? > I work in the Unreal Engine codebase, which results in some pretty massive `clangd` memory usage (sometimes 200+ GB) and thus response times, along with having to spin up multiple `clangd` servers simultaneously. How much VM do you have on that system, if memory consumption can be 200+ GB? And what is the memory footprint of Emacs in those cases? > As far as I can tell, Emacs can hang in cases with as little as two `clangd` servers spun up, but generally this problem repros more often when I have multiple servers spun up. It usually happens on some LSP request, and it's not always deterministic which request it is. How many LSP servers could you have simultaneously? > I work on Windows, and I grabbed a callstack of Emacs's main thread when the hang occurs in WinDbg: This doesn't tell the whole story, because there must be other threads of interest in this case (since one or more clangd processes are being read from and written to). > ``` > [0x0] ntdll!NtWaitForMultipleObjects+0x14 0x3df75fdea8 0x7ffdf18abaf0 > [0x1] KERNELBASE!WaitForMultipleObjectsEx+0xf0 0x3df75fdeb0 0x7ffdf18ab9ee > [0x2] KERNELBASE!WaitForMultipleObjects+0xe 0x3df75fe1a0 0x7ff635a921f7 > [0x3] emacs!sys_select+0x12b7 0x3df75fe1e0 0x7ff635a1147e > [0x4] emacs!really_call_select+0x5e 0x3df75fe2a0 0x7ff635a1273d > [0x5] emacs!thread_select+0x9d 0x3df75fe300 0x7ff6359d9aec > [0x6] emacs!wait_reading_process_output+0x111c 0x3df75fe450 0x7ff6359db4c9 > [0x7] emacs!send_process+0x269 0x3df75fe9c0 0x7ff6359dbd11 > [0x8] emacs!Fprocess_send_string+0xb1 0x3df75fea70 0x7ff6359b5c5e This says that Emacs called process-send-string, and it loops waiting for the queue of sent material to be emptied. This information is not enough to analyze the reason for the hang. > The only workaround, without killing Emacs, is to kill all `clangd.exe` processes - this usually gets Emacs unstuck after a second or so. Sometimes, though, this trick doesn't work, and then the only recourse I have is to kill Emacs.exe entirely. > > I've been trying to find time on and off to debug this properly, but today I finally gave up and decided to report this in the hope that maybe someone will see this callstack and know immediately where the bug is. > > My emacs build was built from source. Version information is as follows: > > ``` > In GNU Emacs 31.0.50 (build 5, x86_64-w64-mingw32) of 2025-06-03 built > on CDW-AQRHE1HHT39 > Repository revision: eb788fd8fd2026fa4d29b918ff95b12d8e3e0bab > Repository branch: master > Windowing system distributor 'Microsoft Corp.', version 10.0.19045 > System Description: Microsoft Windows 10 Enterprise (v10.0.2009.19045.5965) This build is from 2 weeks ago. Please update from master and rebuild, so that the source-level information you report will be easy to follow by looking at the current sources. > Configured using: > 'configure --without-pop --with-imagemagick > --without-compress-install -without-dbus --with-gnutls > --with-tree-sitter --without-gconf --with-rsvg --without-gsettings > --with-mailutils --with-native-compilation --with-modules --with-xml2 > --with-wide-int 'CFLAGS=-O3 -ggdb -fno-math-errno First, please remove the --with-wide-int, it is not needed for 64-bit builds (and is supposed to be a no-op, but who knows?). Also, please don't use -O3, but -O2 instead. The -O3 switch could cause unsafe optimizations. > -funsafe-math-optimizations -fno-finite-math-only -fno-trapping-math > -freciprocal-math -fno-rounding-math -fno-signaling-nans > -fassociative-math -fno-signed-zeros -frename-registers > -funroll-loops -mtune=native -march=native -fomit-frame-pointer > -fallow-store-data-races -fno-semantic-interposition > -floop-parallelize-all -ftree-parallelize-loops=4' Please also drop all these -fSOMETHING and -f-noSOMETHING switches; they are not needed with -O2. And fomit-frame-pointer is actually dangerous in Emacs. > PKG_CONFIG_PATH=/mingw64/lib/pkgconfig:/mingw64/share/pkgconfig' Don't know what that is, or why do you need it. > Anyone who has ideas for me to try/look in debugging this issue, happy to give them a go. I encounter this fairly frequently, so I should be able to catch it in action. Next time Emacs hangs like that, attach GDB to it, then type at the GDB prompt: (gdb) thread apply all bt and post here everything that GDB produces as result. This will show us the C-level backtraces of all the threads that run in the process, not only of the main thread. When Emacs communicates with external processes, it uses additional threads for that purpose, and it is important to know their state and status. (Let me know if you need instructions about attaching GDB to a running Emacs process.) Thanks.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.