Hi Eli:

Sorry for the delayed response. I actually find that after removing a bunch of the optimization flags (those were just from some recipe that I copied for compiling Emacs on Windows), the specific problem of Emacs itself deadlocking goes away, though I still find Emacs will occasionally hang on the main thread, and trying to attach gdb to it via MSYS2 (which is what I use to compile Emacs with) doesn't work - however, it will recover after a non-specific amount of time has elapsed (which doesn't seem to have any relation to any of the timeout settings for eglot . I haven't had a chance to dig into this further yet, but it's definitely an improvement over having to terminate all clangd processes or restarting Emacs.


From: Eli Zaretskii <eliz@gnu.org>
Sent: Saturday, July 5, 2025 1:14 AM
To: admin@sonictk.com <admin@sonictk.com>
Cc: 78846@debbugs.gnu.org <78846@debbugs.gnu.org>
Subject: Re: bug#78846: Emacs hangs non-deterministically when eglot and clangd are used
 
Ping! Could you please answer my questions, and perhaps provide the
additional information I asked for?  I'd like to make some progress in
investigating this problem.

> Date: Fri, 20 Jun 2025 10:46:51 +0300
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: 78846@debbugs.gnu.org
>
> > From: "admin@sonictk.com" <admin@sonictk.com>
> > Date: Fri, 20 Jun 2025 05:09:06 +0000
> >
> > This has been a problem ever since as long as I can remember, but
> > using `eglot` with `clangd` results in Emacs sometimes just hanging.
>
> Does this happen only in Emacs 31, or did you see that in older
> versions as well?
>
> > I work in the Unreal Engine codebase, which results in some pretty massive `clangd` memory usage (sometimes 200+ GB) and thus response times, along with having to spin up multiple `clangd` servers simultaneously.
>
> How much VM do you have on that system, if memory consumption can be
> 200+ GB?  And what is the memory footprint of Emacs in those cases?
>
> > As far as I can tell, Emacs can hang in cases with as little as two `clangd` servers spun up, but generally this problem repros more often when I have multiple servers spun up. It usually happens on some LSP request, and it's not always deterministic which request it is.
>
> How many LSP servers could you have simultaneously?
>
> > I work on Windows, and I grabbed a callstack of Emacs's main thread when the hang occurs in WinDbg:
>
> This doesn't tell the whole story, because there must be other
> threads of interest in this case (since one or more clangd processes
> are being read from and written to).
>
> > ```
> > [0x0]   ntdll!NtWaitForMultipleObjects+0x14   0x3df75fdea8   0x7ffdf18abaf0  
> > [0x1]   KERNELBASE!WaitForMultipleObjectsEx+0xf0   0x3df75fdeb0   0x7ffdf18ab9ee  
> > [0x2]   KERNELBASE!WaitForMultipleObjects+0xe   0x3df75fe1a0   0x7ff635a921f7  
> > [0x3]   emacs!sys_select+0x12b7   0x3df75fe1e0   0x7ff635a1147e  
> > [0x4]   emacs!really_call_select+0x5e   0x3df75fe2a0   0x7ff635a1273d  
> > [0x5]   emacs!thread_select+0x9d   0x3df75fe300   0x7ff6359d9aec  
> > [0x6]   emacs!wait_reading_process_output+0x111c   0x3df75fe450   0x7ff6359db4c9  
> > [0x7]   emacs!send_process+0x269   0x3df75fe9c0   0x7ff6359dbd11  
> > [0x8]   emacs!Fprocess_send_string+0xb1   0x3df75fea70   0x7ff6359b5c5e  
>
> This says that Emacs called process-send-string, and it loops waiting
> for the queue of sent material to be emptied.  This information is not
> enough to analyze the reason for the hang.
>
> > The only workaround, without killing Emacs, is to kill all `clangd.exe` processes - this usually gets Emacs unstuck after a second or so. Sometimes, though, this trick doesn't work, and then the only recourse I have is to kill Emacs.exe entirely.
> >
> > I've been trying to find time on and off to debug this properly, but today I finally gave up and decided to report this in the hope that maybe someone will see this callstack and know immediately where the bug is.
> >
> > My emacs build was built from source. Version information is as follows:
> >
> > ```
> > In GNU Emacs 31.0.50 (build 5, x86_64-w64-mingw32) of 2025-06-03 built
> >  on CDW-AQRHE1HHT39
> > Repository revision: eb788fd8fd2026fa4d29b918ff95b12d8e3e0bab
> > Repository branch: master
> > Windowing system distributor 'Microsoft Corp.', version 10.0.19045
> > System Description: Microsoft Windows 10 Enterprise (v10.0.2009.19045.5965)
>
> This build is from 2 weeks ago.  Please update from master and
> rebuild, so that the source-level information you report will be easy
> to follow by looking at the current sources.
>
> > Configured using:
> >  'configure --without-pop --with-imagemagick
> >  --without-compress-install -without-dbus --with-gnutls
> >  --with-tree-sitter --without-gconf --with-rsvg --without-gsettings
> >  --with-mailutils --with-native-compilation --with-modules --with-xml2
> >  --with-wide-int 'CFLAGS=-O3 -ggdb -fno-math-errno
>
> First, please remove the --with-wide-int, it is not needed for 64-bit
> builds (and is supposed to be a no-op, but who knows?).  Also, please
> don't use -O3, but -O2 instead.  The -O3 switch could cause unsafe
> optimizations.
>
> >  -funsafe-math-optimizations -fno-finite-math-only -fno-trapping-math
> >  -freciprocal-math -fno-rounding-math -fno-signaling-nans
> >  -fassociative-math -fno-signed-zeros -frename-registers
> >  -funroll-loops -mtune=native -march=native -fomit-frame-pointer
> >  -fallow-store-data-races -fno-semantic-interposition
> >  -floop-parallelize-all -ftree-parallelize-loops=4'
>
> Please also drop all these -fSOMETHING and -f-noSOMETHING switches;
> they are not needed with -O2.  And fomit-frame-pointer is actually
> dangerous in Emacs.
>
> >  PKG_CONFIG_PATH=/mingw64/lib/pkgconfig:/mingw64/share/pkgconfig'
>
> Don't know what that is, or why do you need it.
>
> > Anyone who has ideas for me to try/look in debugging this issue, happy to give them a go. I encounter this fairly frequently, so I should be able to catch it in action.
>
> Next time Emacs hangs like that, attach GDB to it, then type at the
> GDB prompt:
>
>   (gdb) thread apply all bt
>
> and post here everything that GDB produces as result.  This will show
> us the C-level backtraces of all the threads that run in the process,
> not only of the main thread.  When Emacs communicates with external
> processes, it uses additional threads for that purpose, and it is
> important to know their state and status.
>
> (Let me know if you need instructions about attaching GDB to a running
> Emacs process.)
>
> Thanks.
>
>
>
>