#79367 - 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t

GNU bug report logs - #79367
31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t

Package: emacs;

Reported by: Zhengyi Fu <i <at> fuzy.me>

Date: Tue, 2 Sep 2025 06:21:01 UTC

Severity: normal

Found in version 31.0.50

Message #95 received at 79367 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 14:04:42 -0400

Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: Dmitry Gutov <dmitry <at> gutov.dev>, i <at> fuzy.me, 79367 <at> debbugs.gnu.org >> Date: Tue, 02 Sep 2025 15:33:46 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> > I can code the proposed fix in a few minutes, if there's agreement to >> > what I proposed to do. I'm still uncertain what I proposed is >> > agreed-upon. So let me reiterate that: >> > >> > . if the process on behalf of which we called >> > server_accept_connection was not locked to some thread, undo what >> > make_process did to the new process when it called pset_thread >> > . otherwise, call set_proc_thread to lock the new process to the >> > same thread as the one which caused this call >> >> Whatever you do, please post your change for review rather than just >> pushing it. > > Will do. > >> >> I don't want us to spend a lot of time arguing revert-or-no-revert now, >> >> but could you, Eli, try to decide on a full example of a buggy scenario >> >> which is really solved by thread locking being? >> > >> > I already described that in so many words. >> >> Can you send that full description again? Perhaps I can translate it >> from words into code for you. > > Look at my posts as part of discussing bug#79201. For example: OK, attempting to excerpt the parts where you describe what could go wrong when a thread is unlocked: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#88 > Wouldn't we want the thread to be able to read the output > from its process after the wait? Without marking the descriptors with > the thread, that output could have been read by some other thread > during the wait, and then our thread will hang forever if it calls > accept-process-output after the wait ended -- a much more serious > problem, and in a program that has no bug per se. https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#94 > > > An output from a process could have been read by a thread other than > > > the one that started the process, thus "starving" the thread that > > > started the process from reading its output. > > > > That has been possible, and common, since threads were originally added. > > And caused strange bugs, whereby a thread that started a process would > hang in accept-process-output because some other thread inadvertently > read the output from that process. https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#100 > If you study the code in wait_reading_process_output, you will > immediately see what kind of godawful mess will that cause with > reading output from several processes started by several threads. > > wait_reading_process_output was written under an implicit assumption > that there's only one thread that handles all the subprocesses. > Therefore, it consistently checks all of the I/O descriptors for > available input from the subprocesses, and consumes that as soon as > it's available. That cannot possibly work reliably, let alone > predictably, when several threads are involved. This is so obvious > from reading the code that I'm astonished I need to even explain it. Note that we need to make sure wait_reading_process_output works reliably in this case for the sake of processes which aren't locked to threads. So defaulting to locking processes to threads doesn't actually help with this. > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#106 > Moreover, IME many weird problems and buggy behavior in Lisp programs > using threads were caused by the wrong thread inadvertently reading > output of a process, which then left the thread that waits for that > process's output stuck waiting forever. If anything, we should make > process locking to its thread stronger, not weaker, if we want > relatively simple threaded programs to work reliably. So, to summarize, it sounds like your issue with not locking processes to threads is: Without locking processes to the creating thread, a process's output can be read by some other thread during the wait, and then our thread will hang forever if it calls accept-process-output after the wait ended. This is unfortunately not detailed enough to translate into code. I've never seen a real-world program which would be vulnerable to this problem, as described here. I can make up toy programs which meet this description which would hang, but I'm not sure they're actually what you're concerned about. Especially because they aren't anything like real-world programs. Could you give some more details about the scenario, so I can try to write a test demonstrating the problem? Or could you point to a real-world program which is vulnerable to this problem, so I can turn it into a test?

This bug report was last modified 7 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #79367 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t

GNU bug report logs - #79367
31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t