GNU bug report logs -
#79367
31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t
Previous Next
Full log
View this message in rfc822 format
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Spencer Baugh <sbaugh <at> janestreet.com>
>> Cc: Dmitry Gutov <dmitry <at> gutov.dev>, i <at> fuzy.me, 79367 <at> debbugs.gnu.org
>> Date: Tue, 02 Sep 2025 15:33:46 -0400
>>
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>>
>> > I can code the proposed fix in a few minutes, if there's agreement to
>> > what I proposed to do. I'm still uncertain what I proposed is
>> > agreed-upon. So let me reiterate that:
>> >
>> > . if the process on behalf of which we called
>> > server_accept_connection was not locked to some thread, undo what
>> > make_process did to the new process when it called pset_thread
>> > . otherwise, call set_proc_thread to lock the new process to the
>> > same thread as the one which caused this call
>>
>> Whatever you do, please post your change for review rather than just
>> pushing it.
>
> Will do.
>
>> >> I don't want us to spend a lot of time arguing revert-or-no-revert now,
>> >> but could you, Eli, try to decide on a full example of a buggy scenario
>> >> which is really solved by thread locking being?
>> >
>> > I already described that in so many words.
>>
>> Can you send that full description again? Perhaps I can translate it
>> from words into code for you.
>
> Look at my posts as part of discussing bug#79201. For example:
OK, attempting to excerpt the parts where you describe what could go
wrong when a thread is unlocked:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#88
> Wouldn't we want the thread to be able to read the output
> from its process after the wait? Without marking the descriptors with
> the thread, that output could have been read by some other thread
> during the wait, and then our thread will hang forever if it calls
> accept-process-output after the wait ended -- a much more serious
> problem, and in a program that has no bug per se.
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#94
> > > An output from a process could have been read by a thread other than
> > > the one that started the process, thus "starving" the thread that
> > > started the process from reading its output.
> >
> > That has been possible, and common, since threads were originally added.
>
> And caused strange bugs, whereby a thread that started a process would
> hang in accept-process-output because some other thread inadvertently
> read the output from that process.
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#100
> If you study the code in wait_reading_process_output, you will
> immediately see what kind of godawful mess will that cause with
> reading output from several processes started by several threads.
>
> wait_reading_process_output was written under an implicit assumption
> that there's only one thread that handles all the subprocesses.
> Therefore, it consistently checks all of the I/O descriptors for
> available input from the subprocesses, and consumes that as soon as
> it's available. That cannot possibly work reliably, let alone
> predictably, when several threads are involved. This is so obvious
> from reading the code that I'm astonished I need to even explain it.
Note that we need to make sure wait_reading_process_output works
reliably in this case for the sake of processes which aren't locked to
threads. So defaulting to locking processes to threads doesn't actually
help with this.
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#106
> Moreover, IME many weird problems and buggy behavior in Lisp programs
> using threads were caused by the wrong thread inadvertently reading
> output of a process, which then left the thread that waits for that
> process's output stuck waiting forever. If anything, we should make
> process locking to its thread stronger, not weaker, if we want
> relatively simple threaded programs to work reliably.
So, to summarize, it sounds like your issue with not locking processes
to threads is:
Without locking processes to the creating thread, a process's output
can be read by some other thread during the wait, and then our thread
will hang forever if it calls accept-process-output after the wait
ended.
This is unfortunately not detailed enough to translate into code. I've
never seen a real-world program which would be vulnerable to this
problem, as described here. I can make up toy programs which meet this
description which would hang, but I'm not sure they're actually what
you're concerned about. Especially because they aren't anything like
real-world programs.
Could you give some more details about the scenario, so I can try to
write a test demonstrating the problem? Or could you point to a
real-world program which is vulnerable to this problem, so I can turn it
into a test?
This bug report was last modified 7 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.