GNU bug report logs -
#79201
30.1.90; set-process-thread can permanently break fd_callback_info slots
Previous Next
Full log
View this message in rfc822 format
Eli Zaretskii <eliz <at> gnu.org> writes:
>> Cc: 79201 <at> debbugs.gnu.org, dmitry <at> gutov.dev, app-emacs-dev <at> janestreet.com
>> Date: Fri, 08 Aug 2025 16:34:10 -0400
>> From: Spencer Baugh via "Bug reports for GNU Emacs,
>> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>> Here's a patch to fix it by initializing these fields.
>
> Thanks, but I don't think this is right. First,
> update_processes_for_thread_death clears the .thread member of
> fd_callback_info of those processes that were locked to the dying
> thread. Second, wait_reading_process_output_unwind clears the
> .waiting_thread member of fd_callback_info for all descriptors, when
> we exit wait_reading_process_output (which is called by
> accept-process-output).
Yes.
> And finally, IMO it is wrong for the add_*_fd functions to overwrite
> these two members, because they are set and reset by other functions,
> mainly those which compute the wait mask for the pselect call.
> add_*_fd should not touch these members.
I think we should at least add an eassert in those functions which
checks that .waiting_thread is null. It is never valid for that to be
non-null in add_*_fd.
> I believe in this case update_processes_for_thread_death doesn't do
> its job because your scenario kills the process before the thread
> dies, and update_processes_for_thread_death only looks at the
> processes that are in Vprocess_alist. So it is possible that all
> that's missing is that delete_read_fd and delete_write_fd (which are
> called from deactivate_process) should clear the .thread member of
> fd_callback_info corresponding to the process's input and output
> descriptors.
>
> If that doesn't fix the issue with your reproducer, please investigate
> why, and let's take it from there.
Yes, that does fix the issue with my reproducer.
But I think it may cause other issues. delete_read_fd and
delete_write_fd are called from many other places. For example,
delete_write_fd is called before calling add_process_read_fd on (I
think?) the same fd in wait_reading_process_output. Clearing the
.thread there would break the guarantees of set-process-thread.
Further complicating the issue: I think there's deeper issues with
set-process-thread. There's also incorrect information in the manual
about it:
processes.texi:
> by default a process is locked to the thread that created it. When a
> process is locked to a thread, output from the process can only be
> accepted by that thread.
...
> @defun set-process-thread process thread
> Set the locking thread of @var{process} to @var{thread}. @var{thread}
> may be @code{nil}, in which case the process is unlocked.
> @end defun
This is not true. By default .thread is NULL, i.e., processes are not
locked to any thread.
I think this behavior was just never implemented. But, it doesn't seem
to have been needed, since no-one has complained about the lack of it in
the last 13 years since threads were released.
I did a quick survey of 300 ELPA packages and there's no use of
set-process-thread in any of them. And set-process-thread is not useful
in practice if processes don't start out locked to a particular thread,
since trying to lock a process to a thread if it's already unlocked is
inherently racy.
So I think there are two good options. Either:
A. Make the code match the manual by locking processes to threads by
default, and somehow fix the issues this causes with breaking
fd_callback_info slots.
Or:
B. Deprecate set-process-thread and make it a no-op. Simply delete this
incorrect section of the manual.
My preference is for B, since this also reduces complexity and will
slightly improve performance.
This bug report was last modified today.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.