GNU bug report logs -
#79334
[PATCH] Don't release thread select lock unnecessarily
Previous Next
Full log
Message #44 received at 79334 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Spencer Baugh <sbaugh <at> janestreet.com> writes:
> Spencer Baugh <sbaugh <at> janestreet.com> writes:
>> While doing this, I had another thought...
>>
>> Do you know if we have any code which handles the fact that
>> delete-process might be called on a process while another thread is
>> calling pselect on its fds?
>>
>> I don't think we do, which suggests that might also be problematic.
>> I'll work on a fix which encompasses that as well.
>
> OK, I think this patch fixes the issue.
>
> Many different pieces of code in Emacs can close file descriptors. We
> need to be generically careful to not close file descriptors which some
> other thread is currently waiting on. In my patch, we do this by
> letting the waiting_thread itself close those file descriptors, after it
> returns from select.
>
> Unfortunately I don't have a reliable test which produces the failure
> case, but in this patch I added a fprintf when we close a deferred fd.
> This prints whenever some thread would have closed an fd that another
> thread was waiting on. If I run the following program (with emacs -Q
> --batch):
>
> (defun my-break-thread ()
> (let ((proc (make-process
> :name "foo"
> :command '("sleep" "1"))))
> (while (process-live-p proc)
> (accept-process-output proc 0.05))))
> (while t
> (make-thread #'my-break-thread "thread1")
> (thread-join (make-thread #'my-break-thread "thread2")))
>
> I get about one print every 10 seconds. Each print is a problem case
> which has been averted, so I do think this patch fixes the bug.
>
> (Obviously, we should remove the fprintf before applying the change)
Of course, I immediately found a bug in my patch (I wasn't clearing
waiting_thread after closing a deferred fd). Fixed in the attached.
In related news: running process-tests.el in a loop also triggers my
"Closing deferred fd" message. Which suggests that src/process-tests.el
was probably also suffering from bugs in this area, which could have
been causing hangs or flaky failures. Which should now hopefully be
fixed.
[0001-Defer-closing-file-descriptors-used-by-other-threads.patch (text/x-patch, inline)]
From 9e46aa50f5de7d719a6e427fe44bb6b8b8e226fc Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Fri, 29 Aug 2025 15:39:00 -0400
Subject: [PATCH] Defer closing file descriptors used by other threads
* src/process.c (other_thread_is_waiting_for): Add.
(clear_fd_callback_data): Don't clear WAITING_FOR_CLOSE_FD.
(close_process_fd): Set WAITING_FOR_CLOSE_FD for fds with a
waiting_thread.
(wait_reading_process_output): Close fds with
WAITING_FOR_CLOSE_FD. (bug#79334)
---
src/process.c | 46 ++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/src/process.c b/src/process.c
index b33bebb9a35..9180b92c61f 100644
--- a/src/process.c
+++ b/src/process.c
@@ -448,7 +448,9 @@ make_lisp_proc (struct Lisp_Process *p)
/* This descriptor refers to a process. */
PROCESS_FD = 8,
/* A non-blocking connect. Only valid if FOR_WRITE is set. */
- NON_BLOCKING_CONNECT_FD = 16
+ NON_BLOCKING_CONNECT_FD = 16,
+ /* This fd is waiting to be closed. */
+ WAITING_FOR_CLOSE_FD = 32,
};
static struct fd_callback_data
@@ -466,14 +468,30 @@ make_lisp_proc (struct Lisp_Process *p)
struct thread_state *waiting_thread;
} fd_callback_info[FD_SETSIZE];
+static bool
+other_thread_is_waiting_for (struct fd_callback_data* elem)
+{
+ return elem->waiting_thread != NULL && elem->waiting_thread != current_thread;
+}
+
static void
clear_fd_callback_data(struct fd_callback_data* elem)
{
elem->func = NULL;
elem->data = NULL;
- elem->flags = 0;
elem->thread = NULL;
- elem->waiting_thread = NULL;
+ if (other_thread_is_waiting_for (elem))
+ {
+ /* Preserve WAITING_FOR_CLOSE_FD; see process_close_fd. Leaving
+ flags non-zero means recompute_max_desc won't reduce max_desc
+ past this element. */
+ elem->flags &= WAITING_FOR_CLOSE_FD;
+ }
+ else
+ {
+ elem->flags = 0;
+ elem->waiting_thread = NULL;
+ }
}
@@ -2113,7 +2131,11 @@ close_process_fd (int *fd_addr)
if (0 <= fd)
{
*fd_addr = -1;
- emacs_close (fd);
+ if (other_thread_is_waiting_for (&fd_callback_info[fd]))
+ /* Let waiting_thread handle closing the fd. */
+ fd_callback_info[fd].flags = WAITING_FOR_CLOSE_FD;
+ else
+ emacs_close (fd);
}
}
@@ -5816,6 +5838,22 @@ wait_reading_process_output (intmax_t time_limit, int nsecs, int read_kbd,
/* Make C-g and alarm signals set flags again. */
clear_waiting_for_input ();
+ /* Close fds which other threads wanted to close. */
+ for (int fd = 0; fd <= max_desc; ++fd)
+ {
+ if (fd_callback_info[fd].waiting_thread == current_thread
+ && (fd_callback_info[fd].flags == WAITING_FOR_CLOSE_FD))
+ {
+ fprintf (stderr, "closing deferred fd %d\n", fd);
+ emacs_close (fd);
+ /* Ignore any events which happened on this fd. */
+ FD_CLR (fd, &Available);
+ FD_CLR (fd, &Writeok);
+ clear_fd_callback_data (&fd_callback_info[fd]);
+ }
+ }
+ recompute_max_desc ();
+
/* If we woke up due to SIGWINCH, actually change size now. */
do_pending_window_change (0);
--
2.43.7
This bug report was last modified 8 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.