GNU bug report logs - #79334
[PATCH] Don't release thread select lock unnecessarily

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Thu, 28 Aug 2025 21:10:02 UTC

Severity: normal

Tags: patch

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: Spencer Baugh <sbaugh <at> janestreet.com>
Cc: 79334 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, dmitry <at> gutov.dev
Subject: bug#79334: [PATCH] Don't release thread select lock unnecessarily, [PATCH] Defer closing file descriptors used by other threads
Date: Tue, 02 Sep 2025 17:39:39 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: 79334 <at> debbugs.gnu.org,  eggert <at> cs.ucla.edu,  dmitry <at> gutov.dev
> Date: Tue, 02 Sep 2025 09:38:15 -0400
> 
> >> But if there was an EBADF, then Available is left in an undefined state (see pselect(1)) and we no longer
> >> know what descriptors we originally set waiting_thread on.
> >
> > In that case pselect returns a negative result, no?  And if that
> > happens, we don't examine the descriptors at all, right?  So why is
> > this situation a problem we need to handle with this additional fd_set?
> 
> Because we currently call emacs_abort when we get EBADF:

So, if the danger of getting EBADF due to threads-and-processes
mishaps is real (I'm saying "if" because I don't think we had bug
reports about that), we should change how we treat EBADF in that case.
Which should probably be a separate discussion anyway.  In any case,
the distinct return value of pselect in that case means that none of
the code which handles other return values should be bothered about
this possibility.

> >> Though, I guess we could alternatively just assume that EBADF is always spurious.
> >
> > Assume and do what?
> 
> Follow one of the other two branches in the above snippet of code:
> 
> - Treat EBADF the same way we do EINTR: there were no events on any fds, so
>   we continue running wait_reading_process_output.
> 
> - Treat EBADF the same way we treat other errno values: signal an error.
>   That would at least be better than calling emacs_abort.

Let's discuss this issue separately, okay?  We are already trying to
cover too many different scenarios, which makes this discussion hard
to follow and the probability of losing focus high.

> >> You misread me: I said SIGCHLD is unsafe even without threads.
> >
> > Please describe how.
> 
> 1. Call wait_reading_process_output
> 2. wait_reading_process_output calls compute_input_wait_mask, including
>    SOME-FD in the mask.
> 3. Receive SIGCHLD, calling handle_child_signal
> 4. handle_child_signal calls delete_read_fd for SOME-FD
> 5. Signal handler finishes
> 6. wait_reading_process_output calls select and returns events for
>    SOME-FD, even though SOME-FD has been deleted from fd_callback_info
>    and the process associated with it has died.  It seems possible that
>    this can cause an issue in some code path.
> 
> This couldn't happen before the refactoring which added
> fd_callback_info, because handle_child_signal directly mutated the
> fd_set which is passed to pselect.

Are you talking about systems that don't use the "child signal pipe"?
Because if that pipe is in use, then the SIGCHLD handler writes to it,
and that will cause pselect to return with that descriptor in
Available, and we then handle the death of the process.  In addition,
delete_read_fd just clears the data from the fd_callback_info of the
descriptor, and the loop that examines the bits in Available is AFAICT
careful enough to test these fields of the callback info before doing
anything because the descriptor was returned by pselect as ready.  So
what kind of problems could have happened in step 6 above?

And once again, this is a separate issue, so let's discuss it
separately.  In this bug, let's focus on what happens in status_notify
and in general when we deactivate/delete a process which exited.  Can
we return to that now?




This bug report was last modified 8 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.