GNU bug report logs -
#79334
[PATCH] Don't release thread select lock unnecessarily
Previous Next
Full log
Message #89 received at 79334 <at> debbugs.gnu.org (full text, mbox):
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: 79334 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, dmitry <at> gutov.dev
> Date: Tue, 02 Sep 2025 10:59:39 -0400
>
> Eli Zaretskii <eliz <at> gnu.org> writes:
>
> >> Because we currently call emacs_abort when we get EBADF:
> >
> > So, if the danger of getting EBADF due to threads-and-processes
> > mishaps is real (I'm saying "if" because I don't think we had bug
> > reports about that),
>
> It is real. I have seen it plenty at my site. I described this issue
> in my initial email. It is the issue I care about solving. This bug is
> my report of the issue. Solving it is my highest priority.
OK, so let's see if fixing deletion of processes fixes that problem.
If you see that in your usage, it should be relatively easy to test.
Then we can take it from there.
> > Let's discuss this issue separately, okay? We are already trying to
> > cover too many different scenarios, which makes this discussion hard
> > to follow and the probability of losing focus high.
>
> It's not a separate issue, it's the entire point of this bug. So yes,
> let's stay focused on this issue and not other issues.
>
> Which of these two options do you think we should go with?
I would like fix the process deletion and closing of the descriptors
first, because that can definitely be one reason for EBADF. We can
then discuss additional measures. From where I stand, it is best to
completely eliminate EBADF due to these problems; if we succeed, the
code which deals with EBADF could maybe be left as it is, since this
should then be some spurious problem with no known reasons. If it
turns out that we cannot eliminate EBADF completely by fixing the
process-deactivation code, then yes, we will need to do something when
we get EBADF as result.
So what happened to the changes in the code that handles deactivation
of a process? can we return to that and arrive at some agreed-upon
solution? AFAIR, last time I asked you why not fix these problems
inside status_notify (and the few other places which use
FOR_EACH_PROCESS).
> > Because if that pipe is in use, then the SIGCHLD handler writes to it,
> > and that will cause pselect to return with that descriptor in
> > Available, and we then handle the death of the process. In addition,
> > delete_read_fd just clears the data from the fd_callback_info of the
> > descriptor, and the loop that examines the bits in Available is AFAICT
> > careful enough to test these fields of the callback info before doing
> > anything because the descriptor was returned by pselect as ready. So
> > what kind of problems could have happened in step 6 above?
>
> I'm not sure.
That's my reading of the code. Of course, I could be wrong, but then
we need a description of the scenario details which causes such
problems, including the control and data flow which make it possible.
> > And once again, this is a separate issue, so let's discuss it
> > separately.
>
> Okay. Let me just say again that I suspect there are bugs here which
> can be triggered by both threads and signal handlers. We can leave it
> at that.
Yes, I agree that signals add a non-trivial complexity to this stuff,
due to their asynchronous nature and their ability to run code on the
"wrong" thread.
This bug report was last modified 8 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.