GNU bug report logs - #79333
31.0.50; Processes (still) aren't actually locked to threads

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Thu, 28 Aug 2025 19:46:02 UTC

Severity: normal

Found in version 31.0.50

Full log


View this message in rfc822 format

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, 79333 <at> debbugs.gnu.org
Subject: bug#79333: 31.0.50; Processes (still) aren't actually locked to threads
Date: Fri, 29 Aug 2025 12:06:58 -0400
[Message part 1 (text/plain, inline)]
On Fri, Aug 29, 2025, 11:53 AM Eli Zaretskii <eliz <at> gnu.org> wrote:

> > From: Spencer Baugh <sbaugh <at> janestreet.com>
> > Cc: dmitry <at> gutov.dev,  79333 <at> debbugs.gnu.org
> > Date: Fri, 29 Aug 2025 11:20:32 -0400
> >
> > Eli Zaretskii <eliz <at> gnu.org> writes:
> >
> > > If you are saying that two arbitrary independently-written pieces of
> > > code can get in trouble if they are lumped together to run by the same
> > > Lisp program in two separate threads, then I agree.
> >
> > I guess that's what I'm saying.  But the Lisp program here is just
> > "Emacs".  This combination of two independent pieces of code just
> > automatically happens when users is using one package which is using
> > timers, and another package which is using threads.  Which of course
> > happens all the time without anyone choosing to do it.
> >
> > For example, one package might add a find-file-hook which starts a
> > subprocess, then another package might add a find-file-hook which starts
> > a thread.  Then when the two hooks run in succession, it would cause
> > this problem.
>
> It's possible that we should have some guidelines for such situations.
> But this is way far in the future, from where I stand: right now,
> taking some processing, which works single-threaded and making it run
> from a separate thread doesn't work well, and we should first make
> sure that's solved.
>

What would the guidelines be?  I don't believe there's any way to fix this
problem other than by unlocking every process you create.

> > However, having a function that starts a process, but doesn't process
> > > its output, and another function that doesn't start any processes, but
> > > does accept output from subprocesses, is an unusual thing to do.
> >
> > Ah, I guess you're referring to the explicit accept-process-output call.
> > I think that was a confusing part of my example, because it was not
> > necessary to cause the issue.
> >
> > Here's a more refined example:
> >
> > ;; Package 1 (perhaps run in a find-file-hook)
> > (run-at-time .3 nil #'async-shell-command "sleep 1 && echo foobar &&
> sleep inf")
> > ;; Package 2 (perhaps run in a find-file-hook)
> > (make-thread
> >  (lambda ()
> >    (sit-for 1)
> >    (thread-join (make-thread (lambda () (while t (sit-for 1)))))))
> >
> > The shell command started by package 1 will sometimes hang forever
> > without producing output.
>
> I believe this is because of that issue with status_notify.  At least,
> we should fix that before we revisit the above and see if anything
> else needs to be fixed there.
>

This bug still happens even with my initial fixes for the status_notify
issue.  But sure, we can fix that first and then come back to this one.

(I want to make sure we don't release Emacs 32 with the change that I
believe breaks existing thread programs, but as long as we resolve the
issues before then, I'm in no rush.  I've just reverted the change at my
site anyway)

> > What I have in mind is a different case, which I think is much more
> > > common, at least at this stage of using Lisp thread in Emacs.  It's a
> > > case where one takes a single-threaded Lisp program, and runs it from
> > > a separate thread so as to avoid blocking the Emacs's main thread.  In
> > > that case, the same thread will both start the process and expect to
> > > be able to process its output (because that's how single-threaded Lisp
> > > programs work), and therefore having the process locked by default
> > > lets such code work as expected when it is run from a thread.
> > > Especially if you take several such programs, each with its own
> > > subprocess, and let them all run from several different threads at the
> > > same time.
> >
> > Yes, I definitely want that real-world case to work right.  I agree that
> > that is a very important case.  But I think it already works right with
> > processes not locked to threads for any non-buggy program.
>
> That's not my experience.  If random threads get return values from
> accept-process-output, you can easily have a thread whose
> accept-process-output call never returns until timeout, because the
> output was already read by another thread.
>

I know you don't have much time to work on this, but it would really help
if you could give a concrete example program that demonstrates this.

> (I personally have written or used lots of code like that with threads,
> > and the fact that processes were not fully locked to threads did not
> > cause problems.
>
> And locking them does cause problems?
>

Yes.  Such as in the example I was describing above.

> For example, a program like this will work correctly even in a thread:
> >
> > (let ((proc (make-process ...)))
> >   (accept-process-output proc))
> >
> > This would run the filter functions in the same thread, because output
> > from PROC can't be read by another thread until we do a thread switch,
> > which will only happen when we call accept-process-output.
>
> What matters is which thread gets first to the pselect call.  That's
> unpredictable, because it's racy.
>

It is not racy in this example.  Even without locking.

> If the program was instead something like:
> >
> > (let ((proc (make-process ...)))
> >   (sit-for 1)
> >   (accept-process-output proc))
> >
> > then the (accept-process-output proc) might block because the sit-for
> > can thread switch.  But this program is already buggy, since sit-for
> > runs wait_reading_process_output which could read the output from PROC.
>
> A program can easily call sit-for indirectly, because sit-for is
> called all over the place in Emacs.
>

That's my point.  This second example program is buggy whether threads are
used or not.

>
[Message part 2 (text/html, inline)]

This bug report was last modified 9 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.