GNU bug report logs - #79333
31.0.50; Processes (still) aren't actually locked to threads

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Thu, 28 Aug 2025 19:46:02 UTC

Severity: normal

Found in version 31.0.50

Full log


Message #29 received at 79333 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Spencer Baugh <sbaugh <at> janestreet.com>
Cc: dmitry <at> gutov.dev, 79333 <at> debbugs.gnu.org
Subject: Re: bug#79333: 31.0.50; Processes (still) aren't actually locked to
 threads
Date: Fri, 29 Aug 2025 18:53:49 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com>
> Cc: dmitry <at> gutov.dev,  79333 <at> debbugs.gnu.org
> Date: Fri, 29 Aug 2025 11:20:32 -0400
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > If you are saying that two arbitrary independently-written pieces of
> > code can get in trouble if they are lumped together to run by the same
> > Lisp program in two separate threads, then I agree.
> 
> I guess that's what I'm saying.  But the Lisp program here is just
> "Emacs".  This combination of two independent pieces of code just
> automatically happens when users is using one package which is using
> timers, and another package which is using threads.  Which of course
> happens all the time without anyone choosing to do it.
> 
> For example, one package might add a find-file-hook which starts a
> subprocess, then another package might add a find-file-hook which starts
> a thread.  Then when the two hooks run in succession, it would cause
> this problem.

It's possible that we should have some guidelines for such situations.
But this is way far in the future, from where I stand: right now,
taking some processing, which works single-threaded and making it run
from a separate thread doesn't work well, and we should first make
sure that's solved.

> > However, having a function that starts a process, but doesn't process
> > its output, and another function that doesn't start any processes, but
> > does accept output from subprocesses, is an unusual thing to do.
> 
> Ah, I guess you're referring to the explicit accept-process-output call.
> I think that was a confusing part of my example, because it was not
> necessary to cause the issue.
> 
> Here's a more refined example:
> 
> ;; Package 1 (perhaps run in a find-file-hook)
> (run-at-time .3 nil #'async-shell-command "sleep 1 && echo foobar && sleep inf")
> ;; Package 2 (perhaps run in a find-file-hook)
> (make-thread
>  (lambda ()
>    (sit-for 1)
>    (thread-join (make-thread (lambda () (while t (sit-for 1)))))))
> 
> The shell command started by package 1 will sometimes hang forever
> without producing output.

I believe this is because of that issue with status_notify.  At least,
we should fix that before we revisit the above and see if anything
else needs to be fixed there.

> > What I have in mind is a different case, which I think is much more
> > common, at least at this stage of using Lisp thread in Emacs.  It's a
> > case where one takes a single-threaded Lisp program, and runs it from
> > a separate thread so as to avoid blocking the Emacs's main thread.  In
> > that case, the same thread will both start the process and expect to
> > be able to process its output (because that's how single-threaded Lisp
> > programs work), and therefore having the process locked by default
> > lets such code work as expected when it is run from a thread.
> > Especially if you take several such programs, each with its own
> > subprocess, and let them all run from several different threads at the
> > same time.
> 
> Yes, I definitely want that real-world case to work right.  I agree that
> that is a very important case.  But I think it already works right with
> processes not locked to threads for any non-buggy program.

That's not my experience.  If random threads get return values from
accept-process-output, you can easily have a thread whose
accept-process-output call never returns until timeout, because the
output was already read by another thread.

> (I personally have written or used lots of code like that with threads,
> and the fact that processes were not fully locked to threads did not
> cause problems.

And locking them does cause problems?

> For example, a program like this will work correctly even in a thread:
> 
> (let ((proc (make-process ...)))
>   (accept-process-output proc))
> 
> This would run the filter functions in the same thread, because output
> from PROC can't be read by another thread until we do a thread switch,
> which will only happen when we call accept-process-output.

What matters is which thread gets first to the pselect call.  That's
unpredictable, because it's racy.

> If the program was instead something like:
> 
> (let ((proc (make-process ...)))
>   (sit-for 1)
>   (accept-process-output proc))
> 
> then the (accept-process-output proc) might block because the sit-for
> can thread switch.  But this program is already buggy, since sit-for
> runs wait_reading_process_output which could read the output from PROC.

A program can easily call sit-for indirectly, because sit-for is
called all over the place in Emacs.

This is why locking processes is better: it makes the program more
predictable.




This bug report was last modified 9 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.