GNU bug report logs - #79367
31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t

Previous Next

Package: emacs;

Reported by: Zhengyi Fu <i <at> fuzy.me>

Date: Tue, 2 Sep 2025 06:21:01 UTC

Severity: normal

Found in version 31.0.50

Full log


View this message in rfc822 format

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org
Subject: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t
Date: Thu, 4 Sep 2025 04:33:31 +0300
On 03/09/2025 14:53, Eli Zaretskii wrote:

>>>> I don't want us to spend a lot of time arguing revert-or-no-revert now,
>>>> but could you, Eli, try to decide on a full example of a buggy scenario
>>>> which is really solved by thread locking being?
>>>
>>> I already described that in so many words.  That is all I can afford.
>>
>> We need a fuller scenario. If not with code, then a full use case
>> described in English.
> 
> I described some of them, and just now posted a list of my messages in
> bug#79201 where I did that.  That's the best I can afford doing at
> this time, sorry.

Thanks. Will follow the subthread that sprung from this paragraph.

>> I'm pretty sure more of your time is being wasted on answering the
>> emails with objections. Regrettably.
> 
> Yes.  And I still consider this a worthwhile investment, if as result
> we will become on the same page regarding how the relevant code works
> and what should be our expectations from it.

I hope to get there as well.

>> IIUC the existing protection did very little in practice, all of these
>> years: it only stopped some thread calling 'accept-process-output' with
>> an explicit reference of a process belonging to another thread.
> 
> That's not "very little" at all.  And if some Lisp program called
> set-process-thread, it would do exactly what is now done by default.

Right. But we don't know of any that did.

>> Any kind
>> of buggy code would have to try hard to obtain that reference in the
>> first place - so that would almost never happen anyway. The new
>> protections seem a lot more strict.
> 
> Yes, it is stricter.  Because that's the intent, as is clear both from
> the code and from the documentation, and also because it makes Lisp
> programs much easier to reason about.
> 
> I also don't quite see a disaster this change is causing to code which
> disregarded the documented restriction.  All we had till now is a
> single bug report, which clearly indicates that my change was
> incomplete, and is easy to fix.  I fail to understand the amount of
> excitement due to a change that turns out not to be different from
> many other changes we make in development.

Spencer also previously mentioned breakage in some existing code, there 
was an example in another thread (not about native compilation).

> Here's one recent example:
> 
>    https://lists.gnu.org/archive/html/emacs-devel/2025-09/msg00002.html
> 
> This even broke the build, which is clearly a serious regression.  And
> the solution was that Mattias quickly installed two followup changes;
> case closed.  Good job, Mattias!

I'm good with fixing this bug report the expedient way.

>> TBC I think we should continue to support locking, and your current fix
>> seems productive in this regard, but whether locking should be by
>> default is not obvious to me - and at least two smart devs who worked
>> with Emacs Lisp threads say no. That seems like grounds to re-evaluate.
>> Even if we reach the current default we'll have documented our
>> conclusions better rather than just saying "this is so".
> 
> I'm not saying we cannot revisit the decision to lock processes to
> threads.  But given that the original design and implementation
> clearly tell us that was the intent, I think we should start from what
> we have, and only change it if we decide that the current default is
> wrong in too many important cases.  For now all we have is 2 cases:
> one was solved yesterday by unlocking the process, the other is the
> one discussed here, which uncovered a flaw in the change I installed,
> and should be easy to fix.  That's not enough to declare the idea of
> locking wrong as the default, let alone bankrupt.

Certainly.

> For such a
> far-reaching conclusion we need much more data points, of both kinds.
> My knowledge of this code and my experience, based on 12 years of
> developing, maintaining, and fixing bugs in it, is that letting random
> threads read output from subprocesses causes many unexpected behaviors
> and thus triggers subtle bugs and non-determinism in seemingly valid
> Lisp programs.

I have an inkling that even in the single-threaded case 
accept-process-output can behave unpredictably enough that people use it 
in a certain way (in a loop, with low timeout). Which could smooth over 
unpredictabilities across multiple threads as well.

There are some exceptions to that, though.

> You think differently, but please don't dismiss my
> opinions on this too quickly, given my familiarity with the code in
> this area and the amount of time I spent on it.

Just to be clear, I haven't formed an opinion on this myself yet. Just 
hoping for a more thorough discussion. Sorry if it comes across differently.




This bug report was last modified 7 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.