Package: emacs;
Reported by: Zhengyi Fu <i <at> fuzy.me>
Date: Tue, 2 Sep 2025 06:21:01 UTC
Severity: normal
Found in version 31.0.50
To reply to this bug, email your comments to 79367 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 06:21:02 GMT) Full text and rfc822 format available.Zhengyi Fu <i <at> fuzy.me>
:bug-gnu-emacs <at> gnu.org
.
(Tue, 02 Sep 2025 06:21:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: bug-gnu-emacs <at> gnu.org Subject: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 14:18:58 +0800
Hi Emacs maintainers, I have been suffering from this bug for a few weeks. I spent a few days finding minimal steps to reproduce the bug: - Install diff-hl and magit - Set `diff-hl-update-async' to t and enable global-diff-hl-mode - Enable server-mode - Open a file in a git repo, make some changes and save it - Run `magit-stage-buffer-file' to stage the changes - Run `revert-buffer' in the file buffer, and make sure the diff-hl fringes are displayed - Run `magit-commit-create' At this moment, a COMMIT_EDITMSG buffer should appear. But actually sometimes no buffer appears. If you run `list-processes', you can see that the git process and the server connection are all there, as shown below. git 542014 run magit-process: emacs_test /dev/pts/1 Main git --no-pager --literal-pathspecs -c core.preloadindex=true -c log.showSignature=false -c color.ui=false -c color.diff=false -c diff.noPrefix=false commit -- server -- listen -- -- Main (network server on /run/user/150425139/emacs/server) server <2> -- open -- -- Main (network connection to t:/run/user/150425139/emacs/server) The commit message buffer will pop up, if you eval (set-process-thread (get-process "server <2>") nil) Further debugging through gdb revealed more details: the `thread' member of the fd_callback_info entry for the server connection references the diff-hl-async thread, which is already exited. Simply adding a `set_proc_thread' in `server_accept_connections' like the following seems to resolve the issue, but I'm not sure if this is the correct fix. diff --git a/src/process.c b/src/process.c index d6efac5479d..326a36716d5 100644 --- a/src/process.c +++ b/src/process.c @@ -5115,6 +5115,8 @@ server_accept_connection (Lisp_Object server, int channel) /* Client processes for accepted connections are not stopped initially. */ if (!EQ (p->filter, Qt)) add_process_read_fd (s); + set_proc_thread (p, current_thread); + if (s > max_desc) max_desc = s; In GNU Emacs 31.0.50 (build 6, x86_64-pc-linux-gnu, GTK+ Version 3.24.41) of 2025-09-02 built on james-fu-ubuntu2404 Repository revision: b953dc679c53d8ae26770762bcb2601389146768 Repository branch: heads/master Windowing system distributor 'Microsoft Corporation', version 11.0.12010000 System Description: Ubuntu 24.04.3 LTS Configured using: 'configure --without-all --with-threads --enable-checking 'CFLAGS=-O1 -ggdb3 -pipe'' Configured features: GLIB GMP PDUMPER SECCOMP THREADS X11 XIM XINERAMA XRANDR GTK3 Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix Major mode: Fundamental Minor modes in effect: global-git-commit-mode: t magit-auto-revert-mode: t auto-revert-mode: t server-mode: t global-diff-hl-mode: t diff-hl-mode: t straight-use-package-mode: t straight-package-neutering-mode: t tooltip-mode: t global-eldoc-mode: t eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t blink-cursor-mode: t minibuffer-regexp-mode: t line-number-mode: t indent-tabs-mode: t transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t Load-path shadows: /var/lib/domain_home/auth.hpicorp.net/fujames/.config/emacs/straight/build/transient/transient hides /var/lib/domain_home/auth.hpicorp.net/fujames/src/emacs_test/lisp/transient /var/lib/domain_home/auth.hpicorp.net/fujames/.config/emacs/straight/build/seq/seq hides /var/lib/domain_home/auth.hpicorp.net/fujames/src/emacs_test/lisp/emacs-lisp/seq /var/lib/domain_home/auth.hpicorp.net/fujames/.config/emacs/straight/build/compat/compat hides /var/lib/domain_home/auth.hpicorp.net/fujames/src/emacs_test/lisp/emacs-lisp/compat Features: (shadow sort mail-extr emacsbug lisp-mnt bug-reference thingatpt help-fns radix-tree cl-print debug backtrace find-func display-line-numbers face-remap magit-submodule magit-blame magit-stash magit-reflog magit-bisect magit-push magit-pull magit-fetch magit-clone magit-remote magit-commit magit-sequence magit-notes magit-worktree magit-tag magit-merge magit-branch magit-reset magit-files magit-refs magit-status magit package url-handlers magit-repos magit-apply magit-wip magit-log which-func imenu magit-diff smerge-mode diff git-commit magit-core magit-autorevert autorevert filenotify magit-margin magit-transient magit-process with-editor shell pcomplete comint ansi-osc ansi-color magit-mode transient pp edmacro kmacro browse-url xdg url url-proxy url-privacy url-expand url-methods url-history url-cookie generate-lisp-file url-domsuf url-util url-parse auth-source icons json map url-vars benchmark magit-git magit-base magit-section format-spec cursor-sensor crm llama eieio byte-opt eieio-core cond-let compat server vc-git files-x diff-hl log-view log-edit message sendmail mailcap yank-media puny dired dired-loaddefs rfc822 mml mml-sec password-cache epa derived epg rfc6068 epg-config gnus-util text-property-search time-date mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr mailabbrev mail-utils gmm-utils mailheader ring add-log pcvs-util vc-dir ewoc vc vc-dispatcher diff-mode track-changes easy-mmode magit-autoloads pcase with-editor-autoloads transient-autoloads magit-section-autoloads llama-autoloads cond-let-autoloads compat-autoloads info seq-autoloads diff-hl-autoloads finder-inf straight-autoloads cl-seq cl-extra help-mode straight subr-x cl-macs gv cl-loaddefs cl-lib bytecomp byte-compile rmc iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/x-win x-win term/common-win x-dnd touch-screen tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic indonesian philippine cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads dynamic-setting gtk x-toolkit x multi-tty move-toolbar make-network-process tty-child-frames emacs) Memory information: ((conses 16 161628 25408) (symbols 48 16281 0) (strings 32 52533 3112) (string-bytes 1 1564345) (vectors 16 32763) (vector-slots 8 308800 12722) (floats 8 75 448) (intervals 56 559 0) (buffers 984 21))
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 12:34:03 GMT) Full text and rfc822 format available.Message #8 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Zhengyi Fu <i <at> fuzy.me>, Spencer Baugh <sbaugh <at> janestreet.com>, Dmitry Gutov <dmitry <at> gutov.dev> Cc: 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 15:32:57 +0300
> From: Zhengyi Fu <i <at> fuzy.me> > Date: Tue, 02 Sep 2025 14:18:58 +0800 > > Hi Emacs maintainers, > > I have been suffering from this bug for a few weeks. > > I spent a few days finding minimal steps to reproduce the bug: > > - Install diff-hl and magit > - Set `diff-hl-update-async' to t and enable global-diff-hl-mode > - Enable server-mode > > - Open a file in a git repo, make some changes and save it > - Run `magit-stage-buffer-file' to stage the changes > - Run `revert-buffer' in the file buffer, and make sure the diff-hl > fringes are displayed > - Run `magit-commit-create' > > At this moment, a COMMIT_EDITMSG buffer should appear. But actually > sometimes no buffer appears. > > If you run `list-processes', you can see that the git process and the > server connection are all there, as shown below. > > git 542014 run magit-process: emacs_test /dev/pts/1 Main git --no-pager --literal-pathspecs -c core.preloadindex=true -c log.showSignature=false -c color.ui=false -c color.diff=false -c diff.noPrefix=false commit -- > server -- listen -- -- Main (network server on /run/user/150425139/emacs/server) > server <2> -- open -- -- Main (network connection to t:/run/user/150425139/emacs/server) > > > The commit message buffer will pop up, if you eval > (set-process-thread (get-process "server <2>") nil) > > Further debugging through gdb revealed more details: the `thread' > member of the fd_callback_info entry for the server connection > references the diff-hl-async thread, which is already exited. > > Simply adding a `set_proc_thread' in `server_accept_connections' like > the following seems to resolve the issue, but I'm not sure if this is > the correct fix. > > diff --git a/src/process.c b/src/process.c > index d6efac5479d..326a36716d5 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -5115,6 +5115,8 @@ server_accept_connection (Lisp_Object server, int channel) > /* Client processes for accepted connections are not stopped initially. */ > if (!EQ (p->filter, Qt)) > add_process_read_fd (s); > + set_proc_thread (p, current_thread); > + > if (s > max_desc) > max_desc = s; Thanks. I believe you are right, since server_accept_connection calls make_process to create a new process object, which should by default be locked to the thread which created it. Spencer and Dmitry, do you agree?
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 13:09:02 GMT) Full text and rfc822 format available.Message #11 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: Zhengyi Fu <i <at> fuzy.me>, Dmitry Gutov <dmitry <at> gutov.dev>, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 09:08:14 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Zhengyi Fu <i <at> fuzy.me> >> Date: Tue, 02 Sep 2025 14:18:58 +0800 >> >> Hi Emacs maintainers, >> >> I have been suffering from this bug for a few weeks. >> >> I spent a few days finding minimal steps to reproduce the bug: >> >> - Install diff-hl and magit >> - Set `diff-hl-update-async' to t and enable global-diff-hl-mode >> - Enable server-mode >> >> - Open a file in a git repo, make some changes and save it >> - Run `magit-stage-buffer-file' to stage the changes >> - Run `revert-buffer' in the file buffer, and make sure the diff-hl >> fringes are displayed >> - Run `magit-commit-create' >> >> At this moment, a COMMIT_EDITMSG buffer should appear. But actually >> sometimes no buffer appears. >> >> If you run `list-processes', you can see that the git process and the >> server connection are all there, as shown below. >> >> git 542014 run magit-process: emacs_test /dev/pts/1 Main git --no-pager --literal-pathspecs -c core.preloadindex=true -c log.showSignature=false -c color.ui=false -c color.diff=false -c diff.noPrefix=false commit -- >> server -- listen -- -- Main (network server on /run/user/150425139/emacs/server) >> server <2> -- open -- -- Main (network connection to t:/run/user/150425139/emacs/server) >> >> >> The commit message buffer will pop up, if you eval >> (set-process-thread (get-process "server <2>") nil) >> >> Further debugging through gdb revealed more details: the `thread' >> member of the fd_callback_info entry for the server connection >> references the diff-hl-async thread, which is already exited. >> >> Simply adding a `set_proc_thread' in `server_accept_connections' like >> the following seems to resolve the issue, but I'm not sure if this is >> the correct fix. >> >> diff --git a/src/process.c b/src/process.c >> index d6efac5479d..326a36716d5 100644 >> --- a/src/process.c >> +++ b/src/process.c >> @@ -5115,6 +5115,8 @@ server_accept_connection (Lisp_Object server, int channel) >> /* Client processes for accepted connections are not stopped initially. */ >> if (!EQ (p->filter, Qt)) >> add_process_read_fd (s); >> + set_proc_thread (p, current_thread); >> + >> if (s > max_desc) >> max_desc = s; > > Thanks. > > I believe you are right, since server_accept_connection calls > make_process to create a new process object, which should by default > be locked to the thread which created it. > > Spencer and Dmitry, do you agree? No. At worst, the locking of the new connection should match that of the existing process. The new connection should not randomly be locked to the thread which happened to run wait_reading_process_output. >> Further debugging through gdb revealed more details: the `thread' >> member of the fd_callback_info entry for the server connection >> references the diff-hl-async thread, which is already exited. This aspect sounds similar to bug#79201. Zhengyi, I assume you're testing with an Emacs built from source. Can you check whether the Emacs you're building includes commit 37325ed5a9c7f62c35b368d9530496ed31404940, which was pushed just a couple weeks ago? If not, can you pull and test again? If that doesn't fix it (or you already had that commit), can you try reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then c93be71e45d4cebeb017c813426228e579e9381d and test again?
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 13:48:02 GMT) Full text and rfc822 format available.Message #14 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 21:46:56 +0800
Spencer Baugh <sbaugh <at> janestreet.com> writes: >> >> Spencer and Dmitry, do you agree? > > No. At worst, the locking of the new connection should match that of > the existing process. The new connection should not randomly be locked > to the thread which happened to run wait_reading_process_output. > >>> Further debugging through gdb revealed more details: the `thread' >>> member of the fd_callback_info entry for the server connection >>> references the diff-hl-async thread, which is already exited. > > This aspect sounds similar to bug#79201. > > Zhengyi, I assume you're testing with an Emacs built from source. Can > you check whether the Emacs you're building includes commit > 37325ed5a9c7f62c35b368d9530496ed31404940, which was pushed just a couple > weeks ago? If not, can you pull and test again? Yes, it includes that commit. > If that doesn't fix it (or you already had that commit), can you try > reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then > c93be71e45d4cebeb017c813426228e579e9381d and test again? After reverting those two commits, it worked fine.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 13:57:02 GMT) Full text and rfc822 format available.Message #17 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Zhengyi Fu <i <at> fuzy.me> Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 09:56:47 -0400
Zhengyi Fu <i <at> fuzy.me> writes: > Spencer Baugh <sbaugh <at> janestreet.com> writes: >>>> Further debugging through gdb revealed more details: the `thread' >>>> member of the fd_callback_info entry for the server connection >>>> references the diff-hl-async thread, which is already exited. >> >> This aspect sounds similar to bug#79201. >> >> Zhengyi, I assume you're testing with an Emacs built from source. Can >> you check whether the Emacs you're building includes commit >> 37325ed5a9c7f62c35b368d9530496ed31404940, which was pushed just a couple >> weeks ago? If not, can you pull and test again? > > Yes, it includes that commit. > >> If that doesn't fix it (or you already had that commit), can you try >> reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then >> c93be71e45d4cebeb017c813426228e579e9381d and test again? > > After reverting those two commits, it worked fine. Thanks for testing. I believe the right fix is to revert those commits, since they are already breaking existing code using threads.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 14:19:01 GMT) Full text and rfc822 format available.Message #20 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 17:17:38 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: Zhengyi Fu <i <at> fuzy.me>, Dmitry Gutov <dmitry <at> gutov.dev>, > 79367 <at> debbugs.gnu.org > Date: Tue, 02 Sep 2025 09:08:14 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> --- a/src/process.c > >> +++ b/src/process.c > >> @@ -5115,6 +5115,8 @@ server_accept_connection (Lisp_Object server, int channel) > >> /* Client processes for accepted connections are not stopped initially. */ > >> if (!EQ (p->filter, Qt)) > >> add_process_read_fd (s); > >> + set_proc_thread (p, current_thread); > >> + > >> if (s > max_desc) > >> max_desc = s; > > > > Thanks. > > > > I believe you are right, since server_accept_connection calls > > make_process to create a new process object, which should by default > > be locked to the thread which created it. > > > > Spencer and Dmitry, do you agree? > > No. At worst, the locking of the new connection should match that of > the existing process. The new connection should not randomly be locked > to the thread which happened to run wait_reading_process_output. But make_process, called by server_accept_connection, already did this: pset_thread (p, Fcurrent_thread ()); So if we want to lock this new process to some other thread, we should fix that part as well to be consistent. And why do you think this thread is a random one? If everything works as intended, it is the thread which created the existing process, because the descriptor whose presence in Available triggers the call to server_accept_connection is one of those on which pselect waited, and should have belonged to the current thread. If that is not true, then it's a separate problem which should be fixed, perhaps as part of the related discussions. > >> Further debugging through gdb revealed more details: the `thread' > >> member of the fd_callback_info entry for the server connection > >> references the diff-hl-async thread, which is already exited. > > This aspect sounds similar to bug#79201. > > Zhengyi, I assume you're testing with an Emacs built from source. Can > you check whether the Emacs you're building includes commit > 37325ed5a9c7f62c35b368d9530496ed31404940, which was pushed just a couple > weeks ago? If not, can you pull and test again? > > If that doesn't fix it (or you already had that commit), can you try > reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then > c93be71e45d4cebeb017c813426228e579e9381d and test again? What will that prove? Those changes are not going away, so trying that is just wasting everyone's time and energy. I thought we were past that...
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 14:34:02 GMT) Full text and rfc822 format available.Message #23 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 10:33:12 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: Zhengyi Fu <i <at> fuzy.me>, Dmitry Gutov <dmitry <at> gutov.dev>, >> 79367 <at> debbugs.gnu.org >> Date: Tue, 02 Sep 2025 09:08:14 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> --- a/src/process.c >> >> +++ b/src/process.c >> >> @@ -5115,6 +5115,8 @@ server_accept_connection (Lisp_Object server, int channel) >> >> /* Client processes for accepted connections are not stopped initially. */ >> >> if (!EQ (p->filter, Qt)) >> >> add_process_read_fd (s); >> >> + set_proc_thread (p, current_thread); >> >> + >> >> if (s > max_desc) >> >> max_desc = s; >> > >> > Thanks. >> > >> > I believe you are right, since server_accept_connection calls >> > make_process to create a new process object, which should by default >> > be locked to the thread which created it. >> > >> > Spencer and Dmitry, do you agree? >> >> No. At worst, the locking of the new connection should match that of >> the existing process. The new connection should not randomly be locked >> to the thread which happened to run wait_reading_process_output. > > But make_process, called by server_accept_connection, already did > this: > > pset_thread (p, Fcurrent_thread ()); > > So if we want to lock this new process to some other thread, we should > fix that part as well to be consistent. True, that's a bug. > And why do you think this thread is a random one? If everything works > as intended, it is the thread which created the existing process, > because the descriptor whose presence in Available triggers the call > to server_accept_connection is one of those on which pselect waited, > and should have belonged to the current thread. Not if the existing process was unlocked. Then this is a random thread. >> >> Further debugging through gdb revealed more details: the `thread' >> >> member of the fd_callback_info entry for the server connection >> >> references the diff-hl-async thread, which is already exited. >> >> This aspect sounds similar to bug#79201. >> >> Zhengyi, I assume you're testing with an Emacs built from source. Can >> you check whether the Emacs you're building includes commit >> 37325ed5a9c7f62c35b368d9530496ed31404940, which was pushed just a couple >> weeks ago? If not, can you pull and test again? >> >> If that doesn't fix it (or you already had that commit), can you try >> reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then >> c93be71e45d4cebeb017c813426228e579e9381d and test again? > > What will that prove? Those changes are not going away, so trying > that is just wasting everyone's time and energy. I thought we were > past that... No, we're not past that, I still think those changes are wrong. Especially because this bug report shows that those changes are indeed breaking existing code. I repeatedly said that those changes would cause new issues to be reported, and now it is happening. Maybe it doesn't convince you that the changes need to be reverted, but you should at least be giving it a little more credence now.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 14:44:02 GMT) Full text and rfc822 format available.Message #26 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 17:42:56 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: Dmitry Gutov <dmitry <at> gutov.dev>, Eli Zaretskii <eliz <at> gnu.org>, > 79367 <at> debbugs.gnu.org > Date: Tue, 02 Sep 2025 09:56:47 -0400 > > Zhengyi Fu <i <at> fuzy.me> writes: > > Spencer Baugh <sbaugh <at> janestreet.com> writes: > >>>> Further debugging through gdb revealed more details: the `thread' > >>>> member of the fd_callback_info entry for the server connection > >>>> references the diff-hl-async thread, which is already exited. > >> > >> This aspect sounds similar to bug#79201. > >> > >> Zhengyi, I assume you're testing with an Emacs built from source. Can > >> you check whether the Emacs you're building includes commit > >> 37325ed5a9c7f62c35b368d9530496ed31404940, which was pushed just a couple > >> weeks ago? If not, can you pull and test again? > > > > Yes, it includes that commit. > > > >> If that doesn't fix it (or you already had that commit), can you try > >> reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then > >> c93be71e45d4cebeb017c813426228e579e9381d and test again? > > > > After reverting those two commits, it worked fine. > > Thanks for testing. > > I believe the right fix is to revert those commits, since they are > already breaking existing code using threads. I object to doing that, for reasons that were already abundantly discussed. An alternative fix was suggested by the OP that is IMO a step in the right direction. IOW, this scenario reveals yet another call to make_process (which sets the process's 'thread' member, something that was always done there) which isn't followed by marking the I/O descriptors with that same thread. So TRT is to add that missing call to set_proc_thread.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 15:03:02 GMT) Full text and rfc822 format available.Message #29 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 18:02:39 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Tue, 02 Sep 2025 10:33:12 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > > But make_process, called by server_accept_connection, already did > > this: > > > > pset_thread (p, Fcurrent_thread ()); > > > > So if we want to lock this new process to some other thread, we should > > fix that part as well to be consistent. > > True, that's a bug. > > > And why do you think this thread is a random one? If everything works > > as intended, it is the thread which created the existing process, > > because the descriptor whose presence in Available triggers the call > > to server_accept_connection is one of those on which pselect waited, > > and should have belonged to the current thread. > > Not if the existing process was unlocked. Then this is a random thread. Then let's fix the case of an unlocked process (which AFAIU is not what happened here). But if the process was locked to a particular thread, as that is the default, then do you agree that this code should be run by that very thread, at least nominally? > >> If that doesn't fix it (or you already had that commit), can you try > >> reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then > >> c93be71e45d4cebeb017c813426228e579e9381d and test again? > > > > What will that prove? Those changes are not going away, so trying > > that is just wasting everyone's time and energy. I thought we were > > past that... > > No, we're not past that, I still think those changes are wrong. > Especially because this bug report shows that those changes are indeed > breaking existing code. > > I repeatedly said that those changes would cause new issues to be > reported, and now it is happening. Maybe it doesn't convince you that > the changes need to be reverted, but you should at least be giving it a > little more credence now. This particular breakage is simply because my change was incomplete: it left some processes not locked to their thread, and the solution proposed by the OP, which fixed that blunder, is simpler and has fewer downsides than backing out the process locking to threads. This happens all the time in development, and the conclusion is rarely that the original changes should be backed out, certainly not if there are simpler fixes.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 15:20:02 GMT) Full text and rfc822 format available.Message #32 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 11:19:21 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> Date: Tue, 02 Sep 2025 10:33:12 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> > But make_process, called by server_accept_connection, already did >> > this: >> > >> > pset_thread (p, Fcurrent_thread ()); >> > >> > So if we want to lock this new process to some other thread, we should >> > fix that part as well to be consistent. >> >> True, that's a bug. >> >> > And why do you think this thread is a random one? If everything works >> > as intended, it is the thread which created the existing process, >> > because the descriptor whose presence in Available triggers the call >> > to server_accept_connection is one of those on which pselect waited, >> > and should have belonged to the current thread. >> >> Not if the existing process was unlocked. Then this is a random thread. > > Then let's fix the case of an unlocked process (which AFAIU is not > what happened here). But if the process was locked to a particular > thread, as that is the default, then do you agree that this code > should be run by that very thread, at least nominally? Yes. But we can't rely on that since of course the process could be unlocked. Our code needs to behave correctly in both cases. >> >> If that doesn't fix it (or you already had that commit), can you try >> >> reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then >> >> c93be71e45d4cebeb017c813426228e579e9381d and test again? >> > >> > What will that prove? Those changes are not going away, so trying >> > that is just wasting everyone's time and energy. I thought we were >> > past that... >> >> No, we're not past that, I still think those changes are wrong. >> Especially because this bug report shows that those changes are indeed >> breaking existing code. >> >> I repeatedly said that those changes would cause new issues to be >> reported, and now it is happening. Maybe it doesn't convince you that >> the changes need to be reverted, but you should at least be giving it a >> little more credence now. > > This particular breakage is simply because my change was incomplete: > it left some processes not locked to their thread, and the solution > proposed by the OP, which fixed that blunder, is simpler and has fewer > downsides than backing out the process locking to threads. This > happens all the time in development, and the conclusion is rarely that > the original changes should be backed out, certainly not if there are > simpler fixes. That's true, but that doesn't change the fact that your change was broken and caused this bug. And I am annoyed that I have to debug issues caused by your own broken changes while you refuse the simple solution of backing out the broken change. Could we at least turn off your change by default, if we can't back it out? I would be happy to add code to do that. That will give us more time to iron out the issues with it.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 16:14:02 GMT) Full text and rfc822 format available.Message #35 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 19:13:32 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Tue, 02 Sep 2025 11:19:21 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> > And why do you think this thread is a random one? If everything works > >> > as intended, it is the thread which created the existing process, > >> > because the descriptor whose presence in Available triggers the call > >> > to server_accept_connection is one of those on which pselect waited, > >> > and should have belonged to the current thread. > >> > >> Not if the existing process was unlocked. Then this is a random thread. > > > > Then let's fix the case of an unlocked process (which AFAIU is not > > what happened here). But if the process was locked to a particular > > thread, as that is the default, then do you agree that this code > > should be run by that very thread, at least nominally? > > Yes. > > But we can't rely on that since of course the process could be unlocked. > Our code needs to behave correctly in both cases. If you mean the case where the original process was not locked to any thread, I agree: the fix should include both cases. It is easy to distinguish between a process that wasn't locked (in which case the new one shouldn't be locked, either), and the case where it was locked to a thread. If you mean something else, please elaborate. > > This particular breakage is simply because my change was incomplete: > > it left some processes not locked to their thread, and the solution > > proposed by the OP, which fixed that blunder, is simpler and has fewer > > downsides than backing out the process locking to threads. This > > happens all the time in development, and the conclusion is rarely that > > the original changes should be backed out, certainly not if there are > > simpler fixes. > > That's true, but that doesn't change the fact that your change was > broken and caused this bug. This happens all the time in development, to all of us. When it does, we install follow-up changes to fix the regressions. We don't usually revert the original ones, unless we learn that their ideas are unworkable, or cause much worse problems. This is not that case, not yet. > And I am annoyed that I have to debug issues caused by your own > broken changes while you refuse the simple solution of backing out > the broken change. It happens all the time in Emacs development that someone makes a mistake and someone else needs to debug and fix this. If you are annoyed in this case, just leave the debugging to me. Unlike some other people who make changes and disappear, I'm here, day in and day out, and when a bug due to my changes is reported, it is usually very high priority to me. (In this case, I responded within an hour of reading the bug report. Before you did, btw.) But backing out changes is not a "simple solution", because those changes were made for a reason. > Could we at least turn off your change by default, if we can't back it > out? I would be happy to add code to do that. That will give us more > time to iron out the issues with it. If we turn this off by default, then any problems, like this one, which need to be fixed as followup, will never materialize. So I must say no in this case. (I also don't think this strategy is correct in any software development in general.) If your use cases suffer from this, you can use set-process-thread to unlock your processes, as a stopgap. Although I would encourage you not to do that and report the problems instead, so that they could be debugged and fixed ASAP.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 16:29:01 GMT) Full text and rfc822 format available.Message #38 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Dmitry Gutov <dmitry <at> gutov.dev> To: Eli Zaretskii <eliz <at> gnu.org>, Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 2 Sep 2025 19:28:05 +0300
On 02/09/2025 18:02, Eli Zaretskii wrote: >>>> If that doesn't fix it (or you already had that commit), can you try >>>> reverting commits 034d755f2f21088b97fdb0a34d846c39fcdbf46d then >>>> c93be71e45d4cebeb017c813426228e579e9381d and test again? >>> What will that prove? Those changes are not going away, so trying >>> that is just wasting everyone's time and energy. I thought we were >>> past that... >> No, we're not past that, I still think those changes are wrong. >> Especially because this bug report shows that those changes are indeed >> breaking existing code. >> >> I repeatedly said that those changes would cause new issues to be >> reported, and now it is happening. Maybe it doesn't convince you that >> the changes need to be reverted, but you should at least be giving it a >> little more credence now. > This particular breakage is simply because my change was incomplete: > it left some processes not locked to their thread, and the solution > proposed by the OP, which fixed that blunder, is simpler and has fewer > downsides than backing out the process locking to threads. This > happens all the time in development, and the conclusion is rarely that > the original changes should be backed out, certainly not if there are > simpler fixes. I do believe that when a change in a long-standing implementation causes an immediate issue we first consider reverting, especially when there is no consensus on the approach. A one-liner change is probably not a big deal by itself, but like Spencer said, the new process should at least follow the locking of its original thread, or something like this. And we still don't have a proposed fix on that. I don't want us to spend a lot of time arguing revert-or-no-revert now, but could you, Eli, try to decide on a full example of a buggy scenario which is really solved by thread locking being? Like, one that you consider a prime example. The present bug report doesn't count, since it's caused by an incomplete implementation. We'll need a specific piece of Lisp code, to be able to try different underlying implementations with it. So far I see two significant reasons to default to locking: - When we have scheduled some code to run on a different thread (filter/sentinel), there is no simple way to switch to a previous thread, to run it there. Unlike with the current buffer, for example. - Enabling locking later if a process starts without being locked to a thread, could be error-prone. But both only really work only if we understand specific problems being solved by locking in the first place.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 19:19:01 GMT) Full text and rfc822 format available.Message #41 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Dmitry Gutov <dmitry <at> gutov.dev> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 22:18:20 +0300
> Date: Tue, 2 Sep 2025 19:28:05 +0300 > Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org > From: Dmitry Gutov <dmitry <at> gutov.dev> > > > This particular breakage is simply because my change was incomplete: > > it left some processes not locked to their thread, and the solution > > proposed by the OP, which fixed that blunder, is simpler and has fewer > > downsides than backing out the process locking to threads. This > > happens all the time in development, and the conclusion is rarely that > > the original changes should be backed out, certainly not if there are > > simpler fixes. > > I do believe that when a change in a long-standing implementation causes > an immediate issue we first consider reverting, especially when there is > no consensus on the approach. No, we don't. Reverting is actually the last resort, reserved for plain and clear mistakes, and for changes that cause grave regressions and cannot be fixed soon enough. > A one-liner change is probably not a big deal by itself, but like > Spencer said, the new process should at least follow the locking of its > original thread, or something like this. And we still don't have a > proposed fix on that. I can code the proposed fix in a few minutes, if there's agreement to what I proposed to do. I'm still uncertain what I proposed is agreed-upon. So let me reiterate that: . if the process on behalf of which we called server_accept_connection was not locked to some thread, undo what make_process did to the new process when it called pset_thread . otherwise, call set_proc_thread to lock the new process to the same thread as the one which caused this call > I don't want us to spend a lot of time arguing revert-or-no-revert now, > but could you, Eli, try to decide on a full example of a buggy scenario > which is really solved by thread locking being? I already described that in so many words. That is all I can afford. And please keep in mind that processes were being locked to threads since day one, just incompletely so. I didn't invent it just yesterday out of the blue: it's the original design of processes with threads. And if that is not good enough for you, then we will have to agree to disagree, and leave it at that. I stand by my opinion.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 19:34:02 GMT) Full text and rfc822 format available.Message #44 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, Dmitry Gutov <dmitry <at> gutov.dev>, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 02 Sep 2025 15:33:46 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> Date: Tue, 2 Sep 2025 19:28:05 +0300 >> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org >> From: Dmitry Gutov <dmitry <at> gutov.dev> >> >> > This particular breakage is simply because my change was incomplete: >> > it left some processes not locked to their thread, and the solution >> > proposed by the OP, which fixed that blunder, is simpler and has fewer >> > downsides than backing out the process locking to threads. This >> > happens all the time in development, and the conclusion is rarely that >> > the original changes should be backed out, certainly not if there are >> > simpler fixes. >> >> I do believe that when a change in a long-standing implementation causes >> an immediate issue we first consider reverting, especially when there is >> no consensus on the approach. > > No, we don't. Reverting is actually the last resort, reserved for > plain and clear mistakes, and for changes that cause grave regressions > and cannot be fixed soon enough. > >> A one-liner change is probably not a big deal by itself, but like >> Spencer said, the new process should at least follow the locking of its >> original thread, or something like this. And we still don't have a >> proposed fix on that. > > I can code the proposed fix in a few minutes, if there's agreement to > what I proposed to do. I'm still uncertain what I proposed is > agreed-upon. So let me reiterate that: > > . if the process on behalf of which we called > server_accept_connection was not locked to some thread, undo what > make_process did to the new process when it called pset_thread > . otherwise, call set_proc_thread to lock the new process to the > same thread as the one which caused this call Whatever you do, please post your change for review rather than just pushing it. >> I don't want us to spend a lot of time arguing revert-or-no-revert now, >> but could you, Eli, try to decide on a full example of a buggy scenario >> which is really solved by thread locking being? > > I already described that in so many words. Can you send that full description again? Perhaps I can translate it from words into code for you.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 19:47:01 GMT) Full text and rfc822 format available.Message #47 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Dmitry Gutov <dmitry <at> gutov.dev> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Tue, 2 Sep 2025 22:45:54 +0300
On 02/09/2025 22:18, Eli Zaretskii wrote: >> I do believe that when a change in a long-standing implementation causes >> an immediate issue we first consider reverting, especially when there is >> no consensus on the approach. > > No, we don't. Reverting is actually the last resort, reserved for > plain and clear mistakes, and for changes that cause grave regressions > and cannot be fixed soon enough. A "mistake" is often in the eyes of the beholder. The lack of consensus is something more objective. >> A one-liner change is probably not a big deal by itself, but like >> Spencer said, the new process should at least follow the locking of its >> original thread, or something like this. And we still don't have a >> proposed fix on that. > > I can code the proposed fix in a few minutes, if there's agreement to > what I proposed to do. I'm still uncertain what I proposed is > agreed-upon. So let me reiterate that: > > . if the process on behalf of which we called > server_accept_connection was not locked to some thread, undo what > make_process did to the new process when it called pset_thread > . otherwise, call set_proc_thread to lock the new process to the > same thread as the one which caused this call Sounds about right. >> I don't want us to spend a lot of time arguing revert-or-no-revert now, >> but could you, Eli, try to decide on a full example of a buggy scenario >> which is really solved by thread locking being? > > I already described that in so many words. That is all I can afford. We need a fuller scenario. If not with code, then a full use case described in English. I'm pretty sure more of your time is being wasted on answering the emails with objections. Regrettably. > And please keep in mind that processes were being locked to threads > since day one, just incompletely so. I didn't invent it just > yesterday out of the blue: it's the original design of processes with > threads. IIUC the existing protection did very little in practice, all of these years: it only stopped some thread calling 'accept-process-output' with an explicit reference of a process belonging to another thread. Any kind of buggy code would have to try hard to obtain that reference in the first place - so that would almost never happen anyway. The new protections seem a lot more strict. TBC I think we should continue to support locking, and your current fix seems productive in this regard, but whether locking should be by default is not obvious to me - and at least two smart devs who worked with Emacs Lisp threads say no. That seems like grounds to re-evaluate. Even if we reach the current default we'll have documented our conclusions better rather than just saying "this is so".
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Tue, 02 Sep 2025 21:22:06 GMT) Full text and rfc822 format available.Message #50 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Dmitry Gutov <dmitry <at> gutov.dev> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 3 Sep 2025 00:21:09 +0300
On 02/09/2025 22:45, Dmitry Gutov wrote: > We need a fuller scenario. If not with code, then a full use case > described in English. Maybe to start something, do we expect some Comint functionality to be broken when background threads exist and perhaps call 'accept-process-output' with nil, in a loop? Such as comint-redirect-results-list-from-process or comint-proc-query, for example. Or among third party code, inf-ruby-completions. All of these call 'accept-process-output' with their own process object.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 11:27:02 GMT) Full text and rfc822 format available.Message #53 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 14:25:58 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: Dmitry Gutov <dmitry <at> gutov.dev>, i <at> fuzy.me, 79367 <at> debbugs.gnu.org > Date: Tue, 02 Sep 2025 15:33:46 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > > I can code the proposed fix in a few minutes, if there's agreement to > > what I proposed to do. I'm still uncertain what I proposed is > > agreed-upon. So let me reiterate that: > > > > . if the process on behalf of which we called > > server_accept_connection was not locked to some thread, undo what > > make_process did to the new process when it called pset_thread > > . otherwise, call set_proc_thread to lock the new process to the > > same thread as the one which caused this call > > Whatever you do, please post your change for review rather than just > pushing it. Will do. > >> I don't want us to spend a lot of time arguing revert-or-no-revert now, > >> but could you, Eli, try to decide on a full example of a buggy scenario > >> which is really solved by thread locking being? > > > > I already described that in so many words. > > Can you send that full description again? Perhaps I can translate it > from words into code for you. Look at my posts as part of discussing bug#79201. For example: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#88 https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#94 https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#100 https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#106
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 11:55:02 GMT) Full text and rfc822 format available.Message #56 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Dmitry Gutov <dmitry <at> gutov.dev> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 14:53:42 +0300
> Date: Tue, 2 Sep 2025 22:45:54 +0300 > Cc: sbaugh <at> janestreet.com, i <at> fuzy.me, 79367 <at> debbugs.gnu.org > From: Dmitry Gutov <dmitry <at> gutov.dev> > > On 02/09/2025 22:18, Eli Zaretskii wrote: > > >> I do believe that when a change in a long-standing implementation causes > >> an immediate issue we first consider reverting, especially when there is > >> no consensus on the approach. > > > > No, we don't. Reverting is actually the last resort, reserved for > > plain and clear mistakes, and for changes that cause grave regressions > > and cannot be fixed soon enough. > > A "mistake" is often in the eyes of the beholder. The lack of consensus > is something more objective. I said "clear mistakes". Those aren't in the eyes of the beholder, they are clear to everyone. > > . if the process on behalf of which we called > > server_accept_connection was not locked to some thread, undo what > > make_process did to the new process when it called pset_thread > > . otherwise, call set_proc_thread to lock the new process to the > > same thread as the one which caused this call > > Sounds about right. Thanks, will do. > >> I don't want us to spend a lot of time arguing revert-or-no-revert now, > >> but could you, Eli, try to decide on a full example of a buggy scenario > >> which is really solved by thread locking being? > > > > I already described that in so many words. That is all I can afford. > > We need a fuller scenario. If not with code, then a full use case > described in English. I described some of them, and just now posted a list of my messages in bug#79201 where I did that. That's the best I can afford doing at this time, sorry. > I'm pretty sure more of your time is being wasted on answering the > emails with objections. Regrettably. Yes. And I still consider this a worthwhile investment, if as result we will become on the same page regarding how the relevant code works and what should be our expectations from it. > > And please keep in mind that processes were being locked to threads > > since day one, just incompletely so. I didn't invent it just > > yesterday out of the blue: it's the original design of processes with > > threads. > > IIUC the existing protection did very little in practice, all of these > years: it only stopped some thread calling 'accept-process-output' with > an explicit reference of a process belonging to another thread. That's not "very little" at all. And if some Lisp program called set-process-thread, it would do exactly what is now done by default. > Any kind > of buggy code would have to try hard to obtain that reference in the > first place - so that would almost never happen anyway. The new > protections seem a lot more strict. Yes, it is stricter. Because that's the intent, as is clear both from the code and from the documentation, and also because it makes Lisp programs much easier to reason about. I also don't quite see a disaster this change is causing to code which disregarded the documented restriction. All we had till now is a single bug report, which clearly indicates that my change was incomplete, and is easy to fix. I fail to understand the amount of excitement due to a change that turns out not to be different from many other changes we make in development. Here's one recent example: https://lists.gnu.org/archive/html/emacs-devel/2025-09/msg00002.html This even broke the build, which is clearly a serious regression. And the solution was that Mattias quickly installed two followup changes; case closed. Good job, Mattias! > TBC I think we should continue to support locking, and your current fix > seems productive in this regard, but whether locking should be by > default is not obvious to me - and at least two smart devs who worked > with Emacs Lisp threads say no. That seems like grounds to re-evaluate. > Even if we reach the current default we'll have documented our > conclusions better rather than just saying "this is so". I'm not saying we cannot revisit the decision to lock processes to threads. But given that the original design and implementation clearly tell us that was the intent, I think we should start from what we have, and only change it if we decide that the current default is wrong in too many important cases. For now all we have is 2 cases: one was solved yesterday by unlocking the process, the other is the one discussed here, which uncovered a flaw in the change I installed, and should be easy to fix. That's not enough to declare the idea of locking wrong as the default, let alone bankrupt. For such a far-reaching conclusion we need much more data points, of both kinds. My knowledge of this code and my experience, based on 12 years of developing, maintaining, and fixing bugs in it, is that letting random threads read output from subprocesses causes many unexpected behaviors and thus triggers subtle bugs and non-determinism in seemingly valid Lisp programs. You think differently, but please don't dismiss my opinions on this too quickly, given my familiarity with the code in this area and the amount of time I spent on it.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 11:58:02 GMT) Full text and rfc822 format available.Message #59 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Dmitry Gutov <dmitry <at> gutov.dev> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 14:57:13 +0300
> Date: Wed, 3 Sep 2025 00:21:09 +0300 > From: Dmitry Gutov <dmitry <at> gutov.dev> > Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org > > On 02/09/2025 22:45, Dmitry Gutov wrote: > > We need a fuller scenario. If not with code, then a full use case > > described in English. > > Maybe to start something, do we expect some Comint functionality to be > broken when background threads exist and perhaps call > 'accept-process-output' with nil, in a loop? > > Such as comint-redirect-results-list-from-process or comint-proc-query, > for example. If the process is locked to the thread which runs comint-redirect-results-list-from-process, I wouldn't expect problems there. But it would be good for someone to look into this, sure. > Or among third party code, inf-ruby-completions. All of these call > 'accept-process-output' with their own process object. Same, I think.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 12:18:01 GMT) Full text and rfc822 format available.Message #62 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Dmitry Gutov <dmitry <at> gutov.dev> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 3 Sep 2025 15:17:42 +0300
On 03/09/2025 14:57, Eli Zaretskii wrote: >> Maybe to start something, do we expect some Comint functionality to be >> broken when background threads exist and perhaps call >> 'accept-process-output' with nil, in a loop? >> >> Such as comint-redirect-results-list-from-process or comint-proc-query, >> for example. > If the process is locked to the thread which runs > comint-redirect-results-list-from-process, I wouldn't expect problems > there. But it would be good for someone to look into this, sure. Right, I meant the unlocked configuration (this could be an evidence of it being bad default).
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 13:16:02 GMT) Full text and rfc822 format available.Message #65 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Dmitry Gutov <dmitry <at> gutov.dev> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 16:15:42 +0300
> Date: Wed, 3 Sep 2025 15:17:42 +0300 > Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org > From: Dmitry Gutov <dmitry <at> gutov.dev> > > On 03/09/2025 14:57, Eli Zaretskii wrote: > >> Maybe to start something, do we expect some Comint functionality to be > >> broken when background threads exist and perhaps call > >> 'accept-process-output' with nil, in a loop? > >> > >> Such as comint-redirect-results-list-from-process or comint-proc-query, > >> for example. > > If the process is locked to the thread which runs > > comint-redirect-results-list-from-process, I wouldn't expect problems > > there. But it would be good for someone to look into this, sure. > > Right, I meant the unlocked configuration (this could be an evidence of > it being bad default). I'm not familiar with the expectations of this function. Is it expected to wait in a loop until PROCESS exits, and process the output only then? If yes, the only question is what happens when PROCESS exits while another thread has the global lock. This is related to a problem discussed in bug#79334, so maybe we should revisit this when that bug is deemed solved.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 14:14:02 GMT) Full text and rfc822 format available.Message #68 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: sbaugh <at> janestreet.com, i <at> fuzy.me, dmitry <at> gutov.dev Cc: 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 17:13:42 +0300
> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Wed, 03 Sep 2025 14:25:58 +0300 > From: Eli Zaretskii <eliz <at> gnu.org> > > > > . if the process on behalf of which we called > > > server_accept_connection was not locked to some thread, undo what > > > make_process did to the new process when it called pset_thread > > > . otherwise, call set_proc_thread to lock the new process to the > > > same thread as the one which caused this call > > > > Whatever you do, please post your change for review rather than just > > pushing it. > > Will do. The proposed patch is below. Zhengyi Fu, can you please apply it to the current master, and see if it solves the original problem? diff --git a/src/process.c b/src/process.c index d6efac5..0c41ec6 100644 --- a/src/process.c +++ b/src/process.c @@ -5078,6 +5078,10 @@ server_accept_connection (Lisp_Object server, int channel) fcntl (s, F_SETFL, O_NONBLOCK); p = XPROCESS (proc); + /* make_process calls pset_thread, but if the server process is not + locked to any thread, we need to undo what make_process did. */ + if (NILP (ps->thread)) + pset_thread (p, Qnil); /* Build new contact information for this setup. */ contact = Fcopy_sequence (ps->childp); @@ -5117,6 +5121,13 @@ server_accept_connection (Lisp_Object server, int channel) add_process_read_fd (s); if (s > max_desc) max_desc = s; + /* If the server process is locked to this thread, lock the client + process to the same thread. */ + if (!NILP (ps->thread)) + { + eassert (XTHREAD (ps->thread) == current_thread); + set_proc_thread (p, XTHREAD (ps->thread)); + } /* Setup coding system for new process based on server process. This seems to be the proper thing to do, as the coding system
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 14:43:02 GMT) Full text and rfc822 format available.Message #71 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 22:42:26 +0800
Eli Zaretskii <eliz <at> gnu.org> writes: >> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> Date: Wed, 03 Sep 2025 14:25:58 +0300 >> From: Eli Zaretskii <eliz <at> gnu.org> >> >> > > . if the process on behalf of which we called >> > > server_accept_connection was not locked to some thread, undo what >> > > make_process did to the new process when it called pset_thread >> > > . otherwise, call set_proc_thread to lock the new process to the >> > > same thread as the one which caused this call >> > >> > Whatever you do, please post your change for review rather than just >> > pushing it. >> >> Will do. > > The proposed patch is below. > > Zhengyi Fu, can you please apply it to the current master, and see if > it solves the original problem? It does resolve my original problem. But I think we need to call set_proc_thread when the server process is not locked as well. Otherwise the `thread' member of the fd_callback_info may still be a dangling pointer. > diff --git a/src/process.c b/src/process.c > index d6efac5..0c41ec6 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -5078,6 +5078,10 @@ server_accept_connection (Lisp_Object server, int channel) > fcntl (s, F_SETFL, O_NONBLOCK); > > p = XPROCESS (proc); > + /* make_process calls pset_thread, but if the server process is not > + locked to any thread, we need to undo what make_process did. */ > + if (NILP (ps->thread)) > + pset_thread (p, Qnil); > > /* Build new contact information for this setup. */ > contact = Fcopy_sequence (ps->childp); > @@ -5117,6 +5121,13 @@ server_accept_connection (Lisp_Object server, int channel) > add_process_read_fd (s); > if (s > max_desc) > max_desc = s; > + /* If the server process is locked to this thread, lock the client > + process to the same thread. */ > + if (!NILP (ps->thread)) > + { > + eassert (XTHREAD (ps->thread) == current_thread); > + set_proc_thread (p, XTHREAD (ps->thread)); > + } > > /* Setup coding system for new process based on server process. > This seems to be the proper thing to do, as the coding system
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 14:54:01 GMT) Full text and rfc822 format available.Message #74 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Zhengyi Fu <i <at> fuzy.me> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 17:53:17 +0300
> From: Zhengyi Fu <i <at> fuzy.me> > Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Wed, 03 Sep 2025 22:42:26 +0800 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > > Zhengyi Fu, can you please apply it to the current master, and see if > > it solves the original problem? > > It does resolve my original problem. Thanks for testing. > But I think we need to call set_proc_thread when the server process > is not locked as well. Otherwise the `thread' member of the > fd_callback_info may still be a dangling pointer. Oh, you mean the below? I agree, it is safer. diff --git a/src/process.c b/src/process.c index d6efac5..2cfff61 100644 --- a/src/process.c +++ b/src/process.c @@ -5078,6 +5078,10 @@ server_accept_connection (Lisp_Object server, int channel) fcntl (s, F_SETFL, O_NONBLOCK); p = XPROCESS (proc); + /* make_process calls pset_thread, but if the server process is not + locked to any thread, we need to undo what make_process did. */ + if (NILP (ps->thread)) + pset_thread (p, Qnil); /* Build new contact information for this setup. */ contact = Fcopy_sequence (ps->childp); @@ -5117,6 +5121,16 @@ server_accept_connection (Lisp_Object server, int channel) add_process_read_fd (s); if (s > max_desc) max_desc = s; + /* If the server process is locked to this thread, lock the client + process to the same thread, otherwise clear the thread of its I/O + descriptors. */ + if (NILP (ps->thread)) + set_proc_thread (p, NULL); + else + { + eassert (XTHREAD (ps->thread) == current_thread); + set_proc_thread (p, XTHREAD (ps->thread)); + } /* Setup coding system for new process based on server process. This seems to be the proper thing to do, as the coding system
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 14:59:01 GMT) Full text and rfc822 format available.Message #77 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 22:57:50 +0800
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Zhengyi Fu <i <at> fuzy.me> >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> Date: Wed, 03 Sep 2025 22:42:26 +0800 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> > Zhengyi Fu, can you please apply it to the current master, and see if >> > it solves the original problem? >> >> It does resolve my original problem. > > Thanks for testing. > >> But I think we need to call set_proc_thread when the server process >> is not locked as well. Otherwise the `thread' member of the >> fd_callback_info may still be a dangling pointer. > > Oh, you mean the below? I agree, it is safer. > > diff --git a/src/process.c b/src/process.c > index d6efac5..2cfff61 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -5078,6 +5078,10 @@ server_accept_connection (Lisp_Object server, int channel) > fcntl (s, F_SETFL, O_NONBLOCK); > > p = XPROCESS (proc); > + /* make_process calls pset_thread, but if the server process is not > + locked to any thread, we need to undo what make_process did. */ > + if (NILP (ps->thread)) > + pset_thread (p, Qnil); > > /* Build new contact information for this setup. */ > contact = Fcopy_sequence (ps->childp); > @@ -5117,6 +5121,16 @@ server_accept_connection (Lisp_Object server, int channel) > add_process_read_fd (s); > if (s > max_desc) > max_desc = s; > + /* If the server process is locked to this thread, lock the client > + process to the same thread, otherwise clear the thread of its I/O > + descriptors. */ > + if (NILP (ps->thread)) > + set_proc_thread (p, NULL); > + else > + { > + eassert (XTHREAD (ps->thread) == current_thread); > + set_proc_thread (p, XTHREAD (ps->thread)); > + } > > /* Setup coding system for new process based on server process. > This seems to be the proper thing to do, as the coding system Yes.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 15:01:02 GMT) Full text and rfc822 format available.Message #80 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: Zhengyi Fu <i <at> fuzy.me>, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 11:00:19 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Zhengyi Fu <i <at> fuzy.me> >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> Date: Wed, 03 Sep 2025 22:42:26 +0800 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> > Zhengyi Fu, can you please apply it to the current master, and see if >> > it solves the original problem? >> >> It does resolve my original problem. > > Thanks for testing. > >> But I think we need to call set_proc_thread when the server process >> is not locked as well. Otherwise the `thread' member of the >> fd_callback_info may still be a dangling pointer. > > Oh, you mean the below? I agree, it is safer. We should not do that - if the thread member of fd_callback_info is a dangling pointer, that indicates there's a bug elsewhere.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 15:21:01 GMT) Full text and rfc822 format available.Message #83 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 18:20:17 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: Zhengyi Fu <i <at> fuzy.me>, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > Date: Wed, 03 Sep 2025 11:00:19 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> From: Zhengyi Fu <i <at> fuzy.me> > >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > >> Date: Wed, 03 Sep 2025 22:42:26 +0800 > >> > >> Eli Zaretskii <eliz <at> gnu.org> writes: > >> > >> > Zhengyi Fu, can you please apply it to the current master, and see if > >> > it solves the original problem? > >> > >> It does resolve my original problem. > > > > Thanks for testing. > > > >> But I think we need to call set_proc_thread when the server process > >> is not locked as well. Otherwise the `thread' member of the > >> fd_callback_info may still be a dangling pointer. > > > > Oh, you mean the below? I agree, it is safer. > > We should not do that - if the thread member of fd_callback_info is a > dangling pointer, that indicates there's a bug elsewhere. So how about adding an assertion before we clear it?
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 16:04:01 GMT) Full text and rfc822 format available.Message #86 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 12:03:42 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: Zhengyi Fu <i <at> fuzy.me>, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> Date: Wed, 03 Sep 2025 11:00:19 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> From: Zhengyi Fu <i <at> fuzy.me> >> >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> >> Date: Wed, 03 Sep 2025 22:42:26 +0800 >> >> >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> >> > Zhengyi Fu, can you please apply it to the current master, and see if >> >> > it solves the original problem? >> >> >> >> It does resolve my original problem. >> > >> > Thanks for testing. >> > >> >> But I think we need to call set_proc_thread when the server process >> >> is not locked as well. Otherwise the `thread' member of the >> >> fd_callback_info may still be a dangling pointer. >> > >> > Oh, you mean the below? I agree, it is safer. >> >> We should not do that - if the thread member of fd_callback_info is a >> dangling pointer, that indicates there's a bug elsewhere. > > So how about adding an assertion before we clear it? Sure. fd_callback_info.thread specifically should always be NULL at the time we're using the fd_callback_info slot for some new fd, so we should assert that.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 16:44:02 GMT) Full text and rfc822 format available.Message #89 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 19:42:47 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > Date: Wed, 03 Sep 2025 12:03:42 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> >> But I think we need to call set_proc_thread when the server process > >> >> is not locked as well. Otherwise the `thread' member of the > >> >> fd_callback_info may still be a dangling pointer. > >> > > >> > Oh, you mean the below? I agree, it is safer. > >> > >> We should not do that - if the thread member of fd_callback_info is a > >> dangling pointer, that indicates there's a bug elsewhere. > > > > So how about adding an assertion before we clear it? > > Sure. fd_callback_info.thread specifically should always be NULL at the > time we're using the fd_callback_info slot for some new fd, so we should > assert that. So, this: diff --git a/src/process.c b/src/process.c index d6efac5..8f5be5e 100644 --- a/src/process.c +++ b/src/process.c @@ -5078,6 +5078,10 @@ server_accept_connection (Lisp_Object server, int channel) fcntl (s, F_SETFL, O_NONBLOCK); p = XPROCESS (proc); + /* make_process calls pset_thread, but if the server process is not + locked to any thread, we need to undo what make_process did. */ + if (NILP (ps->thread)) + pset_thread (p, Qnil); /* Build new contact information for this setup. */ contact = Fcopy_sequence (ps->childp); @@ -5117,6 +5121,19 @@ server_accept_connection (Lisp_Object server, int channel) add_process_read_fd (s); if (s > max_desc) max_desc = s; + /* If the server process is locked to this thread, lock the client + process to the same thread, otherwise clear the thread of its I/O + descriptors. */ + if (NILP (ps->thread)) + { + eassert (!fd_callback_info[p->infd].thread); + set_proc_thread (p, NULL); + } + else + { + eassert (XTHREAD (ps->thread) == current_thread); + set_proc_thread (p, XTHREAD (ps->thread)); + } /* Setup coding system for new process based on server process. This seems to be the proper thing to do, as the coding system
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 17:08:02 GMT) Full text and rfc822 format available.Message #92 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 13:07:08 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> Date: Wed, 03 Sep 2025 12:03:42 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> >> But I think we need to call set_proc_thread when the server process >> >> >> is not locked as well. Otherwise the `thread' member of the >> >> >> fd_callback_info may still be a dangling pointer. >> >> > >> >> > Oh, you mean the below? I agree, it is safer. >> >> >> >> We should not do that - if the thread member of fd_callback_info is a >> >> dangling pointer, that indicates there's a bug elsewhere. >> > >> > So how about adding an assertion before we clear it? >> >> Sure. fd_callback_info.thread specifically should always be NULL at the >> time we're using the fd_callback_info slot for some new fd, so we should >> assert that. > > So, this: > > diff --git a/src/process.c b/src/process.c > index d6efac5..8f5be5e 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -5078,6 +5078,10 @@ server_accept_connection (Lisp_Object server, int channel) > fcntl (s, F_SETFL, O_NONBLOCK); > > p = XPROCESS (proc); > + /* make_process calls pset_thread, but if the server process is not > + locked to any thread, we need to undo what make_process did. */ > + if (NILP (ps->thread)) > + pset_thread (p, Qnil); We should probably just unconditionally do: pset_thread (p, ps->thread) here. > /* Build new contact information for this setup. */ > contact = Fcopy_sequence (ps->childp); > @@ -5117,6 +5121,19 @@ server_accept_connection (Lisp_Object server, int channel) > add_process_read_fd (s); > if (s > max_desc) > max_desc = s; > + /* If the server process is locked to this thread, lock the client > + process to the same thread, otherwise clear the thread of its I/O > + descriptors. */ > + if (NILP (ps->thread)) > + { > + eassert (!fd_callback_info[p->infd].thread); > + set_proc_thread (p, NULL); Yes, this is the right assertion. But it should be outside the if/else because it's true even if ps->thread isn't nil. This whole if/else conditional should probably be just: eassert (!fd_callback_info[p->infd].thread); if (!NILP (p->thread)) { eassert (XTHREAD (ps->thread) == current_thread); set_proc_thread (p, XTHREAD (p->thread)); } It's not necessary to call set_proc_thread(p, NULL) in the other case since it will just set .thread to NULL, which we're already asserting anyway. > + } > + else > + { > + eassert (XTHREAD (ps->thread) == current_thread); > + set_proc_thread (p, XTHREAD (ps->thread)); > + }
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Wed, 03 Sep 2025 18:05:01 GMT) Full text and rfc822 format available.Message #95 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Wed, 03 Sep 2025 14:04:42 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: Dmitry Gutov <dmitry <at> gutov.dev>, i <at> fuzy.me, 79367 <at> debbugs.gnu.org >> Date: Tue, 02 Sep 2025 15:33:46 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> > I can code the proposed fix in a few minutes, if there's agreement to >> > what I proposed to do. I'm still uncertain what I proposed is >> > agreed-upon. So let me reiterate that: >> > >> > . if the process on behalf of which we called >> > server_accept_connection was not locked to some thread, undo what >> > make_process did to the new process when it called pset_thread >> > . otherwise, call set_proc_thread to lock the new process to the >> > same thread as the one which caused this call >> >> Whatever you do, please post your change for review rather than just >> pushing it. > > Will do. > >> >> I don't want us to spend a lot of time arguing revert-or-no-revert now, >> >> but could you, Eli, try to decide on a full example of a buggy scenario >> >> which is really solved by thread locking being? >> > >> > I already described that in so many words. >> >> Can you send that full description again? Perhaps I can translate it >> from words into code for you. > > Look at my posts as part of discussing bug#79201. For example: OK, attempting to excerpt the parts where you describe what could go wrong when a thread is unlocked: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#88 > Wouldn't we want the thread to be able to read the output > from its process after the wait? Without marking the descriptors with > the thread, that output could have been read by some other thread > during the wait, and then our thread will hang forever if it calls > accept-process-output after the wait ended -- a much more serious > problem, and in a program that has no bug per se. https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#94 > > > An output from a process could have been read by a thread other than > > > the one that started the process, thus "starving" the thread that > > > started the process from reading its output. > > > > That has been possible, and common, since threads were originally added. > > And caused strange bugs, whereby a thread that started a process would > hang in accept-process-output because some other thread inadvertently > read the output from that process. https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#100 > If you study the code in wait_reading_process_output, you will > immediately see what kind of godawful mess will that cause with > reading output from several processes started by several threads. > > wait_reading_process_output was written under an implicit assumption > that there's only one thread that handles all the subprocesses. > Therefore, it consistently checks all of the I/O descriptors for > available input from the subprocesses, and consumes that as soon as > it's available. That cannot possibly work reliably, let alone > predictably, when several threads are involved. This is so obvious > from reading the code that I'm astonished I need to even explain it. Note that we need to make sure wait_reading_process_output works reliably in this case for the sake of processes which aren't locked to threads. So defaulting to locking processes to threads doesn't actually help with this. > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#106 > Moreover, IME many weird problems and buggy behavior in Lisp programs > using threads were caused by the wrong thread inadvertently reading > output of a process, which then left the thread that waits for that > process's output stuck waiting forever. If anything, we should make > process locking to its thread stronger, not weaker, if we want > relatively simple threaded programs to work reliably. So, to summarize, it sounds like your issue with not locking processes to threads is: Without locking processes to the creating thread, a process's output can be read by some other thread during the wait, and then our thread will hang forever if it calls accept-process-output after the wait ended. This is unfortunately not detailed enough to translate into code. I've never seen a real-world program which would be vulnerable to this problem, as described here. I can make up toy programs which meet this description which would hang, but I'm not sure they're actually what you're concerned about. Especially because they aren't anything like real-world programs. Could you give some more details about the scenario, so I can try to write a test demonstrating the problem? Or could you point to a real-world program which is vulnerable to this problem, so I can turn it into a test?
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Thu, 04 Sep 2025 01:34:03 GMT) Full text and rfc822 format available.Message #98 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Dmitry Gutov <dmitry <at> gutov.dev> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Thu, 4 Sep 2025 04:33:31 +0300
On 03/09/2025 14:53, Eli Zaretskii wrote: >>>> I don't want us to spend a lot of time arguing revert-or-no-revert now, >>>> but could you, Eli, try to decide on a full example of a buggy scenario >>>> which is really solved by thread locking being? >>> >>> I already described that in so many words. That is all I can afford. >> >> We need a fuller scenario. If not with code, then a full use case >> described in English. > > I described some of them, and just now posted a list of my messages in > bug#79201 where I did that. That's the best I can afford doing at > this time, sorry. Thanks. Will follow the subthread that sprung from this paragraph. >> I'm pretty sure more of your time is being wasted on answering the >> emails with objections. Regrettably. > > Yes. And I still consider this a worthwhile investment, if as result > we will become on the same page regarding how the relevant code works > and what should be our expectations from it. I hope to get there as well. >> IIUC the existing protection did very little in practice, all of these >> years: it only stopped some thread calling 'accept-process-output' with >> an explicit reference of a process belonging to another thread. > > That's not "very little" at all. And if some Lisp program called > set-process-thread, it would do exactly what is now done by default. Right. But we don't know of any that did. >> Any kind >> of buggy code would have to try hard to obtain that reference in the >> first place - so that would almost never happen anyway. The new >> protections seem a lot more strict. > > Yes, it is stricter. Because that's the intent, as is clear both from > the code and from the documentation, and also because it makes Lisp > programs much easier to reason about. > > I also don't quite see a disaster this change is causing to code which > disregarded the documented restriction. All we had till now is a > single bug report, which clearly indicates that my change was > incomplete, and is easy to fix. I fail to understand the amount of > excitement due to a change that turns out not to be different from > many other changes we make in development. Spencer also previously mentioned breakage in some existing code, there was an example in another thread (not about native compilation). > Here's one recent example: > > https://lists.gnu.org/archive/html/emacs-devel/2025-09/msg00002.html > > This even broke the build, which is clearly a serious regression. And > the solution was that Mattias quickly installed two followup changes; > case closed. Good job, Mattias! I'm good with fixing this bug report the expedient way. >> TBC I think we should continue to support locking, and your current fix >> seems productive in this regard, but whether locking should be by >> default is not obvious to me - and at least two smart devs who worked >> with Emacs Lisp threads say no. That seems like grounds to re-evaluate. >> Even if we reach the current default we'll have documented our >> conclusions better rather than just saying "this is so". > > I'm not saying we cannot revisit the decision to lock processes to > threads. But given that the original design and implementation > clearly tell us that was the intent, I think we should start from what > we have, and only change it if we decide that the current default is > wrong in too many important cases. For now all we have is 2 cases: > one was solved yesterday by unlocking the process, the other is the > one discussed here, which uncovered a flaw in the change I installed, > and should be easy to fix. That's not enough to declare the idea of > locking wrong as the default, let alone bankrupt. Certainly. > For such a > far-reaching conclusion we need much more data points, of both kinds. > My knowledge of this code and my experience, based on 12 years of > developing, maintaining, and fixing bugs in it, is that letting random > threads read output from subprocesses causes many unexpected behaviors > and thus triggers subtle bugs and non-determinism in seemingly valid > Lisp programs. I have an inkling that even in the single-threaded case accept-process-output can behave unpredictably enough that people use it in a certain way (in a loop, with low timeout). Which could smooth over unpredictabilities across multiple threads as well. There are some exceptions to that, though. > You think differently, but please don't dismiss my > opinions on this too quickly, given my familiarity with the code in > this area and the amount of time I spent on it. Just to be clear, I haven't formed an opinion on this myself yet. Just hoping for a more thorough discussion. Sorry if it comes across differently.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Thu, 04 Sep 2025 04:49:02 GMT) Full text and rfc822 format available.Message #101 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Thu, 04 Sep 2025 07:48:05 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > Date: Wed, 03 Sep 2025 13:07:08 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > > + /* make_process calls pset_thread, but if the server process is not > > + locked to any thread, we need to undo what make_process did. */ > > + if (NILP (ps->thread)) > > + pset_thread (p, Qnil); > > We should probably just unconditionally do: > pset_thread (p, ps->thread) > here. That's a bit dangerous, since the I/O descriptors of the new process are not yet set, so the call to pset_thread there, in case ps->thread is non-nil, would do a partial job. We could instead move the pset_thread call to later, where set_proc_thread is called, but I preferred to undo what make_process did ASAP. > > > /* Build new contact information for this setup. */ > > contact = Fcopy_sequence (ps->childp); > > @@ -5117,6 +5121,19 @@ server_accept_connection (Lisp_Object server, int channel) > > add_process_read_fd (s); > > if (s > max_desc) > > max_desc = s; > > + /* If the server process is locked to this thread, lock the client > > + process to the same thread, otherwise clear the thread of its I/O > > + descriptors. */ > > + if (NILP (ps->thread)) > > + { > > + eassert (!fd_callback_info[p->infd].thread); > > + set_proc_thread (p, NULL); > > Yes, this is the right assertion. But it should be outside the if/else > because it's true even if ps->thread isn't nil. I don't mind moving it out. > This whole if/else conditional should probably be just: > > eassert (!fd_callback_info[p->infd].thread); > if (!NILP (p->thread)) > { > eassert (XTHREAD (ps->thread) == current_thread); > set_proc_thread (p, XTHREAD (p->thread)); > } > > It's not necessary to call set_proc_thread(p, NULL) in the other case > since it will just set .thread to NULL, which we're already asserting > anyway. The assertions compile to nothing in a production build. Also, I don't like the caller to assume too much about what set_proc_thread does, it's not future-proof. So I'd prefer to leave that call in place.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Thu, 04 Sep 2025 05:20:02 GMT) Full text and rfc822 format available.Message #104 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Thu, 04 Sep 2025 08:18:54 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: i <at> fuzy.me, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Wed, 03 Sep 2025 14:04:42 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=79201#100 > > > If you study the code in wait_reading_process_output, you will > > immediately see what kind of godawful mess will that cause with > > reading output from several processes started by several threads. > > > > wait_reading_process_output was written under an implicit assumption > > that there's only one thread that handles all the subprocesses. > > Therefore, it consistently checks all of the I/O descriptors for > > available input from the subprocesses, and consumes that as soon as > > it's available. That cannot possibly work reliably, let alone > > predictably, when several threads are involved. This is so obvious > > from reading the code that I'm astonished I need to even explain it. > > Note that we need to make sure wait_reading_process_output works > reliably in this case for the sake of processes which aren't locked to > threads. So defaulting to locking processes to threads doesn't actually > help with this. Not if we consider unlocked processes a rare and unusual thing to do, in which case programmers that use this are responsible for the dealing with the results. We made such a change in comp.el, but I cannot say I'm too happy with it, and hope we will revisit that later and find a better solution. We do need to make sure nothing disastrous happens in that case, like crashes or EBADF or Emacs hung and unresponsive, but that's just a small part of "works reliably", and is not my primary concern here. My primary concern is that the results of letting processes be unlocked will cause many unexpected and randomly-unpredictable results, which will avert people from using threads. Because who will want to use an infrastructure that makes working Lisp programs unpredictable when run from a non-main thread? > So, to summarize, it sounds like your issue with not locking processes > to threads is: > > Without locking processes to the creating thread, a process's output > can be read by some other thread during the wait, and then our thread > will hang forever if it calls accept-process-output after the wait > ended. Either "hang forever", or fail to work because the expected output is "stolen" by another thread, or run the process filter in the context of another thread (where, for example, things like the current buffer and match-data are different), or some other such unexpected conditions. > This is unfortunately not detailed enough to translate into code. I've > never seen a real-world program which would be vulnerable to this > problem, as described here. I can make up toy programs which meet this > description which would hang, but I'm not sure they're actually what > you're concerned about. Especially because they aren't anything like > real-world programs. I toy program that looks valid and runs in a single-threaded Emacs, but hangs or fails when run from a non-main thread is already a problem. Whether it is interesting for us depends on the program. A toy program which demonstrates an issue that real-world programs will meet is interesting. In any case, it is clear to me, and should be clear to anyone else, that taking a Lisp program written for a single-threaded Emacs and running it from a non-main thread will be made easier if the output from the sub-processes started by that program will be read only by that thread, because that is closer to the assumptions under which the original code was written. > Could you give some more details about the scenario, so I can try to > write a test demonstrating the problem? Or could you point to a > real-world program which is vulnerable to this problem, so I can turn it > into a test? I've never used or debugged a real-world program where these aspects come into play. My experience is almost exclusively with small "toy" programs and test programs, the ones we have in the test suite and also those submitted as part of relevant bug reports. You may wish looking at running the Gnus function that fetches articles from a separate thread (I think someone tried that at some point), and you may wish talking to Michael Albinus, who tried using threads in Tramp (I think he has that on a branch somewhere?).
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Thu, 04 Sep 2025 16:42:03 GMT) Full text and rfc822 format available.Message #107 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Thu, 04 Sep 2025 12:41:04 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> Date: Wed, 03 Sep 2025 13:07:08 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> > + /* make_process calls pset_thread, but if the server process is not >> > + locked to any thread, we need to undo what make_process did. */ >> > + if (NILP (ps->thread)) >> > + pset_thread (p, Qnil); >> >> We should probably just unconditionally do: >> pset_thread (p, ps->thread) >> here. > > That's a bit dangerous, since the I/O descriptors of the new process > are not yet set, so the call to pset_thread there, in case ps->thread > is non-nil, would do a partial job. We could instead move the > pset_thread call to later, where set_proc_thread is called, but I > preferred to undo what make_process did ASAP. I'm confused. - It's just duplicating what make_process did, so I expect it's essentially a no-op, just simpler code. - Why would it be dangerous even if that wasn't the case? We're still setting up this process, it's not visible to any Lisp code, so what bad event could even happen? >> >> > /* Build new contact information for this setup. */ >> > contact = Fcopy_sequence (ps->childp); >> > @@ -5117,6 +5121,19 @@ server_accept_connection (Lisp_Object server, int channel) >> > add_process_read_fd (s); >> > if (s > max_desc) >> > max_desc = s; >> > + /* If the server process is locked to this thread, lock the client >> > + process to the same thread, otherwise clear the thread of its I/O >> > + descriptors. */ >> > + if (NILP (ps->thread)) >> > + { >> > + eassert (!fd_callback_info[p->infd].thread); >> > + set_proc_thread (p, NULL); >> >> Yes, this is the right assertion. But it should be outside the if/else >> because it's true even if ps->thread isn't nil. > > I don't mind moving it out. Great. >> This whole if/else conditional should probably be just: >> >> eassert (!fd_callback_info[p->infd].thread); >> if (!NILP (p->thread)) >> { >> eassert (XTHREAD (ps->thread) == current_thread); >> set_proc_thread (p, XTHREAD (p->thread)); >> } >> >> It's not necessary to call set_proc_thread(p, NULL) in the other case >> since it will just set .thread to NULL, which we're already asserting >> anyway. > > The assertions compile to nothing in a production build. Then maybe we should turn this assertion on by default in production builds, since threads are still encountering problems, and it would help catch problems. Zhengyi says he saw in a debugger this field was set to non-NULL at this point, which indicates a bug, so it would help tracking that down. > Also, I don't like the caller to assume too much about what > set_proc_thread does, it's not future-proof. So I'd prefer to leave > that call in place. That's fair. Though, I think the way set_proc_thread works is a bit confusing, so I'm inclined to refactor it in the near future anyway. But, the reason I'd like to take out the call is that it will mask errors created elsewhere by clearing the field.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Thu, 04 Sep 2025 19:23:02 GMT) Full text and rfc822 format available.Message #110 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Thu, 04 Sep 2025 22:22:29 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > Date: Thu, 04 Sep 2025 12:41:04 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> We should probably just unconditionally do: > >> pset_thread (p, ps->thread) > >> here. > > > > That's a bit dangerous, since the I/O descriptors of the new process > > are not yet set, so the call to pset_thread there, in case ps->thread > > is non-nil, would do a partial job. We could instead move the > > pset_thread call to later, where set_proc_thread is called, but I > > preferred to undo what make_process did ASAP. > > I'm confused. > > - It's just duplicating what make_process did, so I expect it's > essentially a no-op, just simpler code. > > - Why would it be dangerous even if that wasn't the case? We're still > setting up this process, it's not visible to any Lisp code, so what > bad event could even happen? Having some part of server_accept_connection, even a small part, run with the new process locked to a thread introduces a window during which the process is in an incorrect state, and I would like to avoid that if possible. Admittedly, in this case the window is very small, but it is still there. Some future changes could actually trip on that. > >> It's not necessary to call set_proc_thread(p, NULL) in the other case > >> since it will just set .thread to NULL, which we're already asserting > >> anyway. > > > > The assertions compile to nothing in a production build. > > Then maybe we should turn this assertion on by default in production > builds, since threads are still encountering problems, and it would help > catch problems. Production builds are not for catching problems, at least not in my book. A production build should try to avoid aborts if possible, because an abort might mean loss of edits. That's why we have eassert to begin with: we use the period of time that code is on the development branch to catch errors and fix them. > Zhengyi says he saw in a debugger this field was set to > non-NULL at this point, which indicates a bug, so it would help tracking > that down. Building with --enable-checking will allow us to do that.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Thu, 04 Sep 2025 19:31:01 GMT) Full text and rfc822 format available.Message #113 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Thu, 04 Sep 2025 15:30:30 -0400
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> Date: Thu, 04 Sep 2025 12:41:04 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> We should probably just unconditionally do: >> >> pset_thread (p, ps->thread) >> >> here. >> > >> > That's a bit dangerous, since the I/O descriptors of the new process >> > are not yet set, so the call to pset_thread there, in case ps->thread >> > is non-nil, would do a partial job. We could instead move the >> > pset_thread call to later, where set_proc_thread is called, but I >> > preferred to undo what make_process did ASAP. >> >> I'm confused. >> >> - It's just duplicating what make_process did, so I expect it's >> essentially a no-op, just simpler code. >> >> - Why would it be dangerous even if that wasn't the case? We're still >> setting up this process, it's not visible to any Lisp code, so what >> bad event could even happen? > > Having some part of server_accept_connection, even a small part, run > with the new process locked to a thread introduces a window during > which the process is in an incorrect state, and I would like to avoid > that if possible. Admittedly, in this case the window is very small, > but it is still there. Some future changes could actually trip on > that. It's already locked to a thread by make_process.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Thu, 04 Sep 2025 22:48:02 GMT) Full text and rfc822 format available.Message #116 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Dmitry Gutov <dmitry <at> gutov.dev> To: Eli Zaretskii <eliz <at> gnu.org>, Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 5 Sep 2025 01:47:00 +0300
On 04/09/2025 08:18, Eli Zaretskii wrote: >> This is unfortunately not detailed enough to translate into code. I've >> never seen a real-world program which would be vulnerable to this >> problem, as described here. I can make up toy programs which meet this >> description which would hang, but I'm not sure they're actually what >> you're concerned about. Especially because they aren't anything like >> real-world programs. > > I toy program that looks valid and runs in a single-threaded Emacs, > but hangs or fails when run from a non-main thread is already a > problem. Whether it is interesting for us depends on the program. A > toy program which demonstrates an issue that real-world programs will > meet is interesting. A toy program could be a good example if you can choose a specific one and point out how it related to a real program (e.g. being a simplified case). We could take such a program and see whether it indeed exhibits issues multi-threaded (when unlocked) but stays reliable in the single threaded case and multi-threaded when locked. Both aspects seem testable. >> Could you give some more details about the scenario, so I can try to >> write a test demonstrating the problem? Or could you point to a >> real-world program which is vulnerable to this problem, so I can turn it >> into a test? > > I've never used or debugged a real-world program where these aspects > come into play. My experience is almost exclusively with small "toy" > programs and test programs, the ones we have in the test suite and > also those submitted as part of relevant bug reports. You may wish > looking at running the Gnus function that fetches articles from a > separate thread (I think someone tried that at some point), and you > may wish talking to Michael Albinus, who tried using threads in Tramp > (I think he has that on a branch somewhere?). There two are interesting. Last I heard, Tramp's branch wasn't stable for everyday use (citing hangs, IIRC), but it has a set-process-thread call, locking connection's process to thread explicitly: https://cgit.git.savannah.gnu.org/cgit/emacs.git/tree/lisp/net/tramp.el?h=feature/tramp-thread-safe#n2759 For Gnus, I can't find a branch in Savannah, but Dick Mao's Emacs fork claims non-blocking Gnus among features. And it unlocks nnimap's process: https://github.com/commercial-emacs/commercial-emacs/blob/6d3ff0428337a734136d00cf2294aa9d3dbcff0e/lisp/gnus/nnimap.el#L592 Maybe someone could test it out.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 05:55:01 GMT) Full text and rfc822 format available.Message #119 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Spencer Baugh <sbaugh <at> janestreet.com> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 08:54:11 +0300
> From: Spencer Baugh <sbaugh <at> janestreet.com> > Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > Date: Thu, 04 Sep 2025 15:30:30 -0400 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> From: Spencer Baugh <sbaugh <at> janestreet.com> > >> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > >> Date: Thu, 04 Sep 2025 12:41:04 -0400 > >> > >> Eli Zaretskii <eliz <at> gnu.org> writes: > >> > >> >> We should probably just unconditionally do: > >> >> pset_thread (p, ps->thread) > >> >> here. > >> > > >> > That's a bit dangerous, since the I/O descriptors of the new process > >> > are not yet set, so the call to pset_thread there, in case ps->thread > >> > is non-nil, would do a partial job. We could instead move the > >> > pset_thread call to later, where set_proc_thread is called, but I > >> > preferred to undo what make_process did ASAP. > >> > >> I'm confused. > >> > >> - It's just duplicating what make_process did, so I expect it's > >> essentially a no-op, just simpler code. > >> > >> - Why would it be dangerous even if that wasn't the case? We're still > >> setting up this process, it's not visible to any Lisp code, so what > >> bad event could even happen? > > > > Having some part of server_accept_connection, even a small part, run > > with the new process locked to a thread introduces a window during > > which the process is in an incorrect state, and I would like to avoid > > that if possible. Admittedly, in this case the window is very small, > > but it is still there. Some future changes could actually trip on > > that. > > It's already locked to a thread by make_process. I feel that we are splitting hair, so I went ahead and installed the last agreed-to version of the patch.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 06:53:02 GMT) Full text and rfc822 format available.Message #122 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, Spencer Baugh <sbaugh <at> janestreet.com>, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 14:51:42 +0800
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> Date: Thu, 04 Sep 2025 15:30:30 -0400 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> >> Cc: i <at> fuzy.me, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> >> Date: Thu, 04 Sep 2025 12:41:04 -0400 >> >> >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> >> >> We should probably just unconditionally do: >> >> >> pset_thread (p, ps->thread) >> >> >> here. >> >> > >> >> > That's a bit dangerous, since the I/O descriptors of the new process >> >> > are not yet set, so the call to pset_thread there, in case ps->thread >> >> > is non-nil, would do a partial job. We could instead move the >> >> > pset_thread call to later, where set_proc_thread is called, but I >> >> > preferred to undo what make_process did ASAP. >> >> >> >> I'm confused. >> >> >> >> - It's just duplicating what make_process did, so I expect it's >> >> essentially a no-op, just simpler code. >> >> >> >> - Why would it be dangerous even if that wasn't the case? We're still >> >> setting up this process, it's not visible to any Lisp code, so what >> >> bad event could even happen? >> > >> > Having some part of server_accept_connection, even a small part, run >> > with the new process locked to a thread introduces a window during >> > which the process is in an incorrect state, and I would like to avoid >> > that if possible. Admittedly, in this case the window is very small, >> > but it is still there. Some future changes could actually trip on >> > that. >> >> It's already locked to a thread by make_process. > > I feel that we are splitting hair, so I went ahead and installed the > last agreed-to version of the patch. I got an assertion failure when testing the latest master.
[Message part 2 (text/plain, inline)]
Thread 1 "emacs" hit Breakpoint 1, die (msg=msg <at> entry=0x55555588fd38 "!fd_callback_info[p->infd].thread", file=file <at> entry=0x555555879464 "process.c", line=line <at> entry=5127) at alloc.c:7302 7302 { (gdb) bt #0 die (msg=msg <at> entry=0x55555588fd38 "!fd_callback_info[p->infd].thread", file=file <at> entry=0x555555879464 "process.c", line=line <at> entry=5127) at alloc.c:7302 #1 0x0000555555806a6a in server_accept_connection (channel=14, server=0x555556bce4bd) at process.c:5127 #2 wait_reading_process_output (time_limit=time_limit <at> entry=30, nsecs=nsecs <at> entry=0, read_kbd=read_kbd <at> entry=-1, do_display=do_display <at> entry=true, wait_for_cell=wait_for_cell <at> entry=0x0, wait_proc=wait_proc <at> entry=0x0, just_wait_proc=0) at process.c:5989 #3 0x00005555555a3188 in sit_for (timeout=<optimized out>, reading=reading <at> entry=true, display_option=display_option <at> entry=1) at dispnew.c:6978 #4 0x000055555570b8ba in read_char (commandflag=1, map=map <at> entry=0x555556aa02b3, prev_event=0x0, used_mouse_menu=used_mouse_menu <at> entry=0x7fffffffd8bb, end_time=end_time <at> entry=0x0) at keyboard.c:2925 #5 0x000055555570e164 in read_key_sequence (keybuf=keybuf <at> entry=0x7fffffffda10, prompt=prompt <at> entry=0x0, dont_downcase_last=dont_downcase_last <at> entry=false, can_return_switch_frame=can_return_switch_frame <at> entry=true, fix_current_buffer=fix_current_buffer <at> entry=true, prevent_redisplay=prevent_redisplay <at> entry=false, disable_text_conversion_p=false) at keyboard.c:11146 #6 0x000055555570feda in command_loop_1 () at keyboard.c:1424 #7 0x00005555557a5ba8 in internal_condition_case (bfun=bfun <at> entry=0x55555570fcb2 <command_loop_1>, handlers=handlers <at> entry=0x90, hfun=hfun <at> entry=0x555555700325 <cmd_error>) at eval.c:1690 #8 0x00005555556f8b3b in command_loop_2 (handlers=handlers <at> entry=0x90) at keyboard.c:1163 #9 0x00005555557a5aa7 in internal_catch (tag=tag <at> entry=0x118e0, func=func <at> entry=0x5555556f8b14 <command_loop_2>, arg=arg <at> entry=0x90) at eval.c:1370 #10 0x00005555556f8af1 in command_loop () at keyboard.c:1141 #11 0x00005555556ffdf1 in recursive_edit_1 () at keyboard.c:749 #12 0x0000555555700181 in Frecursive_edit () at keyboard.c:832 #13 0x00005555556f8650 in main (argc=6, argv=<optimized out>) at emacs.c:2629 (gdb) fr 1 #1 0x0000555555806a6a in server_accept_connection (channel=14, server=0x555556bce4bd) at process.c:5127 5127 eassert (!fd_callback_info[p->infd].thread); (gdb) info locals buffer = 0x0 contact = 0x555556a903e3 host = 0x30 service = <optimized out> ps = <optimized out> p = 0x555556bd4c08 s = 16 saddr = {sa = {sa_family = 1, sa_data = "/run/user/1504"}, in = {sin_family = 1, sin_port = 29231, sin_addr = {s_addr = 1966042741}, sin_zero = "ser/1504"}, in6 = {sin6_family = 1, sin6_port = 29231, sin6_flowinfo = 1966042741, sin6_addr = {__in6_u = {__u6_addr8 = "ser/150425139/em", __u6_addr16 = {25971, 12146, 13617, 13360, 13618, 13105, 12089, 28005}, __u6_addr32 = {796026227, 875574577, 858862898, 1835347769}}}, sin6_scope_id = 796091233}, un = {sun_family = 1, sun_path = "/run/user/150425139/emacs/server\000UUU\000\000\000\000\000\000\000\000\000\000\370?UUU\000\000?pUUU\000\000\300\207\272h\000\000\000\000\001", '\000' <repeats 36 times>}} len = 35 count = <optimized out> args = {0x7fffffffcf84, 0x55555686ace4, 0xa, 0x5724a633450d2100, 0x7fffffffd190, 0x555555a714e0, 0x555555c2a7f0, 0x0, 0x400000000a000000, 0x400000003f000000, 0x7fffffffd1b0} nargs = <optimized out> host_format_in = 0x7fffffffd004 host_format_in6 = 0x7fffffffcfe4 procname_format_in = 0x7fffffffcfc4 procname_format_in6 = 0x7fffffffcfa4 procname_format_default = 0x7fffffffcf84 name = <optimized out> proc = 0x555556bd4c0d dash = <optimized out> nl = <optimized out> host_string = <optimized out> open_from = <optimized out> (gdb) p *p $3 = {header = {size = 4611686018578444307}, tty_name = 0x0, name = 0x555556b274c4, command = 0x0, filter = 0xef8cc0, sentinel = 0xef74e0, log = 0x0, buffer = 0x0, childp = 0x555556a903e3, plist = 0x555556a90163, type = 0xd4d0, mark = 0x555556bd4d25, status = 0xf810, decode_coding_system = 0x0, decoding_buf = 0x0, encode_coding_system = 0x0, encoding_buf = 0x0, write_queue = 0x0, stderrproc = 0x0, thread = 0x5555558eb005 <main_thread+5>, pid = 0, infd = 16, nbytes_read = 0, outfd = 16, open_fd = {16, -1, -1, -1, -1, -1}, tick = 0, update_tick = 0, decoding_carryover = 0, read_output_delay = 0, adaptive_read_buffering = 0, read_output_skip = false, readmax = 65536, kill_without_query = false, pty_in = false, pty_out = false, inherit_coding_system_flag = false, alive = false, raw_status_new = false, is_non_blocking_client = false, is_server = false, raw_status = 0, backlog = 0, port = 0, socktype = 0, dns_request = 0x0} (gdb) p fd_callback_info[16] $4 = {func = 0x0, data = 0x0, flags = 9, thread = 0x555556c5d598, waiting_thread = 0x0}
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 10:42:02 GMT) Full text and rfc822 format available.Message #125 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Michael Albinus <michael.albinus <at> gmx.de> To: Dmitry Gutov <dmitry <at> gutov.dev> Cc: i <at> fuzy.me, Spencer Baugh <sbaugh <at> janestreet.com>, Eli Zaretskii <eliz <at> gnu.org>, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 12:40:52 +0200
Dmitry Gutov <dmitry <at> gutov.dev> writes: Hi, > Last I heard, Tramp's branch wasn't stable for everyday use (citing > hangs, IIRC), but it has a set-process-thread call, locking > connection's process to thread explicitly: > > https://cgit.git.savannah.gnu.org/cgit/emacs.git/tree/lisp/net/tramp.el?h=feature/tramp-thread-safe#n2759 It's years ago, but AFAIR the knockout for me was bug#32426 (merged with bug#25214). And there are further open thread-related bugs from that time I have marked locally: bug#32487, bug#33135, bug#41220. Best regards, Michael.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 10:43:02 GMT) Full text and rfc822 format available.Message #128 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Zhengyi Fu <i <at> fuzy.me> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 13:42:09 +0300
> From: Zhengyi Fu <i <at> fuzy.me> > Cc: Spencer Baugh <sbaugh <at> janestreet.com>, i <at> fuzy.me, > 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > Date: Fri, 05 Sep 2025 14:51:42 +0800 > > > I feel that we are splitting hair, so I went ahead and installed the > > last agreed-to version of the patch. > > I got an assertion failure when testing the latest master. What is the recipe for reproducing this?
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 11:19:01 GMT) Full text and rfc822 format available.Message #131 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev Cc: 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 14:18:07 +0300
> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Fri, 05 Sep 2025 13:42:09 +0300 > From: Eli Zaretskii <eliz <at> gnu.org> > > > From: Zhengyi Fu <i <at> fuzy.me> > > Cc: Spencer Baugh <sbaugh <at> janestreet.com>, i <at> fuzy.me, > > 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > > Date: Fri, 05 Sep 2025 14:51:42 +0800 > > > > > I feel that we are splitting hair, so I went ahead and installed the > > > last agreed-to version of the patch. > > > > I got an assertion failure when testing the latest master. > > What is the recipe for reproducing this? Also, does the below fix the problem? diff --git a/src/process.c b/src/process.c index fa003c2..e39e02f 100644 --- a/src/process.c +++ b/src/process.c @@ -4831,7 +4831,11 @@ deactivate_process (Lisp_Object proc) /* Beware SIGCHLD hereabouts. */ for (i = 0; i < PROCESS_OPEN_FDS; i++) - close_process_fd (&p->open_fd[i]); + { + close_process_fd (&p->open_fd[i]); + fd_callback_info[p->open_fd[i]].thread = NULL; + fd_callback_info[p->open_fd[i]].waiting_thread = NULL; + } inchannel = p->infd; eassert (inchannel < FD_SETSIZE);
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 13:12:02 GMT) Full text and rfc822 format available.Message #134 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: dmitry <at> gutov.dev, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 21:11:22 +0800
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Zhengyi Fu <i <at> fuzy.me> >> Cc: Spencer Baugh <sbaugh <at> janestreet.com>, i <at> fuzy.me, >> 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> Date: Fri, 05 Sep 2025 14:51:42 +0800 >> >> > I feel that we are splitting hair, so I went ahead and installed the >> > last agreed-to version of the patch. >> >> I got an assertion failure when testing the latest master. > > What is the recipe for reproducing this? The same steps as I described in the original bug report. It repeated the steps a few times beforing triggering the assertion failure.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 13:20:02 GMT) Full text and rfc822 format available.Message #137 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 21:19:02 +0800
Eli Zaretskii <eliz <at> gnu.org> writes: >> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> Date: Fri, 05 Sep 2025 13:42:09 +0300 >> From: Eli Zaretskii <eliz <at> gnu.org> >> >> > From: Zhengyi Fu <i <at> fuzy.me> >> > Cc: Spencer Baugh <sbaugh <at> janestreet.com>, i <at> fuzy.me, >> > 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> > Date: Fri, 05 Sep 2025 14:51:42 +0800 >> > >> > > I feel that we are splitting hair, so I went ahead and installed the >> > > last agreed-to version of the patch. >> > >> > I got an assertion failure when testing the latest master. >> >> What is the recipe for reproducing this? > > Also, does the below fix the problem? > > diff --git a/src/process.c b/src/process.c > index fa003c2..e39e02f 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -4831,7 +4831,11 @@ deactivate_process (Lisp_Object proc) > /* Beware SIGCHLD hereabouts. */ > > for (i = 0; i < PROCESS_OPEN_FDS; i++) > - close_process_fd (&p->open_fd[i]); > + { > + close_process_fd (&p->open_fd[i]); > + fd_callback_info[p->open_fd[i]].thread = NULL; > + fd_callback_info[p->open_fd[i]].waiting_thread = NULL; > + } > > inchannel = p->infd; > eassert (inchannel < FD_SETSIZE); No. I can still reproduce this after applying the patch.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 13:27:03 GMT) Full text and rfc822 format available.Message #140 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Zhengyi Fu <i <at> fuzy.me> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 16:26:04 +0300
> From: Zhengyi Fu <i <at> fuzy.me> > Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Fri, 05 Sep 2025 21:19:02 +0800 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > >> Date: Fri, 05 Sep 2025 13:42:09 +0300 > >> From: Eli Zaretskii <eliz <at> gnu.org> > >> > >> > From: Zhengyi Fu <i <at> fuzy.me> > >> > Cc: Spencer Baugh <sbaugh <at> janestreet.com>, i <at> fuzy.me, > >> > 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > >> > Date: Fri, 05 Sep 2025 14:51:42 +0800 > >> > > >> > > I feel that we are splitting hair, so I went ahead and installed the > >> > > last agreed-to version of the patch. > >> > > >> > I got an assertion failure when testing the latest master. > >> > >> What is the recipe for reproducing this? > > > > Also, does the below fix the problem? > > > > diff --git a/src/process.c b/src/process.c > > index fa003c2..e39e02f 100644 > > --- a/src/process.c > > +++ b/src/process.c > > @@ -4831,7 +4831,11 @@ deactivate_process (Lisp_Object proc) > > /* Beware SIGCHLD hereabouts. */ > > > > for (i = 0; i < PROCESS_OPEN_FDS; i++) > > - close_process_fd (&p->open_fd[i]); > > + { > > + close_process_fd (&p->open_fd[i]); > > + fd_callback_info[p->open_fd[i]].thread = NULL; > > + fd_callback_info[p->open_fd[i]].waiting_thread = NULL; > > + } > > > > inchannel = p->infd; > > eassert (inchannel < FD_SETSIZE); > > No. I can still reproduce this after applying the patch. Thanks, then please show the value of the offending descriptor which causes the assertion violation. For that, please run Emacs udner GDB, perform the recipe that reproduces the problem, and then type these GDB commands: (gdb) frame 1 (gdb) print p->infd and tell what that produces.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 13:37:02 GMT) Full text and rfc822 format available.Message #143 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 21:36:16 +0800
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Zhengyi Fu <i <at> fuzy.me> >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> Date: Fri, 05 Sep 2025 21:19:02 +0800 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> >> Date: Fri, 05 Sep 2025 13:42:09 +0300 >> >> From: Eli Zaretskii <eliz <at> gnu.org> >> >> >> >> > From: Zhengyi Fu <i <at> fuzy.me> >> >> > Cc: Spencer Baugh <sbaugh <at> janestreet.com>, i <at> fuzy.me, >> >> > 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev >> >> > Date: Fri, 05 Sep 2025 14:51:42 +0800 >> >> > >> >> > > I feel that we are splitting hair, so I went ahead and installed the >> >> > > last agreed-to version of the patch. >> >> > >> >> > I got an assertion failure when testing the latest master. >> >> >> >> What is the recipe for reproducing this? >> > >> > Also, does the below fix the problem? >> > >> > diff --git a/src/process.c b/src/process.c >> > index fa003c2..e39e02f 100644 >> > --- a/src/process.c >> > +++ b/src/process.c >> > @@ -4831,7 +4831,11 @@ deactivate_process (Lisp_Object proc) >> > /* Beware SIGCHLD hereabouts. */ >> > >> > for (i = 0; i < PROCESS_OPEN_FDS; i++) >> > - close_process_fd (&p->open_fd[i]); >> > + { >> > + close_process_fd (&p->open_fd[i]); >> > + fd_callback_info[p->open_fd[i]].thread = NULL; >> > + fd_callback_info[p->open_fd[i]].waiting_thread = NULL; >> > + } >> > >> > inchannel = p->infd; >> > eassert (inchannel < FD_SETSIZE); >> >> No. I can still reproduce this after applying the patch. > > Thanks, then please show the value of the offending descriptor which > causes the assertion violation. For that, please run Emacs udner GDB, > perform the recipe that reproduces the problem, and then type these > GDB commands: > > (gdb) frame 1 > (gdb) print p->infd > > and tell what that produces. The p->infd value is 8.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Fri, 05 Sep 2025 14:40:06 GMT) Full text and rfc822 format available.Message #146 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Zhengyi Fu <i <at> fuzy.me> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Fri, 05 Sep 2025 17:39:10 +0300
> From: Zhengyi Fu <i <at> fuzy.me> > Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Fri, 05 Sep 2025 21:36:16 +0800 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > > Thanks, then please show the value of the offending descriptor which > > causes the assertion violation. For that, please run Emacs udner GDB, > > perform the recipe that reproduces the problem, and then type these > > GDB commands: > > > > (gdb) frame 1 > > (gdb) print p->infd > > > > and tell what that produces. > > The p->infd value is 8. OK, please repeat this several times to make sure it's always 8. If it is, then let's try to see which code leaves the .thread field uncleared. Like this (please run from the src directory where you have the 'emacs' executable you have built): $ gdb ./emacs ... (gdb) break process.c:8731 (gdb) run Emacs will start and stop in init_process_emacs, on line 8731 of process.c. Then: (gdb) watch fd_callback_info[8].thread (gdb) commands > bt 5 > continue > end (gdb) continue The above sets a watchpoint which will stop Emacs each time we assign some value to fd_callback_info[8].thread, and GDB will then show the first 5 frames of the call-stack, then continue. Now run your recipe, until it gets SIGABRT, and post everything GDB produced until that point. I hope that will tell us which code fails to clear the .thread member. Thanks.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 05:37:02 GMT) Full text and rfc822 format available.Message #149 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 13:35:49 +0800
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Zhengyi Fu <i <at> fuzy.me> >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> Date: Fri, 05 Sep 2025 21:36:16 +0800 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> > Thanks, then please show the value of the offending descriptor which >> > causes the assertion violation. For that, please run Emacs udner GDB, >> > perform the recipe that reproduces the problem, and then type these >> > GDB commands: >> > >> > (gdb) frame 1 >> > (gdb) print p->infd >> > >> > and tell what that produces. >> >> The p->infd value is 8. > > OK, please repeat this several times to make sure it's always 8. > > If it is, then let's try to see which code leaves the .thread field > uncleared. Like this (please run from the src directory where you > have the 'emacs' executable you have built): > > $ gdb ./emacs > ... > (gdb) break process.c:8731 > (gdb) run > > Emacs will start and stop in init_process_emacs, on line 8731 of > process.c. Then: > > (gdb) watch fd_callback_info[8].thread > (gdb) commands > > bt 5 > > continue > > end > (gdb) continue > > The above sets a watchpoint which will stop Emacs each time we assign > some value to fd_callback_info[8].thread, and GDB will then show the > first 5 frames of the call-stack, then continue. Now run your recipe, > until it gets SIGABRT, and post everything GDB produced until that > point. I hope that will tell us which code fails to clear the .thread > member. > > Thanks. The fd is 8 when running Emacs in terminal. The fd is always 16 when running Emacs in X11. GNU gdb (Fedora Linux) 16.3-1.fc42 Copyright (C) 2024 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from ./emacs... SIGINT is used by the debugger. Are you sure you want to change it? (y or n) [answered Y; input not from terminal] DISPLAY = :0.0 TERM = foot Breakpoint 1 at 0x556a1d: file emacs.c, line 442. Breakpoint 2 at 0x5211c6: file xterm.c, line 27074. Breakpoint 3 at 0x600880: file eval.c, line 3108. (gdb) break process.c:8731 Breakpoint 4 at 0x6610d8: file process.c, line 8732. (gdb) run Starting program: /home/zhengyi/src/emacs_master/src/emacs -Q -l /home/zhengyi/.config/emacs/straight/repos/straight.el/bootstrap.el --eval ' '\(progn\ ' '\(straight-use-package\ \'diff-hl\)' '\(straight-use-package\ \'magit\)' '\(setq\ diff-hl-update-async\ t\)' '\(global-diff-hl-mode\)' '\) [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Breakpoint 4, init_process_emacs (sockfd=sockfd <at> entry=-1) at process.c:8732 8732 num_pending_connects = 0; Missing rpms, try: dnf --enablerepo='*debug*' install gtk3-debuginfo-3.24.49-2.fc42.x86_64 zlib-ng-compat-debuginfo-2.2.5-2.fc42.x86_64 pango-debuginfo-1.56.4-1.fc42.x86_64 harfbuzz-debuginfo-10.4.0-1.fc42.x86_64 atk-debuginfo-2.56.3-1.fc42.x86_64 cairo-gobject-debuginfo-1.18.2-3.fc42.x86_64 cairo-debuginfo-1.18.2-3.fc42.x86_64 gdk-pixbuf2-debuginfo-2.42.12-12.fc42.x86_64 glib2-debuginfo-2.84.4-1.fc42.x86_64 libX11-debuginfo-1.8.12-1.fc42.x86_64 libX11-xcb-debuginfo-1.8.12-1.fc42.x86_64 libxcb-debuginfo-1.17.0-5.fc42.x86_64 libXrandr-debuginfo-1.5.4-5.fc42.x86_64 libXinerama-debuginfo-1.1.5-8.fc42.x86_64 libXfixes-debuginfo-6.0.1-5.fc42.x86_64 libXext-debuginfo-1.3.6-3.fc42.x86_64 ncurses-libs-debuginfo-6.5-5.20250125.fc42.x86_64 glibc-debuginfo-2.41-11.fc42.x86_64 gmp-debuginfo-6.3.0-4.fc42.x86_64 libXcomposite-debuginfo-0.4.6-5.fc42.x86_64 fontconfig-debuginfo-2.16.0-2.fc42.x86_64 fribidi-debuginfo-1.0.16-2.fc42.x86_64 libepoxy-debuginfo-1.5.10-9.fc42.x86_64 libXi-debuginfo-1.8.2-2.fc42.x86_64 at-spi2-atk-debuginfo-2.56.3-1.fc42.x86_64 libcloudproviders-debuginfo-0.3.6-1.fc42.x86_64 libtinysparql-debuginfo-3.9.2-1.fc42.x86_64 libwayland-client-debuginfo-1.24.0-1.fc42.x86_64 libxkbcommon-debuginfo-1.8.1-1.fc42.x86_64 libwayland-cursor-debuginfo-1.24.0-1.fc42.x86_64 libwayland-egl-debuginfo-1.24.0-1.fc42.x86_64 libXcursor-debuginfo-1.2.3-2.fc42.x86_64 libXdamage-debuginfo-1.1.6-5.fc42.x86_64 libthai-debuginfo-0.1.29-10.fc42.x86_64 freetype-debuginfo-2.13.3-2.fc42.x86_64 graphite2-debuginfo-1.3.14-18.fc42.x86_64 libpng-debuginfo-1.6.44-2.fc42.x86_64 libXrender-debuginfo-0.9.12-2.fc42.x86_64 pixman-debuginfo-0.46.2-1.fc42.x86_64 libjpeg-turbo-debuginfo-3.1.0-2.fc42.x86_64 libmount-debuginfo-2.40.4-7.fc42.x86_64 libselinux-debuginfo-3.8-2.fc42.x86_64 libffi-debuginfo-3.4.6-5.fc42.x86_64 pcre2-debuginfo-10.45-1.fc42.x86_64 libXau-debuginfo-1.0.12-2.fc42.x86_64 libxml2-debuginfo-2.12.10-1.fc42.x86_64 at-spi2-core-debuginfo-2.56.3-1.fc42.x86_64 dbus-libs-debuginfo-1.16.0-3.fc42.x86_64 libgcc-debuginfo-15.2.1-1.fc42.x86_64 json-glib-debuginfo-1.10.6-2.fc42.x86_64 sqlite-libs-debuginfo-3.47.2-2.fc42.x86_64 libdatrie-debuginfo-0.2.13-11.fc42.x86_64 bzip2-libs-debuginfo-1.0.8-20.fc42.x86_64 libbrotli-debuginfo-1.1.0-6.fc42.x86_64 libblkid-debuginfo-2.40.4-7.fc42.x86_64 xz-libs-debuginfo-5.8.1-2.fc42.x86_64 systemd-libs-debuginfo-257.7-1.fc42.x86_64 libcap-debuginfo-2.73-2.fc42.x86_64 (gdb) watch fd_callback_info[16].thread Hardware watchpoint 5: fd_callback_info[16].thread (gdb) commands Type commands for breakpoint(s) 5, one per line. End with a line saying just "end". >bt 5 >continue >end (gdb) continue Continuing. [New Thread 0x7fffe6bfe6c0 (LWP 741835)] [New Thread 0x7fffe63fd6c0 (LWP 741836)] [New Thread 0x7fffe5bfc6c0 (LWP 741837)] [New Thread 0x7fffe4fff6c0 (LWP 741838)] [New Thread 0x7fffd7fff6c0 (LWP 741839)] [New Thread 0x7fffd77fe6c0 (LWP 741840)] [Detaching after vfork from child process 741841] [Detaching after vfork from child process 741842] [Detaching after vfork from child process 741843] [Detaching after vfork from child process 741845] [Detaching after vfork from child process 741846] [Detaching after vfork from child process 741847] [Detaching after vfork from child process 741848] [Detaching after vfork from child process 741849] [Detaching after vfork from child process 741850] [Detaching after vfork from child process 741851] [Detaching after vfork from child process 741852] [Detaching after vfork from child process 741853] [Detaching after vfork from child process 741854] [Detaching after vfork from child process 741865] [Detaching after vfork from child process 741866] [Detaching after vfork from child process 741867] [Detaching after vfork from child process 741868] [Detaching after vfork from child process 741869] [Detaching after vfork from child process 741870] [Detaching after vfork from child process 741871] [Detaching after vfork from child process 741872] [Detaching after vfork from child process 741873] [Detaching after vfork from child process 741874] [Detaching after vfork from child process 741885] [Detaching after vfork from child process 741896] [Detaching after vfork from child process 741897] [Detaching after vfork from child process 741898] [Detaching after vfork from child process 741899] [Detaching after vfork from child process 741900] [Detaching after vfork from child process 741901] [Detaching after vfork from child process 741902] [Detaching after vfork from child process 741903] [Detaching after vfork from child process 741904] [Detaching after vfork from child process 741905] [Detaching after vfork from child process 741906] [Detaching after vfork from child process 741907] [Detaching after vfork from child process 741908] [Detaching after vfork from child process 741909] [Detaching after vfork from child process 741910] [Detaching after vfork from child process 741911] [Detaching after vfork from child process 741912] [Detaching after vfork from child process 741913] [Detaching after vfork from child process 741914] [Detaching after vfork from child process 741915] [Detaching after vfork from child process 741916] [Detaching after vfork from child process 741917] [Detaching after vfork from child process 741918] [Detaching after vfork from child process 741919] [Detaching after vfork from child process 741920] [Detaching after vfork from child process 741921] [Detaching after vfork from child process 741922] [Detaching after vfork from child process 741923] [Detaching after vfork from child process 741924] [Detaching after vfork from child process 741925] [Detaching after vfork from child process 741926] [Detaching after vfork from child process 741927] [Detaching after vfork from child process 741928] [Detaching after vfork from child process 741929] [Detaching after vfork from child process 741930] Thread 1 "emacs" hit Hardware watchpoint 5: fd_callback_info[16].thread Old value = (struct thread_state *) 0x0 New value = (struct thread_state *) 0x74dba0 <main_thread> set_proc_thread (proc=proc <at> entry=0x1eef868, thrd=0x74dba0 <main_thread>) at process.c:1457 1457 eassert (proc->outfd < FD_SETSIZE); #0 set_proc_thread (proc=proc <at> entry=0x1eef868, thrd=0x74dba0 <main_thread>) at process.c:1457 #1 0x000000000065e232 in server_accept_connection (server=<optimized out>, channel=<optimized out>) at process.c:5134 #2 wait_reading_process_output (time_limit=time_limit <at> entry=30, nsecs=nsecs <at> entry=0, read_kbd=read_kbd <at> entry=-1, do_display=do_display <at> entry=true, wait_for_cell=wait_for_cell <at> entry=0x0, wait_proc=wait_proc <at> entry=0x0, just_wait_proc=0) at process.c:5990 #3 0x000000000041217a in sit_for (timeout=<optimized out>, reading=reading <at> entry=true, display_option=display_option <at> entry=1) at dispnew.c:6978 #4 0x000000000056b4cb in read_char (commandflag=1, map=map <at> entry=0x1ed2c83, prev_event=0x0, used_mouse_menu=used_mouse_menu <at> entry=0x7fffffffd62b, end_time=end_time <at> entry=0x0) at keyboard.c:2925 [Detaching after vfork from child process 741972] [Detaching after vfork from child process 741973] [Detaching after vfork from child process 741974] [Detaching after vfork from child process 741975] [Detaching after vfork from child process 741976] [Detaching after vfork from child process 741977] [Detaching after vfork from child process 741978] [Detaching after vfork from child process 741979] [Detaching after vfork from child process 741980] [Detaching after vfork from child process 741981] [Detaching after vfork from child process 741982] [Detaching after vfork from child process 741983] [Detaching after vfork from child process 741984] [Detaching after vfork from child process 741985] [Detaching after vfork from child process 741986] [Detaching after vfork from child process 741987] [Detaching after vfork from child process 741988] [Detaching after vfork from child process 741989] [Detaching after vfork from child process 741990] [Detaching after vfork from child process 741991] [Detaching after vfork from child process 741992] [Detaching after vfork from child process 741993] [Detaching after vfork from child process 741994] [Detaching after vfork from child process 741995] [Detaching after vfork from child process 741996] [Detaching after vfork from child process 741997] [Detaching after vfork from child process 741998] [Detaching after vfork from child process 741999] [Detaching after vfork from child process 742000] [Detaching after vfork from child process 742001] [Detaching after vfork from child process 742002] [Detaching after vfork from child process 742003] [Detaching after vfork from child process 742004] [Detaching after vfork from child process 742005] [Detaching after vfork from child process 742006] [Detaching after vfork from child process 742007] [Detaching after vfork from child process 742008] [Detaching after vfork from child process 742009] [Detaching after vfork from child process 742020] [Detaching after vfork from child process 742021] [Detaching after vfork from child process 742022] [Detaching after vfork from child process 742023] [Detaching after vfork from child process 742024] [Detaching after vfork from child process 742025] [Detaching after vfork from child process 742026] [Detaching after vfork from child process 742027] [Detaching after vfork from child process 742028] [Detaching after vfork from child process 742029] [Detaching after vfork from child process 742030] [Detaching after vfork from child process 742031] [Detaching after vfork from child process 742032] [New Thread 0x7fffc9fff6c0 (LWP 742033)] [Thread 0x7fffc9fff6c0 (LWP 742033) exited] Thread 1 "emacs" hit Hardware watchpoint 5: fd_callback_info[16].thread Old value = (struct thread_state *) 0x74dba0 <main_thread> New value = (struct thread_state *) 0x0 clear_fd_callback_data (elem=0x7ec320 <fd_callback_info+640>) at process.c:476 476 elem->waiting_thread = NULL; #0 clear_fd_callback_data (elem=0x7ec320 <fd_callback_info+640>) at process.c:476 #1 delete_keyboard_wait_descriptor (desc=desc <at> entry=16) at process.c:8341 #2 0x0000000000657ad1 in delete_read_fd (fd=16) at process.c:519 #3 deactivate_process (proc=proc <at> entry=0x1eef86d) at process.c:4852 #4 0x0000000000657b4a in remove_process (proc=proc <at> entry=0x1eef86d) at process.c:965 Lisp Backtrace: "delete-process" (0xe6bff1a0) "with-editor-return" (0xe6bff120) "with-editor-cancel" (0xffffd320) "funcall-interactively" (0xffffd318) "call-interactively" (0xe6bff070) "command-execute" (0xffffd838) [Detaching after vfork from child process 742034] [Detaching after vfork from child process 742035] [Detaching after vfork from child process 742036] [Detaching after vfork from child process 742037] [Detaching after vfork from child process 742038] [Detaching after vfork from child process 742039] [Detaching after vfork from child process 742040] [Detaching after vfork from child process 742041] [Detaching after vfork from child process 742042] [Detaching after vfork from child process 742043] [Detaching after vfork from child process 742054] [Detaching after vfork from child process 742055] [Detaching after vfork from child process 742056] [Detaching after vfork from child process 742057] [Detaching after vfork from child process 742058] [Detaching after vfork from child process 742059] [Detaching after vfork from child process 742060] [Detaching after vfork from child process 742061] [Detaching after vfork from child process 742062] [Detaching after vfork from child process 742063] [Detaching after vfork from child process 742064] [Detaching after vfork from child process 742065] [Detaching after vfork from child process 742076] [Detaching after vfork from child process 742087] [Detaching after vfork from child process 742088] [Detaching after vfork from child process 742089] [Detaching after vfork from child process 742090] [Detaching after vfork from child process 742091] [Detaching after vfork from child process 742092] [Detaching after vfork from child process 742093] [Detaching after vfork from child process 742094] [Detaching after vfork from child process 742095] [Detaching after vfork from child process 742096] [Detaching after vfork from child process 742097] [Detaching after vfork from child process 742098] [Detaching after vfork from child process 742099] [Detaching after vfork from child process 742100] [Detaching after vfork from child process 742101] [Detaching after vfork from child process 742102] [Thread 0x7fffd7fff6c0 (LWP 741839) exited] [Thread 0x7fffe4fff6c0 (LWP 741838) exited] [Detaching after vfork from child process 742103] [Detaching after vfork from child process 742104] [Detaching after vfork from child process 742105] [Detaching after vfork from child process 742106] [Detaching after vfork from child process 742107] [Detaching after vfork from child process 742108] [Detaching after vfork from child process 742109] [Detaching after vfork from child process 742110] [Detaching after vfork from child process 742111] [Detaching after vfork from child process 742112] [Detaching after vfork from child process 742113] [Detaching after vfork from child process 742114] [Detaching after vfork from child process 742115] [Detaching after vfork from child process 742126] [Detaching after vfork from child process 742127] [New Thread 0x7fffd7fff6c0 (LWP 742128)] [Detaching after vfork from child process 742129] [Switching to Thread 0x7fffd7fff6c0 (LWP 742128)] Thread 9 "diff-hl--update" hit Hardware watchpoint 5: fd_callback_info[16].thread Old value = (struct thread_state *) 0x0 New value = (struct thread_state *) 0x1321ae0 0x0000000000651f83 in set_proc_thread (proc=proc <at> entry=0x1b83050, thrd=0x1321ae0) at process.c:1460 1460 } #0 0x0000000000651f83 in set_proc_thread (proc=proc <at> entry=0x1b83050, thrd=0x1321ae0) at process.c:1460 #1 0x0000000000659a8e in create_process (process=0x1b83055, new_argv=0x7fffd7ffcdd0, current_dir=0x1f29a94) at process.c:2272 #2 Fmake_process (nargs=<optimized out>, args=<optimized out>) at process.c:2090 #3 0x0000000000606144 in funcall_subr (subr=<optimized out>, numargs=numargs <at> entry=6, args=args <at> entry=0x7fffd7ffd088) at eval.c:3265 #4 0x0000000000608801 in funcall_general (fun=<optimized out>, numargs=numargs <at> entry=6, args=args <at> entry=0x7fffd7ffd088) at eval.c:3121 Lisp Backtrace: 0x761740 PVEC_SUBR "apply" (0x1faad78) "make-process <at> with-editor-process-filter" (0xd7ffd308) "apply" (0x1faacd0) "make-process" (0xd7ffd598) "apply" (0x1faac70) "start-process" (0xd7ffd818) "apply" (0x1faabf8) 0xf525cd80 PVEC_CLOSURE "apply" (0x1faab80) "start-file-process <at> with-editor-process-filter" (0xd7ffdd78) "apply" (0x1faab18) "start-file-process" (0xd7ffe028) "apply" (0x1faaab8) "vc-do-command" (0xd7ffe2d8) "apply" (0x1faa9e0) "vc-git-command" (0xd7ffe588) "apply" (0x1faa930) "vc-git-diff" (0xd7ffe828) "apply" (0x1faa8a0) "vc-call-backend" (0x1faa810) "diff-hl-diff-against-reference" (0x1faa7a8) "diff-hl-changes-buffer" (0x1faa760) "diff-hl-changes" (0x1faa6b0) "diff-hl--update" (0x1faa668) "diff-hl--update-safe" (0x1faa638) 0x1f51d38 PVEC_CLOSURE [Detaching after vfork from child process 742130] [Thread 0x7fffd7fff6c0 (LWP 742128) exited] [Detaching after vfork from child process 742141] [Detaching after vfork from child process 742142] [Detaching after vfork from child process 742143] [Detaching after vfork from child process 742144] [Detaching after vfork from child process 742145] [Detaching after vfork from child process 742146] [Detaching after vfork from child process 742147] [Detaching after vfork from child process 742148] [Detaching after vfork from child process 742149] [Detaching after vfork from child process 742150] [Detaching after vfork from child process 742151] [Detaching after vfork from child process 742152] [Detaching after vfork from child process 742153] [Detaching after vfork from child process 742154] [Detaching after vfork from child process 742155] [Detaching after vfork from child process 742156] [Detaching after vfork from child process 742157] [Detaching after vfork from child process 742158] [Detaching after vfork from child process 742159] [Detaching after vfork from child process 742160] [Detaching after vfork from child process 742161] process.c:5128: Emacs fatal error: assertion failed: !fd_callback_info[p->infd].thread [Switching to Thread 0x7ffff5d91ac0 (LWP 741828)] Thread 1 "emacs" hit Breakpoint 1, terminate_due_to_signal (sig=sig <at> entry=6, backtrace_limit=backtrace_limit <at> entry=2147483647) at emacs.c:442 442 { Missing rpms, try: dnf --enablerepo='*debug*' install gvfs-client-debuginfo-1.57.2-1.fc42.x86_64 fcitx5-gtk3-debuginfo-5.1.4-1.fc42.x86_64 fcitx5-gtk-debuginfo-5.1.4-1.fc42.x86_64 libstdc++-debuginfo-15.2.1-1.fc42.x86_64 rsvg-pixbuf-loader-debuginfo-2.60.0-2.fc42.x86_64 librsvg2-debuginfo-2.60.0-2.fc42.x86_64 libdav1d-debuginfo-1.5.1-1.fc42.x86_64 gdk-pixbuf2-modules-extra-debuginfo-2.42.12-3.fc42.x86_64 (gdb)
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 07:29:01 GMT) Full text and rfc822 format available.Message #152 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Zhengyi Fu <i <at> fuzy.me> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 10:27:51 +0300
> From: Zhengyi Fu <i <at> fuzy.me> > Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Sat, 06 Sep 2025 13:35:49 +0800 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> From: Zhengyi Fu <i <at> fuzy.me> > >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > >> Date: Fri, 05 Sep 2025 21:36:16 +0800 > >> > >> Eli Zaretskii <eliz <at> gnu.org> writes: > >> > >> > Thanks, then please show the value of the offending descriptor which > >> > causes the assertion violation. For that, please run Emacs udner GDB, > >> > perform the recipe that reproduces the problem, and then type these > >> > GDB commands: > >> > > >> > (gdb) frame 1 > >> > (gdb) print p->infd > >> > > >> > and tell what that produces. > >> > >> The p->infd value is 8. > > > > OK, please repeat this several times to make sure it's always 8. > > > > If it is, then let's try to see which code leaves the .thread field > > uncleared. Like this (please run from the src directory where you > > have the 'emacs' executable you have built): > > > > $ gdb ./emacs > > ... > > (gdb) break process.c:8731 > > (gdb) run > > > > Emacs will start and stop in init_process_emacs, on line 8731 of > > process.c. Then: > > > > (gdb) watch fd_callback_info[8].thread > > (gdb) commands > > > bt 5 > > > continue > > > end > > (gdb) continue > > > > The above sets a watchpoint which will stop Emacs each time we assign > > some value to fd_callback_info[8].thread, and GDB will then show the > > first 5 frames of the call-stack, then continue. Now run your recipe, > > until it gets SIGABRT, and post everything GDB produced until that > > point. I hope that will tell us which code fails to clear the .thread > > member. > > > > Thanks. > > The fd is 8 when running Emacs in terminal. > The fd is always 16 when running Emacs in X11. Thanks, please try the patch below, instead of the incorrect one I asked you to try before: diff --git a/src/process.c b/src/process.c index fa003c2..736098f 100644 --- a/src/process.c +++ b/src/process.c @@ -4831,7 +4831,11 @@ deactivate_process (Lisp_Object proc) /* Beware SIGCHLD hereabouts. */ for (i = 0; i < PROCESS_OPEN_FDS; i++) - close_process_fd (&p->open_fd[i]); + { + fd_callback_info[p->open_fd[i]].thread = NULL; + fd_callback_info[p->open_fd[i]].waiting_thread = NULL; + close_process_fd (&p->open_fd[i]); + } inchannel = p->infd; eassert (inchannel < FD_SETSIZE);
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 08:29:01 GMT) Full text and rfc822 format available.Message #155 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 16:28:40 +0800
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Zhengyi Fu <i <at> fuzy.me> >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> Date: Sat, 06 Sep 2025 13:35:49 +0800 >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> From: Zhengyi Fu <i <at> fuzy.me> >> >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org >> >> Date: Fri, 05 Sep 2025 21:36:16 +0800 >> >> >> >> Eli Zaretskii <eliz <at> gnu.org> writes: >> >> >> >> > Thanks, then please show the value of the offending descriptor which >> >> > causes the assertion violation. For that, please run Emacs udner GDB, >> >> > perform the recipe that reproduces the problem, and then type these >> >> > GDB commands: >> >> > >> >> > (gdb) frame 1 >> >> > (gdb) print p->infd >> >> > >> >> > and tell what that produces. >> >> >> >> The p->infd value is 8. >> > >> > OK, please repeat this several times to make sure it's always 8. >> > >> > If it is, then let's try to see which code leaves the .thread field >> > uncleared. Like this (please run from the src directory where you >> > have the 'emacs' executable you have built): >> > >> > $ gdb ./emacs >> > ... >> > (gdb) break process.c:8731 >> > (gdb) run >> > >> > Emacs will start and stop in init_process_emacs, on line 8731 of >> > process.c. Then: >> > >> > (gdb) watch fd_callback_info[8].thread >> > (gdb) commands >> > > bt 5 >> > > continue >> > > end >> > (gdb) continue >> > >> > The above sets a watchpoint which will stop Emacs each time we assign >> > some value to fd_callback_info[8].thread, and GDB will then show the >> > first 5 frames of the call-stack, then continue. Now run your recipe, >> > until it gets SIGABRT, and post everything GDB produced until that >> > point. I hope that will tell us which code fails to clear the .thread >> > member. >> > >> > Thanks. >> >> The fd is 8 when running Emacs in terminal. >> The fd is always 16 when running Emacs in X11. > > Thanks, please try the patch below, instead of the incorrect one I > asked you to try before: > > diff --git a/src/process.c b/src/process.c > index fa003c2..736098f 100644 > --- a/src/process.c > +++ b/src/process.c > @@ -4831,7 +4831,11 @@ deactivate_process (Lisp_Object proc) > /* Beware SIGCHLD hereabouts. */ > > for (i = 0; i < PROCESS_OPEN_FDS; i++) > - close_process_fd (&p->open_fd[i]); > + { > + fd_callback_info[p->open_fd[i]].thread = NULL; > + fd_callback_info[p->open_fd[i]].waiting_thread = NULL; > + close_process_fd (&p->open_fd[i]); > + } > > inchannel = p->infd; > eassert (inchannel < FD_SETSIZE); This resolves the problem. The assertion failure no longer occurs when testing with the patch.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 09:45:02 GMT) Full text and rfc822 format available.Message #158 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Zhengyi Fu <i <at> fuzy.me> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 12:44:29 +0300
> From: Zhengyi Fu <i <at> fuzy.me> > Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Sat, 06 Sep 2025 16:28:40 +0800 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> From: Zhengyi Fu <i <at> fuzy.me> > >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > >> Date: Sat, 06 Sep 2025 13:35:49 +0800 > >> > >> Eli Zaretskii <eliz <at> gnu.org> writes: > >> > >> >> From: Zhengyi Fu <i <at> fuzy.me> > >> >> Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > >> >> Date: Fri, 05 Sep 2025 21:36:16 +0800 > >> >> > >> >> Eli Zaretskii <eliz <at> gnu.org> writes: > >> >> > >> >> > Thanks, then please show the value of the offending descriptor which > >> >> > causes the assertion violation. For that, please run Emacs udner GDB, > >> >> > perform the recipe that reproduces the problem, and then type these > >> >> > GDB commands: > >> >> > > >> >> > (gdb) frame 1 > >> >> > (gdb) print p->infd > >> >> > > >> >> > and tell what that produces. > >> >> > >> >> The p->infd value is 8. > >> > > >> > OK, please repeat this several times to make sure it's always 8. > >> > > >> > If it is, then let's try to see which code leaves the .thread field > >> > uncleared. Like this (please run from the src directory where you > >> > have the 'emacs' executable you have built): > >> > > >> > $ gdb ./emacs > >> > ... > >> > (gdb) break process.c:8731 > >> > (gdb) run > >> > > >> > Emacs will start and stop in init_process_emacs, on line 8731 of > >> > process.c. Then: > >> > > >> > (gdb) watch fd_callback_info[8].thread > >> > (gdb) commands > >> > > bt 5 > >> > > continue > >> > > end > >> > (gdb) continue > >> > > >> > The above sets a watchpoint which will stop Emacs each time we assign > >> > some value to fd_callback_info[8].thread, and GDB will then show the > >> > first 5 frames of the call-stack, then continue. Now run your recipe, > >> > until it gets SIGABRT, and post everything GDB produced until that > >> > point. I hope that will tell us which code fails to clear the .thread > >> > member. > >> > > >> > Thanks. > >> > >> The fd is 8 when running Emacs in terminal. > >> The fd is always 16 when running Emacs in X11. > > > > Thanks, please try the patch below, instead of the incorrect one I > > asked you to try before: > > > > diff --git a/src/process.c b/src/process.c > > index fa003c2..736098f 100644 > > --- a/src/process.c > > +++ b/src/process.c > > @@ -4831,7 +4831,11 @@ deactivate_process (Lisp_Object proc) > > /* Beware SIGCHLD hereabouts. */ > > > > for (i = 0; i < PROCESS_OPEN_FDS; i++) > > - close_process_fd (&p->open_fd[i]); > > + { > > + fd_callback_info[p->open_fd[i]].thread = NULL; > > + fd_callback_info[p->open_fd[i]].waiting_thread = NULL; > > + close_process_fd (&p->open_fd[i]); > > + } > > > > inchannel = p->infd; > > eassert (inchannel < FD_SETSIZE); > > This resolves the problem. The assertion failure no longer occurs when testing with the patch. Thanks, now installed on the master branch.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 10:24:02 GMT) Full text and rfc822 format available.Message #161 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Zhengyi Fu <i <at> fuzy.me> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 18:23:03 +0800
Eli Zaretskii <eliz <at> gnu.org> writes: > Thanks, now installed on the master branch. Thank you very much!
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 10:38:02 GMT) Full text and rfc822 format available.Message #164 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Zhengyi Fu <i <at> fuzy.me> Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 13:36:54 +0300
> From: Zhengyi Fu <i <at> fuzy.me> > Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > Date: Sat, 06 Sep 2025 18:23:03 +0800 > > Eli Zaretskii <eliz <at> gnu.org> writes: > > > Thanks, now installed on the master branch. > > Thank you very much! Thanks for helping debug this.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 14:07:02 GMT) Full text and rfc822 format available.Message #167 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Vincenzo Pupillo <v.pupillo <at> gmail.com> To: Zhengyi Fu <i <at> fuzy.me>, bug-gnu-emacs <at> gnu.org Cc: dmitry <at> gutov.dev, sbaugh <at> janestreet.com, Eli Zaretskii <eliz <at> gnu.org>, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 16:06:27 +0200
[Message part 1 (text/plain, inline)]
Ciao Eli, after the commit 6c150961fd07e19b6c871d8963d6b9826ec8140f (HEAD) to the master emacs --daemon (with my configuration) now uses 100% of the CPU even if nothing at all is being done in a client. Checking out 6b42b974ceabba8b0215498d4a6eb5048d91514d and recompiling fix the issue. Could the issue be related to this commit on src/process.c? Thanks. Vincenzo In GNU Emacs 31.0.50 (build 13, x86_64-pc-linux-gnu, GTK+ Version 3.24.49, cairo version 1.18.2) of 2025-09-06 built on fedora Repository revision: 2ecced627bc6553003bc32e282629273d2f9c454 Repository branch: master System Description: Fedora Linux 42 (KDE Plasma Desktop Edition) Configured using: 'configure --disable-gc-mark-trace --with-native-compilation=aot --with-tree-sitter --with-cairo --without-pop --without-imagemagick --prefix=/opt/emacs_pgtk --with-file-notification=inotify --enable-link-time-optimization --with-xinput2 --with-pgtk 'CFLAGS= -O2 -mtune=native -march=native -pipe '' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PGTK PNG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP XIM GTK3 ZLIB Important settings: value of $LANG: it_IT.UTF-8 value of $XMODIFIERS: @im=none locale-coding-system: utf-8-unix Major mode: Lisp Interaction Minor modes in effect: marginalia-mode: t xclip-mode: t recentf-mode: t which-key-mode: t server-mode: t yas-minor-mode: t global-git-gutter-mode: t editorconfig-mode: t persp-mode: t global-projection-hook-mode: t corfu-popupinfo-mode: t corfu-history-mode: t global-corfu-mode: t corfu-mode: t vertico-mode: t override-global-mode: t windmove-mode: t electric-pair-mode: t which-function-mode: t save-place-mode: t winner-mode: t minibuffer-electric-default-mode: t global-hl-line-mode: t savehist-mode: t delete-selection-mode: t xterm-mouse-mode: t tooltip-mode: t global-eldoc-mode: t eldoc-mode: t show-paren-mode: t electric-indent-mode: t mouse-wheel-mode: t file-name-shadow-mode: t context-menu-mode: t global-font-lock-mode: t font-lock-mode: t minibuffer-regexp-mode: t column-number-mode: t line-number-mode: t indent-tabs-mode: t transient-mark-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t temp-buffer-resize-mode: t Load-path shadows: /home/vincenzo/.emacs.d/elpa/transient-20250903.1516/transient hides /opt/ emacs_pgtk/share/emacs/31.0.50/lisp/transient /home/vincenzo/.emacs.d/elpa/standard-themes-2.2.0/theme-loaddefs hides /opt/ emacs_pgtk/share/emacs/31.0.50/lisp/theme-loaddefs Features: (shadow sort mail-extr emacsbug lisp-mnt message yank-media puny citre-lang-fileref citre-tags citre-ctags citre-readtags citre-readtags-tables citre-backend-interface citre-common-tag citre-common-util dired-x dired dired-loaddefs rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr mail-utils help-fns radix-tree mule-util vertico-sort tramp-cmds ediff ediff-merg ediff-mult ediff-wind ediff-diff ediff-help ediff-init ediff-util smerge-mode diff-mode track-changes diff mvn vterm tramp trampver tramp-integration tramp-message tramp-compat parse-time iso8601 time-date tramp-loaddefs face-remap color term disp-table shell pcomplete ehelp find-func vterm-module term/xterm xterm multiple-cursors mc-separate-operations rectangular-region-mode mc-mark-pop mc-edit-lines mc-hide-unmatched-lines-mode mc-mark-more sgml-mode facemenu dom mc-cycle-cursors multiple-cursors-core rect marginalia rg files-x vc vc-dispatcher rg-info-hack rg-menu rg-ibuffer rg-result wgrep-rg wgrep rg-history rg-header ibuf-ext ibuffer ibuffer-loaddefs grep xclip recentf tree-widget which-key add-log compile comint ansi-osc ansi-color server yasnippet-snippets yasnippet git-gutter editorconfig editorconfig-core editorconfig-core-handle editorconfig-fnmatch comp comp-cstr warnings comp-run comp-common perspective advice thingatpt rx ido projection-hook projection-core projection-core-match projection-core-type projection-core-log projection-core-cache projection-core-misc projection-core-completion s init custom-vp-gen custom-elixir custom-lua custom-rust custom-dart custom-java custom-go custom-perl custom-php custom-c custom-sql custom-clojure custom-lisp custom-web custom-org custom-markup citre-config eldoc-box corfu-popupinfo corfu-history corfu project consult bookmark text-property-search orderless vertico use-package-ensure use-package-bind-key bind-key treesit standard-dark-theme standard-themes cl-extra help-mode windmove use-package-core elec-pair which-func imenu saveplace winner ring minibuf-eldef easy-mmode hl-line savehist delsel xt-mouse custom-function transient format-spec edmacro kmacro compat custom-variable fonts cus-edit pp cus-start cus-load wid-edit cape-autoloads cargo-autoloads citre-autoloads clj-refactor-autoloads cider-autoloads clojure-mode-autoloads composer-autoloads consult-dir-autoloads consult-lsp-autoloads consult-autoloads corfu-autoloads cuda-mode-autoloads debbugs-autoloads denote-explore-autoloads denote-markdown-autoloads denote-org-autoloads denote-autoloads devdocs-autoloads dired-rsync-autoloads disaster-autoloads docker-autoloads aio-autoloads eldoc-box-autoloads expand-region-autoloads extempore-mode-autoloads flutter-autoloads git-gutter-autoloads go-translate-autoloads gptel-autoloads highlight-indentation-autoloads hover-autoloads iedit-autoloads inflections-autoloads kotlin-ts-mode-autoloads lice-autoloads lsp-dart-autoloads dart-mode-autoloads lsp-java-autoloads dap-mode-autoloads lsp-docker-autoloads bui-autoloads lsp-pyright-autoloads lsp-python-refly-autoloads lsp-treemacs-autoloads lsp-ui-autoloads lsp-mode-autoloads magit-autoloads pcase magit-section-autoloads llama-autoloads cond-let-autoloads marginalia-autoloads mathjax-autoloads meson-mode-autoloads multiple-cursors-autoloads mvn-autoloads nexus-autoloads orderless-autoloads ox-pandoc-autoloads paredit-autoloads parseedn-autoloads parseclj-autoloads pdd-autoloads perspective-autoloads php-runtime-autoloads plantuml-mode-autoloads deflate-autoloads popup-autoloads projection-multi-autoloads compile-multi-autoloads projection-autoloads pyenv-mode-autoloads pythonic-autoloads pyvenv-autoloads qml-mode-autoloads queue-autoloads request-autoloads restclient-autoloads rg-autoloads rustic-autoloads markdown-mode-autoloads f-autoloads rust-mode-autoloads rvm-autoloads sesman-autoloads sly-autoloads spinner-autoloads standard-themes-autoloads tablist-autoloads transient-autoloads treemacs-autoloads cfrs-autoloads posframe-autoloads ht-autoloads hydra-autoloads lv-autoloads pfuture-autoloads ace-window-autoloads avy-autoloads s-autoloads dash-autoloads vertico-autoloads vterm-autoloads web-mode-autoloads wgrep-autoloads info with-editor-autoloads xclip-autoloads xcscope-autoloads xr-autoloads xterm-color-autoloads yaml-autoloads yasnippet-snippets-autoloads yasnippet-autoloads zig-mode-autoloads reformatter-autoloads package browse-url xdg url url-proxy url-privacy url-expand url-methods url-history url-cookie generate-lisp-file url-domsuf url-util mailcap url-handlers url-parse auth-source cl-seq eieio eieio-core cl-macs icons password-cache json subr-x map byte-opt gv bytecomp byte-compile url-vars cl-loaddefs cl-lib early-init rmc iso-transl tooltip cconv eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel term/pgtk-win pgtk-win term/common-win touch-screen pgtk-dnd tool-bar dnd fontset image regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic indonesian philippine cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese composite emoji-zwj charscript charprop case-table epa-hook jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs theme-loaddefs faces cus-face macroexp files window text-properties overlay sha1 md5 base64 format env code-pages mule custom widget keymap hashtable-print-readable backquote threads dbusbind inotify dynamic-setting system-font-setting font-render-setting cairo gtk pgtk lcms2 multi-tty move-toolbar make-network-process tty-child-frames native-compile emacs) Memory information: ((conses 16 361739 33614) (symbols 48 27047 0) (strings 32 92022 7674) (string-bytes 1 3573207) (vectors 16 43793) (vector-slots 8 520774 9737) (floats 8 280 6) (intervals 56 1425 0) (buffers 1064 12)) In data sabato 6 settembre 2025 12:36:54 Ora legale dell’Europa centrale, Eli Zaretskii ha scritto: > > From: Zhengyi Fu <i <at> fuzy.me> > > Cc: sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org > > Date: Sat, 06 Sep 2025 18:23:03 +0800 > > > > Eli Zaretskii <eliz <at> gnu.org> writes: > > > Thanks, now installed on the master branch. > > > > Thank you very much! > > Thanks for helping debug this.
[profile_cpu.txt (text/plain, attachment)]
[profile_mem.txt (text/plain, attachment)]
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 14:08:01 GMT) Full text and rfc822 format available.bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 14:43:02 GMT) Full text and rfc822 format available.Message #173 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Vincenzo Pupillo <v.pupillo <at> gmail.com> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, bug-gnu-emacs <at> gnu.org, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 17:41:33 +0300
> From: Vincenzo Pupillo <v.pupillo <at> gmail.com> > Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev, > Eli Zaretskii <eliz <at> gnu.org> > Date: Sat, 06 Sep 2025 16:06:27 +0200 > > Ciao Eli, after the commit 6c150961fd07e19b6c871d8963d6b9826ec8140f (HEAD) to > the master emacs --daemon (with my configuration) now uses 100% of the CPU > even if nothing at all is being done in a client. > Checking out 6b42b974ceabba8b0215498d4a6eb5048d91514d and recompiling fix the > issue. Could the issue be related to this commit on src/process.c? I cannot reproduce this. Please show a recipe starting from how you start the daemon and then how you start the client. I tried the naïve thing, but 'top' didn't show any 100% CPU consumption by either the daemon or the client.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 14:43:02 GMT) Full text and rfc822 format available.bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 14:55:02 GMT) Full text and rfc822 format available.Message #179 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: v.pupillo <at> gmail.com Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 17:54:30 +0300
> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > Date: Sat, 06 Sep 2025 17:41:33 +0300 > From: Eli Zaretskii <eliz <at> gnu.org> > > > From: Vincenzo Pupillo <v.pupillo <at> gmail.com> > > Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev, > > Eli Zaretskii <eliz <at> gnu.org> > > Date: Sat, 06 Sep 2025 16:06:27 +0200 > > > > Ciao Eli, after the commit 6c150961fd07e19b6c871d8963d6b9826ec8140f (HEAD) to > > the master emacs --daemon (with my configuration) now uses 100% of the CPU > > even if nothing at all is being done in a client. > > Checking out 6b42b974ceabba8b0215498d4a6eb5048d91514d and recompiling fix the > > issue. Could the issue be related to this commit on src/process.c? > > I cannot reproduce this. Please show a recipe starting from how you > start the daemon and then how you start the client. I tried the naïve > thing, but 'top' didn't show any 100% CPU consumption by either the > daemon or the client. However, I've just installed something to make that change safer. So before you post the recipe, please check if the latest master still exhibits the problematic behavior.
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 16:34:01 GMT) Full text and rfc822 format available.Message #182 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Vincenzo Pupillo <v.pupillo <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 18:33:07 +0200
In data sabato 6 settembre 2025 16:54:30 Ora legale dell’Europa centrale, Eli Zaretskii ha scritto: > > Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, > > dmitry <at> gutov.dev Date: Sat, 06 Sep 2025 17:41:33 +0300 > > From: Eli Zaretskii <eliz <at> gnu.org> > > > > > From: Vincenzo Pupillo <v.pupillo <at> gmail.com> > > > Cc: sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev, > > > > > > Eli Zaretskii <eliz <at> gnu.org> > > > > > > Date: Sat, 06 Sep 2025 16:06:27 +0200 > > > > > > Ciao Eli, after the commit 6c150961fd07e19b6c871d8963d6b9826ec8140f > > > (HEAD) to the master emacs --daemon (with my configuration) now uses > > > 100% of the CPU even if nothing at all is being done in a client. > > > Checking out 6b42b974ceabba8b0215498d4a6eb5048d91514d and recompiling > > > fix the issue. Could the issue be related to this commit on > > > src/process.c?> > > I cannot reproduce this. Please show a recipe starting from how you > > start the daemon and then how you start the client. I tried the naïve > > thing, but 'top' didn't show any 100% CPU consumption by either the > > daemon or the client. > > However, I've just installed something to make that change safer. So > before you post the recipe, please check if the latest master still > exhibits the problematic behavior. Thank you Eli, your latest patch works. Vincenzo
bug-gnu-emacs <at> gnu.org
:bug#79367
; Package emacs
.
(Sat, 06 Sep 2025 17:07:02 GMT) Full text and rfc822 format available.Message #185 received at 79367 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Vincenzo Pupillo <v.pupillo <at> gmail.com> Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, dmitry <at> gutov.dev, 79367 <at> debbugs.gnu.org Subject: Re: bug#79367: 31.0.50; magit-commit sometimes doesn't work if diff-hl-update-async is t Date: Sat, 06 Sep 2025 20:06:35 +0300
> From: Vincenzo Pupillo <v.pupillo <at> gmail.com> > Cc: i <at> fuzy.me, sbaugh <at> janestreet.com, 79367 <at> debbugs.gnu.org, dmitry <at> gutov.dev > Date: Sat, 06 Sep 2025 18:33:07 +0200 > > > > > Ciao Eli, after the commit 6c150961fd07e19b6c871d8963d6b9826ec8140f > > > > (HEAD) to the master emacs --daemon (with my configuration) now uses > > > > 100% of the CPU even if nothing at all is being done in a client. > > > > Checking out 6b42b974ceabba8b0215498d4a6eb5048d91514d and recompiling > > > > fix the issue. Could the issue be related to this commit on > > > > src/process.c?> > > > I cannot reproduce this. Please show a recipe starting from how you > > > start the daemon and then how you start the client. I tried the naïve > > > thing, but 'top' didn't show any 100% CPU consumption by either the > > > daemon or the client. > > > > However, I've just installed something to make that change safer. So > > before you post the recipe, please check if the latest master still > > exhibits the problematic behavior. > Thank you Eli, your latest patch works. Great, thanks for testing.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.