From unknown Sat Aug 16 00:33:26 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#79201 <79201@debbugs.gnu.org> To: bug#79201 <79201@debbugs.gnu.org> Subject: Status: 30.1.90; set-process-thread can permanently break fd_callback_info slots Reply-To: bug#79201 <79201@debbugs.gnu.org> Date: Sat, 16 Aug 2025 07:33:26 +0000 retitle 79201 30.1.90; set-process-thread can permanently break fd_callback= _info slots reassign 79201 emacs submitter 79201 Spencer Baugh severity 79201 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 08 13:06:42 2025 Received: (at submit) by debbugs.gnu.org; 8 Aug 2025 17:06:42 +0000 Received: from localhost ([127.0.0.1]:39107 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ukQYD-0001zN-Fh for submit@debbugs.gnu.org; Fri, 08 Aug 2025 13:06:42 -0400 Received: from lists.gnu.org ([2001:470:142::17]:54830) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ukQY9-0001yw-9z for submit@debbugs.gnu.org; Fri, 08 Aug 2025 13:06:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ukQXz-00046P-3a for bug-gnu-emacs@gnu.org; Fri, 08 Aug 2025 13:06:27 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ukQXx-0001O0-Hb for bug-gnu-emacs@gnu.org; Fri, 08 Aug 2025 13:06:26 -0400 From: Spencer Baugh To: bug-gnu-emacs@gnu.org Subject: 30.1.90; set-process-thread can permanently break fd_callback_info slots X-Debbugs-Cc: Date: Fri, 08 Aug 2025 13:06:22 -0400 Message-ID: MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1754672782; bh=KljyH+MMI5XoF7nvIawQwQdw1yIDcoNwV1rt8Od5Uq8=; h=From:To:Cc:Subject:Date; b=yyU+qNyXDlE3vEFxQcbQQEURyuLOTOWjuVglZJsMl/o3fixCVL3odgb5bflQkxWBi 3GJ5T9rKIqYIptfgxEUxRmHGURYFY28hgEYFVQkkDk1fmjWrKJl5wfzS7r9FPeMqUa 2ucWVCE/nguOVaU4VHRkc2lr4CvnTfYa7TPRtHaKRBA1/8zYu8QIIQdGb86gkwJacw /FoM8pG8inbCjhLm5usuKEPoG/GUSJvaN8Jp95kETf3h8oqGV304qNcnV+mtRaAADA IHJr5o+CfWyf5n5ynLlgPRbg43/rav0VsrF/hVRvZOwo/6O7tSWbjyENDgDlhezXZG 4BhcKBtZDfTqA== Received-SPF: pass client-ip=64.215.233.18; envelope-from=sbaugh@janestreet.com; helo=mxout5.mail.janestreet.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.9 (/) X-Debbugs-Envelope-To: submit Cc: dmitry@gutov.dev, app-emacs-dev@janestreet.com, John Wiegley X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.1 (/) Using set-process-thread sets the fd_callback_info[fd].thread slot, which is not cleared on process exit. As a result fd_callback_info[fd].thread can contain a dangling pointer to a dead thread, which means that fd_callback_info[fd] will be permanently broken, and any new process created which uses that slot will also be broken. Reproduction: the following code will hang forever when the (delete-process proc1) line is present, but terminates just fine when that line is deleted. (setq my-thread (make-thread (lambda () (while t (sleep-for 60))))) (setq proc1 (make-process :name "proc1" :command '("sleep" "inf"))) ; Set the process's thread to my-thread (set-process-thread proc1 my-thread) ; Delete the process so fd_callback_info[fd].thread doesn't get cleared ; on thread exit. (delete-process proc1) ; Kill my-thread, just for completeness. (thread-signal my-thread 'error nil) ; Start up a seemingly completely unrelated process; Emacs will never be ; able to read from this process because it's reusing ; fd_callback_info[fd] and .thread is pointing to a dead thread. (setq proc2 (make-process :name "proc2" :command '("sh" "-c" "echo hi; sleep inf"))) (message "waiting on proc2") (while (null (accept-process-output proc2))) In fact, SIGINTing this process while it's hanging will cause Emacs to segfault. It looks like current_thread is NULL? Not sure why that happens, possibly an unrelated bug. Anyway, probably we should be setting fd_callback_info[fd].thread to NULL also when the process exits. But, I suggest we should further also set it to NULL (and also waiting_thread) when we start using a fd_callback_info[fd] slot, e.g. in add_read_fd. That avoids the possibility of permanent contamination of fd_callback_info slots, which I think is possible in some other ways, though I haven't been able to reproduce it yet... In GNU Emacs 30.1.90 (build 8, x86_64-pc-linux-gnu, X toolkit, cairo version 1.15.12, Xaw scroll bars) of 2025-08-06 built on igm-qws-u22796a Repository revision: a7392a6cea9cb7d77fab044a8e8cbcb012b5d6c7 Repository branch: emacs-30 Windowing system distributor 'The X.Org Foundation', version 11.0.12011000 System Description: Rocky Linux 8.10 (Green Obsidian) Configured using: 'configure --with-x-toolkit=lucid --without-gpm --without-gconf --without-gsettings --without-selinux --without-imagemagick --with-modules --with-gif=no --with-cairo --with-rsvg --without-compress-install --with-tree-sitter --with-native-compilation=aot PKG_CONFIG_PATH=/usr/local/home/garnish/libtree-sitter/0.22.6-1/lib/pkgconfig/' Configured features: CAIRO DBUS FREETYPE GLIB GMP GNUTLS HARFBUZZ JPEG LIBSYSTEMD LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER X11 XDBE XIM XINPUT2 XPM LUCID ZLIB Important settings: value of $LANG: en_US.UTF-8 locale-coding-system: utf-8-unix From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 08 14:22:37 2025 Received: (at submit) by debbugs.gnu.org; 8 Aug 2025 18:22:37 +0000 Received: from localhost ([127.0.0.1]:39187 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ukRjh-0005Or-By for submit@debbugs.gnu.org; Fri, 08 Aug 2025 14:22:37 -0400 Received: from lists.gnu.org ([2001:470:142::17]:42026) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ukRjd-0005OW-Oy for submit@debbugs.gnu.org; Fri, 08 Aug 2025 14:22:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ukRjX-0000zH-UW for bug-gnu-emacs@gnu.org; Fri, 08 Aug 2025 14:22:28 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ukRjW-0002LL-Os; Fri, 08 Aug 2025 14:22:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To: From; bh=yj/GfTY9KEawM2KH2xezFnRbbqqxVKLA9Ra5140tahg=; b=aYlloKhbFQyHDuV1/x2X COdvFKWjJCZb02RIxrt4WjstJOBq0Vpf0+mfD7bF/itK261U/egsTsBXcaioAZSNwwIPp3P1RErlS ol+3vNeTPrkU1eZgdVdBGhZxygnCdF7fRAGGcC3fx8ZCbd2wo+FK4JmwqpJBWOsBClNjwdRpAWIoH kkp1Py1H0aU5x+E7q1A/QYg1M72pmqNV4lPhJqHVAQ+VdjBR3aU4m3qFZF0NOYWBXH0WW+9AMkazG sYrzdZkEXwHtfzelh4/MwHxPQROcNcA396H8O9VGFh25Fw5mlnaGAYvmaboLLA0N5SWyuSatLu0Lk ARYoKVOwpWbAdQ==; X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgdduvdeghedvucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhephffvvefujghffffkfgggtgfgsehtqhertddtreejnecuhfhrohhmpedflfhohhhn ucghihgvghhlvgihfdcuoehjohhhnhifsehgnhhurdhorhhgqeenucggtffrrghtthgvrh hnpeeuiedukefftedvgeelkeduueduueevjedvvdegvdefffejhfehkeefhfejffdtheen ucffohhmrghinhepnhgvfigrrhhtihhsrghnshdrtghomhenucevlhhushhtvghrufhiii gvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehjohhhnhifodhmvghsmhhtphgruhht hhhpvghrshhonhgrlhhithihqdeikeejkedtleeggedqudejjeehfeekudeiqdhjohhhnh ifpeepghhnuhdrohhrghesnhgvfigrrhhtihhsrghnshdrtghomhdpnhgspghrtghpthht ohepgedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheprghpphdqvghmrggtshdqug gvvhesjhgrnhgvshhtrhgvvghtrdgtohhmpdhrtghpthhtohepughmihhtrhihsehguhht ohhvrdguvghvpdhrtghpthhtohepsghughdqghhnuhdqvghmrggtshesghhnuhdrohhrgh dprhgtphhtthhopehssggruhhghhesjhgrnhgvshhtrhgvvghtrdgtohhm X-ME-Proxy: Feedback-ID: ib64945b7:Fastmail From: "John Wiegley" To: Spencer Baugh Subject: Re: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: (Spencer Baugh's message of "Fri, 08 Aug 2025 13:06:22 -0400") References: Date: Fri, 08 Aug 2025 11:22:23 -0700 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: submit Cc: dmitry@gutov.dev, bug-gnu-emacs@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) >>>>> Spencer Baugh writes: > (while (null (accept-process-output proc2))) Based on the what the manual says, shouldn=E2=80=99t this be? (while (accept-process-output proc2)) > Anyway, probably we should be setting fd_callback_info[fd].thread to > NULL also when the process exits. But, I suggest we should further also > set it to NULL (and also waiting_thread) when we start using a > fd_callback_info[fd] slot, e.g. in add_read_fd. That avoids the > possibility of permanent contamination of fd_callback_info slots, which > I think is possible in some other ways, though I haven't been able to > reproduce it yet... This sounds like an important fix to me. I haven=E2=80=99t run into this pr= oblem yet myself, but the logic above tracks. --=20 John Wiegley GPG fingerprint =3D 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 08 16:34:20 2025 Received: (at 79201) by debbugs.gnu.org; 8 Aug 2025 20:34:21 +0000 Received: from localhost ([127.0.0.1]:39303 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ukTnA-0002rw-ED for submit@debbugs.gnu.org; Fri, 08 Aug 2025 16:34:20 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:51027) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ukTn6-0002rd-NA for 79201@debbugs.gnu.org; Fri, 08 Aug 2025 16:34:17 -0400 From: Spencer Baugh To: "John Wiegley" Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: (John Wiegley's message of "Fri, 08 Aug 2025 11:22:23 -0700") References: Date: Fri, 08 Aug 2025 16:34:10 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1754685251; bh=I3FbpNRBX4ykTP3wB9s7YN/9nHlOx8XaKH7yUcW6ROA=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=X7JhBKJI8EsnkhWF1tkYS53T8tQRNcxCV9qxQtCmS3FMQZ/aIryxN8yq6Tf8yaKpU JN9uNOPUYcDx3G005tGC63gRdoiv5+1fzfuS6+E0KmVwUgSEvRYCulDasneaZV3Tb4 iJIQP17zaoddJuDwIuQexS2Z0XYrDrBE7DYsfQQiyczj2m5EtehsNFXf/R7JMzJakc 2kV54I0uspkMcUfK2Be0yGPxFyDDjVDPNQ6ofQNZbZ9J5IlPDunotdOpx5p+ugVClm Qxs2QjlQKLUb7E1nVbDZisTOLBtjTSyZLQqCo/kEySOppR06SXxfXIF3G/DTCE3inX 2j5cDLrMNhG3w== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable "John Wiegley" writes: >>>>>> Spencer Baugh writes: > >> (while (null (accept-process-output proc2))) > > Based on the what the manual says, shouldn=E2=80=99t this be? > > (while (accept-process-output proc2)) No, I'm trying to run accept-process-output repeatedly until proc2 produces any output at all, which will cause accept-process-output to return non-nil. >> Anyway, probably we should be setting fd_callback_info[fd].thread to >> NULL also when the process exits. But, I suggest we should further also >> set it to NULL (and also waiting_thread) when we start using a >> fd_callback_info[fd] slot, e.g. in add_read_fd. That avoids the >> possibility of permanent contamination of fd_callback_info slots, which >> I think is possible in some other ways, though I haven't been able to >> reproduce it yet... > > This sounds like an important fix to me. I haven=E2=80=99t run into this = problem yet > myself, but the logic above tracks. Here's a patch to fix it by initializing these fields. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Initialize-fd_callback_info-thread-variables-on-reus.patch >From d97986f4a46a229c3a19a4e554e16b2104018388 Mon Sep 17 00:00:00 2001 From: Spencer Baugh Date: Fri, 8 Aug 2025 16:33:20 -0400 Subject: [PATCH] Initialize fd_callback_info thread variables on reuse By using set-process-thread and then delete-process, fd_callback_info[fd].thread will contain a dangling pointer to a dead thread, making that fd number permanently broken and Emacs unable to read output from it. Handle all such issues by initializing .thread and .waiting_thread to NULL at the time we start using an fd_callback_info[fd] slot. * src/process.c (initialize_fd_callback_info): Add. (add_read_fd, add_process_read_fd, add_write_fd) (add_non_blocking_write_fd, add_keyboard_wait_descriptor): Call initialize_fd_callback_info. (bug#79201) --- src/process.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/src/process.c b/src/process.c index e61ec425f7e..d5d0db1943e 100644 --- a/src/process.c +++ b/src/process.c @@ -467,6 +467,15 @@ make_lisp_proc (struct Lisp_Process *p) } fd_callback_info[FD_SETSIZE]; +static void +initialize_fd_callback_info(int fd) +{ + eassert (0 <= fd && fd < FD_SETSIZE); + + fd_callback_info[fd].thread = NULL; + fd_callback_info[fd].waiting_thread = NULL; +} + /* Add a file descriptor FD to be monitored for when read is possible. When read is possible, call FUNC with argument DATA. */ @@ -475,7 +484,6 @@ add_read_fd (int fd, fd_callback func, void *data) { add_keyboard_wait_descriptor (fd); - eassert (0 <= fd && fd < FD_SETSIZE); fd_callback_info[fd].func = func; fd_callback_info[fd].data = data; } @@ -490,14 +498,13 @@ add_non_keyboard_read_fd (int fd, fd_callback func, void *data) static void add_process_read_fd (int fd) { - eassert (fd >= 0 && fd < FD_SETSIZE); + initialize_fd_callback_info(fd); eassert (fd_callback_info[fd].func == NULL); fd_callback_info[fd].flags &= ~KEYBOARD_FD; fd_callback_info[fd].flags |= FOR_READ; if (fd > max_desc) max_desc = fd; - eassert (0 <= fd && fd < FD_SETSIZE); fd_callback_info[fd].flags |= PROCESS_FD; } @@ -522,7 +529,7 @@ delete_read_fd (int fd) void add_write_fd (int fd, fd_callback func, void *data) { - eassert (fd >= 0 && fd < FD_SETSIZE); + initialize_fd_callback_info(fd); fd_callback_info[fd].func = func; fd_callback_info[fd].data = data; @@ -534,7 +541,7 @@ add_write_fd (int fd, fd_callback func, void *data) static void add_non_blocking_write_fd (int fd) { - eassert (fd >= 0 && fd < FD_SETSIZE); + initialize_fd_callback_info(fd); eassert (fd_callback_info[fd].func == NULL); fd_callback_info[fd].flags |= FOR_WRITE | NON_BLOCKING_CONNECT_FD; @@ -8296,7 +8303,7 @@ remove_slash_colon (Lisp_Object name) add_keyboard_wait_descriptor (int desc) { #ifdef subprocesses /* Actually means "not MSDOS". */ - eassert (desc >= 0 && desc < FD_SETSIZE); + initialize_fd_callback_info(desc); fd_callback_info[desc].flags &= ~PROCESS_FD; fd_callback_info[desc].flags |= (FOR_READ | KEYBOARD_FD); if (desc > max_desc) -- 2.39.3 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Sat Aug 09 06:33:53 2025 Received: (at 79201) by debbugs.gnu.org; 9 Aug 2025 10:33:54 +0000 Received: from localhost ([127.0.0.1]:40331 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ukgtd-0008LM-GH for submit@debbugs.gnu.org; Sat, 09 Aug 2025 06:33:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34062) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ukgta-0008L4-2J for 79201@debbugs.gnu.org; Sat, 09 Aug 2025 06:33:51 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ukgtT-0003TG-Uj; Sat, 09 Aug 2025 06:33:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=OMeqb9hjlSky0DWdZKrn1H9RnAwf/MhIDwipqTqTwYI=; b=D0hvJf+naIo3 Flm3C95vJcEGdb35DlLNHKxqrBCNABkfm+f3OeGg9N56PfOSQCZ4zQX2V///dLKd9UuYDa601Sfv0 z3DJIxiiCDLd2zkBjyx1r+34nQGBFEShtlYin3bq2xN/CWMgb9RVOkYAKVqtNobalTK9Y+dpOWAft qy5n27FTsSc4tVEIqjHRiwzhyQdRLAO5JR9N9/NU9qjeyLhCXxHBEaTyZW2/X15CBfFn6vQ/kSgFK IXhRpqqNsP5Dqv0eK1SVsihD+nhbhURbWI02OeVSYG0y7M89kif2R7wG4cW6jlNEJqBKXTTGvVRKf wW6bjHvCFKJJRVUFdcmNpQ==; Date: Sat, 09 Aug 2025 13:33:40 +0300 Message-Id: <86ectkn5rv.fsf@gnu.org> From: Eli Zaretskii To: Spencer Baugh In-Reply-To: (bug-gnu-emacs@gnu.org) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, app-emacs-dev@janestreet.com > Date: Fri, 08 Aug 2025 16:34:10 -0400 > From: Spencer Baugh via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" > Here's a patch to fix it by initializing these fields. Thanks, but I don't think this is right. First, update_processes_for_thread_death clears the .thread member of fd_callback_info of those processes that were locked to the dying thread. Second, wait_reading_process_output_unwind clears the .waiting_thread member of fd_callback_info for all descriptors, when we exit wait_reading_process_output (which is called by accept-process-output). And finally, IMO it is wrong for the add_*_fd functions to overwrite these two members, because they are set and reset by other functions, mainly those which compute the wait mask for the pselect call. add_*_fd should not touch these members. I believe in this case update_processes_for_thread_death doesn't do its job because your scenario kills the process before the thread dies, and update_processes_for_thread_death only looks at the processes that are in Vprocess_alist. So it is possible that all that's missing is that delete_read_fd and delete_write_fd (which are called from deactivate_process) should clear the .thread member of fd_callback_info corresponding to the process's input and output descriptors. If that doesn't fix the issue with your reproducer, please investigate why, and let's take it from there. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 11 12:03:33 2025 Received: (at 79201) by debbugs.gnu.org; 11 Aug 2025 16:03:33 +0000 Received: from localhost ([127.0.0.1]:49857 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulUzk-0006Qr-C6 for submit@debbugs.gnu.org; Mon, 11 Aug 2025 12:03:33 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:46991) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulUza-0006QH-2y for 79201@debbugs.gnu.org; Mon, 11 Aug 2025 12:03:26 -0400 From: Spencer Baugh To: Eli Zaretskii Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: <86ectkn5rv.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 09 Aug 2025 13:33:40 +0300") References: <86ectkn5rv.fsf@gnu.org> Date: Mon, 11 Aug 2025 12:03:16 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1754928196; bh=4zW8lHefI7kCXEYvtMYxhF85f4pFsqFcus+PeWu9Cvc=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=l7ftICXTSElXp1IdRn6OUnetyhgqgJuDCqTSsngfseNZ/tLAfPqLkE/UDf9KH8gRB 6vzplT1otB1b4cnGPPkdS6QeVPaoSaA4QaAXYrnPof+qeIaW/f6pDG0+t3Mwy7GgbX SAZsJiq/iaeny1aWCdrolbdVXUniVDU547jkzCLsSRGDmNtvdm8nOTRaTRUH1t2ccx B6CYPfoI/VGmn3xoydHrThJozeR1kibPdWkh4tnP3eyPka34ZVxaW5bFL7H3zJBVvZ Ji85nIQsaWZFkYuv/kdevgwW+sHuG80LuRcs/djcTEgZWQuRbYZeVv7CIway3rOX2/ T/1cL1vefZd6g== X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Eli Zaretskii writes: >> Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, app-emacs-dev@janestreet.com >> Date: Fri, 08 Aug 2025 16:34:10 -0400 >> From: Spencer Baugh via "Bug reports for GNU Emacs, >> the Swiss army knife of text editors" >> Here's a patch to fix it by initializing these fields. > > Thanks, but I don't think this is right. First, > update_processes_for_thread_death clears the .thread member of > fd_callback_info of those processes that were locked to the dying > thread. Second, wait_reading_process_output_unwind clears the > .waiting_thread member of fd_callback_info for all descriptors, when > we exit wait_reading_process_output (which is called by > accept-process-output). Yes. > And finally, IMO it is wrong for the add_*_fd functions to overwrite > these two members, because they are set and reset by other functions, > mainly those which compute the wait mask for the pselect call. > add_*_fd should not touch these members. I think we should at least add an eassert in those functions which checks that .waiting_thread is null. It is never valid for that to be non-null in add_*_fd. > I believe in this case update_processes_for_thread_death doesn't do > its job because your scenario kills the process before the thread > dies, and update_processes_for_thread_death only looks at the > processes that are in Vprocess_alist. So it is possible that all > that's missing is that delete_read_fd and delete_write_fd (which are > called from deactivate_process) should clear the .thread member of > fd_callback_info corresponding to the process's input and output > descriptors. > > If that doesn't fix the issue with your reproducer, please investigate > why, and let's take it from there. Yes, that does fix the issue with my reproducer. But I think it may cause other issues. delete_read_fd and delete_write_fd are called from many other places. For example, delete_write_fd is called before calling add_process_read_fd on (I think?) the same fd in wait_reading_process_output. Clearing the .thread there would break the guarantees of set-process-thread. Further complicating the issue: I think there's deeper issues with set-process-thread. There's also incorrect information in the manual about it: processes.texi: > by default a process is locked to the thread that created it. When a > process is locked to a thread, output from the process can only be > accepted by that thread. ... > @defun set-process-thread process thread > Set the locking thread of @var{process} to @var{thread}. @var{thread} > may be @code{nil}, in which case the process is unlocked. > @end defun This is not true. By default .thread is NULL, i.e., processes are not locked to any thread. I think this behavior was just never implemented. But, it doesn't seem to have been needed, since no-one has complained about the lack of it in the last 13 years since threads were released. I did a quick survey of 300 ELPA packages and there's no use of set-process-thread in any of them. And set-process-thread is not useful in practice if processes don't start out locked to a particular thread, since trying to lock a process to a thread if it's already unlocked is inherently racy. So I think there are two good options. Either: A. Make the code match the manual by locking processes to threads by default, and somehow fix the issues this causes with breaking fd_callback_info slots. Or: B. Deprecate set-process-thread and make it a no-op. Simply delete this incorrect section of the manual. My preference is for B, since this also reduces complexity and will slightly improve performance. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 11 12:51:38 2025 Received: (at 79201) by debbugs.gnu.org; 11 Aug 2025 16:51:38 +0000 Received: from localhost ([127.0.0.1]:49981 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulVkH-00035S-VX for submit@debbugs.gnu.org; Mon, 11 Aug 2025 12:51:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54274) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulVkD-00035A-8n for 79201@debbugs.gnu.org; Mon, 11 Aug 2025 12:51:34 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ulVk6-0007h3-08; Mon, 11 Aug 2025 12:51:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=3PHYeUPHX9iayx2MsmhJGbVdsKMUz96v+uBDd8Ol5q4=; b=m5D5e8ma9dOP KV7Hr6gNafDyJCjFsTezQJgAgqZSftGVI8CNgah17K/g9x1PCK0tkc0/ox8icABDGJySMBb6ZxsRE v35Z/jaciwF2swYqKSa/GjhZAA5BIFfzuM0jJOcRLfOGtCG2V/1WDectdDgu+Iwg5+sjrTL94+mal 62AIr7a9FfCU6OS7hY8RVVEX4pCnSfbq7GW8FRiNAqrs6jiJadyy4i+A9dsj/zhICy8mD6QQ2cY7x oddbPNDQcFDHncgS1CROeLzxyZekq4m9q3Tx+jpQqT3F1DNkWItNI2t5W9daA7yVRsKT6pVXBNaYE +F+W0d6UD5QswTL/qZOL6A==; Date: Mon, 11 Aug 2025 19:51:08 +0300 Message-Id: <86y0rpajk3.fsf@gnu.org> From: Eli Zaretskii To: Spencer Baugh In-Reply-To: (message from Spencer Baugh on Mon, 11 Aug 2025 12:03:16 -0400) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Spencer Baugh > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Mon, 11 Aug 2025 12:03:16 -0400 > > > And finally, IMO it is wrong for the add_*_fd functions to overwrite > > these two members, because they are set and reset by other functions, > > mainly those which compute the wait mask for the pselect call. > > add_*_fd should not touch these members. > > I think we should at least add an eassert in those functions which > checks that .waiting_thread is null. It is never valid for that to be > non-null in add_*_fd. Maybe, maybe not. I'm not yet sure. Let's discuss this later. > > I believe in this case update_processes_for_thread_death doesn't do > > its job because your scenario kills the process before the thread > > dies, and update_processes_for_thread_death only looks at the > > processes that are in Vprocess_alist. So it is possible that all > > that's missing is that delete_read_fd and delete_write_fd (which are > > called from deactivate_process) should clear the .thread member of > > fd_callback_info corresponding to the process's input and output > > descriptors. > > > > If that doesn't fix the issue with your reproducer, please investigate > > why, and let's take it from there. > > Yes, that does fix the issue with my reproducer. Then we should do that, I think. > But I think it may cause other issues. delete_read_fd and > delete_write_fd are called from many other places. For example, > delete_write_fd is called before calling add_process_read_fd on (I > think?) the same fd in wait_reading_process_output. I don't see these calls to delete an fd before adding it, please point to the code you had in mind. When a descriptor is deleted, its information is completely forgotten, so there's no reason to keep the .thread and .waiting_thread members. > Clearing the .thread there would break the guarantees of > set-process-thread. I don't think set-process-thread has anything to do with this, see below. > Further complicating the issue: I think there's deeper issues with > set-process-thread. There's also incorrect information in the manual > about it: > > processes.texi: > > by default a process is locked to the thread that created it. When a > > process is locked to a thread, output from the process can only be > > accepted by that thread. > ... > > @defun set-process-thread process thread > > Set the locking thread of @var{process} to @var{thread}. @var{thread} > > may be @code{nil}, in which case the process is unlocked. > > @end defun > > This is not true. By default .thread is NULL, i.e., processes are not > locked to any thread. set-process-thread doesn't set the .thread member of the descriptors, it sets the .thread member of the process object. Here's the code from make-process that processes.texi has in mind: static Lisp_Object make_process (Lisp_Object name) { struct Lisp_Process *p = allocate_process (); /* Initialize Lisp data. Note that allocate_process initializes all Lisp data to nil, so do it only for slots which should not be nil. */ pset_status (p, Qrun); pset_mark (p, Fmake_marker ()); pset_thread (p, Fcurrent_thread ()); <<<<<<<<<<<<<<<<<<<< And here's accept-process-output heeding this: if (! NILP (process)) { CHECK_PROCESS (process); struct Lisp_Process *proc = XPROCESS (process); /* Can't wait for a process that is dedicated to a different thread. */ if (!NILP (proc->thread) && !BASE_EQ (proc->thread, Fcurrent_thread ())) { Lisp_Object proc_thread_name = XTHREAD (proc->thread)->name; error ("Attempt to accept output from process %s locked to thread %s", SDATA (proc->name), STRINGP (proc_thread_name) ? SDATA (proc_thread_name) : SDATA (Fprin1_to_string (proc->thread, Qt, Qnil))); } The .thread and .waiting_thread are set when we prepare descriptors for passing them to pselect, they are a different (although related) mechanism. > I think this behavior was just never implemented. But, it doesn't seem > to have been needed, since no-one has complained about the lack of it in > the last 13 years since threads were released. See above: it's both implemented and working. You are welcome to add tests for that. > So I think there are two good options. Either: > > A. Make the code match the manual by locking processes to threads by > default, and somehow fix the issues this causes with breaking > fd_callback_info slots. > > Or: > B. Deprecate set-process-thread and make it a no-op. Simply delete this > incorrect section of the manual. > > My preference is for B, since this also reduces complexity and will > slightly improve performance. Given the code I show above, do you still see a problem? If yes, please tell the details. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 11 13:20:19 2025 Received: (at 79201) by debbugs.gnu.org; 11 Aug 2025 17:20:19 +0000 Received: from localhost ([127.0.0.1]:50052 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulWC2-0004Ok-CM for submit@debbugs.gnu.org; Mon, 11 Aug 2025 13:20:18 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:40993) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulWBw-0004Lz-Lf for 79201@debbugs.gnu.org; Mon, 11 Aug 2025 13:20:14 -0400 From: Spencer Baugh To: Eli Zaretskii Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: <86y0rpajk3.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 11 Aug 2025 19:51:08 +0300") References: <86ectkn5rv.fsf@gnu.org> <86y0rpajk3.fsf@gnu.org> Date: Mon, 11 Aug 2025 13:20:05 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1754932805; bh=XOdPv65UgLnMIYETyKeVUcCeVG+IshJY+nmZio6EUrM=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=xL7L/p7QE6PRqge+7PO6GbC6JMBkhKbLp12TXzdHlmc8ExEZb6+iOi136etiEfQNL ffBuyqpm5K9tM15wqAWX+0/IZUhoyrKS4KIIpcwJjyvSnAZ7xFLJXoqGCDmZ1Dusk1 BTeZfshNeRGJB8TOwuRMnbjpNRVL+G0oFd0KKt7kbyhYNQjF1Z1oFq+7wzKWzH3fcV iuA2uwFAokX689EdCrD6JJTz9dH2D6RJS4Y2Kv8sFfPDbYkzGiayoFsdJ7AnW61rTn XMIQsEZXmxx2X1lfbOGdpc2tZmnvmB5jlLszbCvHkzPVFKshKCab5SaPWvv3lTmgON iZAzfL8jUQc+Q== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Eli Zaretskii writes: >> From: Spencer Baugh >> Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, >> app-emacs-dev@janestreet.com >> Date: Mon, 11 Aug 2025 12:03:16 -0400 >> >> > And finally, IMO it is wrong for the add_*_fd functions to overwrite >> > these two members, because they are set and reset by other functions, >> > mainly those which compute the wait mask for the pselect call. >> > add_*_fd should not touch these members. >> >> I think we should at least add an eassert in those functions which >> checks that .waiting_thread is null. It is never valid for that to be >> non-null in add_*_fd. > > Maybe, maybe not. I'm not yet sure. Let's discuss this later. > >> > I believe in this case update_processes_for_thread_death doesn't do >> > its job because your scenario kills the process before the thread >> > dies, and update_processes_for_thread_death only looks at the >> > processes that are in Vprocess_alist. So it is possible that all >> > that's missing is that delete_read_fd and delete_write_fd (which are >> > called from deactivate_process) should clear the .thread member of >> > fd_callback_info corresponding to the process's input and output >> > descriptors. >> > >> > If that doesn't fix the issue with your reproducer, please investigate >> > why, and let's take it from there. >> >> Yes, that does fix the issue with my reproducer. > > Then we should do that, I think. > >> But I think it may cause other issues. delete_read_fd and >> delete_write_fd are called from many other places. For example, >> delete_write_fd is called before calling add_process_read_fd on (I >> think?) the same fd in wait_reading_process_output. > > I don't see these calls to delete an fd before adding it, please point > to the code you had in mind. When a descriptor is deleted, its > information is completely forgotten, so there's no reason to keep the > .thread and .waiting_thread members. > >> Clearing the .thread there would break the guarantees of >> set-process-thread. > > I don't think set-process-thread has anything to do with this, see > below. set-process-thread is the only code which ever sets the .thread field of fd_callback_info to a non-null value. >> Further complicating the issue: I think there's deeper issues with >> set-process-thread. There's also incorrect information in the manual >> about it: >> >> processes.texi: >> > by default a process is locked to the thread that created it. When a >> > process is locked to a thread, output from the process can only be >> > accepted by that thread. >> ... >> > @defun set-process-thread process thread >> > Set the locking thread of @var{process} to @var{thread}. @var{thread} >> > may be @code{nil}, in which case the process is unlocked. >> > @end defun >> >> This is not true. By default .thread is NULL, i.e., processes are not >> locked to any thread. > > set-process-thread doesn't set the .thread member of the descriptors, No, it does. Here's the code: DEFUN ("set-process-thread", Fset_process_thread, Sset_process_thread, 2, 2, 0, doc: /* Set the locking thread of PROCESS to be THREAD. If THREAD is nil, the process is unlocked. */) (Lisp_Object process, Lisp_Object thread) { ... if (proc->infd >= 0) fd_callback_info[proc->infd].thread = tstate; eassert (proc->outfd < FD_SETSIZE); if (proc->outfd >= 0) fd_callback_info[proc->outfd].thread = tstate; return thread; } I think set-process-thread should not do this, though. It should only set the thread member of the process object. It should not set the thread member of the descriptors. > it sets the .thread member of the process object. Here's the code > from make-process that processes.texi has in mind: > > static Lisp_Object > make_process (Lisp_Object name) > { > struct Lisp_Process *p = allocate_process (); > /* Initialize Lisp data. Note that allocate_process initializes all > Lisp data to nil, so do it only for slots which should not be nil. */ > pset_status (p, Qrun); > pset_mark (p, Fmake_marker ()); > pset_thread (p, Fcurrent_thread ()); <<<<<<<<<<<<<<<<<<<< > > And here's accept-process-output heeding this: > > if (! NILP (process)) > { > CHECK_PROCESS (process); > struct Lisp_Process *proc = XPROCESS (process); > > /* Can't wait for a process that is dedicated to a different > thread. */ > if (!NILP (proc->thread) && !BASE_EQ (proc->thread, Fcurrent_thread ())) > { > Lisp_Object proc_thread_name = XTHREAD (proc->thread)->name; > > error ("Attempt to accept output from process %s locked to thread %s", > SDATA (proc->name), > STRINGP (proc_thread_name) > ? SDATA (proc_thread_name) > : SDATA (Fprin1_to_string (proc->thread, Qt, Qnil))); > } > > The .thread and .waiting_thread are set when we prepare descriptors > for passing them to pselect, they are a different (although related) > mechanism. > >> I think this behavior was just never implemented. But, it doesn't seem >> to have been needed, since no-one has complained about the lack of it in >> the last 13 years since threads were released. > > See above: it's both implemented and working. You are welcome to add > tests for that. Yes, sorry, I missed that there are two separate mechanisms. I think set-process-thread should continue setting the thread member of the process object, but not the thread member of the fd_callback_info struct. >> So I think there are two good options. Either: >> >> A. Make the code match the manual by locking processes to threads by >> default, and somehow fix the issues this causes with breaking >> fd_callback_info slots. >> >> Or: >> B. Deprecate set-process-thread and make it a no-op. Simply delete this >> incorrect section of the manual. >> >> My preference is for B, since this also reduces complexity and will >> slightly improve performance. > > Given the code I show above, do you still see a problem? If yes, > please tell the details. Yes. Let's just make set-process-thread stop setting the thread member of the fd_callback_info struct. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 11 13:27:04 2025 Received: (at 79201) by debbugs.gnu.org; 11 Aug 2025 17:27:04 +0000 Received: from localhost ([127.0.0.1]:50063 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulWIZ-0004iw-JN for submit@debbugs.gnu.org; Mon, 11 Aug 2025 13:27:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58224) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulWIU-0004iO-HL for 79201@debbugs.gnu.org; Mon, 11 Aug 2025 13:27:00 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ulWIL-0005JA-Eo; Mon, 11 Aug 2025 13:26:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=MiRWYuHmssKtM85VIrREvEUuequSh1sxWKOWNSmpaGA=; b=Nrf7Z9fNd8Qn YHJ31us9+D553BIG3LeKRmb7pQlKrTOgrF8w7H/cyslA8BHj16UJaPdw8Lu9rewf8LaUbRxVThtUc SH2f+shcdOxTcHHh3IEgxBqYzrLMKDwHTIz9vNDX46cEy4H7bY0ozyia1JbYLCybLNHNaa4Z0ulOh zbO+M3dp3gxNC/lx7ymddViDRtiQFHgHkV08fyNC2ss/3LG9txK/gHmXPpfAObpnrvxo72yxPWzOi I9mg+RbMDrqHhPSuhav6v4BzEnx5LD2dRRLOkFbqaYL3VZ92Y1pTf30BQyJLJXuO4HVWELfX+7DOH BuWamlFxN0PRjIGJ2miCNA==; Date: Mon, 11 Aug 2025 20:26:38 +0300 Message-Id: <86tt2dahwx.fsf@gnu.org> From: Eli Zaretskii To: Spencer Baugh In-Reply-To: (message from Spencer Baugh on Mon, 11 Aug 2025 13:20:05 -0400) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> <86y0rpajk3.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Spencer Baugh > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Mon, 11 Aug 2025 13:20:05 -0400 > > Yes, sorry, I missed that there are two separate mechanisms. I think > set-process-thread should continue setting the thread member of the > process object, but not the thread member of the fd_callback_info struct. Why not? set-process-thread can be called when the descriptors for the process are already set up. That's unlike make-process, which indeed doesn't set the .thread member of the descriptors. > Yes. Let's just make set-process-thread stop setting the thread member > of the fd_callback_info struct. I don't agree it shouldn't see above. What problems do you see as the result of that? From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 11 13:58:18 2025 Received: (at 79201) by debbugs.gnu.org; 11 Aug 2025 17:58:18 +0000 Received: from localhost ([127.0.0.1]:50116 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulWmn-00066U-4P for submit@debbugs.gnu.org; Mon, 11 Aug 2025 13:58:18 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:47329) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulWmf-000668-8q for 79201@debbugs.gnu.org; Mon, 11 Aug 2025 13:58:10 -0400 From: Spencer Baugh To: Eli Zaretskii Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: <86tt2dahwx.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 11 Aug 2025 20:26:38 +0300") References: <86ectkn5rv.fsf@gnu.org> <86y0rpajk3.fsf@gnu.org> <86tt2dahwx.fsf@gnu.org> Date: Mon, 11 Aug 2025 13:58:02 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1754935082; bh=Q43Tl5wPcRHplAXRIS1u4vgc9IJNNqXwaYPSxvsZh2g=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=jzzgurJiJ0GazERFQCYuEwpNSDmm27TvI/X3ZEdIZjOoO8l5JjwhZdz+KhS7k2iWj dgU/OQOqeERM7oHn8ozbXW+dA4hVq3n493tzr6rY9B9oGhygy4Jr3I2Ai2bUlu5t4h ks7BHT23G4FzczRa+ijhjPA3O8QJEraMaXupwoFD9uoCyimLb/+wXeGO90RnNEiRTE Z1iYMbEIwP/1AQmGg+BSBeoE4Pz4NhYwCziGvbwRdFBghBU2yTN4HAJyFE6sJyixM7 5m3bQS26083FDPCuEEoAlgIOFbiS2nlwcPf3JBKq/3equ2sG8MlBUYt/oCXh8izw94 SpBVEUvp0UUwg== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> From: Spencer Baugh >> Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, >> app-emacs-dev@janestreet.com >> Date: Mon, 11 Aug 2025 13:20:05 -0400 >> >> Yes, sorry, I missed that there are two separate mechanisms. I think >> set-process-thread should continue setting the thread member of the >> process object, but not the thread member of the fd_callback_info struct. > > Why not? set-process-thread can be called when the descriptors for > the process are already set up. That's unlike make-process, which > indeed doesn't set the .thread member of the descriptors. It's useless and causes this bug. >> Yes. Let's just make set-process-thread stop setting the thread member >> of the fd_callback_info struct. > > I don't agree it shouldn't see above. What problems do you see as the > result of that? It causes this bug. Here's an attached patch which makes this change. The thread member of fd_callback_info is then never set, so it can be entirely deleted. This doesn't need an announcement, because the behavior of "calling accept-process-output on a process locked to a different thread" still works the same way it always did. There's no (non-bug) user-visible change. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Remove-thread-member-of-fd_callback_info.patch >From b5311d3af0d020f718b62f239f9e360379276230 Mon Sep 17 00:00:00 2001 From: Spencer Baugh Date: Mon, 11 Aug 2025 13:54:14 -0400 Subject: [PATCH] Remove thread member of fd_callback_info Previously, if set-process-thread was called on a process after its creation, we would set the thread field in the fd_callback_info for that process's file descriptors. This behavior isn't useful, and cauases bugs since this field could stay set to a dangling pointer if delete-process is called before thread exit. So simply remove it. * src/process.c (compute_input_wait_mask) (compute_non_process_wait_mask, compute_non_keyboard_wait_mask) (compute_write_mask, update_processes_for_thread_death) (Fset_process_thread): Remove thread member of fd_callback_info (bug#79201) --- src/process.c | 34 ++-------------------------------- 1 file changed, 2 insertions(+), 32 deletions(-) diff --git a/src/process.c b/src/process.c index e61ec425f7e..1a5c99139a6 100644 --- a/src/process.c +++ b/src/process.c @@ -591,9 +591,6 @@ compute_input_wait_mask (fd_set *mask) eassert (max_desc < FD_SETSIZE); for (fd = 0; fd <= max_desc; ++fd) { - if (fd_callback_info[fd].thread != NULL - && fd_callback_info[fd].thread != current_thread) - continue; if (fd_callback_info[fd].waiting_thread != NULL && fd_callback_info[fd].waiting_thread != current_thread) continue; @@ -614,9 +611,6 @@ compute_non_process_wait_mask (fd_set *mask) eassert (max_desc < FD_SETSIZE); for (fd = 0; fd <= max_desc; ++fd) { - if (fd_callback_info[fd].thread != NULL - && fd_callback_info[fd].thread != current_thread) - continue; if (fd_callback_info[fd].waiting_thread != NULL && fd_callback_info[fd].waiting_thread != current_thread) continue; @@ -638,9 +632,6 @@ compute_non_keyboard_wait_mask (fd_set *mask) eassert (max_desc < FD_SETSIZE); for (fd = 0; fd <= max_desc; ++fd) { - if (fd_callback_info[fd].thread != NULL - && fd_callback_info[fd].thread != current_thread) - continue; if (fd_callback_info[fd].waiting_thread != NULL && fd_callback_info[fd].waiting_thread != current_thread) continue; @@ -662,9 +653,6 @@ compute_write_mask (fd_set *mask) eassert (max_desc < FD_SETSIZE); for (fd = 0; fd <= max_desc; ++fd) { - if (fd_callback_info[fd].thread != NULL - && fd_callback_info[fd].thread != current_thread) - continue; if (fd_callback_info[fd].waiting_thread != NULL && fd_callback_info[fd].waiting_thread != current_thread) continue; @@ -976,12 +964,6 @@ update_processes_for_thread_death (Lisp_Object dying_thread) struct Lisp_Process *proc = XPROCESS (process); pset_thread (proc, Qnil); - eassert (proc->infd < FD_SETSIZE); - if (proc->infd >= 0) - fd_callback_info[proc->infd].thread = NULL; - eassert (proc->outfd < FD_SETSIZE); - if (proc->outfd >= 0) - fd_callback_info[proc->outfd].thread = NULL; } } } @@ -1451,25 +1433,13 @@ DEFUN ("set-process-thread", Fset_process_thread, Sset_process_thread, (Lisp_Object process, Lisp_Object thread) { struct Lisp_Process *proc; - struct thread_state *tstate; CHECK_PROCESS (process); - if (NILP (thread)) - tstate = NULL; - else - { - CHECK_THREAD (thread); - tstate = XTHREAD (thread); - } + if (!NILP (thread)) + CHECK_THREAD (thread); proc = XPROCESS (process); pset_thread (proc, thread); - eassert (proc->infd < FD_SETSIZE); - if (proc->infd >= 0) - fd_callback_info[proc->infd].thread = tstate; - eassert (proc->outfd < FD_SETSIZE); - if (proc->outfd >= 0) - fd_callback_info[proc->outfd].thread = tstate; return thread; } -- 2.39.3 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 11 14:10:25 2025 Received: (at 79201) by debbugs.gnu.org; 11 Aug 2025 18:10:25 +0000 Received: from localhost ([127.0.0.1]:50135 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulWyW-0006iP-8f for submit@debbugs.gnu.org; Mon, 11 Aug 2025 14:10:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38398) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulWyI-0006d6-2r for 79201@debbugs.gnu.org; Mon, 11 Aug 2025 14:10:13 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ulWy8-0004pB-PL; Mon, 11 Aug 2025 14:10:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=/L8phWaLJshFElQ1kp+tGLDWI6H+vNN0XHtW42yJgRM=; b=k4oVwhIqsDE7 SBzfwX8kF4MEi+uljmMzoMM2cB2v5oI3vuBW5tWPmleWF9O9YwhXcfL0kNQLn0yEpS+uJonE+G8mA Jur8OsH48jUOATfJbsqW77DKbRkW8z5FVs8K0xWpHMTJcDjRYm8ZEs9WkgOvsr937wpTsGP5tfocB +FPgYmoN4w+ZYmKkDUs9BrBK/AFpOUvJWCMYRreQyenTWTb2kzeonkRYuppOEVY/bzJOuf3BPIUcB tp96yKTUnytBjrGKCU7Guhv8POyB0uM0yVZMtAptC/NTjtNebY4e6hTWjOMNQAZGV1iDCRYDVpJ2o uSphJFzm+0Wijxw7kRfq1g==; Date: Mon, 11 Aug 2025 21:09:30 +0300 Message-Id: <86pld1afxh.fsf@gnu.org> From: Eli Zaretskii To: Spencer Baugh In-Reply-To: (message from Spencer Baugh on Mon, 11 Aug 2025 13:58:02 -0400) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> <86y0rpajk3.fsf@gnu.org> <86tt2dahwx.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Spencer Baugh > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Mon, 11 Aug 2025 13:58:02 -0400 > > >> Yes, sorry, I missed that there are two separate mechanisms. I think > >> set-process-thread should continue setting the thread member of the > >> process object, but not the thread member of the fd_callback_info struct. > > > > Why not? set-process-thread can be called when the descriptors for > > the process are already set up. That's unlike make-process, which > > indeed doesn't set the .thread member of the descriptors. > > It's useless and causes this bug. You ignored my explanation why I think it is not useless. Please don't ignore it. What happens if set-process-thread is called when the descriptors are already set? > >> Yes. Let's just make set-process-thread stop setting the thread member > >> of the fd_callback_info struct. > > > > I don't agree it shouldn't see above. What problems do you see as the > > result of that? > > It causes this bug. Didn't you agree that cleaning up in delete_read_fd and delete_write_fd solves the problem? The problem would not exists if the thread dies before the process exits or is terminated, right? > Here's an attached patch which makes this change. The thread member of > fd_callback_info is then never set, so it can be entirely deleted. Sorry, I cannot accept such changes without a very good reason. This is not some random code or a typo, this stuff was deliberately written like that, and written for a reason. If you disagree with the design, let's first make sure we understand the design and its rationale, and then please explain why the design is wrong. It is not enough to show a single bug, especially if it can be fixed in other ways that don't remove essential parts of the design. > This doesn't need an announcement, because the behavior of "calling > accept-process-output on a process locked to a different thread" still > works the same way it always did. There's no (non-bug) user-visible > change. Except if set-process-thread is called at some un-opportune moment. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 11 14:53:09 2025 Received: (at 79201) by debbugs.gnu.org; 11 Aug 2025 18:53:09 +0000 Received: from localhost ([127.0.0.1]:50269 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulXds-0000MT-NS for submit@debbugs.gnu.org; Mon, 11 Aug 2025 14:53:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54340) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulXdl-0000Ls-8H for 79201@debbugs.gnu.org; Mon, 11 Aug 2025 14:53:02 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ulXdd-0004OK-OQ; Mon, 11 Aug 2025 14:52:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=vXvSYS50AGCaZ+Smhtc6o+p15+PLjWo0/urLxIyHKyI=; b=BbmNc0M/upW8 unMrtJs698ZsU0qFhb+2eWahMGJ8/H3qsLU11z6eKDtuuHXJITibhZYDRhn3MDNIu8Z32a1ujbw1R bVc0MvagdDKr8OGs2GoCYUzgYjl1u/RcS7QnOQb08qHF7UktxwvSQX8Qk5EGuohH+rmqXelQbw5re ES6vcZWCsGLChAgC2oKEuoBReb7A8Aps4wDC4aAtAeuSsl8tkA/IwIhxKf25D5Xiihdu2WUATWfMY H3OB58M2CURARW3SQTTHjENClnqe0G22VAGEMr2SBEjpO7jiXq+f0mvw5bCPhEH821qfvvH8cwa2n 4dX1B56Qt/y5+WL7pxR0+A==; Date: Mon, 11 Aug 2025 21:52:48 +0300 Message-Id: <86jz39adxb.fsf@gnu.org> From: Eli Zaretskii To: sbaugh@janestreet.com In-Reply-To: <86pld1afxh.fsf@gnu.org> (message from Eli Zaretskii on Mon, 11 Aug 2025 21:09:30 +0300) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> <86y0rpajk3.fsf@gnu.org> <86tt2dahwx.fsf@gnu.org> <86pld1afxh.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Mon, 11 Aug 2025 21:09:30 +0300 > From: Eli Zaretskii > > Sorry, I cannot accept such changes without a very good reason. This > is not some random code or a typo, this stuff was deliberately written > like that, and written for a reason. If you disagree with the design, > let's first make sure we understand the design and its rationale, and > then please explain why the design is wrong. It is not enough to show > a single bug, especially if it can be fixed in other ways that don't > remove essential parts of the design. > > > This doesn't need an announcement, because the behavior of "calling > > accept-process-output on a process locked to a different thread" still > > works the same way it always did. There's no (non-bug) user-visible > > change. > > Except if set-process-thread is called at some un-opportune moment. I actually think that if there is a bug, then it's in make-process: it should set the .thread member of the process's input and output descriptors with the current thread, like set-process-thread does. In addition to cleaning up in delete_read_fd and delete_write_fd, this should fix the problems discussed here. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 11 18:32:54 2025 Received: (at 79201) by debbugs.gnu.org; 11 Aug 2025 22:32:54 +0000 Received: from localhost ([127.0.0.1]:50552 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulb4Q-0002C6-Ps for submit@debbugs.gnu.org; Mon, 11 Aug 2025 18:32:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42282) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulb4N-0002Bn-2D for 79201@debbugs.gnu.org; Mon, 11 Aug 2025 18:32:44 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ulb4D-0002v4-Dy; Mon, 11 Aug 2025 18:32:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=Subject:References:In-Reply-To:To:From:Date: MIME-Version; bh=CK/V4+yQN35jkhpOZoP2zsMMfIWGydesl83jLNtrryI=; b=IFvAp7wRDGKv p8sUpSAlaKxD6y+T6YJdmH3sybxry6iquOeNm0VE72y70noRcqEixUAU3J+NXBdkHe6du9xPXb4CK 1T3Uc915/fFRBlYda71ZNVQ9BIFyg9cXWvb5AVmeqJ7JXbvW3vaHldQABBblQuPiZYbUgEVCnNqam Pl971BT0v/J30CwztQgvdkgSrwyD4+f5JfT/T1Uo9OoH/r+R+tyoRpUB5JNVLUeokXBmUPsa8xH6v 4BY/8UCEJeMf8H0W3703n49zLm0ggr/S35CAPJBmP/SHau9qvhEKBMFkByRd4egtgEISXi2Mr+ra9 XTROsFyxfCGm0jxfevuIwQ==; X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgddufeefieeiucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepofggfffhvfevkfgjfhfutgfgsehtjeertdertddtnecuhfhrohhmpedflfhohhhn ucghihgvghhlvgihfdcuoehjohhhnhifsehgnhhurdhorhhgqeenucggtffrrghtthgvrh hnpedvveehheettdfgvdeuvdejuedtheektdfhjeeiueejvdfhleeggfeljeetjefgvden ucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehjohhhnh ifodhmvghsmhhtphgruhhthhhpvghrshhonhgrlhhithihqdeikeejkedtleeggedqudej jeehfeekudeiqdhjohhhnhifpeepghhnuhdrohhrghesnhgvfigrrhhtihhsrghnshdrtg homhdpnhgspghrtghpthhtohepiedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohep jeelvddtudesuggvsggsuhhgshdrghhnuhdrohhrghdprhgtphhtthhopeguihgtkhdrrh drtghhihgrnhhgsehgmhgrihhlrdgtohhmpdhrtghpthhtohepvghlihiisehgnhhurdho rhhgpdhrtghpthhtohepughmihhtrhihsehguhhtohhvrdguvghvpdhrtghpthhtoheprg hpphdqvghmrggtshdquggvvhesjhgrnhgvshhtrhgvvghtrdgtohhmpdhrtghpthhtohep shgsrghughhhsehjrghnvghsthhrvggvthdrtghomh X-ME-Proxy: Feedback-ID: ib64945b7:Fastmail X-Mailer: MessagingEngine.com Webmail Interface MIME-Version: 1.0 X-ThreadId: Ta412df8d01fbaeac Date: Mon, 11 Aug 2025 15:32:09 -0700 From: "John Wiegley" To: dick.r.chiang@gmail.com, "Eli Zaretskii" Message-Id: In-Reply-To: <87frdxk6v7.fsf@dick> References: <86ectkn5rv.fsf@gnu.org> <86y0rpajk3.fsf@gnu.org> <86tt2dahwx.fsf@gnu.org> <86pld1afxh.fsf@gnu.org> <87frdxk6v7.fsf@dick> Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, Spencer Baugh , app-emacs-dev@janestreet.com, dmitry@gutov.dev X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) This is neither helpful nor constructive. If there are changes to be made, please provide the evidence and justification asked for. Insulting maintainers will not motivate the result you seek. John On Mon, Aug 11, 2025, at 12:15 PM, dick.r.chiang@gmail.com wrote: > EZ> This is not some random code or a typo > > Yeah it is. You just don't know it yet. > > Brother Spencer, why bother arguing with an idiot? > > Incidentally, the fd_callback_info cruft admits sporadic hangs > documented in Bug#49897. They only get exposed when using multiple > threads, which very few emacs programs do. Fixed in my fork. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 12 00:06:47 2025 Received: (at 79201) by debbugs.gnu.org; 12 Aug 2025 04:06:47 +0000 Received: from localhost ([127.0.0.1]:51081 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulgHe-0001Iz-J3 for submit@debbugs.gnu.org; Tue, 12 Aug 2025 00:06:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45324) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulgHZ-0001Ii-Pz for 79201@debbugs.gnu.org; Tue, 12 Aug 2025 00:06:43 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ulgHP-0003kS-F6; Tue, 12 Aug 2025 00:06:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-Version:Date:References:In-Reply-To:Subject:To: From; bh=e/VdAILI4GwAi1QeF2pNI2tSh+Do+uThiQ3E4/9kPQA=; b=KhEQIt+OOSseXLK+VpiA UMQBh6HHY79ZMZpwoskpyx1hykAWlJbFcualhdAJpqxzDvRnK9CjIak7ZETN35YA+tbV/jR7oRozW aDtnc/cQwdKmgPtkNoae7ea4ZybWVZmZdY+0n42UFYHcfOVLDSAGYEtFFnVokGZvV+aupCKDb3/TO 4iGwjgJ2kh6iNgP0atdTyH9msVMiTi4f1O/oWWrjr8Vu2rUFqIQZI3luSNorfjyrNHImeLPrdda9+ jdv7x/G2kxHOT5BIx9NI3pgzOCk7KmKuUJl/JFMtbfywDff3AfC85XR+DhUpLQJUZLkdUEWkbs6gg rJEgIeMg/XvnFQ==; X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgddufeegfeefucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhephffvvefujghffffkfgggtgfgsehtqhertddtreejnecuhfhrohhmpeflohhhnhcu hghivghglhgvhicuoehjohhhnhifsehgnhhurdhorhhgqeenucggtffrrghtthgvrhhnpe eiffdthedviedvtdfffeffffevkeeludejieeltddufeehfefhhfefgeejteefgeenucff ohhmrghinhepnhgvfigrrhhtihhsrghnshdrtghomhenucevlhhushhtvghrufhiiigvpe dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehjohhhnhifodhmvghsmhhtphgruhhthhhp vghrshhonhgrlhhithihqdeikeejkedtleeggedqudejjeehfeekudeiqdhjohhhnhifpe epghhnuhdrohhrghesnhgvfigrrhhtihhsrghnshdrtghomhdpnhgspghrtghpthhtohep iedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtoheprghpphdqvghmrggtshdquggvvh esjhgrnhgvshhtrhgvvghtrdgtohhmpdhrtghpthhtohepughmihhtrhihsehguhhtohhv rdguvghvpdhrtghpthhtohepjeelvddtudesuggvsggsuhhgshdrghhnuhdrohhrghdprh gtphhtthhopehssggruhhghhesjhgrnhgvshhtrhgvvghtrdgtohhmpdhrtghpthhtohep vghlihiisehgnhhurdhorhhgpdhrtghpthhtohepughitghkrdhrrdgthhhirghnghesgh hmrghilhdrtghomh X-ME-Proxy: Feedback-ID: ib64945b7:Fastmail From: John Wiegley To: Dick Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: (Dick's message of "Mon, 11 Aug 2025 18:55:27 -0400") References: <86ectkn5rv.fsf@gnu.org> <86y0rpajk3.fsf@gnu.org> <86tt2dahwx.fsf@gnu.org> <86pld1afxh.fsf@gnu.org> <87frdxk6v7.fsf@dick> Date: Mon, 11 Aug 2025 21:06:25 -0700 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, Spencer Baugh , Eli Zaretskii , app-emacs-dev@janestreet.com, dmitry@gutov.dev X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) >>>>> Dick writes: > I remember it taking weeks to trace the irreproducible hangs with > non-blocking gnus. It would probably take a day to refresh my memory and > generate an MRE. I can see you=E2=80=99ve had this attitude toward the mailing list for many= years, Dick. I=E2=80=99m not interested. If you ever want to collaborate, we=E2=80= =99re here. --=20 John Wiegley GPG fingerprint =3D 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 12 18:16:22 2025 Received: (at 79201) by debbugs.gnu.org; 12 Aug 2025 22:16:22 +0000 Received: from localhost ([127.0.0.1]:55240 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ulxI5-0007wz-O1 for submit@debbugs.gnu.org; Tue, 12 Aug 2025 18:16:22 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:52521) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ulxI2-0007wk-Cy for 79201@debbugs.gnu.org; Tue, 12 Aug 2025 18:16:19 -0400 From: Spencer Baugh To: Eli Zaretskii Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: <86ectkn5rv.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 09 Aug 2025 13:33:40 +0300") References: <86ectkn5rv.fsf@gnu.org> Date: Tue, 12 Aug 2025 18:16:12 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1755036972; bh=+oJ0mG7iwN2gkQrvLhHloie4PZyYltFNlb7xFpZUs7s=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=ZDKc/1Bk21WugSojWX1+d3eWqZOWW/g1BZ+V9nra8EEzYyuAX+9u4A7wRGYx/6ZKB ejP8uVRuzMVf60IFty7bs55DQqEAaUKVCk8lzW8ZYRseSJWf7Mu31BQCOMSyo9cvp+ dWK+DopdAPj9t7dliAmOMiBGqgrgc5uv/d0vK3e4JPhB77kFzbwXipUjYQz38NJvY1 TXM6bOWssslUkIlfA3pnClAA49/20Ftm71i5d6mfolMIEL4FicRE+hLXAkUm5QZ0Ay lPwAAxSOiqmQn+7z3dCjfOblby+UlFLYRoS6UJdQksgn5xoqAePwVnYVWKsxl1/nKZ 05quf5vxt+h5A== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Anyway, I found a reproducer for the other bug I mentioned, where waiting_thread could also get contaminated and break fd_callback_info slots. It seems to be quite a bad bug: it will happen spontaneously just from running multiple threads interacting with processes. If you run the following simple Lisp code after applying the following patch (which is just to make the breakage more obvious; without the patch, Emacs would be left in a silently broken state), Emacs will abort in a few seconds - though it probably varies based on your system. This indicates that .waiting_thread is somehow being left at a non-NULL value even after the thread has left wait_reading_process_output. Any ideas? (defun my-break-thread () (let ((proc (make-process :name "foo" :command '("sleep" "1")))) (while (process-live-p proc) (accept-process-output proc 0.05)))) (while t (make-thread #'my-break-thread "thread1") (thread-join (make-thread #'my-break-thread "thread2"))) diff --git a/src/process.c b/src/process.c index 1a5c99139a6..294c204ae96 100644 --- a/src/process.c +++ b/src/process.c @@ -467,6 +467,16 @@ make_lisp_proc (struct Lisp_Process *p) } fd_callback_info[FD_SETSIZE]; +static void +assert_fd_callback_info_unused(int fd) +{ + eassert (0 <= fd && fd < FD_SETSIZE); + if (fd_callback_info[fd].waiting_thread != NULL) { + fprintf (stderr, "waiting_thread non-NULL\n"); + terminate_due_to_signal (SIGABRT, INT_MAX); + } +} + /* Add a file descriptor FD to be monitored for when read is possible. When read is possible, call FUNC with argument DATA. */ @@ -475,7 +485,6 @@ add_read_fd (int fd, fd_callback func, void *data) { add_keyboard_wait_descriptor (fd); - eassert (0 <= fd && fd < FD_SETSIZE); fd_callback_info[fd].func = func; fd_callback_info[fd].data = data; } @@ -490,14 +499,13 @@ add_non_keyboard_read_fd (int fd, fd_callback func, void *data) static void add_process_read_fd (int fd) { - eassert (fd >= 0 && fd < FD_SETSIZE); + assert_fd_callback_info_unused(fd); eassert (fd_callback_info[fd].func == NULL); fd_callback_info[fd].flags &= ~KEYBOARD_FD; fd_callback_info[fd].flags |= FOR_READ; if (fd > max_desc) max_desc = fd; - eassert (0 <= fd && fd < FD_SETSIZE); fd_callback_info[fd].flags |= PROCESS_FD; } @@ -522,7 +530,7 @@ delete_read_fd (int fd) void add_write_fd (int fd, fd_callback func, void *data) { - eassert (fd >= 0 && fd < FD_SETSIZE); + assert_fd_callback_info_unused(fd); fd_callback_info[fd].func = func; fd_callback_info[fd].data = data; @@ -534,7 +542,7 @@ add_write_fd (int fd, fd_callback func, void *data) static void add_non_blocking_write_fd (int fd) { - eassert (fd >= 0 && fd < FD_SETSIZE); + assert_fd_callback_info_unused(fd); eassert (fd_callback_info[fd].func == NULL); fd_callback_info[fd].flags |= FOR_WRITE | NON_BLOCKING_CONNECT_FD; @@ -8266,7 +8274,7 @@ remove_slash_colon (Lisp_Object name) add_keyboard_wait_descriptor (int desc) { #ifdef subprocesses /* Actually means "not MSDOS". */ - eassert (desc >= 0 && desc < FD_SETSIZE); + assert_fd_callback_info_unused(desc); fd_callback_info[desc].flags &= ~PROCESS_FD; fd_callback_info[desc].flags |= (FOR_READ | KEYBOARD_FD); if (desc > max_desc) From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 13 07:52:52 2025 Received: (at 79201) by debbugs.gnu.org; 13 Aug 2025 11:52:52 +0000 Received: from localhost ([127.0.0.1]:56608 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umA2G-0003Yv-7O for submit@debbugs.gnu.org; Wed, 13 Aug 2025 07:52:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54268) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umA29-0003YR-Um for 79201@debbugs.gnu.org; Wed, 13 Aug 2025 07:52:48 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1umA21-0002KV-8t; Wed, 13 Aug 2025 07:52:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=Ez/GkdOcyT2g6k4zNyqUdqT+vXInsMnPfF2Q4BUjUMc=; b=OeP29ezK3Gnc emRhL66Zcz4ENdtxhCIkbZW7a+TQ9NA9YGVQ+e1kTNAPdzIrp1X/IO0p+BCFelBBbwOxGOaDovFpx ZBB22Pl5jSw/5C1TMThiQsWKnlN6teKBzzPkwTVTYCS9dDF3afb5+y/SMmMWIWM+a+jUToTHh47lj /SFN/DLBNBTqfddB4IZHerunJMn5y4N9FGRuTQTbXwpFdfq9PzCmBfwGhqw9t34SbKbPmQ4XLHG1P gJjt+wUjuwY9wiVuE9YbbeoyA1c9n3NfuOpaWo8KT3OKbhYJLz4ZM0RCrbVpZap3mI7wyrTyeHh9q GkX5DIotB26ddwSrUkHPnQ==; Date: Wed, 13 Aug 2025 14:52:04 +0300 Message-Id: <868qjno2vv.fsf@gnu.org> From: Eli Zaretskii To: Spencer Baugh In-Reply-To: (message from Spencer Baugh on Tue, 12 Aug 2025 18:16:12 -0400) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Spencer Baugh > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Tue, 12 Aug 2025 18:16:12 -0400 > > > Anyway, I found a reproducer for the other bug I mentioned, where > waiting_thread could also get contaminated and break fd_callback_info > slots. It seems to be quite a bad bug: it will happen spontaneously > just from running multiple threads interacting with processes. > > If you run the following simple Lisp code after applying the following > patch (which is just to make the breakage more obvious; without the > patch, Emacs would be left in a silently broken state), Emacs will abort > in a few seconds - though it probably varies based on your system. > > This indicates that .waiting_thread is somehow being left at a non-NULL > value even after the thread has left wait_reading_process_output. > > Any ideas? I'm not sure I understand your idea behind assert_fd_callback_info_unused. wait_reading_process_output calls thread_select in a loop, so a given thread which called wait_reading_process_output could spend more than one loop iteration without leaving the function. Each iteration, such a thread will call add_process_read_fd and other similar functions, to set up the descriptors it will wait on, and then call thread_select. When it calls thread_select, it releases the global lock, so some other thread could grab the lock and start running. Then this other thread could see the .waiting_thread member non-NULL, because the thread which marked the descriptor as being waited on by it is stuck in a pselect call, called from wait_reading_process_output via thread_select. Could the above scenario be the reason for what you see? IOW, why did you expect to find the .waiting_thread members cleared whenever some thread calls assert_fd_callback_info_unused, and why did you say "even after the thread has left wait_reading_process_output"? From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 13 12:46:49 2025 Received: (at 79201) by debbugs.gnu.org; 13 Aug 2025 16:46:49 +0000 Received: from localhost ([127.0.0.1]:58395 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umEci-00025G-MN for submit@debbugs.gnu.org; Wed, 13 Aug 2025 12:46:49 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:38905) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umEce-00024z-Sz for 79201@debbugs.gnu.org; Wed, 13 Aug 2025 12:46:45 -0400 From: Spencer Baugh To: Eli Zaretskii Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: <868qjno2vv.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 13 Aug 2025 14:52:04 +0300") References: <86ectkn5rv.fsf@gnu.org> <868qjno2vv.fsf@gnu.org> Date: Wed, 13 Aug 2025 12:46:39 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1755103599; bh=NOnc+fHDQU26nf6gTDI0IFAht4jeZ7w+rdGx5Z1rEZo=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=SaucGoKx6A2x8MR0ywm/SWnFLUG9J4iMDWHXR5D1DvZJPxUdySC7BCht1E3g/1WqM NdaDhvCJzmEvI+J42vxd7BdBOPqU/E/h8wub1yG8ZwA9NZW9SULi8JazmG8DA4qGw8 IeXVUrfbm7kaOScOzyjNvOKoA5UQm+caOkm2PVIulhwykZNRAfwo96xTil2jVl7ECS vp/Ndh/KoO/ZClb0jeGz7APONCYw4UMq8LphziHkWuZZmwYI7sKgkeaBqLjfaWtBPG z/aNRjRAZvQ/pjGeYvMc5P8dl10HAIEaYTdWliiVyS9M0LENfKEOjNV3E2OY9K8ePw +qr0128Wve1Ow== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Eli Zaretskii writes: >> From: Spencer Baugh >> Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, >> app-emacs-dev@janestreet.com >> Date: Tue, 12 Aug 2025 18:16:12 -0400 >> >> >> Anyway, I found a reproducer for the other bug I mentioned, where >> waiting_thread could also get contaminated and break fd_callback_info >> slots. It seems to be quite a bad bug: it will happen spontaneously >> just from running multiple threads interacting with processes. >> >> If you run the following simple Lisp code after applying the following >> patch (which is just to make the breakage more obvious; without the >> patch, Emacs would be left in a silently broken state), Emacs will abort >> in a few seconds - though it probably varies based on your system. >> >> This indicates that .waiting_thread is somehow being left at a non-NULL >> value even after the thread has left wait_reading_process_output. >> >> Any ideas? > > I'm not sure I understand your idea behind > assert_fd_callback_info_unused. wait_reading_process_output calls > thread_select in a loop, so a given thread which called > wait_reading_process_output could spend more than one loop iteration > without leaving the function. Yes. > Each iteration, such a thread will call add_process_read_fd and other > similar functions, to set up the descriptors it will wait on, and then > call thread_select. That is incorrect. add_process_read_fd is not usually called from wait_reading_process_output or thread_select. It's only called from make-process. There is one branch where add_process_read_fd is called in wait_reading_process_output, for async network connections, but that's not being hit here, since we aren't using async network connections. > When it calls thread_select, it releases the global lock, so some > other thread could grab the lock and start running. Yes. > Then this other thread could see the .waiting_thread member non-NULL, > because the thread which marked the descriptor as being waited on by > it is stuck in a pselect call, called from wait_reading_process_output > via thread_select. Yes. For existing file descriptors, the waiting_thread member should often be non-NULL, and this tells other threads not to wait on that file descriptor. > Could the above scenario be the reason for what you see? No, because the non-nil waiting_thread is seen in make-process, for a process and file descriptor that is newly created. If you run the reproducer, the assertion is hit in make-process. Here's the backtrace. #0 0x0000000000422fbb in terminate_due_to_signal (sig=sig@entry=6, backtrace_limit=backtrace_limit@entry=2147483647) at emacs.c:443 #1 0x000000000061f59e in assert_fd_callback_info_unused (fd=9) at process.c:476 #2 0x000000000061f5e7 in assert_fd_callback_info_unused (fd=fd@entry=9) at process.c:509 #3 0x000000000061f5e7 in add_process_read_fd (fd=fd@entry=9) at process.c:502 #4 0x0000000000625138 in create_process (current_dir=XIL(0x7fffc4005684), new_argv=0x7fffcbbfd2c0, process=XIL(0x7fffc0000de5)) at process.c:2238 #5 0x0000000000625138 in Fmake_process (nargs=, args=) at process.c:2058 #6 0x00000000005d094c in eval_sub (form=) at lisp.h:2231 #7 0x00000000005d1c91 in Flet (args=XIL(0x7ffff73accc3)) at eval.c:1164 #8 0x00000000005d0a60 in eval_sub (form=) at lisp.h:2231 #9 0x00000000005d1130 in Fprogn (body=) at eval.c:445 #10 0x00000000005d1130 in funcall_lambda (fun=XIL(0x8ead4d), nargs=0, arg_vector=0x7fffcbbfd740) at eval.c:3429 #11 0x00000000005cd6e3 in Ffuncall (nargs=nargs@entry=1, args=args@entry=0x7fffcbbfd738) at eval.c:3172 #12 0x000000000064a560 in invoke_thread_function () at thread.c:747 #13 0x00000000005cbfb2 in internal_condition_case (bfun=bfun@entry=0x64a530 , handlers=handlers@entry=XIL(0x30), hfun=hfun@entry=0x649d80 ) at eval.c:1690 #14 0x000000000064a448 in run_thread (state=0x8eb4b8) at lisp.h:1164 #15 0x00007fffee6081ea in start_thread () at /lib64/libpthread.so.0 #16 0x00007fffeae39953 in clone () at /lib64/libc.so.6 Lisp Backtrace: "make-process" (0xcbbfd490) "let" (0xcbbfd608) "my-break-thread" (0xcbbfd740) From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 13 12:58:02 2025 Received: (at 79201) by debbugs.gnu.org; 13 Aug 2025 16:58:02 +0000 Received: from localhost ([127.0.0.1]:58430 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umEnZ-0002j0-MY for submit@debbugs.gnu.org; Wed, 13 Aug 2025 12:58:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44890) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umEnU-0002iB-BU for 79201@debbugs.gnu.org; Wed, 13 Aug 2025 12:57:57 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1umEnK-0000Vm-Q8; Wed, 13 Aug 2025 12:57:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=C8nbRzR3KKWyMNTgNG96vBvrpL8XvqCECCZ1bi7Rf0I=; b=LFj9XSxstozh oejTePpV+aGvmM6eWY7oHRnWhO+g/GJKBa4KzADgQPeSNrOsU+WV4DP2RFQSx2suTRRKAfEFtpFHN zrq5kT/A2JqaJaizeFmTm3pmSHVyoo9dRsRTe8x4ly/BN8dHY6raCGZbbn801lQ8grmXL522Nqk/v GGcxxY9reUmkAPS5PvCBNtyBCPxpB5X38R1tOY5qbmzB8rHeLbqsi1R218hp6wuYq2JryD0SrZinV ny3xam8qzyuueGLa9JC/A1b0Z+DhuRZofBE/ZhZQApzMuZamaOZwKJrij4FBgHWt5fqR4MizQSZ8G KoXPWGAGcOAGverur3akdQ==; Date: Wed, 13 Aug 2025 19:57:41 +0300 Message-Id: <86tt2bma62.fsf@gnu.org> From: Eli Zaretskii To: Spencer Baugh In-Reply-To: (message from Spencer Baugh on Wed, 13 Aug 2025 12:46:39 -0400) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> <868qjno2vv.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Spencer Baugh > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Wed, 13 Aug 2025 12:46:39 -0400 > > > Could the above scenario be the reason for what you see? > > No, because the non-nil waiting_thread is seen in make-process, for a > process and file descriptor that is newly created. Then I think the descriptors was closed without clearing the .waiting_thread member. So that goes back to what I suggested up-thread: delete_read_fd and delete_write_fd should clear these members. They are called either after closing the descriptor or when we stop reading/writing to them, so their .waiting_thread member is no longer useful. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 13 14:36:39 2025 Received: (at 79201) by debbugs.gnu.org; 13 Aug 2025 18:36:39 +0000 Received: from localhost ([127.0.0.1]:58614 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umGL1-0007hU-6v for submit@debbugs.gnu.org; Wed, 13 Aug 2025 14:36:39 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:50233) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umGKw-0007hB-MV for 79201@debbugs.gnu.org; Wed, 13 Aug 2025 14:36:35 -0400 From: Spencer Baugh To: Eli Zaretskii Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: <86tt2bma62.fsf@gnu.org> (Eli Zaretskii's message of "Wed, 13 Aug 2025 19:57:41 +0300") References: <86ectkn5rv.fsf@gnu.org> <868qjno2vv.fsf@gnu.org> <86tt2bma62.fsf@gnu.org> Date: Wed, 13 Aug 2025 14:36:29 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1755110189; bh=L0yn3FAqqMtdNgqTum3QQPlwpsih7CeraeP144jF0aY=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=t5JC33HPD4qtDSUXr71AsAeVebDOKOnqgIUHUsEq8/p4OnU4oPIBB1fUN0O8Pu926 NKRJzX1MQe3YYZEV7ibK9oOuDvd9uPeIBzysWkw0OXtIq4Y1B3U1Pf4rWL1jJjT2eC +yPjC73aeKhvcWuqKcLqIvIkr0yymOf8mQg3soihhbpjNA1omKRSELF4tFrgJz3NlY In54bTu9Gyxza64mw/JW/q6GBkOcQEY82JmdbCAnsugLJ3T9I6x5NHitnTttd/+WPu mEg0OSfE3CDuQZFW2hHmW7aDY1aNQvklBGT+Oep1zQpWnvAWj+PqQQdGh0kKXLReyA RURON91oW7yNw== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Eli Zaretskii writes: >> From: Spencer Baugh >> Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, >> app-emacs-dev@janestreet.com >> Date: Wed, 13 Aug 2025 12:46:39 -0400 >> >> > Could the above scenario be the reason for what you see? >> >> No, because the non-nil waiting_thread is seen in make-process, for a >> process and file descriptor that is newly created. > > Then I think the descriptors was closed without clearing the > .waiting_thread member. So that goes back to what I suggested > up-thread: delete_read_fd and delete_write_fd should clear these > members. They are called either after closing the descriptor or when > we stop reading/writing to them, so their .waiting_thread member is no > longer useful. That doesn't explain why this is happening, though. Descriptors are always closed without clearing the .waiting_thread member. Clearing .waiting_thread is clear_waiting_thread_info's job. Why isn't it happening? One theory: clear_waiting_thread_info only clears .waiting_thread up to max_desc. That's not it, though, because the bug still happens (albeit slower) if I do: @@ -670,7 +678,7 @@ clear_waiting_thread_info (void) int fd; eassert (max_desc < FD_SETSIZE); - for (fd = 0; fd <= max_desc; ++fd) + for (fd = 0; fd < FD_SETSIZE; ++fd) { if (fd_callback_info[fd].waiting_thread == current_thread) fd_callback_info[fd].waiting_thread = NULL; From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 14 00:40:00 2025 Received: (at 79201) by debbugs.gnu.org; 14 Aug 2025 04:40:00 +0000 Received: from localhost ([127.0.0.1]:60515 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umPks-00050s-Pl for submit@debbugs.gnu.org; Thu, 14 Aug 2025 00:40:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:45150) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umPkh-00050K-MK for 79201@debbugs.gnu.org; Thu, 14 Aug 2025 00:39:54 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1umPkY-0001tR-Ag; Thu, 14 Aug 2025 00:39:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=aUFIvktKW9DlXucOiBtaN5WN9XXkqySMySCYtjYc9oY=; b=W/tvOMBXrZC3 qrExmz3Q4zhI+uoQcRSJaryV1IwYpeNYBZTWUmqRp/nhHWWrAMW5G0uhnEdMfXENARvV4RsGX8Lap tkTljkMun/L9Gr7KGBN8ptzjKBqPANmUFeeEowcKsdhZskYW9kAmQlYI5BvqxJlW5KEZ67JI+z0Z+ lTFUsvTkywMZMYaJq2+mroCooO/tSbo2gG4dKN7UQNfCeepGVKoawzAwuwFNgaIIHJ431lCeRqSb+ heB6W3g6HXfjnVIlyqJ8Y/4BNAbvDGBV3YQNBTDlbBfIfYZn0dFH4yc7O4GoKEOgBlMLGF5hAcPtw f7S0T/oAaAUhsuew4TXwWA==; Date: Thu, 14 Aug 2025 07:39:34 +0300 Message-Id: <86qzxems8p.fsf@gnu.org> From: Eli Zaretskii To: Spencer Baugh In-Reply-To: (message from Spencer Baugh on Wed, 13 Aug 2025 14:36:29 -0400) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> <868qjno2vv.fsf@gnu.org> <86tt2bma62.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Spencer Baugh > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Wed, 13 Aug 2025 14:36:29 -0400 > > Eli Zaretskii writes: > > >> From: Spencer Baugh > >> Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > >> app-emacs-dev@janestreet.com > >> Date: Wed, 13 Aug 2025 12:46:39 -0400 > >> > >> > Could the above scenario be the reason for what you see? > >> > >> No, because the non-nil waiting_thread is seen in make-process, for a > >> process and file descriptor that is newly created. > > > > Then I think the descriptors was closed without clearing the > > .waiting_thread member. So that goes back to what I suggested > > up-thread: delete_read_fd and delete_write_fd should clear these > > members. They are called either after closing the descriptor or when > > we stop reading/writing to them, so their .waiting_thread member is no > > longer useful. > > That doesn't explain why this is happening, though. Descriptors are > always closed without clearing the .waiting_thread member. Clearing > .waiting_thread is clear_waiting_thread_info's job. Why isn't it > happening? clear_waiting_thread_info is called when some thread returns from wait_reading_process_output. If a thread didn't return, but is instead stuck in the call to pselect, another thread can run and access the descriptors. I think if you want to understand exactly how this happens, you need to trace the places that close the descriptors. AFAIU, the scenario where make-process reuses a descriptor already marked with .waiting_thread of another thread is because that descriptor was recently closed. Otherwise, it's not possible for make-process to get a descriptor which is already in use. From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 14 02:01:34 2025 Received: (at 79201) by debbugs.gnu.org; 14 Aug 2025 06:01:35 +0000 Received: from localhost ([127.0.0.1]:60670 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umR1p-00011E-JD for submit@debbugs.gnu.org; Thu, 14 Aug 2025 02:01:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46060) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umR1j-00010x-0S for 79201@debbugs.gnu.org; Thu, 14 Aug 2025 02:01:28 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1umR1c-0003BG-0c; Thu, 14 Aug 2025 02:01:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=6a2Uune5wCdwjOJOSyyF0HsjGpj4Tyw+D9Y+eXm8TX0=; b=MR0eoFkPB/GP UgpCXf/dDFaMI/W0JMYsMSmGpbuN2CkAq5TG7F95nkzD++UAMggVbm+hoTWCSqA2XWqLVN9FFY9Vo 7b3jOe9gcJL6Fw0OLZWkXUyXwQTukAWbpk+DDnXq2nl34F7aRHbn5fX+4Ha7DO4mrwr1osw4ZVa2h uqvqFKvfrpbw5IiFoXJFgaiTttlDSWOpUBjKBjds7VRoiZFJYR69rRoD0/fLZjvK4TjUweZOSBLFh MbiaPQjcwlBH1H2Hpyjv2FvEDMJMyQu6UpbK88yHJbGAZszhwO7B1b4PUI5IbmAKd9+PAH/RRm9d8 DS5soeKq1FCmcWCGCjqtQw==; Date: Thu, 14 Aug 2025 09:01:13 +0300 Message-Id: <86cy8ymogm.fsf@gnu.org> From: Eli Zaretskii To: sbaugh@janestreet.com In-Reply-To: <86qzxems8p.fsf@gnu.org> (message from Eli Zaretskii on Thu, 14 Aug 2025 07:39:34 +0300) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> <868qjno2vv.fsf@gnu.org> <86tt2bma62.fsf@gnu.org> <86qzxems8p.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Thu, 14 Aug 2025 07:39:34 +0300 > From: Eli Zaretskii > > > That doesn't explain why this is happening, though. Descriptors are > > always closed without clearing the .waiting_thread member. Clearing > > .waiting_thread is clear_waiting_thread_info's job. Why isn't it > > happening? > > clear_waiting_thread_info is called when some thread returns from > wait_reading_process_output. If a thread didn't return, but is > instead stuck in the call to pselect, another thread can run and > access the descriptors. > > I think if you want to understand exactly how this happens, you need > to trace the places that close the descriptors. AFAIU, the scenario > where make-process reuses a descriptor already marked with > .waiting_thread of another thread is because that descriptor was > recently closed. Otherwise, it's not possible for make-process to get > a descriptor which is already in use. And also note that handle_child_signal, which is the SIGCHLD handler, and is therefore invoked asynchronously, calls delete_read_fd on the input descriptor of the process that died. Which might be highly relevant for your recipe, because it calls "sleep 1", which dies after 1 sec without producing any output. From debbugs-submit-bounces@debbugs.gnu.org Thu Aug 14 12:21:26 2025 Received: (at 79201) by debbugs.gnu.org; 14 Aug 2025 16:21:26 +0000 Received: from localhost ([127.0.0.1]:34289 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umahh-0002Dl-M5 for submit@debbugs.gnu.org; Thu, 14 Aug 2025 12:21:26 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:33289) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umahd-0002DL-2z for 79201@debbugs.gnu.org; Thu, 14 Aug 2025 12:21:22 -0400 From: Spencer Baugh To: Eli Zaretskii Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: (Spencer Baugh's message of "Wed, 13 Aug 2025 14:36:29 -0400") References: <86ectkn5rv.fsf@gnu.org> <868qjno2vv.fsf@gnu.org> <86tt2bma62.fsf@gnu.org> Date: Thu, 14 Aug 2025 12:21:15 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1755188475; bh=hJ8YfCX/voH3xtekokrn77XgdbwnGZ0FJGTXovct7Ps=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=zAib4+GIx23QqoawQQVndd02CXdJqvWseRebtW+D/y6P+l4Pl/BmTsoFxzceKtf7e 34xgAAN+qmK1B/IFs6nGl57CQQ9mPvL3+KKvv3P0900NE7rseASzbvsqeQxpkPHwDC LNexOU+Ek873LlkDZ2uU7DyTg+MmRYGPT8iDnMUGkvxz/33Z5WMYDbjzJMPan4KnpG 4ZzPGG10FXino3rNpOMWJZ3OX4EV5UZkQ3bRrt5iGfi4cq6FYLMvMCQ9rXyh6682bK e/bAZXAkYXjxVG8aHMUtNYF9drM8zGxMZLwJbrxfSOvwj+oFG3loCSz0kqJOX4/23h 7hLSvOYB3SxyQ== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) --=-=-= Content-Type: text/plain Spencer Baugh writes: > One theory: clear_waiting_thread_info only clears .waiting_thread up to > max_desc. OK, I figured it out. This is indeed the sole issue. With this patch: diff --git a/src/process.c b/src/process.c index 1a5c99139a6..5db427a4a99 100644 --- a/src/process.c +++ b/src/process.c @@ -670,10 +670,15 @@ clear_waiting_thread_info (void) int fd; eassert (max_desc < FD_SETSIZE); - for (fd = 0; fd <= max_desc; ++fd) + for (fd = 0; fd < FD_SETSIZE; ++fd) { - if (fd_callback_info[fd].waiting_thread == current_thread) + if (fd_callback_info[fd].waiting_thread == current_thread) { + if (fd > max_desc) { + fprintf (stderr, "fd too high: %d > max_desc=%d\n", fd, max_desc); + terminate_due_to_signal (SIGABRT, INT_MAX); + } fd_callback_info[fd].waiting_thread = NULL; + } } } And my earlier reproducer (but not my earlier patch), we quickly get an abort, indicating that there's an fd_callback_info slot which would be left with a non-NULL .waiting_thread. The cause is that recompute_max_desc runs when a file descriptor is deleted, and reduces max_desc. But when it does so, that means clear_waiting_thread_info will not get the chance to clear the waiting_thread field in fd_callback_info slots above the new max_desc. My initial thought was that recompute_max_desc should zero the fd_callback_info[fd] entry when it reduces max_desc. But it turns out to be equivalent, and simpler, to zero fd_callback_info[desc] completely instead of just setting flags to 0 right before we call recompute_max_desc. I do this in the attached patch, which fixes the bug. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Zero-fd_callback_info-when-deleting-an-fd.patch >From 975741cc141c45c4d857f51098a35d98885920f4 Mon Sep 17 00:00:00 2001 From: Spencer Baugh Date: Thu, 14 Aug 2025 12:17:23 -0400 Subject: [PATCH] Zero fd_callback_info when deleting an fd .waiting_thread and .thread could be left set to non-NULL values in a deleted fd_callback_info entry. These would never be cleared by e.g. clear_waiting_thread_info since that only clears fd_callback_info entries up to max_desc. Clear fd_callback_info entirely when deleting an entry. * src/process.c (delete_write_fd) (delete_keyboard_wait_descriptor): Set fd_callback_info completely to zero. (bug#79201) (deactivate_process): Remove unnecessary recompute_max_desc. --- src/process.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/process.c b/src/process.c index 1a5c99139a6..36717e39289 100644 --- a/src/process.c +++ b/src/process.c @@ -574,8 +574,7 @@ delete_write_fd (int fd) fd_callback_info[fd].flags &= ~(FOR_WRITE | NON_BLOCKING_CONNECT_FD); if (fd_callback_info[fd].flags == 0) { - fd_callback_info[fd].func = 0; - fd_callback_info[fd].data = 0; + fd_callback_info[fd] = (struct fd_callback_data) {}; if (fd == max_desc) recompute_max_desc (); @@ -4809,8 +4808,6 @@ deactivate_process (Lisp_Object proc) delete_read_fd (inchannel); if ((fd_callback_info[inchannel].flags & NON_BLOCKING_CONNECT_FD) != 0) delete_write_fd (inchannel); - if (inchannel == max_desc) - recompute_max_desc (); } } @@ -8282,7 +8279,7 @@ delete_keyboard_wait_descriptor (int desc) #ifdef subprocesses eassert (desc >= 0 && desc < FD_SETSIZE); - fd_callback_info[desc].flags &= ~(FOR_READ | KEYBOARD_FD | PROCESS_FD); + fd_callback_info[desc] = (struct fd_callback_data) {}; if (desc == max_desc) recompute_max_desc (); -- 2.39.3 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 15 07:52:18 2025 Received: (at 79201) by debbugs.gnu.org; 15 Aug 2025 11:52:18 +0000 Received: from localhost ([127.0.0.1]:37910 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umsyo-0003Jl-CV for submit@debbugs.gnu.org; Fri, 15 Aug 2025 07:52:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:36606) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umsyl-0003JR-VC for 79201@debbugs.gnu.org; Fri, 15 Aug 2025 07:52:17 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1umsye-00067p-Te; Fri, 15 Aug 2025 07:52:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=UvbxkY8AGJ/Dt4Cvpf2gdv0qtOO5q5+UtNsHnolZCQQ=; b=jCzKzWEm0rBJ 9+jJCaSXUEv8J0XH30DGCkTuyW1npQCbwWygkYl+diggScNAaDS+9qNTASlTHNtDXbHtXHser1CJu kHrUkfKJE2GeUav+8xOIO2U/ht+Zm9CXO2w/9NJ9efYoGt8eUpQxUrLAln649fgt7WjTdBJpyOC50 eEHOXxIyJNrXLMUMzEMiZDFuBCdIh3zxGoQ6Wlj7jINpyz+1Jxm329x4fIZo2PQa7kHGN76QIPPG8 Ps2o/feopZul5ZR6p1AlBNLMsiSCm6uwgcPPwh2b+w/CQwsdb0WT01daweT9r8oUUgrMUU5CxM5kw 481JejlOAi3kIyCL24PHjQ==; Date: Fri, 15 Aug 2025 14:52:04 +0300 Message-Id: <86349sn6or.fsf@gnu.org> From: Eli Zaretskii To: Spencer Baugh In-Reply-To: (message from Spencer Baugh on Thu, 14 Aug 2025 12:21:15 -0400) Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots References: <86ectkn5rv.fsf@gnu.org> <868qjno2vv.fsf@gnu.org> <86tt2bma62.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Spencer Baugh > Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, > app-emacs-dev@janestreet.com > Date: Thu, 14 Aug 2025 12:21:15 -0400 > > The cause is that recompute_max_desc runs when a file descriptor is > deleted, and reduces max_desc. But when it does so, that means > clear_waiting_thread_info will not get the chance to clear the > waiting_thread field in fd_callback_info slots above the new max_desc. Makes sense, thanks. > My initial thought was that recompute_max_desc should zero the > fd_callback_info[fd] entry when it reduces max_desc. But it turns out > to be equivalent, and simpler, to zero fd_callback_info[desc] completely > instead of just setting flags to 0 right before we call > recompute_max_desc. > > I do this in the attached patch, which fixes the bug. Thanks, the patch seems OK with a minor comment: > --- a/src/process.c > +++ b/src/process.c > @@ -574,8 +574,7 @@ delete_write_fd (int fd) > fd_callback_info[fd].flags &= ~(FOR_WRITE | NON_BLOCKING_CONNECT_FD); > if (fd_callback_info[fd].flags == 0) > { > - fd_callback_info[fd].func = 0; > - fd_callback_info[fd].data = 0; > + fd_callback_info[fd] = (struct fd_callback_data) {}; Please just clear the .thread and .waiting_thread members. This way, the code is easier to read, and it is also easier to find all the places which assign something to these members. From debbugs-submit-bounces@debbugs.gnu.org Fri Aug 15 10:22:30 2025 Received: (at 79201) by debbugs.gnu.org; 15 Aug 2025 14:22:30 +0000 Received: from localhost ([127.0.0.1]:39134 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1umvK9-0003GG-Im for submit@debbugs.gnu.org; Fri, 15 Aug 2025 10:22:30 -0400 Received: from mxout5.mail.janestreet.com ([64.215.233.18]:33391) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1umvK6-0003Fx-9V for 79201@debbugs.gnu.org; Fri, 15 Aug 2025 10:22:27 -0400 From: Spencer Baugh To: Eli Zaretskii Subject: Re: bug#79201: 30.1.90; set-process-thread can permanently break fd_callback_info slots In-Reply-To: <86349sn6or.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 15 Aug 2025 14:52:04 +0300") References: <86ectkn5rv.fsf@gnu.org> <868qjno2vv.fsf@gnu.org> <86tt2bma62.fsf@gnu.org> <86349sn6or.fsf@gnu.org> Date: Fri, 15 Aug 2025 10:22:20 -0400 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=waixah; t=1755267740; bh=mxxiEpNz3S9+hzNz2k4Y1PH0fm0WLn23PgvvI6K/Flo=; h=From:To:Cc:Subject:In-Reply-To:References:Date; b=hRh4ZIu5i1QxAize//qmKxriJnffHiMM6H91bV8hkKIOzAITzVK/oXRW0P99mQrZx 381cFuqnzBd/RD6gdP1EL+fVmW7B/clzQjKksKBMJaEcK/yvw7afQjqZVIdYjGh76F gy9u2hgUd4/ZeYxKT2rsdkD68OT7QybuZkLZ93kerjHnxWsxmFRvC4OwJott1WmP4s T455Vn+gv2Qz3NHIqOOB95j4Ei8zMOaTQ1G9uMtuz3T9YNHAvGrUVA/mb1xQqkxxON DXLtzHkHYhNWLq/EXnDpddhHJh8tDYa/ZuVhc9i7eugJw+60BYGRjbF+9z9xNB1r0y IvpRfZVRGqbvQ== X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79201 Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, app-emacs-dev@janestreet.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> From: Spencer Baugh >> Cc: 79201@debbugs.gnu.org, dmitry@gutov.dev, johnw@gnu.org, >> app-emacs-dev@janestreet.com >> Date: Thu, 14 Aug 2025 12:21:15 -0400 >> >> The cause is that recompute_max_desc runs when a file descriptor is >> deleted, and reduces max_desc. But when it does so, that means >> clear_waiting_thread_info will not get the chance to clear the >> waiting_thread field in fd_callback_info slots above the new max_desc. > > Makes sense, thanks. > >> My initial thought was that recompute_max_desc should zero the >> fd_callback_info[fd] entry when it reduces max_desc. But it turns out >> to be equivalent, and simpler, to zero fd_callback_info[desc] completely >> instead of just setting flags to 0 right before we call >> recompute_max_desc. >> >> I do this in the attached patch, which fixes the bug. > > Thanks, the patch seems OK with a minor comment: > >> --- a/src/process.c >> +++ b/src/process.c >> @@ -574,8 +574,7 @@ delete_write_fd (int fd) >> fd_callback_info[fd].flags &= ~(FOR_WRITE | NON_BLOCKING_CONNECT_FD); >> if (fd_callback_info[fd].flags == 0) >> { >> - fd_callback_info[fd].func = 0; >> - fd_callback_info[fd].data = 0; >> + fd_callback_info[fd] = (struct fd_callback_data) {}; > > Please just clear the .thread and .waiting_thread members. This way, > the code is easier to read, and it is also easier to find all the > places which assign something to these members. Agreed. Done in the attached patch. --=-=-= Content-Type: text/x-patch Content-Disposition: inline; filename=0001-Zero-fd_callback_info-when-deleting-an-fd.patch >From e6bddffd8b8e11e598f2ca75d7915184d38def4e Mon Sep 17 00:00:00 2001 From: Spencer Baugh Date: Thu, 14 Aug 2025 12:17:23 -0400 Subject: [PATCH] Zero fd_callback_info when deleting an fd .waiting_thread and .thread could be left set to non-NULL values in a deleted fd_callback_info entry. These would never be cleared by e.g. clear_waiting_thread_info since that only clears fd_callback_info entries up to max_desc. Clear fd_callback_info entirely when deleting an entry. * src/process.c (clear_fd_callback_data): Add. (delete_write_fd, delete_keyboard_wait_descriptor): Call clear_fd_callback_data. (bug#79201) (delete_read_fd): Remove duplicated clearing code. (deactivate_process): Remove duplicate recompute_max_desc. --- src/process.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/src/process.c b/src/process.c index 1a5c99139a6..042df7108eb 100644 --- a/src/process.c +++ b/src/process.c @@ -466,6 +466,16 @@ make_lisp_proc (struct Lisp_Process *p) struct thread_state *waiting_thread; } fd_callback_info[FD_SETSIZE]; +static void +clear_fd_callback_data(struct fd_callback_data* elem) +{ + elem->func = NULL; + elem->data = NULL; + elem->flags = 0; + elem->thread = NULL; + elem->waiting_thread = NULL; +} + /* Add a file descriptor FD to be monitored for when read is possible. When read is possible, call FUNC with argument DATA. */ @@ -507,13 +517,6 @@ add_process_read_fd (int fd) delete_read_fd (int fd) { delete_keyboard_wait_descriptor (fd); - - eassert (0 <= fd && fd < FD_SETSIZE); - if (fd_callback_info[fd].flags == 0) - { - fd_callback_info[fd].func = 0; - fd_callback_info[fd].data = 0; - } } /* Add a file descriptor FD to be monitored for when write is possible. @@ -574,8 +577,7 @@ delete_write_fd (int fd) fd_callback_info[fd].flags &= ~(FOR_WRITE | NON_BLOCKING_CONNECT_FD); if (fd_callback_info[fd].flags == 0) { - fd_callback_info[fd].func = 0; - fd_callback_info[fd].data = 0; + clear_fd_callback_data(&fd_callback_info[fd]); if (fd == max_desc) recompute_max_desc (); @@ -4809,8 +4811,6 @@ deactivate_process (Lisp_Object proc) delete_read_fd (inchannel); if ((fd_callback_info[inchannel].flags & NON_BLOCKING_CONNECT_FD) != 0) delete_write_fd (inchannel); - if (inchannel == max_desc) - recompute_max_desc (); } } @@ -8282,7 +8282,7 @@ delete_keyboard_wait_descriptor (int desc) #ifdef subprocesses eassert (desc >= 0 && desc < FD_SETSIZE); - fd_callback_info[desc].flags &= ~(FOR_READ | KEYBOARD_FD | PROCESS_FD); + clear_fd_callback_data(&fd_callback_info[desc]); if (desc == max_desc) recompute_max_desc (); -- 2.39.3 --=-=-=--