Package: emacs;
Reported by: Helmut Eller <eller.helmut <at> gmail.com>
Date: Thu, 10 Aug 2023 15:06:02 UTC
Severity: normal
Tags: patch
Found in version 30.0.50
View this message in rfc822 format
From: Helmut Eller <eller.helmut <at> gmail.com> To: 65211 <at> debbugs.gnu.org Subject: bug#65211: 30.0.50; threads and accept-process-output Date: Thu, 10 Aug 2023 17:04:48 +0200
[Message part 1 (text/plain, inline)]
The test case from below hangs when executed with emacs -Q --batch -l apo-tests.el -f ert-run-tests-batch-and-exit It's not 100% reproducible, but it seems to hang 80% of the time. The test is fairly complicated and involves a TLS connection, a sub-process and a background thread. The background thread starts a sub-process and reads all its output until the process terminates. Then the main thread opens a TLS connection, sends a HTTP request and tries to read the response. At this point, accept-process-output hangs even though there is output available. I think the reason for this problem is that the sub-process uses a file descriptor (usually number 6) and sets fd_callback_info[6].waiting_thread to the background thread (in compute_non_keyboard_wait_mask) . Later, the TLS connection receives a file descriptor with the same number (6) because the sub-process closed it already. That would be fine, however fd_callback_info[6].waiting_thread is still set to the background thread. So this time compute_non_keyboard_wait_mask doesn't include 6 in the wait mask. Because 6 is not in the wait mask, emacs_gnutls_record_check_pending will not be called and Emacs doesn't notice the buffered output. The intention in the code seems to be that clear_waiting_thread_info resets fd_callback_info[6].waiting_thread to NULL, however by the time that clear_waiting_thread_info is called, max_desc was reduced to 4, because somebody closed file descriptors 5 and 6. So clear_waiting_thread_info doesn't touch fd_callback_info[6]. A possible solution would be to reset the .waiting_thread field in delete_read_fd, like so: diff --git a/src/process.c b/src/process.c index 08cb810ec13..74d0bf252ab 100644 --- a/src/process.c +++ b/src/process.c @@ -513,6 +513,9 @@ delete_read_fd (int fd) { fd_callback_info[fd].func = 0; fd_callback_info[fd].data = 0; + + if (fd_callback_info[fd].waiting_thread == current_thread) + fd_callback_info[fd].waiting_thread = NULL; } } With this change, the test doesn't hang. Helmut
[apo-tests.el (text/plain, inline)]
;; -*- lexical-binding: t -*- (require 'ert) (require 'gnutls) (defun t--read (proc nchars) "Read at least NCHARS characters from process PROC. Return 'ok on success; otherwise return 'not-live." (let* ((buf (process-buffer proc)) (size (+ (buffer-size buf) nchars)) (result nil)) (t--accept-process-output proc 0) (while (eq result nil) (cond ((>= (buffer-size buf) size) (setq result 'ok)) ((not (process-live-p proc)) (setq result 'not-live)) (t (t--accept-process-output proc nil)))) result)) ;; a wrapper around accept-process-output for debugging (defun t--accept-process-output (proc timeout) (let ((buf (process-buffer proc))) (message "%s accept-process-output %S %s %s …" (current-thread) proc (process-status proc) timeout) (let ((val (accept-process-output proc timeout))) (message "%s accept-process-output %S %s %s => %S %d" (current-thread) proc (process-status proc) timeout val (buffer-size buf)) val))) (defun t--read-all (proc) (while (pcase-exhaustive (t--read proc 1) ('ok t) ('not-live nil))) (with-current-buffer (process-buffer proc) (buffer-string))) (defun t--make-buffer (name) (with-current-buffer (generate-new-buffer name t) (buffer-disable-undo) (set-buffer-multibyte nil) (current-buffer))) (ert-deftest t-tls () (thread-join (make-thread (lambda () (let* ((proc (make-process :name "cat" :command (list "cat") :sentinel (lambda (_ _)) :buffer (t--make-buffer "cat")))) (process-send-eof proc) (should (equal (t--read-all proc) "")))))) (let* ((host "www.example.net") (proc (make-network-process :name "tls" :host host :service 443 :tls-parameters (cons 'gnutls-x509pki (gnutls-boot-parameters :hostname host)) :buffer (t--make-buffer "tls") :sentinel (lambda (_ _)))) (request (format "GET / HTTP/1.1\r\nHost: %s\r\n\r\n" host))) (process-send-string proc request) (process-send-eof proc) (let* ((response (t--read-all proc))) (should (string-match "^HTTP/1.1 200 OK" response))))) ;; Local Variables: ;; read-symbol-shorthands: (("t-" . "apo-tests-")) ;; End:
[Message part 3 (text/plain, inline)]
In GNU Emacs 30.0.50 (build 71, x86_64-pc-linux-gnu, GTK+ Version 3.24.37, cairo version 1.16.0) of 2023-08-10 built on caladan Repository revision: 164588b174774eba0c3bd6999633a39bed748195 Repository branch: master Windowing system distributor 'The X.Org Foundation', version 11.0.12101007 System Description: Debian GNU/Linux 12 (bookworm) Configured using: 'configure --enable-checking=yes --with-xpm=ifavailable --with-gif=ifavailable 'CFLAGS=-g -O1'' Configured features: CAIRO DBUS FREETYPE GLIB GMP GNUTLS GSETTINGS HARFBUZZ JPEG LIBSELINUX LIBSYSTEMD LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 GTK3 ZLIB Important settings: value of $LANG: C.UTF-8 locale-coding-system: utf-8-unix
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.