Spencer Baugh writes: > Spencer Baugh writes: >> While doing this, I had another thought... >> >> Do you know if we have any code which handles the fact that >> delete-process might be called on a process while another thread is >> calling pselect on its fds? >> >> I don't think we do, which suggests that might also be problematic. >> I'll work on a fix which encompasses that as well. > > OK, I think this patch fixes the issue. > > Many different pieces of code in Emacs can close file descriptors. We > need to be generically careful to not close file descriptors which some > other thread is currently waiting on. In my patch, we do this by > letting the waiting_thread itself close those file descriptors, after it > returns from select. > > Unfortunately I don't have a reliable test which produces the failure > case, but in this patch I added a fprintf when we close a deferred fd. > This prints whenever some thread would have closed an fd that another > thread was waiting on. If I run the following program (with emacs -Q > --batch): > > (defun my-break-thread () > (let ((proc (make-process > :name "foo" > :command '("sleep" "1")))) > (while (process-live-p proc) > (accept-process-output proc 0.05)))) > (while t > (make-thread #'my-break-thread "thread1") > (thread-join (make-thread #'my-break-thread "thread2"))) > > I get about one print every 10 seconds. Each print is a problem case > which has been averted, so I do think this patch fixes the bug. > > (Obviously, we should remove the fprintf before applying the change) Of course, I immediately found a bug in my patch (I wasn't clearing waiting_thread after closing a deferred fd). Fixed in the attached. In related news: running process-tests.el in a loop also triggers my "Closing deferred fd" message. Which suggests that src/process-tests.el was probably also suffering from bugs in this area, which could have been causing hangs or flaky failures. Which should now hopefully be fixed.