#75275 - 30.0.92; `make-thread` bug on macOS 15.2

GNU bug report logs - #75275
30.0.92; `make-thread` bug on macOS 15.2

Package: emacs;

Reported by: Stefan Kangas <stefankangas <at> gmail.com>

Date: Thu, 2 Jan 2025 04:58:01 UTC

Severity: normal

Tags: confirmed

Found in versions 30.0.92, 31.0.50, 30.0.93

View this message in rfc822 format

From: Alan Third <alan <at> idiocy.org> To: Gerd Möllmann <gerd.moellmann <at> gmail.com> Cc: 75275 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>, stefankangas <at> gmail.com Subject: bug#75275: 30.0.92; `make-thread` bug on macOS 15.2 Date: Thu, 2 Jan 2025 11:03:50 +0000

On Thu, Jan 02, 2025 at 11:04:11AM +0100, Gerd Möllmann wrote: > Eli Zaretskii <eliz <at> gnu.org> writes: > > >> From: Gerd Möllmann <gerd.moellmann <at> gmail.com> > >> Cc: stefankangas <at> gmail.com, alan <at> idiocy.org, 75275 <at> debbugs.gnu.org > >> Date: Thu, 02 Jan 2025 09:41:38 +0100 > >> > >> Gerd Möllmann <gerd.moellmann <at> gmail.com> writes: > >> > >> > Eli Zaretskii <eliz <at> gnu.org> writes: > >> > > >> >> So should we add a condition before calling [NSApp run] that we are in > >> >> the main thread? > >> > > >> > ATM, I don't understand how we land in that line in ns_select_1 if not > >> > [NSThread isMainThread]. Maybe I need new glasses. I asked Stefan if he > >> > can see something in LLDB. > >> > >> It must something in here: > >> > >> if (![NSThread isMainThread] > >> || (timeout && timeout->tv_sec == 0 && timeout->tv_nsec == 0)) > >> thread_select (pselect, nfds, readfds, writefds, > >> exceptfds, timeout, sigmask); > >> > >> Should we return here? Yes. We used to be but it was removed: 9370a4763aacbb9278b5be9c92a2484e3652bc29 I don't know why it was removed, but I'd bet that, at the very least, the isMainThread check should have moved with the 'NSApp == nil' check. > > I don't know. Is there anything in the following code that can be > > relevant to a non-main thread? Note that non-main threads can > > legitimately call wait_reading_process_output, which calls ns_select. > > For example, what happens if a non-main Lisp thread starts a > > sub-process? we do expect to be able to read the output from that > > sub-process. My take on how this works was that in a non-main thread ns_select should just act like pselect, hence it used to literally just call pselect and return. This would hopefully allow Emacs to handle IO correctly without having to make the NS runloop code do things it can't do. FWIW, I still think the NS code in its current form is unsuitable for multi-threaded use and must be rewritten. > Really hard to tell. Perhaps someone could try to follow what I write > below and tell if it makes sense? Everything in ns_select_1. > > 1. I think this code must run in a non-main thread: > > if (nr > 0) > { > pthread_mutex_lock (&select_mutex); > ... set some variables ... > /* Inform fd_handler that select should be called. */ > c = 'g'; > emacs_write_sig (selfds[1], &c, 1); > } > > selfds is apparently some pipe, NS-specific. The function fd_handler is > called when writing to the pipe I assume. fd_handler is set up like > this > > [NSThread detachNewThreadSelector:@selector (fd_handler:) > toTarget:NSApp > withObject:nil]; > > Looks to me like it runs in a thread of its own. fd_handler then > pselects on the fd sets set in the if above. That looks like it is > relevant to reading process output. And that means we may _not_ return > from ns_select_1 early when ![NSThread isMainThread]. > > else if (nr == 0 && timeout) > { > /* No file descriptor, just a timeout, no need to wake fd_handler. */ > double time = timespectod (*timeout); > timed_entry = [[NSTimer scheduledTimerWithTimeInterval: time > target: NSApp > selector: > @selector (timeout_handler:) > userInfo: 0 > repeats: NO] > retain]; > } No, none of that needs to run when we're not in the main thread. fd_handler run pselect in a separate thread because the NS main thread has to run the ns main thread run loop to handle incoming IO from the window system. The NS run loop can emulate parts of pselect, but not the whole thing, so we are required to run both the NS runloop and pselect simultaneously, hence fd_handler. If we don't need to run the runloop, i.e. we're in a non-main thread, then we can just run pselect directly and ignore fd_handler. > 2. This code > > else if (nr == 0 && timeout) > { > /* No file descriptor, just a timeout, no need to wake fd_handler. */ > double time = timespectod (*timeout); > timed_entry = [[NSTimer scheduledTimerWithTimeInterval: time > target: NSApp > selector: > @selector (timeout_handler:) > userInfo: 0 > repeats: NO] > retain]; > } > > means basically only to send an app-defined event after a timeout. I > interpret this as "leave the NS event loop to let Emacs do things > after a timeout". Looks okay to me. Correct. In more detail it sends an "App defined" event to the main thread which signals to the run loop to stop itself. > 3. This > > else /* No timeout and no file descriptors, can this happen? */ > { > /* Send appdefined so we exit from the loop. */ > ns_send_appdefined (-1); > } > > is likely also okay because send_app_defined has code checking for > being in the main thread. This will send the app defined event to the main thread run loop. The code in ns_send_appdefined actually instructs the main thread runloop to send itself the event if called from a non-main thread. > 4. The [NSApp run] follows, and it can under no circumstances be done > in a mon-main thread. We should put that in an if for sure. > > if ([NSThread isMainThread]) [NSApp run]; In this circumstance no. In *Step each thread has its own run loop and event queue, so if you call [NSApp run] on a sub-thread it will look at its own queue, which in this case likely has nothing on it, and it will run forever. That's by design, that's how you're supposed to write NS apps. We obviously don't want to do that. > 5. The code below is another enigma. > > I can't figure out why that is done, and what > last_appdefined_event_data is for. But since it is run today, I'd > propose to just let it run. I don't see that it does immediate harm. It's essentially just working out how the run loop was terminated (by fd_handler, by some window system input, or by timeout) and creating suitable return values by, for example, gathering the results of the pselect run in fd_handler. Basically, none of this needs to run in a sub-thread. We should be able to just run pselect directly and return. Perhaps there's some other edge case that prevents that, but I suspect it was just overlooked. After all, nobody understands this code. -- Alan Third

This bug report was last modified 219 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #75275 30.0.92; `make-thread` bug on macOS 15.2

GNU bug report logs - #75275
30.0.92; `make-thread` bug on macOS 15.2