GNU bug report logs -
#36609
27.0.50; Possible race-condition in threading implementation
Previous Next
Reported by: Andreas Politz <politza <at> hochschule-trier.de>
Date: Thu, 11 Jul 2019 20:52:02 UTC
Severity: normal
Tags: fixed
Found in version 27.0.50
Fixed in version 28.1
Done: dick <dick.r.chiang <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
On Fri, Jul 12, 2019 at 1:51 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> > From: Pip Cet <pipcet <at> gmail.com>
> > Date: Fri, 12 Jul 2019 13:40:15 +0000
> > Cc: politza <at> hochschule-trier.de, 36609 <at> debbugs.gnu.org
> >
> > When a thread is signalled (by thread-signal, which sets another
> > thread's error_symbol) while the signalled thread is inside a
> > select(), thread_select() will return non-locally for that thread, and
> > we fail to release an internal GLib lock through
> > g_main_context_release(). That's the first bug.
>
> We should either release the global lock before the thread exits, or
> defer the acting upon the signal until later. We cannot disable the
> signal handling altogether because it is entirely legitimate to signal
> another thread, and when we do, that other thread will _always_ be
> inside thread_select.
Really? What about thread-yield?
> For the main thread, handling the signal in that situation shouldn't
> be a problem, because it is not going to exit. Right?
I think the main thread can still fail to release the lock...
> > When xg_select() fails to acquire the internal GLib lock, it simply
> > does a select() on the remaining file descriptors:
>
> Why does it fail to acquire that lock?
Because another thread holds it, either an Emacs or a non-Emacs
thread. In both cases, I think we might miss events unless we return
with errno == EINTR.
> > context_acquired = g_main_context_acquire (context)
> > /* FIXME: If we couldn't acquire the context, we just silently proceed
> > because this function handles more than just glib file descriptors.
> > Note that, as implemented, this failure is completely silent: there is
> > no feedback to the caller. */
> >
> > This seems like a second, albeit documented, bug to me. I think we're
> > risking not waking up from the actual select because another
> > (non-Emacs) thread happened to hold the main context at the time.
>
> So what is the proposal for that? spin waiting for the lock?
I don't know, to be honest, but I'm afraid there's currently no better
way. There's a way to take the lock without spinning, but it's been
deprecated.
This bug report was last modified 4 years and 27 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.