GNU bug report logs - #36609
27.0.50; Possible race-condition in threading implementation

Previous Next

Package: emacs;

Reported by: Andreas Politz <politza <at> hochschule-trier.de>

Date: Thu, 11 Jul 2019 20:52:02 UTC

Severity: normal

Tags: fixed

Found in version 27.0.50

Fixed in version 28.1

Done: dick <dick.r.chiang <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pip Cet <pipcet <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 36609 <at> debbugs.gnu.org, politza <at> hochschule-trier.de
Subject: bug#36609: 27.0.50; Possible race-condition in threading implementation
Date: Fri, 12 Jul 2019 14:34:44 +0000
On Fri, Jul 12, 2019 at 1:51 PM Eli Zaretskii <eliz <at> gnu.org> wrote:
> > From: Pip Cet <pipcet <at> gmail.com>
> > Date: Fri, 12 Jul 2019 13:40:15 +0000
> > Cc: politza <at> hochschule-trier.de, 36609 <at> debbugs.gnu.org
> >
> > When a thread is signalled (by thread-signal, which sets another
> > thread's error_symbol) while the signalled thread is inside a
> > select(), thread_select() will return non-locally for that thread, and
> > we fail to release an internal GLib lock through
> > g_main_context_release(). That's the first bug.
>
> We should either release the global lock before the thread exits, or
> defer the acting upon the signal until later.  We cannot disable the
> signal handling altogether because it is entirely legitimate to signal
> another thread, and when we do, that other thread will _always_ be
> inside thread_select.

Really? What about thread-yield?

> For the main thread, handling the signal in that situation shouldn't
> be a problem, because it is not going to exit.  Right?

I think the main thread can still fail to release the lock...

> > When xg_select() fails to acquire the internal GLib lock, it simply
> > does a select() on the remaining file descriptors:
>
> Why does it fail to acquire that lock?

Because another thread holds it, either an Emacs or a non-Emacs
thread. In both cases, I think we might miss events unless we return
with errno == EINTR.

> >   context_acquired = g_main_context_acquire (context)
> >   /* FIXME: If we couldn't acquire the context, we just silently proceed
> >      because this function handles more than just glib file descriptors.
> >      Note that, as implemented, this failure is completely silent: there is
> >      no feedback to the caller.  */
> >
> > This seems like a second, albeit documented, bug to me. I think we're
> > risking not waking up from the actual select because another
> > (non-Emacs) thread happened to hold the main context at the time.
>
> So what is the proposal for that? spin waiting for the lock?

I don't know, to be honest, but I'm afraid there's currently no better
way. There's a way to take the lock without spinning, but it's been
deprecated.




This bug report was last modified 4 years and 27 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.