GNU bug report logs - #21380
25.0.50; GTK-induced segfault when scheduling timer from window-configuration-change-hook

Previous Next

Package: emacs;

Reported by: Pip Cet <pipcet <at> gmail.com>

Date: Sun, 30 Aug 2015 12:52:02 UTC

Severity: normal

Found in version 25.0.50

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #53 received at 21380 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21380 <at> debbugs.gnu.org
Subject: Re: bug#21380: 25.0.50; GTK-induced segfault when scheduling timer
 from window-configuration-change-hook
Date: Tue, 1 Sep 2015 10:20:11 +0000
[Message part 1 (text/plain, inline)]
On Mon, Aug 31, 2015 at 2:31 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> > Yes for this particular segfault.
>
> Can you show a patch that fixes the original segfault in your use
> case?


Attached. Note that either one of those changes should work. I'll test this
patch some more using my original code and see whether it blows up.

I'm afraid I lost track, what with all the different scenarios
> and potential solutions being thrown at this.  We should install the
> fix, assuming it's clean.
>

I think we should fix three things:
 - concat shouldn't rely on its argument remaining unchanged in length
 - the timer list copy should happen with block_input/unblock_input wrapped
around it
 - we shouldn't call do_pending_window_change from QUIT [already installed.
Thanks, martin!]

Any one of these is enough to prevent the original segfault. All but the
second also prevent the bizarre-elisp-induced segfault I came up with
later. And I'm perfectly happy for today with the number of hooks called
from QUIT reduced by one, rather than insisting on reducing them to zero
right away.

> No* for similar segfaults that I think pose equally severe problems: if
> any other function calls concat/copy-sequence on data that is modified by
> window-configuration-change-hook, it should* still be possible to produce
> the segfault.
>
> Emacs gives you enough rope to hang yourself; there's nothing new
> here.  We should strive to protect the Emacs internals so that they
> won't cause segfaults, but in user code any bets are off, and "don't
> do that" is a valid response to whoever does such things.
>

It's always good to know what the philosophy is behind the way the code
works, so thank you for that, really.

> So it wouldn't even be safe for window-configuration-change-hook to add a
> timer to the timer list, because the outer frame might be in the middle of
> creating a copy of the timer list for some Lisp code that hasn't blocked
> input. (As in my example below)
>
> Futzing with timers from within some hooks is indeed fundamentally
> dangerous.


Well, doing anything from window-configuration-change-hook is dangerous. My
idea was to schedule an immediate timer from it to get out of the danger
zone to do the actual work, but that backfired...


> But we should still try to minimize the probability of a
> crash, especially when it's Emacs itself who makes the offending copy,
> because people do dangerous things all the time, and expect them to
> work.  In this case, blocking input should do, I think.
>

I agree.

> I really don't think QUIT should run any Lisp hooks, to be honest.
>
> I don't think this limitation could fly.  It will disable a lot of
> useful use patterns, and the outcry will be loud and clear.
>

Okay.

> If I'm wrong and QUIT should be able to run Lisp hooks, concat needs to
> be fixed not to rely on its argument's size being unchanged after the
> make_sequence call.
>
> That can't do any harm, so let's do it, too.
>

Cool.


> > As far as I can tell, that should be reproducible. Also as far as I can
> tell, it's merely a matter of luck that an X resize doesn't happen at the
> point where I interrupted the program to artificially trigger the segfault.
> However, I admit that it is a separate issue, less likely to occur in
> practice, and I'll open another bug for it if that's the way you prefer
> things.
>
> But if input is blocked, as it would be in the case of copying
> timer-list inside timer_check, the X events will not be acted upon,
> and the problem will not happen, right?
>

Indeed, that relies on bizarre elisp code deliberately doing silly things...


> IOW, the above situation is a case of a user shooting herself in the
> foot by having that particular function in the hook and that
> particular code that copies timer-list (which is an internal variable
> unwise users should not touch).  Am I right?
>

I think you are. I'm not sure whether the timer code in timer.el does
anything to the timer list that might count as dangerous, but that's
possibly the only legitimate Lisp user of timer-list.
[Message part 2 (text/html, inline)]
[0001-Fix-potential-race-conditions-bug-23380.patch (text/x-patch, attachment)]

This bug report was last modified 3 years and 77 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.