GNU bug report logs - #21394
25.0.50; Segfault when displaying unprintable character in echo area while frames are being created

Previous Next

Package: emacs;

Reported by: Pip Cet <pipcet <at> gmail.com>

Date: Tue, 1 Sep 2015 16:45:02 UTC

Severity: normal

Tags: patch

Found in version 25.0.50

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21394 in the body.
You can then email your comments to 21394 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Tue, 01 Sep 2015 16:45:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Pip Cet <pipcet <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 01 Sep 2015 16:45:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 25.0.50; Segfault when displaying unprintable character in echo area
 while frames are being created
Date: Tue, 1 Sep 2015 16:43:13 +0000
[Message part 1 (text/plain, inline)]
Recipe:

Launch emacs -Q.

C-x C-f emacs-bug-038.el
M-x eval-buffer

Wait for approximately one minute.

The contents of emacs-bug-038.el are:

------
(run-with-timer 0 .01 (lambda () (message (make-string 1 0))))

(while t
  (let ((f (make-frame)))
    (sit-for .1)
    (delete-frame f)))
------

The repository is unmodified Emacs HEAD; while the CFLAGS contain an extra
option to dump gimple trees, I believe that should have no influence on the
produced code. I have attached a full gdb log.

I have been unable to reproduce this bug with emacs -Q --load
emacs-bug-038.el.

Please let me know if I can provide any further information or do anything
else to help fix this bug.

In GNU Emacs 25.0.50.14 (x86_64-unknown-linux-gnu, GTK+ Version 3.16.6)
 of 2015-09-01
Repository revision: da88ad6cb1e98ef9f28462648ab53d81d22dd189
Windowing system distributor 'The X.Org Foundation', version 11.0.11702000
System Description:    Debian GNU/Linux unstable (sid)

Configured using:
 'configure 'CFLAGS=-O0 -g3 -fdump-tree-gimple-vops-verbose-raw-lineno''

Configured features:
XPM JPEG TIFF GIF PNG RSVG SOUND DBUS GCONF GSETTINGS NOTIFY LIBSELINUX
GNUTLS LIBXML2 FREETYPE XFT ZLIB TOOLKIT_SCROLL_BARS GTK3 X11

Important settings:
  value of $LANG: en_US.UTF-8
  locale-coding-system: utf-8-unix

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  global-eldoc-mode: t
  electric-indent-mode: t
  mouse-wheel-mode: t
  tool-bar-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.

Load-path shadows:
None found.

Features:
(shadow sort gnus-util mail-extr emacsbug message dired format-spec
rfc822 mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231
mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums
mm-util help-fns help-mode easymenu cl-loaddefs pcase cl-lib mail-prsvr
mail-utils time-date mule-util tooltip eldoc electric uniquify
ediff-hook vc-hooks lisp-float-type mwheel x-win term/common-win x-dnd
tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment
elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer
select scroll-bar mouse jit-lock font-lock syntax facemenu font-core
frame cl-generic cham georgian utf-8-lang misc-lang vietnamese tibetan
thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek romanian
slovak czech european ethiopic indian cyrillic chinese charscript
case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer
cl-preloaded nadvice loaddefs button faces cus-face macroexp files
text-properties overlay sha1 md5 base64 format env code-pages mule
custom widget hashtable-print-readable backquote dbusbind inotify
dynamic-setting system-font-setting font-render-setting move-toolbar gtk
x-toolkit x multi-tty make-network-process emacs)

Memory information:
((conses 16 81444 9134)
 (symbols 48 19017 0)
 (miscs 40 42 136)
 (strings 32 13315 4665)
 (string-bytes 1 380651)
 (vectors 16 11238)
 (vector-slots 8 413731 5198)
 (floats 8 131 60)
 (intervals 56 189 0)
 (buffers 976 11)
 (heap 1024 18466 983))
[Message part 2 (text/html, inline)]
[emacs-bug-041.txt (text/plain, attachment)]
[emacs-bug-038.el (text/x-emacs-lisp, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Tue, 01 Sep 2015 16:58:02 GMT) Full text and rfc822 format available.

Message #8 received at 21394 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> gmail.com>
Cc: 21394 <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50;
 Segfault when displaying unprintable character in echo area
 while	frames are being created
Date: Tue, 01 Sep 2015 19:57:19 +0300
> Date: Tue, 1 Sep 2015 16:43:13 +0000
> From: Pip Cet <pipcet <at> gmail.com>
> 
> Launch emacs -Q.
> 
> C-x C-f emacs-bug-038.el
> M-x eval-buffer
> 
> Wait for approximately one minute.

I waited for 1.5 minutes, and saw no crash.  But that doesn't surprise
me, since these problems are very elusive.

> 26484	  if (it->what == IT_CHARACTER)
> 26485	    {
> 26486	      XChar2b char2b;
> 26487	      struct face *face = FACE_FROM_ID (it->f, it->face_id);
> 26488	      struct font *font = face->font;
> 26489	      struct font_metrics *pcm = NULL;
> 26490	      int boff;			/* Baseline offset.  */
> 26491	
> 26492	      if (font == NULL)
> (gdb) p face
> $1 = (struct face *) 0x0
> (gdb) p it->face_id
> $2 = 18
> (gdb) p it->f->face_cache->used
> $3 = 15

This is another incarnation of several similar crashes we've seen for
the past year or so (you can see them on the bug tracker).  I was
never able to find the code path which is responsible for this strange
situation, where a (supposedly) realized face is somehow not cached.
If you can somehow produce a trace of face caching and un-caching in
this recipe, with backtraces to show who calls those, perhaps we could
finally nail this sucker.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Tue, 01 Sep 2015 17:31:02 GMT) Full text and rfc822 format available.

Message #11 received at 21394 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21394 <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50; Segfault when displaying unprintable
 character in echo area while frames are being created
Date: Tue, 1 Sep 2015 17:30:36 +0000
[Message part 1 (text/plain, inline)]
On Tue, Sep 1, 2015 at 4:57 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> > Date: Tue, 1 Sep 2015 16:43:13 +0000
> > From: Pip Cet <pipcet <at> gmail.com>
> >
> > Launch emacs -Q.
> >
> > C-x C-f emacs-bug-038.el
> > M-x eval-buffer
> >
> > Wait for approximately one minute.
>
> I waited for 1.5 minutes, and saw no crash.  But that doesn't surprise
> me, since these problems are very elusive.
>

Well, obviously it's a race condition and happens randomly. It certainly
seems to be running a lot longer with the debugging code you requested...


> > 26484   if (it->what == IT_CHARACTER)
> > 26485     {
> > 26486       XChar2b char2b;
> > 26487       struct face *face = FACE_FROM_ID (it->f, it->face_id);
> > 26488       struct font *font = face->font;
> > 26489       struct font_metrics *pcm = NULL;
> > 26490       int boff;                 /* Baseline offset.  */
> > 26491
> > 26492       if (font == NULL)
> > (gdb) p face
> > $1 = (struct face *) 0x0
> > (gdb) p it->face_id
> > $2 = 18
> > (gdb) p it->f->face_cache->used
> > $3 = 15
>
> This is another incarnation of several similar crashes we've seen for
> the past year or so (you can see them on the bug tracker).


I hope you're right, obviously. I'll try finding them on the bug tracker,
but if you have bug #s that would be very much appreciated.


> If you can somehow produce a trace of face caching and un-caching in
> this recipe, with backtraces to show who calls those, perhaps we could
> finally nail this sucker.
>

I think I've got what you want here. However, I'm not sure how far back in
the log is relevant to you; I'm attaching the end of the log here, but will
send the full log by private email, if that's okay?

I've used these gdb commands:

----
b free_realized_faces
b uncache_face
b cache_face
command 1
bt
c
end
command 2
p c->used
bt
c
end
command 3
p c->used
bt
c
end
----

Please feel free to respond with a list of alternative commands, and I'll
be happy to attempt producing another backtrace here.

I'm currently trying with the same commands with .gdbinit loaded, so we
have xbacktraces, but xbacktrace is really slow so it might take a while
(or never trigger the bug, obviously). It appears to be slower by about a
factor of 100, so this will probably run for a few hours.
[Message part 2 (text/html, inline)]
[emacs-bug-046.txt (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Tue, 01 Sep 2015 18:09:02 GMT) Full text and rfc822 format available.

Message #14 received at 21394 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> gmail.com>
Cc: 21394 <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50;
 Segfault when displaying unprintable character in echo area while
 frames are being created
Date: Tue, 01 Sep 2015 21:08:57 +0300
> Date: Tue, 1 Sep 2015 17:33:10 +0000
> From: Pip Cet <pipcet <at> gmail.com>
> 
> This is the full log, including all my backspacing (so the end might be more
> readable in the edited log I've sent to the list). I'm not sure what the size
> limits are for sending attachments to the list.
> [...]
> (gdb) p face
> $1787 = (struct face *) 0x0
> (gdb) p it
> $1788 = (struct it *) 0x7fffffff9de0
> (gdb) p it->face_id
> $1789 = 18
> $1790 = (struct frame *) 0x129f9e0
> (gdb) p it->f->face_cache->used
> $1791 = 15

I see the problematic face always has face ID of 18, and the 'used'
field is always 15 when the segfault strikes.  So I guess the next
step is to make the breakpoint in cache_face conditional on i being
18, and then see whether c->used is set to 19 during that call to
cache_face.  If it does, then a watchpoint (by location) on c->used
should show which code makes the value smaller.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Tue, 01 Sep 2015 18:53:01 GMT) Full text and rfc822 format available.

Message #17 received at 21394 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21394 <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50; Segfault when displaying unprintable
 character in echo area while frames are being created
Date: Tue, 1 Sep 2015 18:52:44 +0000
[Message part 1 (text/plain, inline)]
On Tue, Sep 1, 2015 at 6:08 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> > Date: Tue, 1 Sep 2015 17:33:10 +0000
> > From: Pip Cet <pipcet <at> gmail.com>
> >
> > This is the full log, including all my backspacing (so the end might be
> more
> > readable in the edited log I've sent to the list). I'm not sure what the
> size
> > limits are for sending attachments to the list.
> > [...]
> > (gdb) p face
> > $1787 = (struct face *) 0x0
> > (gdb) p it
> > $1788 = (struct it *) 0x7fffffff9de0
> > (gdb) p it->face_id
> > $1789 = 18
> > $1790 = (struct frame *) 0x129f9e0
> > (gdb) p it->f->face_cache->used
> > $1791 = 15
>
> I see the problematic face always has face ID of 18, and the 'used'
> field is always 15 when the segfault strikes.  So I guess the next
> step is to make the breakpoint in cache_face conditional on i being
> 18,


i is used in two different ways in that function, as a face hash and as an
index into faces_by_id. I assume you mean the latter?


> and then see whether c->used is set to 19 during that call to
> cache_face.  If it does, then a watchpoint (by location) on c->used
> should show which code makes the value smaller.
>

So I wrote a perl script to set a watchpoint on c->used whenever we
allocate a new face cache c in make_face_cache, and clear the watchpoint
when we hit free_face_cache. Output attached, but do let me know what else
you would like watched. I think that has all the information your approach
would have given us.
[Message part 2 (text/html, inline)]
[emacs-bug-050.txt.gz (application/x-gzip, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Tue, 01 Sep 2015 19:38:01 GMT) Full text and rfc822 format available.

Message #20 received at 21394 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> gmail.com>
Cc: 21394 <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50;
 Segfault when displaying unprintable character in echo area while
 frames are being created
Date: Tue, 01 Sep 2015 22:37:16 +0300
> Date: Tue, 1 Sep 2015 18:52:44 +0000
> From: Pip Cet <pipcet <at> gmail.com>
> Cc: 21394 <at> debbugs.gnu.org
> 
>     I see the problematic face always has face ID of 18, and the 'used'
>     field is always 15 when the segfault strikes. So I guess the next
>     step is to make the breakpoint in cache_face conditional on i being
>     18,
> 
> i is used in two different ways in that function, as a face hash and as an
> index into faces_by_id. I assume you mean the latter?

Yes.

>     and then see whether c->used is set to 19 during that call to
>     cache_face. If it does, then a watchpoint (by location) on c->used
>     should show which code makes the value smaller.
> 
> So I wrote a perl script to set a watchpoint on c->used whenever we allocate a
> new face cache c in make_face_cache, and clear the watchpoint when we hit
> free_face_cache. Output attached, but do let me know what else you would like
> watched. I think that has all the information your approach would have given
> us.

Sounds like the face ID of 18 is "remembered" somewhere, and not
"forgotten" when the face cache is free'd?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Tue, 01 Sep 2015 20:21:01 GMT) Full text and rfc822 format available.

Message #23 received at 21394 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21394 <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50; Segfault when displaying unprintable
 character in echo area while frames are being created
Date: Tue, 1 Sep 2015 20:19:58 +0000
[Message part 1 (text/plain, inline)]
This code in xdisp.c looks very suspicious to me:

------
static int last_escape_glyph_face_id = (1 << FACE_ID_BITS);
static int last_escape_glyph_merged_face_id = 0;

static int
merge_escape_glyph_face (struct it *it)
{
  int face_id;

  if (it->f == last_escape_glyph_frame
      && it->face_id == last_escape_glyph_face_id)
    face_id = last_escape_glyph_merged_face_id;
  else
    {
      /* Merge the `escape-glyph' face into the current face.  */
      face_id = merge_faces (it->f, Qescape_glyph, 0, it->face_id);
      last_escape_glyph_frame = it->f;
      last_escape_glyph_face_id = it->face_id;
      last_escape_glyph_merged_face_id = face_id;
    }
  return face_id;
}
------

That caches a face id in last_escape_glyph_merged_face_id, which is cleared
only in redisplay_internal(). But message() doesn't call
redisplay_internal(), it calls try_window() directly (xdisp.c:10687) (and
resize_window before that, which blows up).

This patch appears, so far, to run without a segfault (I neglected to
record timings so can't give you a p-value, but back of the envelope it's
>99% that it fixes the segfault), and it fixes what I think is a possible
explanation for the segfault, so my current plan is to leave it running
overnight and ask for inclusion (of this or an equivalent, cleaner patch,
of course) if it doesn't blow up.

However, I have yet to look at the other bugs and doubt this is the whole
story.


On Tue, Sep 1, 2015 at 7:37 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> > Date: Tue, 1 Sep 2015 18:52:44 +0000
> > From: Pip Cet <pipcet <at> gmail.com>
> > Cc: 21394 <at> debbugs.gnu.org
> >
> >     I see the problematic face always has face ID of 18, and the 'used'
> >     field is always 15 when the segfault strikes. So I guess the next
> >     step is to make the breakpoint in cache_face conditional on i being
> >     18,
> >
> > i is used in two different ways in that function, as a face hash and as
> an
> > index into faces_by_id. I assume you mean the latter?
>
> Yes.
>
> >     and then see whether c->used is set to 19 during that call to
> >     cache_face. If it does, then a watchpoint (by location) on c->used
> >     should show which code makes the value smaller.
> >
> > So I wrote a perl script to set a watchpoint on c->used whenever we
> allocate a
> > new face cache c in make_face_cache, and clear the watchpoint when we hit
> > free_face_cache. Output attached, but do let me know what else you would
> like
> > watched. I think that has all the information your approach would have
> given
> > us.
>
> Sounds like the face ID of 18 is "remembered" somewhere, and not
> "forgotten" when the face cache is free'd?
>
[Message part 2 (text/html, inline)]
[0001-Forget-cached-face-ids-when-displaying-echo-area-mes.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Wed, 02 Sep 2015 15:17:01 GMT) Full text and rfc822 format available.

Message #26 received at 21394 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> gmail.com>
Cc: 21394 <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50;
 Segfault when displaying unprintable character in echo area while
 frames are being created
Date: Wed, 02 Sep 2015 18:16:22 +0300
> Date: Tue, 1 Sep 2015 20:19:58 +0000
> From: Pip Cet <pipcet <at> gmail.com>
> Cc: 21394 <at> debbugs.gnu.org
> 
> That caches a face id in last_escape_glyph_merged_face_id, which is cleared
> only in redisplay_internal(). But message() doesn't call redisplay_internal(),
> it calls try_window() directly (xdisp.c:10687) (and resize_window before that,
> which blows up).

I think you are right, thanks.

> This patch appears, so far, to run without a segfault

Can you please verify that when the (unmodified) code does segfault,
the offending face ID is indeed stored in
last_escape_glyph_merged_face_id?  I'd like to be sure we have really
found the source of that face ID.

Also, I think that those 4 lines of code should be in a function, and
that function should also be called from free_realized_faces, where we
set c->used to zero.  I couldn't convince myself that
free_realized_faces could never be called while redisplay runs (and
even if now it isn't, I'd like us to be more future-proof in this
respect).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#21394; Package emacs. (Wed, 02 Sep 2015 16:09:01 GMT) Full text and rfc822 format available.

Message #29 received at 21394 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 21394 <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50; Segfault when displaying unprintable
 character in echo area while frames are being created
Date: Wed, 2 Sep 2015 16:08:08 +0000
[Message part 1 (text/plain, inline)]
On Wed, Sep 2, 2015 at 3:16 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> > This patch appears, so far, to run without a segfault
>

No segfault overnight.


> Can you please verify that when the (unmodified) code does segfault,
> the offending face ID is indeed stored in
> last_escape_glyph_merged_face_id?  I'd like to be sure we have really
> found the source of that face ID.
>

It is, indeed, and last_escape_glyph_frame also equals it->f.

Revised patch attached.
[Message part 2 (text/html, inline)]
[0001-Forget-cached-face-ids-when-displaying-echo-area-mes.patch (text/x-patch, attachment)]

Added tag(s) patch. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Fri, 04 Sep 2015 05:19:01 GMT) Full text and rfc822 format available.

Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 05 Sep 2015 08:13:02 GMT) Full text and rfc822 format available.

Notification sent to Pip Cet <pipcet <at> gmail.com>:
bug acknowledged by developer. (Sat, 05 Sep 2015 08:13:03 GMT) Full text and rfc822 format available.

Message #36 received at 21394-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Pip Cet <pipcet <at> gmail.com>
Cc: 21394-done <at> debbugs.gnu.org
Subject: Re: bug#21394: 25.0.50;
 Segfault when displaying unprintable character in echo area while
 frames are being created
Date: Sat, 05 Sep 2015 11:12:09 +0300
> Date: Wed, 2 Sep 2015 16:08:08 +0000
> From: Pip Cet <pipcet <at> gmail.com>
> Cc: 21394 <at> debbugs.gnu.org
> 
> On Wed, Sep 2, 2015 at 3:16 PM, Eli Zaretskii <eliz <at> gnu.org> wrote:
> 
>     > This patch appears, so far, to run without a segfault
>     
> No segfault overnight.
> 
>     Can you please verify that when the (unmodified) code does segfault,
>     the offending face ID is indeed stored in
>     last_escape_glyph_merged_face_id? I'd like to be sure we have really
>     found the source of that face ID.
> 
> It is, indeed, and last_escape_glyph_frame also equals it->f.
> 
> Revised patch attached.

Thanks, I pushed it with a few minor changes.

I'm closing this bug.  feel free to reopen if something else emerges
that needs to be taken care of.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 03 Oct 2015 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 319 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.