GNU bug report logs - #75477
31.0.50; scratch/igc: crash on the latest commit

Previous Next

Package: emacs;

Reported by: Ihor Radchenko <yantar92 <at> posteo.net>

Date: Fri, 10 Jan 2025 13:38:02 UTC

Severity: normal

Found in version 31.0.50

Done: Pip Cet <pipcet <at> protonmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pip Cet <pipcet <at> protonmail.com>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Gerd Möllmann <gerd.moellmann <at> gmail.com>, 75477 <at> debbugs.gnu.org
Subject: bug#75477: 31.0.50; scratch/igc: crash on the latest commit
Date: Fri, 10 Jan 2025 19:01:53 +0000
Pip Cet <pipcet <at> protonmail.com> writes:

> Pip Cet <pipcet <at> protonmail.com> writes:
>
>> Pip Cet <pipcet <at> protonmail.com> writes:
>>
>>> "Ihor Radchenko" <yantar92 <at> posteo.net> writes:
>>>
>>>> Gerd Möllmann <gerd.moellmann <at> gmail.com> writes:
>>>>
>>>>> Don't know if that helps, but M-x igc-root-stats can be used to display
>>>>> information about known roots. Maybe one can see there if the number of
>>>>> roots increases over time, which would indicate if there is something
>>>>> like a "root leak", for example by using xfree instead of igc_xfree.
>>>>
>>>> I noticed that creating a new frame took longer and longer over time
>>>> recently. Up to a dozen of seconds.
>>>
>>> Thanks! Trying to reproduce that here with:
>>>
>>> ./src/emacs -Q --eval '(run-with-timer 1.0 1.0 (lambda () (delete-frame (make-frame))))'
>>>
>>> indicates 8 xzalloc-ambig roots apparently leaked per frame created
>>> (after the fix I just pushed).  Ouch.  Even if we xfree() those, that's
>>> a great number of heap words incorrectly declared to be ambiguous roots,
>>> which may hide other bugs.
>>>
>>> No apparent leak with --with-x-toolkit=no, so we know where to look.
>>
>> Or not.  It's down to one leak/frame now, which is still bad, but I
>
> It's weird bug day: I'm seeing one leak/frame sometimes, sometimes it's
> two leaks/frame, and I expected the following patch to give me a unique
> call chain to a root that isn't freed:
>
> diff --git a/src/igc.c b/src/igc.c
> index f034aae9460..cac9cd5501c 100644
> --- a/src/igc.c
> +++ b/src/igc.c
> @@ -858,6 +858,7 @@ igc_check_fwd (void *client, bool is_vector)
>    void *start, *end;
>    const char *label;
>    bool ambig;
> +  void *caller[4];
>  };
>  
>  typedef struct igc_root igc_root;
> @@ -3217,7 +3218,11 @@ igc_xzalloc_ambig (size_t size)
>    void *end = (char *) p + size;
>    if (end == p)
>      end = (char *) p + IGC_ALIGN_DFLT;
> -  root_create_ambig (global_igc, p, end, "xzalloc-ambig");
> +  struct igc_root_list *r = root_create_ambig (global_igc, p, end, "xzalloc-ambig");
> +  r->d.caller[0] = __builtin_return_address (0);
> +  r->d.caller[1] = __builtin_return_address (1);
> +  r->d.caller[2] = __builtin_return_address (2);
> +  r->d.caller[3] = __builtin_return_address (3);
>    return p;
>  }
>  
>
> However, while I do see what I think are the 100 leaks after running
>
> ./src/emacs -Q --eval '(dotimes (i 100) (delete-frame (make-frame)))'
>
> they have different call chains.
>
> I'm using
>
> p global_igc->roots[0]
> while 1
> p *$.next
> end
>
> in GDB, and I was expecting the leaked roots to be among the first
> values printed.
>
> Is there something obvious I'm doing wrong?  Or are we really creating
> menuitems in such a way that we usually leak one, but it's random which
> one?

I'm suspicious of this code in gtkutil.c:

void
free_frame_tool_bar (struct frame *f)
{
  xp_output *x = f->output_data.xp;

  if (x->toolbar_widget)
    {
      struct xg_frame_tb_info *tbinfo;
      GtkWidget *top_widget = x->toolbar_widget;

      block_input ();
      if (x->toolbar_is_packed)
        {
          if (x->toolbar_in_hbox)
            gtk_container_remove (GTK_CONTAINER (x->hbox_widget),
                                  top_widget);
          else
            gtk_container_remove (GTK_CONTAINER (x->vbox_widget),
                                  top_widget);
        }
      else
        gtk_widget_destroy (x->toolbar_widget);

      x->toolbar_widget = 0;
      x->toolbar_widget = 0;

Not least because that first assignment seems like a dead store (maybe
GCC should warn about those rather than complaining about perfectly
valid code ;-) ).

I don't see how the toolbar packing code ensures x->toolbar_widget no
longer needs to be freed.

However, free_frame_tool_bar isn't called at all, it seems, if we create
a frame, then destroy it.  Calling it from xg_free_frame_widgets crashes
because free_frame_tool_bar calls code which assumes there's a valid
window.  Removing that call gets us down to one leak/frame with a unique
call chain:

  imc = gtk_im_multicontext_new ();
  g_object_ref (imc);
  gtk_im_context_set_use_preedit (imc, TRUE);

  g_signal_connect_data (G_OBJECT (imc), "commit",
			 G_CALLBACK (xg_im_context_commit),
			 glib_user_data (f), free_glib_user_data,
			 0);
  g_signal_connect (G_OBJECT (imc), "preedit-changed",
		    G_CALLBACK (xg_im_context_preedit_changed), NULL);
  g_signal_connect (G_OBJECT (imc), "preedit-end",
		    G_CALLBACK (xg_im_context_preedit_end), NULL);
  FRAME_X_OUTPUT (f)->im_context = imc;

  g_signal_connect (G_OBJECT (wfixed), "key-press-event",
		    G_CALLBACK (xg_widget_key_press_event_cb),
		    NULL);

This creates an artificial extra reference to imc; there's a paired
call in xg_free_frame_widgets, but we need an extra call to destroy the
extra ref as well as the ref gtk_im_multicontext_new already created for
us.

Removing the call to g_object_ref "fixes" things, but I assume it was
there for a reason.  Long story short, this diff "works" but needs to be
redone properly:

diff --git a/src/gtkutil.c b/src/gtkutil.c
index e1949b4a06d..87f9aa854e1 100644
--- a/src/gtkutil.c
+++ b/src/gtkutil.c
@@ -1762,7 +1762,7 @@ xg_create_frame_widgets (struct frame *f)
 			 0);
 
   imc = gtk_im_multicontext_new ();
-  g_object_ref (imc);
+  // g_object_ref (imc);
   gtk_im_context_set_use_preedit (imc, TRUE);
 
   g_signal_connect_data (G_OBJECT (imc), "commit",
@@ -1907,6 +1907,7 @@ xg_create_frame_outer_widgets (struct frame *f)
 void
 xg_free_frame_widgets (struct frame *f)
 {
+  free_frame_tool_bar (f);
   if (FRAME_GTK_OUTER_WIDGET (f))
     {
       xp_output *x = f->output_data.xp;
@@ -6232,10 +6233,9 @@ free_frame_tool_bar (struct frame *f)
             gtk_container_remove (GTK_CONTAINER (x->vbox_widget),
                                   top_widget);
         }
-      else
-        gtk_widget_destroy (x->toolbar_widget);
+      if (x->toolbar_widget)
+	gtk_widget_destroy (x->toolbar_widget);
 
-      x->toolbar_widget = 0;
       x->toolbar_widget = 0;
       x->toolbar_is_packed = false;
       FRAME_TOOLBAR_TOP_HEIGHT (f) = FRAME_TOOLBAR_BOTTOM_HEIGHT (f) = 0;
@@ -6255,7 +6255,7 @@ free_frame_tool_bar (struct frame *f)
                              NULL);
         }
 
-      adjust_frame_size (f, -1, -1, 2, 0, Qtool_bar_lines);
+      // adjust_frame_size (f, -1, -1, 2, 0, Qtool_bar_lines);
 
       unblock_input ();
     }

However, even if a fix like that works, the GTK code simply creates too
many ambiguous unprotected roots for MPS to deal with: thousands of them
per frame.  We should fix the "unprotected", then the "ambiguous", then
the "roots", or at least start doing that until performance becomes
acceptable again.

I suspect the only reason it doesn't make my Emacs unusable is that I
don't use menu bars or tool bars.

Pip





This bug report was last modified 88 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.