Package: emacs;
Reported by: Ihor Radchenko <yantar92 <at> posteo.net>
Date: Fri, 10 Jan 2025 13:38:02 UTC
Severity: normal
Found in version 31.0.50
Done: Pip Cet <pipcet <at> protonmail.com>
Bug is archived. No further changes may be made.
Message #71 received at 75477 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Ihor Radchenko <yantar92 <at> posteo.net> Cc: Gerd Möllmann <gerd.moellmann <at> gmail.com>, 75477 <at> debbugs.gnu.org Subject: Re: bug#75477: 31.0.50; scratch/igc: crash on the latest commit Date: Fri, 10 Jan 2025 19:01:53 +0000
Pip Cet <pipcet <at> protonmail.com> writes: > Pip Cet <pipcet <at> protonmail.com> writes: > >> Pip Cet <pipcet <at> protonmail.com> writes: >> >>> "Ihor Radchenko" <yantar92 <at> posteo.net> writes: >>> >>>> Gerd Möllmann <gerd.moellmann <at> gmail.com> writes: >>>> >>>>> Don't know if that helps, but M-x igc-root-stats can be used to display >>>>> information about known roots. Maybe one can see there if the number of >>>>> roots increases over time, which would indicate if there is something >>>>> like a "root leak", for example by using xfree instead of igc_xfree. >>>> >>>> I noticed that creating a new frame took longer and longer over time >>>> recently. Up to a dozen of seconds. >>> >>> Thanks! Trying to reproduce that here with: >>> >>> ./src/emacs -Q --eval '(run-with-timer 1.0 1.0 (lambda () (delete-frame (make-frame))))' >>> >>> indicates 8 xzalloc-ambig roots apparently leaked per frame created >>> (after the fix I just pushed). Ouch. Even if we xfree() those, that's >>> a great number of heap words incorrectly declared to be ambiguous roots, >>> which may hide other bugs. >>> >>> No apparent leak with --with-x-toolkit=no, so we know where to look. >> >> Or not. It's down to one leak/frame now, which is still bad, but I > > It's weird bug day: I'm seeing one leak/frame sometimes, sometimes it's > two leaks/frame, and I expected the following patch to give me a unique > call chain to a root that isn't freed: > > diff --git a/src/igc.c b/src/igc.c > index f034aae9460..cac9cd5501c 100644 > --- a/src/igc.c > +++ b/src/igc.c > @@ -858,6 +858,7 @@ igc_check_fwd (void *client, bool is_vector) > void *start, *end; > const char *label; > bool ambig; > + void *caller[4]; > }; > > typedef struct igc_root igc_root; > @@ -3217,7 +3218,11 @@ igc_xzalloc_ambig (size_t size) > void *end = (char *) p + size; > if (end == p) > end = (char *) p + IGC_ALIGN_DFLT; > - root_create_ambig (global_igc, p, end, "xzalloc-ambig"); > + struct igc_root_list *r = root_create_ambig (global_igc, p, end, "xzalloc-ambig"); > + r->d.caller[0] = __builtin_return_address (0); > + r->d.caller[1] = __builtin_return_address (1); > + r->d.caller[2] = __builtin_return_address (2); > + r->d.caller[3] = __builtin_return_address (3); > return p; > } > > > However, while I do see what I think are the 100 leaks after running > > ./src/emacs -Q --eval '(dotimes (i 100) (delete-frame (make-frame)))' > > they have different call chains. > > I'm using > > p global_igc->roots[0] > while 1 > p *$.next > end > > in GDB, and I was expecting the leaked roots to be among the first > values printed. > > Is there something obvious I'm doing wrong? Or are we really creating > menuitems in such a way that we usually leak one, but it's random which > one? I'm suspicious of this code in gtkutil.c: void free_frame_tool_bar (struct frame *f) { xp_output *x = f->output_data.xp; if (x->toolbar_widget) { struct xg_frame_tb_info *tbinfo; GtkWidget *top_widget = x->toolbar_widget; block_input (); if (x->toolbar_is_packed) { if (x->toolbar_in_hbox) gtk_container_remove (GTK_CONTAINER (x->hbox_widget), top_widget); else gtk_container_remove (GTK_CONTAINER (x->vbox_widget), top_widget); } else gtk_widget_destroy (x->toolbar_widget); x->toolbar_widget = 0; x->toolbar_widget = 0; Not least because that first assignment seems like a dead store (maybe GCC should warn about those rather than complaining about perfectly valid code ;-) ). I don't see how the toolbar packing code ensures x->toolbar_widget no longer needs to be freed. However, free_frame_tool_bar isn't called at all, it seems, if we create a frame, then destroy it. Calling it from xg_free_frame_widgets crashes because free_frame_tool_bar calls code which assumes there's a valid window. Removing that call gets us down to one leak/frame with a unique call chain: imc = gtk_im_multicontext_new (); g_object_ref (imc); gtk_im_context_set_use_preedit (imc, TRUE); g_signal_connect_data (G_OBJECT (imc), "commit", G_CALLBACK (xg_im_context_commit), glib_user_data (f), free_glib_user_data, 0); g_signal_connect (G_OBJECT (imc), "preedit-changed", G_CALLBACK (xg_im_context_preedit_changed), NULL); g_signal_connect (G_OBJECT (imc), "preedit-end", G_CALLBACK (xg_im_context_preedit_end), NULL); FRAME_X_OUTPUT (f)->im_context = imc; g_signal_connect (G_OBJECT (wfixed), "key-press-event", G_CALLBACK (xg_widget_key_press_event_cb), NULL); This creates an artificial extra reference to imc; there's a paired call in xg_free_frame_widgets, but we need an extra call to destroy the extra ref as well as the ref gtk_im_multicontext_new already created for us. Removing the call to g_object_ref "fixes" things, but I assume it was there for a reason. Long story short, this diff "works" but needs to be redone properly: diff --git a/src/gtkutil.c b/src/gtkutil.c index e1949b4a06d..87f9aa854e1 100644 --- a/src/gtkutil.c +++ b/src/gtkutil.c @@ -1762,7 +1762,7 @@ xg_create_frame_widgets (struct frame *f) 0); imc = gtk_im_multicontext_new (); - g_object_ref (imc); + // g_object_ref (imc); gtk_im_context_set_use_preedit (imc, TRUE); g_signal_connect_data (G_OBJECT (imc), "commit", @@ -1907,6 +1907,7 @@ xg_create_frame_outer_widgets (struct frame *f) void xg_free_frame_widgets (struct frame *f) { + free_frame_tool_bar (f); if (FRAME_GTK_OUTER_WIDGET (f)) { xp_output *x = f->output_data.xp; @@ -6232,10 +6233,9 @@ free_frame_tool_bar (struct frame *f) gtk_container_remove (GTK_CONTAINER (x->vbox_widget), top_widget); } - else - gtk_widget_destroy (x->toolbar_widget); + if (x->toolbar_widget) + gtk_widget_destroy (x->toolbar_widget); - x->toolbar_widget = 0; x->toolbar_widget = 0; x->toolbar_is_packed = false; FRAME_TOOLBAR_TOP_HEIGHT (f) = FRAME_TOOLBAR_BOTTOM_HEIGHT (f) = 0; @@ -6255,7 +6255,7 @@ free_frame_tool_bar (struct frame *f) NULL); } - adjust_frame_size (f, -1, -1, 2, 0, Qtool_bar_lines); + // adjust_frame_size (f, -1, -1, 2, 0, Qtool_bar_lines); unblock_input (); } However, even if a fix like that works, the GTK code simply creates too many ambiguous unprotected roots for MPS to deal with: thousands of them per frame. We should fix the "unprotected", then the "ambiguous", then the "roots", or at least start doing that until performance becomes acceptable again. I suspect the only reason it doesn't make my Emacs unusable is that I don't use menu bars or tool bars. Pip
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.