GNU bug report logs - #43389
28.0.50; Emacs memory leaks

Previous Next

Package: emacs;

Reported by: Michael Heerdegen <michael_heerdegen <at> web.de>

Date: Mon, 14 Sep 2020 00:44:01 UTC

Severity: normal

Merged with 43395, 43876, 44666

Found in version 28.0.50

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

Full log


Message #141 received at 43389 <at> debbugs.gnu.org (full text, mbox):

From: Trevor Bentley <trevor <at> trevorbentley.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 43389 <at> debbugs.gnu.org, 
Subject: Re: bug#43389: 28.0.50; Emacs memory leaks
Date: Wed, 11 Nov 2020 22:15:21 +0100
> Thanks.  This trace doesn't show how many bytes were allocated, 
> does it?  Without that it is hard to judge whether these GnuTLS 
> calls could be the culprit.  Because the full trace shows other 
> calls to malloc, for example this: 

It doesn't show the size of the individual allocations, but it 
indirectly shows the size of the heap.  Each brk() line like this 
one is the start of an entry:

0.000000 brk(0x55f5ed93e000)       = 0x55f5ed93e000 

Where the first field is relative time since the last brk() call, 
and the argument in parentheses is the size requested. 
Subtracting the argument to one call from the argument to the 
previous call shows how much the heap has been extended.  In this 
capture, subtracting the first from the last shows that the heap 
grew by 8,683,520 bytes, and summing the relative timestamps shows 
that this happened in 90.71 seconds.  It's growing at about 
100KB/sec at this point.

Also, keep in mind that this is brk().  There could have been any 
number of malloc() calls in between, zero or millions, but these 
are the ones that couldn't find any unused blocks and had to 
extend the heap.

> I'm not sure how Emacs could be the culprit here.  If GnuTLS is 
> the culprit (and as explained above, this is not certain at this 
> point), perhaps upgrading to a newer GnuTLS version or reporting 
> this to GnuTLS developers would allow some progress. 

I think you are right, GnuTLS was probably a symptom, not a cause. 
I took a while to respond because I tried running emacs in 
Valgrind's Massif heap debugging tool, and it took forever.  Some 
results are in now, and it looks like GnuTLS wasn't present in the 
leak this time around.

First of all, if you aren't familiar with Massif (as I wasn't), it 
captures occassional snapshots of the whole heap and all 
allocations, and lets you dump a tree-view of those allocations 
later with the "ms_print" tool.  The timestamps are fairly 
useless, as they are in "number of instructions executed."  Here 
are three files from my investigation:

The raw massif output:

http://trevorbentley.com/massif.out.3364630

The *full* tree output:

http://trevorbentley.com/ms_print.3364630.txt

The tree output showing only entries above 10% usage:

http://trevorbentley.com/ms_print.thresh10.3364630.txt

What you can see from the handy ASCII graph at the top is that 
memory usage was chugging along, growing upwards for a couple of 
days, and then spiked very quickly up to just over 4GB over a few 
hours.

If you scroll down to the very last checkpoint (the 10% threshold 
file is better for this), you can see where most of the memory is 
used.  Very large sums of memory, but from different sources. 
1.7GB from lisp_align_malloc (nearly all from Fcons), 1.4GB from 
lmalloc (half from allocate_vector_block), 700MB from lrealloc 
(mostly from enlarge_buffer_text).

There were no large buffers open, but there were long-lived 
network sockets and plenty of timers.  I didn't check, but I'd say 
the largest buffer was up to a couple of megabytes, since 
emacs-slack logs fairly heavily.

I'm not sure what to make of this, really.  It seems like a 
general, sudden-onset, intense craving for more memory while not 
particularly doing much.  I could blindly suggest extreme memory 
fragmentation problems, but that doesn't seem very likely.

It's trivial to reproduce, but takes 3-5 days, so not exactly 
handy to debug.  Let me know if you have any requests for the next 
iteration before I kill it.  It's running in Valgrind again.

Thanks,

-Trevor




This bug report was last modified 4 years and 58 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.