GNU bug report logs - #74547
31.0.50; igc: assertion failed in buffer.c

Previous Next

Package: emacs;

Reported by: Óscar Fuentes <oscarfv <at> telefonica.net>

Date: Tue, 26 Nov 2024 18:36:02 UTC

Severity: normal

Found in version 31.0.50

Done: Óscar Fuentes <oscarfv <at> telefonica.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Geza Herman <geza.herman <at> gmail.com>
To: Pip Cet <pipcet <at> protonmail.com>
Cc: Gerd Möllmann <gerd.moellmann <at> gmail.com>, 74547 <at> debbugs.gnu.org, Óscar Fuentes <oscarfv <at> telefonica.net>
Subject: bug#74547: 31.0.50; igc: assertion failed in buffer.c
Date: Wed, 4 Dec 2024 20:11:18 +0100
On 12/1/24 22:15, Pip Cet wrote:
> "Geza Herman" <geza.herman <at> gmail.com> writes:
>>     On 12/1/24 16:48, Pip Cet wrote:
>> Gerd M¶llmann <gerd.moellmann <at> gmail.com> writes:
>>
>>     Back then, the future of the new GC was a question, so Gerd said
>>     (https://lists.gnu.org/archive/html/emacs-devel/2024-03/msg00544.html)
>>     that
>>     "Please don't take my GC efforts into consideration. That may succeed
>>     or not. But this is also a matter of good design, using the stack,
>>     (which BTW pdumper does, too), vs. bad design." That's why we went with
>>     the fastest implementation that doesn't use lisp vectors for storage.
>>     But we suspected that this JSON parser design will likely cause a
>>     problem with the new GC. So I think even if it turned out that the
>>     current problem was not caused by the parser, I still think that there
>>     should be something done about this JSON parser design to eliminate
>>     this potential problem. The lisp vector based approach was reverted
>>     because it added an extra pressure to the GC. For large JSON messages,
>>     it doesn't matter too much, but when the JSON is small, the extra GC
>>     time made the parser measurably slower. But, as far as I remember, that
>>     version hadn't have the small internal storage optimization yet. If we
>>     convert back to the vector based approach, the extra GC pressure will
>>     be smaller (compared to the original vector based approach without the
>>     internal storage), as for smaller sizes the vector won't be actually
>>     used.
>>     G©za
> Thank you for the summary, that makes sense. Is there a standard corpus
> of JSON documents that you use to benchmark the code? That would be very
> helpful, I think, since Eli correctly points out JSON parsing
> performance is critical.

I'm not aware of such a corpus. When I developed the new JSON parser, 
the performance difference was so large so it was obvious that the new 
parser is faster. But I did benchmarks on JSONs which was generated by 
LSP communication (maybe I can share this one, if there is interest, but 
I need to anonymize it first), and also I did a benchmark on all the 
JSONs I found on my computer.

But this time, the performance difference is expected to be smaller, 
using lisp vectors shouldn't have a very large effect on performance. 
I'd check the performance with small JSONs, but large enough ones where 
the (non-internal) object_workspace is actually get used (make sure to 
run a lot of iterations, so the amortized GC time will be included in 
the result). For larger JSONs, we shouldn't have a difference, as all 
the other allocations (which store the actual result of the parsing) 
should hide the additional lisp vector allocation cost. At least, this 
is my theory.


> My gut feeling is that we should get rid of the object_workspace
> entirely, instead modifying the general Lisp code to avoid performance
> issues (and sacrifice some memory in the process, on most systems).
object_workspace is only grown once for the lifetime of one parsing. 
Once it is grown to the needed size, the only extra cost when parsing a 
value is to copy the data to its final place from the object_workspace. 
Truncate based solution does the same copy, but it also needs to grow 
the hashtable/array for each value, so it executes more allocations and 
copies than the current solution. So I'd prefer if we kept 
object_workspace. If the only solution is to convert it to a lisp 
vector, then I think we should do that. But again, this is just my 
theory. If we try the truncate based solution, and if it turns out that 
it's not significantly slower, then it can be a good solution as well.





This bug report was last modified 155 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.