Package: emacs;
Reported by: Gerd Möllmann <gerd.moellmann <at> gmail.com>
Date: Sat, 19 Apr 2025 16:06:02 UTC
Severity: normal
Found in version 31.0.50
Message #215 received at 77924 <at> debbugs.gnu.org (full text, mbox):
From: Gerd Möllmann <gerd.moellmann <at> gmail.com> To: Stefan Monnier <monnier <at> iro.umontreal.ca> Cc: Eli Zaretskii <eliz <at> gnu.org>, stefankangas <at> gmail.com, 77924 <at> debbugs.gnu.org Subject: Re: bug#77924: 31.0.50; [Feature branch] Change marker implementation Date: Fri, 25 Apr 2025 12:18:29 +0200
Stefan Monnier <monnier <at> iro.umontreal.ca> writes: > BTW, I did a run of the complete benchmark suite (plus some extra > micro-benchmarks). This compares the base of the branch to the tip of > the text-index branch: > > | test | BASE (s) | gc (s) | TINDEX (s) | gc (s) | err | > |--------------------------+------------+--------+------------+--------+-----| > | bubble | 4.46 | 0.00 | 4.57 | 0.00 | 0% | > | bubble-no-cons | 15.04 | 0.00 | 14.71 | 0.00 | 0% | > | bytecomp | 3.67 | 0.00 | 3.72 | 0.00 | 0% | > | dhrystone | 10.30 | 0.00 | 10.13 | 0.00 | 0% | > | eieio | 4.06 | 0.00 | 4.10 | 0.00 | 0% | > | fibn | 3.22 | 0.00 | 3.34 | 0.00 | 0% | > | fibn-named-let | 4.11 | 0.00 | 4.08 | 0.00 | 0% | > | fibn-rec | 6.81 | 0.00 | 6.88 | 0.00 | 0% | > | fibn-tc | 5.34 | 0.00 | 5.12 | 0.00 | 0% | > | flet | 10.53 | 0.00 | 10.68 | 0.00 | 0% | > | font-lock | 1.30 | 0.00 | 1.29 | 0.00 | 0% | > | inclist | 24.65 | 0.00 | 16.21 | 0.00 | 0% | > | inclist-type-hints | 24.56 | 0.00 | 16.23 | 0.00 | 0% | > | listlen-tc | 5.19 | 0.00 | 5.16 | 0.00 | 0% | > | map-closure | 9.86 | 0.00 | 9.26 | 0.00 | 0% | > | markers-charbyte-1G/0*1 | 37.38 | 0.00 | 5.54 | 0.00 | 2% | > | markers-charbyte-1G/0*10 | 283.68 | 0.00 | 7.25 | 0.00 | 1% | > | markers-charbyte-1G/1*1 | 3.64 | 0.00 | 0.97 | 0.00 | 5% | > | markers-charbyte-1G/1*10 | 26.11 | 0.00 | 2.69 | 0.00 | 2% | > | markers-save-excursion-0 | 14.07 | 1.48 | 15.58 | 0.00 | 1% | > | markers-save-excursion-3 | 12.97 | 1.34 | 14.99 | 0.00 | 0% | > | nbody | 4.95 | 0.00 | 4.80 | 0.00 | 0% | > | pack-unpack | 0.85 | 0.00 | 0.84 | 0.00 | 0% | > | pack-unpack-old | 2.82 | 0.00 | 2.77 | 0.00 | 1% | > | pcase | 10.37 | 0.00 | 10.35 | 0.00 | 0% | > | pidigits | 10.43 | 0.00 | 9.09 | 0.00 | 2% | > | regexp-cubic-200 | 1.48 | 0.00 | 1.48 | 0.00 | 1% | > | regexp-cubic-400 | 2.31 | 0.00 | 2.32 | 0.00 | 1% | > | regexp-exponential | 3.68 | 0.00 | 3.69 | 0.00 | 0% | > | regexp-gnumsg | 0.92 | 0.00 | 0.92 | 0.00 | 0% | > | regexp-linear-100 | 1.07 | 0.00 | 1.08 | 0.00 | 0% | > | regexp-linear-1000 | 1.01 | 0.00 | 1.01 | 0.00 | 0% | > | regexp-linear-5000 | 1.01 | 0.00 | 1.01 | 0.00 | 0% | > | regexp-quadratic-1000 | 1.06 | 0.00 | 1.07 | 0.00 | 1% | > | regexp-quadratic-5000 | 2.65 | 0.00 | 2.66 | 0.00 | 1% | > | regexp-tuareg | 0.46 | 0.00 | 0.44 | 0.00 | 0% | > | scroll | 1.33 | 0.00 | 1.33 | 0.00 | 0% | > | scroll-nonascii | 2.68 | 0.00 | 2.60 | 0.00 | 0% | > | search | 29.42 | 0.00 | 29.55 | 0.00 | 0% | > | search/m50k | 29.02 | 0.00 | 29.55 | 0.00 | 0% | > | search/nolookup | 24.84 | 0.00 | 24.99 | 0.00 | 0% | > | search/p02 | 55.03 | 0.00 | 52.91 | 0.00 | 0% | > | search/p32 | 31.22 | 0.00 | 31.12 | 0.00 | 0% | > | smie | 2.98 | 0.00 | 3.03 | 0.00 | 0% | > | smie-nonascii | 5.82 | 0.00 | 15.96 | 0.00 | 0% | > |--------------------------+------------+--------+------------+--------+-----| > > Most of the benchmarks are mostly unaffected. Comments: > > - the `err` column shows the maximum of the std-dev for the BASE and the > TINDEX (both were run 10 times). > This is run on a fairly old server, because it's the machine with > least variability to which I have access (no DVFS effects, plenty of > idle cores and RAM not to be too affected by other processes, ...). > > - The `markers-charbyte-*` benchmarks are micro-benchmarks testing the > charpos->bytepos conversion by making many calls of `char-after` to > random positions in the buffer. The fact that we go *much* faster there > is just normal: this is the poor behavior the patch is intended to fix. > IOW, not going much faster would be a big disappointment. > > - The `markers-save-excursion-*` benchmarks are meant to test the > performance of a "cheap" save-excursion. Here, the BASE code is very > efficient and my previous "sorted-array-of-markers-with-gap" had trouble > staying competitive. Here we see that the new branch is a bit slower > but not by very much. Also note that it GCs less, presumably because > the marker objects are smaller (4words vs 6words) so GCs are a bit > less frequent. > > - `inclist` is significantly faster. I have no clue what's up with that. > This micro-benchmark presumably doesn't get anywhere near markers or > charpos or bytepos. > FWIW, I ran those benchmarks on a slightly different pair of builds (I > honestly can't say what's different) and there was no difference in > that other case. AFAICT 16.2s is the "right" answer and the 24.5s > shown here is a weird corner case that's better ignored. > Welcome to the wonderful world of micro-benchmarks! > > - `pidigits` is slightly faster, and I again have no clue why. > Probably some unrelated effect. In this case, my other pair of builds > showed approximately the same difference, so it seems to be "a bit > less of a fluke"? In any case, I'd ignore this one as well. > > - the `search*` benchmarks are designed to test the bytepos->charpos > conversions done during regexp matching because the lookups for the > `syntax-table` text properties. I wrote them back when we had > a serious performance problem there, but nowadays the `master` branch > handles this very well, so we see the branch is not able to get any > benefit: the bytepos to convert is basically always right next to the > last one, so we never need to consult anything like markers or the > text-index to compute the charpos. > > - `smie` is a test which re-indents `xmenu.c` (using the > SMIE-based `sm-c-mode` rather than CC-mode). Since `xmenu.c` is > an ASCII file, the text-index can't be very useful so it's no wonder > that we don't see any performance difference. > > - Finally `smie-nonascii` is the same as `smie` except that it uses > a file that's identical to `xmenu.c` but where all the letters of > comments/strings/identifiers were replaced by non-ASCII ones. > Here, we see that TINDEX` is *much* slower than BASE. > This doesn't seem to be a fluke: I see the exact same performance > difference in my other pair of builds. > > This last one is a serious problem that we need to address before we can > merge the branch. > > FWIW, I also ran those benchmark on the same machine with Debian's > Emacs-30.1 and the results are basically identical to those of > BASE above. I've got the new benchmarks from elpa.git, and ran the scroll and smie becnmarks on my otherwise idle Mac mini M1, GUI Emacs, Scroll: * master Results | test | non-gc (s) | gc (s) | gcs | total (s) | err | |-----------------+------------+--------+-----+-----------+-----| | scroll | 1.35 | 0.34 | 22 | 1.69 | 0% | | scroll-nonascii | 3.24 | 0.77 | 50 | 4.01 | 0% | |-----------------+------------+--------+-----+-----------+-----| | total | 4.59 | 1.11 | 72 | 5.70 | 0% | * text-index Results | test | non-gc (s) | gc (s) | gcs | total (s) | err | |-----------------+------------+--------+-----+-----------+-----| | scroll | 1.34 | 0.33 | 22 | 1.68 | 0% | | scroll-nonascii | 3.16 | 0.75 | 49 | 3.91 | 0% | |-----------------+------------+--------+-----+-----------+-----| | total | 4.51 | 1.08 | 71 | 5.59 | 0% | My summary, together with what I got with tamil.txt yesterday: indistinguishable, if not a tad better with text-index Smie: * master Results | test | non-gc (s) | gc (s) | gcs | total (s) | err | |---------------+------------+--------+-----+-----------+-----| | font-lock | 0.52 | 0.32 | 24 | 0.84 | 0% | | smie | 1.19 | 0.50 | 36 | 1.69 | 0% | | smie-nonascii | 2.13 | 0.53 | 39 | 2.66 | 0% | |---------------+------------+--------+-----+-----------+-----| | total | 3.83 | 1.35 | 99 | 5.19 | 0% | * text-index Results | test | non-gc (s) | gc (s) | gcs | total (s) | err | |---------------+------------+--------+-----+-----------+-----| | font-lock | 0.51 | 0.31 | 23 | 0.82 | 0% | | smie | 1.18 | 0.45 | 33 | 1.63 | 0% | | smie-nonascii | 5.49 | 0.48 | 35 | 5.98 | 0% | |---------------+------------+--------+-----+-----------+-----| | total | 7.19 | 1.24 | 92 | 8.43 | 0% | That shows also the difference in smie-nonascii, i.e. in a C file containing multi-byte characters (one should suffice to make Z != Z_BYTE, and the index is used).
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.