GNU bug report logs - #77924
31.0.50; [Feature branch] Change marker implementation

Previous Next

Package: emacs;

Reported by: Gerd Möllmann <gerd.moellmann <at> gmail.com>

Date: Sat, 19 Apr 2025 16:06:02 UTC

Severity: normal

Found in version 31.0.50

Full log


View this message in rfc822 format

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: gerd.moellmann <at> gmail.com, stefankangas <at> gmail.com, 77924 <at> debbugs.gnu.org
Subject: bug#77924: 31.0.50; [Feature branch] Change marker implementation
Date: Fri, 25 Apr 2025 00:19:42 -0400
BTW, I did a run of the complete benchmark suite (plus some extra
micro-benchmarks).  This compares the base of the branch to the tip of
the text-index branch:

  | test                     |   BASE (s) | gc (s) | TINDEX (s) | gc (s) | err |
  |--------------------------+------------+--------+------------+--------+-----|
  | bubble                   |       4.46 |   0.00 |       4.57 |   0.00 |  0% |
  | bubble-no-cons           |      15.04 |   0.00 |      14.71 |   0.00 |  0% |
  | bytecomp                 |       3.67 |   0.00 |       3.72 |   0.00 |  0% |
  | dhrystone                |      10.30 |   0.00 |      10.13 |   0.00 |  0% |
  | eieio                    |       4.06 |   0.00 |       4.10 |   0.00 |  0% |
  | fibn                     |       3.22 |   0.00 |       3.34 |   0.00 |  0% |
  | fibn-named-let           |       4.11 |   0.00 |       4.08 |   0.00 |  0% |
  | fibn-rec                 |       6.81 |   0.00 |       6.88 |   0.00 |  0% |
  | fibn-tc                  |       5.34 |   0.00 |       5.12 |   0.00 |  0% |
  | flet                     |      10.53 |   0.00 |      10.68 |   0.00 |  0% |
  | font-lock                |       1.30 |   0.00 |       1.29 |   0.00 |  0% |
  | inclist                  |      24.65 |   0.00 |      16.21 |   0.00 |  0% |
  | inclist-type-hints       |      24.56 |   0.00 |      16.23 |   0.00 |  0% |
  | listlen-tc               |       5.19 |   0.00 |       5.16 |   0.00 |  0% |
  | map-closure              |       9.86 |   0.00 |       9.26 |   0.00 |  0% |
  | markers-charbyte-1G/0*1  |      37.38 |   0.00 |       5.54 |   0.00 |  2% |
  | markers-charbyte-1G/0*10 |     283.68 |   0.00 |       7.25 |   0.00 |  1% |
  | markers-charbyte-1G/1*1  |       3.64 |   0.00 |       0.97 |   0.00 |  5% |
  | markers-charbyte-1G/1*10 |      26.11 |   0.00 |       2.69 |   0.00 |  2% |
  | markers-save-excursion-0 |      14.07 |   1.48 |      15.58 |   0.00 |  1% |
  | markers-save-excursion-3 |      12.97 |   1.34 |      14.99 |   0.00 |  0% |
  | nbody                    |       4.95 |   0.00 |       4.80 |   0.00 |  0% |
  | pack-unpack              |       0.85 |   0.00 |       0.84 |   0.00 |  0% |
  | pack-unpack-old          |       2.82 |   0.00 |       2.77 |   0.00 |  1% |
  | pcase                    |      10.37 |   0.00 |      10.35 |   0.00 |  0% |
  | pidigits                 |      10.43 |   0.00 |       9.09 |   0.00 |  2% |
  | regexp-cubic-200         |       1.48 |   0.00 |       1.48 |   0.00 |  1% |
  | regexp-cubic-400         |       2.31 |   0.00 |       2.32 |   0.00 |  1% |
  | regexp-exponential       |       3.68 |   0.00 |       3.69 |   0.00 |  0% |
  | regexp-gnumsg            |       0.92 |   0.00 |       0.92 |   0.00 |  0% |
  | regexp-linear-100        |       1.07 |   0.00 |       1.08 |   0.00 |  0% |
  | regexp-linear-1000       |       1.01 |   0.00 |       1.01 |   0.00 |  0% |
  | regexp-linear-5000       |       1.01 |   0.00 |       1.01 |   0.00 |  0% |
  | regexp-quadratic-1000    |       1.06 |   0.00 |       1.07 |   0.00 |  1% |
  | regexp-quadratic-5000    |       2.65 |   0.00 |       2.66 |   0.00 |  1% |
  | regexp-tuareg            |       0.46 |   0.00 |       0.44 |   0.00 |  0% |
  | scroll                   |       1.33 |   0.00 |       1.33 |   0.00 |  0% |
  | scroll-nonascii          |       2.68 |   0.00 |       2.60 |   0.00 |  0% |
  | search                   |      29.42 |   0.00 |      29.55 |   0.00 |  0% |
  | search/m50k              |      29.02 |   0.00 |      29.55 |   0.00 |  0% |
  | search/nolookup          |      24.84 |   0.00 |      24.99 |   0.00 |  0% |
  | search/p02               |      55.03 |   0.00 |      52.91 |   0.00 |  0% |
  | search/p32               |      31.22 |   0.00 |      31.12 |   0.00 |  0% |
  | smie                     |       2.98 |   0.00 |       3.03 |   0.00 |  0% |
  | smie-nonascii            |       5.82 |   0.00 |      15.96 |   0.00 |  0% |
  |--------------------------+------------+--------+------------+--------+-----|

Most of the benchmarks are mostly unaffected.  Comments:

- the `err` column shows the maximum of the std-dev for the BASE and the
  TINDEX (both were run 10 times).
  This is run on a fairly old server, because it's the machine with
  least variability to which I have access (no DVFS effects, plenty of
  idle cores and RAM not to be too affected by other processes, ...).

- The `markers-charbyte-*` benchmarks are micro-benchmarks testing the
  charpos->bytepos conversion by making many calls of `char-after` to
  random positions in the buffer.  The fact that we go *much* faster there
  is just normal: this is the poor behavior the patch is intended to fix.
  IOW, not going much faster would be a big disappointment.

- The `markers-save-excursion-*` benchmarks are meant to test the
  performance of a "cheap" save-excursion.  Here, the BASE code is very
  efficient and my previous "sorted-array-of-markers-with-gap" had trouble
  staying competitive.  Here we see that the new branch is a bit slower
  but not by very much.  Also note that it GCs less, presumably because
  the marker objects are smaller (4words vs 6words) so GCs are a bit
  less frequent.

- `inclist` is significantly faster.  I have no clue what's up with that.
  This micro-benchmark presumably doesn't get anywhere near markers or
  charpos or bytepos.
  FWIW, I ran those benchmarks on a slightly different pair of builds (I
  honestly can't say what's different) and there was no difference in
  that other case.  AFAICT 16.2s is the "right" answer and the 24.5s
  shown here is a weird corner case that's better ignored.
  Welcome to the wonderful world of micro-benchmarks!

- `pidigits` is slightly faster, and I again have no clue why.
  Probably some unrelated effect.  In this case, my other pair of builds
  showed approximately the same difference, so it seems to be "a bit
  less of a fluke"?  In any case, I'd ignore this one as well.

- the `search*` benchmarks are designed to test the bytepos->charpos
  conversions done during regexp matching because the lookups for the
  `syntax-table` text properties.  I wrote them back when we had
  a serious performance problem there, but nowadays the `master` branch
  handles this very well, so we see the branch is not able to get any
  benefit: the bytepos to convert is basically always right next to the
  last one, so we never need to consult anything like markers or the
  text-index to compute the charpos.

- `smie` is a test which re-indents `xmenu.c` (using the
  SMIE-based `sm-c-mode` rather than CC-mode).  Since `xmenu.c` is
  an ASCII file, the text-index can't be very useful so it's no wonder
  that we don't see any performance difference.

- Finally `smie-nonascii` is the same as `smie` except that it uses
  a file that's identical to `xmenu.c` but where all the letters of
  comments/strings/identifiers were replaced by non-ASCII ones.
  Here, we see that TINDEX` is *much* slower than BASE.
  This doesn't seem to be a fluke: I see the exact same performance
  difference in my other pair of builds.

This last one is a serious problem that we need to address before we can
merge the branch.

FWIW, I also ran those benchmark on the same machine with Debian's
Emacs-30.1 and the results are basically identical to those of
BASE above.


        Stefan





This bug report was last modified 106 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.