GNU bug report logs - #63040
30.0.50; Performance of buf_bytepos_to_charpos when a buffer has large number of markers

Previous Next

Package: emacs;

Reported by: Ihor Radchenko <yantar92 <at> posteo.net>

Date: Sun, 23 Apr 2023 19:40:01 UTC

Severity: normal

Found in version 30.0.50

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ihor Radchenko <yantar92 <at> posteo.net>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.50; Performance of buf_bytepos_to_charpos when a buffer has
 large number of markers
Date: Sun, 23 Apr 2023 19:41:40 +0000
[Message part 1 (text/plain, inline)]
Hi,

When investigating `re-search-forward' performance in
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=58558 (bug#58558), I
noticed that buf_bytepos_to_charpos is taking most of the CPU time,
according to perf stats.

This was partially caused by `parse-sexp-lookup-properties', but even
after working around the text property issue, buf_bytepos_to_charpos
still shows up on top of the perf profile.

Since one of the apparent bottlenecks in buf_bytepos_to_charpos is

for (tail = BUF_MARKERS (b); tail; tail = tail->next)

which obviously scales with the number of markers in buffer, I decided
to add a cut-off parameter, as in the attached patch (number 50 has no
particular motivation underneath).

Surprisingly, this simple change reduced my Org agenda generation times
from 20 seconds down to 3-4 seconds!

I am sure that my dumb approach is not the best way to improve the
performance, but this place in buf_bytepos_to_charpos is clearly
something that can be optimized.

[0001-src-marker.c-buf_bytepos_to_charpos-Limit-marker-sea.patch (text/x-patch, inline)]
From a6ff6268bdc42a7dfedc6729d4232a2ae149da56 Mon Sep 17 00:00:00 2001
Message-Id: <a6ff6268bdc42a7dfedc6729d4232a2ae149da56.1682278830.git.yantar92 <at> posteo.net>
From: Ihor Radchenko <yantar92 <at> posteo.net>
Date: Sun, 23 Apr 2023 21:31:46 +0200
Subject: [PATCH] * src/marker.c (buf_bytepos_to_charpos): Limit marker search

Limit searching across buffer markers to first 50 markers and thus
avoid performance scaling with the number of markers.

I got 5x `re-search-forward' speed improvement in my setup with this
dumb change.
---
 src/marker.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/marker.c b/src/marker.c
index e42c49a5434..008a76c49e6 100644
--- a/src/marker.c
+++ b/src/marker.c
@@ -348,8 +348,10 @@ buf_bytepos_to_charpos (struct buffer *b, ptrdiff_t bytepos)
   if (b == cached_buffer && BUF_MODIFF (b) == cached_modiff)
     CONSIDER (cached_bytepos, cached_charpos);
 
-  for (tail = BUF_MARKERS (b); tail; tail = tail->next)
+  int i = 0;
+  for (tail = BUF_MARKERS (b); tail && i < 50; tail = tail->next)
     {
+      i++;
       CONSIDER (tail->bytepos, tail->charpos);
 
       /* If we are down to a range of 50 chars,
-- 
2.40.0

[Message part 3 (text/plain, inline)]
In GNU Emacs 30.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version
 3.24.37, cairo version 1.17.8) of 2023-04-23 built on localhost
Repository revision: ca875e3947e29d222554a05583068c49a56ed8ca
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12101008
System Description: Gentoo Linux

Configured using:
 'configure --with-native-compilation'


-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

This bug report was last modified 353 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.