GNU bug report logs -
#63225
Compiling regexp patterns (and REGEXP_CACHE_SIZE in search.c)
Previous Next
Full log
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Tags: patch
Hello,
I am now studying the performance of Org mode parser on huge Org files.
I noticed that `org-element-parse-buffer' spends a significant (~10%)
fraction of CPU time simply compiling regexp patterns.
This happens because Org parser performs a huge number repeated regexp
searches as it incrementally parses the buffer. The searches happen on a
fixed set of regexp patterns (several dozens).
I was able to get rid of the regex compilation-related slowdown simply
by increasing REGEXP_CACHE_SIZE 10x (see the attached patch).
Does anyone know if there are potential side effects of this increase if
applied across Emacs? Or, alternatively, may Emacs provide an ability to
store compiled regexp patterns from Elisp (similar to what
`treesit-query-compile' does)?
I suspect that storing pre-compiled patterns may benefit a number of
major modes that have to perform complex regexp matching.
Best,
Ihor
In GNU Emacs 30.0.50 (build 4, x86_64-pc-linux-gnu, GTK+ Version
3.24.37, cairo version 1.17.8) of 2023-05-02 built on localhost
Repository revision: a0a71ca12d585bca5173775f08eabae553e15659
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version 11.0.12101008
System Description: Gentoo Linux
Configured using:
'configure --with-native-compilation'
[0001-src-search.c-REGEXP_CACHE_SIZE-Increase-to-200.patch (text/patch, attachment)]
[Message part 3 (text/plain, inline)]
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
This bug report was last modified 2 years and 38 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.