GNU bug report logs -
#57245
29.0.50; M-> in a large XML file (without long lines) is slow
Previous Next
Full log
View this message in rfc822 format
On 17.08.2022 14:24, Eli Zaretskii wrote:
>> Date: Tue, 16 Aug 2022 22:32:23 +0300
>> Cc: 57245 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dgutov <at> yandex.ru>
>>
>> On 16.08.2022 19:54, Eli Zaretskii wrote:
>>> Stefan, can you see why syntax-related stuff in sgml-mode is so heavy
>>> here?
>>
>> nxml-syntax-propertize might well be heavier than average, but the delay
>> scales linearly with the size of the file.
>
> Which is generally not a good scaling factor, especially if the
> coefficient is quite large (as it seems to be in this case).
Someone can work on the coefficient, but any accurate parser has to scan
the buffer from the beginning. At least once.
Migration to tree-sitter might give us a better coefficient later, but
the principle will remain.
>> Which seems to be exactly the behavior the "font-lock narrowing" was
>> supposed to guard from?
>
> No. It wasn't supposed to fix modes that foolishly scan the buffer
> from BOB to point.
You might want to choose words better.
> It was supposed to fix modes which scan from the
> beginning of line, and that is (a) only a problem when lines are very
> long, and (b) much harder to solve in the mode itself, because
> font-lock very frequently uses anchored regexps and otherwise likes to
> start from BOL, and syntax processing also likes starting from BOL.
syntax-wholelines-max handles that problem.
Though it might depend on what you mean by "anchored regexps".
> Btw, does nXML and/or sgml-mode use libxml for their analysis? If
> not, why not? wouldn't that be faster (and possibly more accurate)?
Might be "a simple matter of coding".
But we do need syntax-propertize to run, so that the user commands can
rely on proper syntax information in the buffer. It remains to be seen
whether xml-parse-region is a good base for nxml-syntax-propertize, and
how much of a performance improvement it can bring (with all the string
marshaling around).
nxml also probably handles invalid documents better, which might or
might not be important.
This bug report was last modified 2 years and 304 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.