#38104 - 27.0.50; elixir-mode fontification is very slow

GNU bug report logs - #38104
27.0.50; elixir-mode fontification is very slow

Package: emacs;

Reported by: Dmitry Gutov <dgutov <at> yandex.ru>

Date: Thu, 7 Nov 2019 15:41:02 UTC

Severity: normal

Found in version 27.0.50

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

Message #20 received at 38104 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org> To: Dmitry Gutov <dgutov <at> yandex.ru> Cc: 38104 <at> debbugs.gnu.org Subject: Re: bug#38104: 27.0.50; elixir-mode fontification is very slow Date: Tue, 26 Nov 2019 20:32:29 +0100

26 nov. 2019 kl. 17.26 skrev Dmitry Gutov <dgutov <at> yandex.ru>: > elixir-mode does use rx, heavily. Albeit with a thin wrapper. As it turned out, rx is fine (now); elixir-mode, not quite. In elixir-mode.el, we have (identifiers . ,(rx (one-or-more (any "A-Z" "a-z" "_")) (zero-or-more (any "A-Z" "a-z" "0-9" "_")) (optional (or "?" "!")))) First, this regex is suboptimal: the first character of an identifier should occur exactly once, or you get bad backtracking behaviour. Just remove the one-or-more construct: (identifiers . ,(rx (any "A-Z" "a-z" "_") (zero-or-more (any "A-Z" "a-z" "0-9" "_")) (optional (or "?" "!")))) This definition is then used in several places, but two in particular are of interest to us: ;; Module attributes (,(elixir-rx (and "@" (1+ identifiers))) The construct (1+ identifiers) was perhaps meant to match multiple identifiers, but it doesn't (no separator); it just matches an identifier in several ways, which again leads to bad backtracking behaviour. The same problem here: ;; Map keys (,(elixir-rx (group (and (one-or-more identifiers) ":")) space) Remove the 1+ and one-or-more and it's fast again. Why did this "work" with the old rx implementation? Because that code had a nasty bug: it does not bracket definitions in rx-constituents properly. Example: (let ((rx-constituents (cons '(hello . "HELLO") rx-constituents))) (rx-to-string '(1+ hello) t)) => "HELLO+" The new rx implementation does not suffer from this bug. The result in your case is that the old rx, when translating (1+ identifiers), only tacked the "+" onto whatever regexp 'identifiers' produced, resulting in "[A-Z_a-z]+[0-9A-Z_a-z]*[!?]?+" which is a lot faster, since only the final [!?] is repeated twice (and it probably doesn't match very often).

This bug report was last modified 5 years and 232 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #38104 27.0.50; elixir-mode fontification is very slow

GNU bug report logs - #38104
27.0.50; elixir-mode fontification is very slow