#38104 - 27.0.50; elixir-mode fontification is very slow

GNU bug report logs - #38104
27.0.50; elixir-mode fontification is very slow

Package: emacs;

Reported by: Dmitry Gutov <dgutov <at> yandex.ru>

Date: Thu, 7 Nov 2019 15:41:02 UTC

Severity: normal

Found in version 27.0.50

Done: Dmitry Gutov <dgutov <at> yandex.ru>

Bug is archived. No further changes may be made.

Message #25 received at 38104-done <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru> To: Mattias Engdegård <mattiase <at> acm.org> Cc: 38104-done <at> debbugs.gnu.org Subject: Re: bug#38104: 27.0.50; elixir-mode fontification is very slow Date: Wed, 27 Nov 2019 23:58:46 +0200

Hi Mattias, On 26.11.2019 21:32, Mattias Engdegård wrote: > As it turned out, rx is fine (now); elixir-mode, not quite. In elixir-mode.el, we have > > (identifiers . ,(rx (one-or-more (any "A-Z" "a-z" "_")) > (zero-or-more (any "A-Z" "a-z" "0-9" "_")) > (optional (or "?" "!")))) > > First, this regex is suboptimal: the first character of an identifier should occur exactly once, or you get bad backtracking behaviour. Just remove the one-or-more construct: > > (identifiers . ,(rx (any "A-Z" "a-z" "_") > (zero-or-more (any "A-Z" "a-z" "0-9" "_")) > (optional (or "?" "!")))) > > This definition is then used in several places, but two in particular are of interest to us: > > ;; Module attributes > (,(elixir-rx (and "@" (1+ identifiers))) > > The construct (1+ identifiers) was perhaps meant to match multiple identifiers, but it doesn't (no separator); it just matches an identifier in several ways, which again leads to bad backtracking behaviour. > The same problem here: > > ;; Map keys > (,(elixir-rx (group (and (one-or-more identifiers) ":")) space) > > Remove the 1+ and one-or-more and it's fast again. That makes a lot of sense. I removed these one-or-more's and 1+ (and a few others), and it became fast again. I'll send a patch upstream. Thanks for your help! (Looking at the tracker, they have a minor version of this change submitted already). > Why did this "work" with the old rx implementation? Because that code had a nasty bug: it does not bracket definitions in rx-constituents properly. Example: > > (let ((rx-constituents (cons '(hello . "HELLO") rx-constituents))) > (rx-to-string '(1+ hello) t)) > => "HELLO+" > > The new rx implementation does not suffer from this bug. > > The result in your case is that the old rx, when translating (1+ identifiers), only tacked the "+" onto whatever regexp 'identifiers' produced, resulting in > > "[A-Z_a-z]+[0-9A-Z_a-z]*[!?]?+" > > which is a lot faster, since only the final [!?] is repeated twice (and it probably doesn't match very often). It's funny to think how someone probably beaten the current code into submission by trial and error.

This bug report was last modified 5 years and 235 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #38104 27.0.50; elixir-mode fontification is very slow

GNU bug report logs - #38104
27.0.50; elixir-mode fontification is very slow