GNU bug report logs - #39595
M-x compile still very line-length weak

Previous Next

Package: emacs;

Reported by: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>

Date: Fri, 14 Feb 2020 02:47:02 UTC

Severity: minor

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 39595 in the body.
You can then email your comments to 39595 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Fri, 14 Feb 2020 02:47:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 14 Feb 2020 02:47:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: bug-gnu-emacs <at> gnu.org
Subject: M-x compile still very line-length weak
Date: Thu, 13 Feb 2020 13:51:57 +0800
Compare M-x compile on make aaa vs. make bbb
$ cat Makefile
aaa:; perl -we 'print " "  x 9999;' #finishes right away.
bbb:; perl -we 'print "\n" x 9999;' #takes several seconds, even on the latest hardware.

(Indeed, on even longer lines we even see both the words "exit" and "Compiling" at the same time in the modeline.)
emacs-version "26.3"




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Fri, 14 Feb 2020 11:19:02 GMT) Full text and rfc822 format available.

Message #8 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Cc: 39595 <at> debbugs.gnu.org
Subject: Re: #39595: M-x compile still very line-length weak
Date: Fri, 14 Feb 2020 12:18:29 +0100
> aaa:; perl -we 'print " "  x 9999;' #finishes right away.
> bbb:; perl -we 'print "\n" x 9999;' #takes several seconds, even on the latest hardware. 

(The comments seem to have been swapped around, but we get the idea.)

This is not a rare edge case. Long lines are not uncommon in compilation output, and a sluggish M-x compile reflects badly on Emacs since it is a commonly used function.

The main culprit seems to be 'omake' -- try removing it from compilation-error-regexp-alist. There is still an annoying delay; further investigation is needed. (For instance, 'msft' occurs twice; this must be a mistake.)





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Fri, 14 Feb 2020 16:28:02 GMT) Full text and rfc822 format available.

Message #11 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>,
 Eli Zaretskii <eliz <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Paul Pogonyshev <pogonyshev <at> gmail.com>
Cc: 39595 <at> debbugs.gnu.org
Subject: Re: #39595: M-x compile still very line-length weak
Date: Fri, 14 Feb 2020 17:27:39 +0100
Dan, in your example you used a long line of spaces. Presumably that is representative for your particular use, but different message parsers are sensitive to different kinds of long lines:

* 'omake' in compilation-error-regexp-alist is indeed what makes Emacs unusably slow with long lines of spaces.

* 'msft' and 'watcom' are both expensive with long lines of spaces, but not as bad as 'omake'. Maybe these regexps can be tuned further.

* 'msft' occurs twice by mistake; the last one should be removed. This helps a bit.

* 'maven' is still expensive for long lines of non-spaces; see bug#3441. Anchoring the match at line-start would fix it:

(rx bol
    (? "["
       (or "ERROR" (group "WARNING") (group "INFO"))
       "]"
       (+ " "))
    (group
     (not (in "\n "))
     (* (or (not (in "\n :"))
            (: " "
               (not (in "\n/-")))
            (: ":"
               (not (in "\n ["))))))
    ":["
    (group (+ digit))
    ","
    (group (+ digit))
    "] ")

Is that correct? (CC:ing Paul Pogonyshev, who worked on that regexp in bug#20556.)

I suggest we disable omake by default --- although a nice tool, it was never widely used, and OCaml programmers tend to use Dune (or plain Make) these days. The omake rule will still be there for those who need it, but the majority shouldn't bear the cost.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Fri, 14 Feb 2020 17:02:02 GMT) Full text and rfc822 format available.

Message #14 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 39595 <at> debbugs.gnu.org, pogonyshev <at> gmail.com, monnier <at> iro.umontreal.ca,
 jidanni <at> jidanni.org
Subject: Re: #39595: M-x compile still very line-length weak
Date: Fri, 14 Feb 2020 19:00:56 +0200
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Fri, 14 Feb 2020 17:27:39 +0100
> Cc: 39595 <at> debbugs.gnu.org
> 
> I suggest we disable omake by default --- although a nice tool, it was never widely used, and OCaml programmers tend to use Dune (or plain Make) these days. The omake rule will still be there for those who need it, but the majority shouldn't bear the cost.

Is there some forum where the relevant people could be asked about
this?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Fri, 14 Feb 2020 22:48:02 GMT) Full text and rfc822 format available.

Message #17 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 39595 <at> debbugs.gnu.org, pogonyshev <at> gmail.com, monnier <at> iro.umontreal.ca,
 jidanni <at> jidanni.org
Subject: Re: #39595: M-x compile still very line-length weak
Date: Fri, 14 Feb 2020 23:47:43 +0100
[Message part 1 (text/plain, inline)]
14 feb. 2020 kl. 18.00 skrev Eli Zaretskii <eliz <at> gnu.org>:

> Is there some forum where the relevant people could be asked about
> this?

Not sure where to go for that. The problem is really in Emacs's hacky implementation: when 'omake' is included in compilation-error-regexp-alist, many other regexps are rewritten in a way that makes them potentially slower. This is why it's not an ideal feature to have enabled by default.

Attached are two patches: one that anchors the regexp for Maven, and one that speeds up 'msft' and 'watcom' by eliminating the same repetition-after-repetition flaw in each (not much different from those found by the latest relint/xr scan posted to emacs-devel).

[0001-Speed-up-maven-compilation-error-message-regexp.patch (application/octet-stream, attachment)]
[0002-Speed-up-msft-and-watcom-compilation-error-regexps.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Sat, 15 Feb 2020 01:30:02 GMT) Full text and rfc822 format available.

Message #20 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 39595 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Paul Pogonyshev <pogonyshev <at> gmail.com>
Subject: Re: #39595: M-x compile still very line-length weak
Date: Sat, 15 Feb 2020 09:28:49 +0800
(Yeah I got my comments backwards.)
Anyway I recall perl is fast on regexps, newlines or not.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Sat, 15 Feb 2020 07:36:02 GMT) Full text and rfc822 format available.

Message #23 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 39595 <at> debbugs.gnu.org, pogonyshev <at> gmail.com, monnier <at> iro.umontreal.ca,
 jidanni <at> jidanni.org
Subject: Re: #39595: M-x compile still very line-length weak
Date: Sat, 15 Feb 2020 09:35:42 +0200
> From: Mattias Engdegård <mattiase <at> acm.org>
> Date: Fri, 14 Feb 2020 23:47:43 +0100
> Cc: jidanni <at> jidanni.org, monnier <at> iro.umontreal.ca, pogonyshev <at> gmail.com,
>         39595 <at> debbugs.gnu.org
> 
> > Is there some forum where the relevant people could be asked about
> > this?
> 
> Not sure where to go for that. The problem is really in Emacs's hacky implementation: when 'omake' is included in compilation-error-regexp-alist, many other regexps are rewritten in a way that makes them potentially slower. This is why it's not an ideal feature to have enabled by default.

I'm okay with disabling 'omake' if we have nowhere else to ask.

Thanks.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Sat, 15 Feb 2020 13:58:02 GMT) Full text and rfc822 format available.

Message #26 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Cc: 39595 <at> debbugs.gnu.org,
 Mattias Engdegård <mattiase <at> acm.org>,
 Eli Zaretskii <eliz <at> gnu.org>, Paul Pogonyshev <pogonyshev <at> gmail.com>
Subject: Re: #39595: M-x compile still very line-length weak
Date: Sat, 15 Feb 2020 08:57:19 -0500
> Anyway I recall perl is fast on regexps, newlines or not.

That's just a reputation.
In reality, maybe its constant is lower than that of Emacs's regexp
matcher, and maybe it implements a few more optimisations, but it
suffers from the same explosion as Emacs's regexp matcher with regexps
like the one under discussions (i.e. when Emacs's regexps are slow,
it's because of the basty complexity introduced by backtracking and
Perl's regexps do backtracking more or less as much as Emacs's).


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Sat, 15 Feb 2020 16:47:01 GMT) Full text and rfc822 format available.

Message #29 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 39595 <at> debbugs.gnu.org, pogonyshev <at> gmail.com, monnier <at> iro.umontreal.ca,
 jidanni <at> jidanni.org
Subject: Re: #39595: M-x compile still very line-length weak
Date: Sat, 15 Feb 2020 17:45:54 +0100
[Message part 1 (text/plain, inline)]
15 feb. 2020 kl. 08.35 skrev Eli Zaretskii <eliz <at> gnu.org>:

> I'm okay with disabling 'omake' if we have nowhere else to ask.

We may not have to, after all.  Reading the OMake sources, it very much looks like errors are indented by exactly 6 spaces, which means that we can replace (* " ") with (? "6 spaces") which is a lot faster.

Having done that, it turned out that recognising ruby-Test::Unit errors depended on the old 'omake' regexp rewriting (another reason to disable omake by default, perhaps), so that regexp had to be fixed as well.

Along with the two previous patches (for msft, watcom and maven), this should reduce the cost of long lines to something more tolerable for the time being.

[0001-Make-OMake-support-slightly-less-expensive-bug-39595.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Sun, 16 Feb 2020 12:16:02 GMT) Full text and rfc822 format available.

Message #32 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Cc: 39595 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Paul Pogonyshev <pogonyshev <at> gmail.com>
Subject: Re: #39595: M-x compile still very line-length weak
Date: Sun, 16 Feb 2020 13:15:09 +0100
[Message part 1 (text/plain, inline)]
To wrap it up, here are the three patches (intended to be used together). The Maven patch was tweaked further for efficiency.

Dan, is this satisfactory?

[0001-Speed-up-maven-compilation-error-message-regexp.patch (application/octet-stream, attachment)]
[0002-Speed-up-msft-and-watcom-compilation-error-regexps.patch (application/octet-stream, attachment)]
[0003-Make-OMake-support-slightly-less-expensive-bug-39595.patch (application/octet-stream, attachment)]
[Message part 5 (text/plain, inline)]


Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#39595; Package emacs. (Sun, 16 Feb 2020 15:38:01 GMT) Full text and rfc822 format available.

Message #35 received at 39595 <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 39595 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Paul Pogonyshev <pogonyshev <at> gmail.com>
Subject: Re: #39595: M-x compile still very line-length weak
Date: Sun, 16 Feb 2020 23:37:43 +0800
>>>>> "ME" == Mattias Engdegård <mattiase <at> acm.org> writes:

ME> Dan, is this satisfactory?

I bet it does!
(All I know is I just use Debian sid. So in two years...)




Reply sent to Mattias Engdegård <mattiase <at> acm.org>:
You have taken responsibility. (Mon, 17 Feb 2020 11:08:02 GMT) Full text and rfc822 format available.

Notification sent to 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>:
bug acknowledged by developer. (Mon, 17 Feb 2020 11:08:02 GMT) Full text and rfc822 format available.

Message #40 received at 39595-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
Cc: 39595-done <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Paul Pogonyshev <pogonyshev <at> gmail.com>
Subject: Re: #39595: M-x compile still very line-length weak
Date: Mon, 17 Feb 2020 12:07:37 +0100
16 feb. 2020 kl. 16.37 skrev 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>:

> I bet it does!
> (All I know is I just use Debian sid. So in two years...)

Very well, pushed to emacs-27.

For future work, there seem to be more opportunities for speeding up the remaining regexps. In particular:

* Try to anchor matches at bol when possible.
* Avoid infinite repetitions (of spaces, etc) when the exact amount is known.
* Reject impossible matches as early as possible.





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 16 Mar 2020 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 99 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.