GNU bug report logs - #18109
24.4.50; `compilation-error-regexp-alist-alist': wrong regexp for Maven

Previous Next

Package: emacs;

Reported by: Filipp Gunbin <fgunbin <at> fastmail.fm>

Date: Fri, 25 Jul 2014 20:41:02 UTC

Severity: normal

Tags: moreinfo

Found in version 24.4.50

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 18109 in the body.
You can then email your comments to 18109 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Fri, 25 Jul 2014 20:41:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Filipp Gunbin <fgunbin <at> fastmail.fm>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 25 Jul 2014 20:41:04 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Filipp Gunbin <fgunbin <at> fastmail.fm>
To: bug-gnu-emacs <at> gnu.org
Subject: 24.4.50;
 `compilation-error-regexp-alist-alist': wrong regexp for Maven
Date: Fri, 25 Jul 2014 21:33:18 +0400
Below is the corrected (I hope) version of the regexp.

=== modified file 'lisp/progmodes/compile.el'
--- lisp/progmodes/compile.el	2014-05-29 03:45:29 +0000
+++ lisp/progmodes/compile.el	2014-07-22 18:33:53 +0000
@@ -211,12 +211,9 @@
     (jikes-file
      "^\\(?:Found\\|Issued\\) .* compiling \"\\(.+\\)\":$" 1 nil nil 0)
 
-
-    ;; This used to be pathologically slow on long lines (Bug#3441),
-    ;; due to matching filenames via \\(.*?\\).  This might be faster.
     (maven
      ;; Maven is a popular free software build tool for Java.
-     "\\([^ \n]\\(?:[^\n :]\\| [^-/\n]\\|:[^ \n]\\)*?\\):\\[\\([0-9]+\\),\\([0-9]+\\)\\] " 1 2 3)
+      "\\(?:\\[ERROR\\]\\s-+\\)?\\([^[\n]+\\):\\[\\([[:digit:]]+\\),\\([[:digit:]]+\\)\\]" 1 2 3)
 
     (jikes-line
      "^ *\\([0-9]+\\)\\.[ \t]+.*\n +\\(<-*>\n\\*\\*\\* \\(?:Error\\|Warnin\\(g\\)\\)\\)"


-- 
    Filipp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Sat, 26 Jul 2014 07:23:01 GMT) Full text and rfc822 format available.

Message #8 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>
Cc: 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50;
 `compilation-error-regexp-alist-alist': wrong regexp for Maven
Date: Sat, 26 Jul 2014 03:22:24 -0400
Please explain why it is wrong, and show an example of what is is
supposed to match. The current one in etc/compilation.txt is:

* maven 2.0.9

symbol: maven

FooBar.java:[111,53] no interface expected here





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Mon, 28 Jul 2014 12:31:01 GMT) Full text and rfc822 format available.

Message #11 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Filipp Gunbin <fgunbin <at> fastmail.fm>
To: Glenn Morris <rgm <at> gnu.org>
Cc: Filipp Gunbin <fgunbin <at> fastmail.fm>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50;
 `compilation-error-regexp-alist-alist': wrong regexp for Maven
Date: Mon, 28 Jul 2014 16:30:06 +0400
Glenn,

On 26/07/2014 03:22 -0400, Glenn Morris wrote:

> Please explain why it is wrong, and show an example of what is is
> supposed to match. The current one in etc/compilation.txt is:
>
> * maven 2.0.9
>
> symbol: maven
>
> FooBar.java:[111,53] no interface expected here

Oh yes, sorry for the bad report.

While the original regexp catches these errors:

D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[27,12] error: cannot find symbol
[ERROR]   symbol:   class SomeService
  location: class MainController

it does not catch these, which Maven emits sometimes:

[ERROR] D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[21,1] error: cannot find symbol
[ERROR]   symbol: class Controller

My version catches both.

Also, the original regexp seems to be more complicated than it really
should be.

Tested on Maven 2.2.1 and 3.0.4.

-- 
    Filipp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Sun, 03 Aug 2014 15:13:02 GMT) Full text and rfc822 format available.

Message #14 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Colascione <dancol <at> dancol.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>, Glenn Morris <rgm <at> gnu.org>
Cc: 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong
 regexp for Maven
Date: Sun, 03 Aug 2014 08:12:23 -0700
[Message part 1 (text/plain, inline)]
On 07/28/2014 05:30 AM, Filipp Gunbin wrote:
> Glenn,
> 
> On 26/07/2014 03:22 -0400, Glenn Morris wrote:
> 
>> Please explain why it is wrong, and show an example of what is is
>> supposed to match. The current one in etc/compilation.txt is:
>>
>> * maven 2.0.9
>>
>> symbol: maven
>>
>> FooBar.java:[111,53] no interface expected here
> 
> Oh yes, sorry for the bad report.
> 
> While the original regexp catches these errors:
> 
> D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[27,12] error: cannot find symbol
> [ERROR]   symbol:   class SomeService
>   location: class MainController
> 
> it does not catch these, which Maven emits sometimes:
> 
> [ERROR] D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[21,1] error: cannot find symbol
> [ERROR]   symbol: class Controller
> 
> My version catches both.
> 
> Also, the original regexp seems to be more complicated than it really
> should be.
> 
> Tested on Maven 2.2.1 and 3.0.4.
> 

Would you please consider converting this regexp to rx form?


[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Wed, 09 Sep 2020 11:17:01 GMT) Full text and rfc822 format available.

Message #17 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>
Cc: Glenn Morris <rgm <at> gnu.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':
 wrong regexp for Maven
Date: Wed, 09 Sep 2020 13:16:18 +0200
Filipp Gunbin <fgunbin <at> fastmail.fm> writes:

> it does not catch these, which Maven emits sometimes:
>
> [ERROR]
> D:\cygwin\root\my-project\src\main\java\my\project\controllers\MainController.java:[21,1]
> error: cannot find symbol
> [ERROR]   symbol: class Controller
>
> My version catches both.

There wasn't much of a follow-up on this afterwards (six years ago), but
this regexp was rewritten to use rx this year:

    (maven
     ;; Maven is a popular free software build tool for Java.
     ,(rx bol
          ;; It is unclear whether the initial [type] tag is always present.
          (? "["
             (or "ERROR" (group-n 1 "WARNING") (group-n 2 "INFO"))
             "] ")
          (group-n 3                    ; File
                   (not (any "\n ["))
                   (* (or (not (any "\n :"))
                          (: " " (not (any "\n/-")))
                          (: ":" (not (any "\n ["))))))
          ":["
          (group-n 4 (+ digit))         ; Line
          ","
          (group-n 5 (+ digit))         ; Column
          "] ")
     3 4 5 (1 . 2))

Looking at the new version, it does seem more similar to the proposed
patch than it was before the rewrite.

So does this work satisfactorily now (i.e., in Emacs 28)?

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) moreinfo. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Wed, 09 Sep 2020 11:17:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Thu, 03 Dec 2020 15:00:02 GMT) Full text and rfc822 format available.

Message #22 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>
Cc: Glenn Morris <rgm <at> gnu.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':
 wrong regexp for Maven
Date: Thu, 03 Dec 2020 15:59:19 +0100
Lars Ingebrigtsen <larsi <at> gnus.org> writes:

> Looking at the new version, it does seem more similar to the proposed
> patch than it was before the rewrite.
>
> So does this work satisfactorily now (i.e., in Emacs 28)?

More information was requested, but no response was given within a few
months, so I'm closing this bug report.  If the problem still exists,
please respond to this email and we'll reopen the bug report.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug closed, send any further explanations to 18109 <at> debbugs.gnu.org and Filipp Gunbin <fgunbin <at> fastmail.fm> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Thu, 03 Dec 2020 15:00:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Fri, 04 Dec 2020 18:13:01 GMT) Full text and rfc822 format available.

Message #27 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Filipp Gunbin <fgunbin <at> fastmail.fm>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: Glenn Morris <rgm <at> gnu.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':
 wrong regexp for Maven
Date: Fri, 04 Dec 2020 21:11:56 +0300
On 03/12/2020 15:59 +0100, Lars Ingebrigtsen wrote:

> Lars Ingebrigtsen <larsi <at> gnus.org> writes:
>
>> Looking at the new version, it does seem more similar to the proposed
>> patch than it was before the rewrite.
>>
>> So does this work satisfactorily now (i.e., in Emacs 28)?
>
> More information was requested, but no response was given within a few
> months, so I'm closing this bug report.  If the problem still exists,
> please respond to this email and we'll reopen the bug report.

Thanks for looking at this, the regexp now seems to catch both my
examples from years ago.

Filipp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Fri, 04 Dec 2020 19:23:02 GMT) Full text and rfc822 format available.

Message #30 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>, Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': 
 wrong regexp for Maven
Date: Fri, 4 Dec 2020 20:22:41 +0100
> Thanks for looking at this, the regexp now seems to catch both my examples from years ago. 

We weren't sure whether messages always were prefixed by [ERROR] etc or could occur without such tags. The Maven documentation and source tree didn't help much, but perhaps I was looking in the wrong places. Could you help resolve the issue?





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Sat, 05 Dec 2020 22:23:01 GMT) Full text and rfc822 format available.

Message #33 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Filipp Gunbin <fgunbin <at> fastmail.fm>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':
 wrong regexp for Maven
Date: Sun, 06 Dec 2020 01:21:46 +0300
On 04/12/2020 20:22 +0100, Mattias Engdegård wrote:

>> Thanks for looking at this, the regexp now seems to catch both my examples from years ago. 
>
> We weren't sure whether messages always were prefixed by [ERROR] etc
> or could occur without such tags. The Maven documentation and source
> tree didn't help much, but perhaps I was looking in the wrong
> places. Could you help resolve the issue?

Now the regexp seems to catch both prefixed and non-prefixed messages,
what else should be resolved here?

Filipp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Sun, 06 Dec 2020 09:33:02 GMT) Full text and rfc822 format available.

Message #36 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong
 regexp for Maven
Date: Sun, 6 Dec 2020 10:32:27 +0100
5 dec. 2020 kl. 23.21 skrev Filipp Gunbin <fgunbin <at> fastmail.fm>:

> Now the regexp seems to catch both prefixed and non-prefixed messages,
> what else should be resolved here?

Ah, yes. Apart from the examples in compilation.txt, I could not find any evidence for non-prefixed messages ever being emitted. Since I'm not a Maven user myself, it would be useful to know if the non-prefixed example was just an oversight or an actual occurrence.

It is not a major problem but I like doing a proper work. Having patterns that match more than their strict minimum can be troublesome for two reasons: a regexp may accidentally catch a message intended for another pattern, and it may slow down message matching. Both have been issues several times in the past, which is why I'm wary of having too-loose regexps in the list.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Sun, 06 Dec 2020 14:23:01 GMT) Full text and rfc822 format available.

Message #39 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Filipp Gunbin <fgunbin <at> fastmail.fm>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':
 wrong regexp for Maven
Date: Sun, 06 Dec 2020 17:22:21 +0300
On 06/12/2020 10:32 +0100, Mattias Engdegård wrote:

> 5 dec. 2020 kl. 23.21 skrev Filipp Gunbin <fgunbin <at> fastmail.fm>:
>
>> Now the regexp seems to catch both prefixed and non-prefixed messages,
>> what else should be resolved here?
>
> Ah, yes. Apart from the examples in compilation.txt, I could not find
> any evidence for non-prefixed messages ever being emitted. Since I'm
> not a Maven user myself, it would be useful to know if the
> non-prefixed example was just an oversight or an actual occurrence.
>
> It is not a major problem but I like doing a proper work. Having
> patterns that match more than their strict minimum can be troublesome
> for two reasons: a regexp may accidentally catch a message intended
> for another pattern, and it may slow down message matching. Both have
> been issues several times in the past, which is why I'm wary of having
> too-loose regexps in the list.

Hm, I rarely use Maven these days (many projects switched to Gradle),
and I'm not on Windows any more, so I cannot reproduce the original
problem now.  If you think it's very improbable to have non-prefixed
message - just make the regexp more strict, and let's see whether
someone reports it again.

Filipp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Sun, 06 Dec 2020 15:06:02 GMT) Full text and rfc822 format available.

Message #42 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong
 regexp for Maven
Date: Sun, 6 Dec 2020 16:05:20 +0100
6 dec. 2020 kl. 15.22 skrev Filipp Gunbin <fgunbin <at> fastmail.fm>:

> Hm, I rarely use Maven these days (many projects switched to Gradle),
> and I'm not on Windows any more, so I cannot reproduce the original
> problem now.  If you think it's very improbable to have non-prefixed
> message - just make the regexp more strict, and let's see whether
> someone reports it again.

Thank you, maybe we should indeed do that.

It is good to have someone knowing Gradle! That pattern could need some work as well. It currently is (in rx form):

(rx bol
   (| (group "w") nonl)
   ":"
   (* " ")     ; ??
   (group
    (? (in "A-Za-z") ":")
    (+ (not (in "\n:"))))
   ":"
   (* " ")     ; ??
   "("
   (group (+ (in "0-9")))
   ","
   (* " ")     ; ??
   (group (+ (in "0-9")))
   ")")

but the examples (from compilation.txt) look like:

e: /src/Test.kt: (34, 15): foo: bar
w: /src/Test.kt: (34, 15): foo: bar

Thus it looks like we can expect exactly one space each after the first and second colon and after the comma, instead of zero-or-more spaces (the '??' comments above). As a Gradle user, can you confirm this?

The way the pattern is written makes it prone to matching other messages entirely or partly, with potential negative consequences for correctness, performance or both.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Sun, 06 Dec 2020 15:26:02 GMT) Full text and rfc822 format available.

Message #45 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong
 regexp for Maven
Date: Sun, 6 Dec 2020 16:25:27 +0100
[Message part 1 (text/plain, inline)]
> Thus it looks like we can expect exactly one space each after the first and second colon and after the comma, instead of zero-or-more spaces

Looking at https://github.com/JetBrains/kotlin/commit/ffe8ae3840d7b9bdc82170c8181031f05ced68bd, it looks likely; here is a proposed patch.

[0001-Stricter-gradle-kotlin-message-pattern.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Mon, 07 Dec 2020 10:42:01 GMT) Full text and rfc822 format available.

Message #48 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Filipp Gunbin <fgunbin <at> fastmail.fm>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':
 wrong regexp for Maven
Date: Mon, 07 Dec 2020 13:41:09 +0300
On 06/12/2020 16:05 +0100, Mattias Engdegård wrote:

> Thus it looks like we can expect exactly one space each after the first and second colon and after the comma, instead of zero-or-more spaces (the '??' comments above). As a Gradle user, can you confirm this?
>
> The way the pattern is written makes it prone to matching other messages entirely or partly, with potential negative consequences for correctness, performance or both.

It was me who put there those quantifiers, and I don't object to making
the regexps stricter.

But, we just need to be aware that Java tools usually don't expect the
output to be parsed.  Like, an IDE uses Gradle's API to run it, and
Gradle uses compiler API to compile - this way none of them have to
parse anything.  So they output something that can be parsed, yes, but
the format could change at any time.  That is why I'm more inclined to
making regexps more _lax_, not the other way around (and fix the
problems with them once they appear).

Filipp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Mon, 07 Dec 2020 13:50:02 GMT) Full text and rfc822 format available.

Message #51 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong
 regexp for Maven
Date: Mon, 7 Dec 2020 14:49:40 +0100
7 dec. 2020 kl. 11.41 skrev Filipp Gunbin <fgunbin <at> fastmail.fm>:

> It was me who put there those quantifiers, and I don't object to making
> the regexps stricter.

It would be unfair to blame you for that! After all, that's how most of the other patterns were written, and for logical reasons: it seems intuitive and sensible to make the rules as loose as possible in case the format changes or there is otherwise a variation in the output. If the observed messages contain a single space in one place then standard practice has been to tolerate any number of spaces there, maybe even zero.

However, experience tells us that this intuition is wrong. Output formats do in fact tend to remain unchanged: Emacs and other editors, IDEs and other code are parsing them, and they are not all equally tolerant or in the same way. There is thus a self-reinforcing effect: the tool keeps output stable because we expect it to. (When output formats do change, it tends to be for good reasons and regexp tolerance is then rarely useful.)

> But, we just need to be aware that Java tools usually don't expect the
> output to be parsed.

Yes they do! The very composition of something like the gradle-kotlin output

e: FILENAME: (LINE, COL): MESSAGE

is so strict and formalised that it was definitely made with machine-readability in mind.

>  That is why I'm more inclined to
> making regexps more _lax_, not the other way around (and fix the
> problems with them once they appear).

As we have found out the hard way, the cost of lax patterns is insidious and diffuse until the mess really has to be sorted out -- and by then it's hard to get hold of the various people involved who have since long disappeared or forgot all about what they wrote years ago. Patterns are added independently of one another but interact in unexpected ways.

Thus, better to keep patterns strict, and only alter them when and if tool output changes; it is then clear exactly what needs to be done and why. For most rules this never becomes necessary.





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Mon, 07 Dec 2020 20:08:02 GMT) Full text and rfc822 format available.

Message #54 received at 18109 <at> debbugs.gnu.org (full text, mbox):

From: Filipp Gunbin <fgunbin <at> fastmail.fm>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109 <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':
 wrong regexp for Maven
Date: Mon, 07 Dec 2020 23:07:44 +0300
On 07/12/2020 14:49 +0100, Mattias Engdegård wrote:

> However, experience tells us that this intuition is wrong. Output
> formats do in fact tend to remain unchanged: Emacs and other editors,
> IDEs and other code are parsing them, and they are not all equally
> tolerant or in the same way. There is thus a self-reinforcing effect:
> the tool keeps output stable because we expect it to. (When output
> formats do change, it tends to be for good reasons and regexp
> tolerance is then rarely useful.)

I would be very much happy if this was true (I don't say it's the
opposite, but I have a feeling that few in Java world care about how the
error parses in Emacs).

>> But, we just need to be aware that Java tools usually don't expect the
>> output to be parsed.
>
> Yes they do! The very composition of something like the gradle-kotlin output
>
> e: FILENAME: (LINE, COL): MESSAGE
>
> is so strict and formalised that it was definitely made with
> machine-readability in mind.

I doubt that any modern-or-so Java IDE will parse any error messages,
given that build tools and compilers have APIs.  At the level of build
tools, I can tell only for Gradle, and (to the best of my knowledge) it
doesn't - when invoking either compilers or other tools, like checkstyle
plugins.

>>  That is why I'm more inclined to
>> making regexps more _lax_, not the other way around (and fix the
>> problems with them once they appear).
>
> As we have found out the hard way, the cost of lax patterns is
> insidious and diffuse until the mess really has to be sorted out --
> and by then it's hard to get hold of the various people involved who
> have since long disappeared or forgot all about what they wrote years
> ago. Patterns are added independently of one another but interact in
> unexpected ways.
>
> Thus, better to keep patterns strict, and only alter them when and if
> tool output changes; it is then clear exactly what needs to be done
> and why. For most rules this never becomes necessary.

Just wondering - did we have really that much problems caused by bad
performance of compilation regexps?  Because if we did, then maybe we
should look at other approaches, like trying to detect the compiler
used, and narrow the set of regexps based on it.  It's natural to expect
that many different people would edit these regexps when something
doesn't work for them, and expecting that you will always come and fix
the things up would not be very fair to you :-)

Filipp




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Wed, 09 Dec 2020 18:43:02 GMT) Full text and rfc822 format available.

Message #57 received at 18109-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Filipp Gunbin <fgunbin <at> fastmail.fm>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109-done <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist': wrong
 regexp for Maven
Date: Wed, 9 Dec 2020 19:41:27 +0100
7 dec. 2020 kl. 21.07 skrev Filipp Gunbin <fgunbin <at> fastmail.fm>:

> [...] I have a feeling that few in Java world care about how the
> error parses in Emacs).

Most likely. On the other hand, lack of interest in the output format can also imply that it's unlikely to change.

> I doubt that any modern-or-so Java IDE will parse any error messages,
> given that build tools and compilers have APIs.

Quite possible, but the very emission of formalised messages to stdout/stderr means that this mode of usage is still acknowledged as somewhat common and useful.

> - did we have really that much problems caused by bad
> performance of compilation regexps?  Because if we did, then maybe we
> should look at other approaches, like trying to detect the compiler
> used, and narrow the set of regexps based on it.

This is hard to do in any practical way, not the least because a single message buffer may consist of the combined output of dozens of different tools -- compilers, linters, build tools, spell checkers, testing, stack traces, packaging, and so on. Not to mention the practical difficulty of going from the string 'make' to 'GCC version 11.2'.

That things work reasonably anyway is very much thanks to the prevalence of a few fairly common formats, such as GNU (file:line: message).

>  It's natural to expect
> that many different people would edit these regexps when something
> doesn't work for them, and expecting that you will always come and fix
> the things up would not be very fair to you :-)

Very considerate, thank you! There seems to be a fairly good flow of reports when something doesn't work. (A more modern and inviting bug-reporting system would probably help but that is a completely different matter.)

I'm pushing the proposed tightening of gradle-kotlin because the principle is right, and even if the Java world internally prefer APIs for composing tools, a tighter regexp in Emacs helps performance and accuracy for other patterns. Loose regexps form a sort of tragedy of the commons.

It seems that we also have forgotten to close the bug; doing that now. Thank you again for the insightful comments!





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#18109; Package emacs. (Thu, 10 Dec 2020 13:13:02 GMT) Full text and rfc822 format available.

Message #60 received at 18109-done <at> debbugs.gnu.org (full text, mbox):

From: Filipp Gunbin <fgunbin <at> fastmail.fm>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>, 18109-done <at> debbugs.gnu.org
Subject: Re: bug#18109: 24.4.50; `compilation-error-regexp-alist-alist':
 wrong regexp for Maven
Date: Thu, 10 Dec 2020 16:12:25 +0300
On 09/12/2020 19:41 +0100, Mattias Engdegård wrote:

> Quite possible, but the very emission of formalised messages to
> stdout/stderr means that this mode of usage is still acknowledged as
> somewhat common and useful.

Yes, sure.

>> - did we have really that much problems caused by bad
>> performance of compilation regexps?  Because if we did, then maybe we
>> should look at other approaches, like trying to detect the compiler
>> used, and narrow the set of regexps based on it.
>
> This is hard to do in any practical way, not the least because a
> single message buffer may consist of the combined output of dozens of
> different tools -- compilers, linters, build tools, spell checkers,
> testing, stack traces, packaging, and so on. Not to mention the
> practical difficulty of going from the string 'make' to 'GCC version
> 11.2'.
>
> That things work reasonably anyway is very much thanks to the
> prevalence of a few fairly common formats, such as GNU (file:line:
> message).

Yes, btw I see that "gnu" regexp sometimes captures messages which I
expect to be captured by "javac" regexp.  This is not that unexpected,
given the occasional similarity between formats...  I'll look into that
later.

>>  It's natural to expect
>> that many different people would edit these regexps when something
>> doesn't work for them, and expecting that you will always come and fix
>> the things up would not be very fair to you :-)
>
> Very considerate, thank you! There seems to be a fairly good flow of
> reports when something doesn't work. (A more modern and inviting
> bug-reporting system would probably help but that is a completely
> different matter.)
>
> I'm pushing the proposed tightening of gradle-kotlin because the
> principle is right, and even if the Java world internally prefer APIs
> for composing tools, a tighter regexp in Emacs helps performance and
> accuracy for other patterns. Loose regexps form a sort of tragedy of
> the commons.
>
> It seems that we also have forgotten to close the bug; doing that
> now. Thank you again for the insightful comments!

Thank you for careful work.

Filipp




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 08 Jan 2021 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 241 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.