GNU bug report logs - #74307
30.0.92; emacs-lisp font-locking word regexp

Previous Next

Package: emacs;

Reported by: Roland Winkler <winkler <at> gnu.org>

Date: Mon, 11 Nov 2024 06:30:02 UTC

Severity: normal

Merged with 74308

Found in version 30.0.92

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 74307 in the body.
You can then email your comments to 74307 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#74307; Package emacs. (Mon, 11 Nov 2024 06:30:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Roland Winkler <winkler <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 11 Nov 2024 06:30:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Roland Winkler <winkler <at> gnu.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.92; emacs-lisp font-locking word regexp
Date: Mon, 11 Nov 2024 00:28:34 -0600
Starting from emacs -Q, put the following into a buffer with
emacs-lisp-mode

  (setq foo "\\<foo\\>")

The part "foo\\" of the string "\\<foo\\>" will get
font-lock-variable-name-face, which looks odd.

I believe, this is due to a clause in lisp-mode.el that says

         ;; Words inside \\[], \\<>, \\{} or \\`' tend to be for
         ;; `substitute-command-keys'.

But this assumption is not always correct, in particular if ">" is
preceded by "\\", which happens when constructing regexps.




Merged 74307 74308. Request was from Eli Zaretskii <eliz <at> gnu.org> to control <at> debbugs.gnu.org. (Mon, 11 Nov 2024 12:31:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74307; Package emacs. (Thu, 14 Nov 2024 08:12:01 GMT) Full text and rfc822 format available.

Message #10 received at 74307 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Roland Winkler <winkler <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 74307 <at> debbugs.gnu.org
Subject: Re: bug#74307: 30.0.92; emacs-lisp font-locking word regexp
Date: Thu, 14 Nov 2024 10:11:43 +0200
> From: Roland Winkler <winkler <at> gnu.org>
> Date: Mon, 11 Nov 2024 00:28:34 -0600
> 
> Starting from emacs -Q, put the following into a buffer with
> emacs-lisp-mode
> 
>   (setq foo "\\<foo\\>")
> 
> The part "foo\\" of the string "\\<foo\\>" will get
> font-lock-variable-name-face, which looks odd.
> 
> I believe, this is due to a clause in lisp-mode.el that says
> 
>          ;; Words inside \\[], \\<>, \\{} or \\`' tend to be for
>          ;; `substitute-command-keys'.
> 
> But this assumption is not always correct, in particular if ">" is
> preceded by "\\", which happens when constructing regexps.

I believe you are saying that in

         (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) ">")
                          (seq "{" (group-n 1 lisp-mode-symbol) "}")))
          (1 font-lock-variable-name-face prepend))

we should use something like the below instead?

     (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) (not "\\\\") ">")
                      (seq "{" (group-n 1 lisp-mode-symbol) (not "\\\\") "}"))

And similarly for \\[] etc.?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74307; Package emacs. (Thu, 14 Nov 2024 16:25:02 GMT) Full text and rfc822 format available.

Message #13 received at 74307 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 74307 <at> debbugs.gnu.org, Roland Winkler <winkler <at> gnu.org>
Subject: Re: bug#74307: 30.0.92; emacs-lisp font-locking word regexp
Date: Thu, 14 Nov 2024 11:24:48 -0500
> I believe you are saying that in
>
>          (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) ">")
>                           (seq "{" (group-n 1 lisp-mode-symbol) "}")))
>           (1 font-lock-variable-name-face prepend))
>
> we should use something like the below instead?
>
>      (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) (not "\\\\") ">")
>                       (seq "{" (group-n 1 lisp-mode-symbol) (not "\\\\") "}"))
>
> And similarly for \\[] etc.?

Sounds good to me.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74307; Package emacs. (Thu, 14 Nov 2024 16:50:01 GMT) Full text and rfc822 format available.

Message #16 received at 74307 <at> debbugs.gnu.org (full text, mbox):

From: Roland Winkler <winkler <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 74307 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#74307: 30.0.92; emacs-lisp font-locking word regexp
Date: Thu, 14 Nov 2024 10:49:35 -0600
On Thu, Nov 14 2024, Eli Zaretskii wrote:
> we should use something like the below instead?
>
>      (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) (not "\\\\") ">")
>                       (seq "{" (group-n 1 lisp-mode-symbol) (not "\\\\") "q}"))

Yes, thanks.  (This is my first real-world encounter with rx.  Otherwise
I would have proposed it myself.)

> And similarly for \\[] etc.?

I do not know in what context backslash-quoted right square brackets may
appear in regexps.  But certainly, they do not make sense in the context
of substitute-command-keys either.  So excluding here backslash-quoted
right square brackets is probably for the better, too.




Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 16 Nov 2024 14:24:02 GMT) Full text and rfc822 format available.

Notification sent to Roland Winkler <winkler <at> gnu.org>:
bug acknowledged by developer. (Sat, 16 Nov 2024 14:24:02 GMT) Full text and rfc822 format available.

Message #21 received at 74307-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 74307-done <at> debbugs.gnu.org, winkler <at> gnu.org
Subject: Re: bug#74307: 30.0.92; emacs-lisp font-locking word regexp
Date: Sat, 16 Nov 2024 16:22:55 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Roland Winkler <winkler <at> gnu.org>,  74307 <at> debbugs.gnu.org
> Date: Thu, 14 Nov 2024 11:24:48 -0500
> 
> > I believe you are saying that in
> >
> >          (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) ">")
> >                           (seq "{" (group-n 1 lisp-mode-symbol) "}")))
> >           (1 font-lock-variable-name-face prepend))
> >
> > we should use something like the below instead?
> >
> >      (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) (not "\\\\") ">")
> >                       (seq "{" (group-n 1 lisp-mode-symbol) (not "\\\\") "}"))
> >
> > And similarly for \\[] etc.?
> 
> Sounds good to me.

Thanks, installed on master, and closing the bug.




Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 16 Nov 2024 14:24:02 GMT) Full text and rfc822 format available.

Notification sent to Roland Winkler <winkler <at> gnu.org>:
bug acknowledged by developer. (Sat, 16 Nov 2024 14:24:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74307; Package emacs. (Thu, 05 Dec 2024 07:23:02 GMT) Full text and rfc822 format available.

Message #29 received at 74307 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 74307 <at> debbugs.gnu.org, Roland Winkler <winkler <at> gnu.org>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#74307: 30.0.92; emacs-lisp font-locking word regexp
Date: Thu, 05 Dec 2024 09:20:16 +0200
>>   (setq foo "\\<foo\\>")
>> 
>> The part "foo\\" of the string "\\<foo\\>" will get
>> font-lock-variable-name-face, which looks odd.
>> 
>> I believe, this is due to a clause in lisp-mode.el that says
>> 
>>          ;; Words inside \\[], \\<>, \\{} or \\`' tend to be for
>>          ;; `substitute-command-keys'.
>> 
>> But this assumption is not always correct, in particular if ">" is
>> preceded by "\\", which happens when constructing regexps.
>
> I believe you are saying that in
>
>          (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) ">")
>                           (seq "{" (group-n 1 lisp-mode-symbol) "}")))
>           (1 font-lock-variable-name-face prepend))
>
> we should use something like the below instead?
>
>      (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) (not "\\\\") ">")
>                       (seq "{" (group-n 1 lisp-mode-symbol) (not "\\\\") "}"))

The problem is that this removes highlighting from the last character
because it doesn't get into the group:

  (rx (seq "[" (group-n 1 lisp-mode-symbol) (not "\\") "]"))
  => "\\[\\(?1:\\(?:\\w\\|\\s_\\|\\\\.\\)+\\)[^\\]]"

A possible solution is to move (not "\\") inside the group:

  (rx (seq "[" (group-n 1 (seq lisp-mode-symbol (not "\\"))) "]"))
  => "\\[\\(?1:\\(?:\\w\\|\\s_\\|\\\\.\\)+[^\\]\\)]"

But this removes highlighting completely from the reported case of
(setq foo "\\<foo\\>").  However, I guess it should not have highlighting
anyway because this is an incorrect syntax of `substitute-command-keys'
that should match only \\[], \\<>, \\{} or \\`' without the second \\




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74307; Package emacs. (Thu, 05 Dec 2024 07:39:02 GMT) Full text and rfc822 format available.

Message #32 received at 74307 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 74307 <at> debbugs.gnu.org, winkler <at> gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#74307: 30.0.92; emacs-lisp font-locking word regexp
Date: Thu, 05 Dec 2024 09:38:48 +0200
> From: Juri Linkov <juri <at> linkov.net>
> Cc: Roland Winkler <winkler <at> gnu.org>,  Stefan Monnier
>  <monnier <at> iro.umontreal.ca>,  74307 <at> debbugs.gnu.org
> Date: Thu, 05 Dec 2024 09:20:16 +0200
> 
> >>   (setq foo "\\<foo\\>")
> >> 
> >> The part "foo\\" of the string "\\<foo\\>" will get
> >> font-lock-variable-name-face, which looks odd.
> >> 
> >> I believe, this is due to a clause in lisp-mode.el that says
> >> 
> >>          ;; Words inside \\[], \\<>, \\{} or \\`' tend to be for
> >>          ;; `substitute-command-keys'.
> >> 
> >> But this assumption is not always correct, in particular if ">" is
> >> preceded by "\\", which happens when constructing regexps.
> >
> > I believe you are saying that in
> >
> >          (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) ">")
> >                           (seq "{" (group-n 1 lisp-mode-symbol) "}")))
> >           (1 font-lock-variable-name-face prepend))
> >
> > we should use something like the below instead?
> >
> >      (,(rx "\\\\" (or (seq "<" (group-n 1 lisp-mode-symbol) (not "\\\\") ">")
> >                       (seq "{" (group-n 1 lisp-mode-symbol) (not "\\\\") "}"))
> 
> The problem is that this removes highlighting from the last character
> because it doesn't get into the group:
> 
>   (rx (seq "[" (group-n 1 lisp-mode-symbol) (not "\\") "]"))
>   => "\\[\\(?1:\\(?:\\w\\|\\s_\\|\\\\.\\)+\\)[^\\]]"
> 
> A possible solution is to move (not "\\") inside the group:
> 
>   (rx (seq "[" (group-n 1 (seq lisp-mode-symbol (not "\\"))) "]"))
>   => "\\[\\(?1:\\(?:\\w\\|\\s_\\|\\\\.\\)+[^\\]\\)]"
> 
> But this removes highlighting completely from the reported case of
> (setq foo "\\<foo\\>").  However, I guess it should not have highlighting
> anyway because this is an incorrect syntax of `substitute-command-keys'
> that should match only \\[], \\<>, \\{} or \\`' without the second \\

Sorry, I don't understand: the change which was supposed to fix this
was already installed.  If you are saying it caused regressions, could
you please show a recipe for reproducing those regressions?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74307; Package emacs. (Thu, 05 Dec 2024 07:49:01 GMT) Full text and rfc822 format available.

Message #35 received at 74307 <at> debbugs.gnu.org (full text, mbox):

From: Juri Linkov <juri <at> linkov.net>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 74307 <at> debbugs.gnu.org, winkler <at> gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#74307: 30.0.92; emacs-lisp font-locking word regexp
Date: Thu, 05 Dec 2024 09:47:05 +0200
>> The problem is that this removes highlighting from the last character
>> because it doesn't get into the group:
>> 
>>   (rx (seq "[" (group-n 1 lisp-mode-symbol) (not "\\") "]"))
>>   => "\\[\\(?1:\\(?:\\w\\|\\s_\\|\\\\.\\)+\\)[^\\]]"
>> 
>> A possible solution is to move (not "\\") inside the group:
>> 
>>   (rx (seq "[" (group-n 1 (seq lisp-mode-symbol (not "\\"))) "]"))
>>   => "\\[\\(?1:\\(?:\\w\\|\\s_\\|\\\\.\\)+[^\\]\\)]"
>> 
>> But this removes highlighting completely from the reported case of
>> (setq foo "\\<foo\\>").  However, I guess it should not have highlighting
>> anyway because this is an incorrect syntax of `substitute-command-keys'
>> that should match only \\[], \\<>, \\{} or \\`' without the second \\
>
> Sorry, I don't understand: the change which was supposed to fix this
> was already installed.

Ah, so (setq foo "\\<foo\\>") should not be highlighted.  Ok, then
indeed (not "\\") should be inside the group.

> If you are saying it caused regressions, could you please show
> a recipe for reproducing those regressions?

A recipe is to put the following two lines into a buffer with
emacs-lisp-mode:

  (setq foo "\\<foo\\>")
  (setq foo "\\<foo>")

The first foo should not be highlighted, the second currently is
highlighted partially without the last character.  Here is the fix:

diff --git a/lisp/emacs-lisp/lisp-mode.el b/lisp/emacs-lisp/lisp-mode.el
index 99980a44ddf..95fbae48bb6 100644
--- a/lisp/emacs-lisp/lisp-mode.el
+++ b/lisp/emacs-lisp/lisp-mode.el
@@ -491,16 +491,16 @@ lisp-mode--search-key
          ;; Words inside \\[], \\<>, \\{} or \\`' tend to be for
          ;; `substitute-command-keys'.
          (,(rx "\\\\" (or (seq "["
-                               (group-n 1 lisp-mode-symbol) (not "\\") "]")
+                               (group-n 1 (seq lisp-mode-symbol (not "\\"))) "]")
                           (seq "`" (group-n 1
                                      ;; allow multiple words, e.g. "C-x a"
                                      lisp-mode-symbol (* " " lisp-mode-symbol))
                                "'")))
           (1 font-lock-constant-face prepend))
          (,(rx "\\\\" (or (seq "<"
-                               (group-n 1 lisp-mode-symbol) (not "\\") ">")
+                               (group-n 1 (seq lisp-mode-symbol (not "\\"))) ">")
                           (seq "{"
-                               (group-n 1 lisp-mode-symbol) (not "\\") "}")))
+                               (group-n 1 (seq lisp-mode-symbol) (not "\\")) "}")))
           (1 font-lock-variable-name-face prepend))
          ;; Ineffective backslashes (typically in need of doubling).
          ("\\(\\\\\\)\\([^\"\\]\\)"




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74307; Package emacs. (Thu, 05 Dec 2024 08:05:02 GMT) Full text and rfc822 format available.

Message #38 received at 74307 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> linkov.net>
Cc: 74307 <at> debbugs.gnu.org, winkler <at> gnu.org, monnier <at> iro.umontreal.ca
Subject: Re: bug#74307: 30.0.92; emacs-lisp font-locking word regexp
Date: Thu, 05 Dec 2024 10:04:03 +0200
> From: Juri Linkov <juri <at> linkov.net>
> Cc: winkler <at> gnu.org,  monnier <at> iro.umontreal.ca,  74307 <at> debbugs.gnu.org
> Date: Thu, 05 Dec 2024 09:47:05 +0200
> 
> > Sorry, I don't understand: the change which was supposed to fix this
> > was already installed.
> 
> Ah, so (setq foo "\\<foo\\>") should not be highlighted.  Ok, then
> indeed (not "\\") should be inside the group.
> 
> > If you are saying it caused regressions, could you please show
> > a recipe for reproducing those regressions?
> 
> A recipe is to put the following two lines into a buffer with
> emacs-lisp-mode:
> 
>   (setq foo "\\<foo\\>")
>   (setq foo "\\<foo>")
> 
> The first foo should not be highlighted, the second currently is
> highlighted partially without the last character.  Here is the fix:

Thanks, please install on master.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 02 Jan 2025 12:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 262 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.