GNU bug report logs - #74415
29.4; mouse-start-end does not respect syntax-table text properties

Previous Next

Package: emacs;

Reported by: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>

Date: Mon, 18 Nov 2024 10:25:02 UTC

Severity: normal

Found in version 29.4

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 74415 in the body.
You can then email your comments to 74415 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Mon, 18 Nov 2024 10:25:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Guillaume Brunerie <guillaume.brunerie <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Mon, 18 Nov 2024 10:25:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.4; mouse-start-end does not respect syntax-table text properties
Date: Mon, 18 Nov 2024 11:12:49 +0100
Hello,

The `mouse-start-end` function (used in particular when double clicking on an
opening parenthesis to select the region from that opening parenthesis to the
matching closing parenthesis) does not respect syntax-table text properties to
determine if the character at point is an opening parenthesis. Looking at the
code, it is at line 1941 in lisp/mouse.el (and there are more instances further
down in the same file)
(https://github.com/emacs-mirror/emacs/blob/eee0ed8442aa78320a3e578ab290df145fb49624/lisp/mouse.el#L1941):

    ((and (= mode 1)
          (= start end)
          (char-after start)
          (= (char-syntax (char-after start)) ?\())

Note that the documentation of `char-syntax` says:

> If you’re trying to determine the syntax of characters in the buffer, this is
> probably the wrong function to use, because it can’t take ‘syntax-table’ text
> properties into account. Consider using ‘syntax-after’ instead.

The line just below does use `syntax-after`, but it looks like something that
was added later to work around a related bug. I think this function should be
refactored to only use `syntax-after` instead of `char-syntax`.

For context, I'm writing my own Typescript major mode where the < and > symbols
are sometimes balanced delimiters, sometimes not (determined via Tree-sitter). I
could get most things working using syntax-table text properties, like
forward-sexp and show-paren-mode, but not double-click selection due to this
issue.

Related: https://lists.gnu.org/archive/html/bug-gnu-emacs/2016-02/msg00988.html
In that issue, the opposite problem occurred, double clicking on an unmatched
open parenthesis did not take into account the "punctuation" text property. But
the fix only fixed one half of the issue (when a text property makes a
parenthesis into not a parenthesis) and I am facing the other half (a text
property makes a punctuation character into a parenthesis).

Best,
Guillaume




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Sun, 24 Nov 2024 10:00:02 GMT) Full text and rfc822 format available.

Message #8 received at 74415 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 74415 <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4;
 mouse-start-end does not respect syntax-table text properties
Date: Sun, 24 Nov 2024 11:59:15 +0200
> From: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>
> Date: Mon, 18 Nov 2024 11:12:49 +0100
> 
> The `mouse-start-end` function (used in particular when double clicking on an
> opening parenthesis to select the region from that opening parenthesis to the
> matching closing parenthesis) does not respect syntax-table text properties to
> determine if the character at point is an opening parenthesis. Looking at the
> code, it is at line 1941 in lisp/mouse.el (and there are more instances further
> down in the same file)
> (https://github.com/emacs-mirror/emacs/blob/eee0ed8442aa78320a3e578ab290df145fb49624/lisp/mouse.el#L1941):
> 
>     ((and (= mode 1)
>           (= start end)
>           (char-after start)
>           (= (char-syntax (char-after start)) ?\())
> 
> Note that the documentation of `char-syntax` says:
> 
> > If you’re trying to determine the syntax of characters in the buffer, this is
> > probably the wrong function to use, because it can’t take ‘syntax-table’ text
> > properties into account. Consider using ‘syntax-after’ instead.
> 
> The line just below does use `syntax-after`, but it looks like something that
> was added later to work around a related bug. I think this function should be
> refactored to only use `syntax-after` instead of `char-syntax`.
> 
> For context, I'm writing my own Typescript major mode where the < and > symbols
> are sometimes balanced delimiters, sometimes not (determined via Tree-sitter). I
> could get most things working using syntax-table text properties, like
> forward-sexp and show-paren-mode, but not double-click selection due to this
> issue.
> 
> Related: https://lists.gnu.org/archive/html/bug-gnu-emacs/2016-02/msg00988.html
> In that issue, the opposite problem occurred, double clicking on an unmatched
> open parenthesis did not take into account the "punctuation" text property. But
> the fix only fixed one half of the issue (when a text property makes a
> parenthesis into not a parenthesis) and I am facing the other half (a text
> property makes a punctuation character into a parenthesis).

Sounds reasonable.

Stefan, is there any reason not to use syntax-after everywhere in
mouse.el?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Mon, 25 Nov 2024 23:16:02 GMT) Full text and rfc822 format available.

Message #11 received at 74415 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>, 74415 <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4; mouse-start-end does not respect syntax-table
 text properties
Date: Mon, 25 Nov 2024 18:15:12 -0500
> Stefan, is there any reason not to use syntax-after everywhere in
> mouse.el?

Not that I can think of, no.
AFAIU, this code simply predates `syntax-after`.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Thu, 28 Nov 2024 16:03:02 GMT) Full text and rfc822 format available.

Message #14 received at 74415 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: guillaume.brunerie <at> gmail.com, Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 74415 <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4; mouse-start-end does not respect syntax-table
 text properties
Date: Thu, 28 Nov 2024 18:02:10 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>,  74415 <at> debbugs.gnu.org
> Date: Mon, 25 Nov 2024 18:15:12 -0500
> 
> > Stefan, is there any reason not to use syntax-after everywhere in
> > mouse.el?
> 
> Not that I can think of, no.
> AFAIU, this code simply predates `syntax-after`.

Thanks.  Guillaume, does the patch below give good results?

diff --git a/lisp/mouse.el b/lisp/mouse.el
index 410e52b..766c4d8 100644
--- a/lisp/mouse.el
+++ b/lisp/mouse.el
@@ -632,7 +632,9 @@ context-menu-region
     (with-current-buffer (window-buffer (posn-window (event-end click)))
       (when (let* ((pos (posn-point (event-end click)))
                    (char (when pos (char-after pos))))
-              (or (and char (eq (char-syntax char) ?\"))
+              (or (and char (eq (syntax-class-to-char
+                                 (syntax-class (syntax-after pos)))
+                                ?\"))
                   (nth 3 (save-excursion (syntax-ppss pos)))))
         (define-key-after submenu [mark-string]
           `(menu-item "String"
@@ -1890,7 +1892,8 @@ mouse-skip-word
 If `mouse-1-double-click-prefer-symbols' is non-nil, skip over symbol.
 If DIR is positive skip forward; if negative, skip backward."
   (let* ((char (following-char))
-	 (syntax (char-to-string (char-syntax char)))
+	 (syntax (char-to-string
+                  (syntax-class-to-char (syntax-class (syntax-after (point))))))
          sym)
     (cond ((and mouse-1-double-click-prefer-symbols
                 (setq sym (bounds-of-thing-at-point 'symbol)))
@@ -1938,7 +1941,9 @@ mouse-start-end
         ((and (= mode 1)
               (= start end)
 	      (char-after start)
-              (= (char-syntax (char-after start)) ?\())
+              (= (syntax-class-to-char
+                  (syntax-class (syntax-after start)))
+                 ?\())
          (if (/= (syntax-class (syntax-after start)) 4) ; raw syntax code for ?\(
              ;; This happens in CC Mode when unbalanced parens in CPP
              ;; constructs are given punctuation syntax with
@@ -1953,7 +1958,9 @@ mouse-start-end
         ((and (= mode 1)
               (= start end)
 	      (char-after start)
-              (= (char-syntax (char-after start)) ?\)))
+              (= (syntax-class-to-char
+                  (syntax-class (syntax-after start)))
+                 ?\)))
          (if (/= (syntax-class (syntax-after start)) 5) ; raw syntax code for ?\)
              ;; See above comment about CC Mode.
              (signal 'scan-error (list "Unbalanced parentheses" start start))
@@ -1965,7 +1972,9 @@ mouse-start-end
 	((and (= mode 1)
               (= start end)
 	      (char-after start)
-              (= (char-syntax (char-after start)) ?\"))
+              (= (syntax-class-to-char
+                  (syntax-class (syntax-after start)))
+                 ?\"))
 	 (let ((open (or (eq start (point-min))
 			 (save-excursion
 			   (goto-char (- start 1))




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Thu, 28 Nov 2024 20:04:02 GMT) Full text and rfc822 format available.

Message #17 received at 74415 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: guillaume.brunerie <at> gmail.com, 74415 <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4; mouse-start-end does not respect syntax-table
 text properties
Date: Thu, 28 Nov 2024 15:03:21 -0500
> Thanks.  Guillaume, does the patch below give good results?

the latch LGTM, but it bought up the following side note: maybe we
should add a `syntax-class-of-char` so we can replace

    (= (syntax-class-to-char FOO)
       BAR))

with

    (= FOO
       (syntax-class-of-char BAR))

where `syntax-class-of-char` can be pre-computed during compilation
(when BAR is a constant, which would be always the case in your patch).

Tho, maybe even better would be a `(syntax-class-p BAR FOO)`.

Anyway, nothing urgent.


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Thu, 05 Dec 2024 07:39:02 GMT) Full text and rfc822 format available.

Message #20 received at 74415 <at> debbugs.gnu.org (full text, mbox):

From: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 74415 <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4; mouse-start-end does not respect syntax-table
 text properties
Date: Thu, 5 Dec 2024 07:13:20 +0100
Den tors 28 nov. 2024 kl 17:02 skrev Eli Zaretskii <eliz <at> gnu.org>:
> Thanks.  Guillaume, does the patch below give good results?

Thank you, I haven’t managed to apply the patch locally yet but I
think that would work (I’m on Emacs 29.4, but I guess I might need to
get the development version of Emacs? The patch seems to fail on my
mouse.el).
But one thing I want to point out is that the two `(signal 'scan-error
[...])` seem to be dead code now, as they test the exact opposite of
what the previous test now does.
So I guess it should be implemented a bit differently if you want to
preserve the current behavior (have a signal 'scan-error' when double
clicking on unbalanced parentheses in CC mode).




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Sat, 07 Dec 2024 12:34:01 GMT) Full text and rfc822 format available.

Message #23 received at 74415 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>
Cc: monnier <at> iro.umontreal.ca, 74415 <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4; mouse-start-end does not respect syntax-table
 text properties
Date: Sat, 07 Dec 2024 14:33:01 +0200
> From: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>
> Date: Thu, 5 Dec 2024 07:13:20 +0100
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 74415 <at> debbugs.gnu.org
> 
> Den tors 28 nov. 2024 kl 17:02 skrev Eli Zaretskii <eliz <at> gnu.org>:
> > Thanks.  Guillaume, does the patch below give good results?
> 
> Thank you, I haven’t managed to apply the patch locally yet but I
> think that would work (I’m on Emacs 29.4, but I guess I might need to
> get the development version of Emacs? The patch seems to fail on my
> mouse.el).
> But one thing I want to point out is that the two `(signal 'scan-error
> [...])` seem to be dead code now, as they test the exact opposite of
> what the previous test now does.

Sorry, I don't understand.  The changes I proposed didn't touch the
lines that signals errors.  Are you saying that those errors were dead
code before these changes as well?  If not, could you please elaborate
on the issues you see with the changes I proposed?

> So I guess it should be implemented a bit differently if you want to
> preserve the current behavior (have a signal 'scan-error' when double
> clicking on unbalanced parentheses in CC mode).





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Tue, 10 Dec 2024 05:11:03 GMT) Full text and rfc822 format available.

Message #26 received at 74415 <at> debbugs.gnu.org (full text, mbox):

From: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: monnier <at> iro.umontreal.ca, 74415 <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4; mouse-start-end does not respect syntax-table
 text properties
Date: Mon, 9 Dec 2024 20:21:49 +0100
Den lör 7 dec. 2024 kl 13:33 skrev Eli Zaretskii <eliz <at> gnu.org>:
>
> > From: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>
> > Date: Thu, 5 Dec 2024 07:13:20 +0100
> > Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 74415 <at> debbugs.gnu.org
> >
> > Den tors 28 nov. 2024 kl 17:02 skrev Eli Zaretskii <eliz <at> gnu.org>:
> > > Thanks.  Guillaume, does the patch below give good results?
> >
> > Thank you, I haven’t managed to apply the patch locally yet but I
> > think that would work (I’m on Emacs 29.4, but I guess I might need to
> > get the development version of Emacs? The patch seems to fail on my
> > mouse.el).
> > But one thing I want to point out is that the two `(signal 'scan-error
> > [...])` seem to be dead code now, as they test the exact opposite of
> > what the previous test now does.
>
> Sorry, I don't understand.  The changes I proposed didn't touch the
> lines that signals errors.  Are you saying that those errors were dead
> code before these changes as well?  If not, could you please elaborate
> on the issues you see with the changes I proposed?

No, it was not dead code before, but changing the outer condition
makes it impossible for both the outer condition and the inner
condition to be true at the same time.
The current code is the following (inside a cond)

((and (= mode 1)
      (= start end)
      (char-after start)
      (= (char-syntax (char-after start)) ?\())
 (if (/= (syntax-class (syntax-after start)) 4) ; raw syntax code for ?\(
     ;; This happens in CC Mode when unbalanced parens in CPP
     ;; constructs are given punctuation syntax with
     ;; syntax-table text properties.  (2016-02-21).
     (signal 'scan-error (list "Containing expression ends prematurely"
                               start start))
   (list start
         (save-excursion
           (goto-char start)
           (forward-sexp 1)
           (point)))))

So the 'scan-error happens when the character is a parenthesis
character according to the syntax table (that's what is tested by (=
(char-syntax (char-after start)) ?\()) but has a text property telling
Emacs to treat it as something else than a parenthesis instead.
Changing the "char-syntax" test to (= (syntax-class-to-char
(syntax-class (syntax-after start))) ?\()) makes it so that the
'scan-error happens if the character is both a parenthesis and not a
parenthesis according to text properties, which is not possible.

In other words, it is not possible for both (= (syntax-class-to-char
(syntax-class (syntax-after start))) ?\()) and (/= (syntax-class
(syntax-after start)) 4) to be simultaneously true, as they are the
exact opposite of each other, so when they are nested conditions, the
inner one becomes dead code.

That said, it only seems to happen in rare edge cases (as in
https://lists.gnu.org/archive/html/bug-gnu-emacs/2016-02/msg00988.html),
as regular unmatched parentheses do seem to still trigger a different
"Unbalanced parentheses" scan error.

Apart from that, I have now managed to test the patch, and I can
confirm that it fixes my original issue




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#74415; Package emacs. (Fri, 13 Dec 2024 17:07:01 GMT) Full text and rfc822 format available.

Message #29 received at 74415 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Guillaume Brunerie <guillaume.brunerie <at> gmail.com>
Cc: Eli Zaretskii <eliz <at> gnu.org>, 74415 <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4; mouse-start-end does not respect syntax-table
 text properties
Date: Fri, 13 Dec 2024 12:05:55 -0500
> No, it was not dead code before, but changing the outer condition
> makes it impossible for both the outer condition and the inner
> condition to be true at the same time.
> The current code is the following (inside a cond)
>
> ((and (= mode 1)
>       (= start end)
>       (char-after start)
>       (= (char-syntax (char-after start)) ?\())
>  (if (/= (syntax-class (syntax-after start)) 4) ; raw syntax code for ?\(
>      ;; This happens in CC Mode when unbalanced parens in CPP
>      ;; constructs are given punctuation syntax with
>      ;; syntax-table text properties.  (2016-02-21).
>      (signal 'scan-error (list "Containing expression ends prematurely"
>                                start start))
>    (list start
>          (save-excursion
>            (goto-char start)
>            (forward-sexp 1)
>            (point)))))

I have the strong impression that this reflects the fact that the
if+signal was a workaround which we're now replacing with an actual fix.


        Stefan





Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 04 Jan 2025 10:55:01 GMT) Full text and rfc822 format available.

Notification sent to Guillaume Brunerie <guillaume.brunerie <at> gmail.com>:
bug acknowledged by developer. (Sat, 04 Jan 2025 10:55:02 GMT) Full text and rfc822 format available.

Message #34 received at 74415-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: guillaume.brunerie <at> gmail.com, 74415-done <at> debbugs.gnu.org
Subject: Re: bug#74415: 29.4; mouse-start-end does not respect syntax-table
 text properties
Date: Sat, 04 Jan 2025 12:54:21 +0200
> From: Stefan Monnier <monnier <at> iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz <at> gnu.org>,  74415 <at> debbugs.gnu.org
> Date: Fri, 13 Dec 2024 12:05:55 -0500
> 
> > No, it was not dead code before, but changing the outer condition
> > makes it impossible for both the outer condition and the inner
> > condition to be true at the same time.
> > The current code is the following (inside a cond)
> >
> > ((and (= mode 1)
> >       (= start end)
> >       (char-after start)
> >       (= (char-syntax (char-after start)) ?\())
> >  (if (/= (syntax-class (syntax-after start)) 4) ; raw syntax code for ?\(
> >      ;; This happens in CC Mode when unbalanced parens in CPP
> >      ;; constructs are given punctuation syntax with
> >      ;; syntax-table text properties.  (2016-02-21).
> >      (signal 'scan-error (list "Containing expression ends prematurely"
> >                                start start))
> >    (list start
> >          (save-excursion
> >            (goto-char start)
> >            (forward-sexp 1)
> >            (point)))))
> 
> I have the strong impression that this reflects the fact that the
> if+signal was a workaround which we're now replacing with an actual fix.

Evidently.  So I've now installed my changes on the master branch,
after removing the unneeded code which signals an error, and I'm
therefore closing this bug.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 01 Feb 2025 12:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 197 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.