GNU bug report logs - #68445
[PATCH] Problem with python--treesit-syntax-propertize

Previous Next

Package: emacs;

Reported by: kobarity <kobarity <at> gmail.com>

Date: Sun, 14 Jan 2024 09:16:01 UTC

Severity: normal

Tags: patch

Done: Dmitry Gutov <dmitry <at> gutov.dev>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Dmitry Gutov <dmitry <at> gutov.dev>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#68445: closed ([PATCH] Problem with python--treesit-syntax-propertize)
Date: Fri, 26 Jan 2024 01:16:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Fri, 26 Jan 2024 03:15:24 +0200
with message-id <c3f80ad8-238e-4d36-9d4d-6cdaa862046b <at> gutov.dev>
and subject line Re: [PATCH] Problem with python--treesit-syntax-propertize
has caused the debbugs.gnu.org bug report #68445,
regarding [PATCH] Problem with python--treesit-syntax-propertize
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
68445: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=68445
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: kobarity <kobarity <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] Problem with python--treesit-syntax-propertize
Date: Sun, 14 Jan 2024 18:15:07 +0900
[Message part 3 (text/plain, inline)]
Hi,

I found a problem with python--treesit-syntax-propertize recently
introduced by the Bug#67977 patch.

1. emacs -Q
2. Open a file in python-ts-mode with the following contents:

#+begin_src python
"""Docstring.

test.
"""
S = """string."""
#+end_src

3. Locate the point on the third line.
4. M-q
5. An empty line will be inserted.
6. M-q
7. The string literal on the last line will be split as follows:

S = ""

"string."""

This problem does not occur in python-mode.

The direct cause of this problem is that the string-delimiter property
set in the docstring is removed.  python--treesit-syntax-propertize is
called to set the property, but it fails to set it properly.  Here is
the trace of python--treesit-syntax-propertize from step 4 above.

======================================================================
1 -> (python--treesit-syntax-propertize 1 45)
1 <- python--treesit-syntax-propertize: nil
======================================================================
1 -> (python--treesit-syntax-propertize 16 45)
1 <- python--treesit-syntax-propertize: nil

python--treesit-syntax-propertize is called with argument START 16.
This is the position inside the docstring.

It seems to me that python--treesit-syntax-propertize assumes that the
START argument is outside the triple-quoted string.  So one solution
might be to change START to the start of the string if it is within a
string, as in the attached patch.  However, I'm not sure this is the
right approach.  Should we use
syntax-propertize-extend-region-functions?

--
In GNU Emacs 30.0.50 (build 5, x86_64-pc-linux-gnu, X toolkit, cairo
 version 1.16.0, Xaw scroll bars) of 2024-01-13 built on ubuntu
Repository revision: 106cd9aafe8248ef91d7e89161adc5f912ea54eb
Repository branch: master
System Description: Ubuntu 22.04.3 LTS
[0001-Fix-python-treesit-syntax-propertize.patch (text/plain, attachment)]
[Message part 5 (message/rfc822, inline)]
From: Dmitry Gutov <dmitry <at> gutov.dev>
To: kobarity <kobarity <at> gmail.com>
Cc: Yuan Fu <casouri <at> gmail.com>, 68445-done <at> debbugs.gnu.org
Subject: Re: [PATCH] Problem with python--treesit-syntax-propertize
Date: Fri, 26 Jan 2024 03:15:24 +0200
On 23/01/2024 16:14, kobarity wrote:
> 
> Dmitry Gutov wrote:
>>
>> On 22/01/2024 17:44, kobarity wrote:
>>> Hi,
>>>
>>> Dmitry Gutov wrote:
>>>> On 21/01/2024 16:47, kobarity wrote:
>>>>> I am resending my mail, as I made a mistake in X-Debbugs-CC.
>>>> Was it supposed to appear in the bug's thread? I don't see it anywhere.
>>>
>>> My first mail was registered as Bug#68445, and my patch is there.
>>>
>>> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=68445
>>>
>>> It says:
>>>
>>> Report forwarded to casouri <at> gmail.com, dmitry@.gutov.dev, bug-gnu-emacs <at> gnu.org:
>>>
>>> The extra period is my mistake and it may have caused the problem.
>>> I'm sorry for the confusion.
>>
>> Yeah, but even so that's odd: I'm subscribed to the bug tracker, so
>> the email should have at least arrived in my inbox, but it did not.
> 
> I agree.  I can't find my first mail in the bug-gnu-emacs archive.
> 
>>>> I think there is also another approach--handle two different types of
>>>> nodes separately, instead of just string_content, so we don't have to
>>>> start from the beginning of the literal. Like this:
>>>>
>>>> diff --git a/lisp/progmodes/python.el b/lisp/progmodes/python.el
>>>> index e2f614f52c2..4f8b0cb9473 100644
>>>> --- a/lisp/progmodes/python.el
>>>> +++ b/lisp/progmodes/python.el
>>>> @@ -1361,13 +1361,15 @@ python--treesit-syntax-propertize
>>>>        (while (re-search-forward (rx (or "\"\"\"" "'''")) end t)
>>>>          (let ((node (treesit-node-at (point))))
>>>>            ;; The triple quotes surround a non-empty string.
>>>> -        (when (equal (treesit-node-type node) "string_content")
>>>> -          (let ((start (treesit-node-start node))
>>>> -                (end (treesit-node-end node)))
>>>> -            (put-text-property (1- start) start
>>>> -                               'syntax-table (string-to-syntax "|"))
>>>> -            (put-text-property end (min (1+ end) (point-max))
>>>> -                               'syntax-table (string-to-syntax "|"))))))))
>>>> +        (cond
>>>> +         ((equal (treesit-node-type node) "string_content")
>>>> +          (put-text-property (1- (treesit-node-start node))
>>>> +                             (treesit-node-start node)
>>>> +                             'syntax-table (string-to-syntax "|")))
>>>> +         ((and (equal (treesit-node-type node) "string_end")
>>>> +               (= (treesit-node-start node) (- (point) 3)))
>>>> +          (put-text-property (- (point) 3) (- (point) 2)
>>>> +                             'syntax-table (string-to-syntax "|"))))))))
>>>>
>>>>    
>>>>    ;;; Indentation
>>>>
>>>
>>> This approach seems better than my patch, but it does not seem to
>>> address the following special case.
>>>
>>> #+begin_src python
>>> """a""""""b"""
>>> #+end_src
>>
>> All right, try the patch below, please. It also covers the case of the
>> empty literal.
> 
> Thanks, it looks good to me.
> 
>> I've tried to find a case where it would behave poorly (e.g. by
>> misdetecting three quotes from a combination of some other string
>> literals), but couldn't. E.g.,
>>
>>    s = '''asdasd'
>>
>> is not a concatenation. It's always an error, at least according to
>> the TS grammar.
> 
> I think the TS grammar is correct, because this example is also an
> error according to the Python interpreter.

Thanks for testing! Installed, and closing.


This bug report was last modified 1 year and 174 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.