GNU bug report logs - #36432
26.2; SMIE does not request forward tokens when point is at point-max

Previous Next

Package: emacs;

Reported by: Sam Halliday <sam.halliday <at> gmail.com>

Date: Sat, 29 Jun 2019 12:15:01 UTC

Severity: normal

Found in version 26.2

To reply to this bug, email your comments to 36432 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 12:15:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Sam Halliday <sam.halliday <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sat, 29 Jun 2019 12:15:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Sam Halliday <sam.halliday <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 26.2; SMIE does not request forward tokens when point is at point-max
Date: Sat, 29 Jun 2019 13:14:01 +0100
SMIE (via a `indent-for-tab-command') does not request forward tokens
from the lexer when point is at `point-max'.

This might sound like a strange bug report: why should smie expect
there to be any tokens when it is already at point-max? The answer is:
virtual tokens. For example, Haskell may have many closing curly
brackets that live at the end of the buffer.

A workaround is to add a few stray newlines to the end of any buffer
that uses SMIE for indentation. Then SMIE will request the next tokens
(even thought there is only whitespace left until the end of the
buffer) and will receive at least one of those virtuals.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 12:24:02 GMT) Full text and rfc822 format available.

Message #8 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sam Halliday <sam.halliday <at> gmail.com>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2;
 SMIE does not request forward tokens when point is at point-max
Date: Sat, 29 Jun 2019 15:23:11 +0300
> From: Sam Halliday <sam.halliday <at> gmail.com>
> Date: Sat, 29 Jun 2019 13:14:01 +0100
> 
> SMIE (via a `indent-for-tab-command') does not request forward tokens
> from the lexer when point is at `point-max'.
> 
> This might sound like a strange bug report: why should smie expect
> there to be any tokens when it is already at point-max? The answer is:
> virtual tokens. For example, Haskell may have many closing curly
> brackets that live at the end of the buffer.

??? There can be noting at point-max, as that position is beyond the
last buffer position.  Did you mean the position just before that?  Or
am I missing something here?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 12:35:02 GMT) Full text and rfc822 format available.

Message #11 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Sam Halliday <sam.halliday <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sat, 29 Jun 2019 13:34:06 +0100
On 29/06/2019, Eli Zaretskii <eliz <at> gnu.org> wrote:
>> From: Sam Halliday <sam.halliday <at> gmail.com>
>> Date: Sat, 29 Jun 2019 13:14:01 +0100
>>
>> SMIE (via a `indent-for-tab-command') does not request forward tokens
>> from the lexer when point is at `point-max'.
>>
>> This might sound like a strange bug report: why should smie expect
>> there to be any tokens when it is already at point-max? The answer is:
>> virtual tokens. For example, Haskell may have many closing curly
>> brackets that live at the end of the buffer.
>
> ??? There can be noting at point-max, as that position is beyond the
> last buffer position.  Did you mean the position just before that?  Or
> am I missing something here?
>

I mean at point-max.

Consider this layout algorithm

https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-layout.el

and this lexer

https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-lexer.el

that can continue to produce tokens even when the point is at the very
end of the buffer.

e.g.

input https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/src/layout.hs
with layout https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/src/layout.hs.layout
just tokens https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/src/layout.hs.lexer

note the trailing } that exists at the very end of the file. SMIE
always misses this.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 12:43:02 GMT) Full text and rfc822 format available.

Message #14 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sam Halliday <sam.halliday <at> gmail.com>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sat, 29 Jun 2019 15:42:27 +0300
> From: Sam Halliday <sam.halliday <at> gmail.com>
> Date: Sat, 29 Jun 2019 13:34:06 +0100
> Cc: 36432 <at> debbugs.gnu.org
> 
> > ??? There can be noting at point-max, as that position is beyond the
> > last buffer position.  Did you mean the position just before that?  Or
> > am I missing something here?
> >
> 
> I mean at point-max.
> 
> Consider this layout algorithm
> 
> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-layout.el
> 
> and this lexer
> 
> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-lexer.el
> 
> that can continue to produce tokens even when the point is at the very
> end of the buffer.

So you create an illusion of characters beyond the EOB?

How would Emacs know this is the case?  Why don't you also override
point-max to make it consistent with those illusory characters?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 12:52:01 GMT) Full text and rfc822 format available.

Message #17 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: sam.halliday <at> gmail.com
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2;
 SMIE does not request forward tokens when point is at point-max
Date: Sat, 29 Jun 2019 15:51:07 +0300
> Date: Sat, 29 Jun 2019 15:42:27 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 36432 <at> debbugs.gnu.org
> 
> How would Emacs know this is the case?  Why don't you also override
> point-max to make it consistent with those illusory characters?

Or actually insert those characters at EOB, but make them invisible?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 12:52:02 GMT) Full text and rfc822 format available.

Message #20 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Sam Halliday <sam.halliday <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sat, 29 Jun 2019 13:51:31 +0100
On 29/06/2019, Eli Zaretskii <eliz <at> gnu.org> wrote:
>> From: Sam Halliday <sam.halliday <at> gmail.com>
>> Date: Sat, 29 Jun 2019 13:34:06 +0100
>> Cc: 36432 <at> debbugs.gnu.org
>>
>> > ??? There can be noting at point-max, as that position is beyond the
>> > last buffer position.  Did you mean the position just before that?  Or
>> > am I missing something here?
>> >
>>
>> I mean at point-max.
>>
>> Consider this layout algorithm
>>
>> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-layout.el
>>
>> and this lexer
>>
>> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/haskell-tng-lexer.el
>>
>> that can continue to produce tokens even when the point is at the very
>> end of the buffer.
>
> So you create an illusion of characters beyond the EOB?
>
> How would Emacs know this is the case?

When testing it is possible to keep polling the lexer until it returns
nil when at point-max, rather than looking at `point-max` and giving
up. I think that could work in general inside SMIE.
https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/haskell-tng-lexer-test.el

I suspect the example forward lexer, from the documentation
https://www.gnu.org/software/emacs/manual/html_mono/elisp.html#SMIE-Lexer
would be ok in this situation. I'd be concerned that existing lexers
would throw an error if they were polled when at the beginning/end of
the buffer unexpectedly.

BTW, this also happens at the start of the buffer. SMIE doesn't ask
for backwards tokens when at the beginning.

> Why don't you also override
> point-max to make it consistent with those illusory characters?

Hmm, that is a workaround worth exploring. I'm not sure what the
consequences would be of changing something so fundamental. I think
changing SMIE would probably be easier, even with a monkey patch of
the relevant function or advice. I can have a go at trying to do that.
I just need to figure out exactly which function is doing the check.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 13:02:02 GMT) Full text and rfc822 format available.

Message #23 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Sam Halliday <sam.halliday <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sat, 29 Jun 2019 14:01:52 +0100
On 29/06/2019, Eli Zaretskii <eliz <at> gnu.org> wrote:
>> Date: Sat, 29 Jun 2019 15:42:27 +0300
>> From: Eli Zaretskii <eliz <at> gnu.org>
>> Cc: 36432 <at> debbugs.gnu.org
>>
>> How would Emacs know this is the case?  Why don't you also override
>> point-max to make it consistent with those illusory characters?
>
> Or actually insert those characters at EOB, but make them invisible?

I think I'd like to avoid doing that unless I do it for all of the
virtual tokens. I don't know how to do that without it impacting the
underlying source code. Is there more documentation about doing it
that way? It'd be a drastic change in how the lexer is written.

BTW I had a look through smie.el and I can't see anywhere obvious
where (eobp) or (point-max) are called that would lead to the bug I'm
seeing. I'll most likely have to step debug at some point to get to
the bottom of this.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 13:07:01 GMT) Full text and rfc822 format available.

Message #26 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sam Halliday <sam.halliday <at> gmail.com>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sat, 29 Jun 2019 16:06:17 +0300
> From: Sam Halliday <sam.halliday <at> gmail.com>
> Date: Sat, 29 Jun 2019 13:51:31 +0100
> Cc: 36432 <at> debbugs.gnu.org
> 
> > So you create an illusion of characters beyond the EOB?
> >
> > How would Emacs know this is the case?
> 
> When testing it is possible to keep polling the lexer until it returns
> nil when at point-max, rather than looking at `point-max` and giving
> up. I think that could work in general inside SMIE.
> https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/haskell-tng-lexer-test.el

But SMIE is just an application on top of Emacs basic handling of
buffer positions.  The assumption that there can be nothing at EOB is
hardcoded into many Emacs primitives, into its display engine, and
into core Lisp infrastructure.  You are playing with fire trying to
force Emacs think there are some characters beyond EOB.  Just grep the
C sources for ZV, and you will see the enormous height of the hill you
will need to fight up.  I wouldn't recommend that to anyone.

It should be easier to modify SMIE to take characters from a string,
then you could put whatever you want into that string.  Or maybe SMIE
already supports reading from strings, I don't know.

> BTW, this also happens at the start of the buffer. SMIE doesn't ask
> for backwards tokens when at the beginning.

For the same basic reasons.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 13:09:02 GMT) Full text and rfc822 format available.

Message #29 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sam Halliday <sam.halliday <at> gmail.com>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sat, 29 Jun 2019 16:08:01 +0300
> From: Sam Halliday <sam.halliday <at> gmail.com>
> Date: Sat, 29 Jun 2019 14:01:52 +0100
> Cc: 36432 <at> debbugs.gnu.org
> 
> On 29/06/2019, Eli Zaretskii <eliz <at> gnu.org> wrote:
> >> Date: Sat, 29 Jun 2019 15:42:27 +0300
> >> From: Eli Zaretskii <eliz <at> gnu.org>
> >> Cc: 36432 <at> debbugs.gnu.org
> >>
> >> How would Emacs know this is the case?  Why don't you also override
> >> point-max to make it consistent with those illusory characters?
> >
> > Or actually insert those characters at EOB, but make them invisible?
> 
> I think I'd like to avoid doing that unless I do it for all of the
> virtual tokens. I don't know how to do that without it impacting the
> underlying source code. Is there more documentation about doing it
> that way? It'd be a drastic change in how the lexer is written.

Why? does the lexer only pay attention to visible characters?

> BTW I had a look through smie.el and I can't see anywhere obvious
> where (eobp) or (point-max) are called that would lead to the bug I'm
> seeing.

It's likely in lower-level code.  Like I said: this assumption is
everywhere in Emacs.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 13:14:02 GMT) Full text and rfc822 format available.

Message #32 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Sam Halliday <sam.halliday <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sat, 29 Jun 2019 14:13:16 +0100
On 29/06/2019, Eli Zaretskii <eliz <at> gnu.org> wrote:
> It should be easier to modify SMIE to take characters from a string,
> then you could put whatever you want into that string.  Or maybe SMIE
> already supports reading from strings, I don't know.

I agree. SMIE should be operating on a list of tokens and a lookup
from those tokens to the original buffer (and content). In most cases
SMIE is written this way but I suspect there are a few lapses.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 13:44:02 GMT) Full text and rfc822 format available.

Message #35 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Sam Halliday <sam.halliday <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca> 
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sat, 29 Jun 2019 16:43:25 +0300
> From: Sam Halliday <sam.halliday <at> gmail.com>
> Date: Sat, 29 Jun 2019 14:13:16 +0100
> Cc: 36432 <at> debbugs.gnu.org
> 
> On 29/06/2019, Eli Zaretskii <eliz <at> gnu.org> wrote:
> > It should be easier to modify SMIE to take characters from a string,
> > then you could put whatever you want into that string.  Or maybe SMIE
> > already supports reading from strings, I don't know.
> 
> I agree. SMIE should be operating on a list of tokens and a lookup
> from those tokens to the original buffer (and content). In most cases
> SMIE is written this way but I suspect there are a few lapses.

In any case, I'm CC'ing Stefan who knows much more about SMIE than I
do.  Apologies in advance if Stefan says my fears have no basis.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sat, 29 Jun 2019 21:40:02 GMT) Full text and rfc822 format available.

Message #38 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Sam Halliday <sam.halliday <at> gmail.com>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2;
 SMIE does not request forward tokens when point is at point-max
Date: Sat, 29 Jun 2019 17:39:21 -0400
> SMIE (via a `indent-for-tab-command') does not request forward tokens
> from the lexer when point is at `point-max'.

After looking at the smie.el code I think this bug report is not
sufficiently detailed: it definitely sometimes does, and I don't see any
obvious place where it doesn't.  Can you clarify if it happens during
something like smie-forward-sexp or rather within the smie-indent*
code itself.

Or do you mean when you trigger indent-according-to-mode with point at EOB?


        Stefan





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#36432; Package emacs. (Sun, 30 Jun 2019 08:51:01 GMT) Full text and rfc822 format available.

Message #41 received at 36432 <at> debbugs.gnu.org (full text, mbox):

From: Sam Halliday <sam.halliday <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 36432 <at> debbugs.gnu.org
Subject: Re: bug#36432: 26.2; SMIE does not request forward tokens when point
 is at point-max
Date: Sun, 30 Jun 2019 09:50:38 +0100
I'm seeing this when doing indentation.

e.g. in https://gitlab.com/tseenshe/haskell-tng.el/blob/tng/test/src/indentation.hs
move the point to the end of the last line and do a
`newline-and-indent'. The do it again when you have two newlines after
that last point. The results are different.

BTW, in addition to the edebug support you've added, I also have

(bind-key "C-M-<return>" 'haskell-tng-smie:debug-newline haskell-tng-mode-map)
(bind-key "C-M-<tab>" 'haskell-tng-smie:debug-tab haskell-tng-mode-map)

that are useful for seeing what's going on, with some haskell-tng
specific things.

On 29/06/2019, Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>> SMIE (via a `indent-for-tab-command') does not request forward tokens
>> from the lexer when point is at `point-max'.
>
> After looking at the smie.el code I think this bug report is not
> sufficiently detailed: it definitely sometimes does, and I don't see any
> obvious place where it doesn't.  Can you clarify if it happens during
> something like smie-forward-sexp or rather within the smie-indent*
> code itself.
>
> Or do you mean when you trigger indent-according-to-mode with point at EOB?
>
>
>         Stefan
>
>




This bug report was last modified 5 years and 351 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.