GNU bug report logs - #61726
[PATCH] Eglot: Support positionEncoding capability

Previous Next

Package: emacs;

Reported by: Augusto Stoffel <arstoffel <at> gmail.com>

Date: Thu, 23 Feb 2023 08:06:01 UTC

Severity: normal

Tags: patch

Done: João Távora <joaotavora <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #62 received at 61726 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Augusto Stoffel <arstoffel <at> gmail.com>
Cc: 61726 <at> debbugs.gnu.org, joaotavora <at> gmail.com
Subject: Re: bug#61726: [PATCH] Eglot: Support positionEncoding capability
Date: Fri, 24 Feb 2023 10:38:35 +0200
> From: Augusto Stoffel <arstoffel <at> gmail.com>
> Cc: joaotavora <at> gmail.com,  61726 <at> debbugs.gnu.org
> Date: Fri, 24 Feb 2023 08:18:30 +0100
> 
> On Fri, 24 Feb 2023 at 08:43, Eli Zaretskii wrote:
> 
> > It does? then please humor me by walking me through the code and the
> > patch to show how that would work after applying the patch.
> 
> +            :general
> +            (list
> +             :positionEncodings ["utf-32" "utf-8" "utf-16"])
>              :experimental eglot--{})))

Is "UTF-32" an LSP thing and terminology?  Because I'd prefer a
different name if we can.  At least for our internal nomenclature,
let's use "codepoint" or "character" instead.

> -(defun eglot-current-column () (- (point) (line-beginning-position)))
> +(defun eglot-current-column ()
> +  "Calculate current column, counting Unicode codepoints."
> +  (- (point) (line-beginning-position)))

Can we please take this opportunity to get rid of the confusing
"column" terminology?  As became evident from this discussion, we are
not talking columns here, we are talking offsets in characters from
BOL.  So something like "pos" or "linepos" or "line-offset" should be
better.

João, are you okay with such a sweeping change in all of eglot.el?

> +(defun eglot--current-column-utf-8 ()
> +  "Calculate current column, counting bytes."
> +  (- (position-bytes (point)) (position-bytes (line-beginning-position))))

As discussed, position-bytes is incorrect.  You should instead do
something like

  (length (encode-coding-string
           (buffer-substring-no-properties (point)
                                           (line-beginning-position))
           'utf-8-unix t))

Also, for 100% reliable results, we should bind
inhibit-field-text-motion to t when calling line-beginning-position.

> +(defun eglot--move-to-column-utf-8 (column)
> +  "Move to COLUMN, regarded as a byte offset."
> +  (goto-char (min (byte-to-position
> +                   (+ (position-bytes (line-beginning-position)) column))
> +                  (line-end-position))))

Likewise here.

> @@ -1515,14 +1536,20 @@ eglot--lsp-position-to-point
>        (forward-line (min most-positive-fixnum
>                           (plist-get pos-plist :line)))
>        (unless (eobp) ;; if line was excessive leave point at eob
> -        (let ((tab-width 1)
> +        (let ((movefn (or eglot-move-to-column-function
> +                          (pcase (plist-get (eglot--capabilities (eglot-current-server))
> +                                            :positionEncoding)
> +                            ("utf-32" #'eglot-move-to-column)
> +                            ("utf-8" #'eglot--move-to-column-utf-8)
> +                            (_ #'eglot-move-to-lsp-abiding-column))))
> +              (tab-width 1)
                  ^^^^^^^^^^^
This last part shouldn't be necessary: we should move by characters,
not by columns.  Why is it necessary?

> I hope this helps clarifying things.

Yes, thank you very much.




This bug report was last modified 2 years and 138 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.