GNU bug report logs -
#27403
26.0.50; Indentation misalignment with Unicode code points >65535
Previous Next
Reported by: Adam Niederer <adam.niederer <at> gmail.com>
Date: Sat, 17 Jun 2017 06:55:03 UTC
Severity: normal
Found in version 26.0.50
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
> From: Adam Niederer <adam.niederer <at> gmail.com>
> Date: Sat, 17 Jun 2017 02:28:41 -0400
>
> Hello, I believe I've found an indentation issue. To reproduce, start
> emacs, create a buffer in js-mode, paste in this code, and press C-x h
> TAB to indent the buffer:
>
> let x = /* 👍 */ { foo: 0
> bar: 0 }
>
> let x = /* ☺ */ { foo: 0
> bar: 0 }
>
> Both 25.2 and 26.0.50 add one extra space before "bar" in the first
> first snippet with U+1F44D THUMBS UP SIGN in the comment, whereas the
> second snippet with U+263A WHITE SMILING FACE properly aligns "bar" with
> "foo".
That's because U+1F44D is a double-width character:
(char-width ?👍) => 2
while U+263A is not double-width.
So as long as indentation works in columns and not in pixels, this is
a "feature".
> This appears to happen whenever the character in the comment needs a
> surrogate pair.
I don't believe surrogates have anything to do with this, since Emacs
works with Unicode codepoints, not their UTF-16 encodings.
> Interestingly, pressing TAB with one's point on the second line of each
> snippet to dedent the line yields a correct result for both symbols:
>
> "👍", {"a": 2,
> "b": 3}
>
> "☺", {"a":2,
> "b":3}
Which is probably a subtle bug: this should behave like the first
snippet.
Thanks.
This bug report was last modified 3 years and 200 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.