GNU bug report logs -
#64420
string-width of … is 2 in CJK environments
Previous Next
Full log
Message #53 received at 64420 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi, I'm the issue(https://github.com/company-mode/company-mode/issues/1388)
reporter of emacs company package. I've been suggested to comment by the
project owner of the company package on the matter of
character-width-table. So, here's my thoughts.
There's many characters marked as A(ambiguous) width in the file (
https://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt) which is
one of the Unicode 15.0.0 Character Database. The characters inside the
general punctuation block (U+2000..U+206F) are marked as either N(Narrow)
or A(Ambiguous) width and the ellipsis character(U+2026) is marked as A.
Also there's a suggestion for rendering the ambiguous width unicode
character for Non-East Asian character in the Unicode 15.0.0 East Asian
Width Technical Report(http://www.unicode.org/reports/tr11/).
Quotes from the TR.
> 5 Recommendations
>
> When processing or displaying data
>
> • Ambiguous characters behave like wide or narrow characters depending
on the context (language tag, script identification, associated font,
source of data, or explicit markup; all can provide the context). If the
context cannot be established reliably, they should be treated as narrow
characters by default.
My understanding of the report about the treatment of the ambiguous width
is that the context is paramount and the recommendation of the default is
narrow for the non-East Asian characters.
How about in practice? I've tested the rendering of a few ambiguous width
characters on some OSes - terminal.
macOS Mojave - builtin, kitty, iterm2
Rendered as narrow character regardless of locale/font setting.
Windows 11 - old and new terminal
Rendered as narrow character regardless of locale/font setting.
Ubuntu 20 - gnome-terminal
User can set the width of ambiguous characters either narrow(default) or
wide through compatibility option.
I'm surprised gnome-terminal has this option. However, it seems incomplete
because when I try to delete an ambiguous width character rendered as a
wide one, the terminal masses up its cursor position whereas deleting a
wide character works fine.
So, I think the proper default width value of the ambiguous width
characters is narrow and there must be options for setting width for those
ambiguous width characters, but such change of default value might cause
breakage in the emacs packages which rely on the CJK language environment.
All in all, I think providing comprehensive options to change the width of
those ambiguous width characters will be desirable.
[Message part 2 (text/html, inline)]
This bug report was last modified 2 years ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.