GNU bug report logs -
#64420
string-width of … is 2 in CJK environments
Previous Next
Full log
View this message in rfc822 format
> Date: Tue, 11 Jul 2023 05:13:57 +0300
> Cc: 64420 <at> debbugs.gnu.org
> From: Dmitry Gutov <dmitry <at> gutov.dev>
>
> >> Would it be appropriate to fix the entry for … in that table either way?
> >
> > "Fix" in what way? In most language-environments we get
> >
> > (char-width ?…) => 1
> >
> > What's wrong with that?
>
> It returns 2 in Chinese-BIG5. While the actual metrics of the char don't
> change.
I explained why this happens and why Emacs works that way. If
something in my explanation is unclear, please ask more specific
questions.
> >> Or does that not match the principle with which those entries are done?
> >
> > Sorry, I don't understand the question: what principle are you talking
> > about?
>
> The principles by which we fill in the said char-table which we fill "by
> hand". E.g. which characters to include, and which to leave with
> "automatic" metrics.
We fill the table by hand, but the data is synchronized with the
Unicode Standard, and is reviewed each time we import a new Unicode
version. The tweaking of the char-width tables in CJK locales is due
to the issue I explained in my previous message:
> >>> In CJK locales, most characters are double-width because those locales
> >>> use fonts where the glyphs are wider. Or at least this is the theory.
> >>> string-pixel-width is free from these assumptions because it actually
> >>> measures the font glyphs.
> What I meant is, string-lixel-width must be slower than string-width
> because it uses a temp buffer and actual measurements, whereas the
> latter function only does a table lookup, more or less (N times).
It is slower, yes, but much more accurate. TANSTAAFL.
This bug report was last modified 2 years and 2 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.