GNU bug report logs - #64420
string-width of … is 2 in CJK environments

Previous Next

Package: emacs;

Reported by: Dmitry Gutov <dmitry <at> gutov.dev>

Date: Sun, 2 Jul 2023 12:58:02 UTC

Severity: normal

Full log


Message #23 received at 64420 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 64420 <at> debbugs.gnu.org
Subject: Re: bug#64420: string-width of … is 2 in CJK environments
Date: Tue, 11 Jul 2023 05:13:57 +0300
On 07/07/2023 09:29, Eli Zaretskii wrote:
>> Date: Fri, 7 Jul 2023 05:13:50 +0300
>> Cc: 64420 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>>
>> On 02/07/2023 16:43, Eli Zaretskii wrote:
>>>> Is there some inherent reason why string-width differs from the result
>>>> of the above expression
>>> Because string-width doesn't consult the actual metrics of the font.
>>> It uses a char-table that we set "by hand".
>>
>> Would it be appropriate to fix the entry for … in that table either way?
> 
> "Fix" in what way?  In most language-environments we get
> 
>    (char-width ?…) => 1
> 
> What's wrong with that?

It returns 2 in Chinese-BIG5. While the actual metrics of the char don't 
change.

>> Or does that not match the principle with which those entries are done?
> 
> Sorry, I don't understand the question: what principle are you talking
> about?

The principles by which we fill in the said char-table which we fill "by 
hand". E.g. which characters to include, and which to leave with 
"automatic" metrics.

>>>> and especially only does that on CJK?
>>> In CJK locales, most characters are double-width because those locales
>>> use fonts where the glyphs are wider.  Or at least this is the theory.
>>> string-pixel-width is free from these assumptions because it actually
>>> measures the font glyphs.
>>
>> I'm guessing it's somewhat slower because of that too
> 
> It isn't.  The entries in char-width-table are set up when you switch
> to the language-environment which requires that; see, for example,
> lisp/language/chinese.el where we call set-language-info-alist for any
> Chinese-* language-environment.

What I meant is, string-lixel-width must be slower than string-width 
because it uses a temp buffer and actual measurements, whereas the 
latter function only does a table lookup, more or less (N times).

>>>> (defun company--string-width (str)
>>>>      (if (display-graphic-p)
>>>>          (ceiling (/ (string-pixel-width str)
>>>>                      (float (default-font-width))))
>>>>        (string-width str)))
>>> Yes, definitely.  (Actually, display-multi-font-p is better than
>>> display-graphic-p, but in practice they will return the same value.)
>>
>> Could you suggest a similar alternative to move-to-column?
> 
> Try this:
> 
>     (vertical-motion (cons (/ (float PIXELS) (default-font-width)) 0))

Thank you. I just uses the column values I was already working with. I'm 
trying whole-pixelwise addressing in the next version, but the better 
precision seems to necessitate a whole new approach, using 
string-pixel-width and the space :width display spec. Seems to be 
working okay too, in my brief testing.




This bug report was last modified 2 years ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.