GNU bug report logs - #64420
string-width of … is 2 in CJK environments

Previous Next

Package: emacs;

Reported by: Dmitry Gutov <dmitry <at> gutov.dev>

Date: Sun, 2 Jul 2023 12:58:02 UTC

Severity: normal

Full log


Message #41 received at 64420 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 64420 <at> debbugs.gnu.org
Subject: Re: bug#64420: string-width of … is 2 in CJK environments
Date: Wed, 12 Jul 2023 04:17:03 +0300
On 11/07/2023 21:45, Eli Zaretskii wrote:

>>> Once again, the assumption behind this "feature" of the CJK
>>> language-environments is that whoever uses those environments has the
>>> terminal emulators configured to use fonts where "…" and its ilk have
>>> double size.  Of course, if you just switch language-environment on a
>>> system that is otherwise configured for non-CJK locale, the terminal
>>> emulator fonts will not magically change, and you get what you see.
>>
>> Does "…" actually have double width in some of their fonts?
> 
> That's the assumption, yes.  (And not only this one character, you can
> see which characters we assume have the same width in the function I
> pointed out earlier in this thread, which we run when the
> language-environment is switched to something CJK.)  It was definitely
> correct at some point in the past, but the big question is whether it
> is still correct.  I don't know who can tell us that nowadays.

Whole ranges of characters, I see.

>> This report stems from an issue opened on Github for company-mode (see
>> the first message) from somebody who as I understand hails from one of
>> those countries (I haven't clarified exactly), and they apparently have
>> to work with the "Chinese-BIG5" language environment.
>>
>> Are you saying that they misconfigured their system somehow, e.g. that
>> Chinese-BIG5 is expected to be used with a certain set of default system
>> fonts which have "…" at double width?
> 
> Either their systems are misconfigured, or the assumption about the
> width of those characters is no longer true, at least not in a vast
> enough majority of cases.  If we cannot get definitive answers, maybe
> we should have an optional feature that disables the redefinition of
> char-width for characters that Unicode does not define as "wide", and
> then see whether someone still needs such tweaking of char-width.
> 
>>> If you worry that something in your package might not work well for
>>> some users due to this issue, how about giving them a user-level
>>> option to change the char-width of this character to 1?
>>
>> It's been suggested that we alter char-width-table dynamically too, as
>> one option. I was just hoping to clarify that we don't carry an
>> erroneous entry for this particular character.
> 
> Whether it's "erroneous" or not depends on what fonts are actually
> used.  char-width-table cannot know that, so we are guessing there.
> 
>> If we did, it would be an easier solution for me to direct the users to
>> the fix in Emacs 29/30, and delay the rollout of the new popup rendering
>> feature a little bit. It will need a fair bit of testing period given
>> the nature of the change.
> 
> We will not change the width in Emacs 29: that is too much for a
> release branch, definitely at this point in the release cycle.  For
> Emacs 30, if we want to change this, I'd rather do it as described
> above, leaving the "fire escape" to get back the old behavior.  It
> would be nice to hear from as many CJK users as possible which
> characters in the widely used fonts are really double-width -- this
> will help in the decision what exactly to change in
> use-cjk-char-width-table.

All right. I'll try to get more info from the issue reporter, at least.

>> Further, string-pixel-width and buffer-text-pixel-size have only been
>> added in Emacs 29. Any chance you know some replacement I could use to
>> backport the functionality to work in Emacs 25 or 26?
>> buffer-text-pixel-size is defined in C.
> 
> You could use window-text-pixel-size instead.

Either I'm doing something wrong, or this function's behavior was 
different in Emacs 28. There had been some changes to it during Emacs 
29's dev cycle, but I'm not sure which one would have that effect.

Anyway, with this definition:

(defun pixel-width (string)
  (if (zerop (length string))
      0
    ;; Keeping a work buffer around is more efficient than creating a
    ;; new temporary buffer.
    (with-current-buffer (get-buffer-create " *string-pixel-width*")
      ;; `display-line-numbers-mode' is enabled in internal buffers
      ;; that breaks width calculation, so need to disable (bug#59311)
      (when (bound-and-true-p display-line-numbers-mode)
        (display-line-numbers-mode -1))
      (delete-region (point-min) (point-max))
      (insert string)
      (save-window-excursion
        (set-window-buffer nil (current-buffer))
        (car
	 (window-text-pixel-size nil nil nil t))))))

In Emacs 29, (pixel-width "abc") returns 54 here (on a 4K screen).

But no matter what I do, it returns 0 in my Emacs 28.2 (from official 
tarball).

To get some more info: if I remove the 'car' call, the value that 
window-text-pixel-size returns is (54 . 36) in Emacs 29 and (0 . 108) in 
Emacs 28.2.




This bug report was last modified 2 years and 1 day ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.