From unknown Sun Aug 17 10:23:00 2025 X-Loop: help-debbugs@gnu.org Subject: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales Resent-From: Ruijie Yu Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Apr 2023 10:39:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 63029 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 63029@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.16822463232417 (code B ref -1); Sun, 23 Apr 2023 10:39:02 +0000 Received: (at submit) by debbugs.gnu.org; 23 Apr 2023 10:38:43 +0000 Received: from localhost ([127.0.0.1]:44648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqX7D-0000cu-1P for submit@debbugs.gnu.org; Sun, 23 Apr 2023 06:38:43 -0400 Received: from lists.gnu.org ([209.51.188.17]:41986) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqX78-0000cd-7a for submit@debbugs.gnu.org; Sun, 23 Apr 2023 06:38:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqX6x-0008Tp-5x for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:38:29 -0400 Received: from netyu.xyz ([152.44.41.246] helo=mail.netyu.xyz) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqX6s-0007As-CM for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:38:26 -0400 Received: from fw.net.yu.netyu.xyz ( [222.248.4.98]) by netyu.xyz (OpenSMTPD) with ESMTPSA id 6a1f0e39 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Sun, 23 Apr 2023 10:38:16 +0000 (UTC) User-agent: mu4e 1.9.22; emacs 30.0.50 From: Ruijie Yu Date: Sun, 23 Apr 2023 18:23:02 +0800 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=152.44.41.246; envelope-from=ruijie@netyu.xyz; helo=mail.netyu.xyz X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.6 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) Hello, I don't quite know yet whether this is a bug in Emacs. Here are the observed results, and note the unicode character: --8<---------------cut here---------------start------------->8--- $ for locale in {en_US,fr_FR,de_DE,zh_CN,ja_JA}.UTF-8; do printf "$locale\t" LANG=3D"$locale" src/emacs -Q -batch \ -eval '(message "%S" (format "%-5.5s" "1234=E2=80=A6"))' done --8<---------------cut here---------------end--------------->8--- This results in the following output: --8<---------------cut here---------------start------------->8--- en_US.UTF-8 "1234=E2=80=A6" fr_FR.UTF-8 "1234=E2=80=A6" de_DE.UTF-8 "1234=E2=80=A6" zh_CN.UTF-8 "1234 " ja_JA.UTF-8 "1234 " --8<---------------cut here---------------end--------------->8--- Notice that in zh_CN and ja_JA, we have a space instead of the expected ellipsis character. If this is expected behavior, how do we know how "wide" the `format' function thinks any given character is? In other words, why _does_ it think "=E2=80=A6" should be two-character wide? And how do we, the elisp u= sers, get this information? I tried to dive into the C code for `styled_format', but got lost. Thanks. ---------- Reproduced on this in-source build: In GNU Emacs 30.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.37, cairo version 1.17.8) of 2023-04-23 built on fw.net.yu Repository revision: 3badd2358d5f0af71887ee1cc9d39c2f312b6888 Repository branch: master System Description: Arch Linux Configured using: 'configure --sysconfdir=3D/etc --prefix=3D/usr --localstatedir=3D/var --with-cairo --with-harfbuzz --with-libsystemd --with-modules --with-pgtk --with-native-compilation CFLAGS=3D-Og' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON LCMS2 LIBOTF LIBSYSTEMD LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PGTK PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP XIM GTK3 ZLIB --=20 Best, RY [Please note that this mail might go to spam due to some misconfiguration in my mail server -- still investigating.] From unknown Sun Aug 17 10:23:00 2025 X-Loop: help-debbugs@gnu.org Subject: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales Resent-From: Ihor Radchenko Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Apr 2023 11:04:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63029 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Ruijie Yu Cc: 63029@debbugs.gnu.org Received: via spool by 63029-submit@debbugs.gnu.org id=B63029.16822478365689 (code B ref 63029); Sun, 23 Apr 2023 11:04:02 +0000 Received: (at 63029) by debbugs.gnu.org; 23 Apr 2023 11:03:56 +0000 Received: from localhost ([127.0.0.1]:44666 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqXVb-0001Th-UI for submit@debbugs.gnu.org; Sun, 23 Apr 2023 07:03:56 -0400 Received: from mout02.posteo.de ([185.67.36.66]:59377) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqXVZ-0001TT-VJ for 63029@debbugs.gnu.org; Sun, 23 Apr 2023 07:03:54 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id 2F1D4240157 for <63029@debbugs.gnu.org>; Sun, 23 Apr 2023 13:03:47 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1682247828; bh=/OngWeqj2Qwn4q3F9fV55Ga41VFWKrK12LWA+NPfSMc=; h=From:To:Cc:Subject:Date:From; b=UtSoKJCG9hmcT68W4bCSUsoAtzevpx6y01DlM0f70/4sqLL8/IEVfJjvX6rit8Fhs XIHuRt1Jm6qeHGL7bRpqbSC7pogvRggW0O+xjIUWg2DEbD9Ft0wIccUwReMydeMnVL 0e/rlXE1P4I1MvgwEACp7CPZVqL9Tn5u41VLZpNHliplDztyYLOCkaeGZAVZzRGsVq mPLgLjpj5NGZx6W1PYX5p+JKbfzTRKNFIimondl4CR3BT4N//WwlqX7qUEI4qt/LkY IrftfFvhpPuUMgjmrDeVgTpewep4fAdquig+4N2KymWFCj7fIzgM36zRapCucr5dpV yNf11c1NFyVAg== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4Q453l344qz9rxM; Sun, 23 Apr 2023 13:03:47 +0200 (CEST) From: Ihor Radchenko In-Reply-To: References: Date: Sun, 23 Apr 2023 11:06:38 +0000 Message-ID: <87a5yyuabl.fsf@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ruijie Yu via "Bug reports for GNU Emacs, the Swiss army knife of text > -eval '(message "%S" (format "%-5.5s" "1234=E2=80=A6")= )' > ... > en_US.UTF-8 "1234=E2=80=A6" > fr_FR.UTF-8 "1234=E2=80=A6" > de_DE.UTF-8 "1234=E2=80=A6" > zh_CN.UTF-8 "1234 " > ja_JA.UTF-8 "1234 " Context: https://orgmode.org/list/sdv7cu4ugk2.fsf@netyu.xyz --=20 Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at From unknown Sun Aug 17 10:23:00 2025 X-Loop: help-debbugs@gnu.org Subject: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales Resent-From: Ihor Radchenko Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Apr 2023 11:06:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63029 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Ruijie Yu Cc: 63029@debbugs.gnu.org Received: via spool by 63029-submit@debbugs.gnu.org id=B63029.16822479505896 (code B ref 63029); Sun, 23 Apr 2023 11:06:02 +0000 Received: (at 63029) by debbugs.gnu.org; 23 Apr 2023 11:05:50 +0000 Received: from localhost ([127.0.0.1]:44671 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqXXS-0001X2-Ck for submit@debbugs.gnu.org; Sun, 23 Apr 2023 07:05:50 -0400 Received: from mout02.posteo.de ([185.67.36.66]:39935) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqXXP-0001Wm-UT for 63029@debbugs.gnu.org; Sun, 23 Apr 2023 07:05:48 -0400 Received: from submission (posteo.de [185.67.36.169]) by mout02.posteo.de (Postfix) with ESMTPS id 01F09240157 for <63029@debbugs.gnu.org>; Sun, 23 Apr 2023 13:05:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=posteo.net; s=2017; t=1682247942; bh=o4c1+DCWyDo/SMY5yo6GD2jvAwUMffIt7XYOnUTv1Eg=; h=From:To:Cc:Subject:Date:From; b=NjeDyFsYXTCSZXRqiLhdGp71ILmKC6F/bULegy5eL5Ri3HFp6xDIEt7u5gIPkVwsH Q1v9g3xjR9W/YOOGQmIxXEMOAtxj9Zypb5qzk8TQ4A99TDl1G7iljUtZJZi2JS2Rcr yM0K3FUxeyEAhbMX/7PDoC7yybQHQQeez+GluXif7T3+Fw5mivN8ebE5gqpNZhOyhY zXS0Yi3r3JRxDlCC6h0+PXDeiBKjfZYGO0lKSEWx/q//hKnTH/zGYSVTo5Ux5grd+3 mYoSmvSVj/94xusG6DJHaQORbpmj6zMevKykhafM1tl0z/Z00ON/8kW5Dd07dClKNO jBRBWRl7Dd9PQ== Received: from customer (localhost [127.0.0.1]) by submission (posteo.de) with ESMTPSA id 4Q455x4VCKz9rxB; Sun, 23 Apr 2023 13:05:41 +0200 (CEST) From: Ihor Radchenko In-Reply-To: References: Date: Sun, 23 Apr 2023 11:08:33 +0000 Message-ID: <877cu2ua8e.fsf@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ruijie Yu via "Bug reports for GNU Emacs, the Swiss army knife of text editors" writes: > en_US.UTF-8 "1234=E2=80=A6" > fr_FR.UTF-8 "1234=E2=80=A6" > de_DE.UTF-8 "1234=E2=80=A6" > zh_CN.UTF-8 "1234 " > ja_JA.UTF-8 "1234 " I can reproduce on the latest master, Emacs 28, Emacs 27, and Emacs 26. --=20 Ihor Radchenko // yantar92, Org mode contributor, Learn more about Org mode at . Support Org development at , or support my work at From unknown Sun Aug 17 10:23:00 2025 X-Loop: help-debbugs@gnu.org Subject: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Apr 2023 14:19:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63029 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Ruijie Yu Cc: 63029@debbugs.gnu.org Received: via spool by 63029-submit@debbugs.gnu.org id=B63029.16822595406108 (code B ref 63029); Sun, 23 Apr 2023 14:19:02 +0000 Received: (at 63029) by debbugs.gnu.org; 23 Apr 2023 14:19:00 +0000 Received: from localhost ([127.0.0.1]:46541 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqaYO-0001aS-GO for submit@debbugs.gnu.org; Sun, 23 Apr 2023 10:19:00 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34382) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqaYK-0001a3-QU for 63029@debbugs.gnu.org; Sun, 23 Apr 2023 10:18:59 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqaYF-0001h6-BB; Sun, 23 Apr 2023 10:18:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=YQHoc8EmpVsPfwBhvgMwyYBGr9ZPszkOstIY74FvtG0=; b=Zeu87623T73LVc8fS/oo bVT1Kk5r/jDw5WV+PvG8zJ4JSP4ariqiFbmIZaNhbm2ZPCzM9z9jZTj+g4Mg+LMIG51QkazJ4zY+6 2kVXVHG34iPIfq4ZxKHcO/1ONBLFYoyf6dNiFkuGhgkgAdUDmBeL3Q0/USQV2JFaTrW7Brt7y0Z6B weR4Vdf3FL5fnTb7jgks6SvqyE8p5MOMRqu4YimPSKjqlvY76V2U/x/AmmEDv6W1MGR9hxW8YZQxL LFOLVfJ9sB7hvr6Q+JgSNp72ltWywjTk/7oYDTWdODtPyVBaLJS6kOUj6Ar2cgqHAkPd+NTkzVlus LvQQYA2IpqIkdg==; Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqaYD-0005P9-PL; Sun, 23 Apr 2023 10:18:50 -0400 Date: Sun, 23 Apr 2023 17:19:10 +0300 Message-Id: <83bkjeznoh.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: (bug-gnu-emacs@gnu.org) References: MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Sun, 23 Apr 2023 18:23:02 +0800 > From: Ruijie Yu via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" > > I don't quite know yet whether this is a bug in Emacs. Here are the > observed results, and note the unicode character: > > --8<---------------cut here---------------start------------->8--- > $ for locale in {en_US,fr_FR,de_DE,zh_CN,ja_JA}.UTF-8; do > printf "$locale\t" > LANG="$locale" src/emacs -Q -batch \ > -eval '(message "%S" (format "%-5.5s" "1234…"))' > done > --8<---------------cut here---------------end--------------->8--- > > This results in the following output: > > --8<---------------cut here---------------start------------->8--- > en_US.UTF-8 "1234…" > fr_FR.UTF-8 "1234…" > de_DE.UTF-8 "1234…" > zh_CN.UTF-8 "1234 " > ja_JA.UTF-8 "1234 " > --8<---------------cut here---------------end--------------->8--- > > Notice that in zh_CN and ja_JA, we have a space instead of the expected > ellipsis character. > > > If this is expected behavior, how do we know how "wide" the `format' > function thinks any given character is? In other words, why _does_ it > think "…" should be two-character wide? This is a kludgey feature: in CJK locales some characters are always considered double-width. See code in characters.el that begins with a comment around line 1140. The function use-cjk-char-width-table defined there is invoked (via the setup-function of the language environment) when the language environment in Emacs is set to one of those CJK locales. The reason for this is that in CJK fonts these characters are supposed to be rendered using full-width glyphs. See also bug#54138 and https://lists.gnu.org/archive/html/emacs-devel/2022-02/msg00917.html. > And how do we, the elisp users, get this information? I don't understand this question. Please elaborate: what information do you want to get, besides the width of the characters (which is accessible via char-width-table). From unknown Sun Aug 17 10:23:00 2025 X-Loop: help-debbugs@gnu.org Subject: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales Resent-From: Ruijie Yu Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Apr 2023 14:30:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63029 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 63029@debbugs.gnu.org Received: via spool by 63029-submit@debbugs.gnu.org id=B63029.16822601637284 (code B ref 63029); Sun, 23 Apr 2023 14:30:02 +0000 Received: (at 63029) by debbugs.gnu.org; 23 Apr 2023 14:29:23 +0000 Received: from localhost ([127.0.0.1]:46559 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqaiR-0001tQ-33 for submit@debbugs.gnu.org; Sun, 23 Apr 2023 10:29:23 -0400 Received: from netyu.xyz ([152.44.41.246]:52486 helo=mail.netyu.xyz) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqaiQ-0001tI-04 for 63029@debbugs.gnu.org; Sun, 23 Apr 2023 10:29:22 -0400 Received: from fw.net.yu.netyu.xyz ( [222.248.4.98]) by netyu.xyz (OpenSMTPD) with ESMTPSA id ceb9861d (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Sun, 23 Apr 2023 14:29:20 +0000 (UTC) References: <83bkjeznoh.fsf@gnu.org> User-agent: mu4e 1.9.22; emacs 30.0.50 From: Ruijie Yu Date: Sun, 23 Apr 2023 22:23:16 +0800 In-reply-to: <83bkjeznoh.fsf@gnu.org> Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: >> If this is expected behavior, how do we know how "wide" the `format' >> function thinks any given character is? In other words, why _does_ it >> think "=E2=80=A6" should be two-character wide? > > This is a kludgey feature: in CJK locales some characters are always > considered double-width. See code in characters.el that begins with a > comment around line 1140. The function use-cjk-char-width-table > defined there is invoked (via the setup-function of the language > environment) when the language environment in Emacs is set to one of > those CJK locales. > > The reason for this is that in CJK fonts these characters are supposed > to be rendered using full-width glyphs. > > See also bug#54138 and > https://lists.gnu.org/archive/html/emacs-devel/2022-02/msg00917.html. Thanks for the link. I have found the answer in your response there. >> And how do we, the elisp users, get this information? > > I don't understand this question. Please elaborate: what information > do you want to get, besides the width of the characters (which is > accessible via char-width-table). You mentioning `char-width-table' here and `char-width' on the linked thread precisely answered my question. I was looking for `char-width' without knowing its name. Thanks. --=20 Best, RY [Please note that this mail might go to spam due to some misconfiguration in my mail server -- still investigating.] From unknown Sun Aug 17 10:23:00 2025 X-Loop: help-debbugs@gnu.org Subject: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 23 Apr 2023 14:33:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 63029 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Ruijie Yu Cc: 63029@debbugs.gnu.org Received: via spool by 63029-submit@debbugs.gnu.org id=B63029.16822603527649 (code B ref 63029); Sun, 23 Apr 2023 14:33:01 +0000 Received: (at 63029) by debbugs.gnu.org; 23 Apr 2023 14:32:32 +0000 Received: from localhost ([127.0.0.1]:46564 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqalT-0001zI-LI for submit@debbugs.gnu.org; Sun, 23 Apr 2023 10:32:31 -0400 Received: from eggs.gnu.org ([209.51.188.92]:53944) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqalR-0001z5-B7 for 63029@debbugs.gnu.org; Sun, 23 Apr 2023 10:32:30 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqalL-0003u9-Nk; Sun, 23 Apr 2023 10:32:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=njC9KoGduQBOosJ/LUrQvE6YeY4y9EW79HnfFOsqqiw=; b=leyrP5Aqcdyz 2hdCnTD4338Q5AAZYpgbnkryMO3j16K1tIYvZmTf9lrxIJ9BQnlYc27V/hmkTc+3FOblzMifZHaRk 5wPQ9XNNPJppXLZ7CN9K327zIOiTFJCQ8eelIpD1PbNyGGjw6GOAO7Te1WkGlO9PLOn2NZ2Md+mOE LWXjnolC3OvukAe66ArAtu2hbL1w23M8V1u/Z46YI0ZrtBcKW1/sguI1uqtZltWRYTj1xrvpwnJBf QzO+iAnvhCK8SEHSPl1VFxQBAtoC+cHwouPVR90KcqxtnJx5dX8NZEiKjd7kbNnuvK1hCq/sU8QSe YF/JQks2VXdA3kokCxavtA==; Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqalK-0002Ca-Uo; Sun, 23 Apr 2023 10:32:23 -0400 Date: Sun, 23 Apr 2023 17:32:43 +0300 Message-Id: <838reizn1w.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: (message from Ruijie Yu on Sun, 23 Apr 2023 22:23:16 +0800) References: <83bkjeznoh.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Ruijie Yu > Cc: 63029@debbugs.gnu.org > Date: Sun, 23 Apr 2023 22:23:16 +0800 > > > Eli Zaretskii writes: > > > See also bug#54138 and > > https://lists.gnu.org/archive/html/emacs-devel/2022-02/msg00917.html. > > Thanks for the link. I have found the answer in your response there. > > >> And how do we, the elisp users, get this information? > > > > I don't understand this question. Please elaborate: what information > > do you want to get, besides the width of the characters (which is > > accessible via char-width-table). > > You mentioning `char-width-table' here and `char-width' on the linked > thread precisely answered my question. I was looking for `char-width' > without knowing its name. Thanks. OK, so can we close this issue? Btw, the recommended method of computing the width of a string is via string-pixel-width. From unknown Sun Aug 17 10:23:00 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Ruijie Yu Subject: bug#63029: closed (Re: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales) Message-ID: References: X-Gnu-PR-Message: they-closed 63029 X-Gnu-PR-Package: emacs Reply-To: 63029@debbugs.gnu.org Date: Sun, 23 Apr 2023 14:40:01 +0000 Content-Type: multipart/mixed; boundary="----------=_1682260801-8350-1" This is a multi-part message in MIME format... ------------=_1682260801-8350-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #63029: [BUG?] format inconsistency in deciding string widths on different = locales which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 63029@debbugs.gnu.org. --=20 63029: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D63029 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1682260801-8350-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 63029-done) by debbugs.gnu.org; 23 Apr 2023 14:39:30 +0000 Received: from localhost ([127.0.0.1]:46578 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqasE-00029s-7g for submit@debbugs.gnu.org; Sun, 23 Apr 2023 10:39:30 -0400 Received: from netyu.xyz ([152.44.41.246]:54850 helo=mail.netyu.xyz) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqasD-00029l-CR for 63029-done@debbugs.gnu.org; Sun, 23 Apr 2023 10:39:29 -0400 Received: from fw.net.yu.netyu.xyz ( [222.248.4.98]) by netyu.xyz (OpenSMTPD) with ESMTPSA id ca130a98 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Sun, 23 Apr 2023 14:39:28 +0000 (UTC) References: <83bkjeznoh.fsf@gnu.org> <838reizn1w.fsf@gnu.org> User-agent: mu4e 1.9.22; emacs 30.0.50 From: Ruijie Yu To: Eli Zaretskii Subject: Re: bug#63029: [BUG?] format inconsistency in deciding string widths on different locales Date: Sun, 23 Apr 2023 22:38:33 +0800 In-reply-to: <838reizn1w.fsf@gnu.org> Message-ID: MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 63029-done Cc: 63029-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) Eli Zaretskii writes: > OK, so can we close this issue? We can -- done. > Btw, the recommended method of computing the width of a string is via > string-pixel-width. Will take a look at this function. Thanks. -- Best, RY [Please note that this mail might go to spam due to some misconfiguration in my mail server -- still investigating.] ------------=_1682260801-8350-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 23 Apr 2023 10:38:43 +0000 Received: from localhost ([127.0.0.1]:44648 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqX7D-0000cu-1P for submit@debbugs.gnu.org; Sun, 23 Apr 2023 06:38:43 -0400 Received: from lists.gnu.org ([209.51.188.17]:41986) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pqX78-0000cd-7a for submit@debbugs.gnu.org; Sun, 23 Apr 2023 06:38:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqX6x-0008Tp-5x for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:38:29 -0400 Received: from netyu.xyz ([152.44.41.246] helo=mail.netyu.xyz) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pqX6s-0007As-CM for bug-gnu-emacs@gnu.org; Sun, 23 Apr 2023 06:38:26 -0400 Received: from fw.net.yu.netyu.xyz ( [222.248.4.98]) by netyu.xyz (OpenSMTPD) with ESMTPSA id 6a1f0e39 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO) for ; Sun, 23 Apr 2023 10:38:16 +0000 (UTC) User-agent: mu4e 1.9.22; emacs 30.0.50 From: Ruijie Yu To: bug-gnu-emacs@gnu.org Subject: [BUG?] format inconsistency in deciding string widths on different locales Date: Sun, 23 Apr 2023 18:23:02 +0800 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=152.44.41.246; envelope-from=ruijie@netyu.xyz; helo=mail.netyu.xyz X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.6 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) Hello, I don't quite know yet whether this is a bug in Emacs. Here are the observed results, and note the unicode character: --8<---------------cut here---------------start------------->8--- $ for locale in {en_US,fr_FR,de_DE,zh_CN,ja_JA}.UTF-8; do printf "$locale\t" LANG=3D"$locale" src/emacs -Q -batch \ -eval '(message "%S" (format "%-5.5s" "1234=E2=80=A6"))' done --8<---------------cut here---------------end--------------->8--- This results in the following output: --8<---------------cut here---------------start------------->8--- en_US.UTF-8 "1234=E2=80=A6" fr_FR.UTF-8 "1234=E2=80=A6" de_DE.UTF-8 "1234=E2=80=A6" zh_CN.UTF-8 "1234 " ja_JA.UTF-8 "1234 " --8<---------------cut here---------------end--------------->8--- Notice that in zh_CN and ja_JA, we have a space instead of the expected ellipsis character. If this is expected behavior, how do we know how "wide" the `format' function thinks any given character is? In other words, why _does_ it think "=E2=80=A6" should be two-character wide? And how do we, the elisp u= sers, get this information? I tried to dive into the C code for `styled_format', but got lost. Thanks. ---------- Reproduced on this in-source build: In GNU Emacs 30.0.50 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.37, cairo version 1.17.8) of 2023-04-23 built on fw.net.yu Repository revision: 3badd2358d5f0af71887ee1cc9d39c2f312b6888 Repository branch: master System Description: Arch Linux Configured using: 'configure --sysconfdir=3D/etc --prefix=3D/usr --localstatedir=3D/var --with-cairo --with-harfbuzz --with-libsystemd --with-modules --with-pgtk --with-native-compilation CFLAGS=3D-Og' Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON LCMS2 LIBOTF LIBSYSTEMD LIBXML2 MODULES NATIVE_COMP NOTIFY INOTIFY PDUMPER PGTK PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER WEBP XIM GTK3 ZLIB --=20 Best, RY [Please note that this mail might go to spam due to some misconfiguration in my mail server -- still investigating.] ------------=_1682260801-8350-1--