From unknown Sat Aug 16 13:49:44 2025 X-Loop: help-debbugs@gnu.org Subject: bug#50336: Width format specifier is calculated wrong for nb_NO locale Resent-From: Carl-Erik Kopseng Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 02 Sep 2021 14:59:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 50336 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 50336@debbugs.gnu.org X-Debbugs-Original-To: bug-coreutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.163059469519182 (code B ref -1); Thu, 02 Sep 2021 14:59:01 +0000 Received: (at submit) by debbugs.gnu.org; 2 Sep 2021 14:58:15 +0000 Received: from localhost ([127.0.0.1]:41730 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mLoAP-0004zI-Rk for submit@debbugs.gnu.org; Thu, 02 Sep 2021 10:58:15 -0400 Received: from lists.gnu.org ([209.51.188.17]:53114) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mLmAK-0001PQ-38 for submit@debbugs.gnu.org; Thu, 02 Sep 2021 08:50:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mLmAJ-0006LE-5N for bug-coreutils@gnu.org; Thu, 02 Sep 2021 08:49:59 -0400 Received: from mail-ej1-x62e.google.com ([2a00:1450:4864:20::62e]:37540) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mLmAG-0007BL-UA for bug-coreutils@gnu.org; Thu, 02 Sep 2021 08:49:58 -0400 Received: by mail-ej1-x62e.google.com with SMTP id h9so4086493ejs.4 for ; Thu, 02 Sep 2021 05:49:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=diffia-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=h47LgHujv/sstCaJcOg40BYlih2Ms7B4xkPr6iW6C4c=; b=E4zjjRbyLtgOkgEmfNlMH+eOcOBT8tZrpUR2bVBNqH1HvvWNC6bpIGUCSgB3JJ7M55 kDGSpWzJMqi6kYhx1A2OwVIOK848S7uUI/Y3igdgS0PwcPMNGHL5MRmRQaG5jkr9A+Rr HiSBxtCwfsmRwYZVh3GbW8W4rKT7Cp28w/PjamgyH/UII0DsU39RjvywBb40LHol5Brw LPJafT/U4H/INoHFfQjHlRbT/1CWJeXIQdWBYOT+cwthsqIVfPwQ2ieRZK5Dcq2SQ29U rUmgp0V8h+MpQRnzG6S9u4du2E0gXKKH2uxrq34nxd7xkypZGRjuzTdlJeFpzFtvwqHu wRDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=h47LgHujv/sstCaJcOg40BYlih2Ms7B4xkPr6iW6C4c=; b=JEi5QDsFL6leRmSooPaKBlq5JmGkWfVpZAZtl7zkcQxsWcFPtrqe8v6orTMjj2z3L4 KkQrCvAb5K3Yck9YMKIIVexZXS1+xNfCx7dW78r3/TvjcPsUD7Erj2uflwDwhu0wMvmP F30DpGGMUtkBr43kZ9SfnsAXmei78futI3fLy+JbQip+42cOpMyKk6uVY3kRV2XcUL0s Kzn+Fwf+CgHPwH+CDcG89yIk6xYlWu2Rg1i7OfoVvIIwnlGbWK2IG5zNdCaRDRy8ZeZQ 6Oe90V46PEgd88wYHYK07irinEN8Qy8n+PHptIuLTPjYXU8/Kl3ImvOroqYPVBKeSxFl X1oQ== X-Gm-Message-State: AOAM533+CY/PfbeTSPTGbZdagatEb2tanwxPVr0Yks2tmeoIRcoZMcjs Z3Fhuk1oz7zW2mnkg7PK5WDgG4EdjzL/EubL7t6R+qnsckMpiw== X-Google-Smtp-Source: ABdhPJzmdhHbHGox13gtxzzWMiE6qPVA/72VdrAcrJNbRN/XNiGm77p0IgOgPBTzwonOfeOI4N6Z7r6rcbEXylbHgS0= X-Received: by 2002:a17:906:43c9:: with SMTP id j9mr3548299ejn.57.1630586994429; Thu, 02 Sep 2021 05:49:54 -0700 (PDT) MIME-Version: 1.0 From: Carl-Erik Kopseng Date: Thu, 2 Sep 2021 14:49:43 +0200 Message-ID: Content-Type: multipart/alternative; boundary="0000000000009425d805cb029eef" Received-SPF: none client-ip=2a00:1450:4864:20::62e; envelope-from=carlerik@diffia.com; helo=mail-ej1-x62e.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -2.3 (--) X-Mailman-Approved-At: Thu, 02 Sep 2021 10:58:12 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) --0000000000009425d805cb029eef Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I just noticed that the width specifier for numeric parameters does some weird calculations when the specified locale is `nb_NO.utf8`. For instance, the number formatting rules for US and NO both result in the same number of characters (with ' ' instead of ','), but the Norwegian version lacks two spaces in the padded output. This must be a bug, no? ``` $ LC_NUMERIC=3Den_US.utf8 printf "%'7d%s\n" 1234 XXX 1,234XXX $ LC_NUMERIC=3Dnb_NO.utf8 printf "%'7d%s\n" 1234 XXX 1=E2=80=AF234XXX ``` Package: coreutils Essential: yes Priority: required Section: utils Installed-Size: 8248 Origin: Ubuntu Maintainer: Ubuntu Developers Bugs: https://bugs.launchpad.net/ubuntu/+filebug Architecture: amd64 Multi-Arch: foreign Version: 8.32-4ubuntu2 Pre-Depends: libacl1 (>=3D 2.2.23), libattr1 (>=3D 1:2.4.44), libc6 (>=3D 2= .32), libgmp10, libselinux1 (>=3D 3.1~) Filename: pool/main/c/coreutils/coreutils_8.32-4ubuntu2_amd64.deb Size: 1353100 MD5sum: 1818b348429f95bffb99fe80bf965b5c Description: GNU core utilities Original-Maintainer: Michael Stone SHA1: d56d97d420f317d7988e77e957bce95c630bc20a SHA256: 1c04fd5a7d4f343beed3b56e37c105f5dffbc58728ab3a6d6bd05bfab8ab289c SHA512: b2ef6a601307cadfcd6bf072e59a306d54267294bdd635fab44f70b0cdccb871b64c15644fa= c0fdd342e4a1eb19145f155588ba97ede04ba683d5bf72c9995c4 Homepage: http://gnu.org/software/coreutils Task: minimal Description-md5: d0d975dec3625409d24be1238cede238 --=20 Carl-Erik Kopseng --0000000000009425d805cb029eef Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I just noticed that the width specifier for numeric p= arameters does some weird calculations when the specified locale is `nb_NO.= utf8`. For instance, the number formatting rules=C2=A0for US and NO both re= sult in the same number of characters (with ' ' instead of ',&#= 39;), but the Norwegian version lacks two spaces in the padded output. This= must be a bug, no?

```
$ = LC_NUMERIC=3Den_US.utf8 printf "%'7d%s\n" 1234 XXX
=C2=A0 = 1,234XXX

$ LC_NUMERIC=3Dnb_NO.utf8 printf "%'7d%s\n" 1= 234 XXX
1=E2=80=AF234XXX
```

Pac= kage: coreutils
Essential: yes
Priority: required
Section: utilsInstalled-Size: 8248
Origin: Ubuntu
Maintainer: Ubuntu Developers &= lt;ubuntu-devel-di= scuss@lists.ubuntu.com>
Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Archi= tecture: amd64
Multi-Arch: foreign
Version: 8.32-4ubuntu2
Pre-Depe= nds: libacl1 (>=3D 2.2.23), libattr1 (>=3D 1:2.4.44), libc6 (>=3D = 2.32), libgmp10, libselinux1 (>=3D 3.1~)
Filename: pool/main/c/coreut= ils/coreutils_8.32-4ubuntu2_amd64.deb
Size: 1353100
MD5sum: 1818b3484= 29f95bffb99fe80bf965b5c
Description: GNU core utilities
Original-Main= tainer: Michael Stone <mstone@debia= n.org>
SHA1: d56d97d420f317d7988e77e957bce95c630bc20a
SHA256: = 1c04fd5a7d4f343beed3b56e37c105f5dffbc58728ab3a6d6bd05bfab8ab289c
SHA512:= b2ef6a601307cadfcd6bf072e59a306d54267294bdd635fab44f70b0cdccb871b64c15644f= ac0fdd342e4a1eb19145f155588ba97ede04ba683d5bf72c9995c4
Homepage: http://gnu.org/software/coreutils
Task: minimal
Description-md5: d0d975dec3625409d24be1238cede238
=

--

--0000000000009425d805cb029eef-- From unknown Sat Aug 16 13:49:44 2025 X-Loop: help-debbugs@gnu.org Subject: bug#50336: Width format specifier is calculated wrong for nb_NO locale Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 02 Sep 2021 15:16:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 50336 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Carl-Erik Kopseng , 50336@debbugs.gnu.org Received: via spool by 50336-submit@debbugs.gnu.org id=B50336.163059570820912 (code B ref 50336); Thu, 02 Sep 2021 15:16:02 +0000 Received: (at 50336) by debbugs.gnu.org; 2 Sep 2021 15:15:08 +0000 Received: from localhost ([127.0.0.1]:41773 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mLoQm-0005RE-Gi for submit@debbugs.gnu.org; Thu, 02 Sep 2021 11:15:08 -0400 Received: from mail-wm1-f42.google.com ([209.85.128.42]:33336) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mLoQg-0005QD-Io; Thu, 02 Sep 2021 11:15:06 -0400 Received: by mail-wm1-f42.google.com with SMTP id 192-20020a1c04c9000000b002f7a4ab0a49so1341920wme.0; Thu, 02 Sep 2021 08:15:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=DemXWCkgen6k6uQVoTM81xsEGFl3ms8Qgacx29Sj6Qc=; b=e/ulAq7ktGjsrcuINf4hsRjQJRQfDbSQozTo1IP9zcCI7yBtqPpdt0sq/bpuEEmsLf /nDs5qjsiA808om7tZYqXEpvcUAYq49IVeEIkoJtQIUFXID2e1a9MlwNw+jsz8QH9h+K Sdd+8TWDSZPTyoHcW3///faD4ZcHv+flihzhVyPcSEP8Tpyp7uriKIDWkWia6e2DreEM WZHperJl9xAYAHDQVTiLvagF1SlIqrVBKOP/j7BLGTWpo4CzWUDi4Z5aUDnX6tcGXN/q oUvSBOMSE0Pj7zRH4jr42iXl4VvrGIKve8/Vy7wsTdDD6Ts05J9faJY4QxNB5pvNrYBt edpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=DemXWCkgen6k6uQVoTM81xsEGFl3ms8Qgacx29Sj6Qc=; b=du7i+zSModkMXH+AB0cPHIDXpgM+FU6aiqEEKQHaktzXdCiYdYFRErrFmoYgAqQWqP IUYfIyeFDx5T+0mxt4s+GAFYAiPjcizIQy2uPYkXnwjTV2Ku0oheX7F0vyNB2LSuqvLq p+CItfb+A1nVaTbBIlQ+LJehDVKt18DObL9zW4axSCkKE9gYIYJyulp54nvHAkLmRt/V nJBb3mDzPuzFQ7bY4svWab35rSGAPrz4A8E5Gses3bTPWmJ6QDOScL7Dztr5SJJesMdx Fv6rR12WuPplBYRAWLxuieX8BXG3fmrLdQAU2kOW8YiM0D70P+fo7Ln7vtR5mxgCT+/h eJjw== X-Gm-Message-State: AOAM530vdtfgWRpDvRZs7bljyOkvLR5/HYlsMbC//rGdjCYZDPQWEimG gw0i9Bqbd6MBA6Cz5C1BahLXVzc1itI7LUJa X-Google-Smtp-Source: ABdhPJxFniGnQuerWGEK6DQ0ibhMG7WIUh96Y8R30h0yhOVKNKCfDxXTDT4LfpMReWtlZgCOhWvUYQ== X-Received: by 2002:a1c:43c5:: with SMTP id q188mr3678475wma.175.1630595696406; Thu, 02 Sep 2021 08:14:56 -0700 (PDT) Received: from localhost.localdomain (86-42-15-3-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.42.15.3]) by smtp.googlemail.com with UTF8SMTPSA id n3sm1965494wmi.0.2021.09.02.08.14.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Sep 2021 08:14:55 -0700 (PDT) References: From: =?UTF-8?Q?P=C3=A1draig?= Brady Message-ID: <3235a22e-21ce-c913-43eb-ebd662b14c9b@draigBrady.com> Date: Thu, 2 Sep 2021 16:14:54 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.4 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) tag 50336 notabug close 50336 stop On 02/09/2021 13:49, Carl-Erik Kopseng wrote: > I just noticed that the width specifier for numeric parameters does some > weird calculations when the specified locale is `nb_NO.utf8`. For instance, > the number formatting rules for US and NO both result in the same number of > characters (with ' ' instead of ','), but the Norwegian version lacks two > spaces in the padded output. This must be a bug, no? > > ``` > $ LC_NUMERIC=en_US.utf8 printf "%'7d%s\n" 1234 XXX > 1,234XXX > > $ LC_NUMERIC=nb_NO.utf8 printf "%'7d%s\n" 1234 XXX > 1 234XXX > ``` Note one must be careful with printf as there is a shell builtin often in consideration here. That's not at issue here tough as both the shell and coreutils call down to libc printf implementation. The particular issue is the grouping char used in the nb_NO.utf8 locale is multi-byte. Specifically: e2 80 af So that character counts as 3 bytes, and the printf implementation is counting bytes, not characters, or display cells. Given the usual consideration is display width, it probably should be considering display cells, but that's an issue for libc, not coreutils. Note coreutils does need to handle alignment in various places, and for that it uses the following module to more generally handle this: https://github.com/coreutils/coreutils/blob/master/gl/lib/mbsalign.c closing this as not a coreutils specific bug. cheers, Pádraig