From unknown Mon Aug 18 18:01:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#36887: coreutils-8.31: printf chokes on \u0041 Resent-From: Ulrich Mueller Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 01 Aug 2019 11:03:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 36887 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: 36887@debbugs.gnu.org Cc: base-system@gentoo.org X-Debbugs-Original-To: bug-coreutils@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.156465735721596 (code B ref -1); Thu, 01 Aug 2019 11:03:01 +0000 Received: (at submit) by debbugs.gnu.org; 1 Aug 2019 11:02:37 +0000 Received: from localhost ([127.0.0.1]:53542 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ht8qz-0005cF-KI for submit@debbugs.gnu.org; Thu, 01 Aug 2019 07:02:37 -0400 Received: from lists.gnu.org ([209.51.188.17]:48377) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ht8qx-0005c5-JE for submit@debbugs.gnu.org; Thu, 01 Aug 2019 07:02:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55624) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1ht8qw-0003xp-MR for bug-coreutils@gnu.org; Thu, 01 Aug 2019 07:02:35 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ht8qv-0000ZW-OI for bug-coreutils@gnu.org; Thu, 01 Aug 2019 07:02:34 -0400 Received: from smtp.gentoo.org ([2001:470:ea4a:1:5054:ff:fec7:86e4]:50215) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ht8qv-0000Yg-J0 for bug-coreutils@gnu.org; Thu, 01 Aug 2019 07:02:33 -0400 Received: from a1i15 (host2092.kph.uni-mainz.de [134.93.134.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: ulm) by smtp.gentoo.org (Postfix) with ESMTPSA id 607F634915C; Thu, 1 Aug 2019 11:02:30 +0000 (UTC) From: Ulrich Mueller Date: Thu, 01 Aug 2019 13:02:26 +0200 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2001:470:ea4a:1:5054:ff:fec7:86e4 X-Spam-Score: -1.6 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.6 (--) [Forwarding bug https://bugs.gentoo.org/680244 as requested by the Gentoo package maintainer.] According to printf(1): Interpreted sequences are: [...] \uHHHH Unicode (ISO/IEC 10646) character with hex value HHHH (4 digits) \UHHHHHHHH Unicode character with hex value HHHHHHHH (8 digits) It does not work, though: $ /usr/bin/printf '\u0041\n' /usr/bin/printf: invalid universal character name \u0041 $ /usr/bin/printf '\U00000041\n' /usr/bin/printf: invalid universal character name \U00000041 Other tools interpret the sequence correctly: $ printf '\u0041\n' # bash A $ echo -e '\u0041' # bash A $ zsh -c "echo -e '\u0041'" A $ emacs -Q --batch --eval '(princ "\u0041\n")' A $ python -c "print ('\u0041')" A $ ruby -e 'print("\u0041\n")' A From unknown Mon Aug 18 18:01:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#36887: coreutils-8.31: printf chokes on \u0041 Resent-From: =?UTF-8?Q?P=C3=A1draig?= Brady Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 01 Aug 2019 13:10:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36887 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Ulrich Mueller , 36887@debbugs.gnu.org Cc: base-system@gentoo.org Received: via spool by 36887-submit@debbugs.gnu.org id=B36887.156466495625653 (code B ref 36887); Thu, 01 Aug 2019 13:10:02 +0000 Received: (at 36887) by debbugs.gnu.org; 1 Aug 2019 13:09:16 +0000 Received: from localhost ([127.0.0.1]:53667 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htApX-0006fh-IB for submit@debbugs.gnu.org; Thu, 01 Aug 2019 09:09:15 -0400 Received: from mail.magicbluesmoke.com ([82.195.144.49]:59570) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htApV-0006fW-6V for 36887@debbugs.gnu.org; Thu, 01 Aug 2019 09:09:13 -0400 Received: from localhost.localdomain (unknown [109.77.225.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.magicbluesmoke.com (Postfix) with ESMTPSA id D2AFA990D; Thu, 1 Aug 2019 14:09:10 +0100 (IST) References: From: =?UTF-8?Q?P=C3=A1draig?= Brady Message-ID: <4041c88e-d9cb-ad28-df50-7cefde550733@draigBrady.com> Date: Thu, 1 Aug 2019 14:09:08 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On 01/08/19 12:02, Ulrich Mueller wrote: > [Forwarding bug https://bugs.gentoo.org/680244 as requested by the > Gentoo package maintainer.] > > According to printf(1): > > Interpreted sequences are: > [...] > > \uHHHH Unicode (ISO/IEC 10646) character with hex value HHHH (4 digits) > > \UHHHHHHHH > Unicode character with hex value HHHHHHHH (8 digits) > > It does not work, though: > > $ /usr/bin/printf '\u0041\n' > /usr/bin/printf: invalid universal character name \u0041 > $ /usr/bin/printf '\U00000041\n' > /usr/bin/printf: invalid universal character name \U00000041 > > Other tools interpret the sequence correctly: > > $ printf '\u0041\n' # bash > A > $ echo -e '\u0041' # bash > A > $ zsh -c "echo -e '\u0041'" > A > $ emacs -Q --batch --eval '(princ "\u0041\n")' > A > $ python -c "print ('\u0041')" > A > $ ruby -e 'print("\u0041\n")' > A I agree this is a bit surprising. The full manual states: "Unicode characters in the ranges U+0000...U+009F, U+D800...U+DFFF cannot be specified by this syntax, except for U+0024 ($), U+0040 (@), and U+0060 (`)." This was previously discussed at: https://lists.gnu.org/archive/html/bug-coreutils/2008-05/threads.html#00067 From unknown Mon Aug 18 18:01:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#36887: coreutils-8.31: printf chokes on \u0041 Resent-From: Ulrich Mueller Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 01 Aug 2019 20:19:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36887 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: =?UTF-8?Q?P=C3=A1draig?= Brady Cc: base-system@gentoo.org, 36887@debbugs.gnu.org Received: via spool by 36887-submit@debbugs.gnu.org id=B36887.15646907337050 (code B ref 36887); Thu, 01 Aug 2019 20:19:02 +0000 Received: (at 36887) by debbugs.gnu.org; 1 Aug 2019 20:18:53 +0000 Received: from localhost ([127.0.0.1]:55427 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htHXJ-0001pe-4x for submit@debbugs.gnu.org; Thu, 01 Aug 2019 16:18:53 -0400 Received: from smtp.gentoo.org ([140.211.166.183]:35206) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htHXH-0001pQ-Bp for 36887@debbugs.gnu.org; Thu, 01 Aug 2019 16:18:52 -0400 Received: from a1i15 (host2092.kph.uni-mainz.de [134.93.134.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: ulm) by smtp.gentoo.org (Postfix) with ESMTPSA id 25B10349280; Thu, 1 Aug 2019 20:18:43 +0000 (UTC) From: Ulrich Mueller References: <4041c88e-d9cb-ad28-df50-7cefde550733@draigBrady.com> Date: Thu, 01 Aug 2019 22:18:41 +0200 In-Reply-To: <4041c88e-d9cb-ad28-df50-7cefde550733@draigBrady.com> ("=?UTF-8?Q?P=C3=A1draig?= Brady"'s message of "Thu, 1 Aug 2019 14:09:08 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) >>>>> On Thu, 01 Aug 2019, P=C3=A1draig Brady wrote: > I agree this is a bit surprising. Indeed, it most certainly violates the principle of least surprise. Especially, it means that a shell script that will run in bash won't run in a shell that doesn't have a built-in printf. > The full manual states: > "Unicode characters in the ranges > U+0000...U+009F, U+D800...U+DFFF cannot be specified by this syntax, > except for U+0024 ($), U+0040 (@), and U+0060 (`)." > This was previously discussed at: > https://lists.gnu.org/archive/html/bug-coreutils/2008-05/threads.html#000= 67 So, there are reasons for this restriction in C99. However, I fail to see how those reasons would apply to printf. Except for the surrogates U+D800...U+DFFF, it looks like an arbitrary restriction, which only makes the printf implementation incompatible with other GNU programs (like Bash and Emacs). From unknown Mon Aug 18 18:01:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#36887: coreutils-8.31: printf chokes on \u0041 Resent-From: Paul Eggert Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Thu, 01 Aug 2019 23:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36887 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Ulrich Mueller , =?UTF-8?Q?P=C3=A1draig?= Brady Cc: base-system@gentoo.org, 36887@debbugs.gnu.org Received: via spool by 36887-submit@debbugs.gnu.org id=B36887.156470267710004 (code B ref 36887); Thu, 01 Aug 2019 23:38:02 +0000 Received: (at 36887) by debbugs.gnu.org; 1 Aug 2019 23:37:57 +0000 Received: from localhost ([127.0.0.1]:55633 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htKdx-0002bH-9C for submit@debbugs.gnu.org; Thu, 01 Aug 2019 19:37:57 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:38898) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htKds-0002b1-Gg for 36887@debbugs.gnu.org; Thu, 01 Aug 2019 19:37:55 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 1D9561626C2; Thu, 1 Aug 2019 16:37:46 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id bE1u2WCkPTU7; Thu, 1 Aug 2019 16:37:45 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 654F71626F8; Thu, 1 Aug 2019 16:37:45 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id lpWQFJkL9QGp; Thu, 1 Aug 2019 16:37:45 -0700 (PDT) Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3B5B61626C2; Thu, 1 Aug 2019 16:37:45 -0700 (PDT) References: <4041c88e-d9cb-ad28-df50-7cefde550733@draigBrady.com> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: Date: Thu, 1 Aug 2019 16:37:44 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Ulrich Mueller wrote: > Except for the surrogates > U+D800...U+DFFF, it looks like an arbitrary restriction It's not entirely arbitrary. Because of the restriction, coreutils printf doesn't have to worry about what this command should do: printf '\u0025d\n' 1 2 Does this print a single line "%d", or two lines "1" and "2"? There are good arguments either way, and one can easily construct even-stranger examples. From unknown Mon Aug 18 18:01:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#36887: coreutils-8.31: printf chokes on \u0041 Resent-From: Ulrich Mueller Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Fri, 02 Aug 2019 08:01:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36887 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Paul Eggert Cc: base-system@gentoo.org, =?UTF-8?Q?P=C3=A1draig?= Brady , 36887@debbugs.gnu.org Received: via spool by 36887-submit@debbugs.gnu.org id=B36887.156473283224948 (code B ref 36887); Fri, 02 Aug 2019 08:01:02 +0000 Received: (at 36887) by debbugs.gnu.org; 2 Aug 2019 08:00:32 +0000 Received: from localhost ([127.0.0.1]:55758 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htSUK-0006UK-CH for submit@debbugs.gnu.org; Fri, 02 Aug 2019 04:00:32 -0400 Received: from smtp.gentoo.org ([140.211.166.183]:44206) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htSUH-0006U5-9q for 36887@debbugs.gnu.org; Fri, 02 Aug 2019 04:00:30 -0400 Received: from a1i15 (host2092.kph.uni-mainz.de [134.93.134.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: ulm) by smtp.gentoo.org (Postfix) with ESMTPSA id C05AE3492C3; Fri, 2 Aug 2019 08:00:21 +0000 (UTC) From: Ulrich Mueller References: <4041c88e-d9cb-ad28-df50-7cefde550733@draigBrady.com> Date: Fri, 02 Aug 2019 10:00:03 +0200 In-Reply-To: (Paul Eggert's message of "Thu, 1 Aug 2019 16:37:44 -0700") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) >>>>> On Fri, 02 Aug 2019, Paul Eggert wrote: > It's not entirely arbitrary. Because of the restriction, coreutils > printf doesn't have to worry about what this command should do: > printf '\u0025d\n' 1 2 Seems quite obvious, it should do the same as these commands: printf '\045d\n' 1 2 printf '\x25d\n' 1 2 This is different from C behaviour, because printf(3) doesn't deal with backslash escapes at all, which are interpreted earlier during parsing of the string literal. That's why I think the C reasoning doesn't apply here. From unknown Mon Aug 18 18:01:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#36887: coreutils-8.31: printf chokes on \u0041 Resent-From: L A Walsh Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Fri, 02 Aug 2019 10:16:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36887 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: Paul Eggert Cc: Ulrich Mueller , base-system@gentoo.org, =?UTF-8?Q?P=C3=A1draig?= Brady , 36887@debbugs.gnu.org Received: via spool by 36887-submit@debbugs.gnu.org id=B36887.15647409044683 (code B ref 36887); Fri, 02 Aug 2019 10:16:01 +0000 Received: (at 36887) by debbugs.gnu.org; 2 Aug 2019 10:15:04 +0000 Received: from localhost ([127.0.0.1]:55813 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htUaW-0001DT-32 for submit@debbugs.gnu.org; Fri, 02 Aug 2019 06:15:04 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:40272 helo=Ishtar.sc.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1htUaU-0001D3-JE for 36887@debbugs.gnu.org; Fri, 02 Aug 2019 06:15:03 -0400 Received: from [192.168.3.12] (Athenae [192.168.3.12]) by Ishtar.sc.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id x72AEpvZ044056; Fri, 2 Aug 2019 03:14:53 -0700 Message-ID: <5D440D1B.2020004@tlinx.org> Date: Fri, 02 Aug 2019 03:14:51 -0700 From: L A Walsh User-Agent: Thunderbird MIME-Version: 1.0 References: <4041c88e-d9cb-ad28-df50-7cefde550733@draigBrady.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On 2019/08/01 16:37, Paul Eggert wrote: > Ulrich Mueller wrote: > >> Except for the surrogates >> U+D800...U+DFFF, it looks like an arbitrary restriction >> > > It's not entirely arbitrary. Because of the restriction, coreutils printf > doesn't have to worry about what this command should do: > > printf '\u0025d\n' 1 2 > > Does this print a single line "%d", or two lines "1" and "2"? There are good > arguments either way, and one can easily construct even-stranger examples. > There are no format characters in the initial line, so only the 1st argument is interpreted. You can't do multiple interpretations since if you do there's no stopping point, (i.e. a hex-encode of a hex-encode of '%d\n') From unknown Mon Aug 18 18:01:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#36887: coreutils-8.31: printf chokes on \u0041 References: In-Reply-To: Resent-From: Ulrich Mueller Original-Sender: "Debbugs-submit" Resent-CC: bug-coreutils@gnu.org Resent-Date: Wed, 07 Jun 2023 14:17:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 36887 X-GNU-PR-Package: coreutils X-GNU-PR-Keywords: To: =?UTF-8?Q?P=C3=A1draig?= Brady Cc: 36887@debbugs.gnu.org Received: via spool by 36887-submit@debbugs.gnu.org id=B36887.168614740111204 (code B ref 36887); Wed, 07 Jun 2023 14:17:01 +0000 Received: (at 36887) by debbugs.gnu.org; 7 Jun 2023 14:16:41 +0000 Received: from localhost ([127.0.0.1]:54473 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q6txp-0002ud-BI for submit@debbugs.gnu.org; Wed, 07 Jun 2023 10:16:41 -0400 Received: from woodpecker.gentoo.org ([140.211.166.183]:53700 helo=smtp.gentoo.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q6txn-0002uP-2f for 36887@debbugs.gnu.org; Wed, 07 Jun 2023 10:16:39 -0400 From: Ulrich Mueller Date: Wed, 07 Jun 2023 16:16:28 +0200 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Can this bug be closed? AFAICS it is fixed since coreutils-9.2. Relevant commit: https://git.savannah.gnu.org/cgit/coreutils.git/commit/src/printf.c?id=0925e8a0f413ecf9004153d89b312b385b20d0ee From unknown Mon Aug 18 18:01:43 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Ulrich Mueller Subject: bug#36887: closed (Re: bug#36887: coreutils-8.31: printf chokes on \u0041) Message-ID: References: <5b24e888-d47b-be16-4e46-8bf3e6bff425@draigBrady.com> X-Gnu-PR-Message: they-closed 36887 X-Gnu-PR-Package: coreutils Reply-To: 36887@debbugs.gnu.org Date: Wed, 07 Jun 2023 14:58:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1686149882-15589-1" This is a multi-part message in MIME format... ------------=_1686149882-15589-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #36887: coreutils-8.31: printf chokes on \u0041 which was filed against the coreutils package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 36887@debbugs.gnu.org. --=20 36887: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D36887 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1686149882-15589-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 36887-done) by debbugs.gnu.org; 7 Jun 2023 14:57:10 +0000 Received: from localhost ([127.0.0.1]:54487 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q6uaz-000422-R5 for submit@debbugs.gnu.org; Wed, 07 Jun 2023 10:57:10 -0400 Received: from mail-wm1-f51.google.com ([209.85.128.51]:59598) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1q6uay-00041q-6u for 36887-done@debbugs.gnu.org; Wed, 07 Jun 2023 10:57:09 -0400 Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-3f736e0c9a8so32980385e9.2 for <36887-done@debbugs.gnu.org>; Wed, 07 Jun 2023 07:57:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1686149822; x=1688741822; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:from:to:cc:subject:date:message-id:reply-to; bh=JU7ZL159g6IbaMKsvmODbq6TvVR2SDa/3+MG4hvlDS8=; b=Mt04JFcJvYeDIcfRejP09+FOgPABh/kjwTzXVo6mhyhCSX03Z/WOMuFCvMkx0uwT0o nDP06nYOXwqS83YAR0H2Jrr23raJV8KNQQQiYjKm5KrefFmgANbdngXKSUCn6LejfJkt obAm5pE7bL1OgNJzxqM38oXJeTOcthR/z9fZevuRTZqJdjTF6ssj+jHSoPoRu8NWxTPR s/Wk++Y8DTK0JF2Hml7m4rOEwbXNFAmelzXQRXuXVUEV6UCEev1DSolALY3cwtKeHHHw oICcWXB0JHMQjWB5J3BQheCTonfSt8gLYYUM1tbOuaT+F8QiPgpX8D1GYPNn9PuWHxYw e87Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686149822; x=1688741822; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :sender:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=JU7ZL159g6IbaMKsvmODbq6TvVR2SDa/3+MG4hvlDS8=; b=a+y0/4+aBpJ4i4/L2Nf2xLKyH4TIN4gEFRUCwQ74ypeWLnGblo/WgquH+0KKLVA+Gr tx/fLDH5WHZKy11xZRNpURlzabhTFqFIXk5BKOPB04KHvG2mbqgmIEW0G9AmnpbIQ8Cw sE7OU8Ebhk7izhuaayxvZ3vDB370S8TCeL8aDz4cP9hTms4UcebEYqZ9Fht8ER9zojk5 sFPdKkMURqL/79b+4VUKVmFnYIlIh4fMOMv/EO6mro5HcXGGxbfoF5elTIkpXpxSX5eh UijI4riMLZnrvpiuo7gFXAH6941NURJvhEmnzyaxlSCSiGHC4jKk4awI9yZW9ijacVLs KDsA== X-Gm-Message-State: AC+VfDxsMedju+h0ouG4qkQVXhXBnkgnorMN51OpeYQGlxBKm5b3v72w 3NRpd3a8Qx51DWSPpEEWtwg= X-Google-Smtp-Source: ACHHUZ5uGceW1IMYE0wkJRBq7vg9BtKJp1cHMN/AjJtBb7aBGQIZsH8YRqVWW1YAhKrZUalPrnvyLQ== X-Received: by 2002:a7b:ca48:0:b0:3f6:2ce:caa8 with SMTP id m8-20020a7bca48000000b003f602cecaa8mr5075581wml.29.1686149822174; Wed, 07 Jun 2023 07:57:02 -0700 (PDT) Received: from [192.168.1.19] (95-44-90-175-dynamic.agg2.lod.rsl-rtd.eircom.net. [95.44.90.175]) by smtp.googlemail.com with ESMTPSA id 4-20020a05600c024400b003f4248dcfcbsm2386472wmj.30.2023.06.07.07.57.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 07 Jun 2023 07:57:01 -0700 (PDT) Message-ID: <5b24e888-d47b-be16-4e46-8bf3e6bff425@draigBrady.com> Date: Wed, 7 Jun 2023 15:57:01 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: bug#36887: coreutils-8.31: printf chokes on \u0041 Content-Language: en-US To: Ulrich Mueller References: From: =?UTF-8?Q?P=C3=A1draig_Brady?= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: 0.2 (/) X-Debbugs-Envelope-To: 36887-done Cc: 36887-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.8 (/) On 07/06/2023 15:16, Ulrich Mueller wrote: > Can this bug be closed? AFAICS it is fixed since coreutils-9.2. > > Relevant commit: > https://git.savannah.gnu.org/cgit/coreutils.git/commit/src/printf.c?id=0925e8a0f413ecf9004153d89b312b385b20d0ee Marked as done. thanks! Pádraig ------------=_1686149882-15589-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 1 Aug 2019 11:02:37 +0000 Received: from localhost ([127.0.0.1]:53542 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ht8qz-0005cF-KI for submit@debbugs.gnu.org; Thu, 01 Aug 2019 07:02:37 -0400 Received: from lists.gnu.org ([209.51.188.17]:48377) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ht8qx-0005c5-JE for submit@debbugs.gnu.org; Thu, 01 Aug 2019 07:02:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55624) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1ht8qw-0003xp-MR for bug-coreutils@gnu.org; Thu, 01 Aug 2019 07:02:35 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20 autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ht8qv-0000ZW-OI for bug-coreutils@gnu.org; Thu, 01 Aug 2019 07:02:34 -0400 Received: from smtp.gentoo.org ([2001:470:ea4a:1:5054:ff:fec7:86e4]:50215) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ht8qv-0000Yg-J0 for bug-coreutils@gnu.org; Thu, 01 Aug 2019 07:02:33 -0400 Received: from a1i15 (host2092.kph.uni-mainz.de [134.93.134.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: ulm) by smtp.gentoo.org (Postfix) with ESMTPSA id 607F634915C; Thu, 1 Aug 2019 11:02:30 +0000 (UTC) From: Ulrich Mueller To: bug-coreutils@gnu.org Subject: coreutils-8.31: printf chokes on \u0041 Date: Thu, 01 Aug 2019 13:02:26 +0200 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2001:470:ea4a:1:5054:ff:fec7:86e4 X-Spam-Score: -1.6 (-) X-Debbugs-Envelope-To: submit Cc: base-system@gentoo.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.6 (--) [Forwarding bug https://bugs.gentoo.org/680244 as requested by the Gentoo package maintainer.] According to printf(1): Interpreted sequences are: [...] \uHHHH Unicode (ISO/IEC 10646) character with hex value HHHH (4 digits) \UHHHHHHHH Unicode character with hex value HHHHHHHH (8 digits) It does not work, though: $ /usr/bin/printf '\u0041\n' /usr/bin/printf: invalid universal character name \u0041 $ /usr/bin/printf '\U00000041\n' /usr/bin/printf: invalid universal character name \U00000041 Other tools interpret the sequence correctly: $ printf '\u0041\n' # bash A $ echo -e '\u0041' # bash A $ zsh -c "echo -e '\u0041'" A $ emacs -Q --batch --eval '(princ "\u0041\n")' A $ python -c "print ('\u0041')" A $ ruby -e 'print("\u0041\n")' A ------------=_1686149882-15589-1--