GNU bug report logs - #54388
printf doesn't handle multi-byte values

Previous Next

Package: coreutils;

Reported by: Pádraig Brady <P <at> draigBrady.com>

Date: Mon, 14 Mar 2022 15:39:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Pádraig Brady <P <at> draigBrady.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#54388: closed (printf doesn't handle multi-byte values)
Date: Fri, 18 Mar 2022 15:01:01 +0000
[Message part 1 (text/plain, inline)]
Your message dated Fri, 18 Mar 2022 14:59:42 +0000
with message-id <fd2fd96e-f1d6-40a3-85a1-18817c15400d <at> draigBrady.com>
and subject line Re: bug#54388: printf doesn't handle multi-byte values
has caused the debbugs.gnu.org bug report #54388,
regarding printf doesn't handle multi-byte values
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
54388: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=54388
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Pádraig Brady <P <at> draigBrady.com>
To: Christoph Anton Mitterer <calestyo <at> scientia.org>,
 Report bugs to <bug-coreutils <at> gnu.org>
Subject: printf doesn't handle multi-byte values
Date: Mon, 14 Mar 2022 15:38:23 +0000
On 14/03/2022 03:27, Christoph Anton Mitterer wrote:
> Hey Pádraig.
> 
> I just wanted to ask, whether the following could be a bug in printf:
> 
> POSIX says[0], that e.g.:
>     printf '%d\n' \"3
> should give the numeric value of the character, and that "in a locale
> with multi-byte characters, the value of a character is intended to be
> the value of the equivalent of the wchar_t representation of the
> character".
> 
> In bash:
> $ printf '%d\n' $'"\u2208'
> 8712
> 
> here the printf is bash's built-in printf, and there it works.
> 
> 
> But using GNU coreutils' printf (version 8.32):
> $ /usr/bin/printf '%d\n' $'"\u2208'
> /usr/bin/printf: warning: ��: character(s) following character constant have been ignored
> 226
> 
> 
> Do I have some wrong assumptions or should I report that as a bug?
> 
> 
> Thanks,
> Chris.
> 
> 
> [0] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html

This is a limitation of current coreutils printf that only handles single byte chars currently.
This email will open an issue in our bug tracker.

To summarize:
$ ord() { printf "0x%x\n" "'$1"; }  # bash's printf
$ ord 3
0x33
$ ord $'\u2208'
0x2208

$ ord() { env printf "0x%x\n" "'$1"; }  # coreutils' printf
$ ord 3
0x33
$ ord $'\u2208'
0xprintf: warning: ��: character(s) following character constant have been ignored
e2

cheers,
Pádraig


[Message part 3 (message/rfc822, inline)]
From: Pádraig Brady <P <at> draigBrady.com>
To: calestyo <at> scientia.org, 54388-done <at> debbugs.gnu.org
Subject: Re: bug#54388: printf doesn't handle multi-byte values
Date: Fri, 18 Mar 2022 14:59:42 +0000
[Message part 4 (text/plain, inline)]
On 14/03/2022 15:38, Pádraig Brady wrote:
> On 14/03/2022 03:27, Christoph Anton Mitterer wrote:
>> Hey Pádraig.
>>
>> I just wanted to ask, whether the following could be a bug in printf:
>>
>> POSIX says[0], that e.g.:
>>      printf '%d\n' \"3
>> should give the numeric value of the character, and that "in a locale
>> with multi-byte characters, the value of a character is intended to be
>> the value of the equivalent of the wchar_t representation of the
>> character".
>>
>> In bash:
>> $ printf '%d\n' $'"\u2208'
>> 8712
>>
>> here the printf is bash's built-in printf, and there it works.
>>
>>
>> But using GNU coreutils' printf (version 8.32):
>> $ /usr/bin/printf '%d\n' $'"\u2208'
>> /usr/bin/printf: warning: ��: character(s) following character constant have been ignored
>> 226
>>
>>
>> Do I have some wrong assumptions or should I report that as a bug?
>>
>>
>> Thanks,
>> Chris.
>>
>>
>> [0] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
> 
> This is a limitation of current coreutils printf that only handles single byte chars currently.
> This email will open an issue in our bug tracker.
> 
> To summarize:
> $ ord() { printf "0x%x\n" "'$1"; }  # bash's printf
> $ ord 3
> 0x33
> $ ord $'\u2208'
> 0x2208
> 
> $ ord() { env printf "0x%x\n" "'$1"; }  # coreutils' printf
> $ ord 3
> 0x33
> $ ord $'\u2208'
> 0xprintf: warning: ��: character(s) following character constant have been ignored
> e2

The attached should fix this up.

Marking this as done.

cheers,
Pádraig
[printf-mb-values.patch (text/x-patch, attachment)]

This bug report was last modified 3 years and 115 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.