GNU bug report logs -
#54388
printf doesn't handle multi-byte values
Previous Next
Reported by: Pádraig Brady <P <at> draigBrady.com>
Date: Mon, 14 Mar 2022 15:39:02 UTC
Severity: normal
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#54388: printf doesn't handle multi-byte values
which was filed against the coreutils package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 54388 <at> debbugs.gnu.org.
--
54388: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=54388
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
[Message part 3 (text/plain, inline)]
On 14/03/2022 15:38, Pádraig Brady wrote:
> On 14/03/2022 03:27, Christoph Anton Mitterer wrote:
>> Hey Pádraig.
>>
>> I just wanted to ask, whether the following could be a bug in printf:
>>
>> POSIX says[0], that e.g.:
>> printf '%d\n' \"3
>> should give the numeric value of the character, and that "in a locale
>> with multi-byte characters, the value of a character is intended to be
>> the value of the equivalent of the wchar_t representation of the
>> character".
>>
>> In bash:
>> $ printf '%d\n' $'"\u2208'
>> 8712
>>
>> here the printf is bash's built-in printf, and there it works.
>>
>>
>> But using GNU coreutils' printf (version 8.32):
>> $ /usr/bin/printf '%d\n' $'"\u2208'
>> /usr/bin/printf: warning: ��: character(s) following character constant have been ignored
>> 226
>>
>>
>> Do I have some wrong assumptions or should I report that as a bug?
>>
>>
>> Thanks,
>> Chris.
>>
>>
>> [0] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
>
> This is a limitation of current coreutils printf that only handles single byte chars currently.
> This email will open an issue in our bug tracker.
>
> To summarize:
> $ ord() { printf "0x%x\n" "'$1"; } # bash's printf
> $ ord 3
> 0x33
> $ ord $'\u2208'
> 0x2208
>
> $ ord() { env printf "0x%x\n" "'$1"; } # coreutils' printf
> $ ord 3
> 0x33
> $ ord $'\u2208'
> 0xprintf: warning: ��: character(s) following character constant have been ignored
> e2
The attached should fix this up.
Marking this as done.
cheers,
Pádraig
[printf-mb-values.patch (text/x-patch, attachment)]
[Message part 5 (message/rfc822, inline)]
On 14/03/2022 03:27, Christoph Anton Mitterer wrote:
> Hey Pádraig.
>
> I just wanted to ask, whether the following could be a bug in printf:
>
> POSIX says[0], that e.g.:
> printf '%d\n' \"3
> should give the numeric value of the character, and that "in a locale
> with multi-byte characters, the value of a character is intended to be
> the value of the equivalent of the wchar_t representation of the
> character".
>
> In bash:
> $ printf '%d\n' $'"\u2208'
> 8712
>
> here the printf is bash's built-in printf, and there it works.
>
>
> But using GNU coreutils' printf (version 8.32):
> $ /usr/bin/printf '%d\n' $'"\u2208'
> /usr/bin/printf: warning: ��: character(s) following character constant have been ignored
> 226
>
>
> Do I have some wrong assumptions or should I report that as a bug?
>
>
> Thanks,
> Chris.
>
>
> [0] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
This is a limitation of current coreutils printf that only handles single byte chars currently.
This email will open an issue in our bug tracker.
To summarize:
$ ord() { printf "0x%x\n" "'$1"; } # bash's printf
$ ord 3
0x33
$ ord $'\u2208'
0x2208
$ ord() { env printf "0x%x\n" "'$1"; } # coreutils' printf
$ ord 3
0x33
$ ord $'\u2208'
0xprintf: warning: ��: character(s) following character constant have been ignored
e2
cheers,
Pádraig
This bug report was last modified 3 years and 66 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.