GNU bug report logs -
#54388
printf doesn't handle multi-byte values
Previous Next
Reported by: Pádraig Brady <P <at> draigBrady.com>
Date: Mon, 14 Mar 2022 15:39:02 UTC
Severity: normal
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
Full log
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
On 14/03/2022 03:27, Christoph Anton Mitterer wrote:
> Hey Pádraig.
>
> I just wanted to ask, whether the following could be a bug in printf:
>
> POSIX says[0], that e.g.:
> printf '%d\n' \"3
> should give the numeric value of the character, and that "in a locale
> with multi-byte characters, the value of a character is intended to be
> the value of the equivalent of the wchar_t representation of the
> character".
>
> In bash:
> $ printf '%d\n' $'"\u2208'
> 8712
>
> here the printf is bash's built-in printf, and there it works.
>
>
> But using GNU coreutils' printf (version 8.32):
> $ /usr/bin/printf '%d\n' $'"\u2208'
> /usr/bin/printf: warning: ��: character(s) following character constant have been ignored
> 226
>
>
> Do I have some wrong assumptions or should I report that as a bug?
>
>
> Thanks,
> Chris.
>
>
> [0] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
This is a limitation of current coreutils printf that only handles single byte chars currently.
This email will open an issue in our bug tracker.
To summarize:
$ ord() { printf "0x%x\n" "'$1"; } # bash's printf
$ ord 3
0x33
$ ord $'\u2208'
0x2208
$ ord() { env printf "0x%x\n" "'$1"; } # coreutils' printf
$ ord 3
0x33
$ ord $'\u2208'
0xprintf: warning: ��: character(s) following character constant have been ignored
e2
cheers,
Pádraig
This bug report was last modified 3 years and 65 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.