On 04/06/2014 12:17 AM, Jan Novak wrote:
>
Hello,
>
> printf string format counts bytes instead of chars, which leads to broken output ...
> (the same problem occurs with bash built in printf)
>
>
> just try this:
>
> $ echo $LANG
> us_US.UTF-8
>
>
> $ printf "|%3s|\n" "a"
> | a|
>
> $ printf "|%3s|\n" "á" (char is a-acute)
> | á|
>
> expected output:
> | á|
>
> Is there some easy solution ?
>
> TIA for the answer
Yes printf follows the C standard which only considers bytes.
awk does respect characters in width specifiers though:
$ awk 'BEGIN{printf "|%3s|\n", "á"}'
| á|
I don't think we'd be able to change the current operation of printf
due to backwards compat reasons? Though we might be able to somehow leverage
the existing multibyte character aware
alignment/truncation code in:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=gl/lib/mbsalign.c;hb=HEADthanks,
Pádraig.