GNU bug report logs - #17196
multibyte: printf: %s counts bytes instead of characters

Previous Next

Package: coreutils;

Reported by: Jan Novak <jn <at> turbo.sk>

Date: Sat, 5 Apr 2014 23:22:01 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: Leslie S Satenstein <lsatenstein <at> yahoo.com>
To: Jan Novak <jn <at> turbo.sk>
Cc: "17196 <at> debbugs.gnu.org" <17196 <at> debbugs.gnu.org>
Subject: bug#17196: UTF-8 printf string formating  problem
Date: Thu, 8 May 2014 19:16:45 -0700 (PDT)
[Message part 1 (text/plain, inline)]
Perhaps printf() needs some wide character extensions via %new characters

 
Regards 

 Leslie

Mr. Leslie Satenstein
SENT FROM MY OPEN SOURCE LINUX SYSTEM.




>________________________________
> From: Pádraig Brady <P <at> draigBrady.com>
>To: Jan Novak <jn <at> turbo.sk> 
>Cc: 17196 <at> debbugs.gnu.org 
>Sent: Sunday, April 6, 2014 6:15 AM
>Subject: bug#17196: UTF-8 printf string formating  problem
> 
>
>On 04/06/2014 12:17 AM, Jan Novak wrote:
>> Hello,
>> 
>> printf string format counts bytes instead of chars, which leads to broken output ...
>> (the same problem occurs with bash built in printf)
>> 
>> 
>> just try this:
>> 
>> $ echo $LANG
>> us_US.UTF-8
>> 
>> 
>> $ printf "|%3s|\n" "a"
>> |  a|
>> 
>> $ printf "|%3s|\n" "á"     (char is a-acute)
>> | á|
>> 
>> expected output:
>> |  á|
>> 
>> Is there some easy solution ?
>> 
>> TIA for the answer
>
>Yes printf follows the C standard which only considers bytes.
>awk does respect characters in width specifiers though:
>
>  $ awk 'BEGIN{printf "|%3s|\n", "á"}'
>  |  á|
>
>I don't think we'd be able to change the current operation of printf
>due to backwards compat reasons? Though we might be able to somehow leverage
>the existing multibyte character aware alignment/truncation code in:
>http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=blob;f=gl/lib/mbsalign.c;hb=HEAD
>
>thanks,
>Pádraig.
>
>
>
>
>
>
[Message part 2 (text/html, inline)]

This bug report was last modified 6 years and 250 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.