GNU bug report logs - #10880
multibyte: tr: TR operates on bytes, not characters

Previous Next

Package: coreutils;

Reported by: "Marton Kadar" <marton.kadar <at> mail.com>

Date: Fri, 24 Feb 2012 17:31:02 UTC

Severity: wishlist

Merged with 9365, 9569, 12192, 13362

Full log


View this message in rfc822 format

From: Eric Blake <eblake <at> redhat.com>
To: Marton Kadar <marton.kadar <at> mail.com>
Cc: 10880 <at> debbugs.gnu.org
Subject: bug#10880: instead of characters, tr works on bytes
Date: Fri, 24 Feb 2012 20:28:41 -0700
[Message part 1 (text/plain, inline)]
On 02/24/2012 07:29 AM, Marton Kadar wrote:
> Don't know which is the official way to report a bug in 'tr'
> so I will copy to this list too. CC me on replies as I am not
> subscribing.

Sending mail to coreutils <at> gnu.org _is_ what creates a bug on
debbugs.gnu.org, so you have managed to create a duplicate.  Paul Eggert
has already merged 9365, 10880, and 9569, so now, replying to any one of
those three is merely adding information to the same report.

>>
>> Let us try to delete a character and see if it worked:
>>
>> $ echo árvíz | tr -d á | od -c
>> 0000000   r   v 255   z  \n
>> 0000005

Please keep in mind that upstream coreutils is not yet converted over to
multibyte support.  This is evidence of one of the places that multibyte
support is required, and therefore, where you cannot expect things to
work yet.  No one has yet contributed a maintainable patch that does not
penalize single-byte locales, at least not upstream.  Several distros
have their own UTF-8 patches that they apply, but then, this would be a
bug you report to your distro and not upstream.

>> I'll check the source for tr myself although never coded in C.
>> This should be a trivial fix.

Alas, dealing with multibyte characters without penalizing single-byte
locales is NOT trivial, or it would have been done long ago.

-- 
Eric Blake   eblake <at> redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

This bug report was last modified 6 years and 305 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.