GNU bug report logs - #69488
tr (question)

Previous Next

Package: coreutils;

Reported by: lacsaP Patatetom <patatetom <at> gmail.com>

Date: Fri, 1 Mar 2024 15:35:02 UTC

Severity: normal

Full log


View this message in rfc822 format

From: lacsaP Patatetom <patatetom <at> gmail.com>
To: 69488 <at> debbugs.gnu.org
Subject: bug#69488: tr (question)
Date: Fri, 1 Mar 2024 16:33:29 +0100
[Message part 1 (text/plain, inline)]
hi,

I did a few tests with tr and I'm surprised by the results...

$ echo éèçà
éèçà

these characters are encoded in utf-8 on 2 bytes :

$ echo éèçà | xxd
00000000: c3a9 c3a8 c3a7 c3a0 0a                   .........

now I use tr to remove non-printable characters :

$ echo éèçà | tr -cd '[:print:]'
$ echo éèçà | tr -cd '[:print:]' | wc
      0       0       0

all characters are deleted by tr
now I want to keep the "é" character :

$ echo éèçà | tr -cd '[:print:]é'
��

why do the "�" characters appear ?

regards, lacsaP.
[Message part 2 (text/html, inline)]

This bug report was last modified 1 year and 104 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.