GNU bug report logs - #9252
cut does not yet support unicode characters

Previous Next

Package: coreutils;

Reported by: Danilo Moraes <moraesdno <at> gmail.com>

Date: Sat, 6 Aug 2011 01:54:06 UTC

Severity: normal

Tags: notabug

Merged with 9253

Done: Bob Proulx <bob <at> proulx.com>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Danilo Moraes <moraesdno <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: a bug in cut
Date: Fri, 5 Aug 2011 12:39:14 -0300
[Message part 1 (text/plain, inline)]
I have found a little bug (i guess). See that:

a=danilo
echo $a | cut -c -5 # shows danil

a=dánilo
echo $a | cut -c 5 # shows dáni

The option -b equal works. The cut is ignoring the letters with acentuation.

I read in infopages this:

`-c CHARACTER-LIST'
`--characters=CHARACTER-LIST'
     Select for printing only the characters in positions listed in
     CHARACTER-LIST.  The same as `-b' for now, but
     internationalization will change that.  Tabs and backspaces are
     treated like any other character; they take up 1 character.  If an
     output delimiter is specified, (see the description of
     `--output-delimiter'), then output that string between ranges of
     selected bytes.

"The same as `-b' for now, but
     internationalization will change that." this solves my problem? How it
works?

Thanks,

Danilo S. Morães
[Message part 2 (text/html, inline)]

This bug report was last modified 13 years and 293 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.