GNU bug report logs - #9253
cut does not yet support unicode characters

Previous Next

Package: coreutils;

Reported by: Danilo Moraes <moraesdno <at> gmail.com>

Date: Sat, 6 Aug 2011 01:54:07 UTC

Severity: normal

Tags: notabug

Merged with 9252

Done: Bob Proulx <bob <at> proulx.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 9253 in the body.
You can then email your comments to 9253 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#9253; Package coreutils. (Sat, 06 Aug 2011 01:54:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Danilo Moraes <moraesdno <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 06 Aug 2011 01:54:07 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Danilo Moraes <moraesdno <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: bug in cut - more information
Date: Fri, 5 Aug 2011 13:22:18 -0300
[Message part 1 (text/plain, inline)]
I have found a little bug in cut(i guess). See that:

a=danilo
echo $a | cut -c -5 # shows danil

a=dánilo
echo $a | cut -c 5 # shows dáni

The option -b equal works. The cut is ignoring the letters with acentuation.

I read in infopages this:

`-c CHARACTER-LIST'
`--characters=CHARACTER-LIST'
     Select for printing only the characters in positions listed in
     CHARACTER-LIST.  The same as `-b' for now, but
     internationalization will change that.  Tabs and backspaces are
     treated like any other character; they take up 1 character.  If an
     output delimiter is specified, (see the description of
     `--output-delimiter'), then output that string between ranges of
     selected bytes.

"The same as `-b' for now, but
     internationalization will change that.". Has not been changed?
This is my locale:

LANG=pt_BR.UTF-8
LANGUAGE=pt_BR:pt:en
LC_CTYPE="pt_BR.UTF-8"
LC_NUMERIC="pt_BR.UTF-8"
LC_TIME="pt_BR.UTF-8"
LC_COLLATE="pt_BR.UTF-8"
LC_MONETARY="pt_BR.UTF-8"
LC_MESSAGES="pt_BR.UTF-8"
LC_PAPER="pt_BR.UTF-8"
LC_NAME="pt_BR.UTF-8"
LC_ADDRESS="pt_BR.UTF-8"
LC_TELEPHONE="pt_BR.UTF-8"
LC_MEASUREMENT="pt_BR.UTF-8"
LC_IDENTIFICATION="pt_BR.UTF-8"
LC_ALL=

and the cut version is: cut (GNU coreutils) 7.4

Thanks,

Danilo S. Morães
[Message part 2 (text/html, inline)]

Forcibly Merged 9252 9253. Request was from Bob Proulx <bob <at> proulx.com> to control <at> debbugs.gnu.org. (Sat, 06 Aug 2011 17:21:02 GMT) Full text and rfc822 format available.

Changed bug title to 'cut does not yet support unicode characters' from 'bug in cut - more information' Request was from Bob Proulx <bob <at> proulx.com> to control <at> debbugs.gnu.org. (Sat, 06 Aug 2011 17:21:02 GMT) Full text and rfc822 format available.

Added tag(s) notabug. Request was from Bob Proulx <bob <at> proulx.com> to control <at> debbugs.gnu.org. (Sat, 06 Aug 2011 17:21:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 9252 <at> debbugs.gnu.org and Danilo Moraes <moraesdno <at> gmail.com> Request was from Bob Proulx <bob <at> proulx.com> to control <at> debbugs.gnu.org. (Sat, 06 Aug 2011 17:21:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 04 Sep 2011 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 293 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.