GNU bug report logs -
#9740
Bug in sort
Previous Next
Reported by: Lluís Padró <padro <at> lsi.upc.edu>
Date: Wed, 12 Oct 2011 18:49:02 UTC
Severity: normal
Tags: notabug
Done: Eric Blake <eblake <at> redhat.com>
Bug is archived. No further changes may be made.
Full log
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I found a bug in the "sort" utility that happens under utf8 locales, though
no character beyond basic ascii is involved in it...
I'm using "sort (GNU coreutils) 7.4" from package
"coreutils-7.4-2ubuntu3" on ubuntu lucid 10.04.03 LTS
Short reproduction of the error follows below.
thank you
Lluis
------------------------------------------------
## test file for "sort"
~$ cat testfile
abc Z
ab Z
abcd Z
abce Z
## let's set C locale
~$ export LC_ALL="C"
~$ locale
LANG=en_US.UTF-8
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=C
## sort works as expected
~$ sort testfile
ab Z
abc Z
abcd Z
abce Z
## Let's try another locale
~$ export LC_ALL="en_US.UTF-8"
~$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
## Sort fails. Shorter words are sorted after longer words with the
same prefix.
~$ sort testfile
abcd Z
abce Z
abc Z
ab Z
This bug report was last modified 13 years and 228 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.