GNU bug report logs - #9740
Bug in sort

Previous Next

Package: coreutils;

Reported by: Lluís Padró <padro <at> lsi.upc.edu>

Date: Wed, 12 Oct 2011 18:49:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Lluís Padró <padro <at> lsi.upc.edu>
To: bug-coreutils <at> gnu.org
Subject: Bug in sort
Date: Wed, 12 Oct 2011 20:41:46 +0200
I found a bug in the "sort" utility that happens under utf8 locales, though
no character beyond basic ascii is involved in it...

I'm using "sort (GNU coreutils) 7.4" from package
 "coreutils-7.4-2ubuntu3" on ubuntu lucid 10.04.03 LTS

Short reproduction of the error follows below.

  thank you

     Lluis

------------------------------------------------
## test file for "sort"
~$ cat testfile
abc Z
ab Z
abcd Z
abce Z

## let's set C locale
~$ export LC_ALL="C"
~$ locale
LANG=en_US.UTF-8
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=C

## sort works as expected
~$ sort testfile
ab Z
abc Z
abcd Z
abce Z

##  Let's try another locale
~$ export LC_ALL="en_US.UTF-8"
~$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8

##  Sort fails. Shorter words are sorted after longer words with the 
same prefix.
~$ sort testfile
abcd Z
abce Z
abc Z
ab Z







This bug report was last modified 13 years and 228 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.