GNU bug report logs - #18893
Bug with Gnu sort program in coreutils 8.4

Previous Next

Package: coreutils;

Reported by: Michael Yang <pstester2015 <at> gmail.com>

Date: Wed, 29 Oct 2014 22:22:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 18893 in the body.
You can then email your comments to 18893 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#18893; Package coreutils. (Wed, 29 Oct 2014 22:22:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Michael Yang <pstester2015 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Wed, 29 Oct 2014 22:22:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Michael Yang <pstester2015 <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: Bug with Gnu sort program in coreutils 8.4
Date: Wed, 29 Oct 2014 17:58:23 -0400
[Message part 1 (text/plain, inline)]
Hi,



There might be a bug in the “sort” program in GNU coreutils 8.4, present at
least in CentOS 6 x86_64.  It’s not immediately obvious to me whether or
not this bug has been reported before.



Given the following:



CC = gcc

CC = aCC

CCFLAGS =

CC = cc



sort (GNU coreutils) 8.4 yields:



CC = aCC

CC = cc

CCFLAGS =

CC = gcc



… the 3rd line is out-of-order.  In comparison, sort (GNU coreutils) 8.14
in cygwin yields:



CC = aCC

CC = cc

CC = gcc

CCFLAGS =



… which is correct.
[Message part 2 (text/html, inline)]

Added tag(s) notabug. Request was from Eric Blake <eblake <at> redhat.com> to control <at> debbugs.gnu.org. (Wed, 29 Oct 2014 22:37:02 GMT) Full text and rfc822 format available.

Reply sent to Eric Blake <eblake <at> redhat.com>:
You have taken responsibility. (Wed, 29 Oct 2014 22:37:02 GMT) Full text and rfc822 format available.

Notification sent to Michael Yang <pstester2015 <at> gmail.com>:
bug acknowledged by developer. (Wed, 29 Oct 2014 22:37:03 GMT) Full text and rfc822 format available.

Message #12 received at 18893-done <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Michael Yang <pstester2015 <at> gmail.com>, 18893-done <at> debbugs.gnu.org
Subject: Re: bug#18893: Bug with Gnu sort program in coreutils 8.4
Date: Wed, 29 Oct 2014 16:36:23 -0600
[Message part 1 (text/plain, inline)]
tag 18893 notabug
thanks

On 10/29/2014 03:58 PM, Michael Yang wrote:

> There might be a bug in the “sort” program in GNU coreutils 8.4, present at
> least in CentOS 6 x86_64.  It’s not immediately obvious to me whether or
> not this bug has been reported before.

Thanks for the report.  However, it has been frequently reported, to the
point that it has a FAQ entry:

https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021

> sort (GNU coreutils) 8.4 yields:
> 
> 
> 
> CC = aCC
> 
> CC = cc
> 
> CCFLAGS =
> 
> CC = gcc

You can use the --debug flag to see what is going on (well, you can when
using new enough sort; 8.4 is rather old these days, and while there
HAVE been sort bug fixes in the meantime, they are for rather obscure
corner cases and not for your issue).

$ printf 'CC = aCC\nCC = cc\nCCFLAGS =\nCC = gcc\n' | sort --debug
sort: using ‘en_US.UTF-8’ sorting rules
CC = aCC
________
CC = cc
_______
CCFLAGS =
_________
CC = gcc
________

I'm guessing that on your CentOS box, your locale is set to en_US.UTF-8,
or some similar locale which collates case-insensitively and ignores
punctuation.  In such a collation sequence, you are comparing 'ccflags'
vs. 'ccgcc', and the final output order is correct.

> … the 3rd line is out-of-order.  In comparison, sort (GNU coreutils) 8.14
> in cygwin yields:

The version of sort makes no difference; rather, it is entirely up to
the locale (and by the way, cygwin now ships with 8.23, so you may want
to upgrade); on your cygwin box, I'm guessing that you are using the C
locale.  And even if you are using the en_US locale there, you must
remember that the cygwin locale definitions come from Windows, not
glibc, and therefore may differ in what the two locale writers thought
would make sense (that is, while the glibc en_US locale ignores
punctuation, maybe the Windows en_US locale does not).  At any rate, on
your CentOS box, you can force the C locale to get the same behavior as
cygwin seemed to give by default:

$ printf 'CC = aCC\nCC = cc\nCCFLAGS =\nCC = gcc\n' | LC_ALL=C sort --debug
sort: using simple byte comparison
CC = aCC
________
CC = cc
_______
CC = gcc
________
CCFLAGS =
_________

Therefore, I'm closing this as not a bug, but feel free to respond if
you have further comments or questions.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#18893; Package coreutils. (Wed, 29 Oct 2014 22:42:02 GMT) Full text and rfc822 format available.

Message #15 received at 18893 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Michael Yang <pstester2015 <at> gmail.com>, 18893 <at> debbugs.gnu.org
Subject: Re: bug#18893: Bug with Gnu sort program in coreutils 8.4
Date: Wed, 29 Oct 2014 23:41:17 +0100
tag 18893 notabug
thanks

On 10/29/2014 10:58 PM, Michael Yang wrote:
> sort (GNU coreutils) 8.4 yields:
> 
> CC = aCC
> CC = cc
> CCFLAGS =
> CC = gcc

Newer builds of sort include a --debug flag that show you what is going on:

  $ printf "CC = gcc\nCC = aCC\nCCFLAGS =\nCC = cc\n" | src/sort --debug
  src/sort: using ‘en_US.UTF-8’ sorting rules
  CC = aCC
  ________
  CC = cc
  _______
  CCFLAGS =
  _________
  CC = gcc
  ________

versus

  $ printf "CC = gcc\nCC = aCC\nCCFLAGS =\nCC = cc\n" | LC_ALL=C src/sort --debug
  src/sort: using simple byte comparison
  CC = aCC
  ________
  CC = cc
  _______
  CC = gcc
  ________
  CCFLAGS =
  _________

You have hit an FAQ:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021

Your current locale has chosen a collation sequence that ignores blanks
and the equal sign, so sort is sorting correctly.  Set LC_ALL in the
environment of sort to a different locale if you want bytewise sorting.

Have a nice day,
Berny






bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 27 Nov 2014 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 211 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.