GNU bug report logs -
#18893
Bug with Gnu sort program in coreutils 8.4
Previous Next
Reported by: Michael Yang <pstester2015 <at> gmail.com>
Date: Wed, 29 Oct 2014 22:22:02 UTC
Severity: normal
Tags: notabug
Done: Eric Blake <eblake <at> redhat.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 18893 in the body.
You can then email your comments to 18893 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#18893
; Package
coreutils
.
(Wed, 29 Oct 2014 22:22:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Michael Yang <pstester2015 <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Wed, 29 Oct 2014 22:22:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi,
There might be a bug in the “sort” program in GNU coreutils 8.4, present at
least in CentOS 6 x86_64. It’s not immediately obvious to me whether or
not this bug has been reported before.
Given the following:
CC = gcc
CC = aCC
CCFLAGS =
CC = cc
sort (GNU coreutils) 8.4 yields:
CC = aCC
CC = cc
CCFLAGS =
CC = gcc
… the 3rd line is out-of-order. In comparison, sort (GNU coreutils) 8.14
in cygwin yields:
CC = aCC
CC = cc
CC = gcc
CCFLAGS =
… which is correct.
[Message part 2 (text/html, inline)]
Added tag(s) notabug.
Request was from
Eric Blake <eblake <at> redhat.com>
to
control <at> debbugs.gnu.org
.
(Wed, 29 Oct 2014 22:37:02 GMT)
Full text and
rfc822 format available.
Reply sent
to
Eric Blake <eblake <at> redhat.com>
:
You have taken responsibility.
(Wed, 29 Oct 2014 22:37:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Michael Yang <pstester2015 <at> gmail.com>
:
bug acknowledged by developer.
(Wed, 29 Oct 2014 22:37:03 GMT)
Full text and
rfc822 format available.
Message #12 received at 18893-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
tag 18893 notabug
thanks
On 10/29/2014 03:58 PM, Michael Yang wrote:
> There might be a bug in the “sort” program in GNU coreutils 8.4, present at
> least in CentOS 6 x86_64. It’s not immediately obvious to me whether or
> not this bug has been reported before.
Thanks for the report. However, it has been frequently reported, to the
point that it has a FAQ entry:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021
> sort (GNU coreutils) 8.4 yields:
>
>
>
> CC = aCC
>
> CC = cc
>
> CCFLAGS =
>
> CC = gcc
You can use the --debug flag to see what is going on (well, you can when
using new enough sort; 8.4 is rather old these days, and while there
HAVE been sort bug fixes in the meantime, they are for rather obscure
corner cases and not for your issue).
$ printf 'CC = aCC\nCC = cc\nCCFLAGS =\nCC = gcc\n' | sort --debug
sort: using ‘en_US.UTF-8’ sorting rules
CC = aCC
________
CC = cc
_______
CCFLAGS =
_________
CC = gcc
________
I'm guessing that on your CentOS box, your locale is set to en_US.UTF-8,
or some similar locale which collates case-insensitively and ignores
punctuation. In such a collation sequence, you are comparing 'ccflags'
vs. 'ccgcc', and the final output order is correct.
> … the 3rd line is out-of-order. In comparison, sort (GNU coreutils) 8.14
> in cygwin yields:
The version of sort makes no difference; rather, it is entirely up to
the locale (and by the way, cygwin now ships with 8.23, so you may want
to upgrade); on your cygwin box, I'm guessing that you are using the C
locale. And even if you are using the en_US locale there, you must
remember that the cygwin locale definitions come from Windows, not
glibc, and therefore may differ in what the two locale writers thought
would make sense (that is, while the glibc en_US locale ignores
punctuation, maybe the Windows en_US locale does not). At any rate, on
your CentOS box, you can force the C locale to get the same behavior as
cygwin seemed to give by default:
$ printf 'CC = aCC\nCC = cc\nCCFLAGS =\nCC = gcc\n' | LC_ALL=C sort --debug
sort: using simple byte comparison
CC = aCC
________
CC = cc
_______
CC = gcc
________
CCFLAGS =
_________
Therefore, I'm closing this as not a bug, but feel free to respond if
you have further comments or questions.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[signature.asc (application/pgp-signature, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#18893
; Package
coreutils
.
(Wed, 29 Oct 2014 22:42:02 GMT)
Full text and
rfc822 format available.
Message #15 received at 18893 <at> debbugs.gnu.org (full text, mbox):
tag 18893 notabug
thanks
On 10/29/2014 10:58 PM, Michael Yang wrote:
> sort (GNU coreutils) 8.4 yields:
>
> CC = aCC
> CC = cc
> CCFLAGS =
> CC = gcc
Newer builds of sort include a --debug flag that show you what is going on:
$ printf "CC = gcc\nCC = aCC\nCCFLAGS =\nCC = cc\n" | src/sort --debug
src/sort: using ‘en_US.UTF-8’ sorting rules
CC = aCC
________
CC = cc
_______
CCFLAGS =
_________
CC = gcc
________
versus
$ printf "CC = gcc\nCC = aCC\nCCFLAGS =\nCC = cc\n" | LC_ALL=C src/sort --debug
src/sort: using simple byte comparison
CC = aCC
________
CC = cc
_______
CC = gcc
________
CCFLAGS =
_________
You have hit an FAQ:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021
Your current locale has chosen a collation sequence that ignores blanks
and the equal sign, so sort is sorting correctly. Set LC_ALL in the
environment of sort to a different locale if you want bytewise sorting.
Have a nice day,
Berny
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 27 Nov 2014 12:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 10 years and 211 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.