GNU bug report logs - #14226
Sort -c takes in account fields that were outside sorting scope

Previous Next

Package: coreutils;

Reported by: Camion SPAM <camion_spam-gnubugs <at> yahoo.fr>

Date: Thu, 18 Apr 2013 17:11:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Camion SPAM <camion_spam-gnubugs <at> yahoo.fr>
Subject: bug#14226: closed (Re: bug#14226: Sort -c takes in account fields
 that were outside sorting scope)
Date: Thu, 18 Apr 2013 19:59:03 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#14226: Sort -c takes in account fields that were outside sorting scope

which was filed against the coreutils package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 14226 <at> debbugs.gnu.org.

-- 
14226: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=14226
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Eric Blake <eblake <at> redhat.com>
To: Camion SPAM <camion_spam-gnubugs <at> yahoo.fr>
Cc: 14226-done <at> debbugs.gnu.org
Subject: Re: bug#14226: Sort -c takes in account fields that were outside
	sorting scope
Date: Thu, 18 Apr 2013 13:53:24 -0600
[Message part 3 (text/plain, inline)]
tag 14226 notabug
thanks

On 04/18/2013 09:04 AM, Camion SPAM wrote:
> The following commands report an error on equals lines because field outside sorting scope were not sorted

How refreshing to get a non-FAQ report on sort - you made me actually do
some research!  The fact that you used LANG=C to pin the locale is also
nice (most people aren't aware that most reported non-bugs in sort are
due to locale issues).  However, I still think sort is doing the right
thing.

> 
> $ cat <<'.' |
>> AAA AAA
>> BBB BBB
>> ZZZ CCC
>> DDD DDD
>> BBC EEE
>> BBD EEE
>> BBC EEE
>> BBE EEE
>> CCC FFF
>> DDD GGG
>> EEE HHH
>> .
>> LANG=C sort -k 2,2 -c
> sort: -:7: disorder: BBC EEE

POSIX says:
"Except when the -u option is specified, lines that otherwise compare
equal shall be ordered as if none of the options -d, -f, -i, -n, or -k
were present (but with -r still in effect, if it was specified) and with
all bytes in the lines significant to the comparison."
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html

In your example, you did not use -u, and the key you specified was
duplicated between two rows, so POSIX requires sort to break the tie by
comparing the entire line, and the entire line is indeed different.

For comparison purposes, I checked out /usr/bin/sort on Solaris 10; it
has the same behavior of declaring your input unsorted.
/usr/xpg4/bin/sort on the same machine is not POSIX compliant, in that
it lacks -C, and treats -c like the POSIX -C; but it also had non-zero
exit status on your sample.

If you don't like the POSIX behavior of a mandated entire line as a sort
key of final resort, then you should use the GNU extension of -s, I
tested that 'LC_ALL=C sort -k2,2 -c -s' has no problems with your
example.  To see the difference of using or not using the entire line as
the final sort key, replace -c by --debug, both with and without -s (you
can't use -c and --debug at the same time, unfortunately).  However,
remember that not all sort implementations have -s, so there is no
standard way to get the behavior you are after.

I'm closing this as not a bug, although you may continue to add comments
or questions to this topic.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]
[Message part 5 (message/rfc822, inline)]
From: Camion SPAM <camion_spam-gnubugs <at> yahoo.fr>
To: "bug-coreutils <at> gnu.org" <bug-coreutils <at> gnu.org>
Subject: Sort -c takes in account fields that were outside sorting scope
Date: Thu, 18 Apr 2013 16:04:37 +0100 (BST)
[Message part 6 (text/plain, inline)]
The following commands report an error on equals lines because field outside sorting scope were not sorted

$ cat <<'.' |
> AAA AAA
> BBB BBB
> ZZZ CCC
> DDD DDD
> BBC EEE
> BBD EEE
> BBC EEE
> BBE EEE
> CCC FFF
> DDD GGG
> EEE HHH
> .
> LANG=C sort -k 2,2 -c
sort: -:7: disorder: BBC EEE
[Message part 7 (text/html, inline)]

This bug report was last modified 12 years and 112 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.