GNU bug report logs - #9562
unexpected sort behaviour

Previous Next

Package: coreutils;

Reported by: vijay krishna <krishna.vijay4444 <at> gmail.com>

Date: Tue, 20 Sep 2011 16:23:02 UTC

Severity: normal

Tags: notabug

Merged with 9561

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 9562 in the body.
You can then email your comments to 9562 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#9562; Package coreutils. (Tue, 20 Sep 2011 16:23:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to vijay krishna <krishna.vijay4444 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Tue, 20 Sep 2011 16:23:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: vijay krishna <krishna.vijay4444 <at> gmail.com>
To: bug-coreutils <bug-coreutils <at> gnu.org>
Subject: unexpected sort behaviour
Date: Tue, 20 Sep 2011 04:51:51 -0700
[Message part 1 (text/plain, inline)]
Hello Team,

  May I please know the reason for the following behaviour of the sort
command...

------------------
0 $ sort -k 2 bug2_file2
512 b1
512 b101

0 $ sort bug2_file1
b101 512
b1 512

0 $ sort -k 1 bug2_file1
b101 512
b1 512
------------------

  The output is same with options  '-s', '-g', '-n', '-d' (used one at a
time) also.



------------------
0 $ sort --version
sort (GNU coreutils) 5.97
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software.  You may redistribute copies of it under the terms of
the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.
There is NO WARRANTY, to the extent permitted by law.
Written by Mike Haertel and Paul Eggert.


Regards,
Krishna
[Message part 2 (text/html, inline)]

Reply sent to Eric Blake <eblake <at> redhat.com>:
You have taken responsibility. (Tue, 20 Sep 2011 16:33:01 GMT) Full text and rfc822 format available.

Notification sent to vijay krishna <krishna.vijay4444 <at> gmail.com>:
bug acknowledged by developer. (Tue, 20 Sep 2011 16:33:02 GMT) Full text and rfc822 format available.

Message #10 received at 9562-done <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: vijay krishna <krishna.vijay4444 <at> gmail.com>
Cc: 9562-done <at> debbugs.gnu.org
Subject: Re: bug#9562: unexpected sort behaviour
Date: Tue, 20 Sep 2011 10:26:58 -0600
force-merge 9562 9561
tag 9562 notabug
thanks

On 09/20/2011 05:51 AM, vijay krishna wrote:
> Hello Team,
>
>    May I please know the reason for the following behaviour of the sort
> command...
>

Thanks for the report; however, this is not a bug.  As mentioned in the 
FAQ, you are encountering this behavior because of your choice of locale:

https://www.gnu.org/software/coreutils/faq/#Sort-does-not-sort-in-normal-order_0021

> 0 $ sort -k 1 bug2_file1
> b101 512
> b1 512
> ------------------

> sort (GNU coreutils) 5.97

Newer sort also comes with a --debug option that would help explain your 
predicament (5.97 is YEARS old; the latest is 8.13, with numerous bug 
fixes, although none of the behavior you show is affected by any of 
those bug fixes).

$ printf 'b101 512\nb1 512\n' | LC_ALL=C sort -k1 --debug
sort: using simple byte comparison
b1 512
______
______
b101 512
________
________

$ printf 'b101 512\nb1 512\n' | sort -k1 --debug
sort: using `en_US.UTF-8' sorting rules
sort: leading blanks are significant in key 1; consider also specifying `b'
b101 512
________
________
b1 512
______
______
$ printf 'b101 512\nb1 512\n' | sort -k1,1 --debug
sort: using `en_US.UTF-8' sorting rules
sort: leading blanks are significant in key 1; consider also specifying `b'
b1 512
__
______
b101 512
____
________


In the en_US.UTF-8 locale, collation is done by dictionary ordering, 
where whitespace is insignificant to the collation; and specification of 
-k1 instead of the more precise k1,1 means that you are sorting the 
entire line instead of the first field of the line.  Since "b1512" 
collates greater than "b101512" in en_US collation rules, the same 
applies to "b1 512" and "b101 512".  Notice how use of -k1,1 changed the 
output by comparing only "b1" and "b101", or how use of LC_ALL=C changed 
the output by switching to bytewise collation with no ditionary sorting, 
where space becomes significant.

-- 
Eric Blake   eblake <at> redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org




Added tag(s) notabug. Request was from Eric Blake <eblake <at> redhat.com> to control <at> debbugs.gnu.org. (Tue, 20 Sep 2011 17:50:02 GMT) Full text and rfc822 format available.

Forcibly Merged 9561 9562. Request was from Eric Blake <eblake <at> redhat.com> to control <at> debbugs.gnu.org. (Tue, 20 Sep 2011 17:55:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Wed, 19 Oct 2011 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 325 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.