GNU bug report logs - #28847
Maybe a bug in "sort (GNU coreutils) 8.4" report

Previous Next

Package: coreutils;

Reported by: kakaxixi777 <kakaxixi777 <at> gmail.com>

Date: Sun, 15 Oct 2017 09:30:02 UTC

Severity: normal

Tags: notabug

Merged with 28846

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 28847 in the body.
You can then email your comments to 28847 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#28847; Package coreutils. (Sun, 15 Oct 2017 09:30:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to kakaxixi777 <kakaxixi777 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sun, 15 Oct 2017 09:30:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: kakaxixi777 <kakaxixi777 <at> gmail.com>
To: "bug-coreutils <at> gnu.org" <bug-coreutils <at> gnu.org>
Subject: Maybe a bug in "sort (GNU coreutils) 8.4" report
Date: Sun, 15 Oct 2017 15:58:47 +0800
[Message part 1 (text/html, inline)]

Forcibly Merged 28846 28847. Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Sun, 15 Oct 2017 23:38:01 GMT) Full text and rfc822 format available.

Added tag(s) notabug. Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Sun, 15 Oct 2017 23:38:01 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 28846 <at> debbugs.gnu.org and Tree Big <kakaxixi777 <at> gmail.com> Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Sun, 15 Oct 2017 23:38:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#28847; Package coreutils. (Mon, 16 Oct 2017 11:50:02 GMT) Full text and rfc822 format available.

Message #14 received at 28847 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: kakaxixi777 <kakaxixi777 <at> gmail.com>, 28847 <at> debbugs.gnu.org
Subject: Re: bug#28847: Maybe a bug in "sort (GNU coreutils) 8.4" report
Date: Mon, 16 Oct 2017 06:49:35 -0500
[Message part 1 (text/plain, inline)]
tag 28847 notabug
thanks

On 10/15/2017 02:58 AM, kakaxixi777 wrote:
>    Dear coreutils :
>    I am a Research and Development Engineer in IT. I met a situation when
>    I use “sort” command in Linux shell which could be a bug for the "sort"
>    command. So I hope you read this email, thank you !
>    The whole command I used was :
>    sort test.txt
>    And the result was :
>    20171012|3|2059517|-|8|-|20-24|2|-|2.0|2.0
>    20171012|3|2059517|-|82|-|25-29|2|-|13.0|12.0
>    20171012|3|2059517|-|8|-|-2|-2|-|71.0|64.0
>    20171012|3|2059517|-|82|-|30-34|0|-|2.0|1.0
>    The content in test.txt was:
>    20171012|3|2059517|-|8|-|20-24|2|-|2.0|2.0
>    20171012|3|2059517|-|82|-|25-29|2|-|13.0|12.0
>    20171012|3|2059517|-|8|-|-2|-2|-|71.0|64.0
>    20171012|3|2059517|-|82|-|30-34|0|-|2.0|1.0

Your situation is a FAQ:
https://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021

Most likely, you are sorting in a locale that does not treat punctuation
with the same weight as digits, such as en_US.UTF8.  If you'll notice,
the substring '8202' sorts before '8225' which in turn is before '8227'
and finally '8230', once you've ignored the punctuation in '8|-|20-2',
'82|-|25', and so forth.

>    Which means the “sort” command didn't work, because I think the correct
>    result should be :
>    20171012|3|2059517|-|8|-|20-24|2|-|2.0|2.0
>    20171012|3|2059517|-|8|-|20-24|2|-|2.0|2.0
>    20171012|3|2059517|-|82|-|25-29|2|-|13.0|12.0
>    20171012|3|2059517|-|82|-|25-29|2|-|13.0|12.0

Well, this isn't the right result either, as it is duplicating two lines
and missing two others (did you copy and past incorrectly?).

>    The version of "sort" command I use is : sort --version
>    "sort (GNU coreutils) 8.4

This version is rather old; we are now at 8.28.  But even as recently as
version 8.6, you can use sort's --debug feature to see where your
expectations are going wrong (as 99% of reports about sort misbehavior
turn out to instead be problems of misuse of either command line options
or current locale).  Observe the difference:

$ printf
'20171012|3|2059517|-|8|-|20-24|2|-|2.0|2.0\n20171012|3|2059517|-|82|-|25-29|2|-|13.0|12.0\n20171012|3|2059517|-|8|-|-2|-2|-|71.0|64.0\n'
| LC_ALL=en_US.UTF8 sort  --debug
sort: using ‘en_US.UTF8’ sorting rules
20171012|3|2059517|-|8|-|20-24|2|-|2.0|2.0
__________________________________________
20171012|3|2059517|-|82|-|25-29|2|-|13.0|12.0
_____________________________________________
20171012|3|2059517|-|8|-|-2|-2|-|71.0|64.0
__________________________________________

$ printf
'20171012|3|2059517|-|8|-|20-24|2|-|2.0|2.0\n20171012|3|2059517|-|82|-|25-29|2|-|13.0|12.0\n20171012|3|2059517|-|8|-|-2|-2|-|71.0|64.0\n'
| LC_ALL=C sort  --debug
sort: using simple byte comparison
20171012|3|2059517|-|82|-|25-29|2|-|13.0|12.0
_____________________________________________
20171012|3|2059517|-|8|-|-2|-2|-|71.0|64.0
__________________________________________
20171012|3|2059517|-|8|-|20-24|2|-|2.0|2.0
__________________________________________

And if you want the lines containing '|8|' to sort before the lines
containing '|82|', then you can't use plain sort (which is over the
whole line), but instead need to use various -k, -n, and -t options to
tell sort where the keys are separated and which keys to sort on, and
the fact that the keys should be treated as numbers rather than as
character strings (since when sorting an entire line in ASCII, digits
sort before |).

>    I am not sure if it is a bug in "sort" command in Linux Shell or maybe
>    it's only my problems in using it.

I think I've demonstrated where the problem was, so I'm closing this as
not a bug.  Feel free to reply with further questions on the topic, though.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[signature.asc (application/pgp-signature, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 13 Nov 2017 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 221 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.