GNU bug report logs -
#67690
Bug in command sort?
Previous Next
To reply to this bug, email your comments to 67690 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#67690
; Package
coreutils
.
(Thu, 07 Dec 2023 14:51:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Oleg Moiseichuk <MetamAdeptus <at> gmx.net>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Thu, 07 Dec 2023 14:51:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello!
I've got a list of IP addresses, each of them is prepended by its frequency counter (please find attached in the file list-1.txt). I need to sort them from most frequent to least. I tried using this command:
sort -t '.' -n -k 1.1,1.8r -k 1.9 -k 2,2 -k 3,3 -k 4,4 list-1.txt
But I've got some weird results.
Ok, I merged these counters with IP addresses using awk (file list-2.txt). Now they use the same separator and I can simplify the command:
sort -t '.' -n -k 1,1r -k 2,2 -k 3,3 -k 4,4 -k 5,5 list-2.txt > sorted-a.txt
It looks like as sorted properly but some entries with the counters 13 and 10 are misplaced.
Strangely enough, when I use direct order, they are sorted correctly:
sort -t '.' -n -k 1,1 -k 2,2 -k 3,3 -k 4,4 -k 5,5 list-2.txt > sorted-b.txt
Is it a bug or I'm doing something wrong?
I checked this in Ubuntu 22.04, sort version is 8.32.
--
Best regards,
Oleg Moiseichuk
[list-1.txt (text/plain, attachment)]
[list-2.txt (text/plain, attachment)]
[sorted-a.txt (text/plain, attachment)]
[sorted-b.txt (text/plain, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#67690
; Package
coreutils
.
(Thu, 07 Dec 2023 15:38:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 67690 <at> debbugs.gnu.org (full text, mbox):
tag 67690 notabug
close 67690
stop
On 07/12/2023 14:36, Oleg Moiseichuk via GNU coreutils Bug Reports wrote:
> Hello!
>
> I've got a list of IP addresses, each of them is prepended by its frequency counter (please find attached in the file list-1.txt). I need to sort them from most frequent to least. I tried using this command:
> sort -t '.' -n -k 1.1,1.8r -k 1.9 -k 2,2 -k 3,3 -k 4,4 list-1.txt
> But I've got some weird results.
Right, once you have multiple delimiters you generally need to adjust the data
> Ok, I merged these counters with IP addresses using awk (file list-2.txt). Now they use the same separator and I can simplify the command:
> sort -t '.' -n -k 1,1r -k 2,2 -k 3,3 -k 4,4 -k 5,5 list-2.txt > sorted-a.txt
> It looks like as sorted properly but some entries with the counters 13 and 10 are misplaced.
> Strangely enough, when I use direct order, they are sorted correctly:
> sort -t '.' -n -k 1,1 -k 2,2 -k 3,3 -k 4,4 -k 5,5 list-2.txt > sorted-b.txt
You're using the correct approach here, but missed this from the docs:
"A position in a sort field specified with ‘-k’ may have any of the
option letters ‘MbdfghinRrV’ appended to it, in which case no global
ordering options are inherited by that particular field."
I.e. the 'r' is cancelling out the global 'n'.
So you need to specify both options from that field like:
sort -t '.' -n -k 1,1rn -k 2,2 -k 3,3 -k 4,4 -k 5,5 list-2.txt
cheers,
Pádraig
p.s. the --debug option can be useful with sort to
help identify what's being compared, and various edge cases.
This bug report was last modified 1 year and 197 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.