GNU bug report logs -
#72617
sort -n loses lines.
Previous Next
Reported by: Simon B <simon.buongiorno <at> gmail.com>
Date: Wed, 14 Aug 2024 09:14:01 UTC
Severity: normal
Tags: notabug
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 72617 in the body.
You can then email your comments to 72617 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#72617
; Package
coreutils
.
(Wed, 14 Aug 2024 09:14:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Simon B <simon.buongiorno <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Wed, 14 Aug 2024 09:14:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hallo,
The output of my grep command is:
# grep -i "sshd" /root/access.report | egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-
9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
64.227.127.122
172.169.5.249
172.169.6.164
172.168.40.186
13.64.194.111
71.6.134.231
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
71.6.134.231
45.56.94.150
45.155.91.30
35.86.214.67
35.84.0.64
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
35.94.106.188
115.231.78.8
34.223.41.242
35.84.141.160
159.65.29.253
115.231.78.8
The expected return of sort (sort -urbn) is:
178.128.44.128
172.169.6.164
172.169.5.249
172.168.40.186
159.65.29.253
142.93.95.82
115.231.78.8
71.6.134.231
64.227.127.122
45.56.94.150
45.155.91.30
35.94.106.188
35.86.214.67
35.84.141.160
35.84.0.64
34.223.41.242
13.64.194.111
The actual return of sort is:
# grep -i "sshd" /root/access.report egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| sort -urbn
178.128.44.128
172.169.5.249
172.168.40.186
159.65.29.253
142.93.95.82
115.231.78.8
71.6.134.231
64.227.127.122
45.56.94.150
45.155.91.30
35.94.106.188
35.86.214.67
35.84.0.64
34.223.41.242
13.64.194.111
The expected return is only achieved by calling sort twice.
egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| sort -urb | sort -nr
178.128.44.128
172.169.6.164
172.169.5.249
172.168.40.186
159.65.29.253
142.93.95.82
115.231.78.8
71.6.134.231
64.227.127.122
45.56.94.150
45.155.91.30
35.94.106.188
35.86.214.67
35.84.141.160
35.84.0.64
34.223.41.242
13.64.194.111
Regards
Simon
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#72617
; Package
coreutils
.
(Wed, 14 Aug 2024 09:51:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 72617 <at> debbugs.gnu.org (full text, mbox):
tag 72617 notabug
close 72617
stop
On 14/08/2024 09:43, Simon B wrote:
> Hallo,
>
> The output of my grep command is:
>
> # grep -i "sshd" /root/access.report | egrep -o
> '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-
> 9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
> 64.227.127.122
> 172.169.5.249
> 172.169.6.164
Adding the --debug option shows the issue.
I.e. the '.' being considered as part of a number:
$ sort --debug -rbn -s ips
sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
sort: note numbers use ‘.’ as a decimal point in this locale
178.128.44.128
_______
172.169.6.164
_______
172.169.5.249
_______
Taking the example for sorting IPv4 addresses from the manual,
shows the desired comparisons being performed:
$ sort --debug -t '.' -k 1,1rn -k 2,2rn -k 3,3rn -k 4,4rn -u ips
sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
sort: numbers use ‘.’ as a decimal point in this locale
178.128.44.128
___
___
__
___
172.169.6.164
___
___
_
___
172.169.5.249
___
___
_
___
cheers,
Pádraig
Added tag(s) notabug.
Request was from
Pádraig Brady <P <at> draigBrady.com>
to
control <at> debbugs.gnu.org
.
(Wed, 14 Aug 2024 09:51:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
72617 <at> debbugs.gnu.org and Simon B <simon.buongiorno <at> gmail.com>
Request was from
Pádraig Brady <P <at> draigBrady.com>
to
control <at> debbugs.gnu.org
.
(Wed, 14 Aug 2024 09:51:03 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#72617
; Package
coreutils
.
(Wed, 14 Aug 2024 11:36:01 GMT)
Full text and
rfc822 format available.
Message #15 received at 72617 <at> debbugs.gnu.org (full text, mbox):
On 14/08/2024 11:04, Simon B wrote:
> Hi Pádraig
>
> I am largely satisfied by your great explanation,
>
> I am still confused why lines go "missing" though.
>
> Even if the dot is being interpreted, it still should not lose the
> line containing 172.169.6.164
This is actually well explained in the online documentation,
so I won't repeat here. See the --unique description at:
https://www.gnu.org/software/coreutils/sort
cheers,
Pádraig
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#72617
; Package
coreutils
.
(Wed, 14 Aug 2024 11:49:02 GMT)
Full text and
rfc822 format available.
Message #18 received at 72617 <at> debbugs.gnu.org (full text, mbox):
On Wed, 14 Aug 2024 at 11:48, Pádraig Brady <P <at> draigbrady.com> wrote:
>
> tag 72617 notabug
> close 72617
> stop
>
> On 14/08/2024 09:43, Simon B wrote:
> > Hallo,
> >
> > The output of my grep command is:
> >
> > # grep -i "sshd" /root/access.report | egrep -o
> > '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-
> > 9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
> > 64.227.127.122
> > 172.169.5.249
> > 172.169.6.164
>
> Adding the --debug option shows the issue.
> I.e. the '.' being considered as part of a number:
>
> $ sort --debug -rbn -s ips
> sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
> sort: note numbers use ‘.’ as a decimal point in this locale
> 178.128.44.128
> _______
> 172.169.6.164
> _______
> 172.169.5.249
> _______
>
>
> Taking the example for sorting IPv4 addresses from the manual,
> shows the desired comparisons being performed:
>
> $ sort --debug -t '.' -k 1,1rn -k 2,2rn -k 3,3rn -k 4,4rn -u ips
> sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
> sort: numbers use ‘.’ as a decimal point in this locale
> 178.128.44.128
> ___
> ___
> __
> ___
> 172.169.6.164
> ___
> ___
> _
> ___
> 172.169.5.249
> ___
> ___
> _
> ___
>
Hi Pádraig
I am largely satisfied by your great explanation,
I am still confused why lines go "missing" though.
Unsorted output
# grep -i "sshd" /root/access.report | egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| wc -l
41
Sorted once:
# grep -i "sshd" /root/access.report | egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| sort -urbn | wc -l
15
Sorted workaround:
# grep -i "sshd" /root/access.report | egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| sort -urb | sort -nr | wc -l
17
Even if the dot is being interpreted, it still should not lose the
line containing 172.169.6.164
Regards
Simon
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 12 Sep 2024 11:24:07 GMT)
Full text and
rfc822 format available.
This bug report was last modified 279 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.