GNU bug report logs - #72617
sort -n loses lines.

Previous Next

Package: coreutils;

Reported by: Simon B <simon.buongiorno <at> gmail.com>

Date: Wed, 14 Aug 2024 09:14:01 UTC

Severity: normal

Tags: notabug

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 72617 in the body.
You can then email your comments to 72617 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#72617; Package coreutils. (Wed, 14 Aug 2024 09:14:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Simon B <simon.buongiorno <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Wed, 14 Aug 2024 09:14:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Simon B <simon.buongiorno <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: sort -n loses lines.
Date: Wed, 14 Aug 2024 10:43:06 +0200
[Message part 1 (text/plain, inline)]
Hallo,

The output of my grep command is:

# grep -i "sshd" /root/access.report | egrep -o
 '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-
9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
64.227.127.122
172.169.5.249
172.169.6.164
172.168.40.186
13.64.194.111
71.6.134.231
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
178.128.44.128
71.6.134.231
45.56.94.150
45.155.91.30
35.86.214.67
35.84.0.64
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
142.93.95.82
35.94.106.188
115.231.78.8
34.223.41.242
35.84.141.160
159.65.29.253
115.231.78.8


The expected return of sort (sort -urbn) is:

178.128.44.128
172.169.6.164
172.169.5.249
172.168.40.186
159.65.29.253
142.93.95.82
115.231.78.8
71.6.134.231
64.227.127.122
45.56.94.150
45.155.91.30
35.94.106.188
35.86.214.67
35.84.141.160
35.84.0.64
34.223.41.242
13.64.194.111


The actual return of sort is:

# grep -i "sshd" /root/access.report  egrep -o
 '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| sort -urbn
178.128.44.128
172.169.5.249
172.168.40.186
159.65.29.253
142.93.95.82
115.231.78.8
71.6.134.231
64.227.127.122
45.56.94.150
45.155.91.30
35.94.106.188
35.86.214.67
35.84.0.64
34.223.41.242
13.64.194.111

The expected return is only achieved by calling sort twice.

 egrep -o
 '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| sort -urb | sort -nr
178.128.44.128
172.169.6.164
172.169.5.249
172.168.40.186
159.65.29.253
142.93.95.82
115.231.78.8
71.6.134.231
64.227.127.122
45.56.94.150
45.155.91.30
35.94.106.188
35.86.214.67
35.84.141.160
35.84.0.64
34.223.41.242
13.64.194.111


Regards

Simon
[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#72617; Package coreutils. (Wed, 14 Aug 2024 09:51:02 GMT) Full text and rfc822 format available.

Message #8 received at 72617 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Simon B <simon.buongiorno <at> gmail.com>, 72617 <at> debbugs.gnu.org
Subject: Re: bug#72617: sort -n loses lines.
Date: Wed, 14 Aug 2024 10:48:53 +0100
tag 72617 notabug
close 72617
stop

On 14/08/2024 09:43, Simon B wrote:
> Hallo,
> 
> The output of my grep command is:
> 
> # grep -i "sshd" /root/access.report | egrep -o
>   '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-
> 9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
> 64.227.127.122
> 172.169.5.249
> 172.169.6.164

Adding the --debug option shows the issue.
I.e. the '.' being considered as part of a number:

$ sort --debug -rbn -s ips
sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
sort: note numbers use ‘.’ as a decimal point in this locale
178.128.44.128
_______
172.169.6.164
_______
172.169.5.249
_______


Taking the example for sorting IPv4 addresses from the manual,
shows the desired comparisons being performed:

$ sort --debug -t '.' -k 1,1rn -k 2,2rn -k 3,3rn -k 4,4rn -u ips
sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
sort: numbers use ‘.’ as a decimal point in this locale
178.128.44.128
___
    ___
        __
           ___
172.169.6.164
___
    ___
        _
          ___
172.169.5.249
___
    ___
        _
          ___


cheers,
Pádraig




Added tag(s) notabug. Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Wed, 14 Aug 2024 09:51:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 72617 <at> debbugs.gnu.org and Simon B <simon.buongiorno <at> gmail.com> Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Wed, 14 Aug 2024 09:51:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#72617; Package coreutils. (Wed, 14 Aug 2024 11:36:01 GMT) Full text and rfc822 format available.

Message #15 received at 72617 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Simon B <simon.buongiorno <at> gmail.com>
Cc: 72617 <at> debbugs.gnu.org
Subject: Re: bug#72617: sort -n loses lines.
Date: Wed, 14 Aug 2024 12:33:45 +0100
On 14/08/2024 11:04, Simon B wrote:
> Hi Pádraig
> 
> I am largely satisfied by your great explanation,
> 
> I am still confused why lines go "missing" though.
> 

> Even if the dot is being interpreted, it still should not lose the
> line containing 172.169.6.164

This is actually well explained in the online documentation,
so I won't repeat here.  See the --unique description at:
https://www.gnu.org/software/coreutils/sort

cheers,
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#72617; Package coreutils. (Wed, 14 Aug 2024 11:49:02 GMT) Full text and rfc822 format available.

Message #18 received at 72617 <at> debbugs.gnu.org (full text, mbox):

From: Simon B <simon.buongiorno <at> gmail.com>
To: P <at> draigbrady.com
Cc: 72617 <at> debbugs.gnu.org
Subject: Re: bug#72617: sort -n loses lines.
Date: Wed, 14 Aug 2024 12:04:58 +0200
On Wed, 14 Aug 2024 at 11:48, Pádraig Brady <P <at> draigbrady.com> wrote:
>
> tag 72617 notabug
> close 72617
> stop
>
> On 14/08/2024 09:43, Simon B wrote:
> > Hallo,
> >
> > The output of my grep command is:
> >
> > # grep -i "sshd" /root/access.report | egrep -o
> >   '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-
> > 9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
> > 64.227.127.122
> > 172.169.5.249
> > 172.169.6.164
>
> Adding the --debug option shows the issue.
> I.e. the '.' being considered as part of a number:
>
> $ sort --debug -rbn -s ips
> sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
> sort: note numbers use ‘.’ as a decimal point in this locale
> 178.128.44.128
> _______
> 172.169.6.164
> _______
> 172.169.5.249
> _______
>
>
> Taking the example for sorting IPv4 addresses from the manual,
> shows the desired comparisons being performed:
>
> $ sort --debug -t '.' -k 1,1rn -k 2,2rn -k 3,3rn -k 4,4rn -u ips
> sort: text ordering performed using ‘en_IE.UTF-8’ sorting rules
> sort: numbers use ‘.’ as a decimal point in this locale
> 178.128.44.128
> ___
>      ___
>          __
>             ___
> 172.169.6.164
> ___
>      ___
>          _
>            ___
> 172.169.5.249
> ___
>      ___
>          _
>            ___
>

Hi Pádraig

I am largely satisfied by your great explanation,

I am still confused why lines go "missing" though.

Unsorted output
# grep -i "sshd" /root/access.report | egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| wc -l
41
Sorted once:
# grep -i "sshd" /root/access.report |  egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| sort -urbn | wc -l
15
Sorted workaround:
# grep -i "sshd" /root/access.report | egrep -o
'(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'
| sort -urb | sort -nr | wc -l
17

Even if the dot is being interpreted, it still should not lose the
line containing 172.169.6.164

Regards

Simon




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 12 Sep 2024 11:24:07 GMT) Full text and rfc822 format available.

This bug report was last modified 279 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.