GNU bug report logs -
#41563
Possible bug with 'sort -Vr' version sorting
Previous Next
To reply to this bug, email your comments to 41563 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Wed, 27 May 2020 15:04:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Danie de Jager <danie.dejager <at> striata.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Wed, 27 May 2020 15:04:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi,
I use sort -Vr to sort version numbers. I noticed this discrepancy on
the latest kernel version from Centos 7.8.
command to get output:
# ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue
| sort -Vr
3.10.0-1127.el7.x86_64
3.10.0-1127.8.2.el7.x86_64
3.10.0-1062.18.1.el7.x86_64
I'd expect the middle value to be the highest version number. Is this
by design or a bug? If it is a bug please let me know if I must log it
somewhere.
Version details:
# sort --version
sort (GNU coreutils) 8.22
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Mike Haertel and Paul Eggert.
Regards,
Danie de Jager
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Wed, 27 May 2020 15:24:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 41563 <at> debbugs.gnu.org (full text, mbox):
Hi,
On Wed, May 27, 2020 at 02:07:32PM +0200, Danie de Jager via GNU coreutils Bug Reports wrote:
> I use sort -Vr to sort version numbers. I noticed this discrepancy on
> the latest kernel version from Centos 7.8.
>
> command to get output:
> # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue
> | sort -Vr
> 3.10.0-1127.el7.x86_64
> 3.10.0-1127.8.2.el7.x86_64
> 3.10.0-1062.18.1.el7.x86_64
>
> I'd expect the middle value to be the highest version number. Is this
> by design or a bug? If it is a bug please let me know if I must log it
> somewhere.
I'd say this is by design:
Sorting compares runs of non-digits, then runs of digits. Thus each
"dot" (.) terminates a run of digits. The "problem" is an unbalanced
number of digit and non-digit runs in the version numbers.
See the following two manual sections:
http://www.gnu.org/software/coreutils/manual/coreutils.html#Version_002dsort-ordering-rules
http://www.gnu.org/software/coreutils/manual/coreutils.html#Punctuation-Characters
The "version sort" is based on Debian's version sort (but different).
It seems as if Red Hat version numbers follow different rules.
HTH,
Erik
--
Be water, my friend.
-- Bruce Lee
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Thu, 28 May 2020 06:49:01 GMT)
Full text and
rfc822 format available.
Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):
On Wednesday, May 27, 2020 2:07:32 PM CEST Danie de Jager via GNU coreutils
Bug Reports wrote:
> Hi,
>
> I use sort -Vr to sort version numbers. I noticed this discrepancy on
> the latest kernel version from Centos 7.8.
>
> command to get output:
> # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue
>
> | sort -Vr
>
> 3.10.0-1127.el7.x86_64
> 3.10.0-1127.8.2.el7.x86_64
> 3.10.0-1062.18.1.el7.x86_64
It is the underscore in the .x86_64 suffix what breaks the version compare
algorithm. If you replace the underscore by an alphabetic character, it
sorts as you expect:
# ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | \
sed 's/x86_64/x86X64/' | sort -Vr | sed 's/x86X64/x86_64/'
3.10.0-1127.8.2.el7.x86_64
3.10.0-1127.el7.x86_64
3.10.0-1062.18.1.el7.x86_64
Kamil
> I'd expect the middle value to be the highest version number. Is this
> by design or a bug? If it is a bug please let me know if I must log it
> somewhere.
>
> Version details:
> # sort --version
> sort (GNU coreutils) 8.22
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>. This is free software: you are free to
> change and redistribute it. There is NO WARRANTY, to the extent permitted
> by law.
>
> Written by Mike Haertel and Paul Eggert.
>
> Regards,
> Danie de Jager
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Thu, 28 May 2020 06:49:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Thu, 28 May 2020 08:17:02 GMT)
Full text and
rfc822 format available.
Message #17 received at submit <at> debbugs.gnu.org (full text, mbox):
Thank you for your response! I'll use it accordingly.
On Thu, 28 May 2020 at 08:48, Kamil Dudka <kdudka <at> redhat.com> wrote:
>
> On Wednesday, May 27, 2020 2:07:32 PM CEST Danie de Jager via GNU coreutils
> Bug Reports wrote:
> > Hi,
> >
> > I use sort -Vr to sort version numbers. I noticed this discrepancy on
> > the latest kernel version from Centos 7.8.
> >
> > command to get output:
> > # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue
> >
> > | sort -Vr
> >
> > 3.10.0-1127.el7.x86_64
> > 3.10.0-1127.8.2.el7.x86_64
> > 3.10.0-1062.18.1.el7.x86_64
>
> It is the underscore in the .x86_64 suffix what breaks the version compare
> algorithm. If you replace the underscore by an alphabetic character, it
> sorts as you expect:
>
> # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | \
> sed 's/x86_64/x86X64/' | sort -Vr | sed 's/x86X64/x86_64/'
>
> 3.10.0-1127.8.2.el7.x86_64
> 3.10.0-1127.el7.x86_64
> 3.10.0-1062.18.1.el7.x86_64
>
> Kamil
>
> > I'd expect the middle value to be the highest version number. Is this
> > by design or a bug? If it is a bug please let me know if I must log it
> > somewhere.
> >
> > Version details:
> > # sort --version
> > sort (GNU coreutils) 8.22
> > Copyright (C) 2013 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later
> > <http://gnu.org/licenses/gpl.html>. This is free software: you are free to
> > change and redistribute it. There is NO WARRANTY, to the extent permitted
> > by law.
> >
> > Written by Mike Haertel and Paul Eggert.
> >
> > Regards,
> > Danie de Jager
>
>
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Thu, 28 May 2020 08:17:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Thu, 28 May 2020 09:14:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 41563 <at> debbugs.gnu.org (full text, mbox):
Hi,
On Thu, May 28, 2020 at 08:48:16AM +0200, Kamil Dudka wrote:
> On Wednesday, May 27, 2020 2:07:32 PM CEST Danie de Jager via GNU coreutils
> Bug Reports wrote:
> >
> > I use sort -Vr to sort version numbers. I noticed this discrepancy on
> > the latest kernel version from Centos 7.8.
> >
> > command to get output:
> > # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | sort -Vr
> >
> > 3.10.0-1127.el7.x86_64
> > 3.10.0-1127.8.2.el7.x86_64
> > 3.10.0-1062.18.1.el7.x86_64
>
> It is the underscore in the .x86_64 suffix what breaks the version compare
> algorithm. If you replace the underscore by an alphabetic character, it
> sorts as you expect:
>
> # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | \
> sed 's/x86_64/x86X64/' | sort -Vr | sed 's/x86X64/x86_64/'
>
> 3.10.0-1127.8.2.el7.x86_64
> 3.10.0-1127.el7.x86_64
> 3.10.0-1062.18.1.el7.x86_64
That is interesting. The underscore can be replaced by a digit or even
removed as well. Replacing it with a dot (.) does not help.
This differs from Debian's "dpkg --compare-versions", where the results
of the comparison do not change by replacing the underscore with a
digit or character, or by removing it (the underscore is identified as
problematic, though):
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86_64 lt 3.10.0-1127.el7.x86_64 && echo less
dpkg: warning: version '3.10.0-1127.8.2.el7.x86_64' has bad syntax: invalid character in revision number
dpkg: warning: version '3.10.0-1127.el7.x86_64' has bad syntax: invalid character in revision number
less
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86.64 lt 3.10.0-1127.el7.x86.64 && echo less
less
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86X64 lt 3.10.0-1127.el7.x86X64 && echo less
less
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86164 lt 3.10.0-1127.el7.x86164 && echo less
less
$ dpkg --compare-versions 3.10.0-1127.8.2.el7.x8664 lt 3.10.0-1127.el7.x8664 && echo less
less
The way I read the GNU Coreutils documentation, removing the underscore
should not affect the version sort comparison result.
Thanks,
Erik
--
There is no remedy for anything in life.
-- Ernest Hemingway
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Thu, 28 May 2020 11:02:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 41563 <at> debbugs.gnu.org (full text, mbox):
On Thursday, May 28, 2020 11:02:43 AM CEST Erik Auerswald wrote:
> On Thu, May 28, 2020 at 08:48:16AM +0200, Kamil Dudka wrote:
> > It is the underscore in the .x86_64 suffix what breaks the version compare
> > algorithm. If you replace the underscore by an alphabetic character, it
> > sorts as you expect:
> >
> > # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | \
> >
> > sed 's/x86_64/x86X64/' | sort -Vr | sed 's/x86X64/x86_64/'
> >
> > 3.10.0-1127.8.2.el7.x86_64
> > 3.10.0-1127.el7.x86_64
> > 3.10.0-1062.18.1.el7.x86_64
>
> That is interesting. The underscore can be replaced by a digit or even
> removed as well. Replacing it with a dot (.) does not help.
If there is no underscore, the .el7.x86X64 suffix is recognized as file
extension. See the corresponding documentation:
https://www.gnu.org/software/coreutils/manual/html_node/Special-handling-of-file-extensions.html
> This differs from Debian's "dpkg --compare-versions", where the results
> of the comparison do not change by replacing the underscore with a
> digit or character, or by removing it (the underscore is identified as
> problematic, though):
The problem is that `dpkg --compare-versions` expects version numbers only.
It does not work well if you feed it with file names including extensions:
$ dpkg --compare-versions 3.10.0-1127.8.2 '>>' 3.10.0-1127 && echo '>>' || echo '<='
>>
$ dpkg --compare-versions 3.10.0-1127.8.2.bz2 '>>' 3.10.0-1127.bz2 && echo '>>' || echo '<='
<=
> $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86_64 lt
> 3.10.0-1127.el7.x86_64 && echo less dpkg: warning: version
> '3.10.0-1127.8.2.el7.x86_64' has bad syntax: invalid character in revision
> number dpkg: warning: version '3.10.0-1127.el7.x86_64' has bad syntax:
> invalid character in revision number less
> $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86.64 lt
> 3.10.0-1127.el7.x86.64 && echo less less
> $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86X64 lt
> 3.10.0-1127.el7.x86X64 && echo less less
> $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86164 lt
> 3.10.0-1127.el7.x86164 && echo less less
> $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x8664 lt
> 3.10.0-1127.el7.x8664 && echo less less
>
> The way I read the GNU Coreutils documentation, removing the underscore
> should not affect the version sort comparison result.
Not really. See the link above to the documentation that covers this part.
Kamil
> Thanks,
> Erik
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#41563
; Package
coreutils
.
(Thu, 28 May 2020 12:05:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 41563 <at> debbugs.gnu.org (full text, mbox):
Hi,
On Thu, May 28, 2020 at 01:01:05PM +0200, Kamil Dudka wrote:
> On Thursday, May 28, 2020 11:02:43 AM CEST Erik Auerswald wrote:
> > On Thu, May 28, 2020 at 08:48:16AM +0200, Kamil Dudka wrote:
> > > It is the underscore in the .x86_64 suffix what breaks the version compare
> > > algorithm. If you replace the underscore by an alphabetic character, it
> > > sorts as you expect:
> > >
> > > # ls -t /boot/vmlinuz-* | sed "s/\/boot\/vmlinuz-//g" | grep -v rescue | \
> > >
> > > sed 's/x86_64/x86X64/' | sort -Vr | sed 's/x86X64/x86_64/'
> > >
> > > 3.10.0-1127.8.2.el7.x86_64
> > > 3.10.0-1127.el7.x86_64
> > > 3.10.0-1062.18.1.el7.x86_64
> >
> > That is interesting. The underscore can be replaced by a digit or even
> > removed as well. Replacing it with a dot (.) does not help.
>
> If there is no underscore, the .el7.x86X64 suffix is recognized as file
> extension. See the corresponding documentation:
>
> https://www.gnu.org/software/coreutils/manual/html_node/Special-handling-of-file-extensions.html
Ah, el7.x86X64 or el7.x86164 is seen as an extension (i.e., a sequence
of suffixes), but el7.x86.64 or el7.x86_64 is not. Since .8.2 does not
contain a letter, it is not seen as part of the extension. Very subtle,
but documented.
Trvia: the usual 7-Zip extension .7z is no suffix resp. file extension
for this algorithm (according to the documented definition).
Thus changing the platform indicator to look like a file extension,
and relying on the behavior that the distribution version information
is interpreted as a file extension as well, you create a file extension
where initially there was none. This file extension is then ignored for
the comparison, unless that comparison results in equality. This seems
to be a useful hack when working with Red Hat products.
Fascinating. :-)
> > This differs from Debian's "dpkg --compare-versions", where the results
> > of the comparison do not change by replacing the underscore with a
> > digit or character, or by removing it (the underscore is identified as
> > problematic, though):
>
> The problem is that `dpkg --compare-versions` expects version numbers only.
> It does not work well if you feed it with file names including extensions:
I did not, as you can see in the examples. I gave version information
to dpkg, though not Debian version information. So of course this is
illegal input and the GIGO principle applies.
> $ dpkg --compare-versions 3.10.0-1127.8.2 '>>' 3.10.0-1127 && echo '>>' || echo '<='
> >>
> $ dpkg --compare-versions 3.10.0-1127.8.2.bz2 '>>' 3.10.0-1127.bz2 && echo '>>' || echo '<='
> <=
>
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86_64 lt
> > 3.10.0-1127.el7.x86_64 && echo less dpkg: warning: version
> > '3.10.0-1127.8.2.el7.x86_64' has bad syntax: invalid character in revision
> > number dpkg: warning: version '3.10.0-1127.el7.x86_64' has bad syntax:
> > invalid character in revision number less
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86.64 lt
> > 3.10.0-1127.el7.x86.64 && echo less less
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86X64 lt
> > 3.10.0-1127.el7.x86X64 && echo less less
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x86164 lt
> > 3.10.0-1127.el7.x86164 && echo less less
> > $ dpkg --compare-versions 3.10.0-1127.8.2.el7.x8664 lt
> > 3.10.0-1127.el7.x8664 && echo less less
> >
> > The way I read the GNU Coreutils documentation, removing the underscore
> > should not affect the version sort comparison result.
>
> Not really. See the link above to the documentation that covers this part.
Yes, you are correct. I find this quite surprising, and see it as another
example where --version-sort fails to deliver on the short form promise
of "natural sort." I am well aware that the long form description shows
that the sorting order is not "natural," but rather strange IMHO.
$ sort --help | grep -- --version-sort
-V, --version-sort natural sort of (version) numbers within text
But then I do not even understand what is "natural" about version numbers
anyway. ;-)
Thanks,
Erik
--
[M]ost parts of this industry just work by chance.
-- Thomas Gleixner
This bug report was last modified 5 years and 25 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.