GNU bug report logs -
#23951
sort -k1.4,1.6n -u HAVE BUGS!!!
Previous Next
Reported by: David Pan <nanospeed <at> 139.com>
Date: Tue, 12 Jul 2016 03:15:01 UTC
Severity: normal
Tags: notabug
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 23951 in the body.
You can then email your comments to 23951 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#23951
; Package
coreutils
.
(Tue, 12 Jul 2016 03:15:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
David Pan <nanospeed <at> 139.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Tue, 12 Jul 2016 03:15:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello Dear:
It seems that I find a bug when using the command:
sort -k1.4n -u
please refer to the attachment for detail .
[Message part 2 (text/html, inline)]
[sort-u.bug (application/octet-stream, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#23951
; Package
coreutils
.
(Tue, 12 Jul 2016 04:28:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 23951 <at> debbugs.gnu.org (full text, mbox):
Hello,
> On Jul 11, 2016, at 21:38, David Pan <nanospeed <at> 139.com> wrote:
>
> It seems that I find a bug when using the command:
>
> sort -k1.4n -u
> please refer to the attachment for detail .
You have not indicated exactly what is the suspected incorrect output.
May I ask you to detail what is the output you expected vs the output you got? (i.e. in which command you think the bug is?) Also, in the subject line you've listed "-k1.4,1.6n", but I do not see such invocation in the attached file.
To help troubleshoot sort-related issues, recent versions of gnu sort support a new option "--debug" which prints additional information about the keys being compared.
It would be useful to attach the output of such command, for example:
======
$ sort --debug -k1.4n -u aa
sort: using ‘en_US.UTF-8’ sorting rules
sort: leading blanks are significant in key 1; consider also specifying 'b'
sort: key 1 is numeric and spans multiple fields
c3a1.ecld.com
__
c3m2.ecld.com
__
c3s15.ecld.com
___
c3s16.ecld.com
___
c3s17.ecld.com
___
c3s18.ecld.com
___
c3s19.ecld.com
___
c3s20.ecld.com
___
c3s21.ecld.com
___
c3s25.ecld.com
___
======
regards,
-assaf
P.S.
If you do test a newer version of coreutils, please be sure to use version 8.19 or newer, as it contains a fix for a "sort -u" bug which was introduced in 8.6 ( http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=eb3f5b3b3de8c6ca005a701f09bff43d778aece7 ).
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#23951
; Package
coreutils
.
(Thu, 14 Jul 2016 00:40:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 23951 <at> debbugs.gnu.org (full text, mbox):
tag 23951 notabug
close 23951
stop
Hello,
quoting an off-list email from David:
> When I use "sort -k1.4n -u", system output miss [c3m1.ecld.com].
> then "sort -k1.4n aa |sort -u" the [c3m1.ecld.com] appear again.
This is not a bug but correct behavior.
The parameter "-k1.4n" means that the compared keys start at the digit (e.g. "1", "2") -
thus the compared key of both "c3a1" and "c3m1" is "1", and "-u" outputs only one of them.
using "sort -u" separately treats the entire line as the key, thus "c3a1" and "c3m1" are different.
The following will demonstrate:
$ printf "aaa1\nbbb2\nccc1\n" | sort -k1.4n
aaa1
ccc1
bbb2
$ printf "aaa1\nbbb2\nccc1\n" | sort -k1.4n -u
aaa1
bbb2
$ printf "aaa1\nbbb2\nccc1\n" | sort -k1.4n | sort -u
aaa1
bbb2
ccc1
In the first two examples, the letters do not matter at all. Because the key is "-k1.4",
the first three characters are ignored, and the compared values are the digits 1,2,1.
In the second example, asking for unique values refer to unique keys, meaning lines with key "1"
will be printed once (aaa1).
In the third example, the additional 'sort' negates the first 'sort', because it first sorts all lines alphabetically, then prints unique lines, while using the entire line as a key.
Recent versions of 'sort' support the '--debug' option, which helps troubleshooting such cases:
===
$ printf "aaa1\nbbb2\nccc1\n" | sort --debug -k1.4n
sort: using ‘en_US.UTF-8’ sorting rules
sort: leading blanks are significant in key 1; consider also specifying 'b'
sort: key 1 is numeric and spans multiple fields
aaa1
_
____
ccc1
_
____
bbb2
_
____
===
As such, I'm closing this bug, but discussion can continue by replying to this thread.
regards,
- assaf
Added tag(s) notabug.
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Wed, 24 Oct 2018 22:12:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
23951 <at> debbugs.gnu.org and David Pan <nanospeed <at> 139.com>
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Wed, 24 Oct 2018 22:12:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 22 Nov 2018 12:24:09 GMT)
Full text and
rfc822 format available.
This bug report was last modified 6 years and 207 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.