GNU bug report logs - #23951
sort -k1.4,1.6n -u HAVE BUGS!!!

Previous Next

Package: coreutils;

Reported by: David Pan <nanospeed <at> 139.com>

Date: Tue, 12 Jul 2016 03:15:01 UTC

Severity: normal

Tags: notabug

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 23951 in the body.
You can then email your comments to 23951 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#23951; Package coreutils. (Tue, 12 Jul 2016 03:15:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to David Pan <nanospeed <at> 139.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Tue, 12 Jul 2016 03:15:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: David Pan <nanospeed <at> 139.com>
To: bug-coreutils <bug-coreutils <at> gnu.org>
Subject: sort -k1.4,1.6n -u HAVE BUGS!!!
Date: Tue, 12 Jul 2016 09:38:44 +0800 (CST)
[Message part 1 (text/plain, inline)]



Hello Dear:




It seems that I find a bug when using the command:




sort -k1.4n -u




please refer to the attachment for detail .







[Message part 2 (text/html, inline)]
[sort-u.bug (application/octet-stream, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#23951; Package coreutils. (Tue, 12 Jul 2016 04:28:02 GMT) Full text and rfc822 format available.

Message #8 received at 23951 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: David Pan <nanospeed <at> 139.com>
Cc: 23951 <at> debbugs.gnu.org
Subject: Re: bug#23951: sort -k1.4,1.6n -u HAVE BUGS!!!
Date: Tue, 12 Jul 2016 00:27:38 -0400
Hello,

> On Jul 11, 2016, at 21:38, David Pan <nanospeed <at> 139.com> wrote:
> 
> It seems that I find a bug when using the command:
> 
> sort -k1.4n -u
> please refer to the attachment for detail .

You have not indicated exactly what is the suspected incorrect output.
May I ask you to detail what is the output you expected vs the output you got? (i.e. in which command you think the bug is?) Also, in the subject line you've listed "-k1.4,1.6n", but I do not see such invocation in the attached file.

To help troubleshoot sort-related issues, recent versions of gnu sort support a new option "--debug" which prints additional information about the keys being compared.
It would be useful to attach the output of such command, for example:

======
$ sort --debug -k1.4n -u aa 
sort: using ‘en_US.UTF-8’ sorting rules
sort: leading blanks are significant in key 1; consider also specifying 'b'
sort: key 1 is numeric and spans multiple fields
c3a1.ecld.com
   __
c3m2.ecld.com
   __
c3s15.ecld.com
   ___
c3s16.ecld.com
   ___
c3s17.ecld.com
   ___
c3s18.ecld.com
   ___
c3s19.ecld.com
   ___
c3s20.ecld.com
   ___
c3s21.ecld.com
   ___
c3s25.ecld.com
   ___

======


regards,
 -assaf


P.S.
If you do test a newer version of coreutils, please be sure to use version 8.19 or newer, as it contains a fix for a "sort -u" bug which was introduced in 8.6 ( http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=eb3f5b3b3de8c6ca005a701f09bff43d778aece7 ).





Information forwarded to bug-coreutils <at> gnu.org:
bug#23951; Package coreutils. (Thu, 14 Jul 2016 00:40:02 GMT) Full text and rfc822 format available.

Message #11 received at 23951 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: David Pan <nanospeed <at> 139.com>
Cc: 23951 <at> debbugs.gnu.org
Subject: Re: bug#23951: sort -k1.4,1.6n -u HAVE BUGS!!!
Date: Wed, 13 Jul 2016 20:38:52 -0400
tag 23951 notabug
close 23951
stop

Hello,

quoting an off-list email from David:
> When I use "sort -k1.4n -u", system output miss [c3m1.ecld.com].
> then "sort -k1.4n aa |sort -u" the [c3m1.ecld.com] appear again.

This is not a bug but correct behavior.

The parameter "-k1.4n" means that the compared keys start at the digit (e.g. "1", "2") -
thus the compared key of both "c3a1" and "c3m1" is "1", and "-u" outputs only one of them.
using "sort -u" separately treats the entire line as the key, thus "c3a1" and "c3m1" are different.

The following will demonstrate:

  $ printf "aaa1\nbbb2\nccc1\n" | sort -k1.4n
  aaa1
  ccc1
  bbb2

  $ printf "aaa1\nbbb2\nccc1\n" | sort -k1.4n -u
  aaa1
  bbb2

  $ printf "aaa1\nbbb2\nccc1\n" | sort -k1.4n | sort -u
  aaa1
  bbb2
  ccc1

In the first two examples, the letters do not matter at all. Because the key is "-k1.4",
the first three characters are ignored, and the compared values are the digits 1,2,1.
In the second example, asking for unique values refer to unique keys, meaning lines with key "1"
will be printed once (aaa1).

In the third example, the additional 'sort' negates the first 'sort', because it first sorts all lines alphabetically, then prints unique lines, while using the entire line as a key.

Recent versions of 'sort' support the '--debug' option, which helps troubleshooting such cases:
===
$ printf "aaa1\nbbb2\nccc1\n" | sort --debug -k1.4n 
sort: using ‘en_US.UTF-8’ sorting rules
sort: leading blanks are significant in key 1; consider also specifying 'b'
sort: key 1 is numeric and spans multiple fields
aaa1
   _
____
ccc1
   _
____
bbb2
   _
____

===

As such, I'm closing this bug, but discussion can continue by replying to this thread.

regards,
 - assaf






Added tag(s) notabug. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 24 Oct 2018 22:12:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 23951 <at> debbugs.gnu.org and David Pan <nanospeed <at> 139.com> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 24 Oct 2018 22:12:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 22 Nov 2018 12:24:09 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 207 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.