GNU bug report logs - #35636
bug report sort command

Previous Next

Package: coreutils;

Reported by: Michele Liberi <mliberi <at> gmail.com>

Date: Wed, 8 May 2019 14:29:02 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Michele Liberi <mliberi <at> gmail.com>
Subject: bug#35636: closed (Re: bug#35636: bug report sort command)
Date: Wed, 08 May 2019 14:43:03 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#35636: bug report sort command

which was filed against the coreutils package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 35636 <at> debbugs.gnu.org.

-- 
35636: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=35636
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Eric Blake <eblake <at> redhat.com>
To: Michele Liberi <mliberi <at> gmail.com>, 35636-done <at> debbugs.gnu.org
Subject: Re: bug#35636: bug report sort command
Date: Wed, 8 May 2019 09:41:58 -0500
[Message part 3 (text/plain, inline)]
tag 35636 notabug
thanks

On 5/8/19 3:35 AM, Michele Liberi wrote:
> I verified the following bug is there in:
> 
>    - sort (GNU coreutils) 8.21
>    - sort (GNU coreutils) 8.22
>    - sort (GNU coreutils) 8.23
> 
> *Input file:*
> # cat sort.in
> 1|a|x
> 2|b|x
> 3|aa|x
> 4|bb|x
> 5|c|x
> 
> 
> *shell command and output:*
> # sort -t'|' -k2 <sort.in
> 3|aa|x
> 1|a|x
> 4|bb|x
> 2|b|x
> 5|c|x

Let's use --debug to see what sort really did:

$ sort --debug -t'|' -k2 <sort.in
sort: using ‘en_US.UTF-8’ sorting rules
3|aa|x
  ____
______
1|a|x
  ___
_____
4|bb|x
  ____
______
2|b|x
  ___
_____
5|c|x
  ___
_____


Since you did not specify an ending field, you are comparing the string
"aa|x" with "a|x", and the string "a|x" with "bb|x"; in the en_US.UTF-8
locale, punctuation is ignored on the first-order pass through
strcoll(), which means you are effectively comparing "aax" with "ax"
with "bbx", and the sort is correct; but even in a locale that does not
ignore punctuation:

$ LC_ALL=C sort --debug -t'|' -k2 <sort.in
sort: using simple byte comparison
3|aa|x
  ____
______
1|a|x
  ___
_____
4|bb|x
  ____
______
2|b|x
  ___
_____
5|c|x
  ___
_____

the sort is still correct, since ASCII '|' sorts after ASCII 'a'. Your
real problem is that you are sorting on too much data; you need to try
again with the key limited to exactly the second field:

$ sort --debug -t'|' -k2,2 <sort.in
sort: using ‘en_US.UTF-8’ sorting rules
1|a|x
  _
_____
3|aa|x
  __
______
2|b|x
  _
_____
4|bb|x
  __
______
5|c|x
  _
_____

where now sort can see that "a" is a prefix of "aa" because it is no
longer bleeding on to the rest of the line.


> 
> *I expected that key "a" to come before key "aa" and key "b" to come before
> key "bb".*

Your expectations are at odds with your incomplete command line.  sort
is behaving as required; therefore, I'm closing this as not a bug. But
feel free to reply if you have further questions.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

[signature.asc (application/pgp-signature, attachment)]
[Message part 5 (message/rfc822, inline)]
From: Michele Liberi <mliberi <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: bug report sort command
Date: Wed, 8 May 2019 10:35:01 +0200
[Message part 6 (text/plain, inline)]
I verified the following bug is there in:

   - sort (GNU coreutils) 8.21
   - sort (GNU coreutils) 8.22
   - sort (GNU coreutils) 8.23

*Input file:*
# cat sort.in
1|a|x
2|b|x
3|aa|x
4|bb|x
5|c|x


*shell command and output:*
# sort -t'|' -k2 <sort.in
3|aa|x
1|a|x
4|bb|x
2|b|x
5|c|x

*I expected that key "a" to come before key "aa" and key "b" to come before
key "bb".*
[Message part 7 (text/html, inline)]

This bug report was last modified 6 years and 52 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.