GNU bug report logs - #22109
Sort gives incorrect order when changing delimiters

Previous Next

Package: coreutils;

Reported by: Ed Brambley <edbrambley <at> gmail.com>

Date: Mon, 7 Dec 2015 16:17:03 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Ed Brambley <edbrambley <at> gmail.com>, 22109 <at> debbugs.gnu.org
Subject: bug#22109: Sort gives incorrect order when changing delimiters
Date: Mon, 7 Dec 2015 11:49:39 -0500
tag 22109 notabug
close 22109
stop

Hello Ed,

On 12/07/2015 10:36 AM, Ed Brambley wrote:
> The following problem came to light following a StackOverflow question [1]. The lexical ordering of sort appears to depend on the delimiter used, and I believe it shouldn't. As a minimal example:
>
> ### Correct ordering ###
> $ printf "1,a,1\n2,aa,2" | LC_ALL=C sort -k2 -t,
> 1,a,1
> 2,aa,2
>
> ### Incorrect ordering by replacing the "," delimiter by "~" ###
> $ printf "1~a~1\n2~aa~2" | LC_ALL=C sort -k2 -t~
> 2~aa~2
> 1~a~1
>

This is not a bug in 'sort', but simply an incorrect usage of the key options.

The parameter "-k2" means: use the second key *and all characters until the end of the line* to sort each line.
In this case, the character after the second key ',' or '~' does come into play.

The correct usage is to specify the key as "-k2,2" meaning: sort by the second key alone (then resolve equal keys by the entire line, unless --stable is used).

    $ printf "1~a~1\n2~aa~2" | LC_ALL=C sort -k2,2 -t~
    1~a~1
    2~aa~2


Using sort's "--debug" option will illustrate the difference (notice the underscore characters indicating what is the key that is being used):

Incorrect usage (-k2):

    $ printf "1~a~1\n2~aa~2" | LC_ALL=C sort --debug -k2 -t~
    sort: using simple byte comparison
    2~aa~2
      ____
    ______
    1~a~1
      ___
    _____


Better usage (-k2,2):

    $ printf "1~a~1\n2~aa~2" | LC_ALL=C sort --debug -k2,2 -t~
    sort: using simple byte comparison
    1~a~1
      _
    _____
    2~aa~2
      __
    ______




regards,
 - assaf





This bug report was last modified 9 years and 166 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.