GNU bug report logs - #58153
hungry sort eats lines

Previous Next

Package: coreutils;

Reported by: DrSlony <bugs <at> londonlight.org>

Date: Wed, 28 Sep 2022 21:42:01 UTC

Severity: normal

Tags: notabug

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: DrSlony <bugs <at> londonlight.org>, 58153 <at> debbugs.gnu.org
Cc: Kamil Dudka <kdudka <at> redhat.com>
Subject: bug#58153: hungry sort eats lines
Date: Wed, 28 Sep 2022 23:36:42 +0100
On 28/09/2022 22:13, DrSlony wrote:
> Hey
> 
> printf '%s\n' "key;foo" "key0;bar0" | sort -Vu -t ';' --key=1,1
> 
> sort 8.32 outputs:
>       key;bar
>       key0;foo
> 
> sort 9.1 outputs:
>       key;foo
> 
> "key0;foo" is missing.

You're using version sort and '0' is special to version sorting.
Specifically as per https://www.debian.org/doc/debian-policy/ch-controlfields.html#version
In particular this portion of the documented comparison algorithm:

"Then the initial part of the remainder of each string which consists entirely
 of digit characters is determined. The numerical values of these two parts are compared,
  and any difference found is returned as the result of the comparison.
  For these purposes an empty string (which can only occur at the end of one or both
  version strings being compared) counts as zero."

You can see this in the simplified example:

# Use --check to see if any matches that need to be dropped:
$ printf '%s\n' "key" "key0" | sort -u -C -V || echo equal
equal

# Here 1 is treated differently:
$ printf '%s\n' "key" "key1" | sort -u -c -V && echo different
different

I agree this is surprising, but version sorting has lots of edge cases,
so it's probably best to stick to the documented algorithm here.

thanks,
Pádraig




This bug report was last modified 2 years and 319 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.