GNU bug report logs - #76290
"sort -u" vs "sort -h -u": possible bug

Previous Next

Package: coreutils;

Reported by: Rupert Gallagher <ruga <at> protonmail.com>

Date: Fri, 14 Feb 2025 17:01:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


Message #20 received at 76290 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Rupert Gallagher <ruga <at> protonmail.com>
Cc: "76290 <at> debbugs.gnu.org" <76290 <at> debbugs.gnu.org>
Subject: Re: "sort -u" vs "sort -h -u": possible bug
Date: Mon, 17 Feb 2025 15:31:10 -0800
On 2025-02-17 15:13, Rupert Gallagher wrote:
> ~ $ echo -e "a1\na2" | sort
> a1
> a2
> 
> ~ $ echo -e "a1\na2" | sort -h
> a1
> a2
> 
> Since A = B, the result of -u must be the same on both sets, by logic.

By that logic, since the output of these two commands:

  echo -e 'a1\na2' | sort
  echo -e 'a1\na2' | sort -n

are the same, then the result of -u be the same on both sets. But this 
logic is wrong, in the sense that it disagrees with both longstanding 
practice and with the POSIX.1-2024 standard 
<https://pubs.opengroup.org/onlinepubs/9799919799/utilities/sort.html>, 
which say that plain 'sort' uses the entire line as a key whereas 'sort 
-n' uses a leading integer prefix (which in this example is empty so the 
keys compare equal).

I get it that 'sort' doesn't behave the way you expected. But that's a 
mismatch of expectations vs implementation, not a bug in the implementation.





This bug report was last modified 92 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.