GNU bug report logs - #11220
uniq -d and -Du bug?

Previous Next

Package: coreutils;

Reported by: phil colbourn <philcolbourn <at> gmail.com>

Date: Wed, 11 Apr 2012 06:25:01 UTC

Severity: normal

Tags: notabug

Done: Eric Blake <eblake <at> redhat.com>

Bug is archived. No further changes may be made.

Full log


Message #19 received at 11220-done <at> debbugs.gnu.org (full text, mbox):

From: phil colbourn <philcolbourn <at> gmail.com>
To: 11220-done <at> debbugs.gnu.org
Subject: Re: bug#11220: uniq -d and -Du bug?
Date: Thu, 12 Apr 2012 22:47:39 +1000
[Message part 1 (text/plain, inline)]
Thanks Jim and Eric for your replies.

Jim, perhaps info's version of option -d should be used in uniq's man page?

In isolation, uniq prints the last instance of the duplicated line, and
>

I think it prints the first line, not the last:

printf "a1\na2\na3\na4" | uniq -w 1
a1


> uniq -u suppresses the output of the 4th line.


I have lost you here. -u suppresses any lines with duplicates.

printf "a1\na2\na3\na4" | uniq -u -w 1
(no output)

I suspect you mean -Du?

printf "a1\na2\na3\na4" | uniq -Du -w 1
a1
a2
a3


>  In isolation, -D says to
> output the first three lines which are normally omitted because they
> have duplicates, in addition to the 4th line that is printed by default.
>

But,

printf "a1\na2\na3\na4" | uniq -D -w 1
a1
a2
a3
a4

and default is this:

printf "a1\na2\na3\na4" | uniq -w 1
a1

So, if I understand your logic correctly and if I correct your logic by
referring to the 1st and not the 4th duplicate then -Du should give me

a2
a3
a4


>  So in combination, -Du says to print the lines with subsequent
> duplicates (the first three lines) but to suppress the output line that
> corresponds to the last input line that ends a sequence of duplicates
> (the 4th line).
>
>



> Perhaps we can document this behavior better.  Or perhaps we can change
> the behavior of -D (but at risk of breaking existing clients that depend
> on the current behavior).  But we can't change -u or -d behavior.
>
>
I think changing behaviour of a utility is dangerous - I thought it was a
bug, but both you and Jim have indicated that it is poor documentation.



> Put another way, per POSIX, the default behavior is subtractive (remove
> any line with a subsequent duplicate), -d is subtractive (remove any
> line with no duplicate), and -u is subtractive (remove any last line
> that had a prior duplicate), and GNU -D is additive (print any line with
> a subsequent duplicate, to counter the initial default).
>
>
Whilst not exactly following your previous notes, I don't think this
explains uniq's behaviour.

At this point, I will claim that the behavior is intended, and therefore
> close out the bug.  But if you are willing to submit documentation
> patches, or even code patches accompanied by extensive test cases to
> demonstrate the corner cases of any new behavior, feel free to continue
> to reply to this bug report.
>
>
Having now read info's uniq pages, I think that

1) -Du is undefined behaviour
2) current output makes no sense.

eg.

printf "1\na1\na2\na3\na4\n2" | uniq -w 1
1
a1
2
(comply)

printf "1\na1\na2\na3\na4\n2" | uniq -u -w 1
1
2
(comply)

printf "1\na1\na2\na3\na4\n2" | uniq -D -w 1
a1
a2
a3
a4
(comply)

printf "1\na1\na2\na3\na4\n2" | uniq -Du -w 1
a1
a2
a3

I think this one makes no sense.

But... this behaviour IS exactly what I need so if this can be documented
then I would be happy - others might not be.


On a side point, why can't the official info pages be automagically
converted into man pages to avoid discrepancies?
[Message part 2 (text/html, inline)]

This bug report was last modified 13 years and 136 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.