Thanks Jim and Eric for your replies.

Jim, perhaps info's version of option -d should be used in uniq's man page?

In isolation, uniq prints the last instance of the duplicated line, and

I think it prints the first line, not the last:

printf "a1\na2\na3\na4" | uniq -w 1
a1
 
uniq -u suppresses the output of the 4th line.

I have lost you here. -u suppresses any lines with duplicates.

printf "a1\na2\na3\na4" | uniq -u -w 1
(no output)

I suspect you mean -Du?

printf "a1\na2\na3\na4" | uniq -Du -w 1
a1
a2
a3
 
 In isolation, -D says to
output the first three lines which are normally omitted because they
have duplicates, in addition to the 4th line that is printed by default.

But,

printf "a1\na2\na3\na4" | uniq -D -w 1
a1
a2
a3
a4

and default is this:

printf "a1\na2\na3\na4" | uniq -w 1
a1

So, if I understand your logic correctly and if I correct your logic by referring to the 1st and not the 4th duplicate then -Du should give me

a2
a3
a4
 
 So in combination, -Du says to print the lines with subsequent
duplicates (the first three lines) but to suppress the output line that
corresponds to the last input line that ends a sequence of duplicates
(the 4th line).



 
Perhaps we can document this behavior better.  Or perhaps we can change
the behavior of -D (but at risk of breaking existing clients that depend
on the current behavior).  But we can't change -u or -d behavior.


I think changing behaviour of a utility is dangerous - I thought it was a bug, but both you and Jim have indicated that it is poor documentation.

 
Put another way, per POSIX, the default behavior is subtractive (remove
any line with a subsequent duplicate), -d is subtractive (remove any
line with no duplicate), and -u is subtractive (remove any last line
that had a prior duplicate), and GNU -D is additive (print any line with
a subsequent duplicate, to counter the initial default).


Whilst not exactly following your previous notes, I don't think this explains uniq's behaviour.

At this point, I will claim that the behavior is intended, and therefore
close out the bug.  But if you are willing to submit documentation
patches, or even code patches accompanied by extensive test cases to
demonstrate the corner cases of any new behavior, feel free to continue
to reply to this bug report.

 
Having now read info's uniq pages, I think that

1) -Du is undefined behaviour
2) current output makes no sense.

eg.

printf "1\na1\na2\na3\na4\n2" | uniq -w 1
1
a1
2
(comply)

printf "1\na1\na2\na3\na4\n2" | uniq -u -w 1
1
2
(comply)

printf "1\na1\na2\na3\na4\n2" | uniq -D -w 1
a1
a2
a3
a4
(comply)

printf "1\na1\na2\na3\na4\n2" | uniq -Du -w 1
a1
a2
a3

I think this one makes no sense.

But... this behaviour IS exactly what I need so if this can be documented then I would be happy - others might not be.


On a side point, why can't the official info pages be automagically converted into man pages to avoid discrepancies?