tag 11220 notabug thanks On 04/10/2012 11:43 PM, phil colbourn wrote: > What should this print? > > echo -e 'aa\naa\naa\n' | uniq -d Thanks for the report. POSIX requires this to print only a single instance of 'aa', whether or not -d is in effect; coreutils does this by outputting the last line in a series of duplicates. The point of -d is to suppress the single-line outputs that do not have a corresponding duplicate input, not to output all instances of a duplicated line. By the way, 'echo -e' is not portable; POSIX recommends you use printf instead. > > Now, -D and -u means 'print all duplicate lines' and 'only print unique > lines'. -D is not specified by POSIX. However, -u is defined by POSIX to suppress output lines that have a corresponding duplicate input. > > I think this should print all lines since union of all unique lines and all > duplicate lines is all lines. > > > Therefore -Du prints first N-1 matching lines and not last matching line. In isolation, uniq prints the last instance of the duplicated line, and uniq -u suppresses the output of the 4th line. In isolation, -D says to output the first three lines which are normally omitted because they have duplicates, in addition to the 4th line that is printed by default. So in combination, -Du says to print the lines with subsequent duplicates (the first three lines) but to suppress the output line that corresponds to the last input line that ends a sequence of duplicates (the 4th line). Perhaps we can document this behavior better. Or perhaps we can change the behavior of -D (but at risk of breaking existing clients that depend on the current behavior). But we can't change -u or -d behavior. Put another way, per POSIX, the default behavior is subtractive (remove any line with a subsequent duplicate), -d is subtractive (remove any line with no duplicate), and -u is subtractive (remove any last line that had a prior duplicate), and GNU -D is additive (print any line with a subsequent duplicate, to counter the initial default). > > Are these bugs? At this point, I will claim that the behavior is intended, and therefore close out the bug. But if you are willing to submit documentation patches, or even code patches accompanied by extensive test cases to demonstrate the corner cases of any new behavior, feel free to continue to reply to this bug report. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org