GNU bug report logs -
#22236
Not exactly a bug...
Previous Next
Full log
Message #8 received at 22236 <at> debbugs.gnu.org (full text, mbox):
tag 22236 notabug
close 22236
thanks
Hello Todd,
> On Dec 25, 2015, at 13:37, Todd Shandelman <todd.shandelman <at> gmail.com> wrote:
[...]
> So it looks like that for chars, 'uniq' has options to compare only the first N chars, or *all but* the first N chars.
>
> Whereas for fields, 'uniq' has only the option to skip the first N fields, but has no corresponding option to compare *only* the first N fields.
>
> Why this lack of symmetry?
This lack of symmetry originates from the POSIX standard:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/uniq.html
Which codified the existing features at that time.
GNU Coreutils' uniq program have added few more features, and there is a working plan to add the ability to use specific fields ( http://lists.gnu.org/archive/html/coreutils/2013-02/msg00082.html , http://lists.gnu.org/archive/html/coreutils/2013-09/msg00047.html ) but this has not yet been integrated into the main program - perhaps in future versions.
> And what do I do when I need that missing functionality, to compare only an initial subset of fields in each line?
To print unique lines of specific fields you can use 'sort':
Example, given the following sample input file:
$ cat input.txt
1 A 10 x 100
5 B 14 z 104
2 A 11 x 101
3 B 12 y 102
4 B 13 z 103
Print only lines with unique values in columns 2 and 4:
$ sort -k2,2 -k4,4 -s -u input.txt
1 A 10 x 100
3 B 12 y 102
5 B 14 z 104
This can be extended to include as many fields as you need.
If the fields are consecutive, you can specify them as so:
$ cat input2.txt
A x 1 97
B x 1 96
A x 1 99
A x 1 98
$ sort -k1,3 -u input2.txt
A x 1 97
B x 1 96
regards,
- assaf
This bug report was last modified 6 years and 212 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.