GNU bug report logs - #10287
[wishlist] uniq can remove non adjacent lines

Previous Next

Package: coreutils;

Reported by: Stéphane Blondon <stephane.blondon <at> gmail.com>

Date: Tue, 13 Dec 2011 02:52:01 UTC

Severity: wishlist

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Jim Meyering <jim <at> meyering.net>
To: Bob Proulx <bob <at> proulx.com>
Cc: 10287 <at> debbugs.gnu.org, Stéphane Blondon <stephane.blondon <at> gmail.com>
Subject: bug#10287: [wishlist] uniq can remove non adjacent lines
Date: Tue, 13 Dec 2011 09:46:14 +0100
Bob Proulx wrote:

> Stéphane Blondon wrote:
>> I think `uniq` should have an additional option (for example -a,
>> --all) to remove same lines but not adjacent.
>>
>> The man page explains a workaround based on `sort` but it can be
>> complex to use. Few weeks ago, I had to `uniq`-ize random numbers and
>> the sort couldn't really work. Fortunately, the order was not
>> important so using `sort | uniq | sort --random-sort` was an
>> acceptable solution. I imagine cases based on other tools like `top`
>> could be a problem too.
>
> If you want to print only the first of a unique line then this perl
> one-liner will do it.
>
>   perl -lne 'print $_ if ! defined $a{$_}; $a{$_}=$_;'

Thanks, but with large files, isn't it better to store not
the full line, but rather a constant?

  perl -lne 'print $_ if ! defined $seen{$_}; $seen{$_}=1'

(actually, using "1" could be seen as misleading, since 0 or even undef
would also work)

I think you can drop the "l".
I have a slight preference for this:

  perl -ne 'defined $seen{$_} or print; $seen{$_}=1'




This bug report was last modified 13 years and 246 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.