GNU bug report logs - #10365
[PATCH] uniq: add ability to skip last N chars or fields

Previous Next

Package: coreutils;

Reported by: Adrien Kunysz <adrien <at> kunysz.be>

Date: Sun, 25 Dec 2011 19:33:02 UTC

Severity: normal

Tags: patch

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


Message #20 received at 10365 <at> debbugs.gnu.org (full text, mbox):

From: Adrien Kunysz <adrien <at> kunysz.be>
To: Jim Meyering <jim <at> meyering.net>
Cc: 10365 <at> debbugs.gnu.org, Pádraig Brady <P <at> draigBrady.com>
Subject: Re: bug#10365: [PATCH] uniq: add ability to skip last N chars or
	fields
Date: Mon, 26 Dec 2011 19:00:36 +0000
[Message part 1 (text/plain, inline)]
On Mon, Dec 26, 2011 at 05:42:25PM +0100, Jim Meyering wrote:
> Pádraig Brady wrote:
> 
> > On 12/25/2011 12:54 PM, Adrien Kunysz wrote:
> >> * doc/coreutils.texi: document the new feature
> >> * src/uniq.c (find_end): new function
> >> (check_file): use find_end() to determine when to stop comparing
> >> (usage): document the new feature
> >> (main): expose the new feature to user
> >> * tests/misc/uniq: add tests to exercise the new code
> >> ---
> >>  doc/coreutils.texi |   17 +++++++++++++
> >>  src/uniq.c         |   69 +++++++++++++++++++++++++++++++++++++++++++++++++---
> >>  tests/misc/uniq    |   15 +++++++++++
> >>  3 files changed, 97 insertions(+), 4 deletions(-)
> >>
> >> I have recently found myself wishing I could have uniq(1) skip
> >> the last N fields before comparison. I am aware of the rev(1) trick
> >> but I don't find it very satisfactory. So I ended up patching uniq
> >> and implementing the feature for characters skipping as well.
> >>
> >> Documentation and tests included. Tests have also been run within
> >> Valgrind on x86_64.
> >
> > Thank you for being so thorough.
> >
> > Hmm, this is quite unusual functionality.
> > I was about to merge this with a previous feature request:
> > http://debbugs.gnu.org/5832
> > But in fact supporting --key would not provide this functionality.
> >
> > Why does `rev | uniq -f | rev` not suffice for you?

It just doesn't look very nice to me but I admit it actually works fine.

> > BTW you would need to start the copyright assignment process for
> > this feature, but we'd have to decide if it generally useful enough
> > to proceed. Perhaps a concrete example would help.

I ended up refactoring my script in such a way that I don't need either
so I don't even have a concrete use case for this any more :) If anybody
finds this useful enough to be merged I am happy to go through the
copyright assignment process.

> I agree that it's borderline.
> If we add this functionality, I'd prefer to do it without adding new
> options.  Instead, just accept negative values for N in the three
> options that accept counts:
> 
>     $ uniq --help|grep -w N
>       -f, --skip-fields=N   avoid comparing the first N fields
>       -s, --skip-chars=N    avoid comparing the first N characters
>       -w, --check-chars=N   compare no more than N characters in lines

I initially wanted to implement it by using negative values for -f but
then realised it would mean you can't say "-f2 -F3" for example.

I wasn't aware of the feature request for --key and I think that
certainly looks more useful (with or without supporting negative field
indexes). I might try to write a patch for that later but don't hold
your breath.
[signature.asc (application/pgp-signature, inline)]

This bug report was last modified 13 years and 209 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.