GNU bug report logs - #24929
comm enhancement proposal: --print-summary --quiet

Previous Next

Package: coreutils;

Reported by: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>

Date: Sat, 12 Nov 2016 09:46:02 UTC

Severity: normal

Tags: fixed

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 24929 in the body.
You can then email your comments to 24929 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Sat, 12 Nov 2016 09:46:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 12 Nov 2016 09:46:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: bug-coreutils <at> gnu.org
Subject: comm enhancement proposal: --print-summary --quiet
Date: Sat, 12 Nov 2016 07:24:35 +0800
Please add a comm --print-summary and --quiet, so we wouldn't have to write
$ comm FILE1 FILE2|perl -nwe '
/^\t+/;
$h{ length $& || 0 }++;

END {
    @L = ( "Lines in 1st ", "Lines in 2nd ", "Lines in both" );
    printf "%s: %5d\n", $L[$_], $h{$_} for sort keys %h;
}
'
Lines in 1st :   601
Lines in 2nd :   437
Lines in both:  2417

(Which would get fooled by leading tabs in the files anyway.)
(Or add "only": 'Lines only in 1st'...)

--print-summary
        Print totals at end.

--quiet
        Suppress file content output.

In fact my formatting is ugly. Make it look like the output of
$ wc -l FILE1 FILE2

Thanks.




Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Mon, 14 Nov 2016 10:31:01 GMT) Full text and rfc822 format available.

Message #8 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>,
 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Mon, 14 Nov 2016 11:30:03 +0100
On 11/12/2016 12:24 AM, 積丹尼 Dan Jacobson wrote:
> Please add a comm --print-summary and --quiet, so we wouldn't have to write
> $ comm FILE1 FILE2|perl -nwe '
> /^\t+/;
> $h{ length $& || 0 }++;
> 
> END {
>     @L = ( "Lines in 1st ", "Lines in 2nd ", "Lines in both" );
>     printf "%s: %5d\n", $L[$_], $h{$_} for sort keys %h;
> }
> '
> Lines in 1st :   601
> Lines in 2nd :   437
> Lines in both:  2417

This sounds like a domain of diffstat(1), doesn't it?

Have a nice day,
Berny




Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Mon, 14 Nov 2016 14:39:01 GMT) Full text and rfc822 format available.

Message #11 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>
Cc: 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Mon, 14 Nov 2016 22:38:02 +0800
>>>>> "BV" == Bernhard Voelker <mail <at> bernhard-voelker.de> writes:

BV> This sounds like a domain of diffstat(1), doesn't it?

Even if it is,
and even if one could understand http://invisible-island.net/diffstat/ ,
it turns out the totals are already easily made within comm(1),
so it would be light-years easier if comm could tell the user,
instead of making him find and install a whole other program,
for something that comm could, with a few more lines of code surely,
just tell the user, instead of throwing away the opportunity.




Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Thu, 17 Nov 2016 02:52:01 GMT) Full text and rfc822 format available.

Message #14 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>,
 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Thu, 17 Nov 2016 03:51:25 +0100
[Message part 1 (text/plain, inline)]
On 11/12/2016 12:24 AM, 積丹尼 Dan Jacobson wrote:
> Please add [...]

> --print-summary
>         Print totals at end.
> 
> --quiet
>         Suppress file content output.

Just for fun (...), I've put the requested functionality into the
attached patch.

I'm only 60:40 for adding this to coreutils, as this may be considered
as feature creep bloating the code; OTOH the size of the additional
code it not so scaring, so I'll leave the decision up to the other CU
maintainers.

BTW: --quiet is not needed, because you can use "-123". ;-)

Have a nice day,
Berny
[0001-comm-add-total-option.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Thu, 17 Nov 2016 10:13:02 GMT) Full text and rfc822 format available.

Message #17 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>,
 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>,
 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Thu, 17 Nov 2016 10:12:24 +0000
On 17/11/16 02:51, Bernhard Voelker wrote:
> On 11/12/2016 12:24 AM, 積丹尼 Dan Jacobson wrote:
>> Please add [...]
> 
>> --print-summary
>>         Print totals at end.
>>
>> --quiet
>>         Suppress file content output.
> 
> Just for fun (...), I've put the requested functionality into the
> attached patch.
> 
> I'm only 60:40 for adding this to coreutils, as this may be considered
> as feature creep bloating the code; OTOH the size of the additional
> code it not so scaring, so I'll leave the decision up to the other CU
> maintainers.
> 
> BTW: --quiet is not needed, because you can use "-123". ;-)

Usually you'd want counts separately from each other
and separate from the data itself, in which case wc -l suffices:

 $ echo Lines in both = $(comm -12 file1 file2 | wc -l)
 $ echo Lines only in 1st = $(comm -23 file1 file2 | wc -l)
 $ echo Lines only in 2nd = $(comm -13 file1 file2 | wc -l)

So this is in the efficiency/convenience category.
I'm 50:50 on it (which means it goes in without further feedback).

thanks for the patch!
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Thu, 17 Nov 2016 10:19:01 GMT) Full text and rfc822 format available.

Message #20 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Pádraig Brady <P <at> draigBrady.com>,
 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>,
 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Thu, 17 Nov 2016 11:18:21 +0100
On 11/17/2016 11:12 AM, Pádraig Brady wrote:
> Usually you'd want counts separately from each other
> and separate from the data itself, in which case wc -l suffices:
> 
>  $ echo Lines in both = $(comm -12 file1 file2 | wc -l)
>  $ echo Lines only in 1st = $(comm -23 file1 file2 | wc -l)
>  $ echo Lines only in 2nd = $(comm -13 file1 file2 | wc -l)
> 
> So this is in the efficiency/convenience category.

You mean to change the --total flag to accept an argument?

  $ echo Lines in both = $(    comm -123 --total=3 file1 file2)
  $ echo Lines only in 1st = $(comm -123 --total=1 file1 file2)
  $ echo Lines only in 2nd = $(comm -123 --total=2 file1 file2)

and

  $ echo Lines only in 1st or 2nd = $(comm -123 --total=1,2 file1 file2)

?

Have a nice day,
Berny




Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Thu, 17 Nov 2016 10:33:01 GMT) Full text and rfc822 format available.

Message #23 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Bernhard Voelker <mail <at> bernhard-voelker.de>,
 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>,
 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Thu, 17 Nov 2016 10:32:16 +0000
On 17/11/16 10:18, Bernhard Voelker wrote:
> On 11/17/2016 11:12 AM, Pádraig Brady wrote:
>> Usually you'd want counts separately from each other
>> and separate from the data itself, in which case wc -l suffices:
>>
>>  $ echo Lines in both = $(comm -12 file1 file2 | wc -l)
>>  $ echo Lines only in 1st = $(comm -23 file1 file2 | wc -l)
>>  $ echo Lines only in 2nd = $(comm -13 file1 file2 | wc -l)
>>
>> So this is in the efficiency/convenience category.
> 
> You mean to change the --total flag to accept an argument?
> 
>   $ echo Lines in both = $(    comm -123 --total=3 file1 file2)
>   $ echo Lines only in 1st = $(comm -123 --total=1 file1 file2)
>   $ echo Lines only in 2nd = $(comm -123 --total=2 file1 file2)
> 
> and
> 
>   $ echo Lines only in 1st or 2nd = $(comm -123 --total=1,2 file1 file2)

Sorry I meant if you only wanted a single count,
then the existing tools suffice.

Your implementation is fine as is I think.

thanks,
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Thu, 17 Nov 2016 10:45:02 GMT) Full text and rfc822 format available.

Message #26 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: Pádraig Brady <P <at> draigBrady.com>,
 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>,
 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Thu, 17 Nov 2016 11:43:53 +0100
On 11/17/2016 11:32 AM, Pádraig Brady wrote:
> Your implementation is fine as is I think.

Okay, thank.
As I'm only slightly pro and you are 50:50, I'll wait for someone
else's opinion (/decision) whether to push.

Thanks & have a nice day,
Berny




Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Fri, 18 Nov 2016 22:03:02 GMT) Full text and rfc822 format available.

Message #29 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Bernhard Voelker <mail <at> bernhard-voelker.de>, 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Sat, 19 Nov 2016 05:43:24 +0800
The example is confusing, as it just happens to result in 1 2 3,

$ printf '%s\n' 1 2 3 4     > file1
$ printf '%s\n'   2 3 4 5 6 > file2
$ comm --total -123 file1 file2
1       2       3       total

So please use

$ printf '%s\n' 0   2 3   5 6       > file1
$ printf '%s\n'   1 2   4   6 7 8 9 > file2
$ comm --total -123 file1 file2
3       5       2      total

Also add a note "However --total is a GNU extension. For a portable way
to make totals, use wc:

$ echo Lines only in 1st = $(comm -23 file1 file2 | wc -l)
$ echo Lines only in 2nd = $(comm -13 file1 file2 | wc -l)
$ echo Lines in both =     $(comm -12 file1 file2 | wc -l)




Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Tue, 22 Nov 2016 13:15:02 GMT) Full text and rfc822 format available.

Message #32 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: Bernhard Voelker <mail <at> bernhard-voelker.de>
To: 積丹尼 Dan Jacobson <jidanni <at> jidanni.org>,
 Pádraig Brady <P <at> draigBrady.com>
Cc: 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Tue, 22 Nov 2016 14:14:09 +0100
[Message part 1 (text/plain, inline)]
On 11/18/2016 10:43 PM, 積丹尼 Dan Jacobson wrote:
> The example is confusing, as it just happens to result in 1 2 3,
> 
> $ printf '%s\n' 1 2 3 4     > file1
> $ printf '%s\n'   2 3 4 5 6 > file2
> $ comm --total -123 file1 file2
> 1       2       3       total
> 
> So please use
> 
> $ printf '%s\n' 0   2 3   5 6       > file1
> $ printf '%s\n'   1 2   4   6 7 8 9 > file2
> $ comm --total -123 file1 file2
> 3       5       2      total

I see the point. I changed the example data to 'a b c ...' which I think
is even easier to read and understand.

> Also add a note "However --total is a GNU extension. For a portable way
> to make totals, use wc:
> 
> $ echo Lines only in 1st = $(comm -23 file1 file2 | wc -l)
> $ echo Lines only in 2nd = $(comm -13 file1 file2 | wc -l)
> $ echo Lines in both =     $(comm -12 file1 file2 | wc -l)

I'll squash in the attached, and push soon.

Thanks & have a nice day,
Berny
[comm-total-doc-amendmend.diff (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#24929; Package coreutils. (Sun, 28 Oct 2018 07:24:01 GMT) Full text and rfc822 format available.

Message #35 received at 24929 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: 24929 <at> debbugs.gnu.org
Subject: Re: bug#24929: comm enhancement proposal: --print-summary --quiet
Date: Sun, 28 Oct 2018 01:22:58 -0600
tags 24929 fixed
close 24929
stop

(triaging old bugs)

Pushed here:

comm: add --total option
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=b50a151346c42816034b5c26266eb753b7dbe737


Closing as "fixed".

-assaf





Added tag(s) fixed. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 28 Oct 2018 07:24:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 24929 <at> debbugs.gnu.org and 積丹尼 Dan Jacobson <jidanni <at> jidanni.org> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Sun, 28 Oct 2018 07:24:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 25 Nov 2018 12:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 258 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.