GNU bug report logs - #17075
diff - - exits immediately

Previous Next

Package: diffutils;

Reported by: karl <at> freefriends.org (Karl Berry)

Date: Sun, 23 Mar 2014 21:22:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17075 in the body.
You can then email your comments to 17075 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Sun, 23 Mar 2014 21:22:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to karl <at> freefriends.org (Karl Berry):
New bug report received and forwarded. Copy sent to bug-diffutils <at> gnu.org. (Sun, 23 Mar 2014 21:22:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: karl <at> freefriends.org (Karl Berry)
To: bug-diffutils <at> gnu.org
Subject: diff - - exits immediately
Date: Sun, 23 Mar 2014 21:20:47 GMT
With diff 3.3, running
  diff - -
exits immediately, and successfully, for me.  (Compiled from the original
GNU source, running on CentOS 6.5.)

It seems like it should either read stdin twice (probably too much
trouble), or read stdin once and then abort when it can't be read again,
or just abort immediately.  Or something, just not success.

FWIW ...

karl




Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Sun, 23 Mar 2014 21:48:01 GMT) Full text and rfc822 format available.

Message #8 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Karl Berry <karl <at> freefriends.org>, 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Sun, 23 Mar 2014 14:47:13 -0700
Karl Berry wrote:
> It seems like it should either read stdin twice (probably too much
> trouble), or read stdin once and then abort when it can't be read again,
> or just abort immediately.  Or something, just not success.

I don't see why 'diff' should be prohibited from optimizing the case 
'diff A A'.  'diff' should be allowed to read 'A' just once (or even not 
at all, which is what 'diff' actually does).  '-' is just a special case 
of this.




Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Mon, 24 Mar 2014 12:34:02 GMT) Full text and rfc822 format available.

Message #11 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Karl Berry <karl <at> freefriends.org>,
 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: bug#17075: diff - - exits immediately
Date: Mon, 24 Mar 2014 06:33:51 -0600
[Message part 1 (text/plain, inline)]
On 03/23/2014 03:47 PM, Paul Eggert wrote:
> Karl Berry wrote:
>> It seems like it should either read stdin twice (probably too much
>> trouble), or read stdin once and then abort when it can't be read again,
>> or just abort immediately.  Or something, just not success.
> 
> I don't see why 'diff' should be prohibited from optimizing the case
> 'diff A A'.  'diff' should be allowed to read 'A' just once (or even not
> at all, which is what 'diff' actually does).  '-' is just a special case
> of this.

POSIX states in XCU 1.4:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html

 Unless otherwise stated, the use of multiple instances of '-' to mean
standard input in a single command produces unspecified results.

and its description for 'diff' does not place any other requirements on
double '-', so we are free to behave however we want.  I personally
think that special-casing '-' to read stdin twice makes more sense, at
least when fstat(0) says that stdin is not a regular file (the way that
'cat - -' behaves differently for stdin used twice).  I agree with the
optimization of not reading a file at all if we know the argument is not
'-', or if the argument is '-' but fstat(0) says the file is regular.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Mon, 24 Mar 2014 15:28:01 GMT) Full text and rfc822 format available.

Message #14 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Eric Blake <eblake <at> redhat.com>, Karl Berry <karl <at> freefriends.org>, 
 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: bug#17075: diff - - exits immediately
Date: Mon, 24 Mar 2014 08:27:17 -0700
Eric Blake wrote:
> special-casing '-' to read stdin twice makes more sense, at
> least when fstat(0) says that stdin is not a regular file (the way that
> 'cat - -' behaves differently for stdin used twice).

The difference in behavior is inherent to what the two commands need to 
do.  'cat - -' must read standard input, whereas 'diff - -' needn't. 
It's like 'cmp -s - -'.

Here's another difference, which is also OK: 'diff -q - /etc/passwd' 
need not read standard input until EOF.  It can simply read stdin until 
it finds a difference, just as 'cmp -s - /etc/passwd' can.  'cat - 
/etc/passwd' doesn't have that liberty.




Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Mon, 24 Mar 2014 18:11:01 GMT) Full text and rfc822 format available.

Message #17 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: karl <at> freefriends.org (Karl Berry)
To: eggert <at> cs.ucla.edu
Cc: 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Mon, 24 Mar 2014 18:10:19 GMT
    diff A A

Couldn't there be side effects missed by not reading the input?
E.g., if stdin is a pipe, or maybe a named pipes, or special
file.  Doesn't seem equivalent to regular files to me, as Eric says.

Anyway, I won't argue for a change in behavior.  I was just surprised
that stdin was not read at all.  Perhaps the help message could get a
one-line addition:

< If a FILE is '-', read standard input.
-
> If a FILE is '-', read standard input.
> If FILES are identical strings, nothing is read.

Not sure that's exactly right, but it's the best my brain can do right now.

The manual explicitly says:
    As a special case, `diff - -' compares a copy of standard input to
    itself.
That does not seem accurate to me (perhaps it was once).  It doesn't
read stdin at all, it just exits zero immediately.

Not that any of this is a big deal, of course.

thanks,
karl




Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Tue, 25 Mar 2014 00:07:02 GMT) Full text and rfc822 format available.

Message #20 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Karl Berry <karl <at> freefriends.org>
Cc: 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Mon, 24 Mar 2014 17:06:05 -0700
Karl Berry wrote:
> Couldn't there be side effects missed by not reading the input?

Sure, but the invoker shouldn't rely on those side effects, from 'diff' 
or from 'cmp' or from 'head' or from any other program that need not 
read all its input.

> Perhaps the help message could get a one-line addition:

Unfortunately all the one-liners I can think of are wrong.  For example, 
diff must read all its input even when given two identical file names A 
and A, e.g., if it's also given the -DFOO option.  And even without 
-DFOO, 'diff A A' reports an error if A does not exist.

Another amusing example: 'diff A A' can succeed even if A is unreadable:

$ umask 777
$ echo foo >A
$ ls -l A
---------- 1 eggert eggert 4 Mar 24 17:02 A
$ diff A A
$ echo $?
0





Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Tue, 25 Mar 2014 21:13:02 GMT) Full text and rfc822 format available.

Message #23 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: karl <at> freefriends.org (Karl Berry)
To: eggert <at> cs.ucla.edu
Cc: 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Tue, 25 Mar 2014 21:12:37 GMT
    Unfortunately all the one-liners I can think of are wrong.

Yeah, ok, forget that.

How about more than one line of explanation in the manual, then?
At least to avoid the wrong implication in the manual now, even if we
don't want to say anything very specific about the behavior.  For instance:

  Given the same file name twice, @code{diff} will ordinarily
  immediately report success, but may or may not read the file even
  once, depending on other options (e.g., @code{-D}) or the situation
  (e.g., if the file exists).  This includes the case where the file is
  standard input, that is, @code{diff - -}.

Something ...

Thanks,
k




Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Tue, 25 Mar 2014 23:43:02 GMT) Full text and rfc822 format available.

Message #26 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Karl Berry <karl <at> freefriends.org>
Cc: 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Tue, 25 Mar 2014 16:42:19 -0700
I dunno, even that sounds dubious, as it's incomplete.  For example, 
when A and B are different files, 'diff -q A B' reads neither A nor B if 
it determines via the 'stat' syscall that the files are different sizes.

More generally, I'm not sure it's necessary or wise to describe exactly 
the optimizations 'diff' uses to avoid reading files.  Quite possibly, 
though, I'm not understanding the problem that caused you to file the 
bug report in the first place.




Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Tue, 25 Mar 2014 23:56:01 GMT) Full text and rfc822 format available.

Message #29 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: Karl Berry <karl <at> freefriends.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Tue, 25 Mar 2014 17:55:31 -0600
    though, I'm not understanding the problem that caused you to file the 
    bug report in the first place.

I was surprised that "diff - -" did not read stdin.

Sure, what I wrote is incomplete; completeness wasn't the goal (and
surely isn't desired).  Avoiding user surprise was the goal.

Here is the one sentence in the manual which I think should be changed,
regardless of anything else:
    As a special case, `diff - -' compares a copy of standard input to
    itself.
I suppose some torturous interpretation could be made to consider that
technically not false, but the straightforward implication is that it
reads stdin.  Even just deleting the sentence and replacing it with
nothing would be better than leaving it, seems to me.

But I think it would be better to say *something* about the fact that
diff does not always read its input, in the event that it can determine the
result via other methods.  This is so unlike virtually every other
program, as we've discussed in this thread, that I think it deserves
mention.  If it counts for anything, I've been using Unix for 30+ years
and have worked (a tiny bit) on the diff source, the diff manual, and tons
of other utilities, and I was *still* surprised.

best,
karl




Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Tue, 25 Mar 2014 23:57:02 GMT) Full text and rfc822 format available.

Message #32 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: Karl Berry <karl <at> freefriends.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Tue, 25 Mar 2014 17:55:57 -0600
    More generally, I'm not sure it's necessary or wise to describe exactly 
    the optimizations 'diff' uses to avoid reading files.  

I completely agree and that was never (intended to be) my suggestion.

k




Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Wed, 26 Mar 2014 01:07:02 GMT) Full text and rfc822 format available.

Message #35 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Karl Berry <karl <at> freefriends.org>
Cc: 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Tue, 25 Mar 2014 18:06:53 -0700
[Message part 1 (text/plain, inline)]
Karl Berry wrote:
> I think it would be better to say*something*  about the fact that
> diff does not always read its input, in the event that it can determine the
> result via other methods.

Thanks, that sounds good; I installed the attached.

> This is so unlike virtually every other program

Hmm, well, in diff's defense, lots of commonly-used programs avoid 
reading some or all their input in some cases, including 'grep', 'head', 
'tail', 'sed', 'awk', 'dd', 'more', and 'od'.  It's not just 'diff' and 
'cmp'.
[0001-doc-improve-documentation-about-reading-and-stdin.patch (text/plain, attachment)]

Information forwarded to bug-diffutils <at> gnu.org:
bug#17075; Package diffutils. (Thu, 27 Mar 2014 21:41:02 GMT) Full text and rfc822 format available.

Message #38 received at 17075 <at> debbugs.gnu.org (full text, mbox):

From: karl <at> freefriends.org (Karl Berry)
To: eggert <at> cs.ucla.edu
Cc: 17075 <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Thu, 27 Mar 2014 21:40:08 GMT
    Hmm, well, in diff's defense, lots of commonly-used programs avoid 
    reading some or all their input in some cases, 

Sure.  But in most other cases, the user says something to imply the
partial read, e.g., it would actually be surprising in the other
direction if head -10 read more than 10 lines.  But head -10 - does read
stdin ... anyway, I don't argue that diff is unique in this regard, only
that in practice it comes up more and is more surprising when it does.

Thanks for the doc patch, looks good to me.

karl




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Sun, 30 Mar 2014 05:05:03 GMT) Full text and rfc822 format available.

Notification sent to karl <at> freefriends.org (Karl Berry):
bug acknowledged by developer. (Sun, 30 Mar 2014 05:05:03 GMT) Full text and rfc822 format available.

Message #43 received at 17075-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Karl Berry <karl <at> freefriends.org>
Cc: 17075-done <at> debbugs.gnu.org
Subject: Re: [bug-diffutils] bug#17075: diff - - exits immediately
Date: Sat, 29 Mar 2014 22:04:25 -0700
Karl Berry wrote:
> Thanks for the doc patch, looks good to me.

Thanks; closing the bug report.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 27 Apr 2014 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 107 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.