GNU bug report logs -
#17075
diff - - exits immediately
Previous Next
Reported by: karl <at> freefriends.org (Karl Berry)
Date: Sun, 23 Mar 2014 21:22:02 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17075 in the body.
You can then email your comments to 17075 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Sun, 23 Mar 2014 21:22:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
karl <at> freefriends.org (Karl Berry)
:
New bug report received and forwarded. Copy sent to
bug-diffutils <at> gnu.org
.
(Sun, 23 Mar 2014 21:22:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
With diff 3.3, running
diff - -
exits immediately, and successfully, for me. (Compiled from the original
GNU source, running on CentOS 6.5.)
It seems like it should either read stdin twice (probably too much
trouble), or read stdin once and then abort when it can't be read again,
or just abort immediately. Or something, just not success.
FWIW ...
karl
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Sun, 23 Mar 2014 21:48:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 17075 <at> debbugs.gnu.org (full text, mbox):
Karl Berry wrote:
> It seems like it should either read stdin twice (probably too much
> trouble), or read stdin once and then abort when it can't be read again,
> or just abort immediately. Or something, just not success.
I don't see why 'diff' should be prohibited from optimizing the case
'diff A A'. 'diff' should be allowed to read 'A' just once (or even not
at all, which is what 'diff' actually does). '-' is just a special case
of this.
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Mon, 24 Mar 2014 12:34:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 17075 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 03/23/2014 03:47 PM, Paul Eggert wrote:
> Karl Berry wrote:
>> It seems like it should either read stdin twice (probably too much
>> trouble), or read stdin once and then abort when it can't be read again,
>> or just abort immediately. Or something, just not success.
>
> I don't see why 'diff' should be prohibited from optimizing the case
> 'diff A A'. 'diff' should be allowed to read 'A' just once (or even not
> at all, which is what 'diff' actually does). '-' is just a special case
> of this.
POSIX states in XCU 1.4:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html
Unless otherwise stated, the use of multiple instances of '-' to mean
standard input in a single command produces unspecified results.
and its description for 'diff' does not place any other requirements on
double '-', so we are free to behave however we want. I personally
think that special-casing '-' to read stdin twice makes more sense, at
least when fstat(0) says that stdin is not a regular file (the way that
'cat - -' behaves differently for stdin used twice). I agree with the
optimization of not reading a file at all if we know the argument is not
'-', or if the argument is '-' but fstat(0) says the file is regular.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[signature.asc (application/pgp-signature, attachment)]
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Mon, 24 Mar 2014 15:28:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 17075 <at> debbugs.gnu.org (full text, mbox):
Eric Blake wrote:
> special-casing '-' to read stdin twice makes more sense, at
> least when fstat(0) says that stdin is not a regular file (the way that
> 'cat - -' behaves differently for stdin used twice).
The difference in behavior is inherent to what the two commands need to
do. 'cat - -' must read standard input, whereas 'diff - -' needn't.
It's like 'cmp -s - -'.
Here's another difference, which is also OK: 'diff -q - /etc/passwd'
need not read standard input until EOF. It can simply read stdin until
it finds a difference, just as 'cmp -s - /etc/passwd' can. 'cat -
/etc/passwd' doesn't have that liberty.
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Mon, 24 Mar 2014 18:11:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 17075 <at> debbugs.gnu.org (full text, mbox):
diff A A
Couldn't there be side effects missed by not reading the input?
E.g., if stdin is a pipe, or maybe a named pipes, or special
file. Doesn't seem equivalent to regular files to me, as Eric says.
Anyway, I won't argue for a change in behavior. I was just surprised
that stdin was not read at all. Perhaps the help message could get a
one-line addition:
< If a FILE is '-', read standard input.
-
> If a FILE is '-', read standard input.
> If FILES are identical strings, nothing is read.
Not sure that's exactly right, but it's the best my brain can do right now.
The manual explicitly says:
As a special case, `diff - -' compares a copy of standard input to
itself.
That does not seem accurate to me (perhaps it was once). It doesn't
read stdin at all, it just exits zero immediately.
Not that any of this is a big deal, of course.
thanks,
karl
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Tue, 25 Mar 2014 00:07:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 17075 <at> debbugs.gnu.org (full text, mbox):
Karl Berry wrote:
> Couldn't there be side effects missed by not reading the input?
Sure, but the invoker shouldn't rely on those side effects, from 'diff'
or from 'cmp' or from 'head' or from any other program that need not
read all its input.
> Perhaps the help message could get a one-line addition:
Unfortunately all the one-liners I can think of are wrong. For example,
diff must read all its input even when given two identical file names A
and A, e.g., if it's also given the -DFOO option. And even without
-DFOO, 'diff A A' reports an error if A does not exist.
Another amusing example: 'diff A A' can succeed even if A is unreadable:
$ umask 777
$ echo foo >A
$ ls -l A
---------- 1 eggert eggert 4 Mar 24 17:02 A
$ diff A A
$ echo $?
0
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Tue, 25 Mar 2014 21:13:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 17075 <at> debbugs.gnu.org (full text, mbox):
Unfortunately all the one-liners I can think of are wrong.
Yeah, ok, forget that.
How about more than one line of explanation in the manual, then?
At least to avoid the wrong implication in the manual now, even if we
don't want to say anything very specific about the behavior. For instance:
Given the same file name twice, @code{diff} will ordinarily
immediately report success, but may or may not read the file even
once, depending on other options (e.g., @code{-D}) or the situation
(e.g., if the file exists). This includes the case where the file is
standard input, that is, @code{diff - -}.
Something ...
Thanks,
k
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Tue, 25 Mar 2014 23:43:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 17075 <at> debbugs.gnu.org (full text, mbox):
I dunno, even that sounds dubious, as it's incomplete. For example,
when A and B are different files, 'diff -q A B' reads neither A nor B if
it determines via the 'stat' syscall that the files are different sizes.
More generally, I'm not sure it's necessary or wise to describe exactly
the optimizations 'diff' uses to avoid reading files. Quite possibly,
though, I'm not understanding the problem that caused you to file the
bug report in the first place.
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Tue, 25 Mar 2014 23:56:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 17075 <at> debbugs.gnu.org (full text, mbox):
though, I'm not understanding the problem that caused you to file the
bug report in the first place.
I was surprised that "diff - -" did not read stdin.
Sure, what I wrote is incomplete; completeness wasn't the goal (and
surely isn't desired). Avoiding user surprise was the goal.
Here is the one sentence in the manual which I think should be changed,
regardless of anything else:
As a special case, `diff - -' compares a copy of standard input to
itself.
I suppose some torturous interpretation could be made to consider that
technically not false, but the straightforward implication is that it
reads stdin. Even just deleting the sentence and replacing it with
nothing would be better than leaving it, seems to me.
But I think it would be better to say *something* about the fact that
diff does not always read its input, in the event that it can determine the
result via other methods. This is so unlike virtually every other
program, as we've discussed in this thread, that I think it deserves
mention. If it counts for anything, I've been using Unix for 30+ years
and have worked (a tiny bit) on the diff source, the diff manual, and tons
of other utilities, and I was *still* surprised.
best,
karl
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Tue, 25 Mar 2014 23:57:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 17075 <at> debbugs.gnu.org (full text, mbox):
More generally, I'm not sure it's necessary or wise to describe exactly
the optimizations 'diff' uses to avoid reading files.
I completely agree and that was never (intended to be) my suggestion.
k
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Wed, 26 Mar 2014 01:07:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 17075 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Karl Berry wrote:
> I think it would be better to say*something* about the fact that
> diff does not always read its input, in the event that it can determine the
> result via other methods.
Thanks, that sounds good; I installed the attached.
> This is so unlike virtually every other program
Hmm, well, in diff's defense, lots of commonly-used programs avoid
reading some or all their input in some cases, including 'grep', 'head',
'tail', 'sed', 'awk', 'dd', 'more', and 'od'. It's not just 'diff' and
'cmp'.
[0001-doc-improve-documentation-about-reading-and-stdin.patch (text/plain, attachment)]
Information forwarded
to
bug-diffutils <at> gnu.org
:
bug#17075
; Package
diffutils
.
(Thu, 27 Mar 2014 21:41:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 17075 <at> debbugs.gnu.org (full text, mbox):
Hmm, well, in diff's defense, lots of commonly-used programs avoid
reading some or all their input in some cases,
Sure. But in most other cases, the user says something to imply the
partial read, e.g., it would actually be surprising in the other
direction if head -10 read more than 10 lines. But head -10 - does read
stdin ... anyway, I don't argue that diff is unique in this regard, only
that in practice it comes up more and is more surprising when it does.
Thanks for the doc patch, looks good to me.
karl
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Sun, 30 Mar 2014 05:05:03 GMT)
Full text and
rfc822 format available.
Notification sent
to
karl <at> freefriends.org (Karl Berry)
:
bug acknowledged by developer.
(Sun, 30 Mar 2014 05:05:03 GMT)
Full text and
rfc822 format available.
Message #43 received at 17075-done <at> debbugs.gnu.org (full text, mbox):
Karl Berry wrote:
> Thanks for the doc patch, looks good to me.
Thanks; closing the bug report.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sun, 27 Apr 2014 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 11 years and 107 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.