GNU bug report logs -
#8490
dd reads random number of records from pipes - named or otherwise - coreutils 8.9
Previous Next
Reported by: Jesse Gordon <jesseg <at> nikola.com>
Date: Wed, 13 Apr 2011 02:46:02 UTC
Severity: normal
Done: Eric Blake <eblake <at> redhat.com>
Bug is archived. No further changes may be made.
Full log
Message #24 received at 8490-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 04/13/2011 05:36 PM, Jesse Gordon wrote:
>> Short reads are not an error, but are a real phenomenon when reading
>> from pipes.
>>
> I agree - short reads from pipes are real. But I don't see why they
> should ever need to cause dd to skip data from the pipe.
dd is not skipping data. It is stopping after exactly count=1000 reads,
just like you asked; so it is ending shy of the amount of data you
thought you were getting.
>>> dd doesn't seem to abort early when reading hard drives, even if the
>>> block size isn't the same as the hard drive IO read size, or even if it
>>> has to wait a few ms for the drive to seek.
>> That's because hard drives, being physical devices, have all of their
>> data handy at once.
> Is that really true?
From the application's perspective - yes. With pipes, the kernel has to
schedule another process to run, and that other process can take an
indefinite amount of time producing data. Since the kernel has no
control over when other processes will actually produce that data, a
short read is the only viable answer that doesn't deadlock the system.
But with files, even if the kernel has to swap out in order to continue
reading from the disk, it has complete control over the data and does
not have to wait on any external processes, so the kernel has no
arbitrary waits - it may have a finite (and even long) wait while
spinning to the next sector for the next portion of data, but the wait
is independent of all other system activity, and therefore the kernel
can afford to avoid short reads when reading from devices since there is
no way to deadlock the system while getting the rest of the data.
>>> However, iflags=fullblock seems to fix it.
>> That's one fix. But it's GNU-specific. If you want the POSIX-compliant
>> fix, then use ibs/obs instead of bs.
>>
> Setting instead ibs and obs does _NOT_ fix. Try it yourself! :-)
I stand corrected. On re-reading POSIX, I indeed concur that there is
no way using _just_ POSIX options to require that a particular amount of
input data be read, regardless of short reads; you _have_ to use the GNU
extension of iflags=fullblock.
> I can't think of a single scenario where any script would rely on dd
> dropping bytes from the input pipe.
> Can anyone else?
It doesn't matter what you think; the problem is that you can't change
existing behavior. You can add new commands that give new behavior, but
there are 40 years worth of scripts that rely on existing behavior, so
even if _you_ can't think of someone that wants short reads from a pipe,
someone else may already be wanting it, and even relying on it, because
it is standardized that way.
Maybe the best thing to do is work on having POSIX standardize the GNU
extension of iflags=fullblock.
>
> I now see the real problem: The POSIX document is not aware of pipes. It
> states that certain things should be certain ways and that if read()
> gets a short read, it should count it as a partial record and write it
> as such.
>
> And the dd authors have just obediently followed the letter of the POSIX
> not realizing it's for a context that does not include pipes.
POSIX is very much aware of pipes. POSIX was written long after dd was
written, and standardized existing practice. It's not POSIX that got it
wrong, but the original dd implementors. But they got it wrong so long
ago that people have come to rely on that behavior, and the only way to
get new behavior is to mandate a new option.
--
Eric Blake eblake <at> redhat.com +1-801-349-2682
Libvirt virtualization library http://libvirt.org
[signature.asc (application/pgp-signature, attachment)]
This bug report was last modified 14 years and 99 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.