On 04/13/2011 12:28 PM, Jesse Gordon wrote:
> 
> 
> On 4/13/2011 7:07 AM, Eric Blake wrote:
>> On 04/12/2011 03:02 PM, Jesse Gordon wrote:
>>> I can't believe such an obvious bug would exist this long, but on the
>>> other hand the test is so simple I can't see where it's user error.
>> Thanks for the report.  And you are correct in surmising that it is user
>> error and not a bug in dd.
>>
>>> dd, when reading from stdin or from a named pipe sometimes (but not
>>> always) reads a random number of records a bit less then what it should.
>> Rather, dd reads as many bytes as possible, but unless that is less than
>> PIPE_MAX, it is not guaranteed to be an atomic read.  In turn, if you
>> have asked dd to pad out partial reads into complete writes, then that
>> explains your problem.  Unfortunately, it is rather easy to do this
>> without realizing it; the POSIX wording on how dd behaves is rather
>> detailed.
>>
> How have I asked dd to pad out partial read? I'm not specifying pad or
> sync or anything.

Sorry, I assumed there was a conv=sync in the mix; without that, there
is no padding (a partial read becomes a complete write with no padding).

> And why is reading from a pipe a partial read when there is neither EOF
> or error?

Because the writer (yes) is getting ahead of the reader (dd), and is not
writing in the same block size as the reader.  For the sake of argument,
let's suppose that yes uses stdio, which buffers to 4096 bytes before it
calls write().  Then the kernel swaps over to dd, which does four reads
of 1000 bytes each, then another read() which only has 96 bytes
available immediately without swapping back to yes.  So the kernel gives
dd a short read.

> That reads in ibs bytes quite nicely from a pipe. It waits for all the
> data to fill into buffer, and only bails for the legitimate reasons --
> like EOF, or some real error.

Short reads are not an error, but are a real phenomenon when reading
from pipes.

> 
> If POSIX really requires dd to abort a read for any reason other then
> EOF or an error, then I'm dumbfounded. To me, it seems obvious that the
> rule should be "When asked to read from a pipe, don't quit till it's
> done or becomes impossible."

POSIX expects the following (and note carefully that bs=nnn is MUCH
different than ibs=nnn obs=nnn):

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/dd.html

If the bs= expr operand is specified and no conversions other than sync,
noerror, or notrunc are requested, the data returned from each input
block shall be written as a separate output block; if the read returns
less than a full block and the sync conversion is not specified, the
resulting output block shall be the same size as the input block. If the
bs= expr operand is not specified, or a conversion other than sync,
noerror, or notrunc is requested, the input shall be processed and
collected into full-sized output blocks until the end of the input is
reached.

> dd doesn't seem to abort early when reading hard drives, even if the
> block size isn't the same as the hard drive IO read size, or even if it
> has to wait a few ms for the drive to seek.

That's because hard drives, being physical devices, have all of their
data handy at once.  The kernel doesn't have to task swap over to
another process to get more bytes, but can proceed with the full read
request right up until EOF.

> However, iflags=fullblock seems to fix it.

That's one fix.  But it's GNU-specific.  If you want the POSIX-compliant
fix, then use ibs/obs instead of bs.

>>
> I still cannot fathom why it would ever be acceptable to abort early
> when there's no error and no EOF and the pipe is still sending data.

> Is there actually EVER a real reason for dd to need to abort a read when
> there's no EOF and no error? Why would POSIX require this?

It's called 40 years of history.  That was the original way dd was
written, back when the default medium was _not_ disks, but tapes, and
tapes had variable size blocks.  It made sense for the default back
then.  And changing it now _WILL_ break existing scripts that have come
to rely on the standardized behavior, even if the standardized behavior
makes no sense if dd were being developed from scratch today.

> So the question is why is it reading partial records?

Because that's the way pipes behave.

> I really have a hard time believing that posix requries DD to abort a
> pipe read just because the data wasn't ready quick enough.

It is NOT aborting a pipe read, it is doing exactly what you told it,
and writing as soon as read returns, even if read() had a short read
value, because you specified bs.

> Can someone point me to where POSIX requires this current behavior of
> dd?

I just did.

-- 
Eric Blake   eblake@redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org