On 04/13/2011 12:28 PM, Jesse Gordon wrote: > > > On 4/13/2011 7:07 AM, Eric Blake wrote: >> On 04/12/2011 03:02 PM, Jesse Gordon wrote: >>> I can't believe such an obvious bug would exist this long, but on the >>> other hand the test is so simple I can't see where it's user error. >> Thanks for the report. And you are correct in surmising that it is user >> error and not a bug in dd. >> >>> dd, when reading from stdin or from a named pipe sometimes (but not >>> always) reads a random number of records a bit less then what it should. >> Rather, dd reads as many bytes as possible, but unless that is less than >> PIPE_MAX, it is not guaranteed to be an atomic read. In turn, if you >> have asked dd to pad out partial reads into complete writes, then that >> explains your problem. Unfortunately, it is rather easy to do this >> without realizing it; the POSIX wording on how dd behaves is rather >> detailed. >> > How have I asked dd to pad out partial read? I'm not specifying pad or > sync or anything. Sorry, I assumed there was a conv=sync in the mix; without that, there is no padding (a partial read becomes a complete write with no padding). > And why is reading from a pipe a partial read when there is neither EOF > or error? Because the writer (yes) is getting ahead of the reader (dd), and is not writing in the same block size as the reader. For the sake of argument, let's suppose that yes uses stdio, which buffers to 4096 bytes before it calls write(). Then the kernel swaps over to dd, which does four reads of 1000 bytes each, then another read() which only has 96 bytes available immediately without swapping back to yes. So the kernel gives dd a short read. > That reads in ibs bytes quite nicely from a pipe. It waits for all the > data to fill into buffer, and only bails for the legitimate reasons -- > like EOF, or some real error. Short reads are not an error, but are a real phenomenon when reading from pipes. > > If POSIX really requires dd to abort a read for any reason other then > EOF or an error, then I'm dumbfounded. To me, it seems obvious that the > rule should be "When asked to read from a pipe, don't quit till it's > done or becomes impossible." POSIX expects the following (and note carefully that bs=nnn is MUCH different than ibs=nnn obs=nnn): http://pubs.opengroup.org/onlinepubs/9699919799/utilities/dd.html If the bs= expr operand is specified and no conversions other than sync, noerror, or notrunc are requested, the data returned from each input block shall be written as a separate output block; if the read returns less than a full block and the sync conversion is not specified, the resulting output block shall be the same size as the input block. If the bs= expr operand is not specified, or a conversion other than sync, noerror, or notrunc is requested, the input shall be processed and collected into full-sized output blocks until the end of the input is reached. > dd doesn't seem to abort early when reading hard drives, even if the > block size isn't the same as the hard drive IO read size, or even if it > has to wait a few ms for the drive to seek. That's because hard drives, being physical devices, have all of their data handy at once. The kernel doesn't have to task swap over to another process to get more bytes, but can proceed with the full read request right up until EOF. > However, iflags=fullblock seems to fix it. That's one fix. But it's GNU-specific. If you want the POSIX-compliant fix, then use ibs/obs instead of bs. >> > I still cannot fathom why it would ever be acceptable to abort early > when there's no error and no EOF and the pipe is still sending data. > Is there actually EVER a real reason for dd to need to abort a read when > there's no EOF and no error? Why would POSIX require this? It's called 40 years of history. That was the original way dd was written, back when the default medium was _not_ disks, but tapes, and tapes had variable size blocks. It made sense for the default back then. And changing it now _WILL_ break existing scripts that have come to rely on the standardized behavior, even if the standardized behavior makes no sense if dd were being developed from scratch today. > So the question is why is it reading partial records? Because that's the way pipes behave. > I really have a hard time believing that posix requries DD to abort a > pipe read just because the data wasn't ready quick enough. It is NOT aborting a pipe read, it is doing exactly what you told it, and writing as soon as read returns, even if read() had a short read value, because you specified bs. > Can someone point me to where POSIX requires this current behavior of > dd? I just did. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org