From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 10 05:25:56 2010 Received: (at submit) by debbugs.gnu.org; 10 Nov 2010 10:25:56 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PG7sJ-0002Eh-Mz for submit@debbugs.gnu.org; Wed, 10 Nov 2010 05:25:56 -0500 Received: from eggs.gnu.org ([140.186.70.92]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PG7nb-00024C-Ne for submit@debbugs.gnu.org; Wed, 10 Nov 2010 05:21:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PG7sA-0006qg-7p for submit@debbugs.gnu.org; Wed, 10 Nov 2010 05:25:47 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable version=3.3.1 Received: from lists.gnu.org ([199.232.76.165]:51211) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PG7s9-0006qT-TW for submit@debbugs.gnu.org; Wed, 10 Nov 2010 05:25:46 -0500 Received: from [140.186.70.92] (port=51518 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PG7s8-0001QM-Dn for bug-coreutils@gnu.org; Wed, 10 Nov 2010 05:25:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PG7s6-0006pn-Mz for bug-coreutils@gnu.org; Wed, 10 Nov 2010 05:25:43 -0500 Received: from 60.red-80-36-217.staticip.rima-tde.net ([80.36.217.60]:38203 helo=smtp.aircomp.aero) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PG7s6-0006oh-DE for bug-coreutils@gnu.org; Wed, 10 Nov 2010 05:25:42 -0500 Received: from [192.168.0.9] (unknown [192.168.0.9]) by smtp.aircomp.aero (Postfix) with ESMTP id D43F5580B9 for ; Wed, 10 Nov 2010 11:25:37 +0100 (CET) Message-ID: <4CDA7261.4060800@aircomp.aero> Date: Wed, 10 Nov 2010 11:22:25 +0100 From: Lucia Rotger User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: bug-coreutils@gnu.org Subject: dd strangeness Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4-2.6 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) X-Spam-Score: -6.6 (------) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 10 Nov 2010 05:25:53 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.6 (------) I see this behavior in Solaris, Linux and BSD dd: if I send a big enough file they all read it short at the end of the stream. This works as expected: # cat /dev/zero | dd bs=512 count=293601280 | wc I get the expected results, dd reads exactly 293601280 blocks and wc sees 150323855360 characters, 140 GB Whereas substituting cat for zfs send doesn't: # zfs send | dd bs=512 count=293601280 | wc The output of one of the runs is 293590463+10817 records in 293590463+10817 records out and the bytes counted by wc are < 140 GB. The zfs command sends 600 GB, so obviously dd should not run short. BSD and Linux dd were used on BSD and Linux machines, respectively, piping the stream with nc. Since this happens with three different implementations of dd I'm thinking of a design flaw but I've never ecountered it before. I'm testing sdd (a dd replacement) and will see what happens, but it'll take 5 hours still. There seems to be something going on in dd with different input and output block sizes since both sdd and this https://bugs.launchpad.net/ubuntu/+source/rsyslog/+bug/517773 hint at it: "The dd process requires a ridiculous amount of CPU during startup, though, since it is running with bs=1 to not miss stuff". But I don't know if that's what's happening here. According to man dd, bs sets ibs and obs. bs=512 is the last attempt I made but I've tried combinations of the bs and count parameters (always to make a size of 140 GB) to no avail, nothing seems to work with a big stream. I still haven't tried bs=1 as I think it would take weeks to go through but maybe I'm wrong. If I try with smaller files, up to hundreds of MBs dd works fine, but I can't tell at what size it breaks or under which circumstances or why. From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 10 10:25:06 2010 Received: (at 7362-done) by debbugs.gnu.org; 10 Nov 2010 15:25:06 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGCXp-0001CA-Ta for submit@debbugs.gnu.org; Wed, 10 Nov 2010 10:25:06 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PGCXn-0001Bo-8W for 7362-done@debbugs.gnu.org; Wed, 10 Nov 2010 10:25:04 -0500 Received: (qmail 87949 invoked from network); 10 Nov 2010 15:29:46 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 10 Nov 2010 15:29:46 -0000 Message-ID: <4CDABA65.9030808@draigBrady.com> Date: Wed, 10 Nov 2010 15:29:41 +0000 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Lucia Rotger Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> In-Reply-To: <4CDA7261.4060800@aircomp.aero> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362-done Cc: 7362-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 10/11/10 10:22, Lucia Rotger wrote: > I see this behavior in Solaris, Linux and BSD dd: if I send a big enough > file they all read it short at the end of the stream. > > This works as expected: > > # cat /dev/zero | dd bs=512 count=293601280 | wc > > I get the expected results, dd reads exactly 293601280 blocks and wc > sees 150323855360 characters, 140 GB > > Whereas substituting cat for zfs send doesn't: > > # zfs send | dd bs=512 count=293601280 | wc different write sizes to the pipe mean in the later case, dd will get short reads. IMHO dd is doing the wrong/most surprising thing here, but it can't be changed for compatibility reasons. You can get coreutils dd to do what you want with: dd iflag=fullblock cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 10 11:02:42 2010 Received: (at 7362) by debbugs.gnu.org; 10 Nov 2010 16:02:42 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGD8E-0001Uu-AM for submit@debbugs.gnu.org; Wed, 10 Nov 2010 11:02:42 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PGD8D-0001Um-4P for 7362@debbugs.gnu.org; Wed, 10 Nov 2010 11:02:41 -0500 Received: (qmail 95762 invoked from network); 10 Nov 2010 16:07:23 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 10 Nov 2010 16:07:23 -0000 Message-ID: <4CDAC336.6030105@draigBrady.com> Date: Wed, 10 Nov 2010 16:07:18 +0000 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 7362@debbugs.gnu.org Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> In-Reply-To: <4CDABA65.9030808@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 10/11/10 15:29, Pádraig Brady wrote: > On 10/11/10 10:22, Lucia Rotger wrote: >> I see this behavior in Solaris, Linux and BSD dd: if I send a big enough >> file they all read it short at the end of the stream. >> >> This works as expected: >> >> # cat /dev/zero | dd bs=512 count=293601280 | wc >> >> I get the expected results, dd reads exactly 293601280 blocks and wc >> sees 150323855360 characters, 140 GB >> >> Whereas substituting cat for zfs send doesn't: >> >> # zfs send | dd bs=512 count=293601280 | wc > > different write sizes to the pipe mean > in the later case, dd will get short reads. > IMHO dd is doing the wrong/most surprising thing here, > but it can't be changed for compatibility reasons. > You can get coreutils dd to do what you want with: > > dd iflag=fullblock BTW here are my notes on some possible changes in this area: `dd conv=sync` seems to do the wrong thing with pipes. I.E. it pads out short reads. Why would one ever want that? Should sync for pipes imply fullblock? Should sync for pipes without fullblock give a warning? Should specifying a count (and bs I suppose) without fullblock when reading from pipes give a warning? cheers, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 10 12:46:21 2010 Received: (at 7362-done) by debbugs.gnu.org; 10 Nov 2010 17:46:21 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGEkX-0002I4-9W for submit@debbugs.gnu.org; Wed, 10 Nov 2010 12:46:21 -0500 Received: from 60.red-80-36-217.staticip.rima-tde.net ([80.36.217.60] helo=smtp.aircomp.aero) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGEkV-0002Hx-4F for 7362-done@debbugs.gnu.org; Wed, 10 Nov 2010 12:46:20 -0500 Received: from [192.168.0.9] (unknown [192.168.0.9]) by smtp.aircomp.aero (Postfix) with ESMTP id 4058A5803F; Wed, 10 Nov 2010 18:51:01 +0100 (CET) Message-ID: <4CDADAC4.7020708@aircomp.aero> Date: Wed, 10 Nov 2010 18:47:48 +0100 From: Lucia Rotger User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: =?ISO-8859-1?Q?P=E1draig_Brady?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> In-Reply-To: <4CDABA65.9030808@draigBrady.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -4.6 (----) X-Debbugs-Envelope-To: 7362-done Cc: 7362-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.9 (---) Thanks, I'll try it, but it's not in the info or man pages. Also, the section under FLAGS is badly formatted with the different=20 options all in one line. You might want to fix that. Lucia On 10/11/2010 16:29, P=E1draig Brady wrote: > On 10/11/10 10:22, Lucia Rotger wrote: >> I see this behavior in Solaris, Linux and BSD dd: if I send a big enou= gh >> file they all read it short at the end of the stream. >> >> This works as expected: >> >> # cat /dev/zero | dd bs=3D512 count=3D293601280 | wc >> >> I get the expected results, dd reads exactly 293601280 blocks and wc >> sees 150323855360 characters, 140 GB >> >> Whereas substituting cat for zfs send doesn't: >> >> # zfs send | dd bs=3D512 count=3D293601280 | wc > > different write sizes to the pipe mean > in the later case, dd will get short reads. > IMHO dd is doing the wrong/most surprising thing here, > but it can't be changed for compatibility reasons. > You can get coreutils dd to do what you want with: > > dd iflag=3Dfullblock > > cheers, > P=E1draig. > From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 10 13:52:06 2010 Received: (at 7362) by debbugs.gnu.org; 10 Nov 2010 18:52:06 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGFmA-0002lu-5B for submit@debbugs.gnu.org; Wed, 10 Nov 2010 13:52:06 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGFm8-0002lY-US for 7362@debbugs.gnu.org; Wed, 10 Nov 2010 13:52:05 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 3DDA239E80F2; Wed, 10 Nov 2010 10:56:48 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7Bmc9LMOF9lz; Wed, 10 Nov 2010 10:56:47 -0800 (PST) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 9EDAC39E80DB; Wed, 10 Nov 2010 10:56:47 -0800 (PST) Message-ID: <4CDAEAEF.7070604@cs.ucla.edu> Date: Wed, 10 Nov 2010 10:56:47 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.15) Gecko/20101027 Thunderbird/3.0.10 MIME-Version: 1.0 To: =?ISO-8859-1?Q?P=E1draig_Brady?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> In-Reply-To: <4CDAC336.6030105@draigBrady.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 11/10/10 08:07, P=E1draig Brady wrote: > BTW here are my notes on some possible changes in this area: >=20 > `dd conv=3Dsync` seems to do the wrong thing with pipes. > I.E. it pads out short reads. Why would one ever want that? Nobody ever wants it but (as you note) it's required by the standard. > Should sync for pipes imply fullblock? I think that would be a good idea, but it should be done only if POSIXLY_CORRECT is not set, since it contradicts the standard. > Should sync for pipes without fullblock give a warning? I would say "no". If POSIXLY_CORRECT, the warning would arguably violate the standard. If not, there's no need for a warning. > Should specifying a count (and bs I suppose) without fullblock > when reading from pipes give a warning? No, for the same reason. Do you have time to implement this change? If not, I suppose I could look into it. It shouldn't be that hard. The hardest part would be the documentation. Regardless of whether we make such a change, we should document the issue, as this is a common problem. From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 10 17:19:09 2010 Received: (at 7362-done) by debbugs.gnu.org; 10 Nov 2010 22:19:09 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGJ0X-0004Iv-83 for submit@debbugs.gnu.org; Wed, 10 Nov 2010 17:19:09 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PGJ0W-0004Iq-0c for 7362-done@debbugs.gnu.org; Wed, 10 Nov 2010 17:19:08 -0500 Received: (qmail 65062 invoked from network); 10 Nov 2010 22:23:51 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 10 Nov 2010 22:23:51 -0000 Message-ID: <4CDB1B71.4020807@draigBrady.com> Date: Wed, 10 Nov 2010 22:23:45 +0000 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Lucia Rotger Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDADAC4.7020708@aircomp.aero> In-Reply-To: <4CDADAC4.7020708@aircomp.aero> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362-done Cc: 7362-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 10/11/10 17:47, Lucia Rotger wrote: > Thanks, I'll try it, but it's not in the info or man pages. > > Also, the section under FLAGS is badly formatted with the different > options all in one line. You might want to fix that. I think all that was fixed up. `dd iflag=fullblock` was implemented in coreutils-7.0 cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 10 20:14:26 2010 Received: (at 7362) by debbugs.gnu.org; 11 Nov 2010 01:14:26 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGLk9-0005SC-Lb for submit@debbugs.gnu.org; Wed, 10 Nov 2010 20:14:25 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PGLk8-0005S5-7r for 7362@debbugs.gnu.org; Wed, 10 Nov 2010 20:14:24 -0500 Received: (qmail 88377 invoked from network); 11 Nov 2010 01:19:08 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 11 Nov 2010 01:19:08 -0000 Message-ID: <4CDB4485.2020607@draigBrady.com> Date: Thu, 11 Nov 2010 01:19:01 +0000 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4CDAEAEF.7070604@cs.ucla.edu> In-Reply-To: <4CDAEAEF.7070604@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 10/11/10 18:56, Paul Eggert wrote: > On 11/10/10 08:07, Pádraig Brady wrote: > >> BTW here are my notes on some possible changes in this area: >> >> `dd conv=sync` seems to do the wrong thing with pipes. >> I.E. it pads out short reads. Why would one ever want that? > > Nobody ever wants it but (as you note) it's required by the > standard. > >> Should sync for pipes imply fullblock? > > I think that would be a good idea, but it should be done only > if POSIXLY_CORRECT is not set, since it contradicts the standard. > >> Should sync for pipes without fullblock give a warning? > > I would say "no". If POSIXLY_CORRECT, the warning would arguably > violate the standard. If not, there's no need for a warning. Agreed on all the above. > >> Should specifying a count (and bs I suppose) without fullblock >> when reading from pipes give a warning? > > No, for the same reason. Or maybe give the warning when POSIXLY_CORRECT was not set? > > Do you have time to implement this change? If not, I suppose > I could look into it. It shouldn't be that hard. The hardest > part would be the documentation. > > Regardless of whether we make such a change, we should document > the issue, as this is a common problem. I'll look into it at some stage but not soon. Feel free do it if you like. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 11 00:56:24 2010 Received: (at 7362) by debbugs.gnu.org; 11 Nov 2010 05:56:24 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGQ91-0007AU-Mk for submit@debbugs.gnu.org; Thu, 11 Nov 2010 00:56:23 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGQ8z-0007AN-8K for 7362@debbugs.gnu.org; Thu, 11 Nov 2010 00:56:21 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 3890239E80FA; Wed, 10 Nov 2010 22:01:06 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V-hy3v1Ag1Ae; Wed, 10 Nov 2010 22:01:05 -0800 (PST) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id D2D4339E80DC; Wed, 10 Nov 2010 22:01:05 -0800 (PST) Message-ID: <4CDB86A1.7030008@cs.ucla.edu> Date: Wed, 10 Nov 2010 22:01:05 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.12) Gecko/20101027 Thunderbird/3.1.6 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4CDAEAEF.7070604@cs.ucla.edu> <4CDB4485.2020607@draigBrady.com> In-Reply-To: <4CDB4485.2020607@draigBrady.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 11/10/2010 05:19 PM, P=C3=A1draig Brady wrote: >> >=20 >>> >> Should specifying a count (and bs I suppose) without fullblock >>> >> when reading from pipes give a warning? >> >=20 >> > No, for the same reason. > Or maybe give the warning when POSIXLY_CORRECT was not set? I wouldn't bother. Let's just have dd do the right thing quietly. I am a bit tied down too doing other things, but will send you email if I= start working on it. From debbugs-submit-bounces@debbugs.gnu.org Thu Nov 11 04:28:14 2010 Received: (at 7362) by debbugs.gnu.org; 11 Nov 2010 09:28:15 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGTS2-000057-O4 for submit@debbugs.gnu.org; Thu, 11 Nov 2010 04:28:14 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PGTS0-00004t-4l for 7362@debbugs.gnu.org; Thu, 11 Nov 2010 04:28:13 -0500 Received: (qmail 46218 invoked from network); 11 Nov 2010 09:32:56 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 11 Nov 2010 09:32:56 -0000 Message-ID: <4CDBB840.400@draigBrady.com> Date: Thu, 11 Nov 2010 09:32:48 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4CDAEAEF.7070604@cs.ucla.edu> <4CDB4485.2020607@draigBrady.com> <4CDB86A1.7030008@cs.ucla.edu> In-Reply-To: <4CDB86A1.7030008@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 11/11/10 06:01, Paul Eggert wrote: > On 11/10/2010 05:19 PM, Pádraig Brady wrote: >>>> >>>>>> Should specifying a count (and bs I suppose) without fullblock >>>>>> when reading from pipes give a warning? >>>> >>>> No, for the same reason. >> Or maybe give the warning when POSIXLY_CORRECT was not set? > > I wouldn't bother. Let's just have dd do the right thing quietly. Oh right, do the auto fullblock for that case also. Agreed. > I am a bit tied down too doing other things, but will send you email if I start > working on it. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri Nov 12 09:51:47 2010 Received: (at 7362-done) by debbugs.gnu.org; 12 Nov 2010 14:51:47 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGuyg-0005x5-UI for submit@debbugs.gnu.org; Fri, 12 Nov 2010 09:51:47 -0500 Received: from 60.red-80-36-217.staticip.rima-tde.net ([80.36.217.60] helo=smtp.aircomp.aero) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGuyf-0005x0-7w for 7362-done@debbugs.gnu.org; Fri, 12 Nov 2010 09:51:46 -0500 Received: from [192.168.0.9] (unknown [192.168.0.9]) by smtp.aircomp.aero (Postfix) with ESMTP id 16D7C5803F; Fri, 12 Nov 2010 15:56:31 +0100 (CET) Message-ID: <4CDD54DF.4030602@aircomp.aero> Date: Fri, 12 Nov 2010 15:53:19 +0100 From: Lucia Rotger User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.4) Gecko/20100608 Thunderbird/3.1 MIME-Version: 1.0 To: =?ISO-8859-1?Q?P=E1draig_Brady?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> In-Reply-To: <4CDABA65.9030808@draigBrady.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -3.6 (---) X-Debbugs-Envelope-To: 7362-done Cc: 7362-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.4 (---) On 10/11/2010 16:29, P=E1draig Brady wrote: > On 10/11/10 10:22, Lucia Rotger wrote: >> I see this behavior in Solaris, Linux and BSD dd: if I send a big enou= gh >> file they all read it short at the end of the stream. >> >> This works as expected: >> >> # cat /dev/zero | dd bs=3D512 count=3D293601280 | wc >> >> I get the expected results, dd reads exactly 293601280 blocks and wc >> sees 150323855360 characters, 140 GB >> >> Whereas substituting cat for zfs send doesn't: >> >> # zfs send | dd bs=3D512 count=3D293601280 | wc > > different write sizes to the pipe mean > in the later case, dd will get short reads. > IMHO dd is doing the wrong/most surprising thing here, > but it can't be changed for compatibility reasons. > You can get coreutils dd to do what you want with: > > dd iflag=3Dfullblock Thanks for the explanation. Now, if I may abuse your time a bit more,=20 how exactly is a block size determined when reading from a pipe? is it=20 the size returned by each read()? And why does dd stop short at the last=20 block? Sorry about those problems in the man page being already fixed. I=20 checked and I have dd version 6.1, these Ubuntu guys don't update very we= ll. I have somewhat solved it in a more portable way doing # zfs send | dd obs=3D512 | dd bs=3D512 count=3D293601280 | wc I guess as long as no writes from zfs send are smaller than 512 bytes.=20 It works nicely now. I suppose adding ibs=3D1 to the first dd would work with everything but i= t=20 would probably take ages to finish. Thanks From debbugs-submit-bounces@debbugs.gnu.org Fri Nov 12 10:15:37 2010 Received: (at 7362-done) by debbugs.gnu.org; 12 Nov 2010 15:15:37 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PGvLk-00069D-Qg for submit@debbugs.gnu.org; Fri, 12 Nov 2010 10:15:37 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PGvLi-000698-RZ for 7362-done@debbugs.gnu.org; Fri, 12 Nov 2010 10:15:35 -0500 Received: (qmail 44924 invoked from network); 12 Nov 2010 15:20:22 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 12 Nov 2010 15:20:22 -0000 Message-ID: <4CDD5B27.9040201@draigBrady.com> Date: Fri, 12 Nov 2010 15:20:07 +0000 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Lucia Rotger Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDD54DF.4030602@aircomp.aero> In-Reply-To: <4CDD54DF.4030602@aircomp.aero> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362-done Cc: 7362-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 12/11/10 14:53, Lucia Rotger wrote: > On 10/11/2010 16:29, Pádraig Brady wrote: >> On 10/11/10 10:22, Lucia Rotger wrote: >>> I see this behavior in Solaris, Linux and BSD dd: if I send a big enough >>> file they all read it short at the end of the stream. >>> >>> This works as expected: >>> >>> # cat /dev/zero | dd bs=512 count=293601280 | wc >>> >>> I get the expected results, dd reads exactly 293601280 blocks and wc >>> sees 150323855360 characters, 140 GB >>> >>> Whereas substituting cat for zfs send doesn't: >>> >>> # zfs send | dd bs=512 count=293601280 | wc >> >> different write sizes to the pipe mean >> in the later case, dd will get short reads. >> IMHO dd is doing the wrong/most surprising thing here, >> but it can't be changed for compatibility reasons. >> You can get coreutils dd to do what you want with: >> >> dd iflag=fullblock > > Thanks for the explanation. Now, if I may abuse your time a bit more, > how exactly is a block size determined when reading from a pipe? is it > the size returned by each read()? And why does dd stop short at the last > block? > > Sorry about those problems in the man page being already fixed. I > checked and I have dd version 6.1, these Ubuntu guys don't update very > well. > > I have somewhat solved it in a more portable way doing > > # zfs send | dd obs=512 | dd bs=512 count=293601280 | wc The above works because the middle dd coalesces the short reads, and so writes 512 bytes at a time, and so the next dd will not have short reads. This will not generally work as you increase the size. To illustrate what the middle dd is doing: $ (echo 1; sleep 1; echo 2) | strace -e read,write dd obs=512 read(0, "1\n", 512) = 2 read(0, "2\n", 512) = 2 read(0, "", 512) = 0 write(1, "1\n2\n", 4) = 4 > I suppose adding ibs=1 to the first dd would work with everything but it > would probably take ages to finish. right. The crux of your problem is specifying a count. If you didn't the short reads wouldn't matter. If you do need to specify a count, then the iflag=fullblock available since 7.0 will work. cheers, Pádraig. From unknown Sat Sep 13 23:21:40 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 11 Dec 2010 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 02 08:07:54 2011 Received: (at control) by debbugs.gnu.org; 2 Feb 2011 13:07:55 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkcR8-00049R-P8 for submit@debbugs.gnu.org; Wed, 02 Feb 2011 08:07:54 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PkcR7-00049F-3d for control@debbugs.gnu.org; Wed, 02 Feb 2011 08:07:53 -0500 Received: (qmail 28466 invoked from network); 2 Feb 2011 13:16:17 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 2 Feb 2011 13:16:17 -0000 Message-ID: <4D49589E.4020703@draigBrady.com> Date: Wed, 02 Feb 2011 13:14:06 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: control@debbugs.gnu.org X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: 0.8 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: 0.8 (/) unarchive 7362 From debbugs-submit-bounces@debbugs.gnu.org Wed Feb 02 08:17:34 2011 Received: (at 7362) by debbugs.gnu.org; 2 Feb 2011 13:17:34 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PkcaT-00057d-V0 for submit@debbugs.gnu.org; Wed, 02 Feb 2011 08:17:34 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PkcaS-00057R-E2 for 7362@debbugs.gnu.org; Wed, 02 Feb 2011 08:17:32 -0500 Received: (qmail 30449 invoked from network); 2 Feb 2011 13:25:57 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 2 Feb 2011 13:25:57 -0000 Message-ID: <4D495AE2.8080404@draigBrady.com> Date: Wed, 02 Feb 2011 13:23:46 +0000 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 7362@debbugs.gnu.org Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> In-Reply-To: <4CDAC336.6030105@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 10/11/10 16:07, Pádraig Brady wrote: > On 10/11/10 15:29, Pádraig Brady wrote: >> On 10/11/10 10:22, Lucia Rotger wrote: >>> I see this behavior in Solaris, Linux and BSD dd: if I send a big enough >>> file they all read it short at the end of the stream. >>> >>> This works as expected: >>> >>> # cat /dev/zero | dd bs=512 count=293601280 | wc >>> >>> I get the expected results, dd reads exactly 293601280 blocks and wc >>> sees 150323855360 characters, 140 GB >>> >>> Whereas substituting cat for zfs send doesn't: >>> >>> # zfs send | dd bs=512 count=293601280 | wc >> >> different write sizes to the pipe mean >> in the later case, dd will get short reads. >> IMHO dd is doing the wrong/most surprising thing here, >> but it can't be changed for compatibility reasons. >> You can get coreutils dd to do what you want with: >> >> dd iflag=fullblock > > BTW here are my notes on some possible changes in this area: > > `dd conv=sync` seems to do the wrong thing with pipes. > I.E. it pads out short reads. Why would one ever want that? > > Should sync for pipes imply fullblock? > > Should sync for pipes without fullblock give a warning? > > Should specifying a count (and bs I suppose) without fullblock > when reading from pipes give a warning? This looks like another candidate to auto enable fullblock for. https://bugzilla.redhat.com/show_bug.cgi?id=614605 I.E. oflag=direct cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 25 07:19:52 2011 Received: (at 7362) by debbugs.gnu.org; 25 Feb 2011 12:19:54 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PsweG-0007sZ-G9 for submit@debbugs.gnu.org; Fri, 25 Feb 2011 07:19:52 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PsweE-0007sN-Es for 7362@debbugs.gnu.org; Fri, 25 Feb 2011 07:19:51 -0500 Received: (qmail 45645 invoked from network); 25 Feb 2011 12:19:44 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 25 Feb 2011 12:19:44 -0000 Message-ID: <4D679E57.8010104@draigBrady.com> Date: Fri, 25 Feb 2011 12:19:35 +0000 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: 7362@debbugs.gnu.org Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> In-Reply-To: <4D495AE2.8080404@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: multipart/mixed; boundary="------------000701050105020802060600" X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) This is a multi-part message in MIME format. --------------000701050105020802060600 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit On 02/02/11 13:23, Pádraig Brady wrote: > This looks like another candidate to auto enable fullblock for. > https://bugzilla.redhat.com/show_bug.cgi?id=614605 > I.E. oflag=direct Attached is a proposed solution to this. I'm worried about the last condition though where we enable 'fullblock' when both count and bs are specified. For example this would still work: # Output first 2 parts $ (echo part1; sleep 1; echo part2; sleep 1; echo discard) | dd count=2 obs=1 2>/dev/null part1 part2 However this would not: # Output first 2 parts, each being up to 4096 bytes $ (echo part1; sleep 1; echo part2; sleep 1; echo discard) | dd count=2 ibs=4096 obs=1 2>/dev/null part1 part2 discard So how contrived is the last example, given how brittle such a construct is? cheers, Pádraig. --------------000701050105020802060600 Content-Type: text/x-patch; name="dd-fullblock.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="dd-fullblock.diff" diff --git a/doc/coreutils.texi b/doc/coreutils.texi index ea35afe..9167537 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -8089,7 +8089,7 @@ Do not truncate the output file. @opindex sync @r{(padding with @acronym{ASCII} @sc{nul}s)} Pad every input block to size of @samp{ibs} with trailing zero bytes. When used with @samp{block} or @samp{unblock}, pad with spaces instead of -zero bytes. +zero bytes. This implies the @samp{fullblock} flag. @item fdatasync @opindex fdatasync @@ -8135,8 +8135,8 @@ output file to be truncated before being appended to. @cindex concurrent I/O Use concurrent I/O mode for data. This mode performs direct I/O and drops the @acronym{POSIX} requirement to serialize all I/O to the same file. -A file cannot be opened in CIO mode and with a standard open at the -same time. +A file cannot be opened in CIO mode and with a standard open at the same time. +This implies the @samp{fullblock} flag. @item direct @opindex direct @@ -8146,6 +8146,7 @@ Note that the kernel may impose restrictions on read or write buffer sizes. For example, with an ext4 destination file system and a linux-based kernel, using @samp{oflag=direct} will cause writes to fail with @code{EINVAL} if the output buffer size is not a multiple of 512. +This implies the @samp{fullblock} flag. @item directory @opindex directory diff --git a/src/dd.c b/src/dd.c index daddc1e..5b56970 100644 --- a/src/dd.c +++ b/src/dd.c @@ -1075,6 +1075,19 @@ scanargs (int argc, char *const *argv) conversions_mask |= C_TWOBUFS; } + /* Enable 'fullblock' as one wouldn't want random + padding applied, when reading from a pipe for example. */ + if (conversions_mask & C_SYNC) + input_flags |= O_FULLBLOCK; + /* Enable 'fullblock' with 'direct' or 'cio' as again if reading from + a pipe, we're constrained in how we write to output. */ + else if ((input_flags | output_flags) & (O_DIRECT | O_CIO)) + input_flags |= O_FULLBLOCK; + /* Enable 'fullblock' if we're reading a specific number of blocks, + with a specific block size. */ + else if (max_records && max_records != (uintmax_t) -1 && input_blocksize) + input_flags |= O_FULLBLOCK; + if (input_blocksize == 0) input_blocksize = DEFAULT_BLOCKSIZE; if (output_blocksize == 0) --------------000701050105020802060600-- From debbugs-submit-bounces@debbugs.gnu.org Fri Feb 25 12:29:40 2011 Received: (at 7362) by debbugs.gnu.org; 25 Feb 2011 17:29:41 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pt1U3-00074c-P4 for submit@debbugs.gnu.org; Fri, 25 Feb 2011 12:29:40 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pt1U1-00074P-3k for 7362@debbugs.gnu.org; Fri, 25 Feb 2011 12:29:38 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 2A4C339E80FF; Fri, 25 Feb 2011 09:29:31 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ttw1CUiKhafB; Fri, 25 Feb 2011 09:29:30 -0800 (PST) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id AE70039E80F7; Fri, 25 Feb 2011 09:29:30 -0800 (PST) Message-ID: <4D67E6F4.9020308@cs.ucla.edu> Date: Fri, 25 Feb 2011 09:29:24 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> In-Reply-To: <4D679E57.8010104@draigBrady.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 02/25/2011 04:19 AM, P=C3=A1draig Brady wrote: > Attached is a proposed solution to this. My kneejerk reaction is that it tries to do too much inferring, and ends up being more complicated than giving the user more control. If we're going to change the default to be not compatible with POSIX, we need to give the user a way to get the POSIX behavior, something that's less subtle than POSIXLY_CORRECT. I suggest that we add a new option that is the inverse of "fullblock". We can call it "partblock", say. Then, we can say that the default is "fullblock" normally, but it is "partblock" if POSIXLY_CORRECT and if bs=3D is given and if no conversion= s other than sync, noerror, and notrunk are given. Anyway, I'm just thinking out loud to some extent, and further comments are welcome. From debbugs-submit-bounces@debbugs.gnu.org Mon Feb 28 04:41:44 2011 Received: (at 7362) by debbugs.gnu.org; 28 Feb 2011 09:41:44 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Ptzbs-0004kp-Az for submit@debbugs.gnu.org; Mon, 28 Feb 2011 04:41:44 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1Ptzbp-0004ka-Hq for 7362@debbugs.gnu.org; Mon, 28 Feb 2011 04:41:42 -0500 Received: (qmail 15230 invoked from network); 28 Feb 2011 09:41:35 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 28 Feb 2011 09:41:35 -0000 Message-ID: <4D6B6DB7.2060707@draigBrady.com> Date: Mon, 28 Feb 2011 09:41:11 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> In-Reply-To: <4D67E6F4.9020308@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: multipart/mixed; boundary="------------010101090209070604040005" X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) This is a multi-part message in MIME format. --------------010101090209070604040005 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 25/02/11 17:29, Paul Eggert wrote: > On 02/25/2011 04:19 AM, Pádraig Brady wrote: > >> Attached is a proposed solution to this. > > My kneejerk reaction is that it tries to do too much inferring, > and ends up being more complicated than giving the user more control. > > If we're going to change the default to be not compatible with POSIX, > we need to give the user a way to get the POSIX behavior, something > that's less subtle than POSIXLY_CORRECT. I suggest that we add > a new option that is the inverse of "fullblock". We can call it > "partblock", say. > > Then, we can say that the default is "fullblock" normally, but it is > "partblock" if POSIXLY_CORRECT and if bs= is given and if no conversions > other than sync, noerror, and notrunk are given. > > Anyway, I'm just thinking out loud to some extent, and further > comments are welcome. Hmm, it's better to be explicit but I think defaulting to "fullblock" is too risky. As an interim step at least, how about just warning as per the attached. cheers, Pádraig. --------------010101090209070604040005 Content-Type: text/x-patch; name="dd-fullblock.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="dd-fullblock.diff" >From 987438bfe5f7a38b7736f0aaca9d8377b4281408 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Fri, 25 Feb 2011 12:27:25 +0000 Subject: [PATCH] dd: suggest iflag=fullblock where appropriate * NEWS: Mention the change in behavior. * doc/coreutils.texi: Document when iflag=fullblock is auto suggested. src/dd.c (scan_args): Suggest O_FULLBLOCK when probably needed. --- NEWS | 6 ++++++ doc/coreutils.texi | 3 +++ src/dd.c | 32 ++++++++++++++++++++++++++++++++ 3 files changed, 41 insertions(+), 0 deletions(-) diff --git a/NEWS b/NEWS index a367d8d..e82bc02 100644 --- a/NEWS +++ b/NEWS @@ -8,6 +8,12 @@ GNU coreutils NEWS -*- outline -*- delimiter and an unbounded range like "-f1234567890-". [bug introduced in coreutils-5.3.0] +** Changes in behavior + + dd now suggests using iflag=fullblock with oflag=direct or conv=sync + where short reads can have adverse effects. This option is also + suggested if the user indicates a specific amount of data to read. + * Noteworthy changes in release 8.10 (2011-02-04) [stable] diff --git a/doc/coreutils.texi b/doc/coreutils.texi index ea35afe..da8e80b 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -8216,6 +8216,9 @@ Accumulate full blocks from input. The @code{read} system call may return early if a full block is not available. When that happens, continue calling @code{read} to fill the remainder of the block. +Using this flag is suggested when @samp{cio}, @samp{direct} +or @samp{conv=sync} are used, or when a specific amount of input +data is specfied. This flag can be used only with @code{iflag}. @end table diff --git a/src/dd.c b/src/dd.c index daddc1e..1a0e177 100644 --- a/src/dd.c +++ b/src/dd.c @@ -1075,6 +1075,38 @@ scanargs (int argc, char *const *argv) conversions_mask |= C_TWOBUFS; } + if (!(input_flags & O_FULLBLOCK) && !getenv ("POSIXLY_CORRECT")) + { + bool fb_warned = false; + /* Suggest 'fullblock' as one wouldn't want random + padding applied, when reading from a pipe for example. */ + if (conversions_mask & C_SYNC) + { + error (0, 0, _("warning: 'iflag=fullblock' is suggested when " + "padding input blocks")); + fb_warned = true; + } + /* Suggest 'fullblock' with 'direct' or 'cio' as again if reading from + a pipe, we're constrained in how we write to output. */ + if ((input_flags | output_flags) & (O_DIRECT | O_CIO)) + { + error (0, 0, _("warning: 'iflag=fullblock' is suggested when " + "writing with direct I/O constraints")); + fb_warned = true; + } + /* Suggest 'fullblock' if we're reading a specific number of blocks, + with a specific block size. */ + if (max_records && max_records != (uintmax_t) -1 && input_blocksize) + { + error (0, 0, _("warning: 'iflag=fullblock' is suggested when " + "reading a specific amount")); + fb_warned = true; + } + + if (fb_warned) + error (0, 0, _("Set POSIXLY_CORRECT to disable this warning")); + } + if (input_blocksize == 0) input_blocksize = DEFAULT_BLOCKSIZE; if (output_blocksize == 0) -- 1.7.4 --------------010101090209070604040005-- From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 01 04:50:18 2011 Received: (at 7362) by debbugs.gnu.org; 1 Mar 2011 09:50:18 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuMDh-0003Au-VZ for submit@debbugs.gnu.org; Tue, 01 Mar 2011 04:50:18 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuMDg-0003Ai-2D for 7362@debbugs.gnu.org; Tue, 01 Mar 2011 04:50:17 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 1154D39E80DC; Tue, 1 Mar 2011 01:50:10 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dBAY51biba9x; Tue, 1 Mar 2011 01:50:08 -0800 (PST) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id C570B39E8083; Tue, 1 Mar 2011 01:50:08 -0800 (PST) Message-ID: <4D6CC150.4000900@cs.ucla.edu> Date: Tue, 01 Mar 2011 01:50:08 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> In-Reply-To: <4D6B6DB7.2060707@draigBrady.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 02/28/2011 01:41 AM, P=C3=A1draig Brady wrote: > Hmm, it's better to be explicit but I think defaulting to "fullblock" > is too risky. As an interim step at least, how about just warning > as per the attached. Ouch. This is a pain to think about. But here are some thoughts anyway: * I went back and reread POSIX, and "dd" is allowed to issue diagnostics to stderr whenever it likes. So we don't need to worry about POSIXLY_CORRECT if all we want to do is issue diagnostics. We can issue them regardless of POSIXLY_CORRECT. * I don't understand the business with C_SYNC. People who use conv=3Dsync know what they're doing, or ought to; there's little point giving them a warning. * For (O_DIRECT | O_CIO), surely this matters only for output_flags. If these bits are set in input_flags then O_FULLBLOCK is irrelevant, n= o? * If we care about max_records we should also care about skip_records, since short reads matter when skipping in a pipe, too. * Since POSIX doesn't specify the direct or cio flags, we're free to have them silently enable iflag=3Dfullblock. But it doesn't sound right to do that. Instead, we should set conversions_mask |=3D C_TWOB= UFS, because the input and output blocksizes might differ. * If we suggest ibs=3Dwhatever rather than iflag=3Dfullblock, our suggestions will be portable to other POSIX implementations, which is a plus. * Rather than warn about potential problems, how about diagnosing the problems only when they actually occur? That would help us avoid crying wolf. Here's a proposed patch that tries to embody all the above, except that I haven't done the documentation (I figure we should get the behavior right first....): >From 85e3716f918e8163695a85d40fe8c561634c9e2e Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Tue, 1 Mar 2011 01:34:57 -0800 Subject: [PATCH] dd: avoid or diagnose some problems with short reads MIME-Version: 1.0 Content-Type: text/plain; charset=3DUTF-8 Content-Transfer-Encoding: 8bit * src/dd.c (iread): Diagnose short reads when they mess up counts. (scanargs): If oflags=3Ddirect or oflags=3Dcio, use C_TWOBUFS so that the output blocks are typically full. Derived from a suggestion by P=C3=A1draig Brady in: http://lists.gnu.org/archive/html/bug-coreutils/2011-02/msg00150.html --- src/dd.c | 27 ++++++++++++++++++++++++++- 1 files changed, 26 insertions(+), 1 deletions(-) diff --git a/src/dd.c b/src/dd.c index daddc1e..41ad7a3 100644 --- a/src/dd.c +++ b/src/dd.c @@ -802,7 +802,29 @@ iread (int fd, char *buf, size_t size) process_signals (); nread =3D read (fd, buf, size); if (! (nread < 0 && errno =3D=3D EINTR)) - return nread; + { + static ssize_t prev_nread; + static bool warned; + + if (nread !=3D 0 && 0 < prev_nread && prev_nread < size + && iread_fnc =3D=3D iread + && ! (conversions_mask & C_TWOBUFS) + && (skip_records + || (0 < max_records && max_records < (uintmax_t) -1)) + && ! warned) + { + unsigned long int prev =3D prev_nread; + unsigned long int ibs =3D input_blocksize; + error (0, 0, _("warning: short read (%lu bytes)"), prev); + error (0, 0, + _("Perhaps you wanted ibs=3D%lu rather than bs=3D%l= u?"), + ibs, ibs); + warned =3D true; + } + + prev_nread =3D nread; + return nread; + } } } =20 @@ -1075,6 +1097,9 @@ scanargs (int argc, char *const *argv) conversions_mask |=3D C_TWOBUFS; } =20 + if (output_flags & (O_DIRECT | O_CIO)) + conversions_mask |=3D C_TWOBUFS; + if (input_blocksize =3D=3D 0) input_blocksize =3D DEFAULT_BLOCKSIZE; if (output_blocksize =3D=3D 0) --=20 1.7.4 From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 01 06:27:50 2011 Received: (at 7362) by debbugs.gnu.org; 1 Mar 2011 11:27:50 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuNk5-0005L8-RE for submit@debbugs.gnu.org; Tue, 01 Mar 2011 06:27:50 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PuNk2-0005Ku-Vt for 7362@debbugs.gnu.org; Tue, 01 Mar 2011 06:27:47 -0500 Received: (qmail 56769 invoked from network); 1 Mar 2011 11:27:40 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 1 Mar 2011 11:27:40 -0000 Message-ID: <4D6CD80F.8070708@draigBrady.com> Date: Tue, 01 Mar 2011 11:27:11 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> In-Reply-To: <4D6CC150.4000900@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 01/03/11 09:50, Paul Eggert wrote: > On 02/28/2011 01:41 AM, Pádraig Brady wrote: >> Hmm, it's better to be explicit but I think defaulting to "fullblock" >> is too risky. As an interim step at least, how about just warning >> as per the attached. > > Ouch. This is a pain to think about. But here are some thoughts anyway: > > * I went back and reread POSIX, and "dd" is allowed to issue > diagnostics to stderr whenever it likes. So we don't need to > worry about POSIXLY_CORRECT if all we want to do is issue diagnostics. > We can issue them regardless of POSIXLY_CORRECT. Checking POSIXLY_CORRECT allows one to disable the warnings > > * I don't understand the business with C_SYNC. People who use > conv=sync know what they're doing, or ought to; there's little > point giving them a warning. They might think the short read might only apply to the end of the input? But fair enough, it's not as important as the O_DIRECT issue. > * For (O_DIRECT | O_CIO), surely this matters only for output_flags. > If these bits are set in input_flags then O_FULLBLOCK is irrelevant, no? True. > * If we care about max_records we should also care about skip_records, > since short reads matter when skipping in a pipe, too. True. > * Since POSIX doesn't specify the direct or cio flags, we're free > to have them silently enable iflag=fullblock. But it doesn't sound > right to do that. Instead, we should set conversions_mask |= C_TWOBUFS, > because the input and output blocksizes might differ. > > * If we suggest ibs=whatever rather than iflag=fullblock, our > suggestions will be portable to other POSIX implementations, which > is a plus. So the standard way to accumulate short reads to a full write, is to specify separate ibs and obs (we'd probably want to prompt about setting obs too for efficiency). However I think that would mess up with a specific count (1 for each partial read) or with conv=sync,noerror > > * Rather than warn about potential problems, how about diagnosing the > problems only when they actually occur? That would help us avoid > crying wolf. I like that idea, except that users might only hit an issue on particular runs, or when moving from a test file to a pipe in production. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 01 07:02:33 2011 Received: (at 7362) by debbugs.gnu.org; 1 Mar 2011 12:02:33 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuOHg-00069T-Tk for submit@debbugs.gnu.org; Tue, 01 Mar 2011 07:02:33 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PuOHe-00069D-Vx for 7362@debbugs.gnu.org; Tue, 01 Mar 2011 07:02:31 -0500 Received: (qmail 63315 invoked from network); 1 Mar 2011 12:02:24 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 1 Mar 2011 12:02:24 -0000 Message-ID: <4D6CE033.9030103@draigBrady.com> Date: Tue, 01 Mar 2011 12:01:55 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> In-Reply-To: <4D6CD80F.8070708@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 01/03/11 11:27, Pádraig Brady wrote: > On 01/03/11 09:50, Paul Eggert wrote: >> * Since POSIX doesn't specify the direct or cio flags, we're free >> to have them silently enable iflag=fullblock. But it doesn't sound >> right to do that. Instead, we should set conversions_mask |= C_TWOBUFS, >> because the input and output blocksizes might differ. >> >> * If we suggest ibs=whatever rather than iflag=fullblock, our >> suggestions will be portable to other POSIX implementations, which >> is a plus. > > So the standard way to accumulate short reads to a full write, > is to specify separate ibs and obs (we'd probably want to prompt about > setting obs too for efficiency). However I think that would mess up > with a specific count (1 for each partial read) or with conv=sync,noerror Oh right, you're warning for short reads with a specific count and discounting the conv=sync,noerror case. From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 01 12:46:02 2011 Received: (at 7362) by debbugs.gnu.org; 1 Mar 2011 17:46:02 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuTe6-0006Bf-88 for submit@debbugs.gnu.org; Tue, 01 Mar 2011 12:46:02 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuTe0-0006BB-3S for 7362@debbugs.gnu.org; Tue, 01 Mar 2011 12:46:00 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 03AC739E80DC; Tue, 1 Mar 2011 09:45:50 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tn-MAI7DMTcx; Tue, 1 Mar 2011 09:45:49 -0800 (PST) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 8664939E8083; Tue, 1 Mar 2011 09:45:49 -0800 (PST) Message-ID: <4D6D30C6.7000607@cs.ucla.edu> Date: Tue, 01 Mar 2011 09:45:42 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Thunderbird/3.1.7 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> In-Reply-To: <4D6CD80F.8070708@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 03/01/2011 03:27 AM, P=C3=A1draig Brady wrote: > So the standard way to accumulate short reads to a full write, > is to specify separate ibs and obs (we'd probably want to prompt about > setting obs too for efficiency) Yes, good point, the diagnostic should suggest ibs=3DN obs=3DN (instead of just ibs=3DN). By the way, the relationship between fullblock and ibs=3DN obs=3DN is a curious one, one that I don't fully understand. If you have ibs=3DN obs=3DN, why would you need fullblock? This should probably be documented (preferably by someone who understands it :-). >> * Rather than warn about potential problems, how about diagnosing th= e >> problems only when they actually occur? That would help us avoid >> crying wolf. > > I like that idea, except that users might only hit an issue > on particular runs, or when moving from a test file > to a pipe in production. True in both cases, but in practice these problems would be pretty rare compared to the problem of dd crying wolf. It's fairly common for people to use dd bs=3DN to extract sections of regular files, e.g., "dd bs=3D1k count=3D1 seek=3D1 if=3Dbigfile", and issuing warnings in these cases (when dd operates perfectly well) would cause unnecessary confusion. > Checking POSIXLY_CORRECT allows one to disable the warnings Yes, but POSIXLY_CORRECT should be used only to make a program conform to POSIX when the default behavior does not conform. It shouldn't be used for auxiliary purposes such as suppressing warnings. If we want an option to suppress warnings, we should add one; but we shouldn't overload POSIXLY_CORRECT. From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 01 13:12:29 2011 Received: (at 7362) by debbugs.gnu.org; 1 Mar 2011 18:12:29 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuU3h-0006m0-5X for submit@debbugs.gnu.org; Tue, 01 Mar 2011 13:12:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuU3e-0006ln-2A for 7362@debbugs.gnu.org; Tue, 01 Mar 2011 13:12:27 -0500 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p21ICJ1s015937 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 1 Mar 2011 13:12:19 -0500 Received: from [10.3.113.122] (ovpn-113-122.phx2.redhat.com [10.3.113.122]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p21ICIYo006479; Tue, 1 Mar 2011 13:12:18 -0500 Message-ID: <4D6D3701.6060806@redhat.com> Date: Tue, 01 Mar 2011 11:12:17 -0700 From: Eric Blake Organization: Red Hat User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.7 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> In-Reply-To: <4D6D30C6.7000607@cs.ucla.edu> X-Enigmail-Version: 1.1.2 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enigD258F89F275FA933DC1A3E3F" X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 X-Spam-Score: -10.3 (----------) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org, =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -10.3 (----------) This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigD258F89F275FA933DC1A3E3F Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 03/01/2011 10:45 AM, Paul Eggert wrote: >> Checking POSIXLY_CORRECT allows one to disable the warnings >=20 > Yes, but POSIXLY_CORRECT should be used only to make a program conform > to POSIX when the default behavior does not conform. It shouldn't > be used for auxiliary purposes such as suppressing warnings. If we > want an option to suppress warnings, we should add one; but we > shouldn't overload POSIXLY_CORRECT. Agreed - POSIXLY_CORRECT is not the right knob to use, since we are already asserting that the warnings don't violate POSIX (indeed, POSIX states under stderr that "Diagnostic messages may also be written to standard error.", line 83304 of POSIX 2008). We already have status=3Dnoxfer as a way to suppress one class of output; we could add status=3Dnowarn to suppress this new warning. --=20 Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org --------------enigD258F89F275FA933DC1A3E3F Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBCAAGBQJNbTcBAAoJEKeha0olJ0Nqk94H/Rae2X6SfVTfCm65UzC+D4MJ gAtukNzBa21+OV9v/hnT1iop9StF4NfAdKDGz9MIemfA+VR4imD+9AqNejoePdXH wjsA/zcZtlPsqDrNshsKj2peJhYMpKN5ctuUepKNPojZit6agVw41a5vMgFdpipM JblAa0aI61E0fIjzvOjfAUDbGmbrCFB1lYscinTvOK7clP6Ny57rUEmvSgfJgx0k 5uoD7dadr+Jjk1PF4IdP+6lFBeU0z9Y0IsywjpbVIbwobpHNQhJ/M34O7Meh0zyW ePqe75RREfvan6cclrgBZtfdi3I7x4yHXuNrtOa+bLTFR1MY9INHTLGVB0RpNa8= =6vfc -----END PGP SIGNATURE----- --------------enigD258F89F275FA933DC1A3E3F-- From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 01 14:30:20 2011 Received: (at 7362) by debbugs.gnu.org; 1 Mar 2011 19:30:20 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuVH2-0000HR-Ip for submit@debbugs.gnu.org; Tue, 01 Mar 2011 14:30:20 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuVH1-0000HE-73 for 7362@debbugs.gnu.org; Tue, 01 Mar 2011 14:30:20 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 107BA39E80DF; Tue, 1 Mar 2011 11:30:13 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E6LPvp1bCx3r; Tue, 1 Mar 2011 11:30:12 -0800 (PST) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 9FC0139E80DB; Tue, 1 Mar 2011 11:30:12 -0800 (PST) Message-ID: <4D6D4944.4070608@cs.ucla.edu> Date: Tue, 01 Mar 2011 11:30:12 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Thunderbird/3.1.7 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> In-Reply-To: <4D6D30C6.7000607@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 03/01/2011 09:45 AM, Paul Eggert wrote: > By the way, the relationship between fullblock and ibs=N obs=N is > a curious one, one that I don't fully understand. If you have > ibs=N obs=N, why would you need fullblock? This should probably > be documented (preferably by someone who understands it :-). In looking into this some more, I did find one difference (or at least, something that *should* be a difference). POSIX says that when an input record has an odd size, conv=swab should output the last byte as-is. If dd obeyed POSIX, ibs=100 obs=100 conv=swab would differ from bs=100 iflag=fullblock conv=swab in that the latter would swap every pair of input bytes, regardless of how many bytes are returned by individual 'read' system calls, whereas the former would not swap the last byte of each input record that has an odd size. Except -- GNU dd doesn't conform to POSIX here! It swaps bytes in the ibs=100 obs=100 conv=swab case, even when an input record has an odd number of bytes and POSIX says the last byte shouldn't be swapped. For example: (echo ab; sleep 1; echo cd) | dd ibs=100 conv=swab 2>/dev/null POSIX says the output should be: ba dc (and Solaris dd agrees with POSIX here). But GNU dd outputs: bac (with a blank line at the end). I suspect that this incompatibility was put in before the fullblock flag was added, because of the need to be able to swap bytes reliably. However, this need is better served by the fullblock option, so we should remove the incompatibility. From debbugs-submit-bounces@debbugs.gnu.org Tue Mar 01 16:40:41 2011 Received: (at 7362) by debbugs.gnu.org; 1 Mar 2011 21:40:41 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuXJA-0003Ao-Si for submit@debbugs.gnu.org; Tue, 01 Mar 2011 16:40:41 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PuXJ7-0003Aa-JO for 7362@debbugs.gnu.org; Tue, 01 Mar 2011 16:40:38 -0500 Received: (qmail 70923 invoked from network); 1 Mar 2011 21:40:31 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 1 Mar 2011 21:40:31 -0000 Message-ID: <4D6D67B0.9070908@draigBrady.com> Date: Tue, 01 Mar 2011 21:40:00 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> In-Reply-To: <4D6D30C6.7000607@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 01/03/11 17:45, Paul Eggert wrote: > On 03/01/2011 03:27 AM, Pádraig Brady wrote: > >> So the standard way to accumulate short reads to a full write, >> is to specify separate ibs and obs (we'd probably want to prompt about >> setting obs too for efficiency) > > Yes, good point, the diagnostic should suggest ibs=N obs=N > (instead of just ibs=N). > > By the way, the relationship between fullblock and ibs=N obs=N is > a curious one, one that I don't fully understand. If you have > ibs=N obs=N, why would you need fullblock? This should probably > be documented (preferably by someone who understands it :-). Well as I understand it, it's to do with 'count'. count refers to the number of input reads, both partial and full. So the advice to use iflag=fullblock is probably safer, especially when a count (or skip) is specified. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed Mar 02 07:54:04 2011 Received: (at 7362) by debbugs.gnu.org; 2 Mar 2011 12:54:04 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PulZ5-0007kj-JS for submit@debbugs.gnu.org; Wed, 02 Mar 2011 07:54:03 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PulZ3-0007kF-QX for 7362@debbugs.gnu.org; Wed, 02 Mar 2011 07:54:02 -0500 Received: (qmail 4651 invoked from network); 2 Mar 2011 12:53:55 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 2 Mar 2011 12:53:55 -0000 Message-ID: <4D6E3DC1.7060106@draigBrady.com> Date: Wed, 02 Mar 2011 12:53:21 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> In-Reply-To: <4D6D67B0.9070908@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: multipart/mixed; boundary="------------050808090500050303040304" X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) This is a multi-part message in MIME format. --------------050808090500050303040304 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 01/03/11 21:40, Pádraig Brady wrote: > On 01/03/11 17:45, Paul Eggert wrote: >> On 03/01/2011 03:27 AM, Pádraig Brady wrote: >> >>> So the standard way to accumulate short reads to a full write, >>> is to specify separate ibs and obs (we'd probably want to prompt about >>> setting obs too for efficiency) >> >> Yes, good point, the diagnostic should suggest ibs=N obs=N >> (instead of just ibs=N). >> >> By the way, the relationship between fullblock and ibs=N obs=N is >> a curious one, one that I don't fully understand. If you have >> ibs=N obs=N, why would you need fullblock? This should probably >> be documented (preferably by someone who understands it :-). > > Well as I understand it, it's to do with 'count'. > count refers to the number of input reads, > both partial and full. > > So the advice to use iflag=fullblock is probably safer, > especially when a count (or skip) is specified. Thinking about it more, we should at least split up the patch. So for the oflag=direct case the attached just enables fullblock (as using C_TWOBUFS would require more mem, CPU, and also messes up if the user specified a count). I'm not sure we should try to be more clever than this, and accept that dd is a low level tool that can be used in a myriad of ways. cheers, Pádraig. --------------050808090500050303040304 Content-Type: text/x-patch; name="dd-fullblock.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="dd-fullblock.diff" >From be0af61dd2288f1e6df4e28d6d955395e3780e7a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Fri, 25 Feb 2011 12:27:25 +0000 Subject: [PATCH] dd: enable iflag=fullblock for oflag=direct or oflag=cio * NEWS: Mention the change in behavior. * doc/coreutils.texi: Document when iflag=fullblock is implied. * src/dd.c (scan_args): Enable O_FULLBLOCK when needed. --- NEWS | 5 +++++ doc/coreutils.texi | 1 + src/dd.c | 8 ++++++++ 3 files changed, 14 insertions(+), 0 deletions(-) diff --git a/NEWS b/NEWS index a367d8d..8627a0b 100644 --- a/NEWS +++ b/NEWS @@ -8,6 +8,11 @@ GNU coreutils NEWS -*- outline -*- delimiter and an unbounded range like "-f1234567890-". [bug introduced in coreutils-5.3.0] +** Changes in behavior + + dd now enables iflag=fullblock with oflag=direct or oflag=cio + where short reads can have adverse effects. + * Noteworthy changes in release 8.10 (2011-02-04) [stable] diff --git a/doc/coreutils.texi b/doc/coreutils.texi index ea35afe..5df2386 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -8216,6 +8216,7 @@ Accumulate full blocks from input. The @code{read} system call may return early if a full block is not available. When that happens, continue calling @code{read} to fill the remainder of the block. +This flag is implied with @samp{oflag=cio} or @samp{oflag=direct}. This flag can be used only with @code{iflag}. @end table diff --git a/src/dd.c b/src/dd.c index daddc1e..b2f4bf3 100644 --- a/src/dd.c +++ b/src/dd.c @@ -1085,6 +1085,14 @@ scanargs (int argc, char *const *argv) if (input_flags & (O_DSYNC | O_SYNC)) input_flags |= O_RSYNC; + /* Enable 'fullblock' with 'direct' or 'cio' as we don't want to + write partial blocks to output, which would disable O_DIRECT. An + alternative would be to enable C_TWOBUFS to accumulate full output + blocks. However that wouldn't work when a count is specified, and + is also less efficient. */ + if (output_flags & (O_DIRECT | O_CIO)) + input_flags |= O_FULLBLOCK; + if (output_flags & O_FULLBLOCK) { error (0, 0, "%s: %s", _("invalid output flag"), "'fullblock'"); -- 1.7.4 --------------050808090500050303040304-- From debbugs-submit-bounces@debbugs.gnu.org Wed Mar 02 08:13:19 2011 Received: (at 7362) by debbugs.gnu.org; 2 Mar 2011 13:13:19 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pulri-0008BX-Rr for submit@debbugs.gnu.org; Wed, 02 Mar 2011 08:13:19 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1Pulrg-0008BJ-Si for 7362@debbugs.gnu.org; Wed, 02 Mar 2011 08:13:17 -0500 Received: (qmail 8265 invoked from network); 2 Mar 2011 13:13:10 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 2 Mar 2011 13:13:10 -0000 Message-ID: <4D6E4244.6090504@draigBrady.com> Date: Wed, 02 Mar 2011 13:12:36 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D4944.4070608@cs.ucla.edu> In-Reply-To: <4D6D4944.4070608@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 01/03/11 19:30, Paul Eggert wrote: > On 03/01/2011 09:45 AM, Paul Eggert wrote: >> By the way, the relationship between fullblock and ibs=N obs=N is >> a curious one, one that I don't fully understand. If you have >> ibs=N obs=N, why would you need fullblock? This should probably >> be documented (preferably by someone who understands it :-). > > In looking into this some more, I did find one difference > (or at least, something that *should* be a difference). > > POSIX says that when an input record has > an odd size, conv=swab should output the last byte as-is. > If dd obeyed POSIX, ibs=100 obs=100 conv=swab would differ > from bs=100 iflag=fullblock conv=swab in that the latter would > swap every pair of input bytes, regardless of how many bytes > are returned by individual 'read' system calls, whereas the > former would not swap the last byte of each input record that > has an odd size. > > Except -- GNU dd doesn't conform to POSIX here! It swaps > bytes in the ibs=100 obs=100 conv=swab case, even when an > input record has an odd number of bytes and POSIX says the > last byte shouldn't be swapped. > > For example: > > (echo ab; sleep 1; echo cd) | dd ibs=100 conv=swab 2>/dev/null > > POSIX says the output should be: > > ba > dc > > (and Solaris dd agrees with POSIX here). But GNU dd outputs: > > bac > > > (with a blank line at the end). > > I suspect that this incompatibility was put in before the fullblock > flag was added, because of the need to be able to swap bytes reliably. > However, this need is better served by the fullblock option, so we > should remove the incompatibility. Oh good spot. One can use an odd block size to test directly: [solaris-10]$ echo abcde | dd bs=3 conv=swab 2>/dev/null baced [solaris-10]$ printf "abcd\n" | dd bs=3 conv=swab 2>/dev/null bac [solaris-10]$ printf "abcd" | dd bs=3 conv=swab 2>/dev/null bacSegmentation Fault (core dumped) So solaris is not being POSIX compliant either :) Seriously though, changing to output the trailing byte directly might break existing scripts that are swapping from a pipe. I suppose we could warn to use fullblock if (!input_seekable && C_SWAB && !ibs%2 && nread%2) though that would warn for your example above. So I'm not sure how to proceed here. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed Mar 02 08:36:43 2011 Received: (at 7362) by debbugs.gnu.org; 2 Mar 2011 13:36:43 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PumEM-0000GG-IT for submit@debbugs.gnu.org; Wed, 02 Mar 2011 08:36:43 -0500 Received: from mx.meyering.net ([82.230.74.64]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PumEK-0000G2-7Y for 7362@debbugs.gnu.org; Wed, 02 Mar 2011 08:36:41 -0500 Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000) id 8BFF5600A2; Wed, 2 Mar 2011 14:36:33 +0100 (CET) From: Jim Meyering To: =?iso-8859-1?Q?P=E1draig?= Brady Subject: Re: bug#7362: dd strangeness In-Reply-To: <4D6E3DC1.7060106@draigBrady.com> (=?iso-8859-1?Q?=22P=E1drai?= =?iso-8859-1?Q?g?= Brady"'s message of "Wed, 02 Mar 2011 12:53:21 +0000") References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> Date: Wed, 02 Mar 2011 14:36:33 +0100 Message-ID: <87bp1teja6.fsf@rho.meyering.net> Lines: 47 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -5.8 (-----) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org, Paul Eggert X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.8 (-----) P=E1draig Brady wrote: > On 01/03/11 21:40, P=E1draig Brady wrote: >> On 01/03/11 17:45, Paul Eggert wrote: >>> On 03/01/2011 03:27 AM, P=E1draig Brady wrote: >>> >>>> So the standard way to accumulate short reads to a full write, >>>> is to specify separate ibs and obs (we'd probably want to prompt about >>>> setting obs too for efficiency) >>> >>> Yes, good point, the diagnostic should suggest ibs=3DN obs=3DN >>> (instead of just ibs=3DN). >>> >>> By the way, the relationship between fullblock and ibs=3DN obs=3DN is >>> a curious one, one that I don't fully understand. If you have >>> ibs=3DN obs=3DN, why would you need fullblock? This should probably >>> be documented (preferably by someone who understands it :-). >> >> Well as I understand it, it's to do with 'count'. >> count refers to the number of input reads, >> both partial and full. >> >> So the advice to use iflag=3Dfullblock is probably safer, >> especially when a count (or skip) is specified. > > Thinking about it more, we should at least split up the patch. > So for the oflag=3Ddirect case the attached just enables fullblock > (as using C_TWOBUFS would require more mem, CPU, and also messes > up if the user specified a count). > > I'm not sure we should try to be more clever than this, > and accept that dd is a low level tool that can be > used in a myriad of ways. ... > Subject: [PATCH] dd: enable iflag=3Dfullblock for oflag=3Ddirect or oflag= =3Dcio > > * NEWS: Mention the change in behavior. > * doc/coreutils.texi: Document when iflag=3Dfullblock is implied. > * src/dd.c (scan_args): Enable O_FULLBLOCK when needed. ... > +** Changes in behavior > + > + dd now enables iflag=3Dfullblock with oflag=3Ddirect or oflag=3Dcio > + where short reads can have adverse effects. Thanks. This looks fine to me. It is so targeted and affects dd only when a non-POSIX flag is specified, that I can't imagine it would cause any trouble. From debbugs-submit-bounces@debbugs.gnu.org Wed Mar 02 12:56:02 2011 Received: (at 7362) by debbugs.gnu.org; 2 Mar 2011 17:56:02 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuqHJ-0006xN-Ki for submit@debbugs.gnu.org; Wed, 02 Mar 2011 12:56:02 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuqHG-0006x4-Gf for 7362@debbugs.gnu.org; Wed, 02 Mar 2011 12:55:59 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 7646039E80F0; Wed, 2 Mar 2011 09:55:52 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TXaJSqnOKJUB; Wed, 2 Mar 2011 09:55:51 -0800 (PST) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id E593939E80DB; Wed, 2 Mar 2011 09:55:51 -0800 (PST) Message-ID: <4D6E84A7.1020805@cs.ucla.edu> Date: Wed, 02 Mar 2011 09:55:51 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Thunderbird/3.1.7 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> In-Reply-To: <4D6E3DC1.7060106@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 03/02/2011 04:53 AM, P=C3=A1draig Brady wrote: > + /* Enable 'fullblock' with 'direct' or 'cio' as we don't want to > + write partial blocks to output, which would disable O_DIRECT. An > + alternative would be to enable C_TWOBUFS to accumulate full outpu= t > + blocks. However that wouldn't work when a count is specified, an= d > + is also less efficient. */ > + if (output_flags & (O_DIRECT | O_CIO)) > + input_flags |=3D O_FULLBLOCK; I'm afraid this patch feels wrong somehow. It's conflating four issues: counting, efficiency, disabling O_DIRECT, and disabling O_CIO. Some thoughts: 1. Offhand I don't see why O_CIO has anything to do with blocking; why is it mentioned here? 2. If C_TWOBUFS is already in effect, then dd needn't set O_FULLBLOCK, since C_TWOBUFS already prevents the disabling of O_DIRECT. 3. The counting issue is independent of oflag=3Ddirect or oflag=3Dcio: any solution for counting should work regardless of oflag settings. The more I think about it, the more I suspect that this O_FULLBLOCK-inferring code should be omitted. It adds little benefit, as it covers a rare combination of flags that is likely to be used only by experts, who should know the gotchas in this area anyway. And the extra complexity in documentation will penalize everybody who reads it, even the non-experts. From debbugs-submit-bounces@debbugs.gnu.org Wed Mar 02 18:07:27 2011 Received: (at 7362) by debbugs.gnu.org; 2 Mar 2011 23:07:27 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Puv8h-0006F1-2B for submit@debbugs.gnu.org; Wed, 02 Mar 2011 18:07:27 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1Puv8f-0006Ep-Ez for 7362@debbugs.gnu.org; Wed, 02 Mar 2011 18:07:26 -0500 Received: (qmail 14467 invoked from network); 2 Mar 2011 23:07:19 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 2 Mar 2011 23:07:19 -0000 Message-ID: <4D6ECD83.1040500@draigBrady.com> Date: Wed, 02 Mar 2011 23:06:43 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> <4D6E84A7.1020805@cs.ucla.edu> In-Reply-To: <4D6E84A7.1020805@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 02/03/11 17:55, Paul Eggert wrote: > On 03/02/2011 04:53 AM, Pádraig Brady wrote: >> + /* Enable 'fullblock' with 'direct' or 'cio' as we don't want to >> + write partial blocks to output, which would disable O_DIRECT. An >> + alternative would be to enable C_TWOBUFS to accumulate full output >> + blocks. However that wouldn't work when a count is specified, and >> + is also less efficient. */ >> + if (output_flags & (O_DIRECT | O_CIO)) >> + input_flags |= O_FULLBLOCK; > > I'm afraid this patch feels wrong somehow. It's conflating four > issues: counting, efficiency, disabling O_DIRECT, and disabling O_CIO. > Some thoughts: > > 1. Offhand I don't see why O_CIO has anything to do with blocking; > why is it mentioned here? Well in my quick googling it seemed to be a superset of O_DIRECT mode, with the same alignment constraints. It seems to fall back to a standard (synchronous) write if the alignment is not maintained (we should always be OK there), so I thought it safer to add it given its close relationship with O_DIRECT. I'll remove it until I can actually test this. > > 2. If C_TWOBUFS is already in effect, then dd needn't set > O_FULLBLOCK, since C_TWOBUFS already prevents the disabling of > O_DIRECT. True. But it shouldn't cause an issue. I'll leave as is to simplify the code and docs. I'll amend the comment. > 3. The counting issue is independent of oflag=direct or oflag=cio: > any solution for counting should work regardless of oflag settings. True. I'll amend the comment. > The more I think about it, the more I suspect that this > O_FULLBLOCK-inferring code should be omitted. It adds little benefit, as > it covers a rare combination of flags that is likely to be used only > by experts, who should know the gotchas in this area anyway. And the > extra complexity in documentation will penalize everybody who reads > it, even the non-experts. Well it has caught people out recently: https://bugzilla.redhat.com/show_bug.cgi?id=614605 https://lkml.org/lkml/2011/2/22/746 I'm still inclined to add this. Thanks for all the insightful comments on this BTW. dd really is hairy. Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed Mar 02 18:30:08 2011 Received: (at 7362) by debbugs.gnu.org; 2 Mar 2011 23:30:08 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PuvUe-0006jp-4l for submit@debbugs.gnu.org; Wed, 02 Mar 2011 18:30:08 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PuvUb-0006j5-Iv for 7362@debbugs.gnu.org; Wed, 02 Mar 2011 18:30:06 -0500 Received: (qmail 17839 invoked from network); 2 Mar 2011 23:29:59 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 2 Mar 2011 23:29:59 -0000 Message-ID: <4D6ED2D3.8060106@draigBrady.com> Date: Wed, 02 Mar 2011 23:29:23 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> <4D6E84A7.1020805@cs.ucla.edu> In-Reply-To: <4D6E84A7.1020805@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 02/03/11 17:55, Paul Eggert wrote: > On 03/02/2011 04:53 AM, Pádraig Brady wrote: >> + /* Enable 'fullblock' with 'direct' or 'cio' as we don't want to >> + write partial blocks to output, which would disable O_DIRECT. An >> + alternative would be to enable C_TWOBUFS to accumulate full output >> + blocks. However that wouldn't work when a count is specified, and >> + is also less efficient. */ >> + if (output_flags & (O_DIRECT | O_CIO)) >> + input_flags |= O_FULLBLOCK; > > I'm afraid this patch feels wrong somehow. It's conflating four > issues: counting, efficiency, disabling O_DIRECT, and disabling O_CIO. > Some thoughts: > > 1. Offhand I don't see why O_CIO has anything to do with blocking; > why is it mentioned here? > > 2. If C_TWOBUFS is already in effect, then dd needn't set > O_FULLBLOCK, since C_TWOBUFS already prevents the disabling of > O_DIRECT. > > 3. The counting issue is independent of oflag=direct or oflag=cio: > any solution for counting should work regardless of oflag settings. > > The more I think about it, the more I suspect that this > O_FULLBLOCK-inferring code should be omitted. It adds little benefit, as > it covers a rare combination of flags that is likely to be used only > by experts, who should know the gotchas in this area anyway. And the > extra complexity in documentation will penalize everybody who reads > it, even the non-experts. Hmm I actually thought of something that may be better. The O_DIRECT -> normal switch done in iwrite() for the last write, should only happen once. If it happens more than that, we can print a warning, suggesting the use of iflag=fullblock. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Wed Mar 02 19:11:25 2011 Received: (at 7362) by debbugs.gnu.org; 3 Mar 2011 00:11:26 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Puw8b-0007eU-ID for submit@debbugs.gnu.org; Wed, 02 Mar 2011 19:11:25 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Puw8Z-0007eH-N2 for 7362@debbugs.gnu.org; Wed, 02 Mar 2011 19:11:24 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 8710E39E80FA; Wed, 2 Mar 2011 16:11:17 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id K-dtXVDQzIDS; Wed, 2 Mar 2011 16:11:16 -0800 (PST) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id CC41439E80F5; Wed, 2 Mar 2011 16:11:15 -0800 (PST) Message-ID: <4D6EDCA3.3070402@cs.ucla.edu> Date: Wed, 02 Mar 2011 16:11:15 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Thunderbird/3.1.7 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> <4D6E84A7.1020805@cs.ucla.edu> <4D6ED2D3.8060106@draigBrady.com> In-Reply-To: <4D6ED2D3.8060106@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 03/02/2011 03:29 PM, P=C3=A1draig Brady wrote: > The O_DIRECT -> normal switch done in iwrite() for the > last write, should only happen once. > If it happens more than that, we can print a warning, > suggesting the use of iflag=3Dfullblock. Yes, that sounds like a better approach. Thanks for thinking this through. I looked into O_CIO and it appears that it does imply O_DIRECT, but O_DIRECT does not have the problems on AIX (which has O_CIO) that I guess it has on GNU/Linux. Quoting from =20 The use of Direct I/O requires that certain alignment and length restrictions be met by the application=E2=80=99s I/O requests. Table 1 lists these requirements for JFS2. Failure to meet these requirements causes reads and writes to be done using normal cached I/O, but after the data is transferred to the application buffer, the cached copy is discarded. which suggests that iwrite's fcntl in iwrite is not needed on AIX. I don't know if this is worth optimizing, but perhaps it's worth a comment as to why we worry about O_DIRECT but not O_CIO in iwrite. From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 04 02:44:30 2011 Received: (at 7362) by debbugs.gnu.org; 4 Mar 2011 07:44:30 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PvPgb-00028m-Ni for submit@debbugs.gnu.org; Fri, 04 Mar 2011 02:44:30 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PvPgZ-00028Y-8M for 7362@debbugs.gnu.org; Fri, 04 Mar 2011 02:44:28 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 269A839E80F7; Thu, 3 Mar 2011 23:44:21 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zj5vze4WpDoD; Thu, 3 Mar 2011 23:44:20 -0800 (PST) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 1251339E80E0; Thu, 3 Mar 2011 23:44:20 -0800 (PST) Message-ID: <4D70984F.3020109@cs.ucla.edu> Date: Thu, 03 Mar 2011 23:44:15 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D4944.4070608@cs.ucla.edu> <4D6E4244.6090504@draigBrady.com> In-Reply-To: <4D6E4244.6090504@draigBrady.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 03/02/2011 05:12 AM, P=C3=A1draig Brady wrote: > So I'm not sure how to proceed here. Before worrying about swab, we should at least try to get a warning out for the short-read case we discussed earlier. Here's a revision of my earlier proposal, with a higher-quality warning. Quite possibly we can fold a swab warning into this code later, but one thing at a time. 2011-03-04 Paul Eggert dd: avoid or diagnose some problems with short reads * src/dd.c (iread): Diagnose short reads when they mess up counts. Derived from a suggestion by P=C3=A1draig Brady in: http://lists.gnu.org/archive/html/bug-coreutils/2011-02/msg00150.html diff --git a/NEWS b/NEWS index a367d8d..32cc7b0 100644 --- a/NEWS +++ b/NEWS @@ -39,6 +39,10 @@ GNU coreutils NEWS = -*- outline -*- reproduce them efficiently in the output file. mv also benefits when it resorts to copying, e.g., between file systems. =20 + dd bs=3DBYTES, with either count=3DBLOCKS or skip=3DBLOCKS and without + iflag=3Dfullblock, now warns if a short block is read (not at end of + file). This helps avoid confusion when counting or skipping bytes. + join now supports -o 'auto' which will automatically infer the output format from the first line in each file, to ensure the same number of fields are output for each line. diff --git a/src/dd.c b/src/dd.c index daddc1e..39fa3ab 100644 --- a/src/dd.c +++ b/src/dd.c @@ -796,14 +796,43 @@ process_signals (void) static ssize_t iread (int fd, char *buf, size_t size) { - while (true) + ssize_t nread; + + do { - ssize_t nread; process_signals (); nread =3D read (fd, buf, size); - if (! (nread < 0 && errno =3D=3D EINTR)) - return nread; } + while (nread < 0 && errno =3D=3D EINTR); + + if (0 < nread) + { + /* If bs=3DSIZE is given and iflag=3Dfullblock is not, warn if a + short block was read (not at EOF), and either count=3DBLOCKS or + skip=3DBLOCKS is also given. This helps avoid confusion when + counting or skipping bytes. */ + static bool warned; + static ssize_t prev_nread; + + if (! warned && iread_fnc =3D=3D iread + && 0 < prev_nread && prev_nread < size + && (skip_records + || (0 < max_records && max_records < (uintmax_t) -1))) + { + uintmax_t prev =3D prev_nread; + error (0, 0, ngettext (("warning: short read (%"PRIuMAX" byte)= ; " + "suggest iflag=3Dfullblock"), + ("warning: short read (%"PRIuMAX" bytes= ); " + "suggest iflag=3Dfullblock"), + select_plural (prev)), + prev); + warned =3D true; + } + + prev_nread =3D nread; + } + + return nread; } =20 /* Wrapper around iread function to accumulate full blocks. */ From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 04 05:10:45 2011 Received: (at 7362) by debbugs.gnu.org; 4 Mar 2011 10:10:45 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PvRy9-0006FL-4i for submit@debbugs.gnu.org; Fri, 04 Mar 2011 05:10:45 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PvRy7-0006F9-F2 for 7362@debbugs.gnu.org; Fri, 04 Mar 2011 05:10:44 -0500 Received: (qmail 27547 invoked from network); 4 Mar 2011 10:10:37 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 4 Mar 2011 10:10:37 -0000 Message-ID: <4D70BA72.600@draigBrady.com> Date: Fri, 04 Mar 2011 10:09:54 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D4944.4070608@cs.ucla.edu> <4D6E4244.6090504@draigBrady.com> <4D70984F.3020109@cs.ucla.edu> In-Reply-To: <4D70984F.3020109@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 04/03/11 07:44, Paul Eggert wrote: > + /* If bs=SIZE is given and iflag=fullblock is not, warn if a Do you check that bs= is specified? Do you want to as it's independent of the counting issue? Anyway, with this patch the following slightly contrived example will warn: # Output first 2 parts $ (echo part1; sleep 1; echo part2; sleep 1; echo discard) | dd count=2 obs=1 2>/dev/null part1 part2 So I'm a bit wary about adding this at all. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 04 13:04:12 2011 Received: (at 7362) by debbugs.gnu.org; 4 Mar 2011 18:04:12 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PvZMJ-0000cZ-Tb for submit@debbugs.gnu.org; Fri, 04 Mar 2011 13:04:12 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PvZMG-0000cM-Su for 7362@debbugs.gnu.org; Fri, 04 Mar 2011 13:04:09 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id C884C39E80F7; Fri, 4 Mar 2011 10:04:02 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YOUsQwzA7Q6L; Fri, 4 Mar 2011 10:04:02 -0800 (PST) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 530DD39E80F0; Fri, 4 Mar 2011 10:04:02 -0800 (PST) Message-ID: <4D712992.2040108@cs.ucla.edu> Date: Fri, 04 Mar 2011 10:04:02 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Thunderbird/3.1.7 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D4944.4070608@cs.ucla.edu> <4D6E4244.6090504@draigBrady.com> <4D70984F.3020109@cs.ucla.edu> <4D70BA72.600@draigBrady.com> In-Reply-To: <4D70BA72.600@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 03/04/2011 02:09 AM, P=C3=A1draig Brady wrote: > On 04/03/11 07:44, Paul Eggert wrote: >> + /* If bs=3DSIZE is given and iflag=3Dfullblock is not, warn if = a > > Do you check that bs=3D is specified? I meant to, but I inadvertently deleted that part of the change, which meant that the code didn't implement the comment correctly. Sorry about that; see below. > Anyway, with this patch the following slightly contrived example will w= arn: The following further patch, which fixes the abovementioned typo, should address that problem. --- a/src/dd.c +++ b/src/dd.c @@ -814,7 +814,7 @@ iread (int fd, char *buf, size_t size) static bool warned; static ssize_t prev_nread; =20 - if (! warned && iread_fnc =3D=3D iread + if (! warned && ! (conversions_mask & C_TWOBUFS) && iread_fnc =3D=3D= iread && 0 < prev_nread && prev_nread < size && (skip_records || (0 < max_records && max_records < (uintmax_t) -1))) From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 04 17:42:54 2011 Received: (at 7362) by debbugs.gnu.org; 4 Mar 2011 22:42:54 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pvdi1-00009L-VH for submit@debbugs.gnu.org; Fri, 04 Mar 2011 17:42:54 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1Pvdhy-000097-Vm for 7362@debbugs.gnu.org; Fri, 04 Mar 2011 17:42:51 -0500 Received: (qmail 59823 invoked from network); 4 Mar 2011 22:42:44 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 4 Mar 2011 22:42:44 -0000 Message-ID: <4D716AB7.3030202@draigBrady.com> Date: Fri, 04 Mar 2011 22:41:59 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D4944.4070608@cs.ucla.edu> <4D6E4244.6090504@draigBrady.com> <4D70984F.3020109@cs.ucla.edu> <4D70BA72.600@draigBrady.com> <4D712992.2040108@cs.ucla.edu> In-Reply-To: <4D712992.2040108@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 04/03/11 18:04, Paul Eggert wrote: > On 03/04/2011 02:09 AM, Pádraig Brady wrote: >> On 04/03/11 07:44, Paul Eggert wrote: >>> + /* If bs=SIZE is given and iflag=fullblock is not, warn if a >> >> Do you check that bs= is specified? > > I meant to, but I inadvertently deleted that part of the change, > which meant that the code didn't implement the comment correctly. > Sorry about that; see below. > >> Anyway, with this patch the following slightly contrived example will >> warn: > > The following further patch, which fixes the abovementioned typo, > should address that problem. > > --- a/src/dd.c > +++ b/src/dd.c > @@ -814,7 +814,7 @@ iread (int fd, char *buf, size_t size) > static bool warned; > static ssize_t prev_nread; > > - if (! warned && iread_fnc == iread > + if (! warned && ! (conversions_mask & C_TWOBUFS) && iread_fnc == > iread > && 0 < prev_nread && prev_nread < size Looks good! I can't think of an example where this might erroneously warn. > && (skip_records > || (0 < max_records && max_records < (uintmax_t) -1))) If we more aggressively warn by removing the check for the counts above, then we would directly address http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8171 Unfortunately that would give unwanted warnings though, in some cases. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 04 20:45:11 2011 Received: (at 7362) by debbugs.gnu.org; 5 Mar 2011 01:45:11 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PvgYQ-0004Mi-Rw for submit@debbugs.gnu.org; Fri, 04 Mar 2011 20:45:11 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1PvgYO-0004MV-Vw for 7362@debbugs.gnu.org; Fri, 04 Mar 2011 20:45:09 -0500 Received: (qmail 82786 invoked from network); 5 Mar 2011 01:45:02 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 5 Mar 2011 01:45:02 -0000 Message-ID: <4D719570.4020403@draigBrady.com> Date: Sat, 05 Mar 2011 01:44:16 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> <4D6E84A7.1020805@cs.ucla.edu> <4D6ED2D3.8060106@draigBrady.com> In-Reply-To: <4D6ED2D3.8060106@draigBrady.com> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) I'm going to apply the following. commit decea1c17bcbce3f70bfb50b7de66de5eaba9b91 Author: Pádraig Brady Date: Fri Feb 25 12:27:25 2011 +0000 dd: warn when we disable oflag=direct not at EOF An alternative to this is to auto enable iflag=fullblock when oflag=direct and bs= is specified. It was thought better though, to warn about the specific issue, and give full control of dd's options to the user. * src/dd.c (iwrite): Warn, when we write after having disabled O_DIRECT. See https://bugzilla.redhat.com/show_bug.cgi?id=614605 diff --git a/src/dd.c b/src/dd.c index daddc1e..fd468a6 100644 --- a/src/dd.c +++ b/src/dd.c @@ -837,6 +837,12 @@ iwrite (int fd, char const *buf, size_t size) { size_t total_written = 0; + if ((output_flags & O_DIRECT) && w_partial == 1) + { + error (0, 0, _("dd: warning: partial read; oflag=direct disabled; " + "suggest iflag=fullblock")); + } + if ((output_flags & O_DIRECT) && size < output_blocksize) { int old_flags = fcntl (STDOUT_FILENO, F_GETFL); From debbugs-submit-bounces@debbugs.gnu.org Fri Mar 04 22:29:18 2011 Received: (at 7362) by debbugs.gnu.org; 5 Mar 2011 03:29:18 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PviBC-0006dL-1A for submit@debbugs.gnu.org; Fri, 04 Mar 2011 22:29:18 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PviB9-0006d8-6B for 7362@debbugs.gnu.org; Fri, 04 Mar 2011 22:29:16 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 2B56639E80F9; Fri, 4 Mar 2011 19:29:09 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rkaVBFi+ibVo; Fri, 4 Mar 2011 19:29:08 -0800 (PST) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id AFE3E39E80F7; Fri, 4 Mar 2011 19:29:08 -0800 (PST) Message-ID: <4D71AE04.7020307@cs.ucla.edu> Date: Fri, 04 Mar 2011 19:29:08 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> <4D6E84A7.1020805@cs.ucla.edu> <4D6ED2D3.8060106@draigBrady.com> <4D719570.4020403@draigBrady.com> In-Reply-To: <4D719570.4020403@draigBrady.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.9 (--) On 03/04/2011 05:44 PM, P=C3=A1draig Brady wrote: > + if ((output_flags & O_DIRECT) && w_partial =3D=3D 1) > + { > + error (0, 0, _("dd: warning: partial read; oflag=3Ddirect disabl= ed; " > + "suggest iflag=3Dfullblock")); This diagnostic looks wrong. w_partial means there was a partial *write*, not a partial *read*. Anyway, we should use a better overall solution for partial reads; I'll try to come up with one. From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 05 03:18:01 2011 Received: (at 7362) by debbugs.gnu.org; 5 Mar 2011 08:18:01 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pvmga-0005CN-K0 for submit@debbugs.gnu.org; Sat, 05 Mar 2011 03:18:01 -0500 Received: from smtp.cs.ucla.edu ([131.179.128.62]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1PvmgY-0005CB-7D for 7362@debbugs.gnu.org; Sat, 05 Mar 2011 03:17:59 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 5436F39E80F7; Sat, 5 Mar 2011 00:17:52 -0800 (PST) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S4Gf5gTQ7iQJ; Sat, 5 Mar 2011 00:17:51 -0800 (PST) Received: from [192.168.1.10] (pool-71-189-109-235.lsanca.fios.verizon.net [71.189.109.235]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 85B0839E80E0; Sat, 5 Mar 2011 00:17:51 -0800 (PST) Message-ID: <4D71F1A9.4030102@cs.ucla.edu> Date: Sat, 05 Mar 2011 00:17:45 -0800 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 MIME-Version: 1.0 To: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> <4D6E84A7.1020805@cs.ucla.edu> <4D6ED2D3.8060106@draigBrady.com> <4D719570.4020403@draigBrady.com> <4D71AE04.7020307@cs.ucla.edu> In-Reply-To: <4D71AE04.7020307@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: -2.9 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.8 (--) On 03/04/2011 07:29 PM, Paul Eggert wrote: > we should use a better overall solution for partial > reads; I'll try to come up with one. Here is a proposed change for that, incorporating the recent suggestions plus a couple more ideas: >From 11b600b65dce686ee9320b5d851fbd1b0beaa832 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 5 Mar 2011 00:14:25 -0800 Subject: [PATCH] dd: diagnose some problems with partial reads * src/dd.c (warn_partial_read): New static var. (iread): Diagnose partial reads if needed. (iwrite): Don't diagnose them here; not needed any more. (scanargs): Determine whether partial reads should be diagnosted. --- src/dd.c | 53 +++++++++++++++++++++++++++++++++++++++++------------ 1 files changed, 41 insertions(+), 12 deletions(-) diff --git a/src/dd.c b/src/dd.c index 6069671..3472442 100644 --- a/src/dd.c +++ b/src/dd.c @@ -199,6 +199,9 @@ static int input_seek_errno; static uintmax_t input_offset; static bool input_offset_overflow; +/* True if a partial read should be diagnosed. */ +static bool warn_partial_read; + /* Records truncated by conv=block. */ static uintmax_t r_truncate = 0; @@ -894,14 +897,35 @@ invalidate_cache (int fd, off_t len) static ssize_t iread (int fd, char *buf, size_t size) { - while (true) + ssize_t nread; + + do { - ssize_t nread; process_signals (); nread = read (fd, buf, size); - if (! (nread < 0 && errno == EINTR)) - return nread; } + while (nread < 0 && errno == EINTR); + + if (0 < nread && warn_partial_read) + { + static ssize_t prev_nread; + + if (0 < prev_nread && prev_nread < size) + { + uintmax_t prev = prev_nread; + error (0, 0, ngettext (("warning: partial read (%"PRIuMAX" byte); " + "suggest iflag=fullblock"), + ("warning: partial read (%"PRIuMAX" bytes); " + "suggest iflag=fullblock"), + select_plural (prev)), + prev); + warn_partial_read = false; + } + + prev_nread = nread; + } + + return nread; } /* Wrapper around iread function to accumulate full blocks. */ @@ -935,12 +959,6 @@ iwrite (int fd, char const *buf, size_t size) { size_t total_written = 0; - if ((output_flags & O_DIRECT) && w_partial == 1) - { - error (0, 0, _("dd: warning: partial read; oflag=direct disabled; " - "suggest iflag=fullblock")); - } - if ((output_flags & O_DIRECT) && size < output_blocksize) { int old_flags = fcntl (STDOUT_FILENO, F_GETFL); @@ -1175,7 +1193,7 @@ scanargs (int argc, char *const *argv) input_blocksize = output_blocksize = blocksize; else { - /* POSIX says dd aggregates short reads into + /* POSIX says dd aggregates partial reads into output_blocksize if bs= is not specified. */ conversions_mask |= C_TWOBUFS; } @@ -1195,6 +1213,17 @@ scanargs (int argc, char *const *argv) error (0, 0, "%s: %s", _("invalid output flag"), "'fullblock'"); usage (EXIT_FAILURE); } + + /* Warn about partial reads if bs=SIZE is given and iflag=fullblock + is not, and if counting or skipping bytes or using direct I/O. + This helps to avoid confusion with miscounts, and to avoid issues + with direct I/O on GNU/Linux. */ + warn_partial_read = + (! (conversions_mask & C_TWOBUFS) && ! (input_flags & O_FULLBLOCK) + && (skip_records + || (0 < max_records && max_records < (uintmax_t) -1) + || (input_flags | output_flags) & O_DIRECT)); + iread_fnc = ((input_flags & O_FULLBLOCK) ? iread_fullblock : iread); @@ -1758,7 +1787,7 @@ dd_copy (void) There are 3 reasons why there might be unskipped blocks/bytes: 1. file is too small 2. pipe has not enough data - 3. short reads */ + 3. partial reads */ if (us_blocks || (!input_offset_overflow && us_bytes)) { error (0, 0, -- 1.7.4 From debbugs-submit-bounces@debbugs.gnu.org Sat Mar 05 04:49:59 2011 Received: (at 7362) by debbugs.gnu.org; 5 Mar 2011 09:49:59 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Pvo7a-0007DX-C0 for submit@debbugs.gnu.org; Sat, 05 Mar 2011 04:49:58 -0500 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1Pvo7X-0007DK-Gv for 7362@debbugs.gnu.org; Sat, 05 Mar 2011 04:49:56 -0500 Received: (qmail 34136 invoked from network); 5 Mar 2011 09:49:49 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 5 Mar 2011 09:49:49 -0000 Message-ID: <4D72070D.3040002@draigBrady.com> Date: Sat, 05 Mar 2011 09:49:01 +0000 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: Paul Eggert Subject: Re: bug#7362: dd strangeness References: <4CDA7261.4060800@aircomp.aero> <4CDABA65.9030808@draigBrady.com> <4CDAC336.6030105@draigBrady.com> <4D495AE2.8080404@draigBrady.com> <4D679E57.8010104@draigBrady.com> <4D67E6F4.9020308@cs.ucla.edu> <4D6B6DB7.2060707@draigBrady.com> <4D6CC150.4000900@cs.ucla.edu> <4D6CD80F.8070708@draigBrady.com> <4D6D30C6.7000607@cs.ucla.edu> <4D6D67B0.9070908@draigBrady.com> <4D6E3DC1.7060106@draigBrady.com> <4D6E84A7.1020805@cs.ucla.edu> <4D6ED2D3.8060106@draigBrady.com> <4D719570.4020403@draigBrady.com> <4D71AE04.7020307@cs.ucla.edu> <4D71F1A9.4030102@cs.ucla.edu> In-Reply-To: <4D71F1A9.4030102@cs.ucla.edu> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.7 (--) X-Debbugs-Envelope-To: 7362 Cc: 7362@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.7 (--) On 05/03/11 08:17, Paul Eggert wrote: > On 03/04/2011 07:29 PM, Paul Eggert wrote: >> we should use a better overall solution for partial >> reads; I'll try to come up with one. > > Here is a proposed change for that, incorporating > the recent suggestions plus a couple more ideas: OK I like this since it consolidates the code. It's slightly less general in that it would erroneously warn with conv=sync oflag=direct, but that's not a practical concern. So Ack. cheers, Pádraig. From unknown Sat Sep 13 23:21:40 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 02 Apr 2011 11:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator