From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 13:29:51 2014 Received: (at submit) by debbugs.gnu.org; 10 Oct 2014 17:29:51 +0000 Received: from localhost ([127.0.0.1]:40342 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xce0k-0000dw-LE for submit@debbugs.gnu.org; Fri, 10 Oct 2014 13:29:51 -0400 Received: from eggs.gnu.org ([208.118.235.92]:53193) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcdyy-0000aV-9I for submit@debbugs.gnu.org; Fri, 10 Oct 2014 13:28:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xcdyo-0002HY-G2 for submit@debbugs.gnu.org; Fri, 10 Oct 2014 13:27:59 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:57928) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xcdyo-0002HQ-D5 for submit@debbugs.gnu.org; Fri, 10 Oct 2014 13:27:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54845) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xcdyh-0007ig-SD for bug-coreutils@gnu.org; Fri, 10 Oct 2014 13:27:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xcdyb-0002Cj-O7 for bug-coreutils@gnu.org; Fri, 10 Oct 2014 13:27:43 -0400 Received: from mga11.intel.com ([192.55.52.93]:4882) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xcdyb-0002CZ-It for bug-coreutils@gnu.org; Fri, 10 Oct 2014 13:27:37 -0400 Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP; 10 Oct 2014 10:27:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,693,1406617200"; d="scan'208";a="603485551" Received: from orsmsx103.amr.corp.intel.com ([10.22.225.130]) by fmsmga001.fm.intel.com with ESMTP; 10 Oct 2014 10:25:24 -0700 Received: from orsmsx102.amr.corp.intel.com ([169.254.1.8]) by ORSMSX103.amr.corp.intel.com ([169.254.2.106]) with mapi id 14.03.0195.001; Fri, 10 Oct 2014 10:25:24 -0700 From: "Polehn, Mike A" To: "bug-coreutils@gnu.org" Subject: The Linux cp command has bugs Thread-Topic: The Linux cp command has bugs Thread-Index: Ac/kryqklasbsNZsTlSHbqZfRZOUTQ== Date: Fri, 10 Oct 2014 17:25:23 +0000 Message-ID: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.139] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Fri, 10 Oct 2014 13:29:49 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) cp --version: 8.21 running on Fedora 20, version 3.16.3-200.fc20.x86_64 with latest updates=20 The Linux copy command (cp) has problems=20 Problem need to copy a tree of 1000s of files to another directory that is = a git directory that has a whole bunch of additional build files, so diff b= etween the directories will not do any good. If the files are copied over the git directory I can do what I need to do, = since I need to see if there are in differences in any of the files. Using: cp -f -r For each file being copied it asked: cp: overwrite XXXXXXXXXXXXXXXXX? So the force command does not work, since it should skip the asking about d= oing an overwrite. If the force command is supposed act differently, then t= here should be an additional argument because answering yes 1000s of times = is not very smart...=20 Also since there are a lot of files, if I accidently hit return before y, c= p moves on to the next file, which implies to me that the file was not copi= ed, which gets to be a problem when 1000s of files are copied. I also assum= ed that 'y' implies the data was copied. From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 14:02:28 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 18:02:28 +0000 Received: from localhost ([127.0.0.1]:40348 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XceWJ-0001Uj-58 for submit@debbugs.gnu.org; Fri, 10 Oct 2014 14:02:27 -0400 Received: from mga01.intel.com ([192.55.52.88]:33824) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XceWG-0001Uc-Dn for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 14:02:25 -0400 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP; 10 Oct 2014 11:02:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,693,1406617200"; d="scan'208";a="612596803" Received: from orsmsx107.amr.corp.intel.com ([10.22.240.5]) by fmsmga002.fm.intel.com with ESMTP; 10 Oct 2014 11:02:01 -0700 Received: from orsmsx158.amr.corp.intel.com (10.22.240.20) by ORSMSX107.amr.corp.intel.com (10.22.240.5) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 10 Oct 2014 11:02:00 -0700 Received: from orsmsx102.amr.corp.intel.com ([169.254.1.8]) by ORSMSX158.amr.corp.intel.com ([169.254.10.165]) with mapi id 14.03.0195.001; Fri, 10 Oct 2014 11:01:59 -0700 From: "Polehn, Mike A" To: "18681@debbugs.gnu.org" <18681@debbugs.gnu.org> Subject: cp Specific fail example Thread-Topic: cp Specific fail example Thread-Index: Ac/ktEDcd/I3b6NmTXm5MFvYkoBcNg== Date: Fri, 10 Oct 2014 18:01:59 +0000 Message-ID: <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.138] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) ######### get and check out version [root@F20-v3 ~]# cd /usr/src [root@F20-v3 src]# git clone git://dpdk.org/dpdk [root@F20-v3 src]# cd dpdk [root@F20-v3 dpdk]# git tag v1.2.3r0 v1.2.3r1 v1.2.3r2 v1.2.3r3 v1.2.3r4 v1.3.0r0 v1.3.1r0 v1.3.1r1 v1.3.1r2 v1.3.1r3 v1.4.0r0 v1.4.1r0 v1.4.1r1 v1.4.1r2 v1.5.0r0 v1.5.0r1 v1.5.0r2 v1.5.1r0 v1.5.1r1 v1.5.1r2 v1.5.2r0 v1.5.2r1 v1.5.2r2 v1.6.0r0 v1.6.0r1 v1.6.0r2 v1.7.0 v1.7.0-rc1 v1.7.0-rc2 v1.7.0-rc3 v1.7.0-rc4 v1.7.1 v1.8.0-rc1 [root@F20-test dpdk]# git checkout -b map_v1.7.1 v1.7.1 Switched to a new branch 'map_v1.7.1' ### download dpdk 1.7.1 files from http://dpdk.org/download ### put in /usr/src directory and untar: [root@F20-v3 src]# tar -xf dpdk-1.7.1.tar.gz [root@F20-v3 src]# dir dpdk dpdk-1.7.1 dpdk-1.7.1.tar.gz=20 [root@F20-v3 src]# cp -f -r dpdk-1.7.1/* dpdk/ cp: overwrite =E2dpdk/app/test/test_lpm6.c=E2? y cp: overwrite =E2dpdk/app/test/test_rwlock.c=E2? y cp: overwrite =E2dpdk/app/test/test_table_ports.h=E2? y cp: overwrite =E2dpdk/app/test/test_logs.c=E2? y cp: overwrite =E2dpdk/app/test/test_pmd_ring.c=E2? y cp: overwrite =E2dpdk/app/test/test_table_tables.h=E2? cp: overwrite =E2dpdk/app/test/test_lpm.c=E2? cp: overwrite =E2dpdk/app/test/test_malloc.c=E2? cp: overwrite =E2dpdk/app/test/test_errno.c=E2? y cp: overwrite =E2dpdk/app/test/test_hash.c=E2? cp: overwrite =E2dpdk/app/test/test_table_acl.h=E2? y note: asking question on each file and moving to next file even when not en= tering n or y From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 14:13:56 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 18:13:56 +0000 Received: from localhost ([127.0.0.1]:40359 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XcehQ-0001nF-AW for submit@debbugs.gnu.org; Fri, 10 Oct 2014 14:13:56 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:44300) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XcehO-0001n6-4m for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 14:13:54 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id A6BE439E8011; Fri, 10 Oct 2014 11:13:53 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nY7tjJxhH8tm; Fri, 10 Oct 2014 11:13:45 -0700 (PDT) Received: from [192.168.1.9] (pool-71-177-17-123.lsanca.dsl-w.verizon.net [71.177.17.123]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 07888A6000C; Fri, 10 Oct 2014 11:13:45 -0700 (PDT) Message-ID: <543821D8.5070306@cs.ucla.edu> Date: Fri, 10 Oct 2014 11:13:44 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: "Polehn, Mike A" , 18681@debbugs.gnu.org Subject: Re: bug#18681: The Linux cp command has bugs References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> In-Reply-To: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Polehn, Mike A wrote: > Using: cp -f -r > > For each file being copied it asked: > > cp: overwrite XXXXXXXXXXXXXXXXX? That's not what I observe here (see below). Perhaps there's something else going on, maybe an alias. For example, I couldn't get the cp to work without also using -T. Can you please give an exact recipe for reproducing the problem on your platform? $ mkdir a b $ echo a >a/f $ echo b >b/f $ cp -f -r -T a b $ cat b/f a From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 14:17:43 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 18:17:43 +0000 Received: from localhost ([127.0.0.1]:40363 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcel5-0001tv-25 for submit@debbugs.gnu.org; Fri, 10 Oct 2014 14:17:43 -0400 Received: from smtp.cs.ucla.edu ([131.179.128.62]:44512) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcel2-0001tl-RA for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 14:17:41 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp.cs.ucla.edu (Postfix) with ESMTP id 5F89A39E8011; Fri, 10 Oct 2014 11:17:40 -0700 (PDT) X-Virus-Scanned: amavisd-new at smtp.cs.ucla.edu Received: from smtp.cs.ucla.edu ([127.0.0.1]) by localhost (smtp.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vvPKD8MREctL; Fri, 10 Oct 2014 11:17:35 -0700 (PDT) Received: from [192.168.1.9] (pool-71-177-17-123.lsanca.dsl-w.verizon.net [71.177.17.123]) by smtp.cs.ucla.edu (Postfix) with ESMTPSA id 5BE4E39E8012; Fri, 10 Oct 2014 11:17:35 -0700 (PDT) Message-ID: <543822BE.1030408@cs.ucla.edu> Date: Fri, 10 Oct 2014 11:17:34 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: "Polehn, Mike A" , "18681@debbugs.gnu.org" <18681@debbugs.gnu.org> Subject: Re: bug#18681: cp Specific fail example References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> In-Reply-To: <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) I do not observe the symptoms that you report. See below. My guess is that you've aliased 'cp' to 'cp -i', which is probably a mistake. $ git clone git://dpdk.org/dpdk Cloning into 'dpdk'... remote: Counting objects: 16249, done. remote: Compressing objects: 100% (3976/3976), done. remote: Total 16249 (delta 12964), reused 15109 (delta 12122) Receiving objects: 100% (16249/16249), 12.79 MiB | 1.20 MiB/s, done. Resolving deltas: 100% (12964/12964), done. Checking connectivity... done. $ cd dpdk $ git checkout -b map_v1.7.1 v1.7.1 Switched to a new branch 'map_v1.7.1' $ pwd /tmp/d/dpdk $ cd .. $ wget http://dpdk.org/browse/dpdk/snapshot/dpdk-1.7.1.tar.gz --2014-10-10 11:15:44-- http://dpdk.org/browse/dpdk/snapshot/dpdk-1.7.1.tar.gz Resolving dpdk.org (dpdk.org)... 92.243.14.124 Connecting to dpdk.org (dpdk.org)|92.243.14.124|:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [application/x-gzip] Saving to: ‘dpdk-1.7.1.tar.gz’ [ <=> ] 8,281,609 1.17MB/s in 7.5s 2014-10-10 11:15:52 (1.06 MB/s) - ‘dpdk-1.7.1.tar.gz’ saved [8281609] $ tar -xf dpdk-1.7.1.tar.gz $ cp -f -r dpdk-1.7.1/* dpdk/ $ From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 15:13:48 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 19:13:48 +0000 Received: from localhost ([127.0.0.1]:40411 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XcfdL-0004lE-PT for submit@debbugs.gnu.org; Fri, 10 Oct 2014 15:13:48 -0400 Received: from mail-qc0-f176.google.com ([209.85.216.176]:33998) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XcfdJ-0004kz-Rg for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 15:13:46 -0400 Received: by mail-qc0-f176.google.com with SMTP id r5so2475965qcx.35 for <18681@debbugs.gnu.org>; Fri, 10 Oct 2014 12:13:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=DdpjVRKks7cd6aXStjmPoG493bD/FgrBY4QlufJzW4I=; b=B5MqPmTJimPtpe/ffbH9VYcCq7HVujpIUGAotbDRYtFBg3q0gKdOLZlIaH4i1/GX3c f+kWDq6oR26fuqEJ9xmpyhojrH6dK/j0nwjbqAxQjimHP6GLdIsl1DVWR/xIkLdVRiYo UsGLynzGyioRi53E5OsM/T4/pLrY5jFNf4MBOHLmc7Z0Lc/RHmnVhEqQ2qfx/LQ6EbvO 7xeketWKo7hKn/6dXIFNGIc37tLIa5C33Sh6fjrdc6+zNMYaz4NbjSQpBfyUNwooqGdk NEn0Xn0ZTzJl/Pv6oB6P9lB6DIbcIByQz/E56BbTl8BZsqamezEv+VlnT9RykEba19vA VJYA== X-Received: by 10.224.136.10 with SMTP id p10mr12665593qat.26.1412968425306; Fri, 10 Oct 2014 12:13:45 -0700 (PDT) Received: from disco.wi.mit.edu ([18.4.1.144]) by mx.google.com with ESMTPSA id d60sm5728796qgd.35.2014.10.10.12.13.44 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 10 Oct 2014 12:13:44 -0700 (PDT) Message-ID: <54382FE8.5030904@gmail.com> Date: Fri, 10 Oct 2014 15:13:44 -0400 From: Assaf Gordon User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: "Polehn, Mike A" , 18681@debbugs.gnu.org Subject: Re: bug#18681: The Linux cp command has bugs References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> In-Reply-To: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Hello Mike, On 10/10/2014 01:25 PM, Polehn, Mike A wrote:> > Problem need to copy a tree of 1000s of files to another directory > that is a git directory that has a whole bunch of additional build > files, so diff between the directories will not do any good. > This is slightly off-topic, but if you want to compare only files managed by git (ignoring other files in current directory), perhaps the following would help: # Download and extract the tarball wget -q http://dpdk.org/browse/dpdk/snapshot/dpdk-1.7.1.tar.gz tar -xf dpdk-1.7.1.tar.gz # Clone the git repo with specific branch, checkout the relevant branch # (or go to an existing checked-out repository directory) git clone git://dpdk.org/dpdk cd dpdk git checkout -b map_v1.7.1 v1.7.1 # For each file managed by git (with 'git ls'), # compare it to the corresponding file in the other directory: git ls -0 | xargs -0 -I% diff -q % ../dpdk-1.7.1/% Regards, -gordon From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 15:15:22 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 19:15:22 +0000 Received: from localhost ([127.0.0.1]:40415 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcfer-0005Ab-Ny for submit@debbugs.gnu.org; Fri, 10 Oct 2014 15:15:22 -0400 Received: from mail-qa0-f42.google.com ([209.85.216.42]:46390) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcfeo-00057N-Vu for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 15:15:19 -0400 Received: by mail-qa0-f42.google.com with SMTP id j7so2093429qaq.15 for <18681@debbugs.gnu.org>; Fri, 10 Oct 2014 12:15:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=b3RoF2HRh8x6GwvZFrSU9Ml2vxbgiBeVaZlaJA9Vsqs=; b=VvCndIxCfZdqDcOi2rBOhGhRz45g8zvLRuKsQR3kEMnSERlxomXBoMnWRTvbuLQHQq vRIrmwWQ21Y+mVcec0h3r2HKDeFgbnvXBEsba9j+pC0oE44OPzU46Jj3EL5RTFmLRkVV AWjBHPkddy1GMaR9nZBO3W1mCn8LTv38rEAIRnLCg8RMp3tV9FCmX1uca/yITrw7Pu5F hQcp0hi6K2M4nEIsMkLtgonp99UZjEsgscF6X+ClB9ri9NJKjJot6clZheVztT+uNor8 fMDc3O6OjCvU5g9dOCMoE3GrUN73xRtj9NbO1hO/vgNxq/iH9ea82s4kQ1futzJgNDOE prxw== X-Received: by 10.224.5.134 with SMTP id 6mr11860278qav.79.1412968518714; Fri, 10 Oct 2014 12:15:18 -0700 (PDT) Received: from disco.wi.mit.edu ([18.4.1.144]) by mx.google.com with ESMTPSA id e64sm5731960qga.34.2014.10.10.12.15.18 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 10 Oct 2014 12:15:18 -0700 (PDT) Message-ID: <54383046.7000202@gmail.com> Date: Fri, 10 Oct 2014 15:15:18 -0400 From: Assaf Gordon User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: "Polehn, Mike A" , 18681@debbugs.gnu.org Subject: Re: bug#18681: The Linux cp command has bugs References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <54382FE8.5030904@gmail.com> In-Reply-To: <54382FE8.5030904@gmail.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Sorry, had a typo: On 10/10/2014 03:13 PM, Assaf Gordon wrote: > # For each file managed by git (with 'git ls'), > # compare it to the corresponding file in the other directory: > git ls -0 | xargs -0 -I% diff -q % ../dpdk-1.7.1/% > Should be: git ls -z | xargs -0 -I% diff -q % ../dpdk-1.7.1/% From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 15:46:48 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 19:46:48 +0000 Received: from localhost ([127.0.0.1]:40438 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcg9H-0006wJ-RW for submit@debbugs.gnu.org; Fri, 10 Oct 2014 15:46:48 -0400 Received: from mga11.intel.com ([192.55.52.93]:39743) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcg9E-0006w9-QV for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 15:46:45 -0400 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 10 Oct 2014 12:46:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,693,1406617200"; d="scan'208";a="612647473" Received: from orsmsx109.amr.corp.intel.com ([10.22.240.7]) by fmsmga002.fm.intel.com with ESMTP; 10 Oct 2014 12:46:26 -0700 Received: from orsmsx152.amr.corp.intel.com (10.22.226.39) by ORSMSX109.amr.corp.intel.com (10.22.240.7) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 10 Oct 2014 12:46:26 -0700 Received: from orsmsx102.amr.corp.intel.com ([169.254.1.8]) by ORSMSX152.amr.corp.intel.com ([169.254.8.178]) with mapi id 14.03.0195.001; Fri, 10 Oct 2014 12:46:26 -0700 From: "Polehn, Mike A" To: Paul Eggert , "18681@debbugs.gnu.org" <18681@debbugs.gnu.org> Subject: RE: bug#18681: cp Specific fail example Thread-Topic: bug#18681: cp Specific fail example Thread-Index: Ac/ktEDcd/I3b6NmTXm5MFvYkoBcNgAPOH8AAAyQWRA= Date: Fri, 10 Oct 2014 19:46:25 +0000 Message-ID: <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> <543822BE.1030408@cs.ucla.edu> In-Reply-To: <543822BE.1030408@cs.ucla.edu> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.138] Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) SGkgUGF1bCENCg0KVGhhbmsgeW91IGZvciB5b3VyIHF1aWNrIHJlc3BvbnNlIQ0KDQpZb3Ugd2Vy ZSBsb2dnZWQgaW4gYXMgYSBub3JtYWwgdXNlci4NCkkgd2FzIGxvZ2dlZCBpbiBhcyByb290Lg0K DQpJIHRyaWVkIGFzIG5vcm1hbCB1c2VyIGFuZCBpdCB3b3JrZWQgdGhlIHNhbWUgYXMgeW91Lg0K DQpIb3dldmVyLCBsb2dnZWQgaW4gYXMgcm9vdCBhbmQgdGhlIGVycm9yIG9jY3VycmVkIGFzIGJl Zm9yZS4NCg0KRGlkIGEgc2VhcmNoIGZvciAnY3AgLUknIGFuZCBmb3VuZCBpdCBmb3Igcm9vdDoN Cg0KW3Jvb3RARjIwLXYzIH5dIyBmaW5kIC9yb290IC10eXBlIGYgLXByaW50IHx4YXJncyBncmVw IC1IbiAiY3AgaSINCi9yb290Ly5iYXNocmM6NjphbGlhcyBjcD0nY3AgLWknDQovcm9vdC8uY3No cmM6NjphbGlhcyBjcCAnY3AgLWknDQovcm9vdC8udGNzaHJjOjY6YWxpYXMgY3AgJ2NwIC1pJw0K W3Jvb3RARjIwLXYzIH5dIyBmaW5kIC9ldGMgLXR5cGUgZiAtcHJpbnQgfHhhcmdzIGdyZXAgLUhu ICJjcCBpIg0KW3Jvb3RARjIwLXYzIH5dIyBmaW5kIC9ob21lL21pa2UgLXR5cGUgZiAtcHJpbnQg fHhhcmdzIGdyZXAgLUhuICJjcCBpIg0KL2hvbWUvbWlrZS9kcGRrLTEuNy4xL2V4YW1wbGVzL3Zo b3N0L21haW4uYzoyNDU2OiAgICAgICAgICAgICAgICJtYnVmX2Rlc3Ryb3lfemNwIGlzOiAlZFxu IiwNCi9ob21lL21pa2UvZHBkay0xLjcuMS9leGFtcGxlcy92aG9zdC9tYWluLmM6MjQ3NDogICAg ICAgICAgICAgICAibWJ1Zl9kZXN0cm95X3pjcCBpczogJWRcbiIsDQovaG9tZS9taWtlL2RwZGst MS43LjEvZXhhbXBsZXMvdmhvc3QvbWFpbi5jOjI0Nzg6ICAgICAgICAgICAgICAgIm1idWZfZGVz dHJveV96Y3AgaXMgOiAlZFxuIiwNCi9ob21lL21pa2UvZHBkay9leGFtcGxlcy92aG9zdC9tYWlu LmM6MjQ1NjogICAgICAgICAgICAgIm1idWZfZGVzdHJveV96Y3AgaXM6ICVkXG4iLA0KL2hvbWUv bWlrZS9kcGRrL2V4YW1wbGVzL3Zob3N0L21haW4uYzoyNDc0OiAgICAgICAgICAgICAibWJ1Zl9k ZXN0cm95X3pjcCBpczogJWRcbiIsDQovaG9tZS9taWtlL2RwZGsvZXhhbXBsZXMvdmhvc3QvbWFp bi5jOjI0Nzg6ICAgICAgICAgICAgICJtYnVmX2Rlc3Ryb3lfemNwIGlzIDogJWRcbiIsDQoNCkJ1 dCB0aGVyZSBpcyBzdGlsbCBhbiBlcnJvciBmb3IgaW50ZXJhY3RpdmU6DQoNCltyb290QEYyMC12 MyBzcmNdIyBjcCAtZiAtciBkcGRrLTEuNy4xLyogZHBkay8NCmNwOiBvdmVyd3JpdGUgw6JkcGRr L2FwcC90ZXN0L3Rlc3RfbHBtNi5jw6I/IHkNCmNwOiBvdmVyd3JpdGUgw6JkcGRrL2FwcC90ZXN0 L3Rlc3Rfcndsb2NrLmPDoj8geQ0KY3A6IG92ZXJ3cml0ZSDDomRwZGsvYXBwL3Rlc3QvdGVzdF90 YWJsZV9wb3J0cy5ow6I/IHkNCmNwOiBvdmVyd3JpdGUgw6JkcGRrL2FwcC90ZXN0L3Rlc3RfbG9n cy5jw6I/IHkNCmNwOiBvdmVyd3JpdGUgw6JkcGRrL2FwcC90ZXN0L3Rlc3RfcG1kX3JpbmcuY8Oi PyB5DQpjcDogb3ZlcndyaXRlIMOiZHBkay9hcHAvdGVzdC90ZXN0X3RhYmxlX3RhYmxlcy5ow6I/ DQpjcDogb3ZlcndyaXRlIMOiZHBkay9hcHAvdGVzdC90ZXN0X2xwbS5jw6I/DQpjcDogb3Zlcndy aXRlIMOiZHBkay9hcHAvdGVzdC90ZXN0X21hbGxvYy5jw6I/DQpjcDogb3ZlcndyaXRlIMOiZHBk ay9hcHAvdGVzdC90ZXN0X2Vycm5vLmPDoj8geQ0KY3A6IG92ZXJ3cml0ZSDDomRwZGsvYXBwL3Rl c3QvdGVzdF9oYXNoLmPDoj8NCmNwOiBvdmVyd3JpdGUgw6JkcGRrL2FwcC90ZXN0L3Rlc3RfdGFi bGVfYWNsLmjDoj8geQ0KDQpEaWRuJ3QgYW5zd2VyIHllcyBvciBubyBmb3Igc29tZSBvZiB0aGVz ZSBhbmQgdGhleSBtb3ZlZCBvbiBhbnl3YXksIGluZGljYXRpbmcgdGhlIGludGVyYWN0aXZlIG1v ZGUgaXMgbm90IG9wZXJhdGluZyBhcyBleHBlY3RlZC4NCg0KSXQgaXMgYSBnb29kIGlkZWEgYXMg cm9vdCBub3QgdG8gYmUgb3ZlcndyaXRpbmcgZmlsZXMsIHNvIEkgY2FuIHVuZGVyc3RhbmQgdGhl ICJjcCAtaSIgdXNhZ2UgZm9yIHJvb3QuDQoNCkhvd2V2ZXIsIHNvbWUgb2YgdGhlIHJlYXNvbiBm b3IgdXNpbmcgcm9vdCBpcyB0byBkbyBzb21ldGhpbmcgdGhhdCB5b3UgbWF5IG5vdCBiZSBhYmxl IHRvIGRvIGFzIGEgbm9ybWFsIHVzZXIuIFNvIGJlaW5nIGFibGUgdG8gb3ZlcnJpZGUgdGhlIC1p IHdpdGggYSAtZiB3b3VsZCBiZSBoaWdobHkgZGVzaXJhYmxlLg0KDQpNaWtlDQoNCi0tLS0tT3Jp Z2luYWwgTWVzc2FnZS0tLS0tDQpGcm9tOiBQYXVsIEVnZ2VydCBbbWFpbHRvOmVnZ2VydEBjcy51 Y2xhLmVkdV0gDQpTZW50OiBGcmlkYXksIE9jdG9iZXIgMTAsIDIwMTQgMTE6MTggQU0NClRvOiBQ b2xlaG4sIE1pa2UgQTsgMTg2ODFAZGViYnVncy5nbnUub3JnDQpTdWJqZWN0OiBSZTogYnVnIzE4 NjgxOiBjcCBTcGVjaWZpYyBmYWlsIGV4YW1wbGUNCg0KSSBkbyBub3Qgb2JzZXJ2ZSB0aGUgc3lt cHRvbXMgdGhhdCB5b3UgcmVwb3J0LiAgU2VlIGJlbG93LiAgTXkgZ3Vlc3MgaXMgdGhhdCB5b3Un dmUgYWxpYXNlZCAnY3AnIHRvICdjcCAtaScsIHdoaWNoIGlzIHByb2JhYmx5IGEgbWlzdGFrZS4N Cg0KJCBnaXQgY2xvbmUgZ2l0Oi8vZHBkay5vcmcvZHBkaw0KQ2xvbmluZyBpbnRvICdkcGRrJy4u Lg0KcmVtb3RlOiBDb3VudGluZyBvYmplY3RzOiAxNjI0OSwgZG9uZS4NCnJlbW90ZTogQ29tcHJl c3Npbmcgb2JqZWN0czogMTAwJSAoMzk3Ni8zOTc2KSwgZG9uZS4NCnJlbW90ZTogVG90YWwgMTYy NDkgKGRlbHRhIDEyOTY0KSwgcmV1c2VkIDE1MTA5IChkZWx0YSAxMjEyMikgUmVjZWl2aW5nIG9i amVjdHM6IDEwMCUgKDE2MjQ5LzE2MjQ5KSwgMTIuNzkgTWlCIHwgMS4yMCBNaUIvcywgZG9uZS4N ClJlc29sdmluZyBkZWx0YXM6IDEwMCUgKDEyOTY0LzEyOTY0KSwgZG9uZS4NCkNoZWNraW5nIGNv bm5lY3Rpdml0eS4uLiBkb25lLg0KJCBjZCBkcGRrDQokIGdpdCBjaGVja291dCAtYiBtYXBfdjEu Ny4xIHYxLjcuMQ0KU3dpdGNoZWQgdG8gYSBuZXcgYnJhbmNoICdtYXBfdjEuNy4xJw0KJCBwd2QN Ci90bXAvZC9kcGRrDQokIGNkIC4uDQokIHdnZXQgaHR0cDovL2RwZGsub3JnL2Jyb3dzZS9kcGRr L3NuYXBzaG90L2RwZGstMS43LjEudGFyLmd6DQotLTIwMTQtMTAtMTAgMTE6MTU6NDQtLSAgaHR0 cDovL2RwZGsub3JnL2Jyb3dzZS9kcGRrL3NuYXBzaG90L2RwZGstMS43LjEudGFyLmd6DQpSZXNv bHZpbmcgZHBkay5vcmcgKGRwZGsub3JnKS4uLiA5Mi4yNDMuMTQuMTI0IENvbm5lY3RpbmcgdG8g ZHBkay5vcmcgKGRwZGsub3JnKXw5Mi4yNDMuMTQuMTI0fDo4MC4uLiBjb25uZWN0ZWQuDQpIVFRQ IHJlcXVlc3Qgc2VudCwgYXdhaXRpbmcgcmVzcG9uc2UuLi4gMjAwIE9LDQpMZW5ndGg6IHVuc3Bl Y2lmaWVkIFthcHBsaWNhdGlvbi94LWd6aXBdIFNhdmluZyB0bzog4oCYZHBkay0xLjcuMS50YXIu Z3rigJkNCg0KICAgICBbICAgICAgICAgICAgICAgICAgIDw9PiAgICAgICAgICAgICAgICAgXSA4 LDI4MSw2MDkgICAxLjE3TUIvcyAgIGluIDcuNXMNCg0KMjAxNC0xMC0xMCAxMToxNTo1MiAoMS4w NiBNQi9zKSAtIOKAmGRwZGstMS43LjEudGFyLmd64oCZIHNhdmVkIFs4MjgxNjA5XQ0KDQokIHRh ciAteGYgZHBkay0xLjcuMS50YXIuZ3oNCiQgY3AgLWYgLXIgZHBkay0xLjcuMS8qIGRwZGsvDQok DQo= From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 16:01:50 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 20:01:50 +0000 Received: from localhost ([127.0.0.1]:40442 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XcgNp-0007L3-V2 for submit@debbugs.gnu.org; Fri, 10 Oct 2014 16:01:50 -0400 Received: from mga09.intel.com ([134.134.136.24]:17661) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XcgNm-0007Kq-Ps for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 16:01:47 -0400 Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga102.jf.intel.com with ESMTP; 10 Oct 2014 12:55:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,694,1406617200"; d="scan'208";a="586929687" Received: from orsmsx103.amr.corp.intel.com ([10.22.225.130]) by orsmga001.jf.intel.com with ESMTP; 10 Oct 2014 13:00:53 -0700 Received: from orsmsx111.amr.corp.intel.com (10.22.240.12) by ORSMSX103.amr.corp.intel.com (10.22.225.130) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 10 Oct 2014 13:00:53 -0700 Received: from orsmsx102.amr.corp.intel.com ([169.254.1.8]) by ORSMSX111.amr.corp.intel.com ([169.254.11.167]) with mapi id 14.03.0195.001; Fri, 10 Oct 2014 13:00:53 -0700 From: "Polehn, Mike A" To: Assaf Gordon , "18681@debbugs.gnu.org" <18681@debbugs.gnu.org> Subject: RE: bug#18681: The Linux cp command has bugs Thread-Topic: bug#18681: The Linux cp command has bugs Thread-Index: Ac/kryqklasbsNZsTlSHbqZfRZOUTQASdDgAAA16ixA= Date: Fri, 10 Oct 2014 20:00:52 +0000 Message-ID: <745DB4B8861F8E4B9849C970520ABBF148CAC0EA@ORSMSX102.amr.corp.intel.com> References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <54382FE8.5030904@gmail.com> In-Reply-To: <54382FE8.5030904@gmail.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.22.254.138] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) Hi Assaf! Thank you for your quick response! There is always multiple ways to do things. The git tool has a diff tool bu= ilt in that makes file comparison easy. I have run across multiple times that copying one tree over another is desi= rable. In another bug message thread, we found that the cause was cp alias to 'cp = -i' for root user was the actual cause. This still left the incorrect operation of the interactive operation when b= oth -i and -f is used. I think that in some cases the need of override the '-i' with '-f' maybe ve= ry desirable. So maybe having the '-f' cancel or override the '-i' operatio= n might be a good change. Thanks! Mike -----Original Message----- From: Assaf Gordon [mailto:assafgordon@gmail.com]=20 Sent: Friday, October 10, 2014 12:14 PM To: Polehn, Mike A; 18681@debbugs.gnu.org Subject: Re: bug#18681: The Linux cp command has bugs Hello Mike, On 10/10/2014 01:25 PM, Polehn, Mike A wrote:> > Problem need to copy a tree of 1000s of files to another directory=20 > that is a git directory that has a whole bunch of additional build=20 > files, so diff between the directories will not do any good. > This is slightly off-topic, but if you want to compare only files managed b= y git (ignoring other files in current directory), perhaps the following wo= uld help: # Download and extract the tarball wget -q http://dpdk.org/browse/dpdk/snapshot/dpdk-1.7.1.tar.gz tar -xf dpdk-1.7.1.tar.gz # Clone the git repo with specific branch, checkout the relevant branc= h # (or go to an existing checked-out repository directory) git clone git://dpdk.org/dpdk cd dpdk git checkout -b map_v1.7.1 v1.7.1 # For each file managed by git (with 'git ls'), # compare it to the corresponding file in the other directory: git ls -0 | xargs -0 -I% diff -q % ../dpdk-1.7.1/% Regards, -gordon From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 16:55:04 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 20:55:05 +0000 Received: from localhost ([127.0.0.1]:40476 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XchDM-0000IE-4p for submit@debbugs.gnu.org; Fri, 10 Oct 2014 16:55:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:25745) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XchDJ-0000Hl-43 for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 16:55:02 -0400 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s9AKsx2w020542 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 10 Oct 2014 16:54:59 -0400 Received: from [10.3.113.90] (ovpn-113-90.phx2.redhat.com [10.3.113.90]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s9AKsxjE007933; Fri, 10 Oct 2014 16:54:59 -0400 Message-ID: <543847A3.9050009@redhat.com> Date: Fri, 10 Oct 2014 14:54:59 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1 MIME-Version: 1.0 To: "Polehn, Mike A" , Assaf Gordon , "18681@debbugs.gnu.org" <18681@debbugs.gnu.org> Subject: Re: bug#18681: The Linux cp command has bugs References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <54382FE8.5030904@gmail.com> <745DB4B8861F8E4B9849C970520ABBF148CAC0EA@ORSMSX102.amr.corp.intel.com> In-Reply-To: <745DB4B8861F8E4B9849C970520ABBF148CAC0EA@ORSMSX102.amr.corp.intel.com> OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="KuU5Fc4amWse5ar6wKBpmJig8Q33X3JLf" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Spam-Score: -6.0 (------) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -6.0 (------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --KuU5Fc4amWse5ar6wKBpmJig8Q33X3JLf Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 10/10/2014 02:00 PM, Polehn, Mike A wrote: > This still left the incorrect operation of the interactive operation wh= en both -i and -f is used. The behavior of -i vs. -f interaction is required by POSIX; in particular, POSIX is explicit that -i and -f are NOT a toggle switch of one another, but each turns on slightly different, somewhat overlapping, changes in behavior (so specifying both is different from specifying one in isolation). We can't change what either one of those flags means. If there is another mode of operation that is also useful, then it needs yet another flag. At one point in the past, we had --reply=3D{yes,no,query} to try and offer a third mode, but it had confusing semantics and we ended up pulling it because of the confusion it could cause. At the time we pulled it, we admitted that 'rsync' has some modes of operations that might be better suited for the particular modes that people people seemed to be requesting when they thought that --reply would do the trick (and usually, what they thought --reply would do and what it actually did were different, which is why we removed it to avoid confusion). We have also added a --no-clobber option, which is somewhat of a compromise (what some people thought --reply=3Dno would do,= --no-clobber actually does better). So adding a new option is not out of the question, but you'd have to have well-defined semantics of what it should do, and how it differs from either normal mode, '-i' mode, '-f' mode, '-i -f' mode, or '--no-clobber' mode. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --KuU5Fc4amWse5ar6wKBpmJig8Q33X3JLf Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAEBCAAGBQJUOEejAAoJEKeha0olJ0NqKJoH/2RD9wm3l+f/hhvUaMJGbp2k SJDG9SNmevsxagEDbejLzP6Ysj2TWWWMNEt/ly+LlZ8IgZlTFQMCQWFfZ/Wue74e aHCTnnEMIK1uqi5Cip5zBkQV0mCk2hgVgRzf7gfKK3ulPlEFz7ISkJKMtv65vMBc L3KtT0XUPo9/61GFfrv6zkmEI5DTDHg+jNsV18QkCoy1xL4TzsLyTr/EzfVluuqU IsU4+1WC4ouP62DKKOxU6/dabffDCHtZeYX5/weV3vbZ0t9O/Hiv0kQtzQ3Sj9WX oEKvPNxDiBSORwFJLYyuLAAoJsdDpXUSRMx5pPhkcpgmlkRQRG3lXHddQLHwpKI= =pc5M -----END PGP SIGNATURE----- --KuU5Fc4amWse5ar6wKBpmJig8Q33X3JLf-- From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 19:36:20 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 23:36:20 +0000 Received: from localhost ([127.0.0.1]:40543 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XcjjQ-0005ni-1K for submit@debbugs.gnu.org; Fri, 10 Oct 2014 19:36:20 -0400 Received: from joseki.proulx.com ([216.17.153.58]:42763) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XcjjN-0005nZ-DC for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 19:36:18 -0400 Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id 5DE5B21229; Fri, 10 Oct 2014 17:36:16 -0600 (MDT) Received: by hysteria.proulx.com (Postfix, from userid 1000) id 42DF22DC4F; Fri, 10 Oct 2014 17:36:16 -0600 (MDT) Date: Fri, 10 Oct 2014 17:36:16 -0600 From: Bob Proulx To: "Polehn, Mike A" Subject: Re: bug#18681: cp Specific fail example Message-ID: <20141010172253719078437@bob.proulx.com> References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> <543822BE.1030408@cs.ucla.edu> <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -1.0 (-) X-Debbugs-Envelope-To: 18681 Cc: 18681@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Polehn, Mike A wrote: > Did a search for 'cp -I' and found it for root: > > [root@F20-v3 ~]# find /root -type f -print |xargs grep -Hn "cp i" > /root/.bashrc:6:alias cp='cp -i' > /root/.cshrc:6:alias cp 'cp -i' > /root/.tcshrc:6:alias cp 'cp -i' It might be easier to guess that there is an alias and look for it. :-) # alias cp alias cp='cp -i' # type cp ls is aliased to `cp -i' > But there is still an error for interactive: > > [root@F20-v3 src]# cp -f -r dpdk-1.7.1/* dpdk/ Since you know that "cp" in the above is "cp -i" then you know the command is actually "cp -i -f -r dpdk-1.7.1/* dpdk/" which you don't want there. Try it without the alias in play. The normal way in a /bin/sh derived environment is to simply quote the command. If you quote the command then it won't do alias expansion. The usual method of quoting is with a backslash. # \cp -f -r dpdk-1.7.1/* dpdk/ However the canonical method is to use "env" since the above doesn't work in csh derived shells. Therefore you will find suggestions to use env to wrap the command and avoid alias expansion like this. It is often offered when we don't know if you are using a sh or csh derived command line shell. (This env trick is one I learned on this list some years ago.) # env cp -f -r dpdk-1.7.1/* dpdk/ And of course you can always unalias the command too. # unalias cp > It is a good idea as root not to be overwriting files, so I can > understand the "cp -i" usage for root. Personally I simply realize that the tools are sharp kitchen knives and I always handle sharp kitchen knives carefully. Trying to put safety shields on them simply gets in the way. It tends to cause problems such as you are seeing here. I usually remove those aliases on systems I administer. > However, some of the reason for using root is to do something that > you may not be able to do as a normal user. So being able to > override the -i with a -f would be highly desirable. Right. And you can. You have the power. Just do it. By avoiding the alias with \cp (or the env trick) and then you won't have the -i in play. Or remove the alias from the environment. There is the burden upon the root superuser that they have great power. With great power comes great responsibility. Being root means you are a pilot not a passenger. There is an old saying in flying, "Fly the airplane. Don't let the airplane fly you." Hopefully the meaning is obvious even to the non-pilot. Meanwhile... I would be one of those suggesting that perhaps you should try using rsync instead of cp. The cp command is lean and mean by comparison to rsync (and should stay that way). But rsync has many attractive features for doing large copies. Bob From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 19:49:03 2014 Received: (at 18681) by debbugs.gnu.org; 10 Oct 2014 23:49:03 +0000 Received: from localhost ([127.0.0.1]:40612 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcjvi-0006AO-RR for submit@debbugs.gnu.org; Fri, 10 Oct 2014 19:49:03 -0400 Received: from mailgw01.kcn.ne.jp ([61.86.7.208]:41487) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcjvf-00069w-TM for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 19:49:01 -0400 Received: from imp02 (mailgw6.kcn.ne.jp [61.86.15.232]) by mailgw01.kcn.ne.jp (Postfix) with ESMTP id 63C407FE25 for <18681@debbugs.gnu.org>; Sat, 11 Oct 2014 08:48:55 +0900 (JST) Received: from mail09.kcn.ne.jp ([61.86.6.188]) by imp02 with bizsmtp id 1bov1p00843QJrh01bovQD; Sat, 11 Oct 2014 08:48:55 +0900 X-OrgRCPT: 18681@debbugs.gnu.org Received: from [10.120.1.60] (i118-21-128-66.s30.a048.ap.plala.or.jp [118.21.128.66]) by mail09.kcn.ne.jp (Postfix) with ESMTPA id 1CDED1BD00C3; Sat, 11 Oct 2014 08:48:55 +0900 (JST) Date: Sat, 11 Oct 2014 08:48:54 +0900 From: Norihiro Tanaka To: "Polehn, Mike A" Subject: Re: bug#18681: The Linux cp command has bugs In-Reply-To: <745DB4B8861F8E4B9849C970520ABBF148CAC0EA@ORSMSX102.amr.corp.intel.com> References: <54382FE8.5030904@gmail.com> <745DB4B8861F8E4B9849C970520ABBF148CAC0EA@ORSMSX102.amr.corp.intel.com> Message-Id: <20141011084853.46A5.27F6AC2D@kcn.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.65.07 [ja] X-Spam-Score: -1.0 (-) X-Debbugs-Envelope-To: 18681 Cc: "18681@debbugs.gnu.org" <18681@debbugs.gnu.org>, Assaf Gordon X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hi Polehn, The -f option isn't `suppress interactive' in cp. It attempts to unlink a destination not to be able to override. It's different from the option in mv. As the behavior is clearly defined in POSIX as Eric says, we won't be able to change it. BTW, I don't like the alias `cp -i'. So I remove it from .bashrc always immediately after an installation of a distribution. (^_^) If you temporarily want to cancel the the alias, you can define an another alias as `cpf', and/or can use below instead of `cp' - command cp -f - /bin/cp -f - ( unalias cp; cp -f ... ) Even if add new option `-F' to supress interactive to cp, we need to use -F for cp and -f for mv to do it. From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 10 20:38:32 2014 Received: (at 18681) by debbugs.gnu.org; 11 Oct 2014 00:38:32 +0000 Received: from localhost ([127.0.0.1]:40633 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xckhb-0007Vi-1U for submit@debbugs.gnu.org; Fri, 10 Oct 2014 20:38:31 -0400 Received: from mail-ig0-f180.google.com ([209.85.213.180]:58844) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XckhX-0007VW-BV for 18681@debbugs.gnu.org; Fri, 10 Oct 2014 20:38:28 -0400 Received: by mail-ig0-f180.google.com with SMTP id uq10so4752427igb.13 for <18681@debbugs.gnu.org>; Fri, 10 Oct 2014 17:38:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=2p/qLjlAGqPIzhV3MpKrbPdOjtv+knABpxVGCPKmiVA=; b=Zv58QMwY+XrzhPTn1Xvw1MCs1srpfiAAuesWVYFEu3zun6/h1f5dg4utjVx6jwSwAD EIcHJWPssQyYCO8/8mdlhMaEVi118EY8Ehq1tGOKkkTz/EbrgEGjadfT8c9HiB3E0OQ2 hnRuC2ikvZzDZvNtVu1pBQfyCcT8c8EVaMEF6LCg4nkfhvINRxjd3SZji8XhHomVG+8n lO1403KWJ5jYc0XVXDIEP+ZSL6NbE2Uxp9qmUkW+TfC5u65Rs93RVRTwu1SZNR12fDm2 g7BYBYQfBVVmmigmpCS57fOTCVDLldj3dqdg2z0AVN3jXwvx/l+Ww+iTuLuN1Bxh/iN0 OEeA== X-Received: by 10.50.142.97 with SMTP id rv1mr11591465igb.11.1412987906658; Fri, 10 Oct 2014 17:38:26 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.110.228 with HTTP; Fri, 10 Oct 2014 17:38:06 -0700 (PDT) In-Reply-To: <20141011084853.46A5.27F6AC2D@kcn.ne.jp> References: <54382FE8.5030904@gmail.com> <745DB4B8861F8E4B9849C970520ABBF148CAC0EA@ORSMSX102.amr.corp.intel.com> <20141011084853.46A5.27F6AC2D@kcn.ne.jp> From: Jon Stanley Date: Fri, 10 Oct 2014 20:38:06 -0400 Message-ID: Subject: Re: bug#18681: The Linux cp command has bugs To: Norihiro Tanaka Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 18681 Cc: "18681@debbugs.gnu.org" <18681@debbugs.gnu.org>, Assaf Gordon , "Polehn, Mike A" X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Fri, Oct 10, 2014 at 7:48 PM, Norihiro Tanaka wrote: > If you temporarily want to cancel the the alias, you can define an another > alias as `cpf', and/or can use below instead of `cp' Note that (in bash at least) you can prefix the command with a backslash (\) to override an alias for that invocation, and is what I typically do: $ \cp From debbugs-submit-bounces@debbugs.gnu.org Sat Oct 11 13:27:11 2014 Received: (at 18681) by debbugs.gnu.org; 11 Oct 2014 17:27:11 +0000 Received: from localhost ([127.0.0.1]:41268 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xd0Ri-0001na-Ky for submit@debbugs.gnu.org; Sat, 11 Oct 2014 13:27:11 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:40531 helo=Ishtar.hs.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xcsvf-0004ZW-9E for 18681@debbugs.gnu.org; Sat, 11 Oct 2014 05:25:36 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id s9B9OW8u043237; Sat, 11 Oct 2014 02:24:35 -0700 Message-ID: <5438F74F.70301@tlinx.org> Date: Sat, 11 Oct 2014 02:24:31 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Bob Proulx Subject: Re: bug#18681: cp Specific fail example References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> <543822BE.1030408@cs.ucla.edu> <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> <20141010172253719078437@bob.proulx.com> In-Reply-To: <20141010172253719078437@bob.proulx.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -1.0 (-) X-Debbugs-Envelope-To: 18681 X-Mailman-Approved-At: Sat, 11 Oct 2014 13:27:08 -0400 Cc: 18681@debbugs.gnu.org, "Polehn, Mike A" X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Bob Proulx wrote: > Meanwhile... I would be one of those suggesting that perhaps you > should try using rsync instead of cp. The cp command is lean and mean > by comparison to rsync (and should stay that way). But rsync has many > attractive features for doing large copies. > ---- fwiw...--- Like large execution times... from the latest snapshot on my system -- I use rsync to only move differences between yesterday and "today[whenever new snap is taken]"... it was a larger than normal snap -- most only take 75-90 minutes...but rsync (these are the script messages) with some debugging output still turned on... even an rm over the resulting diff took 101 seconds... then cp comes along.. even w/a sync it would still be under a minute. I.e. rsync copied just the diffs to "/home.diff", then find with "-empty -delete" is used to get rid of empty dirs (rsync creates many of these). then a static partition is created to hold the "diff" output -- and cp took walked and copied the tree in 12s. (output wasn't flushed, but it's not that long.. 141008030705, /dev/Data/Home-2014.10.08-03.07.05=>CODE(0xbf24a0), f=>CODE(0xbf24e8), d=>{su=>"64k", sw=>1}, i=>{maxpct=>10, size=>256}, s=>{size=>4096}} About to copy base-diff dir to static Copying diffs to dated static snap...Time: 0m, 12s. mklabel@ /home/.snapdir/@GMT-2014.10.08-03.07.05/./._snapdat_=snap_copy_complete after copy2staticsnap: complete From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 12 20:54:08 2014 Received: (at 18681) by debbugs.gnu.org; 13 Oct 2014 00:54:08 +0000 Received: from localhost ([127.0.0.1]:42274 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XdTtn-0003ng-TB for submit@debbugs.gnu.org; Sun, 12 Oct 2014 20:54:08 -0400 Received: from joseki.proulx.com ([216.17.153.58]:52715) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XdTtl-0003nZ-DR for 18681@debbugs.gnu.org; Sun, 12 Oct 2014 20:54:06 -0400 Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id 0A43721229; Sun, 12 Oct 2014 18:54:04 -0600 (MDT) Received: by hysteria.proulx.com (Postfix, from userid 1000) id C543B2DC48; Sun, 12 Oct 2014 18:54:03 -0600 (MDT) Date: Sun, 12 Oct 2014 18:54:03 -0600 From: Bob Proulx To: Linda Walsh Subject: Re: bug#18681: cp Specific fail example Message-ID: <20141012182228828257097@bob.proulx.com> Mail-Followup-To: 18681@debbugs.gnu.org, "Polehn, Mike A" References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> <543822BE.1030408@cs.ucla.edu> <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> <20141010172253719078437@bob.proulx.com> <5438F74F.70301@tlinx.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5438F74F.70301@tlinx.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 18681 Cc: 18681@debbugs.gnu.org, "Polehn, Mike A" X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: 18681@debbugs.gnu.org, "Polehn, Mike A" List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Linda Walsh wrote: > Bob Proulx wrote: > > Meanwhile... I would be one of those suggesting that perhaps you > > should try using rsync instead of cp. The cp command is lean and > > mean by comparison to rsync (and should stay that way). But rsync > > has many attractive features for doing large copies. > > ---- fwiw...--- > Like large execution times... from the latest snapshot on my system -- > I use rsync to only move differences between yesterday and "today[whenever > new snap is taken]"... it was a larger than normal snap -- most only > take 75-90 minutes...but rsync (these are the script messages) with some > debugging output still turned on... even an rm over the resulting diff > took 101 seconds... then cp comes along.. even w/a sync it would > still be under a minute. Wow. Just to be clear an rsync copy took 75 to 90 minutes but a cp copy took less than 1 minute? I find that very suspicious. I never see that much difference between them. Are you sure the difference wasn't that the data was cached into ram by the rsync and therefore the second run with cp just ran with the warmed up cache? With a large data set and a large ram that is plausible. > I.e. rsync copied just the diffs to "/home.diff", then > find with "-empty -delete" is used to get rid of empty dirs (rsync > creates many of these). then a static partition is created to hold > the "diff" output -- and cp took walked and copied the tree in 12s. > (output wasn't flushed, but it's not that long.. If rsync wasn't so slow at local I/O...*sigh*.... The advantage of rsync is that it can be interrupted and restarted and the restarted process will efficiently avoid doing work that is already done. An interrupted and restarted cp will perform the same work again from start to finish. If I am doing a simple copy from A to B then I use 'cp -av A B'. If I am doing it the second time then I will use rsync to avoid repeating previously done work 'rsync -av A B'. If I want progress indication... If I want placement of backup files in a particular directory... If I want other fancy features that are provided by rsync then it is worth it to use rsync. $ du -s coreutils 238920 coreutils $ find coreutils -type f | wc -l 15013 $ rm -rf junk/coreutils # echo 3 > /proc/sys/vm/drop_caches $ time cp -a coreutils junk/ real 1m2.137s user 0m0.140s sys 0m1.724s $ rm -rf junk/coreutils $ time cp -a coreutils junk/ real 0m2.492s user 0m0.060s sys 0m1.064s $ rm -rf junk/coreutils # echo 3 > /proc/sys/vm/drop_caches $ time rsync -a coreutils junk/ real 1m5.473s user 0m1.280s sys 0m2.112s $ rm -rf junk/coreutils $ time rsync -a coreutils junk/ real 0m3.215s user 0m1.184s sys 0m1.536s For normal use cp is a little faster than rsync. Or rather rsync is a little slower than cp. But not enough to make a difference for typical operations. Having the file system cache warmed up makes a *HUGE* difference. Much larger than any other difference. For copies that take hours to run I am probably going to value the restart ability more than raw speed. YMMV. Bob From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 12 22:14:39 2014 Received: (at 18681) by debbugs.gnu.org; 13 Oct 2014 02:14:39 +0000 Received: from localhost ([127.0.0.1]:42306 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XdV9h-00065s-VF for submit@debbugs.gnu.org; Sun, 12 Oct 2014 22:14:38 -0400 Received: from nm33-vm3.bullet.mail.gq1.yahoo.com ([98.136.216.242]:50259) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XdV9e-00065i-Ac for 18681@debbugs.gnu.org; Sun, 12 Oct 2014 22:14:35 -0400 Received: from [127.0.0.1] by nm33.bullet.mail.gq1.yahoo.com with NNFMP; 13 Oct 2014 02:14:33 -0000 Received: from [98.137.12.60] by nm33.bullet.mail.gq1.yahoo.com with NNFMP; 13 Oct 2014 02:11:46 -0000 Received: from [98.139.212.151] by tm5.bullet.mail.gq1.yahoo.com with NNFMP; 13 Oct 2014 02:11:45 -0000 Received: from [98.139.212.213] by tm8.bullet.mail.bf1.yahoo.com with NNFMP; 13 Oct 2014 02:11:45 -0000 Received: from [127.0.0.1] by omp1022.mail.bf1.yahoo.com with NNFMP; 13 Oct 2014 02:11:45 -0000 X-Yahoo-Newman-Property: ymail-4 X-Yahoo-Newman-Id: 771793.65049.bm@omp1022.mail.bf1.yahoo.com Received: (qmail 44546 invoked by uid 60001); 13 Oct 2014 02:11:45 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1413166305; bh=dekp3CUUr9DwFqStsMqfDE9D7oGFAWNwoOmynmpiaWk=; h=References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=Y41j9xcGTDlqUzlGuaRM3lak3jeTU97tB/ItIk2ju1ypjU0esat24FSHv6uxFGaacfCKCfU0PZBNgbH2PFYSU3C44Nwv1ezQIbAe2bt+1kGNuxbAOmnQ2Jg+6AMKZAuvPi19tm73JRf3mc9XGYEF3udXyNUQel0S35+JrD769zA= X-YMail-OSG: 6HI34ycVM1nUdRy_nxcVtJ0TGnLRWsWUI4T9fJReQBaeN2K KwvK5fqHQerY5cLPyvcYXHTQJs6LvXQ0GOYSE1gtCrVVTEtdId3jNdsvWefB oU_B8l1I6ri.qTy54HVSN9CmNqdrpgqaAWYfRYfQSgqquuUE5IQ2HdwG91EL Lc9ZVj_UfTAlcah33aXTrzC0b8YD_fwIUju7CDungzaTL2CmHwJyHtSYtuoF nYF2mJCh2mZQl2e2yZTr5Xcm9JUIyvTA1rytg9q8unrtvEulrWV4RqMXVlUJ bCfBqm7SkdiqoQpB.LrAJkYPTyOlt7aUGwSdZ83oLFDx7oPaSwTqHKqaam66 6ZIxLKTgm_xgBL3KwOKQuA7WXLjgk0avDflKZ4eb88S1asbi_zJY9H.kTxoK JfUtu6OWahg9Y1OyF4oDeGsCPRjyrT0J4.jrTV8MQZkOrjTE0F0U5pNZBwjc rSABCkvJEb7CmMqze.1x0W_nTHANSE.k25vCKf7zezcsq7sFj.YgwSyasrmR I_x9thyPSGtEA5rirNsvuaA0Is2DsiD8anhG7umdevdoTMQf3dz5BKrMFffJ _B6cd4mayKm6wPyRj5ByWM5s- Received: from [70.27.253.89] by web142603.mail.bf1.yahoo.com via HTTP; Sun, 12 Oct 2014 19:11:45 PDT X-Rocket-MIMEInfo: 002.001, RnVydGhlciB0byBCb2IncyBleHBsYW5hdGlvbiwKSWYgeW91IHdlcmUgdG8gY29weSBhIDEwZ2lnIGZpbGUgYWNyb3NzIHRoZSBpbnRlcm5ldC4gY3Agd291bGQgd29yayBqdXN0IGZpbmUgYW5kIGNvdWxkIHRha2Ugc2V2ZXJhbCBob3Vycy4gIEJ1dCBzdXBwb3NlIHRoZXJlIHdhcyBhbiBlcnJvciBpbiB0aGUgdHJhbnNtaXNzaW9uIChiYWQgYmxvY2spIG9yIHlvdSBoYWQgdG8gc3RvcCBhbmQgcmVzdGFydC4geW91IHdvdWxkIG5lZWQgdG8gcmVkbyBjcCBhbmQgY29weSB0aGUgZmlsZSBmcm9tIHRoZSBiZWcBMAEBAQE- X-Mailer: YahooMailWebService/0.8.203.696 References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> <543822BE.1030408@cs.ucla.edu> <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> <20141010172253719078437@bob.proulx.com> <5438F74F.70301@tlinx.org> <20141012182228828257097@bob.proulx.com> Message-ID: <1413166305.24386.YahooMailNeo@web142603.mail.bf1.yahoo.com> Date: Sun, 12 Oct 2014 19:11:45 -0700 From: Leslie S Satenstein Subject: Re: bug#18681: cp Specific fail example To: "18681@debbugs.gnu.org" <18681@debbugs.gnu.org>, "Polehn, Mike A" In-Reply-To: <20141012182228828257097@bob.proulx.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="-908725958-316188267-1413166305=:24386" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: Leslie S Satenstein List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) ---908725958-316188267-1413166305=:24386 Content-Type: text/plain; charset=us-ascii Further to Bob's explanation, If you were to copy a 10gig file across the internet. cp would work just fine and could take several hours. But suppose there was an error in the transmission (bad block) or you had to stop and restart. you would need to redo cp and copy the file from the beginning. Rsync would take a checksum of the parts of the file on the remote, and compare it to the host. It would restart at the first detected bad file offset. Regards Leslie Mr. Leslie Satenstein Montreal, Quebec, Canada >________________________________ > From: Bob Proulx >To: Linda Walsh >Cc: 18681@debbugs.gnu.org; "Polehn, Mike A" >Sent: Sunday, October 12, 2014 8:54 PM >Subject: bug#18681: cp Specific fail example > > >Linda Walsh wrote: >> Bob Proulx wrote: >> > Meanwhile... I would be one of those suggesting that perhaps you >> > should try using rsync instead of cp. The cp command is lean and >> > mean by comparison to rsync (and should stay that way). But rsync >> > has many attractive features for doing large copies. >> >> ---- fwiw...--- >> Like large execution times... from the latest snapshot on my system -- >> I use rsync to only move differences between yesterday and "today[whenever >> new snap is taken]"... it was a larger than normal snap -- most only >> take 75-90 minutes...but rsync (these are the script messages) with some >> debugging output still turned on... even an rm over the resulting diff >> took 101 seconds... then cp comes along.. even w/a sync it would >> still be under a minute. > >Wow. Just to be clear an rsync copy took 75 to 90 minutes but a cp >copy took less than 1 minute? I find that very suspicious. I never >see that much difference between them. Are you sure the difference >wasn't that the data was cached into ram by the rsync and therefore >the second run with cp just ran with the warmed up cache? With a >large data set and a large ram that is plausible. > >> I.e. rsync copied just the diffs to "/home.diff", then >> find with "-empty -delete" is used to get rid of empty dirs (rsync >> creates many of these). then a static partition is created to hold >> the "diff" output -- and cp took walked and copied the tree in 12s. >> (output wasn't flushed, but it's not that long.. >It appears that you are using features from rsync that do not exist in >cp. Therefore the work being done in the task isn't equivalent work. >In that case it is probably quite reasonable for rsync to be slower >than cp. > >Also consider that if cp were to acquire all of the enhancements that >have been requested for cp as time has gone by then cp would be just >as featureful (bloated!) as rsync and likely just as slow as rsync >too. This is something to consider every time someone asks for a >creeping feature to cp. Especially if they say they want the feature >in cp because it is faster than rsync. The natural progression is >that cp would become rsync. > >> If rsync wasn't so slow at local I/O...*sigh*.... > >The advantage of rsync is that it can be interrupted and restarted and >the restarted process will efficiently avoid doing work that is >already done. An interrupted and restarted cp will perform the same >work again from start to finish. > >If I am doing a simple copy from A to B then I use 'cp -av A B'. If I >am doing it the second time then I will use rsync to avoid repeating >previously done work 'rsync -av A B'. > >If I want progress indication... If I want placement of backup files >in a particular directory... If I want other fancy features that are >provided by rsync then it is worth it to use rsync. > > $ du -s coreutils > 238920 coreutils > $ find coreutils -type f | wc -l > 15013 > > $ rm -rf junk/coreutils > # echo 3 > /proc/sys/vm/drop_caches > $ time cp -a coreutils junk/ > real 1m2.137s > user 0m0.140s > sys 0m1.724s > > $ rm -rf junk/coreutils > $ time cp -a coreutils junk/ > real 0m2.492s > user 0m0.060s > sys 0m1.064s > > $ rm -rf junk/coreutils > # echo 3 > /proc/sys/vm/drop_caches > $ time rsync -a coreutils junk/ > real 1m5.473s > user 0m1.280s > sys 0m2.112s > > $ rm -rf junk/coreutils > $ time rsync -a coreutils junk/ > real 0m3.215s > user 0m1.184s > sys 0m1.536s > >For normal use cp is a little faster than rsync. Or rather rsync is a >little slower than cp. But not enough to make a difference for >typical operations. Having the file system cache warmed up makes a >*HUGE* difference. Much larger than any other difference. For copies >that take hours to run I am probably going to value the restart >ability more than raw speed. YMMV. > > > > > >Bob > > > > > > ---908725958-316188267-1413166305=:24386 Content-Type: text/html; charset=us-ascii
Further to Bob's explanation,
If you were to copy a 10gig file across the internet. cp would work just fine and could take several hours.  But suppose there was an error in the transmission (bad block) or you had to stop and restart. you would need to redo cp and copy the file from the beginning.  Rsync would take a checksum of the parts of the file on the remote, and compare it to the host. It would restart at the first detected bad file offset.
 

Regards

 Leslie
Mr. Leslie Satenstein
Montreal, Quebec, Canada



From: Bob Proulx <bob@proulx.com>
To: Linda Walsh <bash@tlinx.org>
Cc: 18681@debbugs.gnu.org; "Polehn, Mike A" <mike.a.polehn@intel.com>
Sent: Sunday, October 12, 2014 8:54 PM
Subject: bug#18681: cp Specific fail example

Linda Walsh wrote:
> Bob Proulx wrote:
> > Meanwhile...  I would be one of those suggesting that perhaps you
> > should try using rsync instead of cp.  The cp command is lean and
> > mean by comparison to rsync (and should stay that way).  But rsync
> > has many attractive features for doing large copies.
>
> ---- fwiw...---
> Like large execution times... from the latest snapshot on my system --
> I use rsync to only move differences between  yesterday and "today[whenever
> new snap is taken]"... it was a larger than normal snap -- most only
> take 75-90 minutes...but rsync (these are the script messages) with some
> debugging output still turned on... even an rm over the resulting diff
> took 101 seconds... then cp comes along.. even w/a sync it would
> still be under a minute.

Wow.  Just to be clear an rsync copy took 75 to 90 minutes but a cp
copy took less than 1 minute?  I find that very suspicious.  I never
see that much difference between them.  Are you sure the difference
wasn't that the data was cached into ram by the rsync and therefore
the second run with cp just ran with the warmed up cache?  With a
large data set and a large ram that is plausible.

> I.e. rsync copied just the diffs to "/home.diff", then
> find with "-empty -delete" is used to get rid of empty dirs (rsync
> creates many of these).  then a static partition is created to hold
> the "diff" output -- and cp took walked and copied the tree in 12s.
> (output wasn't flushed, but it's not that long.. <a minute...).

It appears that you are using features from rsync that do not exist in
cp.  Therefore the work being done in the task isn't equivalent work.
In that case it is probably quite reasonable for rsync to be slower
than cp.

Also consider that if cp were to acquire all of the enhancements that
have been requested for cp as time has gone by then cp would be just
as featureful (bloated!) as rsync and likely just as slow as rsync
too.  This is something to consider every time someone asks for a
creeping feature to cp.  Especially if they say they want the feature
in cp because it is faster than rsync.  The natural progression is
that cp would become rsync.

> If rsync wasn't so slow at local I/O...*sigh*....

The advantage of rsync is that it can be interrupted and restarted and
the restarted process will efficiently avoid doing work that is
already done.  An interrupted and restarted cp will perform the same
work again from start to finish.

If I am doing a simple copy from A to B then I use 'cp -av A B'.  If I
am doing it the second time then I will use rsync to avoid repeating
previously done work 'rsync -av A B'.

If I want progress indication...  If I want placement of backup files
in a particular directory...  If I want other fancy features that are
provided by rsync then it is worth it to use rsync.

  $ du -s coreutils
  238920  coreutils
  $ find coreutils -type f | wc -l
  15013

  $ rm -rf junk/coreutils
  # echo 3 > /proc/sys/vm/drop_caches
  $ time cp -a coreutils junk/
  real    1m2.137s
  user    0m0.140s
  sys    0m1.724s

  $ rm -rf junk/coreutils
  $ time cp -a coreutils junk/
  real    0m2.492s
  user    0m0.060s
  sys    0m1.064s

  $ rm -rf junk/coreutils
  # echo 3 > /proc/sys/vm/drop_caches
  $ time rsync -a coreutils junk/
  real    1m5.473s
  user    0m1.280s
  sys    0m2.112s

  $ rm -rf junk/coreutils
  $ time rsync -a coreutils junk/
  real    0m3.215s
  user    0m1.184s
  sys    0m1.536s

For normal use cp is a little faster than rsync.  Or rather rsync is a
little slower than cp.  But not enough to make a difference for
typical operations.  Having the file system cache warmed up makes a
*HUGE* difference.  Much larger than any other difference.  For copies
that take hours to run I am probably going to value the restart
ability more than raw speed.  YMMV.




Bob





---908725958-316188267-1413166305=:24386-- From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 12 22:46:25 2014 Received: (at 18681) by debbugs.gnu.org; 13 Oct 2014 02:46:25 +0000 Received: from localhost ([127.0.0.1]:42314 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XdVeS-0006xp-4Q for submit@debbugs.gnu.org; Sun, 12 Oct 2014 22:46:25 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:44518 helo=Ishtar.hs.tlinx.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XdVeN-0006xd-V2 for 18681@debbugs.gnu.org; Sun, 12 Oct 2014 22:46:21 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id s9D2jJ4Y008184; Sun, 12 Oct 2014 19:45:23 -0700 Message-ID: <543B3CBF.3070009@tlinx.org> Date: Sun, 12 Oct 2014 19:45:19 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: 18681@debbugs.gnu.org, "Polehn, Mike A" Subject: Re: bug#18681: cp Specific fail example References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> <543822BE.1030408@cs.ucla.edu> <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> <20141010172253719078437@bob.proulx.com> <5438F74F.70301@tlinx.org> <20141012182228828257097@bob.proulx.com> In-Reply-To: <20141012182228828257097@bob.proulx.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Bob Proulx wrote: > Wow. Just to be clear an rsync copy took 75 to 90 minutes but a cp --- Actually in the case I used for illustration, it was 110 minutes, but that was longer than normal. Last night's figures: : rsync took 87m, 34s [which is fairly quick given the size of the diffs.] : Empty-directory removal took 1m, 58s : Find used space for /home.diff...sz=2.5GB, min=3.1GB, extsz=4.0MB, n-ext'=806 : Copying diffs to dated static snap...Time: 0m, 17s. It wasn't a copy, but a diff between 2 volumes (the same volume, but one is a ~24+hour snapshot started the on the previous run. So I look at the differences between two temporal copies then copy that to a 3rd partition that starts out empty. So rsync is comparing file times (doesn't do file reads, _by_ _default_, unless it needs to move the data (as indicated by size and timestamps) -- examines all file time/dates on my 'home' partition, and compares those against a mostly-the-same- active LVM snapshot. Out of 871G, on the long day, it found ~5G of changes -- last night was only 3G... varies based on how much change happened to the volume over the period... smallest size now is 600m, largest I've seen has been about 18G. Once the *difference* is on the 3rd volume ("home.diff"), I destroy the active snapshot created 'yesterday', then recreate it as as a dynamically sized static -- enough to hold the diff. Then cp is used to move whatever "diffs" were put on the "diff" volume by rsync. So Those diffs -- most of them are _likely_ to be in memory -- AND as I mentioned, I didn't do a sync after the copy (it happens automatically, but isn't included in the timing). But if I used rsync to do that exact same copy, it would take at least 2-3 times as long... actually... hold on... I can copy it from that partition made yesterday ... into the diff parition.. but will tar up the source to prime the cache... This is the volume: > df . Filesystem Size Used Avail Use% Mounted on /dev/Data/Home-2014.10.08-03.07.05 5.5G 4.4G 1.1G 81%\ /home/.snapdir/@GMT-2014.10.08-03.07.05 Ishtar:.snapdir/@GMT-2014.10.08-03.07.05> du -sh . 4.4G . ok... running cp 1st, then remove, then rsync...: Ishtar:.snapdir/@GMT-2014.10.08-03.07.05> \ time sudo cp -a . /home.diff/. 6.39sec 0.15usr 6.23sys (99.81% cpu) Ishtar:.snapdir/@GMT-2014.10.08-03.07.05> \ time sudo rm -fr /home.diff/. 1.69sec 0.03usr 1.64sys (99.43% cpu) Ishtar:.snapdir/@GMT-2014.10.08-03.07.05> \ time sudo rsync -aHAX . /home.diff/. 20.83sec 27.02usr 11.68sys (185.84% cpu) ----185% cpu!... hey! that's cheating and still 3x slower... here's 1 core: Ishtar:.snapdir/@GMT-2014.10.08-03.07.05> \ time sudo rm -fr /home.diff/. 1.73sec 0.03usr 1.69sys (99.39% cpu) Ishtar:.snapdir/@GMT-2014.10.08-03.07.05> \ time sudo taskset -a 02 rsync -aHAX . /home.diff/. 38.52sec 25.92usr 11.90sys (98.18% cpu) --- so limiting it to 1 cpu... 6x slower. (remember this is all in memory buffered) Note... rsync has been sped up slightly over the past couple of years and 'cp' has slown down somewhat over the same time period, so these diffs used to be worse. Then 'cp' is used to copy the image on 'home.diff' to the dynamically sized > copy took less than 1 minute? I find that very suspicious. --- Well, hopefully the above explanation is more clear and highlights what we wanted to measure. > > It appears that you are using features from rsync that do not exist in > cp. Therefore the work being done in the task isn't equivalent work. > In that case it is probably quite reasonable for rsync to be slower > than cp. ---- Yup... Never would argue differently, but for what it does, rsync is still pig slow, but when the amount of data you need to move is hundreds of times smaller than the total, it can't be beat! > > Also consider that if cp were to acquire all of the enhancements that > have been requested for cp as time has gone by then cp would be just > as featureful (bloated!) as rsync and likely just as slow as rsync > too. ---- Nope...rsync is slow because it does everything over a client server model --- even when it is local. So everything is written through a pipe .. that's why it can't come close to cp -- and why cp would never be so slow -- I can't imagine it using a pipe to copy a file anywhere! > This is something to consider every time someone asks for a > creeping feature to cp. Especially if they say they want the feature > in cp because it is faster than rsync. The natural progression is > that cp would become rsync. ---- Not even! Note. cp already has a comparison function built in that it uses during "cp -u"... but it doesn't go through pipes. It used to use larger buffer sizes or maybe tell posix to pre-alloc the destination space, dunno, but it used to be faster.. I can't say for certain, but it seems to be using smaller buffer sizes. Another reason rsync is so slow -- uses a relatively small i/o size 1-4k last I looked. I've asked them to increase it, but going through a pipe it won't help alot. This is from a different email on the rsync list from 7/26: One might ask why rsync is so slow -- copying 800G from 1 partition to another via xfsdump/restore takes a bit under 2 hours, or about 170MB/s, but with rsync, on the same partition with rsync transfering less than 1/1000th as much (700MB [in a differential as I mentioned above]), it took ~70-80 minutes... or about 163kB/s. Transfer speeds depend on many factors. One of the largest is transfer size (how much transfered with 1 write /read. Transferring 1GB, @ 1-meg at a time, took 2.08s read, and 1.56s to write (using direct io). Transfer it in 4K chunks: 37.28s, to read, and 43.02s to write. 1k buffers are 4x slower than that! Also in rsync, they've added the posix calls to reserve space in the target location for a file being copied in. Specifically, this is to lower disk fragmentation (does cp do anything like that, been a while since I looked). > >> If rsync wasn't so slow at local I/O...*sigh*.... > > The advantage of rsync is that it can be interrupted and restarted and > the restarted process will efficiently avoid doing work that is > already done. An interrupted and restarted cp will perform the same > work again from start to finish. ---- I wouldn't trust that it would. If you interrupt it at exactly the wrong time, I'd be afraid some file might get set with the right data but the wrong Meta info (acls, primarily). > > If I am doing a simple copy from A to B then I use 'cp -av A B'. If I > am doing it the second time then I will use rsync to avoid repeating > previously done work 'rsync -av A B'. --- Wouldn't cp -auv A B do the same? > > If I want progress indication... If I want placement of backup files > in a particular directory... If I want other fancy features that are > provided by rsync then it is worth it to use rsync. > > $ du -s coreutils > 238920 coreutils > $ find coreutils -type f | wc -l > 15013 > > $ rm -rf junk/coreutils > # echo 3 > /proc/sys/vm/drop_caches > $ time cp -a coreutils junk/ > real 1m2.137s > user 0m0.140s > sys 0m1.724s > > $ rm -rf junk/coreutils > $ time cp -a coreutils junk/ > real 0m2.492s > user 0m0.060s > sys 0m1.064s > > $ rm -rf junk/coreutils > # echo 3 > /proc/sys/vm/drop_caches > $ time rsync -a coreutils junk/ > real 1m5.473s > user 0m1.280s > sys 0m2.112s > > $ rm -rf junk/coreutils > $ time rsync -a coreutils junk/ > real 0m3.215s > user 0m1.184s > sys 0m1.536s --- By default cp -a transfers acls and ext-attrs and preserves hard links. Rsync doesn't do any of that by default. You need to use "-aHAX" to compare them ... you have to call them out as 'extra' with rsync, so the above test may not be what it seems. Though if you don't use ACL's (which I do), then maybe the above is almost reasonable. Still.. should use -aHAX Is your rsync newer? i.e. does it have the posix-pre-alloc hints?... Mine has a pre-alloc patch, but I think that was suse-added and not the one in the mainline code. Not sure. rsync --version rsync version 3.1.0 protocol version 31 64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints, socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace, append, ACLs, xattrs, iconv, symtimes, prealloc, SLP I don't think mine does yet... > > For normal use cp is a little faster than rsync. Or rather rsync is a > little slower than cp. But not enough to make a difference for > typical operations. Having the file system cache warmed up makes a > *HUGE* difference. Much larger than any other difference. For copies > that take hours to run I am probably going to value the restart > ability more than raw speed. YMMV. ---- I'll value the accuracy of xfsdump/restore... Throw a few TB copies at rsync -- where all the data won't fit in memory.... it also, I'm told, has problems with hardlinks, acls and xattrs slowing it down, so it may be a matter of usage... BUT all that said... note that I DO USE it... for the job I'm doing in my snapper script, nothing else will. Cheers! Linda (don't ya just love performance talk?) From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 19 19:53:37 2014 Received: (at 18681) by debbugs.gnu.org; 19 Oct 2014 23:53:38 +0000 Received: from localhost ([127.0.0.1]:56162 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xg0I4-0002nL-CS for submit@debbugs.gnu.org; Sun, 19 Oct 2014 19:53:37 -0400 Received: from joseki.proulx.com ([216.17.153.58]:59346) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xg0I0-0002nB-VS for 18681@debbugs.gnu.org; Sun, 19 Oct 2014 19:53:34 -0400 Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id A26C121229; Sun, 19 Oct 2014 17:53:31 -0600 (MDT) Received: by hysteria.proulx.com (Postfix, from userid 1000) id 705922DC19; Sun, 19 Oct 2014 17:53:31 -0600 (MDT) Date: Sun, 19 Oct 2014 17:53:31 -0600 From: Bob Proulx To: Linda Walsh Subject: Re: bug#18681: cp Specific fail example Message-ID: <20141019171045034315311@bob.proulx.com> References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> <543822BE.1030408@cs.ucla.edu> <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> <20141010172253719078437@bob.proulx.com> <5438F74F.70301@tlinx.org> <20141012182228828257097@bob.proulx.com> <543B3CBF.3070009@tlinx.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <543B3CBF.3070009@tlinx.org> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: 18681 Cc: 18681@debbugs.gnu.org, "Polehn, Mike A" X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) Linda Walsh wrote: > Bob Proulx wrote: > > Also consider that if cp were to acquire all of the enhancements > > that have been requested for cp as time has gone by then cp would > > be just as featureful (bloated!) as rsync and likely just as slow > > as rsync too. > > Nope...rsync is slow because it does everything over a client > server model --- even when it is local. So everything is written through > a pipe .. that's why it can't come close to cp -- and why cp would never > be so slow -- I can't imagine it using a pipe to copy a file anywhere! The client-server structure of rsync is required for copying between systems. Saying that cp doesn't have it isn't fair if cp were to add every requested feature. I am sure that if I search the archives I would find a request to add client-server structure to cp to support copying from system to system. :-) Now I will proactively agree that it would be nice if rsync detected that it was all running locally and didn't fork and instead ran everything in one process like cp does. But I could see that coming to rsync at some time in the future. It is an often requested feature. > > This is something to consider every time someone asks for a > > creeping feature to cp. Especially if they say they want the feature > > in cp because it is faster than rsync. The natural progression is > > that cp would become rsync. > > Not even! Note. cp already has a comparison function > built in that it uses during "cp -u"... I am not convinced of the robustness of 'cp -u ...' interrupt, repeat, interrupt repeat. It wasn't intended for that mode. I am suspicious. Is there any code path that could leave a new file in the target area that would avoid copy? Not sure. Newer meets the -u test but isn't an exact copy if the time stamp were older in the original. But with rsync I know it will correct for this during a subsequent run. > built in that it uses during "cp -u"... but it doesn't go through > pipes. It used to use larger buffer sizes or maybe tell posix > to pre-alloc the destination space, dunno, but it used to be > faster.. I can't say for certain, but it seems to be using Often the data sizes we work with grow larger over time making the same task feel slower because we are actually dealing with more data now. Files include audio. Files include video. Standard def becomes high def. "Difficult to see. Always in motion is the future." > smaller buffer sizes. Another reason rsync is so slow -- uses > a relatively small i/o size 1-4k last I looked. I've asked them > to increase it, but going through a pipe it won't help alot. Nod. Rsync was designed for the network use case. It could benefit with some tuning for the local case. A topic for the rsync list. > Also in rsync, they've added the posix calls to reserve > space in the target location for a file being copied in. > Specifically, this is to lower disk fragmentation (does > cp do anything like that, been a while since I looked). I don't know. It would be worth a look. > > The advantage of rsync is that it can be interrupted and restarted and > > the restarted process will efficiently avoid doing work that is > > already done. An interrupted and restarted cp will perform the same > > work again from start to finish. > > I wouldn't trust that it would. If you interrupt it at exactly > the wrong time, I'd be afraid some file might get set with the right > data but the wrong Meta info (acls, primarily). The design of rsync is to copy the file to a temporary name beside the intended target. After the copy the timestamps are set. After that the timestamps are set the file is renamed into place. An interrupt that happens before that rename time will cause the temporary file to be removed. An interrupt that happens after the rename is, well, after that and the copy is already done. Since rename on the local file system is atomic this is guaranteed to function robustly. (As long as you aren't using a buggy file system that changes the order of operations. That isn't cool. But of course it was famously seen in ext4 for a while. Fortunately sanity has prevailed and ext4 doesn't do that for this operation anymore. Okay to use now.) > > If I am doing a simple copy from A to B then I use 'cp -av A B'. If I > > am doing it the second time then I will use rsync to avoid repeating > > previously done work 'rsync -av A B'. > > Wouldn't cp -auv A B do the same? Do I have to go look at the source code to verify that it doesn't? :-( I assume it doesn't without looking. I assume cp copies in place. I assume that cp does not make a temporary file off to the side and rename it into place once it is done and has set the timestamps. I assume that cp copies to the named destination directly and updates the timestamps afterward. That creates a window of time when the file is in place but has not had the timestamp placed on it yet. Which means that if the cp is interrupted on a large file that it will have started the copy but will not have finished it at the moment that it is interrupted. The new file will be in place with a new timestamp. The second run with cp -u will avoid overwriting the file because the timestamp is newer. However the contents of the file will be incomplete, or at least not matching the source copy at the time of the second copy. If my assumptions in the above are wrong please correct me. I will learn something. But the operating model would need to be the same portably across all portable systems covered by posix before I would consider it actually safe to use. > > If I want progress indication... If I want placement of backup files > > in a particular directory... If I want other fancy features that are > > provided by rsync then it is worth it to use rsync. > > ...trimmed simple benchmark... > > $ time cp -a coreutils junk/ > > By default cp -a transfers acls and ext-attrs and preserves > hard links. Rsync doesn't do any of that by default. > You need to use "-aHAX" to compare them ... Good catch. :-) > you have to call them > out as 'extra' with rsync, so the above test may not be what it seems. > Though if you don't use ACL's (which I do), then maybe the above > is almost reasonable. Still.. should use -aHAX I didn't have any hard links, ACLs, or extended attributes in the test case it shouldn't matter for the above. > Is your rsync newer? i.e. does it have the posix-pre-alloc > hints?... Mine has a pre-alloc patch, but I think that was > suse-added and not the one in the mainline code. Not sure. > > rsync --version > rsync version 3.1.0 protocol version 31 > 64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints, > socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace, > append, ACLs, xattrs, iconv, symtimes, prealloc, SLP I happened to run that test on Debian Sid and it is 3.1.1. However Debian Stable, which I have most widely deployed, has 3.0.9. So you are both ahead of and behind me at the same time. :-) > Throw a few TB copies at rsync -- where all the data > won't fit in memory.... it also, I'm told, has problems with > hardlinks, acls and xattrs slowing it down, so it may be a > matter of usage... I have had problems running rsync with -H for large data sets. Bad enough that I recommend against it. Don't do it! I don't know anything about -A and -X. But rsync -a is fine for very large data sets. > BUT all that said... note that I DO USE it... for the > job I'm doing in my snapper script, nothing else will. Yes. It is too useful to be without! > (don't ya just love performance talk?) Except that we should have moved all of this to the discussion list. I feel guilty to have continued it. We have drifted well away from the original bug report. The one with the terrible title. If this continues let's take it over to the coreutils discussion list for further conversation about it. Bob From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 19 20:04:21 2014 Received: (at 18681) by debbugs.gnu.org; 20 Oct 2014 00:04:21 +0000 Received: from localhost ([127.0.0.1]:56167 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xg0SS-00034z-Mw for submit@debbugs.gnu.org; Sun, 19 Oct 2014 20:04:21 -0400 Received: from joseki.proulx.com ([216.17.153.58]:59387) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xg0SR-00034o-2j; Sun, 19 Oct 2014 20:04:19 -0400 Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id D489C21229; Sun, 19 Oct 2014 18:04:17 -0600 (MDT) Received: by hysteria.proulx.com (Postfix, from userid 1000) id BE0762DC19; Sun, 19 Oct 2014 18:04:17 -0600 (MDT) Date: Sun, 19 Oct 2014 18:04:17 -0600 From: Bob Proulx To: "Polehn, Mike A" , Assaf Gordon , "18681@debbugs.gnu.org" <18681@debbugs.gnu.org> Subject: Re: bug#18681: The Linux cp command has bugs Message-ID: <20141019175744099449260@bob.proulx.com> References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <54382FE8.5030904@gmail.com> <745DB4B8861F8E4B9849C970520ABBF148CAC0EA@ORSMSX102.amr.corp.intel.com> <543847A3.9050009@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <543847A3.9050009@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: 18681 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) close 18681 thanks Eric Blake wrote: > Polehn, Mike A wrote: > > This still left the incorrect operation of the interactive > > operation when both -i and -f is used. > > The behavior of -i vs. -f interaction is required by POSIX; in > particular, POSIX is explicit that -i and -f are NOT a toggle switch of > one another, but each turns on slightly different, somewhat overlapping, > changes in behavior (so specifying both is different from specifying one > in isolation). We can't change what either one of those flags means. This bug log included some serious topic drift of which I contributed to myself. In order to atone for that I am going to triage this as saying that the behavior is intended and standardized and therefore won't be changed. Now that we understand this the bug ticket can be closed. Further discussion can be continued and it will all be logged and read by the subscribed. > If there is another mode of operation that is also useful, then it needs > yet another flag. At one point in the past, we had > --reply={yes,no,query} to try and offer a third mode, but it had > confusing semantics and we ended up pulling it because of the confusion > it could cause. At the time we pulled it, we admitted that 'rsync' has > some modes of operations that might be better suited for the particular > modes that people people seemed to be requesting when they thought that > --reply would do the trick (and usually, what they thought --reply would > do and what it actually did were different, which is why we removed it > to avoid confusion). We have also added a --no-clobber option, which is > somewhat of a compromise (what some people thought --reply=no would do, > --no-clobber actually does better). Good summary! > So adding a new option is not out of the question, but you'd have to > have well-defined semantics of what it should do, and how it differs > from either normal mode, '-i' mode, '-f' mode, '-i -f' mode, or > '--no-clobber' mode. If the readers of this ticket think there is an enhancement request to be filed for cp then please file a wishlist bug with the proposal. A reference to this log can be made if desired. Let me suggest that the proposal first be made on the coreutils discussion list where it can be discussed and shaped and then after that has been done file a wishlist bug of the result in order to track its progress through the code and release. Bob From debbugs-submit-bounces@debbugs.gnu.org Mon Oct 20 02:20:40 2014 Received: (at submit) by debbugs.gnu.org; 20 Oct 2014 06:20:40 +0000 Received: from localhost ([127.0.0.1]:56258 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xg6Kd-0005o8-Bo for submit@debbugs.gnu.org; Mon, 20 Oct 2014 02:20:40 -0400 Received: from eggs.gnu.org ([208.118.235.92]:50179) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Xg6KZ-0005ns-FR for submit@debbugs.gnu.org; Mon, 20 Oct 2014 02:20:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xg6KL-0007TK-56 for submit@debbugs.gnu.org; Mon, 20 Oct 2014 02:20:30 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:36243) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xg6KL-0007T0-2w for submit@debbugs.gnu.org; Mon, 20 Oct 2014 02:20:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51882) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xg6KE-0006x6-Rl for bug-coreutils@gnu.org; Mon, 20 Oct 2014 02:20:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xg6K8-0007S4-G4 for bug-coreutils@gnu.org; Mon, 20 Oct 2014 02:20:14 -0400 Received: from ishtar.tlinx.org ([173.164.175.65]:34435 helo=Ishtar.hs.tlinx.org) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xg6K7-0007Hr-Va for bug-coreutils@gnu.org; Mon, 20 Oct 2014 02:20:08 -0400 Received: from [192.168.4.12] (Athenae [192.168.4.12]) by Ishtar.hs.tlinx.org (8.14.7/8.14.4/SuSE Linux 0.8) with ESMTP id s9K6K0XF024209; Sun, 19 Oct 2014 23:20:02 -0700 Message-ID: <5444A990.3030602@tlinx.org> Date: Sun, 19 Oct 2014 23:20:00 -0700 From: Linda Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Bob Proulx Subject: Re: bug#18681: cp Specific fail example References: <745DB4B8861F8E4B9849C970520ABBF148CABFDB@ORSMSX102.amr.corp.intel.com> <745DB4B8861F8E4B9849C970520ABBF148CAC028@ORSMSX102.amr.corp.intel.com> <543822BE.1030408@cs.ucla.edu> <745DB4B8861F8E4B9849C970520ABBF148CAC0B9@ORSMSX102.amr.corp.intel.com> <20141010172253719078437@bob.proulx.com> <5438F74F.70301@tlinx.org> <20141012182228828257097@bob.proulx.com> <543B3CBF.3070009@tlinx.org> <20141019171045034315311@bob.proulx.com> In-Reply-To: <20141019171045034315311@bob.proulx.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit Cc: "Polehn, Mike A" , bug-coreutils@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Bob Proulx wrote: > Linda Walsh wrote: >> Bob Proulx wrote: >>> Also consider that if cp were to acquire all of the enhancements >>> that have been requested for cp as time has gone by then cp would >>> be just as featureful (bloated!) as rsync and likely just as slow >>> as rsync too. >> Nope...rsync is slow because it does everything over a client >> server model --- even when it is local. So everything is written through >> a pipe .. that's why it can't come close to cp -- and why cp would never >> be so slow -- I can't imagine it using a pipe to copy a file anywhere! > > The client-server structure of rsync is required for copying between > systems. Saying that cp doesn't have it isn't fair if cp were to add > every requested feature. --- cp was designed for local->local copy. rsync was designed for local->remote synchronization (thus 'r(emote) sync'. Saying it isn't fair to compare code quality between a java->'native code compiler' and a compiler developed for a native platform is entirely fair -- because both started out with different design goals -- thus each ends up with pluses and minus that are an effect of that goal. If you claim comparing such effects isn't fair, then it's not fair to compare any different algorithm with another because algorithms inherently have their pluses and minuses and are often chosen for use in a particular situation because of those pluses and minuses. So lets compare using 'cp' with rsync in copying a remote file. The choice of tools depends on the quality of the remote connection, but in most remote connections, "today", reliability isn't usually an issue as they flow over TCP and file transfer protocols like NFS or CIFS also have checks to allow users to reconnect after an interruption (like a machine reboot). Depending on timeout settings, 'cp' already has a restart over-remove ability when used with NFS or CIFS. CIFS doesn't tolerate a system reboot in the middle of a copy, whereas NFS can recover from such if the client uses hard mounts. But for a local network, I regularly use 'cp' with CIFS and it does a faster job than rsync -- over a reliable local network. > I am sure that if I search the archives I > would find a request to add client-server structure to cp to support > copying from system to system. :-) ---- We are comparing where the tools are at _not_ where they _could_ have been had previous algorithm choices been ignored. We are talking about a local->local copy (in the base note), so glossing over the slowness of rsync in doing such is entirely fair. If you want some level of recovery after interrupt, NFS is a better choice for a local network -- client connections can continue even after a server reboot. But if we are talking local->local reliability, the simple, close solution would be SMB/CIFS. Using a 1GB file as an example (and throwing in a 'dd' for for comparison): > time rsync 1G ishtar:/home/law/1G 20.13sec 1.29usr 2.68sys (19.73% cpu) > time cp 1G /h/. 6.94sec 0.01usr 1.10sys (16.16% cpu) > time dd if=1G of=/h/1G bs=256M oflag=direct 4+0 records in 4+0 records out 1073741824 bytes (1.1 GB) copied, 3.4694 s, 309 MB/s 3.50sec 0.00usr 0.51sys (14.64% cpu) Here again, we see rsync doing the same job of cp taking about 3x the time. For a single file over a local net 'dd' is a better bet. > > Now I will proactively agree that it would be nice if rsync detected > that it was all running locally and didn't fork and instead ran > everything in one process like cp does. But I could see that coming > to rsync at some time in the future. It is an often requested > feature. --- For many years. >>> This is something to consider every time someone asks for a >>> creeping feature to cp. Especially if they say they want the feature >>> in cp because it is faster than rsync. The natural progression is >>> that cp would become rsync. >> Not even! Note. cp already has a comparison function >> built in that it uses during "cp -u"... > > I am not convinced of the robustness of 'cp -u ...' interrupt, repeat, > interrupt repeat. It wasn't intended for that mode. --- Neither is rsync in its default mode. It compares timestamps and size, nothing more. I'd be suspicious of either rsync OR cp's chances in such a situation. But USUALLY, people don't interrupt a copy many times -- or even once, so cp is usually faster... > Is there any code path that could leave a new file in the target area > that would avoid copy? Not sure. Newer meets the -u test but isn't > an exact copy if the time stamp were older in the original. But with > rsync I know it will correct for this during a subsequent run. --- Not necessarily. It doesn't do checksumming by default. Certainly, if you used rsync with '-u', rsync will not be much better in recovery, since target files with more recent timestamps may be left in the target dir. I don't think rsync or cp trap a control-c-abort to cleanup target files. > >> built in that it uses during "cp -u"... but it doesn't go through >> pipes. It used to use larger buffer sizes or maybe tell posix >> to pre-alloc the destination space, dunno, but it used to be >> faster.. I can't say for certain, but it seems to be using > > Often the data sizes we work with grow larger over time making the > same task feel slower because we are actually dealing with more data > now. --- I was comparing copy times with same files, not from years ago to now. >> Another reason rsync is so slow -- uses >> a relatively small i/o size 1-4k last I looked. I've asked them >> to increase it, but going through a pipe it won't help alot. > > Nod. Rsync was designed for the network use case. It could benefit > with some tuning for the local case. A topic for the rsync list. --- Been there, done that. Still comparing current-to-current, not hypotheticals. > >> Also in rsync, they've added the posix calls to reserve >> space in the target location for a file being copied in. >> Specifically, this is to lower disk fragmentation (does >> cp do anything like that, been a while since I looked). > > I don't know. It would be worth a look. > >>> The advantage of rsync is that it can be interrupted and restarted and >>> the restarted process will efficiently avoid doing work that is >>> already done. An interrupted and restarted cp will perform the same >>> work again from start to finish. >> I wouldn't trust that it would. If you interrupt it at exactly >> the wrong time, I'd be afraid some file might get set with the right >> data but the wrong Meta info (acls, primarily). > > The design of rsync is to copy the file to a temporary name beside the > intended target. After the copy the timestamps are set. After that > the timestamps are set the file is renamed into place. An interrupt > that happens before that rename time will cause the temporary file to > be removed. An interrupt that happens after the rename is, well, > after that and the copy is already done. Since rename on the local > file system is atomic this is guaranteed to function robustly. (As > long as you aren't using a buggy file system that changes the order of > operations. That isn't cool. But of course it was famously seen in > ext4 for a while. Fortunately sanity has prevailed and ext4 doesn't > do that for this operation anymore. Okay to use now.) > >>> If I am doing a simple copy from A to B then I use 'cp -av A B'. If I >>> am doing it the second time then I will use rsync to avoid repeating >>> previously done work 'rsync -av A B'. >> Wouldn't cp -auv A B do the same? > > Do I have to go look at the source code to verify that it doesn't? :-( --- My timing says cp is 20x faster for that 1G file case. It also shows that rsync doesn't use a tmp file in the update case > time cp -au 1G /h 0.03sec 0.00usr 0.03sys (79.47% cpu) > cp -au 1G /h > time rsync -au 1G ishtar:/home/law/1G 0.60sec 0.06usr 0.09sys (25.12% cpu) > > I assume it doesn't without looking. I assume cp copies in place. I > assume that cp does not make a temporary file off to the side and > rename it into place once it is done and has set the timestamps. --- I assume rsync doesn't either -- if it is comparing against a file already in place, for it to transfer the whole file... nope. I > assume that cp copies to the named destination directly and updates > the timestamps afterward. That creates a window of time when the file > is in place but has not had the timestamp placed on it yet. > > Which means that if the cp is interrupted on a large file that it will > have started the copy but will not have finished it at the moment that > it is interrupted. The new file will be in place with a new > timestamp. The second run with cp -u will avoid overwriting the file > because the timestamp is newer. However the contents of the file will > be incomplete, or at least not matching the source copy at the time of > the second copy. > > If my assumptions in the above are wrong please correct me. I will > learn something. But the operating model would need to be the same > portably across all portable systems covered by posix before I would > consider it actually safe to use. --- Same happens in rsync -- no tmp file is involved. It compares time stamps and doesn't copy. > >>> If I want progress indication... If I want placement of backup files >>> in a particular directory... If I want other fancy features that are >>> provided by rsync then it is worth it to use rsync. >>> ...trimmed simple benchmark... >>> $ time cp -a coreutils junk/ >> By default cp -a transfers acls and ext-attrs and preserves >> hard links. Rsync doesn't do any of that by default. >> You need to use "-aHAX" to compare them ... > > Good catch. :-) > >> you have to call them >> out as 'extra' with rsync, so the above test may not be what it seems. >> Though if you don't use ACL's (which I do), then maybe the above >> is almost reasonable. Still.. should use -aHAX > > I didn't have any hard links, ACLs, or extended attributes in the test > case it shouldn't matter for the above. > >> Is your rsync newer? i.e. does it have the posix-pre-alloc >> hints?... Mine has a pre-alloc patch, but I think that was >> suse-added and not the one in the mainline code. Not sure. >> >> rsync --version >> rsync version 3.1.0 protocol version 31 >> 64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints, >> socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace, >> append, ACLs, xattrs, iconv, symtimes, prealloc, SLP > > I happened to run that test on Debian Sid and it is 3.1.1. However > Debian Stable, which I have most widely deployed, has 3.0.9. So you > are both ahead of and behind me at the same time. :-) > >> Throw a few TB copies at rsync -- where all the data >> won't fit in memory.... it also, I'm told, has problems with >> hardlinks, acls and xattrs slowing it down, so it may be a >> matter of usage... > > I have had problems running rsync with -H for large data sets. Bad > enough that I recommend against it. Don't do it! I don't know > anything about -A and -X. But rsync -a is fine for very large data > sets. ---- But then you can't compare to 'cp' which does handle that case. >> (don't ya just love performance talk?) > > Except that we should have moved all of this to the discussion list. --- :-( ?discussion list? -- bugs-coreutils? (don't know about others)... 'sides, I didn't bring up rsync, all I added was "If rsync wasn't so slow at local I/O...*sigh*.... " Its good for when you need "diffs", but not as a general replacement for 'cp'. From unknown Sat Jun 14 18:58:26 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Mon, 17 Nov 2014 12:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator