GNU bug report logs - #6906
[PATCH] cp: copy entirely-sparse files oodles faster

Previous Next

Package: coreutils;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Wed, 25 Aug 2010 05:37:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Paul Eggert <eggert <at> cs.ucla.edu>
Subject: bug#6906: closed (Re: bug#6906: [PATCH] cp: copy entirely-sparse
 files oodles faster)
Date: Thu, 11 Oct 2018 02:15:04 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#6906: [PATCH] cp: copy entirely-sparse files oodles faster

which was filed against the coreutils package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 6906 <at> debbugs.gnu.org.

-- 
6906: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6906
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Assaf Gordon <assafgordon <at> gmail.com>, Jim Meyering <jim <at> meyering.net>
Cc: 6906-done <at> debbugs.gnu.org
Subject: Re: bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster
Date: Wed, 10 Oct 2018 19:14:19 -0700
Assaf Gordon wrote:
> Can this be closed as out-dated?

Yes, that's fine. Closing.

[Message part 3 (message/rfc822, inline)]
From: Paul Eggert <eggert <at> cs.ucla.edu>
To: bug-coreutils <at> gnu.org
Subject: [PATCH] cp: copy entirely-sparse files oodles faster
Date: Tue, 24 Aug 2010 22:37:02 -0700
(By "oodles faster" I mean "as much faster as you like".
The benchmark below shows a 2800x speedup.)

In response to an idea by Kit Westneat for GNU tar reported in
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>,
Eric Blake wrote:

> Meanwhile, if you are indeed correct that there are easy ways to detect
> completely sparse files, even when the ioctl or SEEK_HOLE directives are
> not present, then the coreutils cp(1) hole iteration routine should
> probably be taught that corner case to recognize an entirely sparse file
> as a single hole.

Here's a patch to coreutils to implement this idea.  It's based on a patch
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html> that
I just now installed into GNU tar.  I think of it as a quick first cut
at full fiemap / SEEK_HOLE implementation, but unlike the full
implementation this optimization does not depend on any special ioctls
or lseek extensions, so it should work on any POSIX or POSIX-like host.

On a simple benchmark this sped up GNU cp by a factor of 2800
(measuring by real-time seconds) on my host:

   $ truncate -s 10GB bigfile
   $ time old/cp bigfile bigfile-slow

   real    2m3.231s
   user    0m1.497s
   sys     0m5.738s
   $ time new/cp bigfile bigfile-fast

   real    0m0.044s
   user    0m0.000s
   sys     0m0.002s
   $ ls -ls bigfile*
   0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:11 bigfile
   0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-fast
   0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-slow

From 2e535b590d675e6d96f954c1f840d678fb133f6a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Tue, 24 Aug 2010 22:20:55 -0700
Subject: [PATCH] cp: copy entirely-sparse files oodles faster

* src/copy.c (copy_reg): Bypass reads if the file is entirely
sparse.  Idea suggested for by Kit Westneat via Bernd Shubert in
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>
for the Lustre file system.  Implementation stolen from my patch
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html>
to GNU tar.  On my machine this sped up a cp benchmark, which
copied a 10 GB entirely-sparse file on an NFS file system, by a
factor of 2800 in real seconds.
---
 src/copy.c |   18 +++++++++++++++---
 1 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/src/copy.c b/src/copy.c
index 6d11ed8..1e79523 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -669,10 +669,21 @@ copy_reg (char const *src_name, char const *dst_name,
 #endif
         }
 
-      /* If not making a sparse file, try to use a more-efficient
-         buffer size.  */
-      if (! make_holes)
+      if (make_holes)
         {
+          /* For speed, bypass reads if the file is entirely sparse.  */
+
+          if (src_open_sb.st_size != 0 && ST_NBLOCKS (src_open_sb) == 0)
+            {
+              n_read_total = src_open_sb.st_size;
+              goto set_dest_size;
+            }
+        }
+      else
+        {
+          /* Not making a sparse file, so try to use a more-efficient
+             buffer size.  */
+
           /* Compute the least common multiple of the input and output
              buffer sizes, adjusting for outlandish values.  */
           size_t blcm_max = MIN (SIZE_MAX, SSIZE_MAX) - buf_alignment_slop;
@@ -788,6 +799,7 @@ copy_reg (char const *src_name, char const *dst_name,
 
       if (last_write_made_hole)
         {
+        set_dest_size:
           if (ftruncate (dest_desc, n_read_total) < 0)
             {
               error (0, errno, _("truncating %s"), quote (dst_name));
-- 
1.7.2





This bug report was last modified 6 years and 311 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.