GNU bug report logs -
#6906
[PATCH] cp: copy entirely-sparse files oodles faster
Previous Next
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Wed, 25 Aug 2010 05:37:02 UTC
Severity: normal
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#6906: [PATCH] cp: copy entirely-sparse files oodles faster
which was filed against the coreutils package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 6906 <at> debbugs.gnu.org.
--
6906: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6906
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
Assaf Gordon wrote:
> Can this be closed as out-dated?
Yes, that's fine. Closing.
[Message part 3 (message/rfc822, inline)]
(By "oodles faster" I mean "as much faster as you like".
The benchmark below shows a 2800x speedup.)
In response to an idea by Kit Westneat for GNU tar reported in
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>,
Eric Blake wrote:
> Meanwhile, if you are indeed correct that there are easy ways to detect
> completely sparse files, even when the ioctl or SEEK_HOLE directives are
> not present, then the coreutils cp(1) hole iteration routine should
> probably be taught that corner case to recognize an entirely sparse file
> as a single hole.
Here's a patch to coreutils to implement this idea. It's based on a patch
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html> that
I just now installed into GNU tar. I think of it as a quick first cut
at full fiemap / SEEK_HOLE implementation, but unlike the full
implementation this optimization does not depend on any special ioctls
or lseek extensions, so it should work on any POSIX or POSIX-like host.
On a simple benchmark this sped up GNU cp by a factor of 2800
(measuring by real-time seconds) on my host:
$ truncate -s 10GB bigfile
$ time old/cp bigfile bigfile-slow
real 2m3.231s
user 0m1.497s
sys 0m5.738s
$ time new/cp bigfile bigfile-fast
real 0m0.044s
user 0m0.000s
sys 0m0.002s
$ ls -ls bigfile*
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:11 bigfile
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-fast
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-slow
From 2e535b590d675e6d96f954c1f840d678fb133f6a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Tue, 24 Aug 2010 22:20:55 -0700
Subject: [PATCH] cp: copy entirely-sparse files oodles faster
* src/copy.c (copy_reg): Bypass reads if the file is entirely
sparse. Idea suggested for by Kit Westneat via Bernd Shubert in
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>
for the Lustre file system. Implementation stolen from my patch
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html>
to GNU tar. On my machine this sped up a cp benchmark, which
copied a 10 GB entirely-sparse file on an NFS file system, by a
factor of 2800 in real seconds.
---
src/copy.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/src/copy.c b/src/copy.c
index 6d11ed8..1e79523 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -669,10 +669,21 @@ copy_reg (char const *src_name, char const *dst_name,
#endif
}
- /* If not making a sparse file, try to use a more-efficient
- buffer size. */
- if (! make_holes)
+ if (make_holes)
{
+ /* For speed, bypass reads if the file is entirely sparse. */
+
+ if (src_open_sb.st_size != 0 && ST_NBLOCKS (src_open_sb) == 0)
+ {
+ n_read_total = src_open_sb.st_size;
+ goto set_dest_size;
+ }
+ }
+ else
+ {
+ /* Not making a sparse file, so try to use a more-efficient
+ buffer size. */
+
/* Compute the least common multiple of the input and output
buffer sizes, adjusting for outlandish values. */
size_t blcm_max = MIN (SIZE_MAX, SSIZE_MAX) - buf_alignment_slop;
@@ -788,6 +799,7 @@ copy_reg (char const *src_name, char const *dst_name,
if (last_write_made_hole)
{
+ set_dest_size:
if (ftruncate (dest_desc, n_read_total) < 0)
{
error (0, errno, _("truncating %s"), quote (dst_name));
--
1.7.2
This bug report was last modified 6 years and 311 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.