GNU bug report logs -
#6906
[PATCH] cp: copy entirely-sparse files oodles faster
Previous Next
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Wed, 25 Aug 2010 05:37:02 UTC
Severity: normal
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Wed, 10 Oct 2018 19:14:19 -0700
with message-id <c645c4d7-16eb-c15c-0fd7-ab63177c60e2 <at> cs.ucla.edu>
and subject line Re: bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster
has caused the debbugs.gnu.org bug report #6906,
regarding [PATCH] cp: copy entirely-sparse files oodles faster
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
6906: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=6906
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
(By "oodles faster" I mean "as much faster as you like".
The benchmark below shows a 2800x speedup.)
In response to an idea by Kit Westneat for GNU tar reported in
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>,
Eric Blake wrote:
> Meanwhile, if you are indeed correct that there are easy ways to detect
> completely sparse files, even when the ioctl or SEEK_HOLE directives are
> not present, then the coreutils cp(1) hole iteration routine should
> probably be taught that corner case to recognize an entirely sparse file
> as a single hole.
Here's a patch to coreutils to implement this idea. It's based on a patch
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html> that
I just now installed into GNU tar. I think of it as a quick first cut
at full fiemap / SEEK_HOLE implementation, but unlike the full
implementation this optimization does not depend on any special ioctls
or lseek extensions, so it should work on any POSIX or POSIX-like host.
On a simple benchmark this sped up GNU cp by a factor of 2800
(measuring by real-time seconds) on my host:
$ truncate -s 10GB bigfile
$ time old/cp bigfile bigfile-slow
real 2m3.231s
user 0m1.497s
sys 0m5.738s
$ time new/cp bigfile bigfile-fast
real 0m0.044s
user 0m0.000s
sys 0m0.002s
$ ls -ls bigfile*
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:11 bigfile
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-fast
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-slow
From 2e535b590d675e6d96f954c1f840d678fb133f6a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Tue, 24 Aug 2010 22:20:55 -0700
Subject: [PATCH] cp: copy entirely-sparse files oodles faster
* src/copy.c (copy_reg): Bypass reads if the file is entirely
sparse. Idea suggested for by Kit Westneat via Bernd Shubert in
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>
for the Lustre file system. Implementation stolen from my patch
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html>
to GNU tar. On my machine this sped up a cp benchmark, which
copied a 10 GB entirely-sparse file on an NFS file system, by a
factor of 2800 in real seconds.
---
src/copy.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/src/copy.c b/src/copy.c
index 6d11ed8..1e79523 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -669,10 +669,21 @@ copy_reg (char const *src_name, char const *dst_name,
#endif
}
- /* If not making a sparse file, try to use a more-efficient
- buffer size. */
- if (! make_holes)
+ if (make_holes)
{
+ /* For speed, bypass reads if the file is entirely sparse. */
+
+ if (src_open_sb.st_size != 0 && ST_NBLOCKS (src_open_sb) == 0)
+ {
+ n_read_total = src_open_sb.st_size;
+ goto set_dest_size;
+ }
+ }
+ else
+ {
+ /* Not making a sparse file, so try to use a more-efficient
+ buffer size. */
+
/* Compute the least common multiple of the input and output
buffer sizes, adjusting for outlandish values. */
size_t blcm_max = MIN (SIZE_MAX, SSIZE_MAX) - buf_alignment_slop;
@@ -788,6 +799,7 @@ copy_reg (char const *src_name, char const *dst_name,
if (last_write_made_hole)
{
+ set_dest_size:
if (ftruncate (dest_desc, n_read_total) < 0)
{
error (0, errno, _("truncating %s"), quote (dst_name));
--
1.7.2
[Message part 3 (message/rfc822, inline)]
Assaf Gordon wrote:
> Can this be closed as out-dated?
Yes, that's fine. Closing.
This bug report was last modified 6 years and 311 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.