GNU bug report logs -
#6906
[PATCH] cp: copy entirely-sparse files oodles faster
Previous Next
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Wed, 25 Aug 2010 05:37:02 UTC
Severity: normal
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #8 received at 6906 <at> debbugs.gnu.org (full text, mbox):
Paul Eggert wrote:
> (By "oodles faster" I mean "as much faster as you like".
> The benchmark below shows a 2800x speedup.)
>
> In response to an idea by Kit Westneat for GNU tar reported in
> <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>,
> Eric Blake wrote:
>
>> Meanwhile, if you are indeed correct that there are easy ways to detect
>> completely sparse files, even when the ioctl or SEEK_HOLE directives are
>> not present, then the coreutils cp(1) hole iteration routine should
>> probably be taught that corner case to recognize an entirely sparse file
>> as a single hole.
>
> Here's a patch to coreutils to implement this idea. It's based on a patch
> <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html> that
> I just now installed into GNU tar. I think of it as a quick first cut
> at full fiemap / SEEK_HOLE implementation, but unlike the full
> implementation this optimization does not depend on any special ioctls
> or lseek extensions, so it should work on any POSIX or POSIX-like host.
>
> On a simple benchmark this sped up GNU cp by a factor of 2800
> (measuring by real-time seconds) on my host:
>
> $ truncate -s 10GB bigfile
> $ time old/cp bigfile bigfile-slow
>
> real 2m3.231s
> user 0m1.497s
> sys 0m5.738s
> $ time new/cp bigfile bigfile-fast
>
> real 0m0.044s
> user 0m0.000s
> sys 0m0.002s
> $ ls -ls bigfile*
> 0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:11 bigfile
> 0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-fast
> 0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-slow
>
>>From 2e535b590d675e6d96f954c1f840d678fb133f6a Mon Sep 17 00:00:00 2001
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Tue, 24 Aug 2010 22:20:55 -0700
> Subject: [PATCH] cp: copy entirely-sparse files oodles faster
>
> * src/copy.c (copy_reg): Bypass reads if the file is entirely
> sparse. Idea suggested for by Kit Westneat via Bernd Shubert in
> <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>
> for the Lustre file system. Implementation stolen from my patch
> <http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html>
> to GNU tar. On my machine this sped up a cp benchmark, which
> copied a 10 GB entirely-sparse file on an NFS file system, by a
> factor of 2800 in real seconds.
Hi Paul,
Somehow I didn't see this patch from you until now, while looking
through the hundreds of outstanding (bug mostly resolved) bugs at
http://debbugs.gnu.org/coreutils. Sorry about that.
Now that we have FIEMAP support, (by the looks of things
we will soon have SEEK_HOLE support in cp and in the linux kernel)
do you think adding support for this special case is worthwhile?
I could go either way.
If so, would you care to rebase it for 8.13?
coreutils-8.12 will probably be coming soon to adjust FIEMAP
support not to collide with the combination of XFS, 2.6.39
release-candidate kernels and so called "unwritten extents".
This bug report was last modified 6 years and 311 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.