GNU bug report logs -
#6131
[PATCH]: fiemap support for efficient sparse file copy
Previous Next
Reported by: "jeff.liu" <jeff.liu <at> oracle.com>
Date: Fri, 7 May 2010 14:16:02 UTC
Severity: normal
Tags: patch
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
On Tue, Jun 15, 2010 at 02:28:43PM -0700, Paul Eggert wrote:
> I looked at the proposed fiemap patch for cp and see another issue
> with it, which needs to be thought through more carefully.
>
> The proposed fiemap code tries to copy the _physical_ holes in the
> source file. But currently cp copies the _logical_ holes: that is, it
> skips over writing (and thus attempts to create physical holes)
> whenever the source file contains a block of zeros, regardless of
> whether that block of zeros is physically stored in the file system.
> The current cp behavior is often the desired behavior (particularly
> for cp --sparse=always) and the fiemap code shouldn't alter that.
>
> For example, if a fiemap_extent has the FIEMAP_EXTENT_UNWRITTEN flag
> set, cp should treat that as a hole, because the extent is all zeros.
> (This will greatly help performance in some cases.) Also, if an input
> extent is read and a block of it is found to be zeros, cp should skip
> over that block when writing.
I pretty much agree with everything you wrote above. Fiemap is
introduced into cp(1) to optimize the existing --sparse behavior. So we
should be holding to that behavior.
In fact, I'll go one further. While I haven't looked at the
actual cp code in a little bit, I do not think that fiemap is used for
the non-sparse copy. I contend that fiemap should be used for all
copies if the capability exists. Even if cp is doing a non-sparse copy,
there's no need for it to call read(2) on a 1GB hole in the file. It
can write zeros just fine on its own.
Basically, these capabilities (fiemap, SEEK_HOLE) are ways to
avoid reading zeros we can determine without reading and scanning. We
should always take advantage of that.
> If the proposed fiemap code was intended to copy the physical holes,
> then the code wasn't successful in implementing the intent, as there's
> no guarantee that the extents of the destination are the same as the
> extents of the source. If it is useful to have 'cp' copy the physical
> holes, this could be supported via a new cp option or options, but
> surely the default should continue to be to copy logical holes.
The goal was not physical holes, it was logical ones as you
surmise. I can't see any way to ensure the file has identical extents.
I don't know any filesystem that guarantees such allocations on
write(2).
I do think there might be an interesting performance
optimization to be had, but it is independent of fiemap. Above, you
describe the usual "cp --sparse" behavior of reading allocated data and
checking for zeroes. If you're optimizing for space usage -- what "cp
--sparse" is defined to do -- you definitely want to skip writing blocks
that are all zero. But it actually creates a crappier allocation and
I/O pattern for some files.
Imagine you have a data extent that is alternating blocks of
zeroes and ones:
hexdump -C /tmp/blk
00000000 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 |................|
*
00001000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00002000 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 |................|
*
00003000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00004000
... (continued for 4MB in this example)
00400000
Classic "cp --sparse" will spend CPU time scanning this extent, which
can be megabytes in size. It will then write out every other block,
which is terribly slow, and it leavs a horribly fragmented destination
file. The result is even more disastrous if those holes ever get filled
in.
I'm, of course, exaggerating a worst-case here. But folks
looking at I/O performance would love detection like "here's a linear
MB, sure it has some zero blocks, but it's faster to write the entire MB
than to chop it up."
Joel
--
Life's Little Instruction Book #456
"Send your loved one flowers. Think of a reason later."
Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker <at> oracle.com
Phone: (650) 506-8127
This bug report was last modified 14 years and 119 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.