GNU bug report logs - #6131
[PATCH]: fiemap support for efficient sparse file copy

Previous Next

Package: coreutils;

Reported by: "jeff.liu" <jeff.liu <at> oracle.com>

Date: Fri, 7 May 2010 14:16:02 UTC

Severity: normal

Tags: patch

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Joel Becker <Joel.Becker <at> oracle.com>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Sunil Mushran <sunil.mushran <at> oracle.com>, Paul Eggert <eggert <at> CS.UCLA.EDU>, bug-coreutils <at> gnu.org, Jim Meyering <jim <at> meyering.net>, "jeff.liu" <jeff.liu <at> oracle.com>, Chris Mason <chris.mason <at> oracle.com>, Tao Ma <tao.ma <at> oracle.com>
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Thu, 15 Jul 2010 15:12:58 -0700
On Thu, Jul 15, 2010 at 12:51:36AM +0100, Pádraig Brady wrote:
> On 14/07/10 18:45, Paul Eggert wrote:

	First and foremost, I re-concur with the broad strokes of the
--sparse={always,never,auto} conversation.  I think you all knew that,
though ;-)

> > It's not just fiemap.  It's also the Solaris interface with SEEK_HOLE
> > and SEEK_DATA.  The change should involve a module that isolates these
> > low-level details from copy.c.  copy.c should ask the new module for the
> > locations of the holes (or the non-holes: that could be more convenient).
> > On traditional hosts without fiemap or SEEK_DATA, the module should report
> > that it doesn't know where the holes are; this can let copy.c resort to
> > the existing heuristic of looking at the size and the disk usage and
> > using the --sparse=always approach if the file "smells" like it's sparse.

	While I think the final result wants to support both fiemap and
SEEK_HOLE, I think baby steps are in order.  If we just implement fiemap
right now, we can later turn that into init_extent_detection() and 
get_next_extent().

> >> 2. Performance optimization, invoke fallocate(2) if an extent flag is UNWRITTEN
> > 
> > This doesn't sound right.  A FIEMAP_EXTENT_UNWRITTEN extent is all zeros, and
> > so it should act as if it were a hole.  The goal is not to copy the exact
> > fiemap structure of the source (that's impossible): the goal is to use as
> > little time and space as possible.

	What he said.  If you find an FIEMAP_EXTENT_UNWRITTEN extent,
you just skip it.  It is a hole for the purposes of copying.  If someone
really wants to clone the extent layout, they can use reflink(8).

> > It's not clear to me that the fiemap stuff can be cleanly separated
> > from the fallocate stuff.  To some extent they're the same issue.
> > If they can easily be separated, that's better of course.
> 
> I see fiemap as optimizing reads,
> posix_fallocate() as optimizing writing zeros
> and fallocate() as optimizing allocation.
> 
> So not having thought much about implementation details,
> it seems like they could be logically separated.

	I think they should absolutely be separated.  The fiemap patch
doesn't have to do anything with fallocate()/posix_fallocate() on the
write side.
	Let's get a happy fiemap patch.  Then a happy
[posix]_fallocate() patch.  Then a happy SEEK_HOLE patch.

Joel

-- 

"For every complex problem there exists a solution that is brief,
     concise, and totally wrong."
                                        -Unknown

Joel Becker
Consulting Software Developer
Oracle
E-mail: joel.becker <at> oracle.com
Phone: (650) 506-8127




This bug report was last modified 14 years and 119 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.