GNU bug report logs - #6131
[PATCH]: fiemap support for efficient sparse file copy

Previous Next

Package: coreutils;

Reported by: "jeff.liu" <jeff.liu <at> oracle.com>

Date: Fri, 7 May 2010 14:16:02 UTC

Severity: normal

Tags: patch

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


Message #248 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> CS.UCLA.EDU>
To: "jeff.liu" <jeff.liu <at> oracle.com>
Cc: Sunil Mushran <sunil.mushran <at> oracle.com>, bug-coreutils <at> gnu.org,
	Jim Meyering <jim <at> meyering.net>, Joel Becker <Joel.Becker <at> oracle.com>,
	Chris Mason <chris.mason <at> oracle.com>,
	Pádraig Brady <P <at> draigBrady.com>,
	Tao Ma <tao.ma <at> oracle.com>
Subject: Re: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Thu, 15 Jul 2010 15:51:40 -0700
>>> This doesn't sound right.  A FIEMAP_EXTENT_UNWRITTEN extent is all zeros, and
>>> so it should act as if it were a hole.  The goal is not to copy the exact
>>> fiemap structure of the source (that's impossible): the goal is to use as
>>> little time and space as possible.

> A FIEMAP_EXTENT_UNWRITTEN extent is marked to allocated although
> read it will return ZEROs through the filesystem.  So why not using
> fallocate(2) to deal with it?  IMHO, it meet the goal to use little
> time and space as possible, Am I miss something?

It's faster to simply skip around that extent while reading it, and to
skip around it when writing it, than to allocate it with fallocate
when writing it.  Logically, a FIEMAP_EXTENT_UNWRITTEN extent is a
hole, and should be optimized when reading and writing, just like any
hole.

>> I see fiemap as optimizing reads,
>> posix_fallocate() as optimizing writing zeros
>> and fallocate() as optimizing allocation.

It may not be quite that simple.  Some platforms won't have fallocate
and so posix_fallocate will have to do double duty as optimizing
allocation too.  Also, lseek is part of the process of optimizing
reads, and of optimizing writing zeros.  Most important, the
heuristics for optimizing the writes should use info derived from
optimizing the reads.

I'm not objecting to breaking these improvements into two or three
pieces, if someone wants to do that.  However, it shouldn't be
required to break them up; it's OK if someone wants to do it all at
once.  (This stuff is not that hard, after all.)  I was planning to
give it a shot at some point but obviously have not done so yet.





This bug report was last modified 14 years and 119 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.