GNU bug report logs - #6131
[PATCH]: fiemap support for efficient sparse file copy

Previous Next

Package: coreutils;

Reported by: "jeff.liu" <jeff.liu <at> oracle.com>

Date: Fri, 7 May 2010 14:16:02 UTC

Severity: normal

Tags: patch

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


Message #122 received at submit <at> debbugs.gnu.org (full text, mbox):

From: "jeff.liu" <jeff.liu <at> oracle.com>
To: Jim Meyering <jim <at> meyering.net>, Paul Eggert <eggert <at> CS.UCLA.EDU>
Cc: Sunil Mushran <sunil.mushran <at> oracle.com>,
	Chris Mason <chris.mason <at> oracle.com>, bug-coreutils <at> gnu.org,
	Joel Becker <Joel.Becker <at> oracle.com>, Tao Ma <tao.ma <at> oracle.com>
Subject: Re: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Wed, 09 Jun 2010 15:33:18 +0800
Jim Meyering wrote:
> Paul Eggert wrote:
>> Jeff Liu and Jim Meyering wrote:
>>> diff --git a/src/fiemap.h b/src/fiemap.h
>>> new file mode 100644
>>> index 0000000..d33293b
>>> --- /dev/null
>>> +++ b/src/fiemap.h
> 
> Hi Paul,
> 
> Thanks for the review.
> 
>> Why is this file necessary?  fiemap.h is included only if it exists,
>> right?  Shouldn't we just use the kernel's fiemap.h rather than
>> copying it here and assuming kernel internals?
> 
> The ioctl is available/usable in 2.6.27 and newer that do not publish
> this file.  For example, it's in F13's (2.6.33's) /usr/include/linux/fiemap.h,
> as well as the one in debian unstable's 2.6.32, but probably
> not in much older kernels.
> 
> Hmm..  I see that it's available even in F11's 2.6.30.9-x
> 
> It would be better to include <linux/fiemap.h> when present,
> else our copy of that header.  Then, eventually, the else
> clause will become unused.  Note that even on newer kernels,
> the linux/* headers are optional.
> 
> Eventually we'll have a hard requirement on kernel headers --
> at least when building against a linux kernel.
> 
>>>          if (lseek (src_fd, ext_logical, SEEK_SET) < 0LL)
>> For this sort of thing, please just use "0" rather than "0LL".
>> "0" is easier to read and it has the same effect here.
> 
> Included in the patch below.
> 
>>>              char buf[buf_size];
>> This assumes C99, since buf_size is not known at compile time.
>> Also, isn't there a potential segmentation-violation problem
>> if buf_size is sufficiently large?
>>
>> More generally, since the caller is already allocating a buffer of the
>> appropiate size, shouldn't we just reuse that buffer, rather than
>> allocating a new one?  That would avoid the problems of assuming
>> C99 and possible segmentation violations.
> 
> Good point.  Thanks.
> We can definitely avoid that allocation.
> Do you feel like writing the patch?
> 
> I've just pushed this series to a branch, "fiemap-copy",
> so others can follow along and contribute more easily.
> 
>>>   char fiemap_buf[4096];
>>>   struct fiemap *fiemap = (struct fiemap *) fiemap_buf;
>>>   struct fiemap_extent *fm_ext = &fiemap->fm_extents[0];
>>>   uint32_t count = ((sizeof fiemap_buf - sizeof (*fiemap))
>>>                     / sizeof (struct fiemap_extent));
>> This code isn't portable, since fiemap_buf is only char-aligned, and
>> struct fiemap may well require stricter alignment.  The code will work
>> on the x86 despite the alignment problem, but that's just a happy
>> coincidence.
>>
>> A lesser point: the code assumes that 'struct fiemap' is sufficiently
>> small (considerably less than 4096 bytes in size); I expect that this
>> is universally true but we might as well check this assumption, since
>> it's easy to do so without any runtime overhead.
>>
>> So I propose something like this instead:
>>
>>    union { struct fiemap f; char c[4096]; } fiemap_buf;
>>    struct fiemap *fiemap = &fiemap_buf.f;
>>    struct fiemap_extent *fm_ext = &fiemap->fm_extents[0];
>>    enum { count = (sizeof fiemap_buf - sizeof *fiemap) / sizeof *fm_ext };
>>    verify (count != 0);
Cooool!!

Thanks Paul and Jim for the sharing.


Regards,
-Jeff

> 
> I've done this in your name:
> 
> From fffa2e8661a27978927fcc8afb6873631a753292 Mon Sep 17 00:00:00 2001
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> Date: Wed, 9 Jun 2010 08:15:07 +0200
> Subject: [PATCH] copy.c: ensure proper alignment of fiemap buffer
> 
> * src/copy.c (fiemap_copy): Ensure that our fiemap buffer
> is large enough and well-aligned.
> Replace "0LL" with equivalent "0" as 3rd argument to lseek.
> ---
>  src/copy.c |   15 ++++++++-------
>  1 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/src/copy.c b/src/copy.c
> index f629771..27e083a 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -171,11 +171,12 @@ fiemap_copy (int src_fd, int dest_fd, size_t buf_size,
>               char const *dst_name, bool *normal_copy_required)
>  {
>    bool last = false;
> -  char fiemap_buf[4096];
> -  struct fiemap *fiemap = (struct fiemap *) fiemap_buf;
> +  union { struct fiemap f; char c[4096]; } fiemap_buf;
> +  struct fiemap *fiemap = &fiemap_buf.f;
>    struct fiemap_extent *fm_ext = &fiemap->fm_extents[0];
> -  uint32_t count = ((sizeof fiemap_buf - sizeof (*fiemap))
> -                    / sizeof (struct fiemap_extent));
> +  enum { count = (sizeof fiemap_buf - sizeof *fiemap) / sizeof *fm_ext };
> +  verify (count != 0);
> +
>    off_t last_ext_logical = 0;
>    uint64_t last_ext_len = 0;
>    uint64_t last_read_size = 0;
> @@ -185,7 +186,7 @@ fiemap_copy (int src_fd, int dest_fd, size_t buf_size,
>    /* This is required at least to initialize fiemap->fm_start,
>       but also serves (in mid 2010) to appease valgrind, which
>       appears not to know the semantics of the FIEMAP ioctl. */
> -  memset (fiemap_buf, 0, sizeof fiemap_buf);
> +  memset (&fiemap_buf, 0, sizeof fiemap_buf);
> 
>    do
>      {
> @@ -220,13 +221,13 @@ fiemap_copy (int src_fd, int dest_fd, size_t buf_size,
>            off_t ext_logical = fm_ext[i].fe_logical;
>            uint64_t ext_len = fm_ext[i].fe_length;
> 
> -          if (lseek (src_fd, ext_logical, SEEK_SET) < 0LL)
> +          if (lseek (src_fd, ext_logical, SEEK_SET) < 0)
>              {
>                error (0, errno, _("cannot lseek %s"), quote (src_name));
>                return false;
>              }
> 
> -          if (lseek (dest_fd, ext_logical, SEEK_SET) < 0LL)
> +          if (lseek (dest_fd, ext_logical, SEEK_SET) < 0)
>              {
>                error (0, errno, _("cannot lseek %s"), quote (dst_name));
>                return false;
> --
> 1.7.1.501.g23b46
> 
> 
> Also, I've squashed this clean-up patch onto Jeff's original,
> since ext_len is unsigned (of type uint64_t).
> 
> From bad13e737c683757a2ed05404564d8863c5da30e Mon Sep 17 00:00:00 2001
> From: Jim Meyering <meyering <at> redhat.com>
> Date: Wed, 9 Jun 2010 08:24:39 +0200
> Subject: [PATCH] remove 0 <
> 
> ---
>  src/copy.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/src/copy.c b/src/copy.c
> index 27e083a..f149be4 100644
> --- a/src/copy.c
> +++ b/src/copy.c
> @@ -240,7 +240,7 @@ fiemap_copy (int src_fd, int dest_fd, size_t buf_size,
>                last = true;
>              }
> 
> -          while (0 < ext_len)
> +          while (ext_len)
>              {
>                char buf[buf_size];
> 
> --
> 1.7.1.501.g23b46
> 
> 
> 


-- 
With Windows 7, Microsoft is asserting legal control over your computer and is using this power to
abuse computer users.




This bug report was last modified 14 years and 119 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.