Package: coreutils;
Reported by: "jeff.liu" <jeff.liu <at> oracle.com>
Date: Fri, 7 May 2010 14:16:02 UTC
Severity: normal
Tags: patch
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
Message #44 received at submit <at> debbugs.gnu.org (full text, mbox):
From: "jeff.liu" <jeff.liu <at> oracle.com> To: Jim Meyering <jim <at> meyering.net> Cc: Sunil Mushran <sunil.mushran <at> oracle.com>, Tao Ma <tao.ma <at> oracle.com>, bug-coreutils <at> gnu.org, Joel Becker <Joel.Becker <at> oracle.com>, Chris Mason <chris.mason <at> oracle.com> Subject: Re: bug#6131: [PATCH]: fiemap support for efficient sparse file copy Date: Fri, 21 May 2010 23:51:03 +0800
Hi Jim, This is the revised version, it fixed the fiemap-start offset calculation approach to remove it out of the 'for (i = 0; i < fiemap->fm_mapped_extents; i++)' loop. I have not got a 64bits machine for the testing at the moment, at the following, the first case only run againt x86 with valgrind for the non-extent file copy, it works for me, could you help verify on x64? The second case is to test the non-sparse extents logical offset and length of the copied file are identical to the source file, `ex' is test tool I write in C to show the extents info through FIEMAP ioctl(2), it step through each extent of a file to examine and print out the logical offset/extent length/physical offset. jeff <at> jeff-laptop:~/opensource_dev/coreutils$ dd if=/dev/null of=/ext4/sp1 bs=1 seek=2G jeff <at> jeff-laptop:~/opensource_dev/coreutils$ valgrind --version valgrind-3.3.0-Debian jeff <at> jeff-laptop:~/opensource_dev/coreutils$ valgrind ./src/cp --sparse=always /ext4/sp1 /ext4/sp2 ==13678== Memcheck, a memory error detector. ==13678== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==13678== Using LibVEX rev 1804, a library for dynamic binary translation. ==13678== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. ==13678== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation framework. ==13678== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. ==13678== For more details, rerun with: -v ==13678== ==13678== ==13678== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 23 from 1) ==13678== malloc/free: in use at exit: 0 bytes in 0 blocks. ==13678== malloc/free: 71 allocs, 71 frees, 10,255 bytes allocated. ==13678== For counts of detected errors, rerun with: -v ==13678== All heap blocks were freed -- no leaks areFrom 2a2df00acbcc9cdaef723f23efccb65d761d9093 Mon Sep 17 00:00:00 2001 jeff <at> jeff-laptop:~/opensource_dev/coreutils$ ./src/cp --sparse=always /ocfs2/sparse_dir/sparse_4 /ext4/sp2 jeff <at> jeff-laptop:~/opensource_dev/coreutils$ ./ex /ext4/sp2 Extents in file "/ext4/sp2": 14 Extents returned: 14 Logical: ###[ 0] Ext length: ###[ 65536] Physical: ###[352321536] Logical: ###[ 98304] Ext length: ###[ 32768] Physical: ###[352419840] Logical: ###[ 229376] Ext length: ###[ 32768] Physical: ###[352550912] Logical: ###[ 458752] Ext length: ###[ 65536] Physical: ###[352780288] Logical: ###[ 950272] Ext length: ###[ 65536] Physical: ###[353271808] Logical: ###[ 1966080] Ext length: ###[ 32768] Physical: ###[354287616] Logical: ###[ 3932160] Ext length: ###[ 65536] Physical: ###[356253696] Logical: ###[ 7897088] Ext length: ###[ 65536] Physical: ###[360218624] Logical: ###[15826944] Ext length: ###[ 65536] Physical: ###[384925696] Logical: ###[31719424] Ext length: ###[ 65536] Physical: ###[1004797952] Logical: ###[63471616] Ext length: ###[ 65536] Physical: ###[1011384320] Logical: ###[126976000] Ext length: ###[ 65536] Physical: ###[1016168448] Logical: ###[254017536] Ext length: ###[ 65536] Physical: ###[1025769472] Logical: ###[508100608] Ext length: ###[ 32768] Physical: ###[1036582912] jeff <at> jeff-laptop:~/opensource_dev/coreutils$ ./src/cp --sparse=always /ext4/sp2 /ext4/sp2_fiemap jeff <at> jeff-laptop:~/opensource_dev/coreutils$ ./ex /ext4/sp2_fiemap Extents in file "/ext4/sp2_fiemap": 14 Extents returned: 14 Logical: ###[ 0] Ext length: ###[ 65536] Physical: ###[1040187392] Logical: ###[ 98304] Ext length: ###[ 32768] Physical: ###[1040285696] Logical: ###[ 229376] Ext length: ###[ 32768] Physical: ###[1040416768] Logical: ###[ 458752] Ext length: ###[ 65536] Physical: ###[1040646144] Logical: ###[ 950272] Ext length: ###[ 65536] Physical: ###[1041137664] Logical: ###[ 1966080] Ext length: ###[ 32768] Physical: ###[1042153472] Logical: ###[ 3932160] Ext length: ###[ 65536] Physical: ###[1044119552] Logical: ###[ 7897088] Ext length: ###[ 65536] Physical: ###[1048084480] Logical: ###[15826944] Ext length: ###[ 65536] Physical: ###[1056014336] Logical: ###[31719424] Ext length: ###[ 65536] Physical: ###[1063518208] Logical: ###[63471616] Ext length: ###[ 65536] Physical: ###[1070104576] Logical: ###[126976000] Ext length: ###[ 65536] Physical: ###[1125220352] Logical: ###[254017536] Ext length: ###[ 65536] Physical: ###[1134821376] Logical: ###[508100608] Ext length: ###[ 32768] Physical: ###[1145634816] From 056bb15018466cc2b6b7ae2603fb41b6f61fa084 Mon Sep 17 00:00:00 2001 From: Jie Liu <jeff.liu <at> oracle.com> Date: Fri, 21 May 2010 22:49:03 +0800 Subject: [PATCH 1/1] cp: Add FIEMAP support for efficient sparse file copy * src/fiemap.h: Add fiemap.h for fiemap ioctl(2) support. Copied from linux's include/linux/fiemap.h, with minor formatting changes. * src/copy.c (copy_reg): Now, when `cp' invoked with --sparse=[WHEN] option, we will try to do FIEMAP-copy if the underlaying file system support it, fall back to a normal copy if it fails. Signed-off-by: Jie Liu <jeff.liu <at> oracle.com> --- src/copy.c | 153 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ src/fiemap.h | 102 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 255 insertions(+), 0 deletions(-) create mode 100644 src/fiemap.h diff --git a/src/copy.c b/src/copy.c index c16cef6..f32a676 100644 --- a/src/copy.c +++ b/src/copy.c @@ -63,6 +63,10 @@ #include <sys/ioctl.h> +#ifndef HAVE_FIEMAP +# include "fiemap.h" +#endif + #ifndef HAVE_FCHOWN # define HAVE_FCHOWN false # define fchown(fd, uid, gid) (-1) @@ -149,6 +153,135 @@ clone_file (int dest_fd, int src_fd) #endif } +#ifdef __linux__ +# ifndef FS_IOC_FIEMAP +# define FS_IOC_FIEMAP _IOWR ('f', 11, struct fiemap) +# endif +/* Perform FIEMAP(available in mainline 2.6.27) copy if possible. + Call ioctl(2) with FS_IOC_FIEMAP to efficiently map file allocation + excepts holes. So the overhead to deal with holes with lseek(2) in + normal copy could be saved. This would result in much faster backups + for any kind of sparse file. */ +static bool +fiemap_copy_ok (int src_fd, int dest_fd, size_t buf_size, + off_t src_total_size, char const *src_name, + char const *dst_name, bool *normal_copy_required) +{ + bool fail = false; + bool last = false; + char fiemap_buf[4096]; + struct fiemap *fiemap = (struct fiemap *)fiemap_buf; + struct fiemap_extent *fm_ext = &fiemap->fm_extents[0]; + uint32_t count = (sizeof (fiemap_buf) - sizeof (*fiemap)) / + sizeof (struct fiemap_extent); + off_t last_ext_logical = 0; + uint64_t last_ext_len = 0; + uint64_t last_read_size = 0; + unsigned int i = 0; + + memset (fiemap, 0, sizeof fiemap_buf); + do + { + fiemap->fm_length = FIEMAP_MAX_OFFSET; + fiemap->fm_extent_count = count; + + /* When ioctl(2) fails, fall back to the normal copy only if it + is the first time we met. */ + if (ioctl (src_fd, FS_IOC_FIEMAP, fiemap) < 0) + { + /* If `i > 0', then at least one ioctl(2) has been performed before. */ + if (i == 0) + *normal_copy_required = true; + return false; + } + + /* If 0 extents are returned, then more ioctls are not needed. */ + if (fiemap->fm_mapped_extents == 0) + break; + + for (i = 0; i < fiemap->fm_mapped_extents; i++) + { + assert (fm_ext[i].fe_logical <= OFF_T_MAX); + + off_t ext_logical = fm_ext[i].fe_logical; + uint64_t ext_len = fm_ext[i].fe_length; + + if (lseek (src_fd, ext_logical, SEEK_SET) < 0LL) + { + error (0, errno, _("cannot lseek %s"), quote (src_name)); + return fail; + } + + if (lseek (dest_fd, ext_logical, SEEK_SET) < 0LL) + { + error (0, errno, _("cannot lseek %s"), quote (dst_name)); + return fail; + } + + if (fm_ext[i].fe_flags & FIEMAP_EXTENT_LAST) + { + last_ext_logical = ext_logical; + last_ext_len = ext_len; + last = true; + } + + while (0 < ext_len) + { + char buf[buf_size]; + + /* Avoid reading into the holes if the left extent + length is shorter than the buffer size. */ + if (ext_len < buf_size) + buf_size = ext_len; + + ssize_t n_read = read (src_fd, buf, buf_size); + if (n_read < 0) + { +#ifdef EINTR + if (errno == EINTR) + continue; +#endif + error (0, errno, _("reading %s"), quote (src_name)); + return fail; + } + + if (n_read == 0) + { + /* Figure out how many bytes read from the last extent. */ + last_read_size = last_ext_len - ext_len; + break; + } + + if (full_write (dest_fd, buf, n_read) != n_read) + { + error (0, errno, _("writing %s"), quote (dst_name)); + return fail; + } + + ext_len -= n_read; + } + } + fiemap->fm_start = (fm_ext[i-1].fe_logical + fm_ext[i-1].fe_length); + } while (! last); + + /* If a file ends up with holes, the sum of the last extent logical offset + and the read-returned size will be shorter than the actual size of the + file. Use ftruncate to extend the length of the destination file. */ + if (last_ext_logical + last_read_size < src_total_size) + { + if (ftruncate (dest_fd, src_total_size) < 0) + { + error (0, errno, _("extending %s"), quote (dst_name)); + return fail; + } + } + + return ! fail; +} +#else +static bool fiemap_copy_ok (ignored) { errno == ENOTSUP; return false; } +#endif + /* FIXME: describe */ /* FIXME: rewrite this to use a hash table so we avoid the quadratic performance hit that's probably noticeable only on trees deeper @@ -679,6 +812,25 @@ copy_reg (char const *src_name, char const *dst_name, #endif } + if (make_holes) + { + bool require_normal_copy = false; + /* Perform efficient FIEMAP copy for sparse files, fall back to the + standard copy only if the ioctl(2) fails. */ + if (fiemap_copy_ok (source_desc, dest_desc, buf_size, + src_open_sb.st_size, src_name, + dst_name, &require_normal_copy)) + goto preserve_metadata; + else + { + if (! require_normal_copy) + { + return_val = false; + goto close_src_and_dst_desc; + } + } + } + /* If not making a sparse file, try to use a more-efficient buffer size. */ if (! make_holes) @@ -807,6 +959,7 @@ copy_reg (char const *src_name, char const *dst_name, } } +preserve_metadata: if (x->preserve_timestamps) { struct timespec timespec[2]; diff --git a/src/fiemap.h b/src/fiemap.h new file mode 100644 index 0000000..d33293b --- /dev/null +++ b/src/fiemap.h @@ -0,0 +1,102 @@ +/* FS_IOC_FIEMAP ioctl infrastructure. + Some portions copyright (C) 2007 Cluster File Systems, Inc + Authors: Mark Fasheh <mfasheh <at> suse.com> + Kalpak Shah <kalpak.shah <at> sun.com> + Andreas Dilger <adilger <at> sun.com>. */ + +/* Copy from kernel, modified to respect GNU code style by Jie Liu. */ + +#ifndef _LINUX_FIEMAP_H +# define _LINUX_FIEMAP_H + +# include <linux/types.h> + +struct fiemap_extent +{ + /* Logical offset in bytes for the start of the extent + from the beginning of the file. */ + uint64_t fe_logical; + + /* Physical offset in bytes for the start of the extent + from the beginning of the disk. */ + uint64_t fe_physical; + + /* Length in bytes for this extent. */ + uint64_t fe_length; + + uint64_t fe_reserved64[2]; + + /* FIEMAP_EXTENT_* flags for this extent. */ + uint32_t fe_flags; + + uint32_t fe_reserved[3]; +}; + +struct fiemap +{ + /* Logical offset(inclusive) at which to start mapping(in). */ + uint64_t fm_start; + + /* Logical length of mapping which userspace wants(in). */ + uint64_t fm_length; + + /* FIEMAP_FLAG_* flags for request(in/out). */ + uint32_t fm_flags; + + /* Number of extents that were mapped(out). */ + uint32_t fm_mapped_extents; + + /* Size of fm_extents array(in). */ + uint32_t fm_extent_count; + + uint32_t fm_reserved; + + /* Array of mapped extents(out). */ + struct fiemap_extent fm_extents[0]; +}; + +/* The maximum offset can be mapped for a file. */ +# define FIEMAP_MAX_OFFSET (~0ULL) + +/* Sync file data before map. */ +# define FIEMAP_FLAG_SYNC 0x00000001 + +/* Map extented attribute tree. */ +# define FIEMAP_FLAG_XATTR 0x00000002 + +# define FIEMAP_FLAGS_COMPAT (FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR) + +/* Last extent in file. */ +# define FIEMAP_EXTENT_LAST 0x00000001 + +/* Data location unknown. */ +# define FIEMAP_EXTENT_UNKNOWN 0x00000002 + +/* Location still pending, Sets EXTENT_UNKNOWN. */ +# define FIEMAP_EXTENT_DELALLOC 0x00000004 + +/* Data can not be read while fs is unmounted. */ +# define FIEMAP_EXTENT_ENCODED 0x00000008 + +/* Data is encrypted by fs. Sets EXTENT_NO_BYPASS. */ +# define FIEMAP_EXTENT_DATA_ENCRYPTED 0x00000080 + +/* Extent offsets may not be block aligned. */ +# define FIEMAP_EXTENT_NOT_ALIGNED 0x00000100 + +/* Data mixed with metadata. Sets EXTENT_NOT_ALIGNED. */ +# define FIEMAP_EXTENT_DATA_INLINE 0x00000200 + +/* Multiple files in block. Set EXTENT_NOT_ALIGNED. */ +# define FIEMAP_EXTENT_DATA_TAIL 0x00000400 + +/* Space allocated, but not data (i.e. zero). */ +# define FIEMAP_EXTENT_UNWRITTEN 0x00000800 + +/* File does not natively support extents. Result merged for efficiency. */ +# define FIEMAP_EXTENT_MERGED 0x00001000 + +/* Space shared with other files. */ +# define FIEMAP_EXTENT_SHARED 0x00002000 + +#endif -- 1.5.4.3 Cheers, -Jeff -- With Windows 7, Microsoft is asserting legal control over your computer and is using this power to abuse computer users.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.