GNU bug report logs - #79139
cp --reflink truncates sparse files on ZFS

Previous Next

Package: coreutils;

Reported by: Leah Neukirchen <leah <at> vuxu.org>

Date: Fri, 1 Aug 2025 15:02:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Full log


View this message in rfc822 format

From: Leah Neukirchen <leah <at> vuxu.org>
To: 79139 <at> debbugs.gnu.org
Subject: bug#79139: cp --reflink truncates sparse files on ZFS
Date: Fri, 01 Aug 2025 19:33:05 +0200
I debugged this further:

The issue boils down to several things that happen rarely:
- source and destination must be on different mountpoints, so FICLONE fails
- the fallback copy_file_range usually copies at most 2GB segments on ZFS,
  however it seems to be able to copy more at once when copying from a
  snapshot.

The problem now is that the return value is interpreted as a negative
number.  It's not clear to me how that happens, as ssize_t should be a
signed 64-bit number and contain the value fine, however, gdb also agrees:

Breakpoint 1, copy_file_range (infd=infd <at> entry=3, pinoff=pinoff <at> entry=0x0, outfd=outfd <at> entry=4, poutoff=poutoff <at> entry=0x0, length=137304735744, 
    flags=flags <at> entry=0) at ../sysdeps/unix/sysv/linux/copy_file_range.c:27
27      {
(gdb) fin
Run till exit from #0  copy_file_range (infd=infd <at> entry=3, pinoff=pinoff <at> entry=0x0, outfd=outfd <at> entry=4, poutoff=poutoff <at> entry=0x0, length=137304735744, 
    flags=flags <at> entry=0) at ../sysdeps/unix/sysv/linux/copy_file_range.c:27
sparse_copy (src_fd=src_fd <at> entry=3, dest_fd=dest_fd <at> entry=4, abuf=abuf <at> entry=0x7fffffffd9d8, buf_size=buf_size <at> entry=262144, hole_size=0, 
    punch_holes=punch_holes <at> entry=true, allow_reflink=true, src_name=0x7fffffffe3d7 "/.zfs/snapshot/pre-fixup/var/lib/libvirt/images/celestis.img", 
    dst_name=0x7fffffffe414 "celestis.img", max_n_read=137304735744, total_n_read=0x7fffffffd9e0, last_write_made_hole=0x7fffffffd9d0) at src/copy.c:344
344             if (n_copied == 0)
Value returned is $2 = -134217728

Then the error branch is triggered and the code falsely reads errno
(which is 18 from the failed FICLONE) so is_CLONENOTSUP is true, we
leave the loop without error reporting, total_n_read is still 0,
etc...  and it ends up truncating the file thinking the file has
shrunk.  Unfortunate.

I think the return value gets corrupted in glibc, see:
https://github.com/bminor/glibc/blob/d9a348d0927c7a1aec5caf3df3fcd36956b3eb23/nptl/cancellation.c#L66

long int
__syscall_cancel (__syscall_arg_t a1, __syscall_arg_t a2,
		  __syscall_arg_t a3, __syscall_arg_t a4,
		  __syscall_arg_t a5, __syscall_arg_t a6,
		  __SYSCALL_CANCEL7_ARG_DEF __syscall_arg_t nr)
{
  int r = __internal_syscall_cancel (a1, a2, a3, a4, a5, a6,
				     __SYSCALL_CANCEL7_ARG nr);
  return __glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (r))
	 ? SYSCALL_ERROR_LABEL (INTERNAL_SYSCALL_ERRNO (r))
	 : r;
}

Here, r should be a long int.

As a workaround, copy_max could be clamped to 2GB.

P.S.: why does coreutils cat not fail as well? It checks the return
value against -1, which it is not...

-- 
Leah Neukirchen  <leah <at> vuxu.org>  https://leahneukirchen.org/




This bug report was last modified 10 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.