GNU bug report logs - #51433
cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

Previous Next

Package: coreutils;

Reported by: Janne Heß <janne+coreutils <at> hess.ooo>

Date: Wed, 27 Oct 2021 11:56:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Janne Heß <janne+coreutils <at> hess.ooo>, 51433 <at> debbugs.gnu.org
Subject: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Mon, 1 Nov 2021 13:27:15 +0000
On 01/11/2021 05:52, Paul Eggert wrote:
> On 10/31/21 09:13, Pádraig Brady wrote:
> 
>> The attached uses statfs()->f_type which is usually available,
>> to avoid using SEEK_DATA on ZFS.  This should be fairly lightweight I
>> think,
>> and only used for files that might be sparse.
> 
> Couldn't we be even lazier about invoking fstatfs? It needs to be
> invoked only if the file appears to be sparse (as you mentioned), and
> also only if lseek+SEEK_DATA fails (with errno==ENXIO) at a position
> less than the file's length, and also at most once per input file
> descriptor.
> 
> When copying multiple files (e.g., cp -r) we could also cache the
> previous file's st_dev and ZFS status, so in the typical case we could
> skip fstatfs entirely except for the first sparsish file that satisfies
> the abovementioned criteria.

I was thinking that sparse may be enough of a filter,
but these are valid suggestions.

> Also, as I understand it the problem occurs with OpenZFS but not on
> Solaris or its descendants, so we don't need to do any of this if __sun
> is defined.

Oh interesting.
If that was actually the case we could probably default
to allowing SEEK_DATA, and only disallow when f_type == ZFS,
i.e. allow SEEK_DATA on solaris.

> Also, if we're going to use fstatfs it'd probably be better to hoist all
> the fstatfs portability mess out of stat.c and into a new file (say,
> gl/lib/pstatfs.c) that gives us a portable way of getting statfs-like
> data out of filesystems, and using that new file in both stat.c and copy.c.

.. and tail.c. Good suggestion.

> One more idea: I've seen reports that the problem goes away if you use
> 'read' to read just the first byte of the false hole; that might be a
> simpler workaround than getting into the fstatfs portability mess.

Indeed. It would be really nice to get a solution supporting
existing, and future openzfs installations.

>> We could also disable SEEK_DATA for remote file systems,
>> but that would be a pity since SEEK_DATA seems especially useful there.
> 
> Absolutely.
> 
> Don't know if you're following
> <https://github.com/openzfs/zfs/issues/11900> but rincebrain reports a
> working workaround on the OpenZFS side tho more testing needs to be
> done. This OpenZFS bugs affects GNU tar and star (and surely other
> programs) as well as coreutils, and we may be better off overall if we
> merely ask people to fix their ZFS implementation rather than our trying
> to work around the bug in coreutils.

I see a slew of recent messages re workarounds in openzfs.
It's good to see some movement there, as it's definitely
best to address this in zfs.

cheers,
Pádraig




This bug report was last modified 3 years and 173 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.