GNU bug report logs - #51433
cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE

Previous Next

Package: coreutils;

Reported by: Janne Heß <janne+coreutils <at> hess.ooo>

Date: Wed, 27 Oct 2021 11:56:01 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Janne Heß <janne+coreutils <at> hess.ooo>, 51433 <at> debbugs.gnu.org
Subject: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE
Date: Wed, 3 Nov 2021 15:37:58 +0000
On 27/10/2021 11:00, Janne Heß wrote:
> Hi everyone,
> 
> I packaged coreutils 9.0 for NixOS and we found breakages that seemed to be very random during builds of packages
> that use the updated coreutils in their build process. It's really hard to tell the main cause but it seems like the issues
> are caused by binaries that are corrupted after cp copied them from /tmp to /nix. The issue arises both when the
> directories are on the same filesystem and when /tmp is on tmpfs.
> Upon further inspection/bisection we figured out these issues are caused by a6eaee501f6ec0c152abe88640203a64c390993e.
> This seems to happen on ZFS and indeed on the main coreutils mailing list there is a ZFS issue linked [1].
> The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 so it doesn't detect this issue anymore,
> but the issue still very much happens in the real world.
> 
> We have found this to happen while building the completions for a go tool (jx) which seems to be the same
> issue as [2]. The tool is built, copied using cp, and called which causes a segfault to happen.
> 
> Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the
> test suite, something about "Error: The service is no longer running". This does not happen when the mentioned
> coreutils commit is undone by replacing #ifdef with #if 0 [3].
> 
> We have also seen this issue on Darwin when building Alacritty but only happening on some machines
> but we were not able to pin it down any further there so this might be related or it might not.
> 
> Since the issue is so random, we started wondering if it might be related to -frandom-seed which changes in NixOS
> when rebuilding a package [4]. A thing to note here is that Nix does a lot of sandboxing stuff during builds which
> includes mount namespaces so a Kernel bug is not out of the question. All of these issues happened during Nix builds,
> coreutils 9.0 never made it out of the NixOS staging environment due to the builds breaking. We will probably disable
> the new code paths as outlined above so the issue is contained for NixOS users and does not hit any production environments.
> 
> [1]: https://github.com/openzfs/zfs/issues/11900
> [2]: https://github.com/golang/go/issues/48636
> [3]: https://raw.githubusercontent.com/NixOS/nixpkgs/bf0531b4f8a2de4ff2700797fb211a90c951786e/pkgs/tools/misc/coreutils/disable-seek-hole.patch
> [4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-952339263

Looks like there is a WIP fix for OpenZFS mentioned at [1],
where mmap'd regions were not being flushed:
https://github.com/openzfs/zfs/commit/f2eebe07

So this should unblock enabling coreutils 9 at some stage at least.
I've asked at [1] now they know what's going on,
how programs might best distinguish buggy instances of openzfs.

cheers,
Pádraig




This bug report was last modified 3 years and 174 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.