From unknown Mon Aug 18 02:05:16 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#51433 <51433@debbugs.gnu.org> To: bug#51433 <51433@debbugs.gnu.org> Subject: Status: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE Reply-To: bug#51433 <51433@debbugs.gnu.org> Date: Mon, 18 Aug 2025 09:05:16 +0000 retitle 51433 cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE reassign 51433 coreutils submitter 51433 Janne He=C3=9F severity 51433 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Wed Oct 27 07:55:40 2021 Received: (at submit) by debbugs.gnu.org; 27 Oct 2021 11:55:40 +0000 Received: from localhost ([127.0.0.1]:48577 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mfhWp-0002CE-Uf for submit@debbugs.gnu.org; Wed, 27 Oct 2021 07:55:40 -0400 Received: from lists.gnu.org ([209.51.188.17]:35306) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mffjO-0005EY-6p for submit@debbugs.gnu.org; Wed, 27 Oct 2021 06:00:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47206) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mffjE-0004AE-MZ for bug-coreutils@gnu.org; Wed, 27 Oct 2021 06:00:22 -0400 Received: from mx1.helsinki.tools ([2a01:4f8:c010:4014::1]:34838) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mffjB-00042I-ND for bug-coreutils@gnu.org; Wed, 27 Oct 2021 06:00:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=hess.ooo; s=20191020-hss; h=Content-Transfer-Encoding:Content-Type:Message-ID:Subject: Date:MIME-Version:To:From:Sender:Reply-To:Cc:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=nwBTyhp6YeWE4QXR+bdZz7GSvz4/N5eVXnzJL099ahg=; b=FfQr+dSSjwJTSkf5lNK6HVeSA8 DA04eTX7UnCtFSaBoyibveVs+Bwo5FWMu9MQydxwyONpwXvDk5Lc5gXWRC4vGCr02bAZc2gvsQMrD F+jl2dc5sZZizEeZ1Fec2Z4VWn2Oo9s3DJZ8fWeLD3tjgjfF/OMCyw+jRxSCO7zwMnstN0WR76Y6p zb4hhv/VJUB0r0Rhlbx9L9iKdN87Q/X9baXAamsebFAtojWsOmpGubAW8za/b0UNVAJMRm5Tn/4ne UbX/WAvVlY6kbStSCe9fhrvI9sHq0/r6aeO7wMPcxwV0yc6ed9rqk9EDNqoKl849aYK9tgWTw9Qu0 micdCX5Q==; From: =?utf-8?q?Janne_He=C3=9F?= To: bug-coreutils@gnu.org MIME-Version: 1.0 Date: Wed, 27 Oct 2021 12:00:06 +0200 Subject: cp 9.0 sometimes fails with =?utf-8?q?SEEK=5FDATA/SEEK=5FHOLE?= Message-ID: <2cba43-61792300-71-753b5f00@59006960> X-Forward: 127.0.0.1 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Length: 2463 Received-SPF: none client-ip=2a01:4f8:c010:4014::1; envelope-from=janne+coreutils@hess.ooo; helo=mx1.helsinki.tools X-Spam_score_int: 4 X-Spam_score: 0.4 X-Spam_bar: / X-Spam_report: (0.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FROM_SUSPICIOUS_NTLD=0.5, FROM_SUSPICIOUS_NTLD_FP=1.999, PDS_OTHER_BAD_TLD=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.7 (/) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Wed, 27 Oct 2021 07:55:35 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.8 (/) Hi everyone, I packaged coreutils 9.0 for NixOS and we found breakages that seemed t= o be very random during builds of packages that use the updated coreutils in their build process. It's really hard= to tell the main cause but it seems like the issues are caused by binaries that are corrupted after cp copied them from /tm= p to /nix. The issue arises both when the directories are on the same filesystem and when /tmp is on tmpfs. Upon further inspection/bisection we figured out these issues are cause= d by a6eaee501f6ec0c152abe88640203a64c390993e. This seems to happen on ZFS and indeed on the main coreutils mailing li= st there is a ZFS issue linked [1]. The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 s= o it doesn't detect this issue anymore, but the issue still very much happens in the real world. We have found this to happen while building the completions for a go to= ol (jx) which seems to be the same issue as [2]. The tool is built, copied using cp, and called which caus= es a segfault to happen. Building another package (peertube) on x86=5F64-linux on ext4 also fail= s with strange errors in the test suite, something about "Error: The service is no longer running". = This does not happen when the mentioned coreutils commit is undone by replacing #ifdef with #if 0 [3]. We have also seen this issue on Darwin when building Alacritty but only= happening on some machines but we were not able to pin it down any further there so this might be = related or it might not. Since the issue is so random, we started wondering if it might be relat= ed to -frandom-seed which changes in NixOS when rebuilding a package [4]. A thing to note here is that Nix does a = lot of sandboxing stuff during builds which includes mount namespaces so a Kernel bug is not out of the question. A= ll of these issues happened during Nix builds, coreutils 9.0 never made it out of the NixOS staging environment due to= the builds breaking. We will probably disable the new code paths as outlined above so the issue is contained for NixO= S users and does not hit any production environments. [1]: https://github.com/openzfs/zfs/issues/11900 [2]: https://github.com/golang/go/issues/48636 [3]: https://raw.githubusercontent.com/NixOS/nixpkgs/bf0531b4f8a2de4ff2= 700797fb211a90c951786e/pkgs/tools/misc/coreutils/disable-seek-hole.patc= h [4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-95233926= 3 From debbugs-submit-bounces@debbugs.gnu.org Wed Oct 27 11:36:39 2021 Received: (at 51433) by debbugs.gnu.org; 27 Oct 2021 15:36:39 +0000 Received: from localhost ([127.0.0.1]:50251 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mfkyk-0001Da-SE for submit@debbugs.gnu.org; Wed, 27 Oct 2021 11:36:39 -0400 Received: from mail-wr1-f53.google.com ([209.85.221.53]:46866) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mfkyj-0001DI-8R for 51433@debbugs.gnu.org; Wed, 27 Oct 2021 11:36:38 -0400 Received: by mail-wr1-f53.google.com with SMTP id k7so4807372wrd.13 for <51433@debbugs.gnu.org>; Wed, 27 Oct 2021 08:36:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=gind7xQQFENwz6KE8lJPWqCrDjxLl13V/waZSFWR0es=; b=hi7FQ5nlE6Y6CjV3lamIIA+EJ3VPXMOROoVWzD13bazScl2XRwWZ6ifLeqXQ8qnaBj DBRms63m3eYagrwEz8eLiXkNC3lHpJsTq2mlsXIFCjnFBgy5QQdH1vLLaKeUqpehSVd6 L5IWbhfaUZ41eHK94y3WbNlYXrZH4hxSgNgRIYedgfPNtm3/cZGgAT6yNmWCynwuE2rK pyVPC9d/3HG1cU5NfF7f5IFmTnnSRgSfdxMq2YVEj8UnCGiZ1bc2dnP4FePmnIrfNS5A IQS3OHq7JivcUKyP3kAYsFHyrirWpUzRQaeQMDHu++IerxgFFnJKAPEkrbPFY6Mhp5tV KZ9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=gind7xQQFENwz6KE8lJPWqCrDjxLl13V/waZSFWR0es=; b=iqCGQEdMGa1sPGOTRlq5slEWXo0p9nRglwBxrlOLianNqa8ydL9tBEc33TCUzWQOQB q+hXefpRNDKnNwbhCzSqX5dat+Ziw9OIIydx3XC2rpG+GjZ9M0PAX+6fkYgFZSu7KNl9 Z91yd7HPy7GTfkzWy3qNXKy5hkjMSP5C3v57axJatt4oBAGzkvwEORim1bnKKUR3F16N e0gOMpPPOi9O1c2PW/VvstG8VF7ISmTNIxmS9sRZcce79vOGUstRDORUbkaHq6mHxz0P rWxDWcgGvVQtWVONyZhnADaFKn4ns7QjZCrLTGXXWkgAkZRDffaSLhbl8SnEc1wZfe44 7v+g== X-Gm-Message-State: AOAM530ZUIcTd1PYKaAPzqVzMTZxkEwYHkRXx+vpeRguPa62BcQ6dXrq MtlvgngTNAvhdYINwhGnjouZ9sFsjatktml/ X-Google-Smtp-Source: ABdhPJwPvdriW0fWMmrOsItfaJyZgnU9bvnr2EoYXqeBg6kluIQpgbgdWUzjwKj1ej5b2fLW9m3jPg== X-Received: by 2002:adf:eb4f:: with SMTP id u15mr41279944wrn.215.1635348991109; Wed, 27 Oct 2021 08:36:31 -0700 (PDT) Received: from localhost.localdomain (86-40-129-104-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.40.129.104]) by smtp.googlemail.com with UTF8SMTPSA id k7sm251085wrn.16.2021.10.27.08.36.30 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 27 Oct 2021 08:36:30 -0700 (PDT) Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE To: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org References: <2cba43-61792300-71-753b5f00@59006960> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <30cdf0da-cb33-8e41-8b41-b6dff043e182@draigBrady.com> Date: Wed, 27 Oct 2021 16:36:29 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: <2cba43-61792300-71-753b5f00@59006960> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 51433 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) On 27/10/2021 11:00, Janne Heß wrote: > Hi everyone, > > I packaged coreutils 9.0 for NixOS and we found breakages that seemed to be very random during builds of packages > that use the updated coreutils in their build process. It's really hard to tell the main cause but it seems like the issues > are caused by binaries that are corrupted after cp copied them from /tmp to /nix. The issue arises both when the > directories are on the same filesystem and when /tmp is on tmpfs. > Upon further inspection/bisection we figured out these issues are caused by a6eaee501f6ec0c152abe88640203a64c390993e. > This seems to happen on ZFS and indeed on the main coreutils mailing list there is a ZFS issue linked [1]. > The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 so it doesn't detect this issue anymore, > but the issue still very much happens in the real world. > > We have found this to happen while building the completions for a go tool (jx) which seems to be the same > issue as [2]. The tool is built, copied using cp, and called which causes a segfault to happen. > > Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the > test suite, something about "Error: The service is no longer running". This does not happen when the mentioned > coreutils commit is undone by replacing #ifdef with #if 0 [3]. > > We have also seen this issue on Darwin when building Alacritty but only happening on some machines > but we were not able to pin it down any further there so this might be related or it might not. > > Since the issue is so random, we started wondering if it might be related to -frandom-seed which changes in NixOS > when rebuilding a package [4]. A thing to note here is that Nix does a lot of sandboxing stuff during builds which > includes mount namespaces so a Kernel bug is not out of the question. All of these issues happened during Nix builds, > coreutils 9.0 never made it out of the NixOS staging environment due to the builds breaking. We will probably disable > the new code paths as outlined above so the issue is contained for NixOS users and does not hit any production environments. > > [1]: https://github.com/openzfs/zfs/issues/11900 > [2]: https://github.com/golang/go/issues/48636 > [3]: https://raw.githubusercontent.com/NixOS/nixpkgs/bf0531b4f8a2de4ff2700797fb211a90c951786e/pkgs/tools/misc/coreutils/disable-seek-hole.patch > [4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-952339263 We know about the ZFS issue with SEEK_HOLE: https://lists.gnu.org/archive/html/coreutils/2021-10/msg00021.html I've asked the user having nixos issues on darwin whether they're using the zfs on darwin port, or at least what file system is being copied from there. This is awkward to handle unfortunately. All I can think of now is to identify the file system type for each source file, and disable SEEK_HOLE on zfs at least. thanks for the info, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 28 03:56:25 2021 Received: (at 51433) by debbugs.gnu.org; 28 Oct 2021 07:56:25 +0000 Received: from localhost ([127.0.0.1]:51251 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mg0Gu-0003IJ-W8 for submit@debbugs.gnu.org; Thu, 28 Oct 2021 03:56:25 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:42828) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mg0Gp-0003I1-Ur for 51433@debbugs.gnu.org; Thu, 28 Oct 2021 03:56:23 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 3539C16005E; Thu, 28 Oct 2021 00:56:13 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id jYxyEToso34w; Thu, 28 Oct 2021 00:56:12 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 00A69160097; Thu, 28 Oct 2021 00:56:11 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 0NG_hsgghQ-1; Thu, 28 Oct 2021 00:56:11 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id CC3AD16005E; Thu, 28 Oct 2021 00:56:11 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------KU0s1mD4iUtAat9UpEKMjzlc" Message-ID: Date: Thu, 28 Oct 2021 00:56:11 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 From: Paul Eggert To: =?UTF-8?Q?Janne_He=c3=9f?= References: <2cba43-61792300-71-753b5f00@59006960> Content-Language: en-US Organization: UCLA Computer Science Department Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE In-Reply-To: <2cba43-61792300-71-753b5f00@59006960> X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 51433 Cc: 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) This is a multi-part message in MIME format. --------------KU0s1mD4iUtAat9UpEKMjzlc Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable On 10/27/21 03:00, Janne He=C3=9F wrote: > Building another package (peertube) on x86_64-linux on ext4 also fails = with strange errors in the > test suite, something about "Error: The service is no longer running". = This does not happen when the mentioned > coreutils commit is undone by replacing #ifdef with #if 0 [3]. So the problem is not limited to ZFS? Which means that even if we=20 implemented P=C3=A1draig's suggestion and disabled SEEK_HOLE on zfs, we'd= =20 still run into problems? That's really puzzling. Particularly since it's=20 not clear what program is generating the diagnostic "The service is no=20 longer running", or how it's related to GNU cp. Anyway, the ZFS issue sounds like a serious bug in lseek+SEEK_DATA that=20 really needs to be fixed. This is not just a coreutils issue, as other=20 programs use SEEK_DATA. I assume the ZFS bug (if the bug is related to ZFS, anyway) is a race=20 condition of some sort; at least, that's what the trace in=20 suggests. In particular, I was struck that the depthcharge.config file that 'cp'=20 was reading from was created by some other process, this way: [pid 3014182] openat(AT_FDCWD,=20 "/build/guybrush/tmp/portage/sys-boot/depthcharge-0.0.1-r3237/image/firmw= are/guybrush/depthcharge/depthcharge.config",=20 O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) =3D 4 [pid 3014182] fstat(4, {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 [pid 3014182] ioctl(4, TCGETS, 0x7ffd919d61c0) =3D -1 ENOTTY=20 (Inappropriate ioctl for device) [pid 3014182] lseek(3, 0, SEEK_CUR) =3D 0 [pid 3014182] lseek(3, 0, SEEK_DATA) =3D 0 [pid 3014182] lseek(3, 0, SEEK_HOLE) =3D 9608 [pid 3014182] copy_file_range(3, [0], 4, [0], 9608, 0) =3D 9608 [pid 3014182] lseek(3, 0, SEEK_CUR) =3D 9608 [pid 3014182] lseek(3, 9608, SEEK_DATA) =3D -1 ENXIO (No such device or=20 address) [pid 3014182] lseek(3, 0, SEEK_END) =3D 9608 [pid 3014182] ftruncate(4, 9608) =3D 0 [pid 3014182] close(4) =3D 0 So, one hypothesis is that ZFS's implementation of copy_file_range does=20 not set up data structures appropriately for cp's later use of=20 lseek+SEEK_DATA when reading depthcharge.config. That is, from cp's=20 point of view, the ftruncate(4, 9608) has been executed but the=20 copy_file_range(3, [0], 4, [0], 9608, 0) has not been executed yet (it's=20 cached somewhere, no doubt). If my guess is right, then an fdatasync or fsync on cp's input might=20 work around common instances of this ZFS bug. Could you try the=20 attached coreutils patch, and see whether it works around the bug? Or=20 perhaps change 'fdatasync' with 'fsync' in the attached patch? Thanks. --------------KU0s1mD4iUtAat9UpEKMjzlc Content-Type: text/x-patch; charset=UTF-8; name="0001-cp-attempt-to-work-around-ZFS-bug.patch" Content-Disposition: attachment; filename="0001-cp-attempt-to-work-around-ZFS-bug.patch" Content-Transfer-Encoding: base64 RnJvbSA0NTFjMTVmY2I3MmUwOWFmYmE0OTlmMTg4ZWI2ZTAzY2VkNDliMWNhIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBUaHUsIDI4IE9jdCAyMDIxIDAwOjQxOjU1IC0wNzAwClN1YmplY3Q6IFtQQVRD SF0gY3A6IGF0dGVtcHQgdG8gd29yayBhcm91bmQgWkZTIGJ1ZwoKKiBzcmMvY29weS5jIChp bmZlcl9zY2FudHlwZSk6IEF0dGVtcHQgYSBwYXJ0aWFsIHdvcmthcm91bmQgZm9yIGEKTGlu dXggZmlsZSBzeXN0ZW0gYnVnIChzZWUgQnVnIzUxNDMzKS4KLS0tCiBzcmMvY29weS5jIHwg MTAgKysrKysrKysrKwogMSBmaWxlIGNoYW5nZWQsIDEwIGluc2VydGlvbnMoKykKCmRpZmYg LS1naXQgYS9zcmMvY29weS5jIGIvc3JjL2NvcHkuYwppbmRleCBjYjkwMThmOTMuLjZmNDU2 NmQ5ZiAxMDA2NDQKLS0tIGEvc3JjL2NvcHkuYworKysgYi9zcmMvY29weS5jCkBAIC0xMDk3 LDYgKzEwOTcsMTYgQEAgaW5mZXJfc2NhbnR5cGUgKGludCBmZCwgc3RydWN0IHN0YXQgY29u c3QgKnNiLAogCiAjaWZkZWYgU0VFS19IT0xFCiAgIHNjYW5faW5mZXJlbmNlLT5leHRfc3Rh cnQgPSBsc2VlayAoZmQsIDAsIFNFRUtfREFUQSk7CisKKyAgLyogSWYgbHNlZWsgZmFpbHMg d2l0aCBFTlhJTywgaXQgbWF5IGJlIGEgTGludXggZmlsZSBzeXN0ZW0gYnVnCisgICAgIDxo dHRwczovL2J1Z3MuZ251Lm9yZy81MTQzMz4uICBUcnkgdG8gd29yayBhcm91bmQgaXQgYnkK KyAgICAgZmRhdGFzeW5jaW5nIHRoZSBpbnB1dCBhbmQgdGhlbiByZXRyeWluZyB0aGUgbHNl ZWsuICAqLworICBpZiAoc2Nhbl9pbmZlcmVuY2UtPmV4dF9zdGFydCA8IDAgJiYgZXJybm8g PT0gRU5YSU8pCisgICAgeworICAgICAgZmRhdGFzeW5jIChmZCk7CisgICAgICBzY2FuX2lu ZmVyZW5jZS0+ZXh0X3N0YXJ0ID0gbHNlZWsgKGZkLCAwLCBTRUVLX0RBVEEpOworICAgIH0K KwogICBpZiAoMCA8PSBzY2FuX2luZmVyZW5jZS0+ZXh0X3N0YXJ0KQogICAgIHJldHVybiBM U0VFS19TQ0FOVFlQRTsKICAgZWxzZSBpZiAoZXJybm8gIT0gRUlOVkFMICYmICFpc19FTk9U U1VQIChlcnJubykpCi0tIAoyLjMxLjEKCg== --------------KU0s1mD4iUtAat9UpEKMjzlc-- From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 28 09:54:26 2021 Received: (at 51433) by debbugs.gnu.org; 28 Oct 2021 13:54:26 +0000 Received: from localhost ([127.0.0.1]:51612 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mg5rN-0001Ti-OQ for submit@debbugs.gnu.org; Thu, 28 Oct 2021 09:54:26 -0400 Received: from mail-wm1-f47.google.com ([209.85.128.47]:46698) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mg5rJ-0001TR-BX for 51433@debbugs.gnu.org; Thu, 28 Oct 2021 09:54:23 -0400 Received: by mail-wm1-f47.google.com with SMTP id b194-20020a1c1bcb000000b0032cd7b47853so3984811wmb.5 for <51433@debbugs.gnu.org>; Thu, 28 Oct 2021 06:54:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=brHVWKi80tkXuJQ2ZZddGZL4UN9FdSEjX1jPdILPxi8=; b=Ab7bnHvcPhwS9siULDuO9z28whBq7TsA+6w71Ovh3hA+RrJGh2IqvEZyGhl+RURYk1 5WZZz3o4g9I95Qr71M+EHOlqayVdnBya5hA5e73WarKp+Ipz2cticyr9NaS8+3ZrIe9T GghcCoZDZC237TFyiz1KRcUQ09nEKEKg4Ig0H4Eto7rVvx130e2Be/upe+baQKOiQ3gj PceHU7BbCkVRZ0W4t55VekD8lzP75H3oONdDvcwYKXlNyBHEjvLzSlD20AsLDrm76NiX tVbEaYbcCdwCXDGu7nLWiCAErU7WP/QW3UuIsOA/IDTlA/7ylD9VXGKUL012Xf3+gW0o GWpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=brHVWKi80tkXuJQ2ZZddGZL4UN9FdSEjX1jPdILPxi8=; b=ZqOpEMKbFa3oObHIZ6f5zKrOXaYG0wfNTnAA5RkGS7AfFWt4nNUmMwrFCLj0icbcnA NDWCodJaEl0fnIDjJYGgLQxKfvC686oiKK5TEA//57Z3pKUJ16IDXAgn9HwLmqYRBWnm mc3PhITU6GufZp24ZJqrH0oKGgxD4tBeQmzLKadit5FEh2yc//gkUVXvhDEui0stgWhc S6AwyL5oy1T09a2E+jGJ+c3fi78T6VySwlmDu64lq2URxrzLsOo17WM/qtzaIN4IN339 lArEmW2E86VwshfW2P26UL9oLAJm3yEhUq3XW2/VmK4pSdYElhdjomCeEM/V6AyQeGH/ 87DQ== X-Gm-Message-State: AOAM532B0jJRc2cz39Fukf5iXimLO7yoPzNaL2Rx5yHLpxmCClR+OFp+ V1J8reaPAGJQfx3X/jpBRYrm8EWSJrtJV7YZ X-Google-Smtp-Source: ABdhPJyJl528mB7EHL/G1/TqKMFdYkJUh1jfT0n6Z2wPXJh/3WH8wtiieSnSNr9KzMDvwzROuUbNSQ== X-Received: by 2002:a05:600c:4147:: with SMTP id h7mr12549104wmm.0.1635429254997; Thu, 28 Oct 2021 06:54:14 -0700 (PDT) Received: from localhost.localdomain (86-40-129-104-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.40.129.104]) by smtp.googlemail.com with UTF8SMTPSA id g7sm4185967wrd.81.2021.10.28.06.54.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 28 Oct 2021 06:54:14 -0700 (PDT) Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE To: Paul Eggert , =?UTF-8?Q?Janne_He=c3=9f?= References: <2cba43-61792300-71-753b5f00@59006960> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> Date: Thu, 28 Oct 2021 14:54:12 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 51433 Cc: 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) On 28/10/2021 08:56, Paul Eggert wrote: > On 10/27/21 03:00, Janne Heß wrote: >> Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the >> test suite, something about "Error: The service is no longer running". This does not happen when the mentioned >> coreutils commit is undone by replacing #ifdef with #if 0 [3]. > > So the problem is not limited to ZFS? Which means that even if we > implemented Pádraig's suggestion and disabled SEEK_HOLE on zfs, we'd > still run into problems? That's really puzzling. Particularly since it's > not clear what program is generating the diagnostic "The service is no > longer running", or how it's related to GNU cp. > > Anyway, the ZFS issue sounds like a serious bug in lseek+SEEK_DATA that > really needs to be fixed. This is not just a coreutils issue, as other > programs use SEEK_DATA. > > I assume the ZFS bug (if the bug is related to ZFS, anyway) is a race > condition of some sort; at least, that's what the trace in > suggests. > > In particular, I was struck that the depthcharge.config file that 'cp' > was reading from was created by some other process, this way: > > [pid 3014182] openat(AT_FDCWD, > "/build/guybrush/tmp/portage/sys-boot/depthcharge-0.0.1-r3237/image/firmware/guybrush/depthcharge/depthcharge.config", > O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 4 > [pid 3014182] fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > [pid 3014182] ioctl(4, TCGETS, 0x7ffd919d61c0) = -1 ENOTTY > (Inappropriate ioctl for device) > [pid 3014182] lseek(3, 0, SEEK_CUR) = 0 > [pid 3014182] lseek(3, 0, SEEK_DATA) = 0 > [pid 3014182] lseek(3, 0, SEEK_HOLE) = 9608 > [pid 3014182] copy_file_range(3, [0], 4, [0], 9608, 0) = 9608 > [pid 3014182] lseek(3, 0, SEEK_CUR) = 9608 > [pid 3014182] lseek(3, 9608, SEEK_DATA) = -1 ENXIO (No such device or > address) > [pid 3014182] lseek(3, 0, SEEK_END) = 9608 > [pid 3014182] ftruncate(4, 9608) = 0 > [pid 3014182] close(4) = 0 > > So, one hypothesis is that ZFS's implementation of copy_file_range does > not set up data structures appropriately for cp's later use of > lseek+SEEK_DATA when reading depthcharge.config. That is, from cp's > point of view, the ftruncate(4, 9608) has been executed but the > copy_file_range(3, [0], 4, [0], 9608, 0) has not been executed yet (it's > cached somewhere, no doubt). > > If my guess is right, then an fdatasync or fsync on cp's input might > work around common instances of this ZFS bug. Could you try the > attached coreutils patch, and see whether it works around the bug? Or > perhaps change 'fdatasync' with 'fsync' in the attached patch? Thanks. Further debugging from Nix folks suggest ZFS was in consideration always, as invalid artifacts were written to a central cache from ZFS backed hosts. So we should at least change the comment in the patch to only mention ZFS. Also it seems like fsync() does avoid the ZFS issue as mentioned in: https://github.com/openzfs/zfs/issues/11900 BTW I'm slightly worried about retrying SEEK_DATA as FreeBSD 9.1 has a bug with large sparse files at least where it takes ages for SEEK_DATA to return: 36.13290615 lseek(3,0x0,SEEK_DATA) = -32768 (0xffff8000) If ENXIO is not set in that case, then there is no issue. Also I'm not sure restricting sync to ENXIO is general enough, as an strace from a problematic cp, from the github issue above is: lseek(3, 0, SEEK_DATA) = 0 lseek(3, 0, SEEK_HOLE) = 131072 lseek(3, 0, SEEK_SET) = 0 read(3, "\177ELF\2\1"..., 131072) = 131072 write(4, "\177ELF\2\"..., 131072) = 131072 lseek(3, 131072, SEEK_DATA) = -1 ENXIO ftruncate(4, 3318813) = 0 For completeness, there is another SEEK_DATA issue on ZFS, but only a perf one, not a correctness one. This was a perf issue we saw in our tests and handled with: https://git.sv.gnu.org/cgit/coreutils.git/commit/?id=v9.0-3-g61c81ffaa That case is where ZFS has an async operation that needs to complete before SEEK_DATA can determine a sparse file is empty (return ENXIO). That seems to be controlled with zfs-dmu-offset-next-sync One can see that perf issue with: rm f3 f4 && truncate -s 1T f3 && sleep 2 && timeout 1 src/cp --reflink=never f3 f4 && echo ok If you change the `sleep 2` to a `sleep 4`, then "ok" is echoed. Sometimes `sleep 3` works, so this seems to be the zfs timer. Changing to a smaller file doesn't seem to change anything. Using `sync f3`, 'sync .`, or `sync` rather than `sleep 4` doesn't help. cheers, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 28 15:11:52 2021 Received: (at 51433) by debbugs.gnu.org; 28 Oct 2021 19:11:52 +0000 Received: from localhost ([127.0.0.1]:53232 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgAoa-0001Ua-Fd for submit@debbugs.gnu.org; Thu, 28 Oct 2021 15:11:52 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:53018) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgAoX-0001UL-E8 for 51433@debbugs.gnu.org; Thu, 28 Oct 2021 15:11:51 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 3B99E1600EF; Thu, 28 Oct 2021 12:11:43 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id BKsZ2ktIHQKV; Thu, 28 Oct 2021 12:11:42 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 33C4D1600FD; Thu, 28 Oct 2021 12:11:42 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Y_LRTsffcNFR; Thu, 28 Oct 2021 12:11:42 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 06C861600EF; Thu, 28 Oct 2021 12:11:42 -0700 (PDT) Message-ID: Date: Thu, 28 Oct 2021 12:11:41 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 Content-Language: en-US To: =?UTF-8?Q?P=c3=a1draig_Brady?= References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> From: Paul Eggert Organization: UCLA Computer Science Department Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE In-Reply-To: <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) On 10/28/21 06:54, P=C3=A1draig Brady wrote: > Further debugging from Nix folks suggest ZFS was in consideration alway= s, > as invalid artifacts were written to a central cache from ZFS backed ho= sts. > So we should at least change the comment in the patch to only mention Z= FS. Yes, that sounds reasonable. This ZFS bug sounds pretty serious, though. Apparently it affects star=20 and other programs too. I'm not sure we should attempt to work around it=20 in coreutils, if the workarounds penalize everybody not using ZFS. Is it cheap to check whether a file is actually in a ZFS filesystem?=20 (Don't know how this'd work with loopback mounts, NFS, etc.) If so, it=20 might be better to simply fdatasync (or even fsync) every input file=20 that's on ZFS, until we know the ZFS bugs are fixed. In theory we could fdatasync/fsync every input file on every platform.=20 It'd be a shame to do that, though; that would slow down everybody=20 merely to work around this ZFS bug. > Also it seems like fsync() does avoid the ZFS issue as mentioned in: > https://github.com/openzfs/zfs/issues/11900 Yes. I'm hoping that fdatasync suffices as it's lighter-weight. But if=20 fsync is needed we can use fsync. > BTW I'm slightly worried about retrying SEEK_DATA as > FreeBSD 9.1 has a bug with large sparse files at least > where it takes ages for SEEK_DATA to return: > =C2=A0 36.13290615 lseek(3,0x0,SEEK_DATA)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 =3D -32768 (0xffff8000) > If ENXIO is not set in that case, then there is no issue. Wait - lseek returns a number less than -1?! We could easily check for=20 that FreeBSD bug, perhaps as an independent patch; this shouldn't=20 require any extra syscalls. Also please see=20 . It appears=20 that ZFS has significant bugs in this area on FreeBSD, bugs that haven't=20 been fixed yet. That bug report does suggest that an fsync (and I hope=20 fdatasync) works around the bugs. > Also I'm not sure restricting sync to ENXIO is general enough, > as an strace from a problematic cp, from the github issue above is: > =C2=A0 lseek(3, 0, SEEK_DATA)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 =3D 0 > =C2=A0 lseek(3, 0, SEEK_HOLE)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0 =3D 131072 > =C2=A0 lseek(3, 0, SEEK_SET)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D 0 > =C2=A0 read(3, "\177ELF\2\1"..., 131072) =3D 131072 > =C2=A0 write(4, "\177ELF\2\"..., 131072) =3D 131072 > =C2=A0 lseek(3, 131072, SEEK_DATA)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =3D -1 ENXIO > =C2=A0 ftruncate(4, 3318813)=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 =3D 0 How about if we also do an fdatasync+retry after that 2nd lseek yields=20 ENXIO? Would that suffice to work around the ZFS bug? Would it be too=20 much of a performance penalty for non-ZFS users? From debbugs-submit-bounces@debbugs.gnu.org Thu Oct 28 16:59:16 2021 Received: (at 51433) by debbugs.gnu.org; 28 Oct 2021 20:59:16 +0000 Received: from localhost ([127.0.0.1]:53371 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgCUW-0004BL-5y for submit@debbugs.gnu.org; Thu, 28 Oct 2021 16:59:16 -0400 Received: from mail-wm1-f48.google.com ([209.85.128.48]:33284) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgCUS-0004B6-Cd for 51433@debbugs.gnu.org; Thu, 28 Oct 2021 16:59:14 -0400 Received: by mail-wm1-f48.google.com with SMTP id j35-20020a05600c1c2300b0032caeca81b7so5356929wms.0 for <51433@debbugs.gnu.org>; Thu, 28 Oct 2021 13:59:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=YHs61xQ/B+z3j64xJx08CpupRpoD9cXQ6WnpblA9RSE=; b=ipa6idGHjE4yooUnXkc9aJQyhgp0yRiTp/OS1UYYlVfbf4JB38HtN/XvWWJZx7roSL 7WPEbFuSO4qHYQXJvyutQZgTjnv+Re9yWcYo4hC3Oo5qUWg04FCmAVJqX52alOj2rJgc Mku1OGB8DtAbBSbv8OW+OsjGx6G9aDaRq7RMwJ3Kucf1NdZbN+n4NSDFBfx3a8kcU2wM rioOKqe1xc23jIjkb/Qo0d5TSaX6m3KVIYpjAOnPIEOLu0GoXlwwb6+G+EvyVuTGLr2s FLUgWbuugnsAMTNlbJDX5tKYpo3zoGjhdunSs1GXJxCg8HJARIoppRigJ67fdae0fCzx SVJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=YHs61xQ/B+z3j64xJx08CpupRpoD9cXQ6WnpblA9RSE=; b=GkQR61mJXs+GvSxf+XNW0Rkf70i5pBfzONJA6+7yGLoLbYVtQ1Aun1Dqku8KzPDQyj u34mqyLIvK948mB/H+0YAsLZRsLJeqU0IxXxg4HvnHeBwQBaxrFGcJZowGhmCKk3cJZd 64wmUZNfUbvBdSUlNpqgr8IQKgTzRl4goS5iKsNkviRBMwmLF2X2DOe30gzI+01mWQ5V jOuIyYVxMr/saVhlUy672LHp9U8wnj1qDJ0SE3d59TguTSNxaDrtRjOiCm+AJnx0KGFB TthphZ11S36NmOOSQCW9b9fArfaap+swxRXwMO5xysQOKjhWPYemUZImtT4T5VWdWYCO /Eow== X-Gm-Message-State: AOAM530Sl/Kn2xd+1PhnqXb+0x2WGxuZqeJYTV9vvZkvYF2mfLoChueV sSmXMZGXRdCWckGuKgzq7yu4tOBwAlLQBzyb X-Google-Smtp-Source: ABdhPJxVE7qH0VITAGdUDgbxF/5ftdWyKy6g95Tn4pPyqRHLUsTU2XezJ0QSOhGSIvlTwlD7W/33Qg== X-Received: by 2002:a05:600c:2909:: with SMTP id i9mr15006012wmd.74.1635454745804; Thu, 28 Oct 2021 13:59:05 -0700 (PDT) Received: from localhost.localdomain (86-40-129-104-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.40.129.104]) by smtp.googlemail.com with UTF8SMTPSA id l2sm7844156wmi.1.2021.10.28.13.59.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 28 Oct 2021 13:59:05 -0700 (PDT) Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE To: Paul Eggert References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: Date: Thu, 28 Oct 2021 21:59:03 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) On 28/10/2021 20:11, Paul Eggert wrote: > On 10/28/21 06:54, Pádraig Brady wrote: > >> Further debugging from Nix folks suggest ZFS was in consideration always, >> as invalid artifacts were written to a central cache from ZFS backed hosts. >> So we should at least change the comment in the patch to only mention ZFS. > > Yes, that sounds reasonable. > > This ZFS bug sounds pretty serious, though. Apparently it affects star > and other programs too. I'm not sure we should attempt to work around it > in coreutils, if the workarounds penalize everybody not using ZFS. > > Is it cheap to check whether a file is actually in a ZFS filesystem? > (Don't know how this'd work with loopback mounts, NFS, etc.) If so, it > might be better to simply fdatasync (or even fsync) every input file > that's on ZFS, until we know the ZFS bugs are fixed. > > In theory we could fdatasync/fsync every input file on every platform. > It'd be a shame to do that, though; that would slow down everybody > merely to work around this ZFS bug. > >> Also it seems like fsync() does avoid the ZFS issue as mentioned in: >> https://github.com/openzfs/zfs/issues/11900 > > Yes. I'm hoping that fdatasync suffices as it's lighter-weight. But if > fsync is needed we can use fsync. > >> BTW I'm slightly worried about retrying SEEK_DATA as >> FreeBSD 9.1 has a bug with large sparse files at least >> where it takes ages for SEEK_DATA to return: >>   36.13290615 lseek(3,0x0,SEEK_DATA)         = -32768 (0xffff8000) >> If ENXIO is not set in that case, then there is no issue. > > Wait - lseek returns a number less than -1?! We could easily check for > that FreeBSD bug, perhaps as an independent patch; this shouldn't > require any extra syscalls. > > Also please see > . It appears > that ZFS has significant bugs in this area on FreeBSD, bugs that haven't > been fixed yet. That bug report does suggest that an fsync (and I hope > fdatasync) works around the bugs. > >> Also I'm not sure restricting sync to ENXIO is general enough, >> as an strace from a problematic cp, from the github issue above is: >>   lseek(3, 0, SEEK_DATA)            = 0 >>   lseek(3, 0, SEEK_HOLE)            = 131072 >>   lseek(3, 0, SEEK_SET)             = 0 >>   read(3, "\177ELF\2\1"..., 131072) = 131072 >>   write(4, "\177ELF\2\"..., 131072) = 131072 >>   lseek(3, 131072, SEEK_DATA)       = -1 ENXIO >>   ftruncate(4, 3318813)             = 0 > > How about if we also do an fdatasync+retry after that 2nd lseek yields > ENXIO? Would that suffice to work around the ZFS bug? Would it be too > much of a performance penalty for non-ZFS users? I don't think there is anything special about first or second lseek(). ZFS just seems to be returning ENXIO depending on the write pattern of the file. Since we currently always have an expected SEEK_DATA ENXIO at end of file, doing a sync on any ENXIO would be best avoided. I wonder after getting a SEEK_DATA ENXIO, might we correlate we really are at end of file by checking SEEK_HOLE also returns ENXIO? If not we could do the datasync and try SEEK_DATA again. I've not seen enough syscall traces to be confident of that though. BTW we only attempt the SEEK_HOLE scanning when our heuristic of st_blocks being less than st_size indicates there may be holes. It's a bit surprising that that is the case for these elf binaries. Perhaps zfs is updating st_blocks async, or perhaps there are runs of zeros in these files that a linker or whatever is sparsifying? cheers, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 29 21:24:50 2021 Received: (at 51433) by debbugs.gnu.org; 30 Oct 2021 01:24:50 +0000 Received: from localhost ([127.0.0.1]:56340 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgd73-0004Ha-OB for submit@debbugs.gnu.org; Fri, 29 Oct 2021 21:24:49 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:44902) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgd70-0004HK-8i for 51433@debbugs.gnu.org; Fri, 29 Oct 2021 21:24:48 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 2D2E5160071; Fri, 29 Oct 2021 18:24:40 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id H3nGnrYv2pk1; Fri, 29 Oct 2021 18:24:39 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 522A91600B8; Fri, 29 Oct 2021 18:24:39 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id sIYegCb494cL; Fri, 29 Oct 2021 18:24:39 -0700 (PDT) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 1C68D160071; Fri, 29 Oct 2021 18:24:39 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------jHRmISFtN0NO008kVaPrNzFo" Message-ID: <49b12d1d-575e-3bb2-af3e-e8ae0ab740eb@cs.ucla.edu> Date: Fri, 29 Oct 2021 18:24:38 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE Content-Language: en-US From: Paul Eggert To: =?UTF-8?Q?P=c3=a1draig_Brady?= References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> Organization: UCLA Computer Science Department In-Reply-To: X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) This is a multi-part message in MIME format. --------------jHRmISFtN0NO008kVaPrNzFo Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 10/28/21 12:11, Paul Eggert wrote: > Wait - lseek returns a number less than -1?! We could easily check for > that FreeBSD bug, perhaps as an independent patch; this shouldn't > require any extra syscalls. I installed the attached patch to do this. This doesn't fix coreutils bug#51433; it merely makes 'cp' and similar programs more likely to detect and report the FreeBSD 9.1 bug you mentioned. --------------jHRmISFtN0NO008kVaPrNzFo Content-Type: text/x-patch; charset=UTF-8; name="0001-cp-defend-better-against-FreeBSD-9.1-zfs-bug.patch" Content-Disposition: attachment; filename="0001-cp-defend-better-against-FreeBSD-9.1-zfs-bug.patch" Content-Transfer-Encoding: base64 RnJvbSA2NzQ5MTJjNmY3YzQ1YjVmMGIxZjc3ZGE4OTY2M2NhYzAyYzExOGY3IE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBGcmksIDI5IE9jdCAyMDIxIDE4OjAxOjM0IC0wNzAwClN1YmplY3Q6IFtQQVRD SF0gY3A6IGRlZmVuZCBiZXR0ZXIgYWdhaW5zdCBGcmVlQlNEIDkuMSB6ZnMgYnVnCk1JTUUt VmVyc2lvbjogMS4wCkNvbnRlbnQtVHlwZTogdGV4dC9wbGFpbjsgY2hhcnNldD1VVEYtOApD b250ZW50LVRyYW5zZmVyLUVuY29kaW5nOiA4Yml0CgpQcm9ibGVtIHJlcG9ydGVkIGJ5IFDD oWRyYWlnIEJyYWR5IChCdWcjNTE0MzMjMTQpLgoqIHNyYy9jb3B5LmMgKGxzZWVrX2NvcHks IGluZmVyX3NjYW50eXBlKTogUmVwb3J0IGFuIGVycm9yIGlmCmxzZWVrIHdpdGggU0VFS19E QVRBIG9yIFNFRUtfSE9MRSByZXR1cm5zIGxlc3MgdGhhbiAtMSwKYXMgdGhpcyBpcyBhbiBs c2VlayBidWcuCi0tLQogc3JjL2NvcHkuYyB8IDE1ICsrKysrKy0tLS0tLS0tLQogMSBmaWxl IGNoYW5nZWQsIDYgaW5zZXJ0aW9ucygrKSwgOSBkZWxldGlvbnMoLSkKCmRpZmYgLS1naXQg YS9zcmMvY29weS5jIGIvc3JjL2NvcHkuYwppbmRleCBjYjkwMThmOTMuLjFjYmM5NDgwYyAx MDA2NDQKLS0tIGEvc3JjL2NvcHkuYworKysgYi9zcmMvY29weS5jCkBAIC01MzAsNyArNTMw LDcgQEAgbHNlZWtfY29weSAoaW50IHNyY19mZCwgaW50IGRlc3RfZmQsIGNoYXIgKmJ1Ziwg c2l6ZV90IGJ1Zl9zaXplLAogICAgICAgb2ZmX3QgZXh0X2VuZCA9IGxzZWVrIChzcmNfZmQs IGV4dF9zdGFydCwgU0VFS19IT0xFKTsKICAgICAgIGlmIChleHRfZW5kIDwgMCkKICAgICAg ICAgewotICAgICAgICAgIGlmIChlcnJubyAhPSBFTlhJTykKKyAgICAgICAgICBpZiAoISAo ZXh0X2VuZCA9PSAtMSAmJiBlcnJubyA9PSBFTlhJTykpCiAgICAgICAgICAgICBnb3RvIGNh bm5vdF9sc2VlazsKICAgICAgICAgICBleHRfZW5kID0gc3JjX3RvdGFsX3NpemU7CiAgICAg ICAgICAgaWYgKGV4dF9lbmQgPD0gZXh0X3N0YXJ0KQpAQCAtNjA3LDEyICs2MDcsOCBAQCBs c2Vla19jb3B5IChpbnQgc3JjX2ZkLCBpbnQgZGVzdF9mZCwgY2hhciAqYnVmLCBzaXplX3Qg YnVmX3NpemUsCiAgICAgICAgIH0KIAogICAgICAgZXh0X3N0YXJ0ID0gbHNlZWsgKHNyY19m ZCwgZGVzdF9wb3MsIFNFRUtfREFUQSk7Ci0gICAgICBpZiAoZXh0X3N0YXJ0IDwgMCkKLSAg ICAgICAgewotICAgICAgICAgIGlmIChlcnJubyAhPSBFTlhJTykKLSAgICAgICAgICAgIGdv dG8gY2Fubm90X2xzZWVrOwotICAgICAgICAgIGJyZWFrOwotICAgICAgICB9CisgICAgICBp ZiAoZXh0X3N0YXJ0IDwgMCAmJiAhIChleHRfc3RhcnQgPT0gLTEgJiYgZXJybm8gPT0gRU5Y SU8pKQorICAgICAgICBnb3RvIGNhbm5vdF9sc2VlazsKICAgICB9CiAKICAgLyogV2hlbiB0 aGUgc291cmNlIGZpbGUgZW5kcyB3aXRoIGEgaG9sZSwgd2UgaGF2ZSB0byBkbyBhIGxpdHRs ZSBtb3JlIHdvcmssCkBAIC0xMDk3LDEwICsxMDkzLDExIEBAIGluZmVyX3NjYW50eXBlIChp bnQgZmQsIHN0cnVjdCBzdGF0IGNvbnN0ICpzYiwKIAogI2lmZGVmIFNFRUtfSE9MRQogICBz Y2FuX2luZmVyZW5jZS0+ZXh0X3N0YXJ0ID0gbHNlZWsgKGZkLCAwLCBTRUVLX0RBVEEpOwot ICBpZiAoMCA8PSBzY2FuX2luZmVyZW5jZS0+ZXh0X3N0YXJ0KQorICBpZiAoMCA8PSBzY2Fu X2luZmVyZW5jZS0+ZXh0X3N0YXJ0CisgICAgICB8fCAoc2Nhbl9pbmZlcmVuY2UtPmV4dF9z dGFydCA9PSAtMSAmJiBlcnJubyA9PSBFTlhJTykpCiAgICAgcmV0dXJuIExTRUVLX1NDQU5U WVBFOwogICBlbHNlIGlmIChlcnJubyAhPSBFSU5WQUwgJiYgIWlzX0VOT1RTVVAgKGVycm5v KSkKLSAgICByZXR1cm4gZXJybm8gPT0gRU5YSU8gPyBMU0VFS19TQ0FOVFlQRSA6IEVSUk9S X1NDQU5UWVBFOworICAgIHJldHVybiBFUlJPUl9TQ0FOVFlQRTsKICNlbmRpZgogCiAgIHJl dHVybiBaRVJPX1NDQU5UWVBFOwotLSAKMi4zMS4xCgo= --------------jHRmISFtN0NO008kVaPrNzFo-- From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 29 22:04:13 2021 Received: (at 51433) by debbugs.gnu.org; 30 Oct 2021 02:04:14 +0000 Received: from localhost ([127.0.0.1]:56363 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgdjB-0005MZ-Lv for submit@debbugs.gnu.org; Fri, 29 Oct 2021 22:04:13 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:49114) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgdj9-0005MK-Cd for 51433@debbugs.gnu.org; Fri, 29 Oct 2021 22:04:12 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 76CB4160071; Fri, 29 Oct 2021 19:04:05 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id VCOBOEMr9JJK; Fri, 29 Oct 2021 19:04:04 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 8E2341600B8; Fri, 29 Oct 2021 19:04:04 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id TjvbXONwswj3; Fri, 29 Oct 2021 19:04:04 -0700 (PDT) Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 6B0F6160071; Fri, 29 Oct 2021 19:04:04 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------7s5B0LB0vH9QE6PdPbKgxmOz" Message-ID: Date: Fri, 29 Oct 2021 19:04:04 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE Content-Language: en-US To: =?UTF-8?Q?P=c3=a1draig_Brady?= References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) This is a multi-part message in MIME format. --------------7s5B0LB0vH9QE6PdPbKgxmOz Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable On 10/28/21 13:59, P=C3=A1draig Brady wrote: > I wonder after getting a SEEK_DATA ENXIO, might we correlate > we really are at end of file by checking SEEK_HOLE also returns ENXIO? Wouldn't SEEK_HOLE return the current offset, instead of ENXIO? That is,=20 if the underlying data structure is wrong and claims that the rest of=20 the file is a hole, wouldn't SEEK_HOLE merely repeat that information? > Perhaps zfs is updating st_blocks async, or perhaps there are > runs of zeros in these files that a linker or whatever is sparsifying? Even if st_blocks was out-of-date, that's just a heuristic and the later=20 lseeks should still work. I don't think the files contain actual holes. I don't see an easy workaround for the ZFS bug, unless we want to slow=20 down 'cp' for everybody. This really needs to be fixed on the ZFS side. The attached patch to OpenZFS might work around the bug (I haven't=20 tested it; perhaps someone who uses ZFS could give it a try). --------------7s5B0LB0vH9QE6PdPbKgxmOz Content-Type: text/x-patch; charset=UTF-8; name="zfs.patch" Content-Disposition: attachment; filename="zfs.patch" Content-Transfer-Encoding: base64 ZGlmZiAtLWdpdCBhL21vZHVsZS96ZnMvemZzX3Zub3BzLmMgYi9tb2R1bGUvemZzL3pmc192 bm9wcy5jCmluZGV4IDliZDc1YzAxMS4uMWQxYmVmMDc5IDEwMDY0NAotLS0gYS9tb2R1bGUv emZzL3pmc192bm9wcy5jCisrKyBiL21vZHVsZS96ZnMvemZzX3Zub3BzLmMKQEAgLTEwMCw3 ICsxMDAsMTMgQEAgemZzX2hvbGV5X2NvbW1vbih6bm9kZV90ICp6cCwgdWxvbmdfdCBjbWQs IGxvZmZfdCAqb2ZmKQogCWVsc2UKIAkJaG9sZSA9IEJfRkFMU0U7CiAKKwkvKiBXb3JrIGFy b3VuZCBPcGVuWkZTIGJ1ZyAxMTkwMAorCSAgIDxodHRwczovL2dpdGh1Yi5jb20vb3Blbnpm cy96ZnMvaXNzdWVzLzExOTAwPi4gICovCisjIGlmIDAKIAllcnJvciA9IGRtdV9vZmZzZXRf bmV4dChaVE9aU0IoenApLT56X29zLCB6cC0+el9pZCwgaG9sZSwgJm5vZmYpOworIyBlbHNl CisJZXJyb3IgPSBFQlVTWTsKKyMgZmkKIAogCWlmIChlcnJvciA9PSBFU1JDSCkKIAkJcmV0 dXJuIChTRVRfRVJST1IoRU5YSU8pKTsK --------------7s5B0LB0vH9QE6PdPbKgxmOz-- From debbugs-submit-bounces@debbugs.gnu.org Sat Oct 30 06:22:24 2021 Received: (at 51433) by debbugs.gnu.org; 30 Oct 2021 10:22:24 +0000 Received: from localhost ([127.0.0.1]:56685 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mglVH-0001tV-TZ for submit@debbugs.gnu.org; Sat, 30 Oct 2021 06:22:24 -0400 Received: from mail-wm1-f46.google.com ([209.85.128.46]:36794) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mglVF-0001tG-Rs for 51433@debbugs.gnu.org; Sat, 30 Oct 2021 06:22:22 -0400 Received: by mail-wm1-f46.google.com with SMTP id z11-20020a1c7e0b000000b0030db7b70b6bso13619700wmc.1 for <51433@debbugs.gnu.org>; Sat, 30 Oct 2021 03:22:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=rz80ywnG5JJPjhxryU60PB+8kZVyaqbzTT4J2PsbnN0=; b=fsqfGnrksjLpv5E8WwnFcc8CV6+VPcg0zhVgigAvffjPRc2Iqf8nDZ+nPUEHjyRNa1 7DXELyDTkjQeIVXHw1t3S55yJhke+zUxs2WbvKQ2JRFSCUsDvaeN2y++N8TevWvQ72/5 9DwhPRIjDID8FBrbf5mDu6EU9uukGdsYdoY9FpLJ2f3huod5YjTvUM4cRKui4PPiFwcp gf/UijXq1UvutI/YKkfvGheZRuDdiUsar1+UNbEQAy+qZ1XevJNld67+0apKGoOWsAW5 +3mR9/fLkR5KL1Ht4rzTRo+dTFN5fclbQooheKZg7YMlaN8o3VjMGifNESz3HqHfHgZ7 TjpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=rz80ywnG5JJPjhxryU60PB+8kZVyaqbzTT4J2PsbnN0=; b=cuvBcDBUfqg5NsIPoHh3ZnP0ETcfOnd/+09/6Gq6VVALE4Bnr4ZyH8dxvt2ZcmN/n+ LbzCWsFoe0cztiYeNBnB0PWm/GMVm9R7LwAyXcO+4wUr7eLQM7n9xGus2EfCXXaRawNB Ly4SiHOWyXlCmfva8LrcLHjsdGBeedt11uqDIWaqlV6mQfFNarx7j0lE4huX/i4yPpj9 K8LkfBC9cq4spcEACAa8xKEoXjSLoq6V2qkLFva4siE620SLANn6GAu3AvDIuxiRoWSp PIy76lpYHmn0SID/3A7u7DjX6uD9I+N8IoQcCyOj9yrxtlACMYlHe8VZnc42K+7TZXSH aIAQ== X-Gm-Message-State: AOAM5324qwVtmciBfHfDk1ySUUDvA/9n+2DrmrXHcq5Sds3XjMc0vq0c JX0El+xbXHpkvuHQLVxlfCruJTuHJeC3ZUm8 X-Google-Smtp-Source: ABdhPJzkDjpbe3vmwUL1it5Tcun2Wv3FDTHyE9dQhp/X1EmPOiYLOCzHQIxMP20CSmOQrGl/ZVa3lA== X-Received: by 2002:a05:600c:2f17:: with SMTP id r23mr192141wmn.93.1635589335552; Sat, 30 Oct 2021 03:22:15 -0700 (PDT) Received: from localhost.localdomain (86-40-129-104-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.40.129.104]) by smtp.googlemail.com with UTF8SMTPSA id l2sm11800000wmi.1.2021.10.30.03.22.14 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 30 Oct 2021 03:22:15 -0700 (PDT) Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE To: Paul Eggert References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> <49b12d1d-575e-3bb2-af3e-e8ae0ab740eb@cs.ucla.edu> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <1c8aa5b2-8ee0-b7d6-693a-88f40c416d60@draigBrady.com> Date: Sat, 30 Oct 2021 11:22:13 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: <49b12d1d-575e-3bb2-af3e-e8ae0ab740eb@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) On 30/10/2021 02:24, Paul Eggert wrote: > On 10/28/21 12:11, Paul Eggert wrote: >> Wait - lseek returns a number less than -1?! We could easily check for >> that FreeBSD bug, perhaps as an independent patch; this shouldn't >> require any extra syscalls. > > I installed the attached patch to do this. This doesn't fix coreutils > bug#51433; it merely makes 'cp' and similar programs more likely to > detect and report the FreeBSD 9.1 bug you mentioned. On further inspection it seems the bug is in truss :/ I.e. the only bug on FreeBSD 9.1 is that the SEEK_DATA takes ages, but the returned values are correct. I.e. SEEK_DATA was returning 1099511595008 in this case which truss was reporting in 32 bit rather than 64 bit. I see other inconsistencies in truss output, so it can only be relied on for what syscall was called, rather than what was passed/returned. Given this patch doesn't change any operation, it may be best to revert to avoid the complexity? cheers, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Sat Oct 30 13:08:04 2021 Received: (at 51433) by debbugs.gnu.org; 30 Oct 2021 17:08:04 +0000 Received: from localhost ([127.0.0.1]:58360 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgrpr-0004RL-O1 for submit@debbugs.gnu.org; Sat, 30 Oct 2021 13:08:03 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:34074) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mgrpp-0004QP-GQ for 51433@debbugs.gnu.org; Sat, 30 Oct 2021 13:08:02 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 04F761600EB; Sat, 30 Oct 2021 10:07:55 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id KWxpwSIa3aG6; Sat, 30 Oct 2021 10:07:54 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id E06981600FD; Sat, 30 Oct 2021 10:07:53 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id WsYoMvfYmZPa; Sat, 30 Oct 2021 10:07:53 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id A9BF01600EB; Sat, 30 Oct 2021 10:07:53 -0700 (PDT) Content-Type: multipart/mixed; boundary="------------dolldGIAXY69b1MVG3CG4G2L" Message-ID: Date: Sat, 30 Oct 2021 10:07:53 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE Content-Language: en-US To: =?UTF-8?Q?P=c3=a1draig_Brady?= References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> <49b12d1d-575e-3bb2-af3e-e8ae0ab740eb@cs.ucla.edu> <1c8aa5b2-8ee0-b7d6-693a-88f40c416d60@draigBrady.com> From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: <1c8aa5b2-8ee0-b7d6-693a-88f40c416d60@draigBrady.com> X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) This is a multi-part message in MIME format. --------------dolldGIAXY69b1MVG3CG4G2L Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable On 10/30/21 03:22, P=C3=A1draig Brady wrote: > Given this patch doesn't change any operation, > it may be best to revert to avoid the complexity? Yes, and thanks for checking. I installed the attached. The main bug of course remains unfixed, as it's a ZFS bug that needs=20 fixing in ZFS. --------------dolldGIAXY69b1MVG3CG4G2L Content-Type: text/x-patch; charset=UTF-8; name="0001-cp-revert-unnecessary-FreeBSD-workaround.patch" Content-Disposition: attachment; filename="0001-cp-revert-unnecessary-FreeBSD-workaround.patch" Content-Transfer-Encoding: base64 RnJvbSBiNDE0MjVmYmY3ZGM2ZDI1YTFhNGQyZmUzMjI4NjNmYWI1OTdkNjVhIE1vbiBTZXAg MTcgMDA6MDA6MDAgMjAwMQpGcm9tOiBQYXVsIEVnZ2VydCA8ZWdnZXJ0QGNzLnVjbGEuZWR1 PgpEYXRlOiBTYXQsIDMwIE9jdCAyMDIxIDEwOjAwOjEwIC0wNzAwClN1YmplY3Q6IFtQQVRD SF0gY3A6IHJldmVydCB1bm5lY2Vzc2FyeSBGcmVlQlNEIHdvcmthcm91bmQKTUlNRS1WZXJz aW9uOiAxLjAKQ29udGVudC1UeXBlOiB0ZXh0L3BsYWluOyBjaGFyc2V0PVVURi04CkNvbnRl bnQtVHJhbnNmZXItRW5jb2Rpbmc6IDhiaXQKClRoYXQgd2FzIGEgZmFsc2UgYWxhcm0gZHVl IHRvIGEgYnVnIGluIEZyZWVCU0QgOS4xIHRydXNzOwpzZWUgUMOhZHJhaWcgQnJhZHnigJlz IHJlcG9ydCAoQnVnIzUxNDMzIzI5KS4KKiBzcmMvY29weS5jIChsc2Vla19jb3B5LCBpbmZl cl9zY2FudHlwZSk6IERvbuKAmXQgYm90aGVyIGNoZWNraW5nCndoZXRoZXIgbHNlZWsgcmV0 dXJuZWQgLTEuICBUaGlzIGRvZXNu4oCZdCBlbnRpcmVseSByZXZlcnQgdGhlCnByZXZpb3Vz IGNoYW5nZSwgYXMgaXQga2VlcHMgdGhlIGNvZGUgc2ltcGxpZmljYXRpb24gb2YgdGhlCnBy ZXZpb3VzIGNoYW5nZSB3aGlsZSByZXZlcnRpbmcgdGhlIGNoZWNrIGZvciAtMS4KLS0tCiBz cmMvY29weS5jIHwgNyArKystLS0tCiAxIGZpbGUgY2hhbmdlZCwgMyBpbnNlcnRpb25zKCsp LCA0IGRlbGV0aW9ucygtKQoKZGlmZiAtLWdpdCBhL3NyYy9jb3B5LmMgYi9zcmMvY29weS5j CmluZGV4IDFjYmM5NDgwYy4uYTY1MjNlZDk3IDEwMDY0NAotLS0gYS9zcmMvY29weS5jCisr KyBiL3NyYy9jb3B5LmMKQEAgLTUzMCw3ICs1MzAsNyBAQCBsc2Vla19jb3B5IChpbnQgc3Jj X2ZkLCBpbnQgZGVzdF9mZCwgY2hhciAqYnVmLCBzaXplX3QgYnVmX3NpemUsCiAgICAgICBv ZmZfdCBleHRfZW5kID0gbHNlZWsgKHNyY19mZCwgZXh0X3N0YXJ0LCBTRUVLX0hPTEUpOwog ICAgICAgaWYgKGV4dF9lbmQgPCAwKQogICAgICAgICB7Ci0gICAgICAgICAgaWYgKCEgKGV4 dF9lbmQgPT0gLTEgJiYgZXJybm8gPT0gRU5YSU8pKQorICAgICAgICAgIGlmIChlcnJubyAh PSBFTlhJTykKICAgICAgICAgICAgIGdvdG8gY2Fubm90X2xzZWVrOwogICAgICAgICAgIGV4 dF9lbmQgPSBzcmNfdG90YWxfc2l6ZTsKICAgICAgICAgICBpZiAoZXh0X2VuZCA8PSBleHRf c3RhcnQpCkBAIC02MDcsNyArNjA3LDcgQEAgbHNlZWtfY29weSAoaW50IHNyY19mZCwgaW50 IGRlc3RfZmQsIGNoYXIgKmJ1Ziwgc2l6ZV90IGJ1Zl9zaXplLAogICAgICAgICB9CiAKICAg ICAgIGV4dF9zdGFydCA9IGxzZWVrIChzcmNfZmQsIGRlc3RfcG9zLCBTRUVLX0RBVEEpOwot ICAgICAgaWYgKGV4dF9zdGFydCA8IDAgJiYgISAoZXh0X3N0YXJ0ID09IC0xICYmIGVycm5v ID09IEVOWElPKSkKKyAgICAgIGlmIChleHRfc3RhcnQgPCAwICYmIGVycm5vICE9IEVOWElP KQogICAgICAgICBnb3RvIGNhbm5vdF9sc2VlazsKICAgICB9CiAKQEAgLTEwOTMsOCArMTA5 Myw3IEBAIGluZmVyX3NjYW50eXBlIChpbnQgZmQsIHN0cnVjdCBzdGF0IGNvbnN0ICpzYiwK IAogI2lmZGVmIFNFRUtfSE9MRQogICBzY2FuX2luZmVyZW5jZS0+ZXh0X3N0YXJ0ID0gbHNl ZWsgKGZkLCAwLCBTRUVLX0RBVEEpOwotICBpZiAoMCA8PSBzY2FuX2luZmVyZW5jZS0+ZXh0 X3N0YXJ0Ci0gICAgICB8fCAoc2Nhbl9pbmZlcmVuY2UtPmV4dF9zdGFydCA9PSAtMSAmJiBl cnJubyA9PSBFTlhJTykpCisgIGlmICgwIDw9IHNjYW5faW5mZXJlbmNlLT5leHRfc3RhcnQg fHwgZXJybm8gPT0gRU5YSU8pCiAgICAgcmV0dXJuIExTRUVLX1NDQU5UWVBFOwogICBlbHNl IGlmIChlcnJubyAhPSBFSU5WQUwgJiYgIWlzX0VOT1RTVVAgKGVycm5vKSkKICAgICByZXR1 cm4gRVJST1JfU0NBTlRZUEU7Ci0tIAoyLjMyLjAKCg== --------------dolldGIAXY69b1MVG3CG4G2L-- From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 31 12:13:57 2021 Received: (at 51433) by debbugs.gnu.org; 31 Oct 2021 16:13:57 +0000 Received: from localhost ([127.0.0.1]:32937 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mhDT2-0002yT-RO for submit@debbugs.gnu.org; Sun, 31 Oct 2021 12:13:57 -0400 Received: from mail-wr1-f46.google.com ([209.85.221.46]:38778) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mhDT0-0002yD-8Y for 51433@debbugs.gnu.org; Sun, 31 Oct 2021 12:13:54 -0400 Received: by mail-wr1-f46.google.com with SMTP id u18so24695335wrg.5 for <51433@debbugs.gnu.org>; Sun, 31 Oct 2021 09:13:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=DbMcfu4aO5hbDNWoCpRgqHNh4MTpoPJ65HlZIfcgdo4=; b=BCXdamc9wvqCxQgOILaClA2diBkd8ESLQxru84ZTUORco6t9p0C52IOJRu6+BoAx7t sq0DAoSbKlQOUiQcqXOsXA5/r23tHvM/Qgc275ozPMNNeiDQo17+fpiBIT7lKtAFrcXm r9xu162OZc3MgeTU1UQVGM9/iS6+H75g6axzO62/Zt5D+ZkEblnzq5AGeE4+nJOKBaQp eZOWlbqxruWMgo7ZGVAX18CHOFRXZiizh7QyrRnsYlIg0cbnJzac+0SzcNdon8vD4SWz wKDRBfKviFaZIe23SikrK6qx9+71yeq2KTNljR6fem13/jUs/+bt02UCXgWIW5NzDyEJ oKGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language; bh=DbMcfu4aO5hbDNWoCpRgqHNh4MTpoPJ65HlZIfcgdo4=; b=RkLtI/Ai1Vl7UCRkeCHQfa7F8zkEN50+0urrAS5QRDZHiZdKHFI74ZvpL7w/ofYdWs k+JBzNzM16Vl5sXUB+yJVjwh20DI54ymEmCziVOVkLjoCIFp2a9r3EQM/5onjrafqCqK YK/Cmv1Tn2JcvvuMQGJsxJXTHw6ynAYOGm9LfEy0YJL6crhdoznPWMclYIDgu85jvlPW /dfXaqv1vEjNowbmQfuZVSei1mif0wkpitOwv1P1U175nSe2gD7MZbAcQcwIRrYzSt9J 3ZsUzV9WYJDOMtHM4iljulZi3KwAA2z366Tiy+7VQ4NYiMJirsiA6DJr0NPJq73MzIpv cgyg== X-Gm-Message-State: AOAM530oF8L/28qsLPEHWmKeBkIzI93ym1dkuvI14VUw3PwWea67JXZv wpqb/jVIzHYE6RoeXhNWIV3Q6+IBDHw= X-Google-Smtp-Source: ABdhPJwtjqBW/myOGdv4QJz2bAG+CdJc3XzJNuEVFsHqc57fj89P1iwQaW+NbWVZ1LscuIDCKaexZQ== X-Received: by 2002:adf:e390:: with SMTP id e16mr31072176wrm.217.1635696827728; Sun, 31 Oct 2021 09:13:47 -0700 (PDT) Received: from localhost.localdomain (86-40-129-104-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.40.129.104]) by smtp.googlemail.com with UTF8SMTPSA id q84sm4359744wme.3.2021.10.31.09.13.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 31 Oct 2021 09:13:46 -0700 (PDT) Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE To: Paul Eggert References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <62818902-0b61-64ee-b4d9-d2e2342b8788@draigBrady.com> Date: Sun, 31 Oct 2021 16:13:45 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------B4F3544DCAEA6DEA16F2B72C" Content-Language: en-US X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) This is a multi-part message in MIME format. --------------B4F3544DCAEA6DEA16F2B72C Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 28/10/2021 20:11, Paul Eggert wrote: > On 10/28/21 06:54, Pádraig Brady wrote: > >> Further debugging from Nix folks suggest ZFS was in consideration always, >> as invalid artifacts were written to a central cache from ZFS backed hosts. >> So we should at least change the comment in the patch to only mention ZFS. > > Yes, that sounds reasonable. > > This ZFS bug sounds pretty serious, though. Apparently it affects star > and other programs too. I'm not sure we should attempt to work around it > in coreutils, if the workarounds penalize everybody not using ZFS. Yes this is an awkward situation. Though given the frequency of the cp/zfs combo, I think we should try to provide some mitigation. > Is it cheap to check whether a file is actually in a ZFS filesystem? > (Don't know how this'd work with loopback mounts, NFS, etc.) If so, it > might be better to simply fdatasync (or even fsync) every input file > that's on ZFS, until we know the ZFS bugs are fixed. The attached uses statfs()->f_type which is usually available, to avoid using SEEK_DATA on ZFS. This should be fairly lightweight I think, and only used for files that might be sparse. There may still be issues with NFS backed by ZFS, but that should be rarer and more centrally managed, thus more amenable to possible future ZFS fixes, or mitigations. We could also disable SEEK_DATA for remote file systems, but that would be a pity since SEEK_DATA seems especially useful there. cheers, Pádraig --------------B4F3544DCAEA6DEA16F2B72C Content-Type: text/x-patch; charset=UTF-8; name="zfs-seek-data-skip.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="zfs-seek-data-skip.patch" >From b7b4926b2f52cfa2ab7d33742355beaf9a08c695 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Sun, 31 Oct 2021 15:38:29 +0000 Subject: [PATCH] copy: avoid SEEK_DATA corruption on ZFS Avoid corruption when using SEEK_DATA on ZFS, as discussed in: https://github.com/openzfs/zfs/issues/11900 * src/copy.c (functional_seek_data): A new function that returns true when we're sure we're not copying from ZFS. Note systems like solaris which doesn't have statfs()->f_type available, will be assumed to not have a usable SEEK_DATA. Most systems in use should have this info available. (infer_scantype): After a file is determined to perhaps be sparse, call functional_seek_data to ensure it's not on ZFS. * init.cfg (seek_data_capable_): Skip on ZFS. * NEWS: Mention the bug avoidance. Addresses https://bugs.gnu.org/51433 --- NEWS | 4 ++++ init.cfg | 12 ++++++++++++ src/copy.c | 36 +++++++++++++++++++++++++++++++++++- 3 files changed, 51 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 086da03ae..78a73ab05 100644 --- a/NEWS +++ b/NEWS @@ -4,6 +4,10 @@ GNU coreutils NEWS -*- outline -*- ** Bug fixes + cp will avoid corruption bugs in the ZFS file system by avoiding use + of lseek(..., SEEK_DATA), thus using potentially slower hole detection. + [bug triggered in coreutils-9.0] + chmod -R no longer exits with error status when encountering symlinks. All files would be processed correctly, but the exit status was incorrect. [bug introduced in coreutils-9.0] diff --git a/init.cfg b/init.cfg index b92f717f5..0f97a6712 100644 --- a/init.cfg +++ b/init.cfg @@ -541,6 +541,18 @@ seek_data_capable_() return 1 fi + # Skip on zfs due to various SEEK_DATA issues in its implementation + fstype=$(stat -f -c%t "$@") # Ensure we have f_type, as that's what copy uses + if test -z "$fstype" || test "$fstype" = '?'; then + warn_ 'seek_data_capable_: stat(1) failed: assuming not SEEK_DATA capable' + return 1 + fi + fsname=$(stat -f -c%T "$@") + if test "$fsname" = 'zfs'; then + warn_ 'seek_data_capable_: zfs detected: SEEK_DATA is disabled' + return 1 + fi + # Use timeout if available to skip cases where SEEK_DATA takes a long time. # We saw FreeBSD 9.1 take 35s to return from SEEK_DATA for a 1TiB empty file. # Note lseek() is uninterruptible on FreeBSD 9.1, but it does eventually diff --git a/src/copy.c b/src/copy.c index a6523ed97..6e19b8740 100644 --- a/src/copy.c +++ b/src/copy.c @@ -1080,6 +1080,33 @@ union scan_inference off_t ext_start; }; +/* SEEK_DATA on ZFS has many issues as described at: + https://github.com/openzfs/zfs/issues/11900 + so return FALSE for ZFS (or inability to detect fs type). */ + +#ifdef SEEK_HOLE +# include "fs.h" +# if HAVE_SYS_STATFS_H +# include +# elif HAVE_SYS_VFS_H +# include +# endif +static bool +functional_seek_data (int fd) +{ +# if HAVE_FSTATFS && HAVE_STRUCT_STATFS_F_TYPE + struct statfs buf; + if (fstatfs (fd, &buf) == 0) + { + if (buf.f_type != S_MAGIC_ZFS) + return true; + } +# endif + + return false; +} +#endif + /* Return how to scan a file with descriptor FD and stat buffer SB. Store any information gathered into *SCAN. */ static enum scantype @@ -1092,7 +1119,14 @@ infer_scantype (int fd, struct stat const *sb, return PLAIN_SCANTYPE; #ifdef SEEK_HOLE - scan_inference->ext_start = lseek (fd, 0, SEEK_DATA); + if (! functional_seek_data (fd)) + { + errno = ENOTSUP; + scan_inference->ext_start = -1; + } + else + scan_inference->ext_start = lseek (fd, 0, SEEK_DATA); + if (0 <= scan_inference->ext_start || errno == ENXIO) return LSEEK_SCANTYPE; else if (errno != EINVAL && !is_ENOTSUP (errno)) -- 2.26.2 --------------B4F3544DCAEA6DEA16F2B72C-- From debbugs-submit-bounces@debbugs.gnu.org Sun Oct 31 12:23:58 2021 Received: (at 51433) by debbugs.gnu.org; 31 Oct 2021 16:23:58 +0000 Received: from localhost ([127.0.0.1]:32971 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mhDcj-0003GB-S1 for submit@debbugs.gnu.org; Sun, 31 Oct 2021 12:23:58 -0400 Received: from mail-wm1-f44.google.com ([209.85.128.44]:39826) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mhDcf-0003Fe-Rd for 51433@debbugs.gnu.org; Sun, 31 Oct 2021 12:23:54 -0400 Received: by mail-wm1-f44.google.com with SMTP id b2-20020a1c8002000000b0032fb900951eso7392926wmd.4 for <51433@debbugs.gnu.org>; Sun, 31 Oct 2021 09:23:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=CqPrQuv3sC0VCmCOuW8Zvh/qJHammel9q5Mo0I+0GnU=; b=XAGx/0v9879HO1hvtpIG+mX/Pry8RTCreZNbzwsLyQn8tsRDjCbTn+PZLnMCFCJcY8 nue13vTwi7PmNSpU2FOpeqyiHQC7EigJ7vHxv6y9eeaiRcixuAdEaN2EP3iNk37w2Q6T UBtOowL1BTkqx4VnfKz/0qik1I/Og8HSeAME3wx9qvAq82QHaLE7rMSx1WnbOn9x75uw n73q0mUE+3u19pcY3al76tPZhDdk1q7rp3sdRmbi/ZD8OjWCG+Dx7Qp8IvkcFoZEq3ex ZZDG5VgqggQ3Bw6Rmd1BqsZdNBllYebn1B9doUl4SC6/HPeQQfZAWhA1amyn3AFIo8RS gqlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=CqPrQuv3sC0VCmCOuW8Zvh/qJHammel9q5Mo0I+0GnU=; b=biHuYUWFstWEEQrPVOi+7+U4nDsaJqkTGwm75gQxU+mXJr1v1LHJDPhqV0duoR66Hz Q5uuRJxcdCK7SkvMZsS2EbQOt+vhZjQoulq3YU9OUMLw2Vf3BKH+c+b3lVJCbCMLdOYb Oym1dgm3aI+5m/i/x42J2xNE/MKtQZfF4lrYm2CPYl9gdwvbmURqKhKWaG1QTJAPA3oh khjbDatn0Ctr2XaxbWwgiSWPU1DHYioguOj9ISBItPvVSXiPxEPBEXeauQBxnjSzd8XG ZVQDtnnxqVrRjAWricvH3SS9NXlXPOICWXEhAT0tTsLSkcNIgAr3Hvsk0XPFhahGD/Ce PmTQ== X-Gm-Message-State: AOAM533bcCml8spLOqEfWgDt1ODVD0fYxWZNdMs6BYdPN8K3k2EVkvbt lT863WJ/WmPtooMkhiVZy+FKIZXBkx4= X-Google-Smtp-Source: ABdhPJxdEd+CeS87JJpbQqd3cfe5cx9AF419+Uuoriw8VxsqEv47fPK/q3S6gT9To0NckkJ8zFmSGA== X-Received: by 2002:a7b:ce02:: with SMTP id m2mr9702772wmc.166.1635697427554; Sun, 31 Oct 2021 09:23:47 -0700 (PDT) Received: from localhost.localdomain (86-40-129-104-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.40.129.104]) by smtp.googlemail.com with UTF8SMTPSA id o17sm14301881wmq.11.2021.10.31.09.23.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 31 Oct 2021 09:23:47 -0700 (PDT) Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE To: Paul Eggert References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <3e740a88-56dc-5b81-dea6-296719759044@draigBrady.com> Date: Sun, 31 Oct 2021 16:23:45 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) On 30/10/2021 03:04, Paul Eggert wrote: > On 10/28/21 13:59, Pádraig Brady wrote: > >> I wonder after getting a SEEK_DATA ENXIO, might we correlate >> we really are at end of file by checking SEEK_HOLE also returns ENXIO? > > Wouldn't SEEK_HOLE return the current offset, instead of ENXIO? That is, > if the underlying data structure is wrong and claims that the rest of > the file is a hole, wouldn't SEEK_HOLE merely repeat that information? What I was thinking is that SEEK_DATA returning ENXIO suggests end of file. If we're not actually at EOF a SEEK_HOLE at SEEK_CUR+1 might not indicate ENXIO. I which case we could do a fdatasync(). All wild speculation though depending on the ZFS bug. If the bug is fixed and we have a good reproducer, we might be able to do something like this to cater for existing buggy ZFS installations. cheers, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Mon Nov 01 01:52:13 2021 Received: (at 51433) by debbugs.gnu.org; 1 Nov 2021 05:52:13 +0000 Received: from localhost ([127.0.0.1]:33563 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mhQEv-0002mU-Ao for submit@debbugs.gnu.org; Mon, 01 Nov 2021 01:52:13 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:52760) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mhQEt-0002mJ-2L for 51433@debbugs.gnu.org; Mon, 01 Nov 2021 01:52:12 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 89F6016005C; Sun, 31 Oct 2021 22:52:05 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id qXuckK3l55XK; Sun, 31 Oct 2021 22:52:04 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 9778A1600B8; Sun, 31 Oct 2021 22:52:04 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id xhOBdq-yijig; Sun, 31 Oct 2021 22:52:04 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 67EAB16005C; Sun, 31 Oct 2021 22:52:04 -0700 (PDT) Message-ID: <4c36195a-72f5-e591-7ee5-ef9464702b06@cs.ucla.edu> Date: Sun, 31 Oct 2021 22:52:04 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 Content-Language: en-US To: =?UTF-8?Q?P=c3=a1draig_Brady?= References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> <62818902-0b61-64ee-b4d9-d2e2342b8788@draigBrady.com> From: Paul Eggert Organization: UCLA Computer Science Department Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE In-Reply-To: <62818902-0b61-64ee-b4d9-d2e2342b8788@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) On 10/31/21 09:13, P=C3=A1draig Brady wrote: > The attached uses statfs()->f_type which is usually available, > to avoid using SEEK_DATA on ZFS.=C2=A0 This should be fairly lightweigh= t I=20 > think, > and only used for files that might be sparse. Couldn't we be even lazier about invoking fstatfs? It needs to be=20 invoked only if the file appears to be sparse (as you mentioned), and=20 also only if lseek+SEEK_DATA fails (with errno=3D=3DENXIO) at a position=20 less than the file's length, and also at most once per input file=20 descriptor. When copying multiple files (e.g., cp -r) we could also cache the=20 previous file's st_dev and ZFS status, so in the typical case we could=20 skip fstatfs entirely except for the first sparsish file that satisfies=20 the abovementioned criteria. Also, as I understand it the problem occurs with OpenZFS but not on=20 Solaris or its descendants, so we don't need to do any of this if __sun=20 is defined. Also, if we're going to use fstatfs it'd probably be better to hoist all=20 the fstatfs portability mess out of stat.c and into a new file (say,=20 gl/lib/pstatfs.c) that gives us a portable way of getting statfs-like=20 data out of filesystems, and using that new file in both stat.c and copy.= c. One more idea: I've seen reports that the problem goes away if you use=20 'read' to read just the first byte of the false hole; that might be a=20 simpler workaround than getting into the fstatfs portability mess. > We could also disable SEEK_DATA for remote file systems, > but that would be a pity since SEEK_DATA seems especially useful there. Absolutely. Don't know if you're following=20 but rincebrain reports a=20 working workaround on the OpenZFS side tho more testing needs to be=20 done. This OpenZFS bugs affects GNU tar and star (and surely other=20 programs) as well as coreutils, and we may be better off overall if we=20 merely ask people to fix their ZFS implementation rather than our trying=20 to work around the bug in coreutils. From debbugs-submit-bounces@debbugs.gnu.org Mon Nov 01 09:27:25 2021 Received: (at 51433) by debbugs.gnu.org; 1 Nov 2021 13:27:25 +0000 Received: from localhost ([127.0.0.1]:34013 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mhXLR-0008Hl-0R for submit@debbugs.gnu.org; Mon, 01 Nov 2021 09:27:25 -0400 Received: from mail-wr1-f43.google.com ([209.85.221.43]:36818) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mhXLP-0008HW-JE for 51433@debbugs.gnu.org; Mon, 01 Nov 2021 09:27:24 -0400 Received: by mail-wr1-f43.google.com with SMTP id s13so21014190wrb.3 for <51433@debbugs.gnu.org>; Mon, 01 Nov 2021 06:27:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=ahXqXxdf6r4UtRqkehOTxYsnmQt1XWQip9axNMoA2KU=; b=Ja5A+I71FuWVdwI91DasMPu3hZoDxhXTzTc2H7/cPMbtCE3iVpGBn91flHEQ3Zf0TQ uUoKXV5HA5LRtWn4n613x3dPBlfQhX5hIMvpIrnA7Q0e4vBbK1PDXLV7z+cdONS5bkoH 7Ghv/IdV5lgHB0uIC1rVUR6b1nXwjen2w3IM5UUh25RUeUd8R2yGA9kYuQq0P6Ieu3O/ R+eXPo3SKeYgn8go8bc8vwEQM8b2rM/Sdmq4s0Oi7PaPl3cqauktKggzKCBwSn5fop2d Vev2oFkavdmnBzSTn17zcOx4U9BZJWC78ZwD/FhLhuYzFQSm+NQGDdX7rwdBjeININjF dfmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:cc:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=ahXqXxdf6r4UtRqkehOTxYsnmQt1XWQip9axNMoA2KU=; b=yP/LWUQ+vfCyYogbTrPSdArOAM9U7Ld7xS3hNuuzqPAuYppCwufvxgOqMLA8wl8daE 7C2C4xnzPm5LAxYEn8dDnW9g1QHsP+CZgw9GHs75nJDMR0rDCZoBYXZ6xG0ch4wtQNQY 6ByuI/iIxlD5o/J+NmUjmqw7NWO5qL3bBQ6m/HIqzcApqZTl4vwEhgfYSGA0HyrgNl7F JUHmMVEL3u1hcAyTf2G+Rg+KGIT2s0xdRgWY77Q2DpULCqeYpm52X8L/WP5WBhRDZhha G+iwVSm6+mI8T8DKdJxhttK7DvxkulpCELDh1bKTfVmSSTV9EB2mba239lB2hLJj4fVM ihHw== X-Gm-Message-State: AOAM531+NPNWJArYacLVACJJcdGSkw2BITY2T6gUDH1kk3MVUoLIg7PL T7V5UhbRzyHvdgI76bQYsmaF4SYf1WiDwmAp X-Google-Smtp-Source: ABdhPJzRqifa69AWZwrRZiT6pr8gCUsaNrXuSAIBatvI42cU8mwAAwM6ALO7br2sX/lPlZ6NnSdh5A== X-Received: by 2002:adf:a413:: with SMTP id d19mr38458647wra.246.1635773237392; Mon, 01 Nov 2021 06:27:17 -0700 (PDT) Received: from localhost.localdomain (86-40-129-104-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.40.129.104]) by smtp.googlemail.com with UTF8SMTPSA id t12sm12825865wmq.44.2021.11.01.06.27.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 01 Nov 2021 06:27:16 -0700 (PDT) Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE To: Paul Eggert References: <2cba43-61792300-71-753b5f00@59006960> <3e2b31ad-008c-afa6-2b30-416109800905@draigBrady.com> <62818902-0b61-64ee-b4d9-d2e2342b8788@draigBrady.com> <4c36195a-72f5-e591-7ee5-ef9464702b06@cs.ucla.edu> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: Date: Mon, 1 Nov 2021 13:27:15 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: <4c36195a-72f5-e591-7ee5-ef9464702b06@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) On 01/11/2021 05:52, Paul Eggert wrote: > On 10/31/21 09:13, Pádraig Brady wrote: > >> The attached uses statfs()->f_type which is usually available, >> to avoid using SEEK_DATA on ZFS.  This should be fairly lightweight I >> think, >> and only used for files that might be sparse. > > Couldn't we be even lazier about invoking fstatfs? It needs to be > invoked only if the file appears to be sparse (as you mentioned), and > also only if lseek+SEEK_DATA fails (with errno==ENXIO) at a position > less than the file's length, and also at most once per input file > descriptor. > > When copying multiple files (e.g., cp -r) we could also cache the > previous file's st_dev and ZFS status, so in the typical case we could > skip fstatfs entirely except for the first sparsish file that satisfies > the abovementioned criteria. I was thinking that sparse may be enough of a filter, but these are valid suggestions. > Also, as I understand it the problem occurs with OpenZFS but not on > Solaris or its descendants, so we don't need to do any of this if __sun > is defined. Oh interesting. If that was actually the case we could probably default to allowing SEEK_DATA, and only disallow when f_type == ZFS, i.e. allow SEEK_DATA on solaris. > Also, if we're going to use fstatfs it'd probably be better to hoist all > the fstatfs portability mess out of stat.c and into a new file (say, > gl/lib/pstatfs.c) that gives us a portable way of getting statfs-like > data out of filesystems, and using that new file in both stat.c and copy.c. .. and tail.c. Good suggestion. > One more idea: I've seen reports that the problem goes away if you use > 'read' to read just the first byte of the false hole; that might be a > simpler workaround than getting into the fstatfs portability mess. Indeed. It would be really nice to get a solution supporting existing, and future openzfs installations. >> We could also disable SEEK_DATA for remote file systems, >> but that would be a pity since SEEK_DATA seems especially useful there. > > Absolutely. > > Don't know if you're following > but rincebrain reports a > working workaround on the OpenZFS side tho more testing needs to be > done. This OpenZFS bugs affects GNU tar and star (and surely other > programs) as well as coreutils, and we may be better off overall if we > merely ask people to fix their ZFS implementation rather than our trying > to work around the bug in coreutils. I see a slew of recent messages re workarounds in openzfs. It's good to see some movement there, as it's definitely best to address this in zfs. cheers, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 03 11:38:09 2021 Received: (at 51433) by debbugs.gnu.org; 3 Nov 2021 15:38:09 +0000 Received: from localhost ([127.0.0.1]:40962 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1miIL3-0004j7-BY for submit@debbugs.gnu.org; Wed, 03 Nov 2021 11:38:09 -0400 Received: from mail-wr1-f44.google.com ([209.85.221.44]:44853) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1miIL1-0004if-QF for 51433@debbugs.gnu.org; Wed, 03 Nov 2021 11:38:08 -0400 Received: by mail-wr1-f44.google.com with SMTP id d13so4184304wrf.11 for <51433@debbugs.gnu.org>; Wed, 03 Nov 2021 08:38:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=SCqXwI0Q4c2hgOxddpb7R3mkf7a6jN3fE19i689NXKU=; b=fYcRzpqr5pUhfCSiBW8/P21gsZKO9xPEkeEbETAh7rq+u3KBOYU7fZCN0P1DLvMEZI zxQsDHLw36ndM03nHTTESEN9UWKjmfcFy25lSut4zmUf34m27cKVDHJuVP90eGDU5Bu5 CGWnkxt0n7RUxYhLlCHLjLBqmUD4G2CK1Wz0Zy3mfgl/tgcNAtzTOuX51Okj3LF0sF63 HvaXES4o+nIxpYOAXFgV8Muq4oJ6D3jMYU+OpiEI1KQne8TQSinJ2RsL/4uO1Vs1Fhcp AMGMRoO6q7N+MXkqIw4epbt2siHLYk7wRiKw4sW5gyXV9LgJcOATRAlhOGHh/C6wtoLi bj9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=SCqXwI0Q4c2hgOxddpb7R3mkf7a6jN3fE19i689NXKU=; b=kZgBEqwK4/mkE+IspXmIk8EpZrsil995jy5aFzo1/UbKTbzUnXCVGTRSKxnybZ8jl2 dAYggJ/5Jx8b4hEKGrDwwL7XghCKJ1EHP0vP+UVMJlFjEciUl9e+bJ1J/XqeT86YEX1l ItRIQ0afga2oXtG1GINwJ2kvT2vHsolwbCqpfJyUDMfp5knxdBh5LqwZkSnKae7CDy8I DIaAaNi01PIxpWD/d6LO92AA6+OwPMmzSFbA3q7+u2CmVUKkTGNCmj6rxloDw170uo48 JdgWA18fctPArnVpff4mIPW+Ys6s/jZPQe9EInMwbjckHMt8FtDdDEcYJjX9JWmV/1nE 8XQw== X-Gm-Message-State: AOAM531lmFOL5FJr+h6prSxsqU/aNzPwv+DfL/b7mmyh84CYYjKAJCMD BS1Rb61j3/G2M95St4avbAD1WZV8WwYefv1M X-Google-Smtp-Source: ABdhPJzy+T1AePRbS3DxZpQBburRJ+zu4JAVpZPxSPsajW/gyy9PbOaJzc4s1wGivcNs8/GC05h8lg== X-Received: by 2002:a05:6000:1a8b:: with SMTP id f11mr26297521wry.409.1635953881354; Wed, 03 Nov 2021 08:38:01 -0700 (PDT) Received: from localhost.localdomain (86-40-129-104-dynamic.agg2.lod.rsl-rtd.eircom.net. [86.40.129.104]) by smtp.googlemail.com with UTF8SMTPSA id b9sm2290461wrx.24.2021.11.03.08.37.59 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 03 Nov 2021 08:38:00 -0700 (PDT) Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE To: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org References: <2cba43-61792300-71-753b5f00@59006960> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <4e5c0dea-5843-1f2c-3a5b-742ea1f145ae@draigBrady.com> Date: Wed, 3 Nov 2021 15:37:58 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:84.0) Gecko/20100101 Thunderbird/84.0 MIME-Version: 1.0 In-Reply-To: <2cba43-61792300-71-753b5f00@59006960> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-Spam-Score: 0.4 (/) X-Debbugs-Envelope-To: 51433 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.6 (/) On 27/10/2021 11:00, Janne Heß wrote: > Hi everyone, > > I packaged coreutils 9.0 for NixOS and we found breakages that seemed to be very random during builds of packages > that use the updated coreutils in their build process. It's really hard to tell the main cause but it seems like the issues > are caused by binaries that are corrupted after cp copied them from /tmp to /nix. The issue arises both when the > directories are on the same filesystem and when /tmp is on tmpfs. > Upon further inspection/bisection we figured out these issues are caused by a6eaee501f6ec0c152abe88640203a64c390993e. > This seems to happen on ZFS and indeed on the main coreutils mailing list there is a ZFS issue linked [1]. > The testsuite was patched in 61c81ffaacb0194dec31297bc1aa51be72315858 so it doesn't detect this issue anymore, > but the issue still very much happens in the real world. > > We have found this to happen while building the completions for a go tool (jx) which seems to be the same > issue as [2]. The tool is built, copied using cp, and called which causes a segfault to happen. > > Building another package (peertube) on x86_64-linux on ext4 also fails with strange errors in the > test suite, something about "Error: The service is no longer running". This does not happen when the mentioned > coreutils commit is undone by replacing #ifdef with #if 0 [3]. > > We have also seen this issue on Darwin when building Alacritty but only happening on some machines > but we were not able to pin it down any further there so this might be related or it might not. > > Since the issue is so random, we started wondering if it might be related to -frandom-seed which changes in NixOS > when rebuilding a package [4]. A thing to note here is that Nix does a lot of sandboxing stuff during builds which > includes mount namespaces so a Kernel bug is not out of the question. All of these issues happened during Nix builds, > coreutils 9.0 never made it out of the NixOS staging environment due to the builds breaking. We will probably disable > the new code paths as outlined above so the issue is contained for NixOS users and does not hit any production environments. > > [1]: https://github.com/openzfs/zfs/issues/11900 > [2]: https://github.com/golang/go/issues/48636 > [3]: https://raw.githubusercontent.com/NixOS/nixpkgs/bf0531b4f8a2de4ff2700797fb211a90c951786e/pkgs/tools/misc/coreutils/disable-seek-hole.patch > [4]: https://github.com/NixOS/nixpkgs/pull/141684#issuecomment-952339263 Looks like there is a WIP fix for OpenZFS mentioned at [1], where mmap'd regions were not being flushed: https://github.com/openzfs/zfs/commit/f2eebe07 So this should unblock enabling coreutils 9 at some stage at least. I've asked at [1] now they know what's going on, how programs might best distinguish buggy instances of openzfs. cheers, Pádraig From debbugs-submit-bounces@debbugs.gnu.org Wed Nov 03 13:57:40 2021 Received: (at 51433) by debbugs.gnu.org; 3 Nov 2021 17:57:41 +0000 Received: from localhost ([127.0.0.1]:41099 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1miKW4-0000Mt-PZ for submit@debbugs.gnu.org; Wed, 03 Nov 2021 13:57:40 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:40818) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1miKW0-0000MU-Uq for 51433@debbugs.gnu.org; Wed, 03 Nov 2021 13:57:39 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 11DA11600AD; Wed, 3 Nov 2021 10:57:30 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id wV8QvKOsZjVH; Wed, 3 Nov 2021 10:57:29 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 67C511600C3; Wed, 3 Nov 2021 10:57:29 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Kv32V79otZHJ; Wed, 3 Nov 2021 10:57:29 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 3F0681600AD; Wed, 3 Nov 2021 10:57:29 -0700 (PDT) Message-ID: <9ee6f3f2-2727-7b93-c6c3-6a91c72e85a2@cs.ucla.edu> Date: Wed, 3 Nov 2021 10:57:28 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 Content-Language: en-US To: =?UTF-8?Q?P=c3=a1draig_Brady?= References: <2cba43-61792300-71-753b5f00@59006960> <4e5c0dea-5843-1f2c-3a5b-742ea1f145ae@draigBrady.com> From: Paul Eggert Organization: UCLA Computer Science Department Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE In-Reply-To: <4e5c0dea-5843-1f2c-3a5b-742ea1f145ae@draigBrady.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.4 (--) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= , 51433@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.4 (---) On 11/3/21 08:37, P=C3=A1draig Brady wrote: > So this should unblock enabling coreutils 9 at some stage at least. That's good news. > I've asked at [1] now they know what's going on, > how programs might best distinguish buggy instances of openzfs. Maybe "configure" could do a "modinfo zfs" or a "cat=20 /sys/modinfo/zfs/version"? Admittedly that's *quite* a hack and of=20 course wouldn't be that reliable. PS. Is there some way to arrange for this OpenZFS patch to be backported=20 to Fedora relatively quickly? (I ask as a Fedora user. :-) From debbugs-submit-bounces@debbugs.gnu.org Mon Nov 08 02:04:38 2021 Received: (at 51433) by debbugs.gnu.org; 8 Nov 2021 07:04:38 +0000 Received: from localhost ([127.0.0.1]:55878 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjyhp-0000UR-RF for submit@debbugs.gnu.org; Mon, 08 Nov 2021 02:04:37 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:55954) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mjyho-0000U7-4X for 51433@debbugs.gnu.org; Mon, 08 Nov 2021 02:04:36 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 0384E1600C4; Sun, 7 Nov 2021 23:04:29 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id ZikoysRm4YKP; Sun, 7 Nov 2021 23:04:28 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 0A5771600FD; Sun, 7 Nov 2021 23:04:28 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id wncQ9e8l1nqU; Sun, 7 Nov 2021 23:04:27 -0800 (PST) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id D5BE91600C4; Sun, 7 Nov 2021 23:04:27 -0800 (PST) Message-ID: <4f329cd7-1a5b-cef1-28d4-dbe60f8613d1@cs.ucla.edu> Date: Sun, 7 Nov 2021 23:04:27 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.1 Content-Language: en-US To: 51433@debbugs.gnu.org From: Paul Eggert Organization: UCLA Computer Science Department Subject: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 51433 Cc: =?UTF-8?Q?Janne_He=c3=9f?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) It looks like the patch to OpenZFS might not suffice; see: https://github.com/openzfs/zfs/issues/11900#issuecomment-962812974 It looks like the OpenZFS folks are planning to look at it more this week. There is an obvious workaround (though it hurts performance).... From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 27 21:44:03 2022 Received: (at 51433-done) by debbugs.gnu.org; 28 Jan 2022 02:44:03 +0000 Received: from localhost ([127.0.0.1]:57073 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nDHF4-0000nc-T8 for submit@debbugs.gnu.org; Thu, 27 Jan 2022 21:44:03 -0500 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:36366) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1nDHF4-0000n3-1A for 51433-done@debbugs.gnu.org; Thu, 27 Jan 2022 21:44:02 -0500 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 7EBC7160126; Thu, 27 Jan 2022 18:43:56 -0800 (PST) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id fJIaDcCzMMyu; Thu, 27 Jan 2022 18:43:55 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id A6F00160133; Thu, 27 Jan 2022 18:43:55 -0800 (PST) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id qc510N9z5qEk; Thu, 27 Jan 2022 18:43:55 -0800 (PST) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 822DE160126; Thu, 27 Jan 2022 18:43:55 -0800 (PST) Message-ID: <392ff9be-2c63-eddd-71eb-de130021bcee@cs.ucla.edu> Date: Thu, 27 Jan 2022 18:43:55 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: bug#51433: cp 9.0 sometimes fails with SEEK_DATA/SEEK_HOLE Content-Language: en-US From: Paul Eggert To: 51433-done@debbugs.gnu.org References: <4f329cd7-1a5b-cef1-28d4-dbe60f8613d1@cs.ucla.edu> Organization: UCLA Computer Science Department In-Reply-To: <4f329cd7-1a5b-cef1-28d4-dbe60f8613d1@cs.ucla.edu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: -3.4 (---) X-Debbugs-Envelope-To: 51433-done Cc: =?UTF-8?Q?Janne_He=c3=9f?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.4 (----) On 11/7/21 23:04, Paul Eggert wrote: > https://github.com/openzfs/zfs/issues/11900#issuecomment-962812974 Apparently the OpenZFS bug has been fixed, as behlendorf closed it 20 days ago. Since there doesn't seem to be a good way for coreutils to work around the bug, and the bug potentially affects all apps that use SEEK_DATA, I'm taking the liberty of closing the coreutils bug report. Thanks for reporting it. From unknown Mon Aug 18 02:05:16 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 25 Feb 2022 12:24:06 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator