GNU bug report logs -
#61386
[PATCH] cp,mv,install: Disable sparse copy on macOS
Previous Next
Reported by: George Valkov <gvalkov <at> gmail.com>
Date: Thu, 9 Feb 2023 09:36:03 UTC
Severity: normal
Tags: patch
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
Full log
Message #11 received at 61386 <at> debbugs.gnu.org (full text, mbox):
On 09/02/2023 15:57, George Valkov wrote:
>
>
>> On 2023-02-09, at 1:56 PM, Pádraig Brady <P <at> draigBrady.com> wrote:
>>
>> On 09/02/2023 09:20, George Valkov wrote:
>>> Due to a bug in macOS, sparse copies are corrupted on virtual disks
>>> formatted with APFS. HFS is not affected. Affected are coreutils
>>> install, and gcp when compiled with SEEK_HOLE, as well as macOS Finder.
>>> While reading the entire file returns valid data, scanning for
>>> allocated segments may return holes where valid data is present.
>>> In this case a sparse copy does not contain these segments and returns
>>> zeroes instead. Once the virtual disk is dismounted and then
>>> mounted again, a sparse copy produces correct results.
>>> This breaks OpenWRT build on macOS. Details:
>>> https://github.com/openwrt/openwrt/pull/11960
>>> https://github.com/openwrt/openwrt/pull/11960#issuecomment-1423185579
>>> Signed-off-by: Georgi Valkov <gvalkov <at> gmail.com>
>>> ---
>>> src/copy.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>> diff --git a/src/copy.c b/src/copy.c
>>> index e16fedb28..4f56138a6 100644
>>> --- a/src/copy.c
>>> +++ b/src/copy.c
>>> @@ -1065,7 +1065,7 @@ infer_scantype (int fd, struct stat const *sb,
>>> return PLAIN_SCANTYPE;
>>> }
>>> -#ifdef SEEK_HOLE
>>> +#if defined(SEEK_HOLE) && !defined(__APPLE__)
>>> scan_inference->ext_start = lseek (fd, 0, SEEK_DATA);
>>> if (0 <= scan_inference->ext_start || errno == ENXIO)
>>> return LSEEK_SCANTYPE;
>>
>>
>> Thanks for the detailed report.
>> The patch might very well be appropriate.
>
> Hi! Let’s test the ideas you have first, and fall-back to the patch.
>
> In October 2021 macOS Finder was also affected, and that points directly at Apple.
> I tested again today, they have fixed Finder. After performing a copy in Finder,
> coreutils cp produces a good copy. I have to run
> make toolchain/gcc/initial/{clean,compile} -j 16
> before I can reproduce it again with the same file.
>
> So while Apple didn't fix the underlaying issue with APFS, they did
> provide a solution for Finder. And we can make coreutils work correctly too.
That suggests that Finder may have sync'd the file.
Now syncing has overheads of course, so not an option to take lightly.
We may be safer just doing the normal copy on __APPLE__ as per your orig patch.
A dtruss of Finder might be instructive BTW.
Why I suspect syncing is that old fiemap code we used to have in copy
needed the FIEMAP_FLAG_SYNC flag to avoid bugs like this on some file systems.
>> We avoided copy_file_range() for a long time on Linux for example
>> until it became usable.
>>
>> Thanks for confirming the latest git is also impacted.
>
> Yes, I tested on master today.
>
>> I see you're on macOS 12.
>> macOS 13 supports fclonefileat() which may
>> avoid the issue, or also be impacted?
>
> Yes, I use macOS 12. There are too many complains about 13.
> I can boot macOS 13 recovery, so that might be an option,
> if I can setup a working build environment.
> Can you invite someone with macOS 13 to test the new API?
> Github has hosted runners. But I’ve never used one. And I think they use macOS 12.
OK let's assume fclonefileat() on macOS 13 is fine.
That's just a thing to consider if one is testing on macOS 13,
rather than jumping through hoops to test it.
>> I wonder might an fdatasync(fd) avoid your issue,
>> which we might do if defined(__APPLE__)?
>
> Send me patches, and I will test them.
>
> #ifdef SEEK_HOLE
> fdatasync(fd); // this did not work, is fd writable? Do we need to call it on the disk image?
> scan_inference->ext_start = lseek (fd, 0, SEEK_DATA);
> if (0 <= scan_inference->ext_start || errno == ENXIO)
> return LSEEK_SCANTYPE;
Oh right interesting. When you say doesn't work, do you mean the sync
fails, or just doesn't help. Note the separate coreutils sync util
only opens with write mode on AIX and Cygwin.
Relatedly you could just try that external `sync file && cp file dest` also to test the sync hypothesis.
If the sync doesn't work, or requires write mode
then that might be enough to decide to just avoid SEEK_DATA on __APPLE__ for now.
> It would be best if we test on an Intel Mac with T2 chip. I use MBP 2019 16”.
> That security chip and the encrypted physical disk with APFS might also affect our results.
>
> Georgi Valkov
> httpstorm.com
> nano RTOS
>
> off-topic: can you please point me at API to take a disk offline or lock it for exclusive access?
> I wrote a disk cloning tool and I have that functionality on Windows, I’d like to add it to Linux, BSD and macOS.
Note it's best to keep this on list.
cheers,
Pádraig
This bug report was last modified 2 years and 123 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.