GNU bug report logs -
#79267
cp --sparse=auto heuristic fails on a squashfs mounted drive.
Previous Next
Full log
View this message in rfc822 format
Jeremy Allison wrote:
> It turns out that: lseek(3, 0, SEEK_HOLE) returns end-of-file for a
> sparse file copied from a Linux squashfs mounted drive. This breaks
> the --sparse=auto heuristic that detects a sparse file.
The reason for this is because Squashfs supports sparse files, but
it has never implemented SEEK_HOLE/SEEK_DATA, forcing applications to
do their own hole discovery. This was done for following reason.
Squashfs supports sparse holes at the granularity of the block, but
the block size in Squashfs is by default 128 Kbytes (and can be up to
1 Mbyte). In contrast most Linux filesystems use 4K block sizes.
This means any Squashfs SEEK_HOLE/SEEK_DATA implementation will not
behave like other Linux filesystems, because it won't report sparseness
at the 4K granularity that most people or programs will expect it to.
With the result a program may miss holes that exist in the file.
I have always considered it better not to support something rather than
support it in a way that people won't expect it to behave, or the
principle of least surprise.
> lseek(3, 0, SEEK_DATA) = 0
> fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
> lseek(3, 0, SEEK_HOLE) = 417792
This is the behaviour of the default llseek() implementation in the
Linux kernel VFS when doing an lseek SEEK_HOLE. This is to seek to
a virtual hole at the end of the file.
See
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/read_write.c#n102
I am not subscribed to this email list, and so please CC me on replies.
Thanks
Phillip
---
Squashfs author and maintainer
This bug report was last modified 17 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.