GNU bug report logs -
#5918
[dd] conv=sparse option
Previous Next
Reported by: Heinrich Langos <henrik-gnu <at> prak.org>
Date: Sat, 10 Apr 2010 00:33:02 UTC
Severity: normal
Tags: fixed
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 5918 in the body.
You can then email your comments to 5918 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5918
; Package
coreutils
.
(Sat, 10 Apr 2010 00:33:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Heinrich Langos <henrik-gnu <at> prak.org>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sat, 10 Apr 2010 00:33:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello Andreas, Samuel and list,
sorry to pick up such an old thread, but I stumbled upon it while
looking for an efficient way to "re-sparse" files that contain a
lot of zero blocks but
1) had already been expanded
or
2) are being expanded due to pipes.
On Sun, Dec 30, 2007 at 10:19:54AM +0100, Andreas Schwab wrote:
> Samuel Thibault <samuel.thibault <at> ens-lyon.org> writes:
>
> > Some time ago, I wrote a conv=sparse option for dd, attached is the
> > patch.
>
> How is it different from cp --sparse=always?
I'd say in enough ways to make such an option highly desirable.
a) "dd" will maintain an existing of=target file including the inode
number, thus respecting existing hard links. "cp" will depending
on the other options given (e.g. "-a") maintain or break existing
hard links to an existing target file.
b) "dd" could read a stream from a device or stdin and write it directly
to a sparse file. no need to "dd" from e.g. a block device to a file and
afterwards do a "cp --sparse=always file sparse-file". this will save a
lot of disk space, io operations and time.
example transcript for a) :
1 hlangos <at> jukebox:~/sparse$ ls -lis
2 total 1984
3 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse
4 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse2
5 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse3
6 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse4
7 114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse
8 hlangos <at> jukebox:~/sparse$ cp sparse non-sparse
9 hlangos <at> jukebox:~/sparse$ ls -lis
10 total 0
11 114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse
12 114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse2
13 114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse3
14 114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse4
15 114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse
16 hlangos <at> jukebox:~/sparse$ dd if=/dev/zero bs=1 count=500000 of=non-sparse
17 500000+0 records in
18 500000+0 records out
19 500000 bytes (500 kB) copied, 3.96621 s, 126 kB/s
20 hlangos <at> jukebox:~/sparse$ ls -lis
21 total 1984
22 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse
23 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse2
24 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse3
25 114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse4
26 114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse
27 hlangos <at> jukebox:~/sparse$ cp -a sparse non-sparse
28 hlangos <at> jukebox:~/sparse$ ls -lis
29 total 1488
30 114691 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 non-sparse
31 114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse2
32 114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse3
33 114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse4
34 114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse
35 hlangos <at> jukebox:~/sparse$
As you see in line 30, a new "non-sparse" file has been created with a different inode
number while the link count of the other "non-sparse*" files has be reduced.
I'd very much like to see the patch make it into "dd", though I think it might be
better to integrate that function as "oflag=sparse" instead of "conv=sparse".
After all you don't convert data but change the way the output is done.
cheers
-henrik
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5918
; Package
coreutils
.
(Sat, 10 Apr 2010 15:34:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 5918 <at> debbugs.gnu.org (full text, mbox):
On 10/04/10 01:28, Heinrich Langos wrote:
> Hello Andreas, Samuel and list,
>
> sorry to pick up such an old thread, but I stumbled upon it while
> looking for an efficient way to "re-sparse" files that contain a
> lot of zero blocks but
> 1) had already been expanded
> or
> 2) are being expanded due to pipes.
>
> On Sun, Dec 30, 2007 at 10:19:54AM +0100, Andreas Schwab wrote:
>> Samuel Thibault <samuel.thibault <at> ens-lyon.org> writes:
>>
>>> Some time ago, I wrote a conv=sparse option for dd, attached is the
>>> patch.
>>
>> How is it different from cp --sparse=always?
>
> I'd say in enough ways to make such an option highly desirable.
>
> a) "dd" will maintain an existing of=target file including the inode
> number, thus respecting existing hard links. "cp" will depending
> on the other options given (e.g. "-a") maintain or break existing
> hard links to an existing target file.
I don't think that's possible as holes can only be created at the end of a file.
Well I think NTFS supports punching holes in the "middle" but it's not common.
>
> b) "dd" could read a stream from a device or stdin and write it directly
> to a sparse file. no need to "dd" from e.g. a block device to a file and
> afterwards do a "cp --sparse=always file sparse-file". this will save a
> lot of disk space, io operations and time.
This seems to work:
cp --sparse=always /dev/stdin file
cheers,
Pádraig.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5918
; Package
coreutils
.
(Sat, 10 Apr 2010 17:18:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 5918 <at> debbugs.gnu.org (full text, mbox):
Pádraig Brady, le Sat 10 Apr 2010 16:33:07 +0100, a écrit :
> On 10/04/10 01:28, Heinrich Langos wrote:
> > a) "dd" will maintain an existing of=target file including the inode
> > number, thus respecting existing hard links. "cp" will depending
> > on the other options given (e.g. "-a") maintain or break existing
> > hard links to an existing target file.
>
> I don't think that's possible as holes can only be created at the end of a file.
> Well I think NTFS supports punching holes in the "middle" but it's not common.
I believe there's demand for supporting punching holes in the middle
of files and it will eventually show up in Linux. For instance, the
combination of IDE TRIM support and virtualization can allow virtualized
guest to take as less disk space as possible in file-backed virtual
disks.
Samuel
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5918
; Package
coreutils
.
(Sun, 11 Apr 2010 14:02:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 5918 <at> debbugs.gnu.org (full text, mbox):
Pádraig Brady wrote:
> On 10/04/10 01:28, Heinrich Langos wrote:
>> Hello Andreas, Samuel and list,
>>
>> sorry to pick up such an old thread, but I stumbled upon it while
>> looking for an efficient way to "re-sparse" files that contain a
>> lot of zero blocks but
>> 1) had already been expanded
>> or
>> 2) are being expanded due to pipes.
>>
>> On Sun, Dec 30, 2007 at 10:19:54AM +0100, Andreas Schwab wrote:
>>> Samuel Thibault <samuel.thibault <at> ens-lyon.org> writes:
>>>
>>>> Some time ago, I wrote a conv=sparse option for dd, attached is the
>>>> patch.
>>>
>>> How is it different from cp --sparse=always?
>>
>> I'd say in enough ways to make such an option highly desirable.
>>
>> a) "dd" will maintain an existing of=target file including the inode
>> number, thus respecting existing hard links. "cp" will depending
>> on the other options given (e.g. "-a") maintain or break existing
>> hard links to an existing target file.
>
> I don't think that's possible as holes can only be created at the end of a file.
> Well I think NTFS supports punching holes in the "middle" but it's not common.
I would like at least cp to be able to copy sparse files efficiently,
and considering the FIEMAP patches that Jeff Liu is working on, we
don't have long to wait.
BTW, I'm pretty sure it is possible to punch a hole in the middle of
a file with XFS. Maybe with other CoW file systems, too?
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5918
; Package
coreutils
.
(Tue, 13 Apr 2010 10:32:02 GMT)
Full text and
rfc822 format available.
Message #17 received at submit <at> debbugs.gnu.org (full text, mbox):
On Sat, Apr 10, 2010 at 06:46:13PM +0200, Samuel Thibault wrote:
> Pádraig Brady, le Sat 10 Apr 2010 16:33:07 +0100, a écrit :
> > On 10/04/10 01:28, Heinrich Langos wrote:
> > > a) "dd" will maintain an existing of=target file including the inode
> > > number, thus respecting existing hard links. "cp" will depending
> > > on the other options given (e.g. "-a") maintain or break existing
> > > hard links to an existing target file.
> >
> > I don't think that's possible as holes can only be created at the end of a file.
> > Well I think NTFS supports punching holes in the "middle" but it's not common.
I was not advocating support for punching hole in existing files (though
this is what I want to do in the end).
I was only interested in an option that would create sparse output files by
seeking forward in the output file whenever zero bytes in the input stream
ocurr. This is completely independent of the filesystem underneath. It
should even work if the FS doesn't support holes at all, as long as the
interface is POSIX compliant.
> I believe there's demand for supporting punching holes in the middle
> of files and it will eventually show up in Linux. For instance, the
> combination of IDE TRIM support and virtualization can allow virtualized
> guest to take as less disk space as possible in file-backed virtual
> disks.
True. "Thin Provisioning", that is the usage of disk images that start out
by allocating only the used disk blocks and growing on demand, suffers from
the inability of guest systems to "give back" unused blocks whenever a block
is released. There are ways around it, but none of them work very well with
a life guest.
The question there is wether this is a task for "dd" or for a specialized
tool.
-henrik
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5918
; Package
coreutils
.
(Tue, 13 Apr 2010 12:16:02 GMT)
Full text and
rfc822 format available.
Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):
On Sat, Apr 10, 2010 at 04:33:07PM +0100, Pádraig Brady wrote:
> On 10/04/10 01:28, Heinrich Langos wrote:
> > Hello Andreas, Samuel and list,
> >
> > sorry to pick up such an old thread, but I stumbled upon it while
> > looking for an efficient way to "re-sparse" files that contain a
> > lot of zero blocks but
> > 1) had already been expanded
> > or
> > 2) are being expanded due to pipes.
> >
> > On Sun, Dec 30, 2007 at 10:19:54AM +0100, Andreas Schwab wrote:
> >> Samuel Thibault <samuel.thibault <at> ens-lyon.org> writes:
> >>
> >>> Some time ago, I wrote a conv=sparse option for dd, attached is the
> >>> patch.
> >>
> >> How is it different from cp --sparse=always?
> >
> > I'd say in enough ways to make such an option highly desirable.
> >
> > a) "dd" will maintain an existing of=target file including the inode
> > number, thus respecting existing hard links. "cp" will depending
> > on the other options given (e.g. "-a") maintain or break existing
> > hard links to an existing target file.
>
> I don't think that's possible as holes can only be created at the end of a file.
> Well I think NTFS supports punching holes in the "middle" but it's not common.
>
> >
> > b) "dd" could read a stream from a device or stdin and write it directly
> > to a sparse file. no need to "dd" from e.g. a block device to a file and
> > afterwards do a "cp --sparse=always file sparse-file". this will save a
> > lot of disk space, io operations and time.
>
> This seems to work:
> cp --sparse=always /dev/stdin file
Yeap. That worked!
> hlangos <at> pc-hlangos:~/zaurus$ ls -lisa foo
> 958477 4 -rw-r--r-- 1 hlangos hlangos 3072 2010-04-13 12:12 foo
> hlangos <at> pc-hlangos:~/zaurus$ dd if=/dev/zero bs=1k count=100 | cp --sparse=always /dev/stdin foo
> 100+0 records in
> 100+0 records out
> 102400 bytes (102 kB) copied, 0.0802346 s, 1.3 MB/s
> hlangos <at> pc-hlangos:~/zaurus$ ls -lisa foo
> 958477 0 -rw-r--r-- 1 hlangos hlangos 102400 2010-04-13 14:06 foo
It doesn't change the target file's inode (and also maintains the existing
hard links).
Cheers
-henrik
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#5918
; Package
coreutils
.
(Mon, 22 Nov 2010 16:16:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 5918 <at> debbugs.gnu.org (full text, mbox):
On 11/04/10 15:01, Jim Meyering wrote:
> Pádraig Brady wrote:
>> On 10/04/10 01:28, Heinrich Langos wrote:
>>> Hello Andreas, Samuel and list,
>>>
>>> sorry to pick up such an old thread, but I stumbled upon it while
>>> looking for an efficient way to "re-sparse" files that contain a
>>> lot of zero blocks but
>>> 1) had already been expanded
>>> or
>>> 2) are being expanded due to pipes.
>>>
>>> On Sun, Dec 30, 2007 at 10:19:54AM +0100, Andreas Schwab wrote:
>>>> Samuel Thibault <samuel.thibault <at> ens-lyon.org> writes:
>>>>
>>>>> Some time ago, I wrote a conv=sparse option for dd, attached is the
>>>>> patch.
>>>>
>>>> How is it different from cp --sparse=always?
>>>
>>> I'd say in enough ways to make such an option highly desirable.
>>>
>>> a) "dd" will maintain an existing of=target file including the inode
>>> number, thus respecting existing hard links. "cp" will depending
>>> on the other options given (e.g. "-a") maintain or break existing
>>> hard links to an existing target file.
>>
>> I don't think that's possible as holes can only be created at the end of a file.
>> Well I think NTFS supports punching holes in the "middle" but it's not common.
>
> I would like at least cp to be able to copy sparse files efficiently,
> and considering the FIEMAP patches that Jeff Liu is working on, we
> don't have long to wait.
>
> BTW, I'm pretty sure it is possible to punch a hole in the middle of
> a file with XFS. Maybe with other CoW file systems, too?
They're just now adding FALLOC_FL_PUNCH_HOLE to fallocate() in the Linux kernel.
It's supported by xfs and ocfs2, and other filesystems will for now, return EOPNOTSUPP
fallocate(..,FALLOC_FL_PUNCH_HOLE) will return EOPNOTSUPP on older kernels
even for xfs and ocfs2.
cheers,
Pádraig.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#5918
; Package
coreutils
.
(Wed, 10 Oct 2018 16:45:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 5918 <at> debbugs.gnu.org (full text, mbox):
tags 5918 fixed
close 5918
stop
Hello,
Coreutils version 8.16 (released 2012) gained "dd conv=sparse" option,
see
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=4e776faa8482ae630d2ea9bc767298e664f07ba9
closing this bug.
regards,
- assaf
Added tag(s) fixed.
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Wed, 10 Oct 2018 16:45:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
5918 <at> debbugs.gnu.org and Heinrich Langos <henrik-gnu <at> prak.org>
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Wed, 10 Oct 2018 16:45:03 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 08 Nov 2018 12:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 6 years and 228 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.