GNU bug report logs - #5918
[dd] conv=sparse option

Previous Next

Package: coreutils;

Reported by: Heinrich Langos <henrik-gnu <at> prak.org>

Date: Sat, 10 Apr 2010 00:33:02 UTC

Severity: normal

Tags: fixed

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Heinrich Langos <henrik-gnu <at> prak.org>
To: Andreas Schwab <schwab <at> suse.de>
Cc: samuel.thibault <at> ens-lyon.org, 5918 <at> debbugs.gnu.org
Subject: bug#5918: [dd] conv=sparse option
Date: Sat, 10 Apr 2010 02:28:57 +0200
Hello Andreas, Samuel and list,

sorry to pick up such an old thread, but I stumbled upon it while
looking for an efficient way to "re-sparse" files that contain a 
lot of zero blocks but 
1) had already been expanded 
or 
2) are being expanded due to pipes.

On Sun, Dec 30, 2007 at 10:19:54AM +0100, Andreas Schwab wrote:
> Samuel Thibault <samuel.thibault <at> ens-lyon.org> writes:
> 
> > Some time ago, I wrote a conv=sparse option for dd, attached is the
> > patch.
> 
> How is it different from cp --sparse=always?

I'd say in enough ways to make such an option highly desirable.

a) "dd" will maintain an existing of=target file including the inode 
   number, thus respecting existing hard links. "cp" will depending 
   on the other options given (e.g. "-a") maintain or break existing 
   hard links to an existing target file.

b) "dd" could read a stream from a device or stdin and write it directly 
   to a sparse file. no need to "dd" from e.g. a block device to a file and 
   afterwards do a "cp --sparse=always file sparse-file". this will save a 
   lot of disk space, io operations and time.


example transcript for a) :

     1  hlangos <at> jukebox:~/sparse$ ls -lis                                  
     2  total 1984                                                         
     3  114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse
     4  114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse2
     5  114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse3
     6  114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:55 non-sparse4
     7  114690   0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse
     8  hlangos <at> jukebox:~/sparse$ cp sparse non-sparse
     9  hlangos <at> jukebox:~/sparse$ ls -lis
    10  total 0
    11  114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse
    12  114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse2
    13  114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse3
    14  114692 0 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:56 non-sparse4
    15  114690 0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse
    16  hlangos <at> jukebox:~/sparse$ dd if=/dev/zero bs=1 count=500000 of=non-sparse
    17  500000+0 records in
    18  500000+0 records out
    19  500000 bytes (500 kB) copied, 3.96621 s, 126 kB/s
    20  hlangos <at> jukebox:~/sparse$ ls -lis
    21  total 1984
    22  114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse
    23  114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse2
    24  114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse3
    25  114692 496 -rw-r--r-- 4 hlangos hlangos 500000 2010-04-10 01:57 non-sparse4
    26  114690   0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse
    27  hlangos <at> jukebox:~/sparse$ cp -a sparse non-sparse
    28  hlangos <at> jukebox:~/sparse$ ls -lis
    29  total 1488
    30  114691   0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 non-sparse
    31  114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse2
    32  114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse3
    33  114692 496 -rw-r--r-- 3 hlangos hlangos 500000 2010-04-10 01:57 non-sparse4
    34  114690   0 -rw-r--r-- 1 hlangos hlangos 500000 2010-04-10 01:56 sparse
    35  hlangos <at> jukebox:~/sparse$

As you see in line 30, a new "non-sparse" file has been created with a different inode
number while the link count of the other "non-sparse*" files has be reduced.


I'd very much like to see the patch make it into "dd", though I think it might be
better to integrate that function as "oflag=sparse" instead of "conv=sparse". 
After all you don't convert data but change the way the output is done.

cheers
-henrik






This bug report was last modified 6 years and 228 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.