GNU bug report logs - #7649
sparse files and commands

Previous Next

Package: coreutils;

Reported by: support <at> sigmagames.com

Date: Wed, 15 Dec 2010 20:41:02 UTC

Severity: normal

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eric Blake <eblake <at> redhat.com>
To: support <at> sigmagames.com
Cc: 7649 <at> debbugs.gnu.org
Subject: bug#7649: sparse files and commands
Date: Wed, 15 Dec 2010 14:16:04 -0700
[Message part 1 (text/plain, inline)]
On 12/15/2010 02:01 PM, support <at> sigmagames.com wrote:
> Hi, 
>   Just an idea:
> Am working with sparse text (ascii) files and need command line options.
> For instance, cat could have --sparse=text
> meaning 2 things:
> 1) it is a sparse file and so skip sparse recs.
> 2) it is text and so if(byte==0) skip it, it is sure to work.
> maybe do this for all commands.

Rather than modify lots of commands to add a new option, why not use
existing features of existing commands to do what you need?

tr -d '\0'

is a great way to remove NUL bytes (including sparse blocks) from a
file.  Then, for everywhere you want to use a sparse file but ignore the
NUL bytes, you instead just pass a pipe with tr doing the work for you.

cmp <(tr -d '\0' < file) otherfile

Meanwhile, I can certainly see a case for teaching tr(1) to optimize
deletion of just the NUL byte in order to be more efficient on sparse
files.  In fact, we are already in the middle of optimizing cp based on
sparseness, and have plans to optimize other tools like cmp and tar to
reuse the framework developed for cp to detect sparse blocks where the
system supports such information, so tr is an easy tool to add to that
list of potential beneficiaries.

That way, what you want to do works now (albeit not with optimal speed)
and might improve in the future (if we teach tr to special case NUL
elision); and we don't have to spend time retrofitting a new option onto
scores of programs (where you would have to wait for the new coreutils
to propagate to the machines you use).

-- 
Eric Blake   eblake <at> redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

This bug report was last modified 6 years and 209 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.