GNU bug report logs -
#7649
sparse files and commands
Previous Next
Reported by: support <at> sigmagames.com
Date: Wed, 15 Dec 2010 20:41:02 UTC
Severity: normal
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 7649 in the body.
You can then email your comments to 7649 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7649
; Package
coreutils
.
(Wed, 15 Dec 2010 20:41:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
support <at> sigmagames.com
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Wed, 15 Dec 2010 20:41:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi,
Just an idea:
Am working with sparse text (ascii) files and need command line options.
For instance, cat could have --sparse=text
meaning 2 things:
1) it is a sparse file and so skip sparse recs.
2) it is text and so if(byte==0) skip it, it is sure to work.
maybe do this for all commands.
Thanks Charles Carter, President Sigma Software, Inc.
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7649
; Package
coreutils
.
(Wed, 15 Dec 2010 21:10:03 GMT)
Full text and
rfc822 format available.
Message #8 received at 7649 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 12/15/2010 02:01 PM, support <at> sigmagames.com wrote:
> Hi,
> Just an idea:
> Am working with sparse text (ascii) files and need command line options.
> For instance, cat could have --sparse=text
> meaning 2 things:
> 1) it is a sparse file and so skip sparse recs.
> 2) it is text and so if(byte==0) skip it, it is sure to work.
> maybe do this for all commands.
Rather than modify lots of commands to add a new option, why not use
existing features of existing commands to do what you need?
tr -d '\0'
is a great way to remove NUL bytes (including sparse blocks) from a
file. Then, for everywhere you want to use a sparse file but ignore the
NUL bytes, you instead just pass a pipe with tr doing the work for you.
cmp <(tr -d '\0' < file) otherfile
Meanwhile, I can certainly see a case for teaching tr(1) to optimize
deletion of just the NUL byte in order to be more efficient on sparse
files. In fact, we are already in the middle of optimizing cp based on
sparseness, and have plans to optimize other tools like cmp and tar to
reuse the framework developed for cp to detect sparse blocks where the
system supports such information, so tr is an easy tool to add to that
list of potential beneficiaries.
That way, what you want to do works now (albeit not with optimal speed)
and might improve in the future (if we teach tr to special case NUL
elision); and we don't have to spend time retrofitting a new option onto
scores of programs (where you would have to wait for the new coreutils
to propagate to the machines you use).
--
Eric Blake eblake <at> redhat.com +1-801-349-2682
Libvirt virtualization library http://libvirt.org
[signature.asc (application/pgp-signature, attachment)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7649
; Package
coreutils
.
(Wed, 15 Dec 2010 22:07:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 7649 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
[please keep the list in the loop, and don't top-post on technical lists]
On 12/15/2010 03:20 PM, support <at> sigmagames.com wrote:
> THANKS, I see what you mean.
> Also food for thought, IF OS supports it, skip file blocks that are all null in a sparse file.
> This way less IO and less processing.
That's what I was alluding to - we are working on adding fiemap support
here:
http://git.savannah.gnu.org/cgit/coreutils.git/log/?h=fiemap-copy
as well as a proposal for a hole iterator on OSs that support it (Linux
supports it for ext4 and btrfs via fiemap ioctl; Solaris supports it via
lseek(,SEEK_HOLE)
http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00117.html
Note that an explicit block of all 0s cannot be efficiently skipped;
this only works for sparse files (where the block is completely
represented in the metadata of the file, and is implicitly all 0s).
Also, there has been movement to get Linux to add support for punching
holes into an existing file (right now, the only way to make a file more
sparse is to create a new file with the same contents while leaving
holes where the source had all-0 blocks, but that's obviously not as
efficient as modifying a file in-place to add a hole).
Once hole iteration is implemented (and right now, cp is our guinea
pig), then we can port that effort to a number of other programs (tr,
cmp, tar, ...) to make for more efficient I/O on files where we can
behave differently if we know that a block is sparse.
--
Eric Blake eblake <at> redhat.com +1-801-349-2682
Libvirt virtualization library http://libvirt.org
[signature.asc (application/pgp-signature, attachment)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#7649
; Package
coreutils
.
(Tue, 30 Oct 2018 07:56:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 7649 <at> debbugs.gnu.org (full text, mbox):
close 7649
stop
(triaging old bugs)
With no follow-ups in 7 years, I'm closing this bug.
-assaf
bug closed, send any further explanations to
7649 <at> debbugs.gnu.org and support <at> sigmagames.com
Request was from
Assaf Gordon <assafgordon <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Tue, 30 Oct 2018 07:56:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 27 Nov 2018 12:24:08 GMT)
Full text and
rfc822 format available.
This bug report was last modified 6 years and 209 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.