GNU bug report logs -
#23113
parallel gzip processes trash hard disks, need larger buffers
Previous Next
Full log
Message #14 received at 23113 <at> debbugs.gnu.org (full text, mbox):
Here are some other approaches which may help:
1. Use gzopen() from zlib to compress the 10GB file as it is generated.
This uses only one CPU core and requires sequential writing only
(no random writes) but that may be enough in some cases.
2. The output from gzip is written 32KiB at at time, so a large output file
involves growing the file many times. Thus buffering the output from gzip
into larger blocks may help, too. Try:
gzip ... | dd obs=... of=...
3. Similarly, dd can buffer the input to gzip:
dd if=... ibs=... obs=... | gzip ...
4. dd can also be used to create multiple streams of input
from a single file:
(dd if=file ibs=... skip=0*N count=N obs=... | gzip ... ) &
(dd if=file ibs=... skip=1*N count=N obs=... | gzip ... ) &
(dd if=file ibs=... skip=2*N count=N obs=... | gzip ... ) &
(dd if=file ibs=... skip=3*N count=N obs=... | gzip ... ) &
However dd does not perform arithmetic, so the multiplication j*N
must be given as a literal result.
The dd utility program is quite versatile!
This bug report was last modified 9 years and 42 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.