GNU bug report logs - #34110
du: add dual-column showing apparent-size and disk-size

Previous Next

Package: coreutils;

Reported by: René J.V. Bertin <rjvbertin <at> gmail.com>

Date: Wed, 16 Jan 2019 22:04:02 UTC

Severity: wishlist

Full log


View this message in rfc822 format

From: René J.V. Bertin <rjvbertin <at> gmail.com>
To: 34110 <at> debbugs.gnu.org
Subject: bug#34110: feature request: dual-column du output, showing "real" and "on-disk" sizes (and about that "apparent-size" concept)
Date: Wed, 16 Jan 2019 21:13:15 +0100
Hi,

I hope feature requests are acceptable here.

Now that more and more filesystems have support for compression it becomes more interesting the comparre actual file/directory (content) size and the corresponding on-disk size. Currently you have to call du twice to do that, which quickly becomes cumbersome in practice (commandlines, parsing the output) and requires repeating the same IO operations twice.

The code obtains both size values at the same time so it would make sense to do both calculations at the same time, and provide an option to display the regular and "apparent-size" values in column output. My guess would be that the cost of calculating both output values at the same time is negligible w.r.t. the cost of the stat() call (and thus that there's no need to complexify the code with "calculate this and/or that" conditionals).

The option could be called --both, --colums (-C) or --two (-T).

I'd also reconsider the "apparent-size" term as I think it is confusing and ambiguous. Consider this, taken from a ZFS dataset with gzip-9 compression (and copies=1; du v8.30):

%> du -hcs /Volumes/nif64/tmp/.npm/ ; du -hcs --apparent-size /Volumes/nif64/tmp/.npm/
340M    /Volumes/nif64/tmp/.npm/
180M    /Volumes/nif64/tmp/.npm/

Same folder on btrfs (mounted with compress=lzo):
%> du -hcs /mnt/.npm/ ; du -hcs --apparent-size  /mnt/.npm
198M    /mnt/.npm/
181M    /mnt/.npm

According to `du --help`, the apparent-size option reports a size that is not the actual disk usage. The numbers above seem to show the opposite.
If anything, I find the concept of "apparent size" more appropriate to the size a file occupies on the storage medium because ultimately that storage device will not give you more than "struct stat : st_size" bytes for uncompressed filesystems. 
Another way to say it: with "--apparent-size", du returns the actual file size; without, it returns how large the file appears to be (judging from its disk footprint).

For comparison; same folder,  on Mac with HFS+
%> du -hcs /Volumes/VMs/.npm ; du -hcs --apparent-size /Volumes/VMs/.npm
198M    /Volumes/VMs/.npm
181M    /Volumes/VMs/.npm

Idem, with HFS+ compression (zip-9)
%> du -hcs /Volumes/VMs/.npm ; du -hcs --apparent-size /Volumes/VMs/.npm
115M    /Volumes/VMs/.npm
148M    /Volumes/VMs/.npm

Thoughts?

Thanks,
R.





This bug report was last modified 6 years and 151 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.