GNU bug report logs - #7399
du: add "--hash-all-files" option

Previous Next

Package: coreutils;

Reported by: Evgeny Kapun <abacabadabacaba <at> gmail.com>

Date: Sun, 14 Nov 2010 14:00:04 UTC

Severity: wishlist

Full log


Message #11 received at 7399 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: Evgeny Kapun <abacabadabacaba <at> gmail.com>, 7399 <at> debbugs.gnu.org
Subject: Re: bug#7399: du may count one file multiple times if it visible
	through multiple mounts
Date: Mon, 15 Nov 2010 09:06:05 +0100
Paul Eggert wrote:

> On 11/14/2010 05:16 AM, Evgeny Kapun wrote:
>> Some kernels, such as Linux, permit mounting one filesystem multiple
>> times. This can make multiple paths refer to the same file, although
>> neither hard nor symbolic links are involved.
>
> GNU du (as well as a lot of other programs I expect) doesn't work well
> in such environments, which do not conform to POSIX requirements for
> file system link counts.  GNU du could easily be fixed to handle these
> environments, but at a substantial runtime cost in the normal case,
> because it'd have to hash every file it runs across, not just files
> with link counts > 1 or that result from multiple arguments.
>
> One possible workaround is to add an option, --hash-all-files say, which causes
> du to hash every file it runs across, and thus not double-count files
> in such cases.

du.c already has an internal hash_all variable, and it so happens you
can set it by using du's --files0-from= option.  This should do the trick:

  find dir -print0 | du --files0-from=-

Obviously that's a bit of a kludge.
We shouldn't require a separate find process (and disabling
du's internal traversal code) just to turn this on, so adding
that option might make sense.




This bug report was last modified 6 years and 223 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.