GNU bug report logs - #39850
"du" command can not count some files

Previous Next

Package: coreutils;

Reported by: Hyunho Cho<mug896 <at> naver.com>

Date: Sun, 1 Mar 2020 08:06:02 UTC

Severity: normal

Full log


Message #11 received at 39850 <at> debbugs.gnu.org (full text, mbox):

From: Bob Proulx <bob <at> proulx.com>
To: Hyunho Cho <mug896 <at> naver.com>
Cc: 39850 <at> debbugs.gnu.org
Subject: Re: bug#39850: "du" command can not count some files
Date: Mon, 2 Mar 2020 23:01:08 -0700
Hyunho Cho wrote:
> $ find /usr/bin -type f | wc -l
> 2234
> 
> $ find /usr/bin -type f -print0 | du -b --files0-from=- | wc -l
> 2222

Hard links.  Files that are hard linked are only counted once by du
since du is summing up the disk usage and hard linked files only use
disk on the first usage.

Add the du -l option if you want to count hard linked files multiple
times.

  find /usr/bin -type f -print0 | du -l -b --files0-from=- | wc -l

That will generate an incorrect total disk usage amount however as it
will report hard linked disk space for each hard link.  But it all
depends upon what you are trying to count.

> $ du -b $( find /usr/bin -type f ) | wc -l
> 2222

  du -l -b $( find /usr/bin -type f ) | wc -l

> $ find /usr/bin -type f -exec stat -c %s {} + | awk '{sum+=$1} END{ print sum}'
> 1296011570
> 
> $ find /usr/bin -type f -print0 | du -b --files0-from=- | awk '{sum+=$1} END{ print sum}'
> 1282350388

  find /usr/bin -type f -print0 | du -l -b --files0-from=- | awk '{sum+=$1} END{ print sum}'

> $ diff <( find /usr/bin -type f | sort ) <( find /usr/bin -type f -print0 | du --files0-from=-  | cut -f 2  | sort )

  diff <( find /usr/bin -type f | sort ) <( find /usr/bin -type f -print0 | du -l --files0-from=-  | cut -f 2  | sort )

I am surprised you didn't try du on each file in addition to stat -c %s
on each file when you were summing them up. :-)

  find /usr/bin -type f -exec du -b {} \; | awk '{sum+=$1} END{ print sum}'

Bob




This bug report was last modified 5 years and 109 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.