GNU bug report logs - #28657
Random sort-order in du

Previous Next

Package: coreutils;

Reported by: Holger Klene <h.klene <at> gmx.de>

Date: Sat, 30 Sep 2017 20:47:02 UTC

Severity: normal

Tags: notabug

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 28657 in the body.
You can then email your comments to 28657 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#28657; Package coreutils. (Sat, 30 Sep 2017 20:47:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Holger Klene <h.klene <at> gmx.de>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 30 Sep 2017 20:47:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Holger Klene <h.klene <at> gmx.de>
To: bug-coreutils <at> gnu.org
Subject: Random sort-order in du
Date: Sat, 30 Sep 2017 22:45:24 +0200
[Message part 1 (text/plain, inline)]
Hello!

I'm using Back-In-Time for backups. Out of curiosity, I wanted to know, how big the snapshots 
are and came across the FAQ:

https://github.com/bit-team/backintime/wiki/FAQ#how-can-i-check-if-my-snapshots-are-incremental-using-hard-links

They recommend:

du -hd1 /media/<USER>/backintime/<HOST>/<USER>/1/

Say I have a subdirectory structure in that folder like this:

+ 2015
I	+ FileA 1GB Inode 123
+ 2016
I	+ FileA 1GB Inode 123
+ 2017
	+ FileA 1GB Inode 123

Now du returns something like:
1G	2016
0G	2015
0G 2017
1G .

So the hardlinked FileA is reported only once (this is actually the desired feature, to not report it 
thrice).

But now the order determines, which folder the size is reported against. But I had to learn, that 
the order of items in the filesystem is unpredictable:

https://serverfault.com/questions/406229/ensuring-a-repeatable-directory-ordering-in-linux#answer-675748

I'd like to suggest to add some sort-controlling options known from ls --sort

The desired behavior in this case would be either alphabetical or reverse-alphabetical to 
attribute the FileA to the first or last appearance in the tree respectively.

Thanxs

Holger

PS: coreutils 8.25 with ext4 file-system

-- 
|_|/    MfG
| |\    Holger Klene

PGP Key ID: 0x22FFE57E
[Message part 2 (text/html, inline)]
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#28657; Package coreutils. (Sat, 30 Sep 2017 21:09:02 GMT) Full text and rfc822 format available.

Message #8 received at 28657 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Holger Klene <h.klene <at> gmx.de>, 28657 <at> debbugs.gnu.org
Subject: Re: bug#28657: Random sort-order in du
Date: Sat, 30 Sep 2017 15:08:35 -0600
Hello,

On 2017-09-30 02:45 PM, Holger Klene wrote:
> 
> du -hd1 /media/<USER>/backintime/<HOST>/<USER>/1/
[...]
> Now du returns something like:
> 1G	2016
> 0G	2015
> 0G 2017
> 1G .
> 
> [...]
>
> But now the order determines, which folder the size is reported against. But I had to learn, that 
> the order of items in the filesystem is unpredictable:
> 
[...]
> The desired behavior in this case would be either alphabetical or reverse-alphabetical to 
> attribute the FileA to the first or last appearance in the tree respectively.

'du' can report the directories in the order they are given on the
command line, and so you can use a slightly longer command to force
directory order.

Example 1:

Create a tiny directory structure example:

    mkdir -p data/{a,b,c}
    seq 10000 > a/1
    ln a/1 b/2
    ln a/1 c/3

Default output is unordered, as you've observed:

    $ du -hd1 data
    580K	data/c
    4.0K	data/b
    4.0K	data/a
    592K	data

But if you use shell-globbing, the shell will automatically
order the directory alphabetically:

    $ du -c -hd1 data/*
    580K	data/a
    4.0K	data/b
    4.0K	data/c
    588K	total

If you want a more complicated ordered (e.g. reverse order),
you can combine "find" (to list the directories), "sort" (to sort in
your desired order) and "du --files0-from" to read the file list from stdin:

    $ find data -maxdepth 1 -mindepth 1 -type d -print0 \
           | sort -z -k1r,1 \
           | du --files0-from=- -c -h
    580K	data/c
    4.0K	data/b
    4.0K	data/a
    588K	total

Note the "find -print0" and "sort -z" which force NUL-terminated
filenames (instead of newlines) - this is required to be used with "du
--files0-from" .


Another tip:
If you have filenames with mixed letters and numbers or numbers with
different width (e.g. "2017.7" and "2017.11") you can use
"sort -z -k1V,1" to sort them correctly.

Hope this helps,
 - assaf












Information forwarded to bug-coreutils <at> gnu.org:
bug#28657; Package coreutils. (Sat, 30 Sep 2017 21:42:02 GMT) Full text and rfc822 format available.

Message #11 received at 28657 <at> debbugs.gnu.org (full text, mbox):

From: Holger Klene <h.klene <at> gmx.de>
To: Assaf Gordon <assafgordon <at> gmail.com>
Cc: 28657 <at> debbugs.gnu.org
Subject: Re: bug#28657: Random sort-order in du
Date: Sat, 30 Sep 2017 23:41:19 +0200
[Message part 1 (text/plain, inline)]
* Assaf Gordon <assafgordon <at> gmail.com> [30.09.17 15:08]:
> But if you use shell-globbing, the shell will automatically
> order the directory alphabetically:
> 
>     $ du -c -hd1 data/*
>     580K	data/a
>     4.0K	data/b
>     4.0K	data/c
>     588K	total
> 
> Hope this helps,
>  - assaf

Who would have thought of passing multiple directories ...

This is the solution to my problem, though I better write it down to not forget, as I'm quite sure, 
I'll not remember it in a month :-)

Thank you for the immediate reply.

Holger

-- 
|_|/    MfG
| |\    Holger Klene

PGP Key ID: 0x22FFE57E
[Message part 2 (text/html, inline)]
[signature.asc (application/pgp-signature, inline)]

Added tag(s) notabug. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Sat, 30 Sep 2017 21:46:02 GMT) Full text and rfc822 format available.

Reply sent to Assaf Gordon <assafgordon <at> gmail.com>:
You have taken responsibility. (Sat, 30 Sep 2017 21:46:02 GMT) Full text and rfc822 format available.

Notification sent to Holger Klene <h.klene <at> gmx.de>:
bug acknowledged by developer. (Sat, 30 Sep 2017 21:46:03 GMT) Full text and rfc822 format available.

Message #18 received at 28657-done <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
To: Holger Klene <h.klene <at> gmx.de>
Cc: 28657-done <at> debbugs.gnu.org
Subject: Re: bug#28657: Random sort-order in du
Date: Sat, 30 Sep 2017 15:45:49 -0600
tag 28657 notabug
stop


Hello,


On 2017-09-30 03:41 PM, Holger Klene wrote:
> * Assaf Gordon <assafgordon <at> gmail.com> [30.09.17 15:08]:
>> But if you use shell-globbing, the shell will automatically
>> order the directory alphabetically:
> 
>> $ du -c -hd1 data/*
>  
> 
> Who would have thought of passing multiple directories ...
> 
> This is the solution to my problem, though I better write it down to not
> forget, as I'm quite sure, I'll not remember it in a month :-)
>

Thank you for confirming.

I'm thus marking this bug report as "done", but discussion can continue
by replying to this thread.

regards,
 - assaf




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 29 Oct 2017 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 7 years and 230 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.