From unknown Mon Jun 16 16:06:07 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#6557 <6557@debbugs.gnu.org> To: bug#6557 <6557@debbugs.gnu.org> Subject: Status: du sometimes miscounts directories, and files whose link count equals 1 Reply-To: bug#6557 <6557@debbugs.gnu.org> Date: Mon, 16 Jun 2025 23:06:07 +0000 retitle 6557 du sometimes miscounts directories, and files whose link count= equals 1 reassign 6557 coreutils submitter 6557 Paul Eggert severity 6557 normal thanks From debbugs-submit-bounces@debbugs.gnu.org Sat Jul 03 02:41:20 2010 Received: (at submit) by debbugs.gnu.org; 3 Jul 2010 06:41:20 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OUwPg-00043j-3M for submit@debbugs.gnu.org; Sat, 03 Jul 2010 02:41:20 -0400 Received: from mx10.gnu.org ([199.232.76.166]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OUwPd-00043e-Db for submit@debbugs.gnu.org; Sat, 03 Jul 2010 02:41:18 -0400 Received: from lists.gnu.org ([199.232.76.165]:37331) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1OUwPa-0001JR-B4 for submit@debbugs.gnu.org; Sat, 03 Jul 2010 02:41:14 -0400 Received: from [140.186.70.92] (port=44531 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OUwPY-0004E7-D5 for bug-coreutils@gnu.org; Sat, 03 Jul 2010 02:41:13 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.1 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OUwPX-0005RS-64 for bug-coreutils@gnu.org; Sat, 03 Jul 2010 02:41:12 -0400 Received: from kiwi.cs.ucla.edu ([131.179.128.19]:49424) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OUwPW-0005RH-RF for bug-coreutils@gnu.org; Sat, 03 Jul 2010 02:41:11 -0400 Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by kiwi.cs.ucla.edu (8.13.8+Sun/8.13.8/UCLACS-6.0) with ESMTP id o636f82r014535 for ; Fri, 2 Jul 2010 23:41:09 -0700 (PDT) Message-ID: <4C2EDB84.5050302@cs.ucla.edu> Date: Fri, 02 Jul 2010 23:41:08 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: bug-coreutils@gnu.org Subject: du sometimes miscounts directories, and files whose link count equals 1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-detected-operating-system: by eggs.gnu.org: Solaris 10 (beta) X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -5.1 (-----) (I found this bug by code inspection while doing the du performance improvement reported in: http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00014.html ) Unless -l is given, du is not supposed to count the same file more than once. It optimizes this test by not bothering to put a file into the hash table if its link count is 1, or if it is a directory. But this optimization is not correct if -L is given (because the same link-count-1 file, or directory, can be seen via symbolic links) or if two or more arguments are given (because the same such file can be seen under multiple arguments). The optimization should be suppressed if -L is given, or if multiple arguments are given. Here is a patch, with a couple of test cases for it. This patch assumes the du performance fix, but I can prepare an independent patch if you like. ----- Don't miscount directories or link-count-1 files seen multiple times. * NEWS: Mention this. * src/du.c (hash_all): New static var. (process_file): Use it. (main): Set it. * tests/du/hard-link: Add a couple of test cases to help make sure this bug stays squashed. diff --git a/NEWS b/NEWS index 2493ef8..82190d9 100644 --- a/NEWS +++ b/NEWS @@ -42,6 +42,11 @@ GNU coreutils NEWS -*- outline -*- Also errors are no longer suppressed for unsupported file types, and relative sizes are restricted to supported file types. +** Bug fixes + + du no longer multiply counts a file that is a directory or whose + link count is 1, even if the file is reached multiple times by + following symlinks or via multiple arguments. * Noteworthy changes in release 8.5 (2010-04-23) [stable] diff --git a/src/du.c b/src/du.c index bc24861..739be73 100644 --- a/src/du.c +++ b/src/du.c @@ -121,6 +121,9 @@ static bool apparent_size = false; /* If true, count each hard link of files with multiple links. */ static bool opt_count_all = false; +/* If true, hash all files to look for hard links. */ +static bool hash_all; + /* If true, output the NUL byte instead of a newline at the end of each line. */ static bool opt_nul_terminate_output = false; @@ -457,8 +460,7 @@ process_file (FTS *fts, FTSENT *ent) via a hard link, then don't let it contribute to the sums. */ if (skip || (!opt_count_all - && ! S_ISDIR (sb->st_mode) - && 1 < sb->st_nlink + && (hash_all || (! S_ISDIR (sb->st_mode) && 1 < sb->st_nlink)) && ! hash_ins (sb->st_ino, sb->st_dev))) { /* Note that we must not simply return here. @@ -876,11 +878,20 @@ main (int argc, char **argv) quote (files_from)); ai = argv_iter_init_stream (stdin); + + /* It's not easy here to count the arguments, so assume the + worst. */ + hash_all = true; } else { char **files = (optind < argc ? argv + optind : cwd_only); ai = argv_iter_init_argv (files); + + /* Hash all dev,ino pairs if there are multiple arguments, or if + following non-command-line symlinks, because in either case a + file with just one hard link might be seen more than once. */ + hash_all = (optind + 1 < argc || symlink_deref_bits == FTS_LOGICAL); } if (!ai) diff --git a/tests/du/hard-link b/tests/du/hard-link index 7e4f51a..e22320b 100755 --- a/tests/du/hard-link +++ b/tests/du/hard-link @@ -26,24 +26,40 @@ fi . $srcdir/test-lib.sh mkdir -p dir/sub -( cd dir && { echo non-empty > f1; ln f1 f2; echo non-empty > sub/F; } ) - - -# Note that for this first test, we transform f1 or f2 -# (whichever name we find first) to f_. That is necessary because, -# depending on the type of file system, du could encounter either of those -# two hard-linked files first, thus listing that one and not the other. -du -a --exclude=sub dir \ - | sed 's/^[0-9][0-9]* //' | sed 's/f[12]/f_/' > out || fail=1 -echo === >> out -du -a --exclude=sub --count-links dir \ - | sed 's/^[0-9][0-9]* //' | sort -r >> out || fail=1 +( cd dir && + { echo non-empty > f1 + ln f1 f2 + ln -s f1 f3 + echo non-empty > sub/F; } ) + +du -a -L --exclude=sub --count-links dir \ + | sed 's/^[0-9][0-9]* //' | sort -r > out || fail=1 + +# For these tests, transform f1 or f2 or f3 (whichever name is find +# first) to f_. That is necessary because, depending on the type of +# file system, du could encounter any of those linked files first, +# thus listing that one and not the others. +for args in '-L' 'dir' '-L dir' +do + echo === >> out + du -a --exclude=sub $args dir \ + | sed 's/^[0-9][0-9]* //' | sed 's/f[123]/f_/' >> out || fail=1 +done + cat <<\EOF > exp +dir/f3 +dir/f2 +dir/f1 +dir +=== dir/f_ dir === -dir/f2 -dir/f1 +dir/f_ +dir/f_ +dir +=== +dir/f_ dir EOF From debbugs-submit-bounces@debbugs.gnu.org Sat Jul 03 04:18:31 2010 Received: (at 6557) by debbugs.gnu.org; 3 Jul 2010 08:18:31 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OUxvi-0004fp-Vw for submit@debbugs.gnu.org; Sat, 03 Jul 2010 04:18:31 -0400 Received: from smtp1-g21.free.fr ([212.27.42.1]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OUxvf-0004fk-HN for 6557@debbugs.gnu.org; Sat, 03 Jul 2010 04:18:29 -0400 Received: from mx.meyering.net (unknown [82.230.74.64]) by smtp1-g21.free.fr (Postfix) with ESMTP id 7858D9400FE; Sat, 3 Jul 2010 10:18:19 +0200 (CEST) Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000) id 562AAE1D; Sat, 3 Jul 2010 10:18:18 +0200 (CEST) From: Jim Meyering To: Paul Eggert Subject: Re: bug#6557: du sometimes miscounts directories, and files whose link count equals 1 In-Reply-To: <4C2EDB84.5050302@cs.ucla.edu> (Paul Eggert's message of "Fri, 02 Jul 2010 23:41:08 -0700") References: <4C2EDB84.5050302@cs.ucla.edu> Date: Sat, 03 Jul 2010 10:18:18 +0200 Message-ID: <87aaq9dlmd.fsf@meyering.net> Lines: 140 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Score: -2.2 (--) X-Debbugs-Envelope-To: 6557 Cc: 6557@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.4 (---) Paul Eggert wrote: > (I found this bug by code inspection while doing the du performance > improvement reported in: > http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00014.html > ) > > Unless -l is given, du is not supposed to count the same file more > than once. It optimizes this test by not bothering to put a file into > the hash table if its link count is 1, or if it is a directory. But > this optimization is not correct if -L is given (because the same > link-count-1 file, or directory, can be seen via symbolic links) or if > two or more arguments are given (because the same such file can be > seen under multiple arguments). The optimization should be suppressed > if -L is given, or if multiple arguments are given. > > Here is a patch, with a couple of test cases for it. This patch > assumes the du performance fix, but I can prepare an independent > patch if you like. Thanks! Actually, that patch applies just fine, as-is. However, it induces this new "make check" test failure: FAIL: du/files0-from (exit: 1) ============================== du (GNU coreutils) 8.5.75-569b2 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later . This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Torbjorn Granlund, David MacKenzie, Paul Eggert, and Jim Meyering. f-extra-arg... missing... minus-in-stdin... empty... empty-nonreg... nul-1... nul-2... 1... 1a... 2... files0-from: test 2: stdout mismatch, comparing 2.O (actual) and 2.1 (expected) *** 2.O Sat Jul 3 09:28:08 2010 --- 2.1 Sat Jul 3 09:28:08 2010 *************** *** 1 **** --- 1,2 ---- 0 g + 0 g 2a... files0-from: test 2a: stdout mismatch, comparing 2a.O (actual) and 2a.1 (expected) *** 2a.O Sat Jul 3 09:28:08 2010 --- 2a.1 Sat Jul 3 09:28:08 2010 *************** *** 1 **** --- 1,2 ---- 0 g + 0 g zero-len... That's because with the unpatched "du", a command like this, with a duplicate argument, prints two lines, while the patched version prints two: $ seq 100 > g; du g g 4 g 4 g $ seq 100 > g; ./du g g 4 g Note that the vendor versions of "du" from at least Solaris 10, openBSD, netBSD and freeBSD print both lines. I prefer the new semantics, especially when using --total: $ seq 100 > g; du --total g g 4 g 4 g 8 total $ seq 100 > g; ./du --total g g 4 g 4 total You can get some of the old semantics by using -l: $ seq 100 > g; ./du -l --total g g 4 g 4 g 8 total What do you think of breaking with that tradition? POSIX does appear to say that for each "FILE" argument du must print a line, but it also mentions how with linked files, the space must be counted only once. You can definitely consider listing the same file twice as being analogous to a file being hard-linked. An alternative might be to do this, $ seq 100 > g; du --total g g 4 g 0 g 4 total but this is too prone to misinterpretation both by people and by code that parses du output. So I'm inclined to go with your approach. ------------------------------------- This is the additional patch we'd need to make the failing failing test accept your new output. You're welcome to merge it into yours. diff --git a/tests/du/files0-from b/tests/du/files0-from index 620246d..860fc6a 100755 --- a/tests/du/files0-from +++ b/tests/du/files0-from @@ -70,15 +70,15 @@ my @Tests = {IN=>{f=>"g\0"}}, {AUX=>{g=>''}}, {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], - # two file names, no final NUL + # two identical file names, no final NUL ['2', '--files0-from=-', '<', {IN=>{f=>"g\0g"}}, {AUX=>{g=>''}}, - {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], + {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], - # two file names, with final NUL + # two identical file names, with final NUL ['2a', '--files0-from=-', '<', {IN=>{f=>"g\0g\0"}}, {AUX=>{g=>''}}, - {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], + {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], # Ensure that $prog processes FILEs following a zero-length name. ['zero-len', '--files0-from=-', '<', From debbugs-submit-bounces@debbugs.gnu.org Sat Jul 03 04:36:14 2010 Received: (at 6557) by debbugs.gnu.org; 3 Jul 2010 08:36:14 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OUyCs-0004n3-Ay for submit@debbugs.gnu.org; Sat, 03 Jul 2010 04:36:14 -0400 Received: from smtp1-g21.free.fr ([212.27.42.1]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OUyCn-0004my-Va for 6557@debbugs.gnu.org; Sat, 03 Jul 2010 04:36:12 -0400 Received: from mx.meyering.net (unknown [82.230.74.64]) by smtp1-g21.free.fr (Postfix) with ESMTP id E74E6940060; Sat, 3 Jul 2010 10:36:01 +0200 (CEST) Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000) id 7B76BE4D3; Sat, 3 Jul 2010 10:36:00 +0200 (CEST) From: Jim Meyering To: Paul Eggert Subject: Re: bug#6557: du sometimes miscounts directories, and files whose link count equals 1 In-Reply-To: <87aaq9dlmd.fsf@meyering.net> (Jim Meyering's message of "Sat, 03 Jul 2010 10:18:18 +0200") References: <4C2EDB84.5050302@cs.ucla.edu> <87aaq9dlmd.fsf@meyering.net> Date: Sat, 03 Jul 2010 10:36:00 +0200 Message-ID: <871vbldksv.fsf@meyering.net> Lines: 202 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Score: -3.4 (---) X-Debbugs-Envelope-To: 6557 Cc: 6557@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.4 (---) Jim Meyering wrote: > Paul Eggert wrote: >> (I found this bug by code inspection while doing the du performance >> improvement reported in: >> http://lists.gnu.org/archive/html/bug-coreutils/2010-07/msg00014.html >> ) >> >> Unless -l is given, du is not supposed to count the same file more >> than once. It optimizes this test by not bothering to put a file into >> the hash table if its link count is 1, or if it is a directory. But >> this optimization is not correct if -L is given (because the same >> link-count-1 file, or directory, can be seen via symbolic links) or if >> two or more arguments are given (because the same such file can be >> seen under multiple arguments). The optimization should be suppressed >> if -L is given, or if multiple arguments are given. >> >> Here is a patch, with a couple of test cases for it. This patch >> assumes the du performance fix, but I can prepare an independent >> patch if you like. > > Thanks! > Actually, that patch applies just fine, as-is. > However, it induces this new "make check" test failure: ... > This is the additional patch we'd need to make the failing > failing test accept your new output. You're welcome to merge > it into yours. Actually I did that. Here's the adjusted patch, for review. Note the "du: " prefix on the one-line log summary -- that's the part that goes into the Subject below. Plus, I shortened it. Also, I added a log line for the tests/du/files0-from change. (BTW, the following is the output from "git format-patch --stdout -1". It's easy to apply that by saving it in a FILE, then running "git am FILE") >From efe53cc72b599979ea292754ecfe8abf7c839d22 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Fri, 2 Jul 2010 23:41:08 -0700 Subject: [PATCH] du: don't miscount duplicate directories or link-count-1 files * NEWS: Mention this. * src/du.c (hash_all): New static var. (process_file): Use it. (main): Set it. * tests/du/hard-link: Add a couple of test cases to help make sure this bug stays squashed. * tests/du/files0-from: Adjust existing tests to reflect change in semantics with duplicate arguments. --- NEWS | 5 +++++ src/du.c | 15 +++++++++++++-- tests/du/files0-from | 8 ++++---- tests/du/hard-link | 44 ++++++++++++++++++++++++++++++-------------- 4 files changed, 52 insertions(+), 20 deletions(-) diff --git a/NEWS b/NEWS index 3a24925..b02a223 100644 --- a/NEWS +++ b/NEWS @@ -38,6 +38,11 @@ GNU coreutils NEWS -*- outline -*- Also errors are no longer suppressed for unsupported file types, and relative sizes are restricted to supported file types. +** Bug fixes + + du no longer multiply counts a file that is a directory or whose + link count is 1, even if the file is reached multiple times by + following symlinks or via multiple arguments. * Noteworthy changes in release 8.5 (2010-04-23) [stable] diff --git a/src/du.c b/src/du.c index a90568e..4d6e03a 100644 --- a/src/du.c +++ b/src/du.c @@ -132,6 +132,9 @@ static bool apparent_size = false; /* If true, count each hard link of files with multiple links. */ static bool opt_count_all = false; +/* If true, hash all files to look for hard links. */ +static bool hash_all; + /* If true, output the NUL byte instead of a newline at the end of each line. */ static bool opt_nul_terminate_output = false; @@ -518,8 +521,7 @@ process_file (FTS *fts, FTSENT *ent) via a hard link, then don't let it contribute to the sums. */ if (skip || (!opt_count_all - && ! S_ISDIR (sb->st_mode) - && 1 < sb->st_nlink + && (hash_all || (! S_ISDIR (sb->st_mode) && 1 < sb->st_nlink)) && ! hash_ins (sb->st_ino, sb->st_dev))) { /* Note that we must not simply return here. @@ -937,11 +939,20 @@ main (int argc, char **argv) quote (files_from)); ai = argv_iter_init_stream (stdin); + + /* It's not easy here to count the arguments, so assume the + worst. */ + hash_all = true; } else { char **files = (optind < argc ? argv + optind : cwd_only); ai = argv_iter_init_argv (files); + + /* Hash all dev,ino pairs if there are multiple arguments, or if + following non-command-line symlinks, because in either case a + file with just one hard link might be seen more than once. */ + hash_all = (optind + 1 < argc || symlink_deref_bits == FTS_LOGICAL); } if (!ai) diff --git a/tests/du/files0-from b/tests/du/files0-from index 620246d..860fc6a 100755 --- a/tests/du/files0-from +++ b/tests/du/files0-from @@ -70,15 +70,15 @@ my @Tests = {IN=>{f=>"g\0"}}, {AUX=>{g=>''}}, {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], - # two file names, no final NUL + # two identical file names, no final NUL ['2', '--files0-from=-', '<', {IN=>{f=>"g\0g"}}, {AUX=>{g=>''}}, - {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], + {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], - # two file names, with final NUL + # two identical file names, with final NUL ['2a', '--files0-from=-', '<', {IN=>{f=>"g\0g\0"}}, {AUX=>{g=>''}}, - {OUT=>"0\tg\n0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], + {OUT=>"0\tg\n"}, {OUT_SUBST=>'s/^\d+/0/'} ], # Ensure that $prog processes FILEs following a zero-length name. ['zero-len', '--files0-from=-', '<', diff --git a/tests/du/hard-link b/tests/du/hard-link index 7e4f51a..e22320b 100755 --- a/tests/du/hard-link +++ b/tests/du/hard-link @@ -26,24 +26,40 @@ fi . $srcdir/test-lib.sh mkdir -p dir/sub -( cd dir && { echo non-empty > f1; ln f1 f2; echo non-empty > sub/F; } ) - - -# Note that for this first test, we transform f1 or f2 -# (whichever name we find first) to f_. That is necessary because, -# depending on the type of file system, du could encounter either of those -# two hard-linked files first, thus listing that one and not the other. -du -a --exclude=sub dir \ - | sed 's/^[0-9][0-9]* //' | sed 's/f[12]/f_/' > out || fail=1 -echo === >> out -du -a --exclude=sub --count-links dir \ - | sed 's/^[0-9][0-9]* //' | sort -r >> out || fail=1 +( cd dir && + { echo non-empty > f1 + ln f1 f2 + ln -s f1 f3 + echo non-empty > sub/F; } ) + +du -a -L --exclude=sub --count-links dir \ + | sed 's/^[0-9][0-9]* //' | sort -r > out || fail=1 + +# For these tests, transform f1 or f2 or f3 (whichever name is find +# first) to f_. That is necessary because, depending on the type of +# file system, du could encounter any of those linked files first, +# thus listing that one and not the others. +for args in '-L' 'dir' '-L dir' +do + echo === >> out + du -a --exclude=sub $args dir \ + | sed 's/^[0-9][0-9]* //' | sed 's/f[123]/f_/' >> out || fail=1 +done + cat <<\EOF > exp +dir/f3 +dir/f2 +dir/f1 +dir +=== dir/f_ dir === -dir/f2 -dir/f1 +dir/f_ +dir/f_ +dir +=== +dir/f_ dir EOF -- 1.7.2.rc1.192.g262ff From debbugs-submit-bounces@debbugs.gnu.org Sat Jul 03 21:48:49 2010 Received: (at 6557) by debbugs.gnu.org; 4 Jul 2010 01:48:49 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OVEK9-0003oE-0z for submit@debbugs.gnu.org; Sat, 03 Jul 2010 21:48:49 -0400 Received: from kiwi.cs.ucla.edu ([131.179.128.19]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OVEK6-0003o7-Kz for 6557@debbugs.gnu.org; Sat, 03 Jul 2010 21:48:48 -0400 Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by kiwi.cs.ucla.edu (8.13.8+Sun/8.13.8/UCLACS-6.0) with ESMTP id o641mfVH021109; Sat, 3 Jul 2010 18:48:41 -0700 (PDT) Message-ID: <4C2FE878.3080807@cs.ucla.edu> Date: Sat, 03 Jul 2010 18:48:40 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: Jim Meyering Subject: Re: bug#6557: du sometimes miscounts directories, and files whose link count equals 1 References: <4C2EDB84.5050302@cs.ucla.edu> <87aaq9dlmd.fsf@meyering.net> <871vbldksv.fsf@meyering.net> In-Reply-To: <871vbldksv.fsf@meyering.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam-Score: -2.1 (--) X-Debbugs-Envelope-To: 6557 Cc: 6557@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 07/03/10 01:36, Jim Meyering wrote: > Here's the adjusted patch, for review. Yes, thanks, that looks good and it works for me. > Also, I added a log line for the tests/du/files0-from change. > (BTW, the following is the output from "git format-patch --stdout -1". > It's easy to apply that by saving it in a FILE, then running "git am FILE") Yes, and here's a proposed change to README-hacking to try to record this advice, along with some other good advice you've given me recently: >From ded44a4b21f50faf40aa70695bec20b3822cffd1 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Sat, 3 Jul 2010 18:44:16 -0700 Subject: [PATCH] Add advice about ChangeLogs and synchronizing submodules. * README-hacking: Adjust accordingly. --- README-hacking | 29 +++++++++++++++++++++++++++++ 1 files changed, 29 insertions(+), 0 deletions(-) diff --git a/README-hacking b/README-hacking index fecbf9e..02cb277 100644 --- a/README-hacking +++ b/README-hacking @@ -39,6 +39,12 @@ which are extracted from other source packages: $ ./bootstrap +To use the most-recent gnulib (as opposed to the gnulib version that +the package last synchronized to), do this next: + + $ git submodule foreach git pull origin master + $ git commit -a -m 'build: update gnulib submodule to latest' + And there you are! Just $ ./configure --quiet #[--enable-gcc-warnings] [*] @@ -60,6 +66,29 @@ to use recent system headers. If you configure with this option, and spot a problem, please be sure to send the report to the bug reporting address of this package, and not to that of gnulib, even if the problem seems to originate in a gnulib-provided file. + +* Submitting patches + +If you develop a fix or a new feature, please send it to the +appropriate bug-reporting address as reported by the --help option of +each program. One way to do this is to use vc-dwim +), as follows. + + Run the command "vc-dwim --help", copy its definition of the + "git-changelog-symlink-init" function into your shell, and then run + this function at the top-level directory of the package. + + Edit the ChangeLog file that this command creates, creating a + properly-formatted entry according to the GNU coding standards + . + + Run the command "vc-dwim" and make sure its output looks good. + + Run "vc-dwim --commit". + + Run the command "git format-patch --stdout -1", and email its output + in, using the the output's subject line. + ----- Copyright (C) 2002-2010 Free Software Foundation, Inc. -- 1.7.0.4 From debbugs-submit-bounces@debbugs.gnu.org Sun Jul 04 02:37:10 2010 Received: (at 6557) by debbugs.gnu.org; 4 Jul 2010 06:37:10 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OVIpB-0005X5-RB for submit@debbugs.gnu.org; Sun, 04 Jul 2010 02:37:10 -0400 Received: from smtp1-g21.free.fr ([212.27.42.1]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OVIp7-0005Vt-FA for 6557@debbugs.gnu.org; Sun, 04 Jul 2010 02:37:07 -0400 Received: from mx.meyering.net (unknown [82.230.74.64]) by smtp1-g21.free.fr (Postfix) with ESMTP id 3371E9400A4; Sun, 4 Jul 2010 08:36:56 +0200 (CEST) Received: by rho.meyering.net (Acme Bit-Twister, from userid 1000) id CB574E642; Sun, 4 Jul 2010 08:36:55 +0200 (CEST) From: Jim Meyering To: Paul Eggert Subject: Re: bug#6557: du sometimes miscounts directories, and files whose link count equals 1 In-Reply-To: <4C2FE878.3080807@cs.ucla.edu> (Paul Eggert's message of "Sat, 03 Jul 2010 18:48:40 -0700") References: <4C2EDB84.5050302@cs.ucla.edu> <87aaq9dlmd.fsf@meyering.net> <871vbldksv.fsf@meyering.net> <4C2FE878.3080807@cs.ucla.edu> Date: Sun, 04 Jul 2010 08:36:55 +0200 Message-ID: <878w5rda7s.fsf@meyering.net> Lines: 98 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: 6557 Cc: 6557@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.4 (---) Paul Eggert wrote: > On 07/03/10 01:36, Jim Meyering wrote: > >> Here's the adjusted patch, for review. > > Yes, thanks, that looks good and it works for me. I've pushed that du fix. >> Also, I added a log line for the tests/du/files0-from change. >> (BTW, the following is the output from "git format-patch --stdout -1". >> It's easy to apply that by saving it in a FILE, then running "git am FILE") > > Yes, and here's a proposed change to README-hacking to try to record > this advice, along with some other good advice you've given me recently: Thanks! > Subject: [PATCH] Add advice about ChangeLogs and synchronizing submodules. I like to put a "doc: " at the beginning of such summary lines and to omit the trailing ".": doc: add advice about ChangeLogs and synchronizing submodules > * README-hacking: Adjust accordingly. > --- > README-hacking | 29 +++++++++++++++++++++++++++++ > 1 files changed, 29 insertions(+), 0 deletions(-) > > diff --git a/README-hacking b/README-hacking > index fecbf9e..02cb277 100644 > --- a/README-hacking > +++ b/README-hacking > @@ -39,6 +39,12 @@ which are extracted from other source packages: > > $ ./bootstrap > > +To use the most-recent gnulib (as opposed to the gnulib version that > +the package last synchronized to), do this next: > + > + $ git submodule foreach git pull origin master > + $ git commit -a -m 'build: update gnulib submodule to latest' In general, I try to ensure that each gnulib-updating change remains in a commit all by itself[*], partly because they are relatively likely to conflict -- esp. if I do the update on a branch, later update to a different version on the trunk and try to rebase. If it's a commit by itself it's trivial to avoid trouble: just remove the commit before rebasing the branch. So maybe this, instead? $ git commit -m 'build: update gnulib submodule to latest' gnulib [*] However, when a gnulib change induces a matching change in coreutils, the gnulib-updating part obviously belongs with the coreutils-changing deltas. > And there you are! Just > > $ ./configure --quiet #[--enable-gcc-warnings] [*] > @@ -60,6 +66,29 @@ to use recent system headers. If you configure with this option, > and spot a problem, please be sure to send the report to the bug > reporting address of this package, and not to that of gnulib, even > if the problem seems to originate in a gnulib-provided file. > + > +* Submitting patches > + > +If you develop a fix or a new feature, please send it to the > +appropriate bug-reporting address as reported by the --help option of > +each program. One way to do this is to use vc-dwim > +), as follows. > + > + Run the command "vc-dwim --help", copy its definition of the > + "git-changelog-symlink-init" function into your shell, and then run > + this function at the top-level directory of the package. This (above and below) is precisely the process I use. Thanks for documenting it. It may sound a little tortuous, but has some hidden benefits. > + Edit the ChangeLog file that this command creates, creating a > + properly-formatted entry according to the GNU coding standards > + . > + > + Run the command "vc-dwim" and make sure its output looks good. > + > + Run "vc-dwim --commit". > + > + Run the command "git format-patch --stdout -1", and email its output > + in, using the the output's subject line. ---------------^^^ ^^^ "make syntax-check" spotted the doubled "the". You're welcome to push the result. From debbugs-submit-bounces@debbugs.gnu.org Sun Jul 04 18:50:34 2010 Received: (at 6557) by debbugs.gnu.org; 4 Jul 2010 22:50:34 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OVY1C-0004BW-50 for submit@debbugs.gnu.org; Sun, 04 Jul 2010 18:50:34 -0400 Received: from kiwi.cs.ucla.edu ([131.179.128.19]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OVY1A-0004BP-3p for 6557@debbugs.gnu.org; Sun, 04 Jul 2010 18:50:33 -0400 Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by kiwi.cs.ucla.edu (8.13.8+Sun/8.13.8/UCLACS-6.0) with ESMTP id o64MoPhe028854; Sun, 4 Jul 2010 15:50:26 -0700 (PDT) Message-ID: <4C311030.1050209@cs.ucla.edu> Date: Sun, 04 Jul 2010 15:50:24 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: Jim Meyering Subject: Re: bug#6557: du sometimes miscounts directories, and files whose link count equals 1 References: <4C2EDB84.5050302@cs.ucla.edu> <87aaq9dlmd.fsf@meyering.net> <871vbldksv.fsf@meyering.net> <4C2FE878.3080807@cs.ucla.edu> <878w5rda7s.fsf@meyering.net> In-Reply-To: <878w5rda7s.fsf@meyering.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam-Score: -3.3 (---) X-Debbugs-Envelope-To: 6557 Cc: 6557@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 07/03/10 23:36, Jim Meyering wrote: > So maybe this, instead? > > $ git commit -m 'build: update gnulib submodule to latest' gnulib Sure, that's better. > "make syntax-check" spotted the doubled "the". > > You're welcome to push the result. Thanks, I did that, with the two fixes noted above. From debbugs-submit-bounces@debbugs.gnu.org Sun Jul 04 19:03:22 2010 Received: (at 6557) by debbugs.gnu.org; 4 Jul 2010 23:03:22 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OVYDa-0004Gz-Ga for submit@debbugs.gnu.org; Sun, 04 Jul 2010 19:03:22 -0400 Received: from kiwi.cs.ucla.edu ([131.179.128.19]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OVYDX-0004Gu-S7 for 6557@debbugs.gnu.org; Sun, 04 Jul 2010 19:03:21 -0400 Received: from [131.179.64.200] (Penguin.CS.UCLA.EDU [131.179.64.200]) by kiwi.cs.ucla.edu (8.13.8+Sun/8.13.8/UCLACS-6.0) with ESMTP id o64N3CMw028986; Sun, 4 Jul 2010 16:03:13 -0700 (PDT) Message-ID: <4C31132F.40201@cs.ucla.edu> Date: Sun, 04 Jul 2010 16:03:11 -0700 From: Paul Eggert Organization: UCLA Computer Science Department User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4 MIME-Version: 1.0 To: Jim Meyering Subject: Re: bug#6557: du sometimes miscounts directories, and files whose link count equals 1 References: <4C2EDB84.5050302@cs.ucla.edu> <87aaq9dlmd.fsf@meyering.net> In-Reply-To: <87aaq9dlmd.fsf@meyering.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam-Score: -2.1 (--) X-Debbugs-Envelope-To: 6557 Cc: 6557@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -3.3 (---) On 07/03/10 01:18, Jim Meyering wrote: > Note that the vendor versions of "du" from at least Solaris 10, > openBSD, netBSD and freeBSD print both lines. > I prefer the new semantics, especially when using --total: Yes, the new semantics make more sense. If you prefer the traditional semantics, you can still get them, by using "du A; du B" rather than "du A B". In contrast, there's no way to get the new (and better) semantics if all you have is the traditional behavior. This is another argument for staying with the new semantics. GNU du had already diverged from the traditional semantics, in that it kept track of hard links across argument boundaries, which traditional du does not. (This behavior is documented in the coreutils manual.) Solaris 10 du -L is clearly busted, by the way, in that it counts files multiply if their link count is 1. I wouldn't be surprised if the BSD du implementations are busted too. This behavior is not reasonable (and clearly doesn't conform to POSIX). So in some sense that weakens the argument of following the precedent of these older implementations in this area. > What do you think of breaking with that tradition? POSIX does appear > to say that for each "FILE" argument du must print a line, but it also > mentions how with linked files, the space must be counted only once. > You can definitely consider listing the same file twice as being > analogous to a file being hard-linked. The POSIX requirements are contradictory, and clearly the authors had not thought through the implications. When they're contradictory we should do the best we can, and perhaps get POSIX fixed at some point to clearly allow the new GNU behavior (as well as clearly allowing the traditional behavior of course; right now POSIX does neither). Thanks for fixing the test cases to match the new behavior. I had only run the test case that I had updated, and should have run them all (my only defense being that I'm using a circa-2003 desktop to test....). From debbugs-submit-bounces@debbugs.gnu.org Wed Jul 14 12:19:56 2010 Received: (at control) by debbugs.gnu.org; 14 Jul 2010 16:19:56 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OZ4gd-0007gF-Gp for submit@debbugs.gnu.org; Wed, 14 Jul 2010 12:19:55 -0400 Received: from mail1.slb.deg.dub.stisp.net ([84.203.253.98]) by debbugs.gnu.org with smtp (Exim 4.69) (envelope-from ) id 1OZ4gb-0007fv-FQ for control@debbugs.gnu.org; Wed, 14 Jul 2010 12:19:53 -0400 Received: (qmail 4256 invoked from network); 14 Jul 2010 16:20:00 -0000 Received: from unknown (HELO ?192.168.2.25?) (84.203.137.218) by mail1.slb.deg.dub.stisp.net with SMTP; 14 Jul 2010 16:20:00 -0000 Message-ID: <4C3DE361.50300@draigBrady.com> Date: Wed, 14 Jul 2010 17:18:41 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 MIME-Version: 1.0 To: control@debbugs.gnu.org X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.8 (-) close 6557 8.6 From unknown Mon Jun 16 16:06:07 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 12 Aug 2010 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator