GNU bug report logs - #7389
rename() over NFS

Previous Next

Package: coreutils;

Reported by: Bruno Haible <bruno <at> clisp.org>

Date: Sat, 13 Nov 2010 11:55:02 UTC

Severity: normal

Tags: patch

Merged with 7394

To reply to this bug, email your comments to 7389 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7389; Package coreutils. (Sat, 13 Nov 2010 11:55:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Bruno Haible <bruno <at> clisp.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Sat, 13 Nov 2010 11:55:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>,
 bug-coreutils <at> gnu.org
Cc: bug-gnulib <at> gnu.org, Eric Blake <eblake <at> redhat.com>,
	"Gary V. Vaughan" <gary <at> gnu.org>
Subject: Re: rename() over NFS
Date: Sat, 13 Nov 2010 12:58:56 +0100
[bug-coreutils readers: This is a reply to
 <http://lists.gnu.org/archive/html/bug-gnulib/2010-11/msg00155.html>].

Paul Eggert wrote:
> > What should we do?
> >   a) Patch the test so that it uses a readdir() loop to detect the absence of
> >      the file even when stat() pretends it's still present. Or
> >   b) Use an rpl_rename override that will make the unit test work.
> 
> It's long been well-known that NFS is not POSIX-compliant.
> If we start down the path of putting wrappers around NFS to make it POSIX-compliant,
> we'll have a lot of work to do and a job that will probably never end,
> and we'll make coreutils slow down for the common case even though
> in practice the problems are rare and users don't care about them.
> 
> So, my vote is for (a).  Or maybe even (c): skip the test if it's
> running atop NFS.
> 
> I've been observing similar test failures for months, by the way, but
> haven't bothered to report them, because, hey! it's NFS! of course it's
> going to screw up in those cases!

But NFS is widely used, in small networks of 2 to 50 Linux/Unix machines.
Users are running shell scripts in such situations, with lots of coreutils
commands.

Bruno




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7389; Package coreutils. (Sat, 13 Nov 2010 12:08:01 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Bruno Haible <bruno <at> clisp.org>
Cc: bug-gnulib <at> gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>, bug-coreutils <at> gnu.org,
	Eric Blake <eblake <at> redhat.com>
Subject: Re: rename() over NFS
Date: Sat, 13 Nov 2010 13:11:41 +0100
Bruno Haible wrote:
> [bug-coreutils readers: This is a reply to
>  <http://lists.gnu.org/archive/html/bug-gnulib/2010-11/msg00155.html>].
>
> Paul Eggert wrote:
>> > What should we do?
>> >   a) Patch the test so that it uses a readdir() loop to detect the absence of
>> >      the file even when stat() pretends it's still present. Or
>> >   b) Use an rpl_rename override that will make the unit test work.
>>
>> It's long been well-known that NFS is not POSIX-compliant.
>> If we start down the path of putting wrappers around NFS to make it
>> POSIX-compliant,
>> we'll have a lot of work to do and a job that will probably never end,
>> and we'll make coreutils slow down for the common case even though
>> in practice the problems are rare and users don't care about them.
>>
>> So, my vote is for (a).  Or maybe even (c): skip the test if it's
>> running atop NFS.
>>
>> I've been observing similar test failures for months, by the way, but
>> haven't bothered to report them, because, hey! it's NFS! of course it's
>> going to screw up in those cases!
>
> But NFS is widely used, in small networks of 2 to 50 Linux/Unix machines.
> Users are running shell scripts in such situations, with lots of coreutils
> commands.

Hi Bruno,

Rename is fundamental enough and such NFS-specific problems rare enough
that I would prefer not to impose a rename wrapper on everyone.  I agree
with Paul that this is a very steep slippery slope.  If we start down
it in earnest, we may end up making our tools less maintainable in the
long run, and for only very marginal benefit.

However, if you see ways to improve things without impacting performance
or maintainability, I'm all for it.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7389; Package coreutils. (Sat, 13 Nov 2010 14:09:01 GMT) Full text and rfc822 format available.

Message #11 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Jim Meyering <jim <at> meyering.net>
Cc: bug-gnulib <at> gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>, bug-coreutils <at> gnu.org,
	Eric Blake <eblake <at> redhat.com>
Subject: Re: rename() over NFS
Date: Sat, 13 Nov 2010 15:13:15 +0100
Jim Meyering wrote:
> if you see ways to improve things without impacting performance
> or maintainability, I'm all for it.

Maintainability wouldn't be impacted, because the fix would be to use
lib/rename.c, with
  #define RENAME_DEST_EXISTS_BUG 1
  #define RENAME_HARD_LINK_BUG 1

But performance would be impacted, since 2 lstat() calls would happen
before every rename().

The bug may not have a big impact, since coreutils 'mv' handles a move of a
directory to a directory differently anyway, and few other programs use
rename() on directories.

So, assuming Eric also agrees with approach (a), here's a proposal for
fixing the testsuite:


2010-11-13  Bruno Haible  <bruno <at> clisp.org>

	rename, renameat: Avoid test failures at NFS mounted locations.
	* tests/test-rename.h (dentry_exists, assert_nonexistent): New
	functions.
	(test_rename): Use assert_nonexistent.
	* tests/test-rename.c: Include <dirent.h>.
	* tests/test-renameat.c: Likewise.
	Reported by Gary V. Vaughan <gary <at> gnu.org>.

--- tests/test-rename.h.orig	Sat Nov 13 14:35:50 2010
+++ tests/test-rename.h	Sat Nov 13 14:35:23 2010
@@ -20,6 +20,49 @@
    appropriate headers are already included.  If PRINT, warn before
    skipping symlink tests with status 77.  */
 
+/* Tests whether a file, given by a file name without slashes, exists in
+   the current directory, by scanning the directory entries.  */
+static bool
+dentry_exists (const char *filename)
+{
+  bool exists = false;
+  DIR *dir = opendir (".");
+
+  ASSERT (dir != NULL);
+  for (;;)
+    {
+      struct dirent *d = readdir (dir);
+      if (d == NULL)
+        break;
+      if (strcmp (d->d_name, filename) == 0)
+        {
+          exists = true;
+          break;
+        }
+    }
+  ASSERT (closedir (dir) == 0);
+  return exists;
+}
+
+/* Asserts that a specific file, given by a file name without slashes, does
+   not exist in the current directory.  */
+static void
+assert_nonexistent (const char *filename)
+{
+  struct stat st;
+
+  /* The usual way to test the presence of a file is via stat() or lstat().  */
+  errno = 0;
+  if (stat (filename, &st) == -1)
+    ASSERT (errno == ENOENT);
+  else
+    /* But after renaming a directory over an empty directory on an NFS-mounted
+       file system, on Linux 2.6.18, for a period of 30 seconds the old
+       directory name is "present" according to stat() but "nonexistent"
+       according to dentry_exists().  */
+    ASSERT (!dentry_exists (filename));
+}
+
 static int
 test_rename (int (*func) (char const *, char const *), bool print)
 {
@@ -226,9 +269,7 @@
     }
     { /* Full onto empty.  */
       ASSERT (func (BASE "dir", BASE "dir2") == 0);
-      errno = 0;
-      ASSERT (stat (BASE "dir", &st) == -1);
-      ASSERT (errno == ENOENT);
+      assert_nonexistent (BASE "dir");
       ASSERT (stat (BASE "dir2/file", &st) == 0);
       /* Files present here:
            {BASE}file
@@ -263,9 +304,7 @@
        */
       {
         ASSERT (func (BASE "dir", BASE "dir2/") == 0);
-        errno = 0;
-        ASSERT (stat (BASE "dir", &st) == -1);
-        ASSERT (errno == ENOENT);
+        assert_nonexistent (BASE "dir");
         ASSERT (stat (BASE "dir2/file", &st) == 0);
       }
       /* Files present here:
--- tests/test-rename.c.orig	Sat Nov 13 14:35:50 2010
+++ tests/test-rename.c	Sat Nov 13 14:15:34 2010
@@ -21,6 +21,7 @@
 #include "signature.h"
 SIGNATURE_CHECK (rename, int, (char const *, char const *));
 
+#include <dirent.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <stdbool.h>
--- tests/test-renameat.c.orig	Sat Nov 13 14:35:50 2010
+++ tests/test-renameat.c	Sat Nov 13 14:15:30 2010
@@ -23,6 +23,7 @@
 #include "signature.h"
 SIGNATURE_CHECK (renameat, int, (int, char const *, int, char const *));
 
+#include <dirent.h>
 #include <fcntl.h>
 #include <errno.h>
 #include <stdbool.h>




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7389; Package coreutils. (Sat, 13 Nov 2010 14:27:02 GMT) Full text and rfc822 format available.

Message #14 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Bruno Haible <bruno <at> clisp.org>
Cc: bug-gnulib <at> gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>, bug-coreutils <at> gnu.org,
	Eric Blake <eblake <at> redhat.com>
Subject: Re: rename() over NFS
Date: Sat, 13 Nov 2010 15:30:48 +0100
Bruno Haible wrote:
> Jim Meyering wrote:
>> if you see ways to improve things without impacting performance
>> or maintainability, I'm all for it.
>
> Maintainability wouldn't be impacted, because the fix would be to use
> lib/rename.c, with
>   #define RENAME_DEST_EXISTS_BUG 1
>   #define RENAME_HARD_LINK_BUG 1
>
> But performance would be impacted, since 2 lstat() calls would happen
> before every rename().

That sounds like it would be excessive.
Currently, moving 100 regular files into a subdir
incurs 100% of its time in 100 rename and 200 stat calls:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 79.69    0.000051           1       100           rename
 20.31    0.000013           0       200       100 lstat
  0.00    0.000000           0         8           read

Adding a rename wrapper and doubling the number of lstat calls
per file would represent a significant slow-down.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7389; Package coreutils. (Sat, 13 Nov 2010 21:20:02 GMT) Full text and rfc822 format available.

Message #17 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: bug-gnulib <at> gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>,
	Bruno Haible <bruno <at> clisp.org>, bug-coreutils <at> gnu.org
Subject: Re: rename() over NFS
Date: Sat, 13 Nov 2010 14:24:11 -0700
[Message part 1 (text/plain, inline)]
On 11/13/2010 07:30 AM, Jim Meyering wrote:
> Bruno Haible wrote:
>> Jim Meyering wrote:
>>> if you see ways to improve things without impacting performance
>>> or maintainability, I'm all for it.
>>
>> Maintainability wouldn't be impacted, because the fix would be to use
>> lib/rename.c, with
>>   #define RENAME_DEST_EXISTS_BUG 1
>>   #define RENAME_HARD_LINK_BUG 1
>>
>> But performance would be impacted, since 2 lstat() calls would happen
>> before every rename().
> 
> That sounds like it would be excessive.
> Currently, moving 100 regular files into a subdir
> incurs 100% of its time in 100 rename and 200 stat calls:
> 
> % time     seconds  usecs/call     calls    errors syscall
> ------ ----------- ----------- --------- --------- ----------------
>  79.69    0.000051           1       100           rename
>  20.31    0.000013           0       200       100 lstat
>   0.00    0.000000           0         8           read
> 
> Adding a rename wrapper and doubling the number of lstat calls
> per file would represent a significant slow-down.

The two lstat() calls occur before coreutils calls rename().  And it
appears that the NFS rename() bug that we are concerned about is only in
the case of renaming one directory onto another empty one; which is
generally only possible via 'mv -T a b' (where b is the empty directory
to be overwritten by directory a).

So maybe it's still possible to do a relatively-maintainable patch to
coreutils that detects when we are about to call rename() onto an
existing directory (basically, when 'mv -T' is in effect), and manually
call rmdir("b") before renaming "a", which should work around the NFS
bug, all without needing a rename() wrapper.  At any rate, the gnulib
documentation would need a patch describing the bug, and stating that
gnulib does not work around it.

But I'm not sure if changing mv to avoid this corner-case POSIX
violation on NFS not renaming directories properly is made any better by
inserting the rmdir() call, since that also feels like it is a POSIX
violation for not making an atomic rename attempt.  Maybe all the more
we can do is document this, and ping the kernel folks to see if they can
fix their NFS implementation bug.

-- 
Eric Blake   eblake <at> redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7389; Package coreutils. (Sat, 13 Nov 2010 21:22:02 GMT) Full text and rfc822 format available.

Message #20 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Bruno Haible <bruno <at> clisp.org>
Cc: bug-coreutils <at> gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>, bug-gnulib <at> gnu.org,
	Jim Meyering <jim <at> meyering.net>
Subject: Re: rename() over NFS
Date: Sat, 13 Nov 2010 14:26:15 -0700
[Message part 1 (text/plain, inline)]
On 11/13/2010 07:13 AM, Bruno Haible wrote:
> So, assuming Eric also agrees with approach (a), here's a proposal for
> fixing the testsuite:
> 
> 
> 2010-11-13  Bruno Haible  <bruno <at> clisp.org>
> 
> 	rename, renameat: Avoid test failures at NFS mounted locations.
> 	* tests/test-rename.h (dentry_exists, assert_nonexistent): New
> 	functions.
> 	(test_rename): Use assert_nonexistent.
> 	* tests/test-rename.c: Include <dirent.h>.
> 	* tests/test-renameat.c: Likewise.
> 	Reported by Gary V. Vaughan <gary <at> gnu.org>.

For now, I'm okay with weakening the testsuite to ignore the NFS bug;
please apply this patch, but let's also document the bug.

-- 
Eric Blake   eblake <at> redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

[signature.asc (application/pgp-signature, attachment)]

Merged 7389 7394. Request was from era eriksson <era <at> iki.fi> to control <at> debbugs.gnu.org. (Thu, 30 Aug 2012 08:08:02 GMT) Full text and rfc822 format available.

Added tag(s) patch. Request was from era eriksson <era <at> iki.fi> to control <at> debbugs.gnu.org. (Thu, 30 Aug 2012 08:08:02 GMT) Full text and rfc822 format available.

This bug report was last modified 12 years and 298 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.