Hello, Pádraig,

I was actually the one to identify the bug and analyze the code underlying it (Illia, my colleague, was kind enough to volunteer to file the bug report and attempt a patch), and so I may be in the better position to follow up on this. I appreciate your addressing the matter so promptly.

We are indeed provisionally using --remove-destination; however, we would prefer to use --force instead for the atomicity it affords.

Regarding the proposed solution, I think that it well addresses the present issue. But there are related issues that I should raise while we are discussing it and that may influence its resolution.

To begin, it seems problematic to make the return value in the case of hard-link creation dependent on whether the source and the destination are associated with the same device ID, as the purpose of the same_file_ok function seems to be to indicate whether it matters that the files may somehow be the same. If their file systems are different, that is not a problem with the files' being the same; it is a problem with their being on different file systems. The resulting error message (that the operation cannot be performed because the files are the same) is therefore incorrect and misleading. Perhaps, as it seems to me, this check should rather be performed outside this function and associated with its own error message.

In fact, even if the source and destination are completely unrelated, this function may still return false and cause the same-file error message to be displayed. For example, let [source] be a regular file or directory on one file system, and let [destination] be a symbolic link on another. Then the following command will result in a same-file error, even if [destination] does not refer to [source]:

cp --recursive --link --no-dereference [source] [destination]

Moreover, the name and description of the function seem misleading. One would think from reading these that the source and destination files passed to it have already been determined to be equivalent in some way, and that the function is meant to determine whether it is all right in that case to overwrite the destination; however, it appears that the function is called whenever the destination already exists, regardless of its relation to the source. This discrepancy can lead to misunderstandings and problems such as the kind last described. Ideally, the function's name and description would be made more accurate and precise, and the function used strictly accordingly.

Assuming we were to move the file-system check outside the same_file_ok function, we would be left with the following code within the function:

/* It's ok to remove a destination symbolic link when creating a symbolic link or hard link. */
if (S_ISLNK (dst_sb_link->st_mode) && (x->symbolic_link || x->hard_link))
  return true;

But as we earlier in the function handle the cases when both the source and the destination are symbolic links, and there does not seem to be a reason to constrain overwriting of symbolic links otherwise (at least within this function), we can perhaps simplify this even further to the following:

/* At this point, it's ok to remove a destination symbolic link. */
if (S_ISLNK (dst_sb_link->st_mode))
  return true;

But I have not analyzed all the relevant code to determine the full impact of this change.

Finally, it would seem to make sense for the same_file_ok function to return true immediately if the source and destination files passed to it have different inode numbers or are on different file systems (and are thus not the same file). In fact, this is done in lines 1488-89 if DEREF_NEVER is not set; however, it is not done if DEREF_NEVER is set. For reference, those lines read as follows:

if (!same)
  return true;

Cursory analysis suggests that if these lines were unconditionally executed near the top of the function, certain problems might be avoided.

Cheers,
Jason

On Sat, May 5, 2018 at 8:18 PM Pádraig Brady <P@draigbrady.com> wrote:
On 04/05/18 16:37, Illia Bobyr wrote:
> Hello,
>
> I have found a bug in "cp -rfs".
>
> Steps to reproduce:
>
> 1. Given "path1" and "path2" are on different devices.
> 2. $ touch "path1/file"
> 3. $ cd path2/; ln -s path1/file
> 4. $ cp --symbolic-link --force --recursive path1/file .
>
> Expected:
> The link is overwritten with an exact copy.
>
> Actual result:
> cp shows an error:
>     cp: 'path1/file' and './file' are the same file
>
> This bug was introduced in
>
> http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=376967889ed7ed561e46ff6d88a66779db62737a
>
> Specifically this hunk:
>
> diff --git a/src/copy.c b/src/copy.c
> index e3832c2..9dbd536 100644
> --- a/src/copy.c
> <http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/copy.c?id=2f69dba5df8caaf9eda658c1808b1379e9949f22>
> +++ b/src/copy.c
> <http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/copy.c?id=376967889ed7ed561e46ff6d88a66779db62737a>
> @@ -46,6 +46,7 @@
> #include "file-set.h"
> #include "filemode.h"
> #include "filenamecat.h"
> +#include "force-link.h"
> #include "full-write.h"
> #include "hash.h"
> #include "hash-triple.h"
> @@ -1623,11 +1624,13 @@ same_file_ok (char const *src_name, struct stat
> const *src_sb,
> }
> }
> - /* It's ok to remove a destination symlink. But that works only when we
> - unlink before opening the destination and when the source and destination
> - files are on the same partition. */
> - if (x->unlink_dest_before_opening
> - && S_ISLNK (dst_sb_link->st_mode))
> + /* It's ok to remove a destination symlink. But that works only
> + when creating symbolic links, or when the source and destination
> + are on the same file system and when creating hard links or when
> + unlinking before opening the destination. */
> + if (x->symbolic_link
> + || ((x->hard_link || x->unlink_dest_before_opening)
> + && S_ISLNK (dst_sb_link->st_mode)))
> return dst_sb_link->st_dev == src_sb_link->st_dev;
> if (x->dereference == DEREF_NEVER)
>
> Two patches that fix the issue are attached.
> They are against the current master in
> https://github.com/coreutils/coreutils
> The changes are also here:
>
> https://github.com/coreutils/coreutils/compare/master...ilya-bobyr:master
>
> Thank you, Illia Bobyr
>
Thanks for the careful analysis of this hairy code.

Your use case works with --remove, which is similar
to the very recent https://bugs.gnu.org/31335
which also requests --force to replace symlinks.
Your case is slightly different and a change from previous behavior.

Note the specific reason for the change is not the
hunk you mentioned, but this hunk in cp.c which used to
make --force --symlink also imply --remove.

  /* If --force (-f) was specified and we're in link-creation mode,
     first remove any existing destination file.  */
  if (x.unlink_dest_after_failed_open && (x.hard_link || x.symbolic_link))
    x.unlink_dest_before_opening = true;

That was also changed in the commit you referenced,
as we no longer unconditionally unlink() for atomicity reasons.
I.E. we now create a temp symlink and rename it over
the existing destination.  If you don't require these new
guarantees then --remove will work for your use case.

So back to the hunk you mentioned.
I think the logic in the hunk you referenced was suspect
before commit 3769678 and only became significant as now
hit due to --remove no longer being set unconditionally for -sf.
Your analysis wrt x->symbolic_link being independent
of the src and dst devices is sound.

Though things aren't quite right as one can now nuke a file like:

  $ touch file
  $ ln -s file l1
  $ cp -s -f l1 file

That would be a regression in commit 3769678 not in your change.

What I've currently have to address both these cases is:

diff --git a/src/copy.c b/src/copy.c
index 4998c83..806323e 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -1627,14 +1627,17 @@ same_file_ok (char const *src_name, struct stat const *src_sb,
         }
     }

-  /* It's ok to remove a destination symlink.  But that works only
-     when creating symbolic links, or when the source and destination
-     are on the same file system and when creating hard links or when
-     unlinking before opening the destination.  */
-  if (x->symbolic_link
-      || ((x->hard_link || x->unlink_dest_before_opening)
-          && S_ISLNK (dst_sb_link->st_mode)))
-    return dst_sb_link->st_dev == src_sb_link->st_dev;
+  if (S_ISLNK (dst_sb_link->st_mode))
+    {
+      /* It's ok to replace a destination symlink. */
+      if (x->symbolic_link)
+        return true;
+
+      /* Or when hard links are possible.
+         TODO: Analyze this case further.  */
+      if (x->hard_link)
+        return dst_sb_link->st_dev == src_sb_link->st_dev;
+    }

   if (x->dereference == DEREF_NEVER)
     {

I'll test a few more cases, and add tests and NEWS.

Thanks again for pinpointing these issues!

Pádraig