Package: coreutils;
Reported by: Gian Piero Carrubba <gpiero <at> rm-rf.it>
Date: Fri, 23 Aug 2013 21:55:02 UTC
Severity: normal
Tags: fixed
Merged with 23120
Done: Bernhard Voelker <mail <at> bernhard-voelker.de>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Pádraig Brady <P <at> draigBrady.com> To: Bernhard Voelker <mail <at> bernhard-voelker.de> Cc: Gian Piero Carrubba <gpiero <at> rm-rf.it>, 15173 <at> debbugs.gnu.org Subject: bug#15173: [cp] --link overrides dereference settings Date: Tue, 29 Oct 2013 18:27:50 +0000
On 10/20/2013 02:11 AM, Bernhard Voelker wrote: > Hello Gian, > > On 08/23/2013 11:40 PM, Gian Piero Carrubba wrote: >> Please keep me in Cc: as I'm not subscribed. >> >> I failed to find references to discussions about this (intended?) >> behaviour, so I'm filing this report. Please forgive me if I've missed >> something elementary. > > thanks for your bug report, the thorough thoughts about it and the > patch proposal. This looks like a valid issue... >> $ echo testfile > file; ln -s file link; opts="-l -Ll -Hl"; >> for o in $opts; do cp $o link cp${o}; done; >> ls -li >> >> total 4 >> 3358742 lrwxrwxrwx 4 gpiero gpiero 4 Aug 23 22:27 cp-Hl -> file >> 3358742 lrwxrwxrwx 4 gpiero gpiero 4 Aug 23 22:27 cp-Ll -> file >> 3358742 lrwxrwxrwx 4 gpiero gpiero 4 Aug 23 22:27 cp-l -> file >> 3358741 -rw-r--r-- 1 gpiero gpiero 9 Aug 23 22:27 file >> 3358742 lrwxrwxrwx 4 gpiero gpiero 4 Aug 23 22:27 link -> file >> >> I would have expected both 'cp-Hl' and 'cp-Ll' to be hardlinked to >> 'file', not to 'link'. > > I'd also say that the above is an error ... or at least contradicts > to what is documented in the code (copy.c, line 2377): > > Yet cp, invoked with '--link --no-dereference', > should not follow the link. Yes I would think that `ln -L` and `cp --link -L` should be equivalent in this regard. >> --- old-coreutils/src/copy.c 2013-08-23 22:16:15.938940800 +0200 >> +++ new-coreutils/src/copy.c 2013-08-23 22:16:15.942940823 +0200 >> @@ -1566,10 +1566,15 @@ >> be created as a symbolic link to SRC_NAME. */ >> static bool >> create_hard_link (char const *src_name, char const *dst_name, >> - bool replace, bool verbose) >> + bool replace, bool verbose, bool dereference) >> { >> - /* We want to guarantee that symlinks are not followed. */ >> - bool link_failed = (linkat (AT_FDCWD, src_name, AT_FDCWD, dst_name, 0) != 0); >> + int flags; >> + if (dereference) >> + flags = AT_SYMLINK_FOLLOW; >> + else >> + flags = 0; /* We want to guarantee that symlinks are not followed. */ >> + >> + bool link_failed = (linkat (AT_FDCWD, src_name, AT_FDCWD, dst_name, flags) != 0); >> >> /* If the link failed because of an existing destination, >> remove that file and then call link again. */ > > Instead of passing 'flags' to linkat (of which I don't know if it works > on all platforms), I'd rather dereference the symlink manually: > > char *src_name_new = canonicalize_filename_mode (src_name, CAN_EXISTING); Actually since linkat() has defined semantics, _and thanks to gnulib, guaranteed semantics_ we should just leverage the gnulib magic here, and use linkat with flags. >> AFAICS, at least in one case the patch changes the behaviour of cp: >> >> $ echo testfile > file; ln -s file link; >> cp -l link cp-l.old; ../src/cp -l link cp-l.new; >> ls -li >> >> total 8 >> 8912975 -rw-r--r-- 2 gpiero gpiero 9 Aug 23 22:58 cp-l.new >> 8913010 lrwxrwxrwx 2 gpiero gpiero 4 Aug 23 22:58 cp-l.old -> file >> 8912975 -rw-r--r-- 2 gpiero gpiero 9 Aug 23 22:58 file >> 8913010 lrwxrwxrwx 2 gpiero gpiero 4 Aug 23 22:58 link -> file > > Not ideal ... > >> This is caused by the following code: >> >> $ tail -n+1135 src/cp.c | head -n8 >> if (x.dereference == DEREF_UNDEFINED) >> { >> if (x.recursive) >> /* This is compatible with FreeBSD. */ >> x.dereference = DEREF_NEVER; >> else >> x.dereference = DEREF_ALWAYS; >> } Interesting. You'd nearly think it should be the other way around, given linking to symlinks when copying to other directories, may give invalid symlinks if they link outside the copied hierarchy. > Good analysis! > My guess is that this because POSIX [1] allows either behavior: > > If source_file is a file of type symbolic link: > [...] > If the -R option was specified: > If none of the options -H, -L, nor -P were specified, it is > unspecified which of -H, -L, or -P will be used as a default. > > [1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cp.html > > This means that coreutils has chosen in between what makes most sense > and compatibility to other implementations. > >> Not sure why this snippet is there[0], but it seems to me that it leads >> to inconsistent behaviour: >> >> $ echo testfile > file; ln -s file link; >> cp link cp; cp -r link cp-r; >> ls -li >> >> total 8 >> 3358743 -rw-r--r-- 1 gpiero gpiero 9 Aug 23 23:06 cp >> 3358744 lrwxrwxrwx 1 gpiero gpiero 4 Aug 23 23:06 cp-r -> file >> 3358741 -rw-r--r-- 1 gpiero gpiero 9 Aug 23 23:06 file >> 3358742 lrwxrwxrwx 1 gpiero gpiero 4 Aug 23 23:06 link -> file >> >> I think it is counterintuitive that '--recursive' had effects on the >> referentiation of the link.[1] > > I think that is correct as POSIX requires this (see also in [1] above). > > Instead, I'd change the above initialization of x.dereference in the > case the --link option is specified. > > I have wrapped it for you into a Git patch (below), yet without a test > case. > > This is the result: > > $ : > file; ln -s file link > $ for opt in "" -l -lH -lL -lP -r ; do cp $opt link cp$opt ; done > $ ls -ldogi link file cp* Nice basis for a test. > 15335993 -rw-r--r-- 1 0 Oct 20 03:09 cp > 15335992 lrwxrwxrwx 3 4 Oct 20 03:08 cp-l -> file > 15335991 -rw-r--r-- 3 0 Oct 20 03:08 cp-lH > 15335991 -rw-r--r-- 3 0 Oct 20 03:08 cp-lL > 15335992 lrwxrwxrwx 3 4 Oct 20 03:08 cp-lP -> file > 15335994 lrwxrwxrwx 1 4 Oct 20 03:09 cp-r -> file > 15335991 -rw-r--r-- 3 0 Oct 20 03:08 file > 15335992 lrwxrwxrwx 3 4 Oct 20 03:08 link -> file The above look correct. For comparison on FreeBSD 9.1 1755429 -rw-r--r-- 1 pbrady pbrady - 0 Oct 29 12:16 cp 1755427 -rw-r--r-- 5 pbrady pbrady - 0 Oct 29 12:16 cp-l 1755427 -rw-r--r-- 5 pbrady pbrady - 0 Oct 29 12:16 cp-lH 1755427 -rw-r--r-- 5 pbrady pbrady - 0 Oct 29 12:16 cp-lL 1755427 -rw-r--r-- 5 pbrady pbrady - 0 Oct 29 12:16 cp-lP 1755430 -rw-r--r-- 1 pbrady pbrady - 0 Oct 29 12:16 cp-r 1755427 -rw-r--r-- 5 pbrady pbrady - 0 Oct 29 12:16 file 1755428 lrwxr-xr-x 1 pbrady pbrady - 4 Oct 29 12:16 link -> file And from older 8.10 GNU cp we have: 2368286 -rw-rw-r--. 1 0 Oct 29 18:19 cp 2368262 lrwxrwxrwx. 4 4 Oct 29 18:19 cp-l -> file 2368262 lrwxrwxrwx. 4 4 Oct 29 18:19 cp-lH -> file 2368262 lrwxrwxrwx. 4 4 Oct 29 18:19 cp-lL -> file 2368298 lrwxrwxrwx. 1 4 Oct 29 18:19 cp-lP -> file 2368299 lrwxrwxrwx. 1 4 Oct 29 18:19 cp-r -> file 2368218 -rw-rw-r--. 1 0 Oct 29 18:19 file 2368262 lrwxrwxrwx. 4 4 Oct 29 18:19 link -> file In both the above we seem to be just getting the default dereferencing behavior of link(1). > WDYT? > > BTW: tests/cp/same-file.sh now fails with the patch. Yes better to consolidate the functionality, before doing the tests to avoid awkward test churn. > Have a nice day, > Berny > > >>From fca06077e11d6338b35029bc61fbee5dbfb5ab66 Mon Sep 17 00:00:00 2001 > From: Gian Piero Carrubba <gpiero <at> rm-rf.it> > Date: Sun, 20 Oct 2013 03:04:41 +0200 > Subject: [PATCH] cp: fix --link --dereference [DRAFT] > > * src/copy.c (create_hard_link): Add a bool 'dereference' parameter, > and canonicalize the given src_name when dereference is true. > (copy_internal): Add a value for the above new parameter in all > three calls to create_hard_link: true if the src_name should be > dereferenced before creating the hard link, false otherwise. > * src/cp.c (main): after parsing the options, if x.dereference is > still DEFEF_UNDEFINED and the --links option was specified, initialize s/--links/--link/ > x.dereference to DEREF_NEVER. > * NEWS (Changes in behavior): Mention the change. > > This fixes http://bugs.gnu.org/15173 > --- > NEWS | 4 ++++ > src/copy.c | 42 ++++++++++++++++++++++++++++++------------ > src/cp.c | 2 +- > 3 files changed, 35 insertions(+), 13 deletions(-) > > diff --git a/NEWS b/NEWS > index 39dd3d6..c1d637c 100644 > --- a/NEWS > +++ b/NEWS > @@ -70,6 +70,10 @@ GNU coreutils NEWS -*- outline -*- > > ** Changes in behavior > > + cp --link --dereference now dereferences a symbolic link as source files > + before creating the hard link in the destination. > + Previously, it would create a hard link of the symbolic link. > + > dd status=none now suppresses all non fatal diagnostic messages, > not just the transfer counts. > > diff --git a/src/copy.c b/src/copy.c > index f66ab46..fa4d734 100644 > --- a/src/copy.c > +++ b/src/copy.c > @@ -1558,18 +1558,23 @@ restore_default_fscreatecon_or_die (void) > _("failed to restore the default file creation context")); > } > > -/* Create a hard link DST_NAME to SRC_NAME, honoring the REPLACE and > - VERBOSE settings. Return true upon success. Otherwise, diagnose > - the failure and return false. > - If SRC_NAME is a symbolic link it will not be followed. If the system > - doesn't support hard links to symbolic links, then DST_NAME will > - be created as a symbolic link to SRC_NAME. */ > +/* Create a hard link DST_NAME to SRC_NAME, honoring the REPLACE, > + VERBOSE and DEREFERENCE settings. Return true upon success. > + Otherwise, diagnose the failure and return false. > + If SRC_NAME is a symbolic link it will not be followed unless DEREFERENCE > + is true. If the system doesn't support hard links to symbolic links, > + then DST_NAME will be created as a symbolic link to SRC_NAME. */ > static bool > create_hard_link (char const *src_name, char const *dst_name, > - bool replace, bool verbose) > + bool replace, bool verbose, bool dereference) > { > + char *src_deref = NULL; > + if (dereference) > + src_deref = canonicalize_filename_mode (src_name, CAN_EXISTING); Probably best avoided as mentioned above. > /* We want to guarantee that symlinks are not followed. */ > - bool link_failed = (linkat (AT_FDCWD, src_name, AT_FDCWD, dst_name, 0) != 0); > + bool link_failed = (linkat (AT_FDCWD, dereference ? src_deref : src_name, > + AT_FDCWD, dst_name, 0) != 0); > > /* If the link failed because of an existing destination, > remove that file and then call link again. */ > @@ -1578,13 +1583,17 @@ create_hard_link (char const *src_name, char const *dst_name, > if (unlink (dst_name) != 0) > { > error (0, errno, _("cannot remove %s"), quote (dst_name)); > + free (src_deref); > return false; > } > if (verbose) > printf (_("removed %s\n"), quote (dst_name)); > - link_failed = (linkat (AT_FDCWD, src_name, AT_FDCWD, dst_name, 0) != 0); > + link_failed = (linkat (AT_FDCWD, dereference ? src_deref : src_name, > + AT_FDCWD, dst_name, 0) != 0); > } > > + free (src_deref); > + > if (link_failed) > { > error (0, errno, _("cannot create hard link %s to %s"), > @@ -1748,7 +1757,10 @@ copy_internal (char const *src_name, char const *dst_name, > /* Note we currently replace DST_NAME unconditionally, > even if it was a newer separate file. */ > if (! create_hard_link (earlier_file, dst_name, true, > - x->verbose)) > + x->verbose, > + x->dereference == DEREF_ALWAYS > + || (x->dereference == DEREF_COMMAND_LINE_ARGUMENTS > + && command_line_arg))) > { > goto un_backup; > } > @@ -2078,7 +2090,10 @@ copy_internal (char const *src_name, char const *dst_name, > } > else > { > - if (! create_hard_link (earlier_file, dst_name, true, x->verbose)) > + if (! create_hard_link (earlier_file, dst_name, true, x->verbose, > + x->dereference == DEREF_ALWAYS > + || (x->dereference == DEREF_COMMAND_LINE_ARGUMENTS > + && command_line_arg))) > goto un_backup; > > return true; > @@ -2389,7 +2404,10 @@ copy_internal (char const *src_name, char const *dst_name, > && !(LINK_FOLLOWS_SYMLINKS && S_ISLNK (src_mode) > && x->dereference == DEREF_NEVER)) > { > - if (! create_hard_link (src_name, dst_name, false, false)) > + if (! create_hard_link (src_name, dst_name, false, false, > + x->dereference == DEREF_ALWAYS > + || (x->dereference == DEREF_COMMAND_LINE_ARGUMENTS > + && command_line_arg))) Lots and copy and paste there. Maybe use a helper like: static inline bool _GL_ATTRIBUTE_PURE should_dereference (const struct cp_options *x, bool cli_arg) { return x->dereference == DEREF_ALWAYS || (x->dereference == DEREF_COMMAND_LINE_ARGUMENTS && cli_arg); } > goto un_backup; > } > else if (S_ISREG (src_mode) > diff --git a/src/cp.c b/src/cp.c > index 7bc8630..6efb152 100644 > --- a/src/cp.c > +++ b/src/cp.c > @@ -1135,7 +1135,7 @@ main (int argc, char **argv) > > if (x.dereference == DEREF_UNDEFINED) > { > - if (x.recursive) > + if (x.recursive || x.hard_link) > /* This is compatible with FreeBSD. */ > x.dereference = DEREF_NEVER; > else > Thanks to both of you for looking into this awkward issue. Pádraig.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.