GNU bug report logs - #18499
Possible mv race for hardlinks (rhbz #1141368 )

Previous Next

Package: coreutils;

Reported by: ovasik <at> redhat.com

Date: Thu, 18 Sep 2014 10:54:02 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: ovasik <at> redhat.com
Cc: 18499 <at> debbugs.gnu.org, Boris Ranto <branto <at> redhat.com>, Miklos Szeredi <miklos <at> szeredi.hu>
Subject: bug#18499: Possible mv race for hardlinks (rhbz #1141368 )
Date: Sun, 16 Nov 2014 12:06:46 +0000
On 16/11/14 11:38, Ondrej Vasik wrote:
> On Fri, 2014-11-14 at 23:36 +0000, Pádraig Brady wrote:
>> On 13/11/14 16:06, Boris Ranto wrote:
>>> On Thu, 2014-11-13 at 15:41 +0000, Pádraig Brady wrote:
>>>> On 13/11/14 13:58, Boris Ranto wrote:
>>>>> On Wed, 2014-09-24 at 17:18 +0100, Pádraig Brady wrote:
> ...
>>>> We've three options that I see:
>>>>
>>>> 1. Simulate but be susceptible to data loss races (what we do now)
>>>> 2. Ignore rename() issue and leave both files present (what FreeBSD does)
>>>> 3. Fail with a "files are identical" error (what Solaris does)
>>>>
>>
>> I see this was previously discussed in http://bugs.gnu.org/10686
>> and it was mentioned there that the 2008 POSIX standard explicitly
>> states that any of the three approaches above is appropriate.
>>
>>> Personally, I like the Solaris way the best (if we can't get the system
>>> call that would assure the atomicity in this case) -- i.e. let the user
>>> know that we can't do that. It is also in tune with what the link man
>>> page says for hard links:
>>>
>>>    "This  new  name  may be used exactly as the old one for any
>>>     operation;"
>>>
>>> As we error out on 'mv a a', we should probably also threat 'mv a b'
>>> where a and b are the same file that way.
>>>
>>> As opposed to FreeBSD solution, the user will know that he needs to run
>>> 'rm a' to accomplish what he actually wanted to do in the first place.
>>>
>>>> Now 1 is also an "issue" when the source and dest files are on separate file systems.
>>>> I.E. mv will need to unlink() in that case.  So from a high level, mv is foregoing the
>>>> atomicity of the rename in this edge case to provide consistent behavior across
>>>> all cases.  The question boils down to whether atomicity in this edge case is required.
>>>> Could you expand on a use case that requires this atomicity?
>>>>
>>> There actually is a real world use case. The original red hat bugzilla
>>> (1141368) was filed because people were hitting this issue in a
>>> distributed environment (via GlusterFS) -- i.e. two people were renaming
>>> files at the same time from different locations.
>>
>> Following those bugs indicates it was an internal test case that was failing.
>> I.E. it's an unlikely edge case.
>>
>> I'm in a bit of a quandary with this one.
>> Seeing as this (consistent high level) behavior has been in use for the last 11 years:
>> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=fc6073d6
>> without a "real" customer complaint, I'm tending to leave as is,
>> but push for better behavior from the kernel.
> 
> There was at least one complaint about non-atomicity
> ( http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/12985 ). Jim
> tried to address this with
> http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=63feb84a2db5246fb71df45884589b914679110c , however this non-atomicity still remains at least on this one place shown by the test-case in the mentioned rhbz #1141368. I have asked if there are some real world usecases - other than this artificial one - in the original bugzilla.

Yes that was for a much more important case of _any_ hardlinked file,
rather than two that are hardlinked to each other.

>> I.E. add a RENAME_REPLACE flag to renameat2(), which will
>> do the rename atomically for hardlinks, so we could do something like:
>>
>>   #if !HAVE_RENAME_REPLACE
>>   if (same_file (a, b))
>>     unlink (src); /* as we do now.  */
>>   #endif
>>   rename (a, b);
>>
>> thanks,
>> Pádraig.
>>
> 
> Personally, although this one reproducer scenario is artificial, it
> still brings potential data loss - and this is always bad. Flag in
> kernel sounds like reasonable compromise - to not change existing
> behaviour allowed by POSIX. It may take some time, though.

Right, though my point was we have time to fix this correctly,
since no-one has reported this particular case in a practical use case
in the 10 years it was possible.

Also all cases would still not be handled if we changed this, for example:
  mv /backup1/file /backup2/file & mv /backup2/file /backup1/file

> However - as it is relatively corner case - change to Solaris approach
> should be IMHO relatively safe (for other cases, behaviour will not
> change at all).

If we change this, it's much more likely that people will start complaining
about their non overlapping mv instances failing.

Hence I'm 55:45 for leaving as is and fixing in the right place (the kernel).

thanks,
Pádraig.




This bug report was last modified 10 years and 180 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.