GNU bug report logs -
#21460
Race condition in tests/tail-2/assert.sh
Previous Next
Reported by: ludo <at> gnu.org (Ludovic Courtès)
Date: Fri, 11 Sep 2015 16:24:02 UTC
Severity: normal
Merged with 21459
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
On 11/09/15 23:49, Pádraig Brady wrote:
> On 11/09/15 21:55, Ludovic Courtès wrote:
>> Paul Eggert <eggert <at> cs.ucla.edu> skribis:
>>
>>> Ludovic Courtès wrote:
>>>> I think the problem happens when ‘tail’ opens ‘foo’ right in between of
>>>> the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t
>>>> report anything.
>>>>
>>>> Does that make sense?
>>>
>>> Yes, though if the link count is indeed zero, I'm surprised that
>>> 'tail' can open the file -- that sounds like a bug in the kernel.
>>
>> Attached is a reproducer; just run it in a loop for a couple of seconds:
>>
>> --8<---------------cut here---------------start------------->8---
>> $ while ./a.out ; do : ; done
>> funny, errno = Success, nlink = 0
>> Aborted (core dumped)
>> --8<---------------cut here---------------end--------------->8---
>>
>> I’m not sure if that’s a kernel bug. Strictly speaking, inotify works
>> as expected: we get a notification for nlink--, which doesn’t mean the
>> file has vanished.
>
> Interesting. It does seem that the IN_ATTRIB is sent before the st_nlink--
> takes effect? That could be a bug. Or it could be a dcache coherency
> issue where the name still references the st_nlink==0 inode.
>
> Note recheck() just open() and close() the file in this case,
> but since it doesn't close() the original fd, then there will be
> no IN_DELETE_SELF event.
>
> If the above kernel behavior can be explained and is acceptable,
> I suppose we could augment recheck() with something like:
>
> diff --git a/src/tail.c b/src/tail.c
> index f916d74..e9d5337 100644
> --- a/src/tail.c
> +++ b/src/tail.c
> @@ -1046,6 +1046,18 @@ recheck (struct File_spec *f, bool blocking)
> close_fd (f->fd, pretty_name (f));
>
> }
> + else if (new_stats.st_nlink == 0) /* XXX: what about multi-linked files. */
> + {
> + /* It was seen on Linux that a file could be opened
> + even though unlinked as the directory entry (cache)
> + is updated after the IN_ATTRIB is sent for the nlink--. */
> +
> + error (0, f->errnum, _("%s has become inaccessible"),
> + quote (pretty_name (f)));
> +
> + close_fd (fd, pretty_name (f));
> + close_fd (f->fd, pretty_name (f));
> + f->fd = -1;
> + }
> else
> {
>
>> The conclusion for ‘tail’ would be to wait for the IN_DELETE_SELF event
>> before considering the file to be gone. WDYT?
>
> As mentioned above, tail references the file until it can't open it,
> so the IN_DELETE_SELF is only generated upon the close_fd(f->fd) above.
Google reminded me of this!
https://lists.gnu.org/archive/html/coreutils/2015-07/msg00015.html
I.E. this is the same issue that Assaf noticed,
and that I though was restricted to older kernels.
That has an alternate fix attached.
cheers,
Pádraig.
This bug report was last modified 9 years and 232 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.