GNU bug report logs - #21460
Race condition in tests/tail-2/assert.sh

Previous Next

Package: coreutils;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Fri, 11 Sep 2015 16:24:02 UTC

Severity: normal

Merged with 21459

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Pádraig Brady <P <at> draigBrady.com>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#21460: closed (Race condition in tests/tail-2/assert.sh)
Date: Fri, 02 Oct 2015 15:51:02 +0000
[Message part 1 (text/plain, inline)]
Your message dated Fri, 2 Oct 2015 16:50:22 +0100
with message-id <560EA7BE.8010305 <at> draigBrady.com>
and subject line Re: bug#21460: Race condition in tests/tail-2/assert.sh
has caused the debbugs.gnu.org bug report #21460,
regarding Race condition in tests/tail-2/assert.sh
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
21460: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=21460
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: ludo <at> gnu.org (Ludovic Courtès)
To: bug-coreutils <at> gnu.org
Cc: bug-guix <at> gnu.org
Subject: Race condition in tests/tail-2/assert.sh
Date: Fri, 11 Sep 2015 18:23:31 +0200
Hello,

We have observed intermittent failures of tests/tail-2/assert.sh
(Coreutils 8.24, libc 2.22), especially showing up on relatively slow
machines (armhf and mips64el.)

The failure is with ‘tail --follow=name’, which, in inotify mode, would
fail to report that file ‘foo’ has been deleted.

The strace of a correct execution (x86_64) is like this:

--8<---------------cut here---------------start------------->8---
lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
inotify_init()                          = 5
write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3
open("a", O_RDONLY|O_NONBLOCK)          = 6
lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstat(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
close(6)                                = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
open("foo", O_RDONLY|O_NONBLOCK)        = 6
lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
close(6)                                = 0
fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
open("foo", O_RDONLY|O_NONBLOCK)        = -1 ENOENT (No such file or directory)
lstat("foo", 0x7ffee174c3b0)            = -1 ENOENT (No such file or directory)
write(2, "tail: ", 6)                   = 6
write(2, "foo", 3)                      = 3
write(2, ": No such file or directory", 27) = 27
write(2, "\n", 1)                       = 1
close(4)                                = 0
read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) = 16
inotify_rm_watch(5, 3)                  = -1 EINVAL (Invalid argument)
--8<---------------cut here---------------end--------------->8---

For a *failing* execution (armhf), we get:

--8<---------------cut here---------------start------------->8---
lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
inotify_init()                          = 5
write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3
open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178938, f_bavail=34274467, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
read(5,
--8<---------------cut here---------------end--------------->8---

In both cases, reading from the inotify file descriptor (number 5)
returns a notification for watch #3 (corresponding to ‘foo’), with mask
IN_ATTRIB (value 4).

However, the open("foo") call that immediately follows does *not* return
ENOENT in the failing case: The file is still there.

The kernel’s ‘vfs_unlink’ goes like this:

        fsnotify_link_count(target);                    /* IN_ATTRIB */
        d_delete(dentry); /* fsnotify_nameremove → IN_DELETE for “.” */

So, ‘tail’ first receives the IN_ATTRIB notification corresponding to
the link count decrease on ‘foo’; at that point, ‘foo’ is still
available.  And then, ‘tail’ should receive the IN_DELETE_SELF
notification, at which point ‘foo’ would have been actually been
unlinked.  But in practice we don’t seem to be receiving IN_DELETE_SELF,
even in the succeeding case.

I think the problem happens when ‘tail’ opens ‘foo’ right in between of
the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t
report anything.

Does that make sense?

Thanks,
Ludo’.


[Message part 3 (message/rfc822, inline)]
From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>, Ludovic Courtès
 <ludo <at> gnu.org>
Cc: 21460-done <at> debbugs.gnu.org, bug-guix <at> gnu.org
Subject: Re: bug#21460: Race condition in tests/tail-2/assert.sh
Date: Fri, 2 Oct 2015 16:50:22 +0100
[Message part 4 (text/plain, inline)]
On 12/09/15 03:22, Pádraig Brady wrote:
> On 12/09/15 02:09, Paul Eggert wrote:
>> Pádraig Brady wrote:
>>> +  else if (new_stats.st_nlink == 0) /* XXX: what about multi-linked files.  */
>>
>> That comment was my thought exactly.  It appears to be a kernel bug that really 
>> needs to get fixed in the kernel; there just doesn't seem to be a reliable 
>> workaround in user space.
> 
> I double checked with kernel guys, and Al Viro
> essentially said the inode and directory operations are independent.
> https://lkml.org/lkml/2015/9/11/790
> 
> So we probably need to handle the IN_DELETE event
> on the directory to cater for this race, as done in:
> https://lists.gnu.org/archive/html/coreutils/2015-07/msg00015.html

I'll apply the attached later.

thanks,
Pádraig

[tail-unlink-notification-race.patch (text/x-patch, attachment)]

This bug report was last modified 9 years and 232 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.