GNU bug report logs - #21459
Race condition in tests/tail-2/assert.sh

Previous Next

Package: coreutils;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Fri, 11 Sep 2015 16:24:02 UTC

Severity: normal

Merged with 21460

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21459 in the body.
You can then email your comments to 21459 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#21459; Package coreutils. (Fri, 11 Sep 2015 16:24:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to ludo <at> gnu.org (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 11 Sep 2015 16:24:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: bug-coreutils <at> gnu.org
Cc: bug-guix <at> gnu.org
Subject: Race condition in tests/tail-2/assert.sh
Date: Fri, 11 Sep 2015 18:23:31 +0200
Hello,

We have observed intermittent failures of tests/tail-2/assert.sh
(Coreutils 8.24, libc 2.22), especially showing up on relatively slow
machines (armhf and mips64el.)

The failure is with ‘tail --follow=name’, which, in inotify mode, would
fail to report that file ‘foo’ has been deleted.

The strace of a correct execution (x86_64) is like this:

--8<---------------cut here---------------start------------->8---
lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
inotify_init()                          = 5
write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3
open("a", O_RDONLY|O_NONBLOCK)          = 6
lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstat(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
close(6)                                = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
open("foo", O_RDONLY|O_NONBLOCK)        = 6
lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
close(6)                                = 0
fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
open("foo", O_RDONLY|O_NONBLOCK)        = -1 ENOENT (No such file or directory)
lstat("foo", 0x7ffee174c3b0)            = -1 ENOENT (No such file or directory)
write(2, "tail: ", 6)                   = 6
write(2, "foo", 3)                      = 3
write(2, ": No such file or directory", 27) = 27
write(2, "\n", 1)                       = 1
close(4)                                = 0
read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) = 16
inotify_rm_watch(5, 3)                  = -1 EINVAL (Invalid argument)
--8<---------------cut here---------------end--------------->8---

For a *failing* execution (armhf), we get:

--8<---------------cut here---------------start------------->8---
lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
inotify_init()                          = 5
write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3
open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
fstat64(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178938, f_bavail=34274467, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
close(6)                                = 0
read(5,
--8<---------------cut here---------------end--------------->8---

In both cases, reading from the inotify file descriptor (number 5)
returns a notification for watch #3 (corresponding to ‘foo’), with mask
IN_ATTRIB (value 4).

However, the open("foo") call that immediately follows does *not* return
ENOENT in the failing case: The file is still there.

The kernel’s ‘vfs_unlink’ goes like this:

        fsnotify_link_count(target);                    /* IN_ATTRIB */
        d_delete(dentry); /* fsnotify_nameremove → IN_DELETE for “.” */

So, ‘tail’ first receives the IN_ATTRIB notification corresponding to
the link count decrease on ‘foo’; at that point, ‘foo’ is still
available.  And then, ‘tail’ should receive the IN_DELETE_SELF
notification, at which point ‘foo’ would have been actually been
unlinked.  But in practice we don’t seem to be receiving IN_DELETE_SELF,
even in the succeeding case.

I think the problem happens when ‘tail’ opens ‘foo’ right in between of
the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t
report anything.

Does that make sense?

Thanks,
Ludo’.




Forcibly Merged 21459 21460. Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Fri, 11 Sep 2015 18:20:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#21459; Package coreutils. (Fri, 11 Sep 2015 20:35:02 GMT) Full text and rfc822 format available.

Message #10 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Ludovic Courtès <ludo <at> gnu.org>, 21459 <at> debbugs.gnu.org
Cc: bug-guix <at> gnu.org
Subject: Re: bug#21459: Race condition in tests/tail-2/assert.sh
Date: Fri, 11 Sep 2015 21:34:25 +0100
On 11/09/15 17:23, Ludovic Courtès wrote:
> Hello,
> 
> We have observed intermittent failures of tests/tail-2/assert.sh
> (Coreutils 8.24, libc 2.22), especially showing up on relatively slow
> machines (armhf and mips64el.)
> 
> The failure is with ‘tail --follow=name’, which, in inotify mode, would
> fail to report that file ‘foo’ has been deleted.
> 
> The strace of a correct execution (x86_64) is like this:
> 
> --8<---------------cut here---------------start------------->8---
> lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> inotify_init()                          = 5
> write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
> inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
> inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
> inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
> inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3
> open("a", O_RDONLY|O_NONBLOCK)          = 6
> lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> fstat(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
> close(6)                                = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> open("foo", O_RDONLY|O_NONBLOCK)        = 6
> lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0
> close(6)                                = 0
> fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
> open("foo", O_RDONLY|O_NONBLOCK)        = -1 ENOENT (No such file or directory)
> lstat("foo", 0x7ffee174c3b0)            = -1 ENOENT (No such file or directory)
> write(2, "tail: ", 6)                   = 6
> write(2, "foo", 3)                      = 3
> write(2, ": No such file or directory", 27) = 27
> write(2, "\n", 1)                       = 1
> close(4)                                = 0
> read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) = 16
> inotify_rm_watch(5, 3)                  = -1 EINVAL (Invalid argument)
> --8<---------------cut here---------------end--------------->8---
> 
> For a *failing* execution (armhf), we get:
> 
> --8<---------------cut here---------------start------------->8---
> lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> inotify_init()                          = 5
> write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25
> inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
> inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2
> inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1
> inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3
> open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
> lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> fstat64(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
> close(6)                                = 0
> fstat64(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0
> open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
> lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
> close(6)                                = 0
> fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16
> open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6
> lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178938, f_bavail=34274467, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0
> close(6)                                = 0
> read(5,
> --8<---------------cut here---------------end--------------->8---
> 
> In both cases, reading from the inotify file descriptor (number 5)
> returns a notification for watch #3 (corresponding to ‘foo’), with mask
> IN_ATTRIB (value 4).
> 
> However, the open("foo") call that immediately follows does *not* return
> ENOENT in the failing case: The file is still there.
> 
> The kernel’s ‘vfs_unlink’ goes like this:
> 
>         fsnotify_link_count(target);                    /* IN_ATTRIB */
>         d_delete(dentry); /* fsnotify_nameremove → IN_DELETE for “.” */
> 
> So, ‘tail’ first receives the IN_ATTRIB notification corresponding to
> the link count decrease on ‘foo’; at that point, ‘foo’ is still
> available.  And then, ‘tail’ should receive the IN_DELETE_SELF
> notification, at which point ‘foo’ would have been actually been
> unlinked.  But in practice we don’t seem to be receiving IN_DELETE_SELF,
> even in the succeeding case.
> 
> I think the problem happens when ‘tail’ opens ‘foo’ right in between of
> the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t
> report anything.
> 
> Does that make sense?

Thanks for looking into it in such detail.
IN_DELETE_SELF is only sent when the inode is removed,
not on the unlink itself (as tail has the file open).
One of the awkward gotchas with inotify being inode oriented.

I.E. I wonder is this a cache coherency issue,
with tail referencing the stale inode in the open()
before the dcache is updated.

It'll be a while before I can look at this.

thanks again,
Pádraig.




Information forwarded to bug-coreutils <at> gnu.org:
bug#21459; Package coreutils. (Fri, 11 Sep 2015 20:35:04 GMT) Full text and rfc822 format available.

Merged 21459 21460. Request was from ludo <at> gnu.org (Ludovic Courtès) to control <at> debbugs.gnu.org. (Fri, 11 Sep 2015 21:11:01 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 31 Oct 2015 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 235 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.