Package: coreutils;
Reported by: ludo <at> gnu.org (Ludovic Courtès)
Date: Fri, 11 Sep 2015 16:24:02 UTC
Severity: normal
Merged with 21460
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21459 in the body.
You can then email your comments to 21459 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-coreutils <at> gnu.org
:bug#21459
; Package coreutils
.
(Fri, 11 Sep 2015 16:24:02 GMT) Full text and rfc822 format available.ludo <at> gnu.org (Ludovic Courtès)
:bug-coreutils <at> gnu.org
.
(Fri, 11 Sep 2015 16:24:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: ludo <at> gnu.org (Ludovic Courtès) To: bug-coreutils <at> gnu.org Cc: bug-guix <at> gnu.org Subject: Race condition in tests/tail-2/assert.sh Date: Fri, 11 Sep 2015 18:23:31 +0200
Hello, We have observed intermittent failures of tests/tail-2/assert.sh (Coreutils 8.24, libc 2.22), especially showing up on relatively slow machines (armhf and mips64el.) The failure is with ‘tail --follow=name’, which, in inotify mode, would fail to report that file ‘foo’ has been deleted. The strace of a correct execution (x86_64) is like this: --8<---------------cut here---------------start------------->8--- lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 inotify_init() = 5 write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3 open("a", O_RDONLY|O_NONBLOCK) = 6 lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 fstat(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0 close(6) = 0 fstat(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 open("foo", O_RDONLY|O_NONBLOCK) = 6 lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0 close(6) = 0 fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16 open("foo", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory) lstat("foo", 0x7ffee174c3b0) = -1 ENOENT (No such file or directory) write(2, "tail: ", 6) = 6 write(2, "foo", 3) = 3 write(2, ": No such file or directory", 27) = 27 write(2, "\n", 1) = 1 close(4) = 0 read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) = 16 inotify_rm_watch(5, 3) = -1 EINVAL (Invalid argument) --8<---------------cut here---------------end--------------->8--- For a *failing* execution (armhf), we get: --8<---------------cut here---------------start------------->8--- lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 inotify_init() = 5 write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3 open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 fstat64(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 close(6) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 close(6) = 0 fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16 open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178938, f_bavail=34274467, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 close(6) = 0 read(5, --8<---------------cut here---------------end--------------->8--- In both cases, reading from the inotify file descriptor (number 5) returns a notification for watch #3 (corresponding to ‘foo’), with mask IN_ATTRIB (value 4). However, the open("foo") call that immediately follows does *not* return ENOENT in the failing case: The file is still there. The kernel’s ‘vfs_unlink’ goes like this: fsnotify_link_count(target); /* IN_ATTRIB */ d_delete(dentry); /* fsnotify_nameremove → IN_DELETE for “.” */ So, ‘tail’ first receives the IN_ATTRIB notification corresponding to the link count decrease on ‘foo’; at that point, ‘foo’ is still available. And then, ‘tail’ should receive the IN_DELETE_SELF notification, at which point ‘foo’ would have been actually been unlinked. But in practice we don’t seem to be receiving IN_DELETE_SELF, even in the succeeding case. I think the problem happens when ‘tail’ opens ‘foo’ right in between of the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t report anything. Does that make sense? Thanks, Ludo’.
Pádraig Brady <P <at> draigBrady.com>
to control <at> debbugs.gnu.org
.
(Fri, 11 Sep 2015 18:20:02 GMT) Full text and rfc822 format available.bug-coreutils <at> gnu.org
:bug#21459
; Package coreutils
.
(Fri, 11 Sep 2015 20:35:02 GMT) Full text and rfc822 format available.Message #10 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Pádraig Brady <P <at> draigBrady.com> To: Ludovic Courtès <ludo <at> gnu.org>, 21459 <at> debbugs.gnu.org Cc: bug-guix <at> gnu.org Subject: Re: bug#21459: Race condition in tests/tail-2/assert.sh Date: Fri, 11 Sep 2015 21:34:25 +0100
On 11/09/15 17:23, Ludovic Courtès wrote: > Hello, > > We have observed intermittent failures of tests/tail-2/assert.sh > (Coreutils 8.24, libc 2.22), especially showing up on relatively slow > machines (armhf and mips64el.) > > The failure is with ‘tail --follow=name’, which, in inotify mode, would > fail to report that file ‘foo’ has been deleted. > > The strace of a correct execution (x86_64) is like this: > > --8<---------------cut here---------------start------------->8--- > lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > inotify_init() = 5 > write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25 > inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 > inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2 > inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 > inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3 > open("a", O_RDONLY|O_NONBLOCK) = 6 > lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > fstat(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0 > close(6) = 0 > fstat(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > open("foo", O_RDONLY|O_NONBLOCK) = 6 > lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0 > close(6) = 0 > fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16 > open("foo", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory) > lstat("foo", 0x7ffee174c3b0) = -1 ENOENT (No such file or directory) > write(2, "tail: ", 6) = 6 > write(2, "foo", 3) = 3 > write(2, ": No such file or directory", 27) = 27 > write(2, "\n", 1) = 1 > close(4) = 0 > read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) = 16 > inotify_rm_watch(5, 3) = -1 EINVAL (Invalid argument) > --8<---------------cut here---------------end--------------->8--- > > For a *failing* execution (armhf), we get: > > --8<---------------cut here---------------start------------->8--- > lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > inotify_init() = 5 > write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25 > inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 > inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2 > inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 > inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3 > open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 > lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > fstat64(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 > close(6) = 0 > fstat64(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 > open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 > lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 > close(6) = 0 > fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16 > open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 > lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 > fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178938, f_bavail=34274467, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 > close(6) = 0 > read(5, > --8<---------------cut here---------------end--------------->8--- > > In both cases, reading from the inotify file descriptor (number 5) > returns a notification for watch #3 (corresponding to ‘foo’), with mask > IN_ATTRIB (value 4). > > However, the open("foo") call that immediately follows does *not* return > ENOENT in the failing case: The file is still there. > > The kernel’s ‘vfs_unlink’ goes like this: > > fsnotify_link_count(target); /* IN_ATTRIB */ > d_delete(dentry); /* fsnotify_nameremove → IN_DELETE for “.” */ > > So, ‘tail’ first receives the IN_ATTRIB notification corresponding to > the link count decrease on ‘foo’; at that point, ‘foo’ is still > available. And then, ‘tail’ should receive the IN_DELETE_SELF > notification, at which point ‘foo’ would have been actually been > unlinked. But in practice we don’t seem to be receiving IN_DELETE_SELF, > even in the succeeding case. > > I think the problem happens when ‘tail’ opens ‘foo’ right in between of > the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t > report anything. > > Does that make sense? Thanks for looking into it in such detail. IN_DELETE_SELF is only sent when the inode is removed, not on the unlink itself (as tail has the file open). One of the awkward gotchas with inotify being inode oriented. I.E. I wonder is this a cache coherency issue, with tail referencing the stale inode in the open() before the dcache is updated. It'll be a while before I can look at this. thanks again, Pádraig.
bug-coreutils <at> gnu.org
:bug#21459
; Package coreutils
.
(Fri, 11 Sep 2015 20:35:04 GMT) Full text and rfc822 format available.ludo <at> gnu.org (Ludovic Courtès)
to control <at> debbugs.gnu.org
.
(Fri, 11 Sep 2015 21:11:01 GMT) Full text and rfc822 format available.Debbugs Internal Request <help-debbugs <at> gnu.org>
to internal_control <at> debbugs.gnu.org
.
(Sat, 31 Oct 2015 11:24:04 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.