Package: coreutils;
Reported by: ludo <at> gnu.org (Ludovic Courtès)
Date: Fri, 11 Sep 2015 16:24:02 UTC
Severity: normal
Merged with 21460
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: ludo <at> gnu.org (Ludovic Courtès) To: bug-coreutils <at> gnu.org Cc: bug-guix <at> gnu.org Subject: Race condition in tests/tail-2/assert.sh Date: Fri, 11 Sep 2015 18:23:31 +0200
Hello, We have observed intermittent failures of tests/tail-2/assert.sh (Coreutils 8.24, libc 2.22), especially showing up on relatively slow machines (armhf and mips64el.) The failure is with ‘tail --follow=name’, which, in inotify mode, would fail to report that file ‘foo’ has been deleted. The strace of a correct execution (x86_64) is like this: --8<---------------cut here---------------start------------->8--- lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 inotify_init() = 5 write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3 open("a", O_RDONLY|O_NONBLOCK) = 6 lstat("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 fstat(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0 close(6) = 0 fstat(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 open("foo", O_RDONLY|O_NONBLOCK) = 6 lstat("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstatfs(6, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=16738210, f_bfree=1219533, f_bavail=367617, f_files=4259840, f_ffree=2406492, f_fsid={1622537548, 1497272261}, f_namelen=255, f_frsize=4096}) = 0 close(6) = 0 fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16 open("foo", O_RDONLY|O_NONBLOCK) = -1 ENOENT (No such file or directory) lstat("foo", 0x7ffee174c3b0) = -1 ENOENT (No such file or directory) write(2, "tail: ", 6) = 6 write(2, "foo", 3) = 3 write(2, ": No such file or directory", 27) = 27 write(2, "\n", 1) = 1 close(4) = 0 read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) = 16 inotify_rm_watch(5, 3) = -1 EINVAL (Invalid argument) --8<---------------cut here---------------end--------------->8--- For a *failing* execution (armhf), we get: --8<---------------cut here---------------start------------->8--- lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 inotify_init() = 5 write(1, "==> a <==\nx\n\n==> foo <==\n", 25) = 25 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 2 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) = 1 inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = 3 open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 lstat64("a", {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 fstat64(6, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 close(6) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=2, ...}) = 0 open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178942, f_bavail=34274471, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 close(6) = 0 fstat64(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) = 16 open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 6 lstat64("foo", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstat64(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 fstatfs64(6, 88, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=37378337, f_bfree=36178938, f_bavail=34274467, f_files=9502720, f_ffree=9405759, f_fsid={1592050960, 1812457140}, f_namelen=255, f_frsize=4096, f_flags=4128}) = 0 close(6) = 0 read(5, --8<---------------cut here---------------end--------------->8--- In both cases, reading from the inotify file descriptor (number 5) returns a notification for watch #3 (corresponding to ‘foo’), with mask IN_ATTRIB (value 4). However, the open("foo") call that immediately follows does *not* return ENOENT in the failing case: The file is still there. The kernel’s ‘vfs_unlink’ goes like this: fsnotify_link_count(target); /* IN_ATTRIB */ d_delete(dentry); /* fsnotify_nameremove → IN_DELETE for “.” */ So, ‘tail’ first receives the IN_ATTRIB notification corresponding to the link count decrease on ‘foo’; at that point, ‘foo’ is still available. And then, ‘tail’ should receive the IN_DELETE_SELF notification, at which point ‘foo’ would have been actually been unlinked. But in practice we don’t seem to be receiving IN_DELETE_SELF, even in the succeeding case. I think the problem happens when ‘tail’ opens ‘foo’ right in between of the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t report anything. Does that make sense? Thanks, Ludo’.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.