From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 12:23:47 2015 Received: (at submit) by debbugs.gnu.org; 11 Sep 2015 16:23:47 +0000 Received: from localhost ([127.0.0.1]:57131 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaR74-0003dV-A9 for submit@debbugs.gnu.org; Fri, 11 Sep 2015 12:23:46 -0400 Received: from eggs.gnu.org ([208.118.235.92]:55539) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaR72-0003dH-4f for submit@debbugs.gnu.org; Fri, 11 Sep 2015 12:23:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaR70-0004Vg-Vw for submit@debbugs.gnu.org; Fri, 11 Sep 2015 12:23:44 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:34669) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaR70-0004Vc-SU for submit@debbugs.gnu.org; Fri, 11 Sep 2015 12:23:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37311) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaR6z-0006WV-FD for bug-coreutils@gnu.org; Fri, 11 Sep 2015 12:23:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaR6y-0004Uw-0Z for bug-coreutils@gnu.org; Fri, 11 Sep 2015 12:23:41 -0400 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:45070) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaR6s-0004T6-FF; Fri, 11 Sep 2015 12:23:34 -0400 Received: from pluto.bordeaux.inria.fr ([193.50.110.57]:34884 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1ZaR6r-00009S-QZ; Fri, 11 Sep 2015 12:23:34 -0400 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: bug-coreutils@gnu.org Subject: Race condition in tests/tail-2/assert.sh X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 25 Fructidor an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x3D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-unknown-linux-gnu Date: Fri, 11 Sep 2015 18:23:31 +0200 Message-ID: <87wpvw2ad8.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit Cc: bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) Hello, We have observed intermittent failures of tests/tail-2/assert.sh (Coreutils 8.24, libc 2.22), especially showing up on relatively slow machines (armhf and mips64el.) The failure is with =E2=80=98tail --follow=3Dname=E2=80=99, which, in inoti= fy mode, would fail to report that file =E2=80=98foo=E2=80=99 has been deleted. The strace of a correct execution (x86_64) is like this: --8<---------------cut here---------------start------------->8--- lstat("a", {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 lstat("foo", {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 inotify_init() =3D 5 write(1, "=3D=3D> a <=3D=3D\nx\n\n=3D=3D> foo <=3D=3D\n", 25) =3D 25 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) =3D 1 inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = =3D 2 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) =3D 1 inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF= ) =3D 3 open("a", O_RDONLY|O_NONBLOCK) =3D 6 lstat("a", {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 fstat(6, {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 fstatfs(6, {f_type=3D"EXT2_SUPER_MAGIC", f_bsize=3D4096, f_blocks=3D1673821= 0, f_bfree=3D1219533, f_bavail=3D367617, f_files=3D4259840, f_ffree=3D24064= 92, f_fsid=3D{1622537548, 1497272261}, f_namelen=3D255, f_frsize=3D4096}) = =3D 0 close(6) =3D 0 fstat(3, {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 open("foo", O_RDONLY|O_NONBLOCK) =3D 6 lstat("foo", {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 fstat(6, {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 fstatfs(6, {f_type=3D"EXT2_SUPER_MAGIC", f_bsize=3D4096, f_blocks=3D1673821= 0, f_bfree=3D1219533, f_bavail=3D367617, f_files=3D4259840, f_ffree=3D24064= 92, f_fsid=3D{1622537548, 1497272261}, f_namelen=3D255, f_frsize=3D4096}) = =3D 0 close(6) =3D 0 fstat(4, {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) =3D 16 open("foo", O_RDONLY|O_NONBLOCK) =3D -1 ENOENT (No such file or dire= ctory) lstat("foo", 0x7ffee174c3b0) =3D -1 ENOENT (No such file or dire= ctory) write(2, "tail: ", 6) =3D 6 write(2, "foo", 3) =3D 3 write(2, ": No such file or directory", 27) =3D 27 write(2, "\n", 1) =3D 1 close(4) =3D 0 read(5, "\3\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0", 20) =3D 16 inotify_rm_watch(5, 3) =3D -1 EINVAL (Invalid argument) --8<---------------cut here---------------end--------------->8--- For a *failing* execution (armhf), we get: --8<---------------cut here---------------start------------->8--- lstat64("a", {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 lstat64("foo", {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 inotify_init() =3D 5 write(1, "=3D=3D> a <=3D=3D\nx\n\n=3D=3D> foo <=3D=3D\n", 25) =3D 25 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) =3D 1 inotify_add_watch(5, "a", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF) = =3D 2 inotify_add_watch(5, ".", IN_ATTRIB|IN_MOVED_TO|IN_CREATE) =3D 1 inotify_add_watch(5, "foo", IN_MODIFY|IN_ATTRIB|IN_DELETE_SELF|IN_MOVE_SELF= ) =3D 3 open("a", O_RDONLY|O_NONBLOCK|O_LARGEFILE) =3D 6 lstat64("a", {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 fstat64(6, {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 fstatfs64(6, 88, {f_type=3D"EXT2_SUPER_MAGIC", f_bsize=3D4096, f_blocks=3D3= 7378337, f_bfree=3D36178942, f_bavail=3D34274471, f_files=3D9502720, f_ffre= e=3D9405759, f_fsid=3D{1592050960, 1812457140}, f_namelen=3D255, f_frsize= =3D4096, f_flags=3D4128}) =3D 0 close(6) =3D 0 fstat64(3, {st_mode=3DS_IFREG|0644, st_size=3D2, ...}) =3D 0 open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) =3D 6 lstat64("foo", {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 fstat64(6, {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 fstatfs64(6, 88, {f_type=3D"EXT2_SUPER_MAGIC", f_bsize=3D4096, f_blocks=3D3= 7378337, f_bfree=3D36178942, f_bavail=3D34274471, f_files=3D9502720, f_ffre= e=3D9405759, f_fsid=3D{1592050960, 1812457140}, f_namelen=3D255, f_frsize= =3D4096, f_flags=3D4128}) =3D 0 close(6) =3D 0 fstat64(4, {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 read(5, "\3\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0", 20) =3D 16 open("foo", O_RDONLY|O_NONBLOCK|O_LARGEFILE) =3D 6 lstat64("foo", {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 fstat64(6, {st_mode=3DS_IFREG|0644, st_size=3D0, ...}) =3D 0 fstatfs64(6, 88, {f_type=3D"EXT2_SUPER_MAGIC", f_bsize=3D4096, f_blocks=3D3= 7378337, f_bfree=3D36178938, f_bavail=3D34274467, f_files=3D9502720, f_ffre= e=3D9405759, f_fsid=3D{1592050960, 1812457140}, f_namelen=3D255, f_frsize= =3D4096, f_flags=3D4128}) =3D 0 close(6) =3D 0 read(5, --8<---------------cut here---------------end--------------->8--- In both cases, reading from the inotify file descriptor (number 5) returns a notification for watch #3 (corresponding to =E2=80=98foo=E2=80=99= ), with mask IN_ATTRIB (value 4). However, the open("foo") call that immediately follows does *not* return ENOENT in the failing case: The file is still there. The kernel=E2=80=99s =E2=80=98vfs_unlink=E2=80=99 goes like this: fsnotify_link_count(target); /* IN_ATTRIB */ d_delete(dentry); /* fsnotify_nameremove =E2=86=92 IN_DELETE for = =E2=80=9C.=E2=80=9D */ So, =E2=80=98tail=E2=80=99 first receives the IN_ATTRIB notification corres= ponding to the link count decrease on =E2=80=98foo=E2=80=99; at that point, =E2=80=98f= oo=E2=80=99 is still available. And then, =E2=80=98tail=E2=80=99 should receive the IN_DELETE_S= ELF notification, at which point =E2=80=98foo=E2=80=99 would have been actually= been unlinked. But in practice we don=E2=80=99t seem to be receiving IN_DELETE_= SELF, even in the succeeding case. I think the problem happens when =E2=80=98tail=E2=80=99 opens =E2=80=98foo= =E2=80=99 right in between of the two notifications: =E2=80=98foo=E2=80=99 is still there, and so =E2=80= =98tail=E2=80=99 doesn=E2=80=99t report anything. Does that make sense? Thanks, Ludo=E2=80=99. From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 13:18:46 2015 Received: (at submit) by debbugs.gnu.org; 11 Sep 2015 17:18:46 +0000 Received: from localhost ([127.0.0.1]:57217 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaRyH-0006Wa-Jz for submit@debbugs.gnu.org; Fri, 11 Sep 2015 13:18:45 -0400 Received: from eggs.gnu.org ([208.118.235.92]:42032) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaRyG-0006WS-1X for submit@debbugs.gnu.org; Fri, 11 Sep 2015 13:18:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaRyE-0000FH-VX for submit@debbugs.gnu.org; Fri, 11 Sep 2015 13:18:43 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:38557) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaRyE-0000FD-SE for submit@debbugs.gnu.org; Fri, 11 Sep 2015 13:18:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52029) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaRyD-0001s5-S2 for bug-guix@gnu.org; Fri, 11 Sep 2015 13:18:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaRyA-0000Cq-NA for bug-guix@gnu.org; Fri, 11 Sep 2015 13:18:41 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:49510) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaRyA-0000BF-Ho; Fri, 11 Sep 2015 13:18:38 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 15A1F1601BA; Fri, 11 Sep 2015 10:18:37 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id sTXPccxEaj5X; Fri, 11 Sep 2015 10:18:36 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 619E9161066; Fri, 11 Sep 2015 10:18:36 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id t0br2SiZprPJ; Fri, 11 Sep 2015 10:18:36 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 402911601BA; Fri, 11 Sep 2015 10:18:36 -0700 (PDT) Subject: Re: bug#21460: Race condition in tests/tail-2/assert.sh To: =?UTF-8?Q?Ludovic_Court=c3=a8s?= , 21460@debbugs.gnu.org References: <87wpvw2ad8.fsf@gnu.org> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <55F30CEC.7060102@cs.ucla.edu> Date: Fri, 11 Sep 2015 10:18:36 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <87wpvw2ad8.fsf@gnu.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit Cc: bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) Ludovic Court=C3=A8s wrote: > I think the problem happens when =E2=80=98tail=E2=80=99 opens =E2=80=98= foo=E2=80=99 right in between of > the two notifications: =E2=80=98foo=E2=80=99 is still there, and so =E2= =80=98tail=E2=80=99 doesn=E2=80=99t > report anything. > > Does that make sense? Yes, though if the link count is indeed zero, I'm surprised that 'tail' c= an open=20 the file -- that sounds like a bug in the kernel. If there is such a kernel bug and 'tail' can open a file with a link coun= t of=20 zero, that would explain why 'tail' does not immediately receive an=20 IN_DELETE_SELF notification: after all, the file is open (by 'tail' itsel= f) so=20 it should not be deleted even if it has a link count of zero. If so, it = appears=20 that there's another kernel bug later: when 'tail' closes the file's last= file=20 descriptor, the file should be deleted and an IN_DELETE_SELF notification= should=20 be sent to 'tail'. From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 14:19:23 2015 Received: (at control) by debbugs.gnu.org; 11 Sep 2015 18:19:23 +0000 Received: from localhost ([127.0.0.1]:57282 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaSux-0002fT-7O for submit@debbugs.gnu.org; Fri, 11 Sep 2015 14:19:23 -0400 Received: from mail2.vodafone.ie ([213.233.128.44]:46058) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaSuv-0002fJ-5D for control@debbugs.gnu.org; Fri, 11 Sep 2015 14:19:21 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhsLALEa81VtTIH4/2dsb2JhbABdglFSgRcIgWWBFU6/M4JONYErTAEBAQEBAYELQQWDfQoqVA0CBRYLAgsDAgECATkGAgIIDQgBAYguAahLj2WFb45XgSKEVo11gUMFlVaWDZFmY4FKAQEIAgGCKz2Db4ZjAgEC Received: from unknown (HELO localhost.localdomain) ([109.76.129.248]) by mail2.vodafone.ie with ESMTP; 11 Sep 2015 19:19:19 +0100 Message-ID: <55F31B26.8090406@draigBrady.com> Date: Fri, 11 Sep 2015 19:19:18 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: GNU bug tracker automated control server Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: forcemerge 21459 21460 stop [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [213.233.128.44 listed in list.dnswl.org] 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: forcemerge 21459 21460 stop [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [213.233.128.44 listed in list.dnswl.org] 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject forcemerge 21459 21460 stop From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 16:55:13 2015 Received: (at submit) by debbugs.gnu.org; 11 Sep 2015 20:55:13 +0000 Received: from localhost ([127.0.0.1]:57421 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaVLl-0002Xc-0N for submit@debbugs.gnu.org; Fri, 11 Sep 2015 16:55:13 -0400 Received: from eggs.gnu.org ([208.118.235.92]:36058) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaVLi-0002XO-PD for submit@debbugs.gnu.org; Fri, 11 Sep 2015 16:55:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaVLh-0006dY-Hn for submit@debbugs.gnu.org; Fri, 11 Sep 2015 16:55:10 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:48998) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaVLh-0006dQ-FC for submit@debbugs.gnu.org; Fri, 11 Sep 2015 16:55:09 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46053) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaVLg-0002ia-Fy for bug-guix@gnu.org; Fri, 11 Sep 2015 16:55:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaVLc-0006TJ-FV for bug-guix@gnu.org; Fri, 11 Sep 2015 16:55:08 -0400 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:48734) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaVLc-0006TB-C4; Fri, 11 Sep 2015 16:55:04 -0400 Received: from reverse-83.fdn.fr ([80.67.176.83]:33120 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1ZaVLb-0000dP-Ia; Fri, 11 Sep 2015 16:55:03 -0400 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: Paul Eggert Subject: Re: bug#21460: Race condition in tests/tail-2/assert.sh References: <87wpvw2ad8.fsf@gnu.org> <55F30CEC.7060102@cs.ucla.edu> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 25 Fructidor an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x3D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-unknown-linux-gnu Date: Fri, 11 Sep 2015 22:55:01 +0200 In-Reply-To: <55F30CEC.7060102@cs.ucla.edu> (Paul Eggert's message of "Fri, 11 Sep 2015 10:18:36 -0700") Message-ID: <87a8ssad7e.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit Cc: 21460@debbugs.gnu.org, bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Paul Eggert skribis: > Ludovic Court=C3=A8s wrote: >> I think the problem happens when =E2=80=98tail=E2=80=99 opens =E2=80=98f= oo=E2=80=99 right in between of >> the two notifications: =E2=80=98foo=E2=80=99 is still there, and so =E2= =80=98tail=E2=80=99 doesn=E2=80=99t >> report anything. >> >> Does that make sense? > > Yes, though if the link count is indeed zero, I'm surprised that > 'tail' can open the file -- that sounds like a bug in the kernel. Attached is a reproducer; just run it in a loop for a couple of seconds: --8<---------------cut here---------------start------------->8--- $ while ./a.out ; do : ; done funny, errno =3D Success, nlink =3D 0 Aborted (core dumped) --8<---------------cut here---------------end--------------->8--- I=E2=80=99m not sure if that=E2=80=99s a kernel bug. Strictly speaking, in= otify works as expected: we get a notification for nlink--, which doesn=E2=80=99t mean = the file has vanished. The conclusion for =E2=80=98tail=E2=80=99 would be to wait for the IN_DELET= E_SELF event before considering the file to be gone. WDYT? (That =E2=80=98inotify_rm_watch=E2=80=99 returns EINVAL *is* a bug IMO, but= not worrisome.) Thanks, Ludo=E2=80=99. --=-=-= Content-Type: text/plain Content-Disposition: inline; filename=inotify.c Content-Description: the inotify race #define _GNU_SOURCE 1 #include #include #include #include #include #include #include #include #include int main () { int file = creat ("foo", S_IRUSR | S_IWUSR); assert_perror (errno); close (file); int notifications = inotify_init (); assert_perror (errno); int watch = inotify_add_watch (notifications, "foo", IN_MODIFY | IN_ATTRIB | IN_DELETE_SELF | IN_MOVE_SELF); assert_perror (errno); if (fork () == 0) { unlink ("foo"); assert_perror (errno); exit (EXIT_SUCCESS); } struct inotify_event event; ssize_t count = read (notifications, &event, sizeof event); assert (count == sizeof event); assert (event.mask == IN_ATTRIB); struct stat st; stat ("foo", &st); if (errno != ENOENT) { printf ("funny, errno = %m, nlink = %li\n", st.st_nlink); abort (); } count = read (notifications, &event, sizeof event); assert (count == sizeof event); assert (event.mask == IN_DELETE_SELF); /* Bug #2: this returns EINVAL for no good reason. */ /* inotify_rm_watch (notifications, watch); */ /* assert_perror (errno); */ return EXIT_SUCCESS; } --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 17:11:00 2015 Received: (at request) by debbugs.gnu.org; 11 Sep 2015 21:11:00 +0000 Received: from localhost ([127.0.0.1]:57435 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaVb1-0002u2-Uk for submit@debbugs.gnu.org; Fri, 11 Sep 2015 17:11:00 -0400 Received: from eggs.gnu.org ([208.118.235.92]:38699) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaVb0-0002tu-0D for request@debbugs.gnu.org; Fri, 11 Sep 2015 17:10:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaVaw-0005By-EB for request@debbugs.gnu.org; Fri, 11 Sep 2015 17:10:57 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:48903) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaVaw-0005Bt-BH for request@debbugs.gnu.org; Fri, 11 Sep 2015 17:10:54 -0400 Received: from reverse-83.fdn.fr ([80.67.176.83]:33994 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1ZaVav-0007QJ-Ks for request@debbugs.gnu.org; Fri, 11 Sep 2015 17:10:54 -0400 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: request@debbugs.gnu.org Subject: merge X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 25 Fructidor an 223 de la =?utf-8?Q?R=C3=A9volution?= X-PGP-Key-ID: 0x3D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-unknown-linux-gnu Date: Fri, 11 Sep 2015 23:10:51 +0200 Message-ID: <87y4gc8xwk.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: request X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) merge 21459 21460 thanks (I wonder why 2 reports for Coreutils and 0 for Guix were created.) From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 18:50:12 2015 Received: (at submit) by debbugs.gnu.org; 11 Sep 2015 22:50:12 +0000 Received: from localhost ([127.0.0.1]:57476 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaX91-0006kq-Gl for submit@debbugs.gnu.org; Fri, 11 Sep 2015 18:50:11 -0400 Received: from eggs.gnu.org ([208.118.235.92]:34311) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaX8z-0006kh-7S for submit@debbugs.gnu.org; Fri, 11 Sep 2015 18:50:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaX8x-00009k-VI for submit@debbugs.gnu.org; Fri, 11 Sep 2015 18:50:08 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:58638) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaX8x-00009U-T0 for submit@debbugs.gnu.org; Fri, 11 Sep 2015 18:50:07 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44293) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaX8w-0007c6-UT for bug-guix@gnu.org; Fri, 11 Sep 2015 18:50:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaX8r-0008NF-VP for bug-guix@gnu.org; Fri, 11 Sep 2015 18:50:06 -0400 Received: from mail2.vodafone.ie ([213.233.128.44]:29014) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaX8r-0008IY-HN; Fri, 11 Sep 2015 18:50:01 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AroFACha81VtTIH4/2dsb2JhbABdDoJDUlRpglxOsH+LCIYBAoE2TAEBAQEBAYELhCMBAQEDASMECwFGBQsLDQsCAgUWCwICCQMCAQIBRQYBDAEFAgEBiCIMAbdghW+OKQEBAQEGAQEBAQEdgSKEVoV4hQ0HgmmBQwEEhyYLhUaBLocxhQqKAocBkWZjg0M/PTOKHQEBAQ Received: from unknown (HELO localhost.localdomain) ([109.76.129.248]) by mail2.vodafone.ie with ESMTP; 11 Sep 2015 23:49:58 +0100 Message-ID: <55F35A96.2090906@draigBrady.com> Date: Fri, 11 Sep 2015 23:49:58 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: =?UTF-8?B?THVkb3ZpYyBDb3VydMOocw==?= , Paul Eggert Subject: Re: bug#21460: Race condition in tests/tail-2/assert.sh References: <87wpvw2ad8.fsf@gnu.org> <55F30CEC.7060102@cs.ucla.edu> <87a8ssad7e.fsf@gnu.org> In-Reply-To: <87a8ssad7e.fsf@gnu.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit Cc: 21460@debbugs.gnu.org, bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) On 11/09/15 21:55, Ludovic Courtès wrote: > Paul Eggert skribis: > >> Ludovic Courtès wrote: >>> I think the problem happens when ‘tail’ opens ‘foo’ right in between of >>> the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t >>> report anything. >>> >>> Does that make sense? >> >> Yes, though if the link count is indeed zero, I'm surprised that >> 'tail' can open the file -- that sounds like a bug in the kernel. > > Attached is a reproducer; just run it in a loop for a couple of seconds: > > --8<---------------cut here---------------start------------->8--- > $ while ./a.out ; do : ; done > funny, errno = Success, nlink = 0 > Aborted (core dumped) > --8<---------------cut here---------------end--------------->8--- > > I’m not sure if that’s a kernel bug. Strictly speaking, inotify works > as expected: we get a notification for nlink--, which doesn’t mean the > file has vanished. Interesting. It does seem that the IN_ATTRIB is sent before the st_nlink-- takes effect? That could be a bug. Or it could be a dcache coherency issue where the name still references the st_nlink==0 inode. Note recheck() just open() and close() the file in this case, but since it doesn't close() the original fd, then there will be no IN_DELETE_SELF event. If the above kernel behavior can be explained and is acceptable, I suppose we could augment recheck() with something like: diff --git a/src/tail.c b/src/tail.c index f916d74..e9d5337 100644 --- a/src/tail.c +++ b/src/tail.c @@ -1046,6 +1046,18 @@ recheck (struct File_spec *f, bool blocking) close_fd (f->fd, pretty_name (f)); } + else if (new_stats.st_nlink == 0) /* XXX: what about multi-linked files. */ + { + /* It was seen on Linux that a file could be opened + even though unlinked as the directory entry (cache) + is updated after the IN_ATTRIB is sent for the nlink--. */ + + error (0, f->errnum, _("%s has become inaccessible"), + quote (pretty_name (f))); + + close_fd (fd, pretty_name (f)); + close_fd (f->fd, pretty_name (f)); + f->fd = -1; else { /* No changes detected, so close new fd. */ > The conclusion for ‘tail’ would be to wait for the IN_DELETE_SELF event > before considering the file to be gone. WDYT? As mentioned above, tail references the file until it can't open it, so the IN_DELETE_SELF is only generated upon the close_fd(f->fd) above. thanks, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 19:48:47 2015 Received: (at submit) by debbugs.gnu.org; 11 Sep 2015 23:48:47 +0000 Received: from localhost ([127.0.0.1]:57496 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaY3i-000847-PE for submit@debbugs.gnu.org; Fri, 11 Sep 2015 19:48:47 -0400 Received: from eggs.gnu.org ([208.118.235.92]:44466) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaY3h-000840-4i for submit@debbugs.gnu.org; Fri, 11 Sep 2015 19:48:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaY3f-0003JY-L1 for submit@debbugs.gnu.org; Fri, 11 Sep 2015 19:48:44 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:42303) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaY3f-0003JQ-Hd for submit@debbugs.gnu.org; Fri, 11 Sep 2015 19:48:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54459) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaY3e-0000GT-7O for bug-guix@gnu.org; Fri, 11 Sep 2015 19:48:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaY3b-0003Hn-03 for bug-guix@gnu.org; Fri, 11 Sep 2015 19:48:42 -0400 Received: from mail2.vodafone.ie ([213.233.128.44]:54957) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaY3a-0003Gj-R2; Fri, 11 Sep 2015 19:48:38 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjMGABNn81VtTIH4/2dsb2JhbABdDoJDUh81aYJcTrB/iwYKhXkCgTdMAQEBAQEBgQuEJAEBBAECIAQLAUYQCw0LAgIFIQICDwIXLwYBDAEFAgEBiC4BCLdIhW+OJwEBAQEBBQEBAQEBAQEbgSKEVoV4hQ0HgmmBQwWHJguFRoEuhzGFCoUOhHSHAZFmY4NDPz0zAYocAQEB Received: from unknown (HELO localhost.localdomain) ([109.76.129.248]) by mail2.vodafone.ie with ESMTP; 12 Sep 2015 00:48:36 +0100 Message-ID: <55F36854.90306@draigBrady.com> Date: Sat, 12 Sep 2015 00:48:36 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: =?UTF-8?B?THVkb3ZpYyBDb3VydMOocw==?= , Paul Eggert Subject: Re: bug#21460: Race condition in tests/tail-2/assert.sh References: <87wpvw2ad8.fsf@gnu.org> <55F30CEC.7060102@cs.ucla.edu> <87a8ssad7e.fsf@gnu.org> <55F35A96.2090906@draigBrady.com> In-Reply-To: <55F35A96.2090906@draigBrady.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit Cc: 21460@debbugs.gnu.org, Assaf Gordon , bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) On 11/09/15 23:49, Pádraig Brady wrote: > On 11/09/15 21:55, Ludovic Courtès wrote: >> Paul Eggert skribis: >> >>> Ludovic Courtès wrote: >>>> I think the problem happens when ‘tail’ opens ‘foo’ right in between of >>>> the two notifications: ‘foo’ is still there, and so ‘tail’ doesn’t >>>> report anything. >>>> >>>> Does that make sense? >>> >>> Yes, though if the link count is indeed zero, I'm surprised that >>> 'tail' can open the file -- that sounds like a bug in the kernel. >> >> Attached is a reproducer; just run it in a loop for a couple of seconds: >> >> --8<---------------cut here---------------start------------->8--- >> $ while ./a.out ; do : ; done >> funny, errno = Success, nlink = 0 >> Aborted (core dumped) >> --8<---------------cut here---------------end--------------->8--- >> >> I’m not sure if that’s a kernel bug. Strictly speaking, inotify works >> as expected: we get a notification for nlink--, which doesn’t mean the >> file has vanished. > > Interesting. It does seem that the IN_ATTRIB is sent before the st_nlink-- > takes effect? That could be a bug. Or it could be a dcache coherency > issue where the name still references the st_nlink==0 inode. > > Note recheck() just open() and close() the file in this case, > but since it doesn't close() the original fd, then there will be > no IN_DELETE_SELF event. > > If the above kernel behavior can be explained and is acceptable, > I suppose we could augment recheck() with something like: > > diff --git a/src/tail.c b/src/tail.c > index f916d74..e9d5337 100644 > --- a/src/tail.c > +++ b/src/tail.c > @@ -1046,6 +1046,18 @@ recheck (struct File_spec *f, bool blocking) > close_fd (f->fd, pretty_name (f)); > > } > + else if (new_stats.st_nlink == 0) /* XXX: what about multi-linked files. */ > + { > + /* It was seen on Linux that a file could be opened > + even though unlinked as the directory entry (cache) > + is updated after the IN_ATTRIB is sent for the nlink--. */ > + > + error (0, f->errnum, _("%s has become inaccessible"), > + quote (pretty_name (f))); > + > + close_fd (fd, pretty_name (f)); > + close_fd (f->fd, pretty_name (f)); > + f->fd = -1; > + } > else > { > >> The conclusion for ‘tail’ would be to wait for the IN_DELETE_SELF event >> before considering the file to be gone. WDYT? > > As mentioned above, tail references the file until it can't open it, > so the IN_DELETE_SELF is only generated upon the close_fd(f->fd) above. Google reminded me of this! https://lists.gnu.org/archive/html/coreutils/2015-07/msg00015.html I.E. this is the same issue that Assaf noticed, and that I though was restricted to older kernels. That has an alternate fix attached. cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 21:09:45 2015 Received: (at submit) by debbugs.gnu.org; 12 Sep 2015 01:09:45 +0000 Received: from localhost ([127.0.0.1]:57515 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaZK4-0001Sf-T9 for submit@debbugs.gnu.org; Fri, 11 Sep 2015 21:09:45 -0400 Received: from eggs.gnu.org ([208.118.235.92]:56926) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaZK3-0001SX-E8 for submit@debbugs.gnu.org; Fri, 11 Sep 2015 21:09:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaZK2-0004x8-Lf for submit@debbugs.gnu.org; Fri, 11 Sep 2015 21:09:43 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([208.118.235.17]:52950) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaZK2-0004x4-Ii for submit@debbugs.gnu.org; Fri, 11 Sep 2015 21:09:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38691) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaZK1-0003AU-QJ for bug-guix@gnu.org; Fri, 11 Sep 2015 21:09:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaZJy-0004vB-L1 for bug-guix@gnu.org; Fri, 11 Sep 2015 21:09:41 -0400 Received: from zimbra.cs.ucla.edu ([131.179.128.68]:36280) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaZJy-0004tw-DF; Fri, 11 Sep 2015 21:09:38 -0400 Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 9BEC216106A; Fri, 11 Sep 2015 18:09:36 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id O1J6EQRHQLt1; Fri, 11 Sep 2015 18:09:35 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id B2DB716106D; Fri, 11 Sep 2015 18:09:35 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id YBc1T3y-biWQ; Fri, 11 Sep 2015 18:09:35 -0700 (PDT) Received: from [192.168.1.9] (pool-100-32-155-148.lsanca.fios.verizon.net [100.32.155.148]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 52BDC16106A; Fri, 11 Sep 2015 18:09:35 -0700 (PDT) Subject: Re: bug#21460: Race condition in tests/tail-2/assert.sh To: =?UTF-8?Q?P=c3=a1draig_Brady?= , =?UTF-8?Q?Ludovic_Court=c3=a8s?= References: <87wpvw2ad8.fsf@gnu.org> <55F30CEC.7060102@cs.ucla.edu> <87a8ssad7e.fsf@gnu.org> <55F35A96.2090906@draigBrady.com> From: Paul Eggert Organization: UCLA Computer Science Department Message-ID: <55F37B4E.6060600@cs.ucla.edu> Date: Fri, 11 Sep 2015 18:09:34 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55F35A96.2090906@draigBrady.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 208.118.235.17 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit Cc: 21460@debbugs.gnu.org, bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) P=C3=A1draig Brady wrote: > + else if (new_stats.st_nlink =3D=3D 0) /* XXX: what about multi-linke= d files. */ That comment was my thought exactly. It appears to be a kernel bug that = really=20 needs to get fixed in the kernel; there just doesn't seem to be a reliabl= e=20 workaround in user space. From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 11 22:22:22 2015 Received: (at submit) by debbugs.gnu.org; 12 Sep 2015 02:22:22 +0000 Received: from localhost ([127.0.0.1]:57551 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaaSM-0003Ak-Gz for submit@debbugs.gnu.org; Fri, 11 Sep 2015 22:22:22 -0400 Received: from eggs.gnu.org ([208.118.235.92]:37597) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZaaSK-0003Ac-Qm for submit@debbugs.gnu.org; Fri, 11 Sep 2015 22:22:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaaSJ-0001oG-VK for submit@debbugs.gnu.org; Fri, 11 Sep 2015 22:22:20 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:53541) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaaSJ-0001oC-T6 for submit@debbugs.gnu.org; Fri, 11 Sep 2015 22:22:19 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47597) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaaSI-0005FW-KA for bug-guix@gnu.org; Fri, 11 Sep 2015 22:22:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZaaSF-0001na-Fh for bug-guix@gnu.org; Fri, 11 Sep 2015 22:22:18 -0400 Received: from mail1.vodafone.ie ([213.233.128.43]:57339) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZaaSF-0001nT-AA; Fri, 11 Sep 2015 22:22:15 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjcGABWM81VtTR2s/2dsb2JhbABdDoJDUh81aYFHgRVOvAYKhXkCgTlMAQEBAQEBgQuEJAEBBAECIA8BRhALDQsCAgUWCwICCQMCAQIBFi8GAQwBBwEBiC4BCLc6hW+OIwEBAQEBBQEBAQEBARyBIoRWhXiFDQeCaYFDBZVWhQqFDoR0hwGRZmOCERyBFj89MwGKHAEBAQ Received: from unknown (HELO localhost.localdomain) ([109.77.29.172]) by mail1.vodafone.ie with ESMTP; 12 Sep 2015 03:22:13 +0100 Message-ID: <55F38C54.8070306@draigBrady.com> Date: Sat, 12 Sep 2015 03:22:12 +0100 From: =?UTF-8?B?UMOhZHJhaWcgQnJhZHk=?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Paul Eggert , =?UTF-8?B?THVkb3ZpYyBDb3VydMOocw==?= Subject: Re: bug#21460: Race condition in tests/tail-2/assert.sh References: <87wpvw2ad8.fsf@gnu.org> <55F30CEC.7060102@cs.ucla.edu> <87a8ssad7e.fsf@gnu.org> <55F35A96.2090906@draigBrady.com> <55F37B4E.6060600@cs.ucla.edu> In-Reply-To: <55F37B4E.6060600@cs.ucla.edu> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit Cc: 21460@debbugs.gnu.org, bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) On 12/09/15 02:09, Paul Eggert wrote: > Pádraig Brady wrote: >> + else if (new_stats.st_nlink == 0) /* XXX: what about multi-linked files. */ > > That comment was my thought exactly. It appears to be a kernel bug that really > needs to get fixed in the kernel; there just doesn't seem to be a reliable > workaround in user space. I double checked with kernel guys, and Al Viro essentially said the inode and directory operations are independent. https://lkml.org/lkml/2015/9/11/790 So we probably need to handle the IN_DELETE event on the directory to cater for this race, as done in: https://lists.gnu.org/archive/html/coreutils/2015-07/msg00015.html cheers, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Fri Oct 02 11:50:43 2015 Received: (at submit) by debbugs.gnu.org; 2 Oct 2015 15:50:43 +0000 Received: from localhost ([127.0.0.1]:52327 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Zi2bZ-0001J5-Q6 for submit@debbugs.gnu.org; Fri, 02 Oct 2015 11:50:42 -0400 Received: from eggs.gnu.org ([208.118.235.92]:59577) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Zi2bW-0001Ih-T1 for submit@debbugs.gnu.org; Fri, 02 Oct 2015 11:50:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zi2bP-0006sg-Av for submit@debbugs.gnu.org; Fri, 02 Oct 2015 11:50:33 -0400 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:41338) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zi2bP-0006ri-8w for submit@debbugs.gnu.org; Fri, 02 Oct 2015 11:50:31 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41340) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zi2bN-0001Gk-Mo for bug-guix@gnu.org; Fri, 02 Oct 2015 11:50:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zi2bK-0006jc-5v for bug-guix@gnu.org; Fri, 02 Oct 2015 11:50:29 -0400 Received: from mail1.vodafone.ie ([213.233.128.43]:55168) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zi2bJ-0006j5-Ss; Fri, 02 Oct 2015 11:50:26 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AuMHAMumDlZtTKRQ/2dsb2JhbABeDoJHUlRoBsAACoV2AQICgTVMAQEBAQEBgQuEJAEBAQQBAiAEUhALDQQDAQIBCRYLAgIJAwIBAgEWJwgGAQkDBgIBAYguAQMFtzCFb45WAQEBAQEBAQEBAQEBAQEBAQEBARmFeIV5hHwRB4JpgUMFjQWId4JMgkuFGIUFhwiLAIc+Y4IRHYEWPz0zAYl3AQEB Received: from unknown (HELO localhost.localdomain) ([109.76.164.80]) by mail1.vodafone.ie with ESMTP; 02 Oct 2015 16:50:22 +0100 Subject: Re: bug#21460: Race condition in tests/tail-2/assert.sh To: Paul Eggert , =?UTF-8?Q?Ludovic_Court=c3=a8s?= References: <87wpvw2ad8.fsf@gnu.org> <55F30CEC.7060102@cs.ucla.edu> <87a8ssad7e.fsf@gnu.org> <55F35A96.2090906@draigBrady.com> <55F37B4E.6060600@cs.ucla.edu> <55F38C54.8070306@draigBrady.com> From: =?UTF-8?Q?P=c3=a1draig_Brady?= Message-ID: <560EA7BE.8010305@draigBrady.com> Date: Fri, 2 Oct 2015 16:50:22 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55F38C54.8070306@draigBrady.com> Content-Type: multipart/mixed; boundary="------------010202010309070009010007" X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit Cc: 21460-done@debbugs.gnu.org, bug-guix@gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) This is a multi-part message in MIME format. --------------010202010309070009010007 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit On 12/09/15 03:22, Pádraig Brady wrote: > On 12/09/15 02:09, Paul Eggert wrote: >> Pádraig Brady wrote: >>> + else if (new_stats.st_nlink == 0) /* XXX: what about multi-linked files. */ >> >> That comment was my thought exactly. It appears to be a kernel bug that really >> needs to get fixed in the kernel; there just doesn't seem to be a reliable >> workaround in user space. > > I double checked with kernel guys, and Al Viro > essentially said the inode and directory operations are independent. > https://lkml.org/lkml/2015/9/11/790 > > So we probably need to handle the IN_DELETE event > on the directory to cater for this race, as done in: > https://lists.gnu.org/archive/html/coreutils/2015-07/msg00015.html I'll apply the attached later. thanks, Pádraig --------------010202010309070009010007 Content-Type: text/x-patch; name="tail-unlink-notification-race.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename="tail-unlink-notification-race.patch" >From 07ca8a227e04f9fb7bb0b21968056a562b8c2f83 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= Date: Thu, 2 Jul 2015 08:41:25 +0100 Subject: [PATCH] tail: handle kernel dentry unlink race MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Avoid the intermittent loss of "... has become inaccessible" messages. That would cause tests/tail-2/assert.sh to fail sometimes, mainly on uniprocessor systems. * src/tail.c (tail_forever_inotify): Also monitor IN_DELETE events on the directory, to avoid a dentry unlink()..open() race, where the open() on the deleted file was seen to succeed after an, unlink() and a subsequent IN_ATTRIB, was sent to tail. Note an IN_ATTRIB is sent on the monitored file to indicate the change in number of links, and we can't just use a decrease in the number of links to determine the file being unlinked, due to the possibility of the file having multiple links. Reported by Assaf Gordon and Ludovic Courtès. Fixes http://bugs.gnu.org/21460 --- src/tail.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/src/tail.c b/src/tail.c index f916d74..7a34b0b 100644 --- a/src/tail.c +++ b/src/tail.c @@ -1429,8 +1429,8 @@ tail_forever_inotify (int wd, struct File_spec *f, size_t n_files, /* It's fine to add the same directory more than once. In that case the same watch descriptor is returned. */ f[i].parent_wd = inotify_add_watch (wd, dirlen ? f[i].name : ".", - (IN_CREATE | IN_MOVED_TO - | IN_ATTRIB)); + (IN_CREATE | IN_DELETE + | IN_MOVED_TO | IN_ATTRIB)); f[i].name[dirlen] = prev; @@ -1619,9 +1619,16 @@ tail_forever_inotify (int wd, struct File_spec *f, size_t n_files, fspec = &(f[j]); - /* Adding the same inode again will look up any existing wd. */ - int new_wd = inotify_add_watch (wd, f[j].name, inotify_wd_mask); - if (new_wd < 0) + int new_wd = -1; + bool deleting = !! (ev->mask & IN_DELETE); + + if (! deleting) + { + /* Adding the same inode again will look up any existing wd. */ + new_wd = inotify_add_watch (wd, f[j].name, inotify_wd_mask); + } + + if (! deleting && new_wd < 0) { if (errno == ENOSPC || errno == ENOMEM) { @@ -1641,7 +1648,7 @@ tail_forever_inotify (int wd, struct File_spec *f, size_t n_files, /* This will be false if only attributes of file change. */ bool new_watch = fspec->wd < 0 || new_wd != fspec->wd; - if (new_watch) + if (! deleting && new_watch) { if (0 <= fspec->wd) { @@ -1683,7 +1690,7 @@ tail_forever_inotify (int wd, struct File_spec *f, size_t n_files, if (! fspec) continue; - if (ev->mask & (IN_ATTRIB | IN_DELETE_SELF | IN_MOVE_SELF)) + if (ev->mask & (IN_ATTRIB | IN_DELETE | IN_DELETE_SELF | IN_MOVE_SELF)) { /* Note for IN_MOVE_SELF (the file we're watching has been clobbered via a rename) we leave the watch -- 2.5.0 --------------010202010309070009010007-- From unknown Tue Jun 17 22:23:12 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 31 Oct 2015 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator