From unknown Fri Sep 05 08:41:29 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#16361 <16361@debbugs.gnu.org> To: bug#16361 <16361@debbugs.gnu.org> Subject: Status: [wishlist] improve freshness checking in compile cache Reply-To: bug#16361 <16361@debbugs.gnu.org> Date: Fri, 05 Sep 2025 15:41:29 +0000 retitle 16361 [wishlist] improve freshness checking in compile cache reassign 16361 guile submitter 16361 Zefram severity 16361 wishlist tag 16361 notabug wontfix thanks From debbugs-submit-bounces@debbugs.gnu.org Sun Jan 05 18:44:24 2014 Received: (at submit) by debbugs.gnu.org; 5 Jan 2014 23:44:24 +0000 Received: from localhost ([127.0.0.1]:37207 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1VzxMl-0004zG-Qv for submit@debbugs.gnu.org; Sun, 05 Jan 2014 18:44:24 -0500 Received: from eggs.gnu.org ([208.118.235.92]:46282) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1Vzwob-0003zb-TD for submit@debbugs.gnu.org; Sun, 05 Jan 2014 18:09:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VzwoT-0002yP-Au for submit@debbugs.gnu.org; Sun, 05 Jan 2014 18:09:05 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_40 autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:59515) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VzwoT-0002yL-7Z for submit@debbugs.gnu.org; Sun, 05 Jan 2014 18:08:57 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47999) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VzwoN-0005IV-7S for bug-guile@gnu.org; Sun, 05 Jan 2014 18:08:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VzwoH-0002y3-8d for bug-guile@gnu.org; Sun, 05 Jan 2014 18:08:51 -0500 Received: from river.fysh.org ([5.135.154.127]:53153) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VzwoH-0002xz-1x for bug-guile@gnu.org; Sun, 05 Jan 2014 18:08:45 -0500 Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian)) id 1VzwoD-0000U7-EV; Sun, 05 Jan 2014 23:08:41 +0000 Date: Sun, 5 Jan 2014 23:08:41 +0000 From: Zefram To: bug-guile@gnu.org Subject: compile cache confused about file identity Message-ID: <20140105230841.GF30283@fysh.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -4.0 (----) X-Debbugs-Envelope-To: submit X-Mailman-Approved-At: Sun, 05 Jan 2014 18:44:19 -0500 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -4.0 (----) The automatic cache of compiled versions of scripts in guile-2.0.9 identifies scripts mainly by name, and partially by mtime. This is not actually sufficient: it is easily misled by a pathname that refers to different files at different times. Test case: $ echo '(display "aaa\n")' >t13 $ echo '(display "bbb\n")' >t14 $ guile-2.0 t13 ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 ;;; or pass the --no-auto-compile argument to disable. ;;; compiling /home/zefram/usr/guile/t13 ;;; compiled /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t13.go aaa $ mv t14 t13 $ guile-2.0 t13 aaa You can see that the mtime is not fully used here: the cache is misapplied even if there is a delay of seconds between the creations of the two script files. The cache's mtime check will only notice a mismatch if the script currently seen under the supplied name was modified later than when the previous script was *compiled*. Obviously, in this test case the cache could trivially distinguish the two script files by looking at the inode numbers. On its own the inode number isn't sufficient, but exact match on device, inode number, and mtime would be far superior to the current behaviour, only going wrong in the presence of deliberate timestamp manipulation. As a bonus, if the cache were actually *keyed* by inode number and device, rather than by pathname, it would retain the caching of compilation across renamings of the script. Or, even better, the cache could be keyed by a cryptographic hash of the file contents. This would be immune even to timestamp manipulation, and would preserve the cached compilation even across the script being copied to a fresh file or being edited and reverted. This would be a cache worthy of the name. The only downside is the expense of computing the hash, but I expect this is small compared to the expense of compilation. Debian incarnation of this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734178 -zefram From debbugs-submit-bounces@debbugs.gnu.org Wed Jan 15 16:16:41 2014 Received: (at control) by debbugs.gnu.org; 15 Jan 2014 21:16:41 +0000 Received: from localhost ([127.0.0.1]:52494 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3XpI-00028Y-IJ for submit@debbugs.gnu.org; Wed, 15 Jan 2014 16:16:40 -0500 Received: from world.peace.net ([96.39.62.75]:50140) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1W3XpE-00028C-M2 for control@debbugs.gnu.org; Wed, 15 Jan 2014 16:16:36 -0500 Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=yeeloong) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1W3Xp9-0005Pd-0i; Wed, 15 Jan 2014 16:16:31 -0500 From: Mark H Weaver To: control@debbugs.gnu.org Date: Wed, 15 Jan 2014 16:14:11 -0500 Message-ID: <87vbxlrlr0.fsf@netris.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: retitle 16451 autogen.sh fails on FreeBSD 9.1 retitle 16359 "guild list" lists nothing (Guile 2.0.9 on Debian) retitle 16360 "guile help COMMAND" crashes (Guile 2.0.9 on Debian) retitle 16361 [wishlist] improve freshness checking in compile cache severity 16361 wishlist retitle 16362 compiler doesn't preserve distinctness of literals thanks [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: retitle 16451 autogen.sh fails on FreeBSD 9.1 retitle 16359 "guild list" lists nothing (Guile 2.0.9 on Debian) retitle 16360 "guile help COMMAND" crashes (Guile 2.0.9 on Debian) retitle 16361 [wishlist] improve freshness checking in compile cache severity 16361 wishlist retitle 16362 compiler doesn't preserve distinctness of literals thanks [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject retitle 16451 autogen.sh fails on FreeBSD 9.1 retitle 16359 "guild list" lists nothing (Guile 2.0.9 on Debian) retitle 16360 "guile help COMMAND" crashes (Guile 2.0.9 on Debian) retitle 16361 [wishlist] improve freshness checking in compile cache severity 16361 wishlist retitle 16362 compiler doesn't preserve distinctness of literals thanks From debbugs-submit-bounces@debbugs.gnu.org Wed Oct 01 15:23:21 2014 Received: (at 16361) by debbugs.gnu.org; 1 Oct 2014 19:23:21 +0000 Received: from localhost ([127.0.0.1]:57717 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XZPUe-0002IX-Nj for submit@debbugs.gnu.org; Wed, 01 Oct 2014 15:23:21 -0400 Received: from world.peace.net ([96.39.62.75]:44244) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1XZPUb-0002IH-Q0; Wed, 01 Oct 2014 15:23:18 -0400 Received: from c-24-62-95-23.hsd1.ma.comcast.net ([24.62.95.23] helo=yeeloong.lan) by world.peace.net with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1XZPUU-0001Kd-9o; Wed, 01 Oct 2014 15:23:10 -0400 From: Mark H Weaver To: Zefram Subject: Re: bug#16361: compile cache confused about file identity References: <20140105230841.GF30283@fysh.org> Date: Wed, 01 Oct 2014 15:22:58 -0400 In-Reply-To: <20140105230841.GF30283@fysh.org> (zefram@fysh.org's message of "Sun, 5 Jan 2014 23:08:41 +0000") Message-ID: <87zjdf7fxp.fsf@yeeloong.lan> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 16361 Cc: 16361@debbugs.gnu.org, request@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) tags 16361 + notabug wontfix close 16361 thanks Zefram writes: > The automatic cache of compiled versions of scripts in guile-2.0.9 > identifies scripts mainly by name, and partially by mtime. This is not > actually sufficient: it is easily misled by a pathname that refers to > different files at different times. Test case: > > $ echo '(display "aaa\n")' >t13 > $ echo '(display "bbb\n")' >t14 > $ guile-2.0 t13 > ;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0 > ;;; or pass the --no-auto-compile argument to disable. > ;;; compiling /home/zefram/usr/guile/t13 > ;;; compiled /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t13.go > aaa > $ mv t14 t13 > $ guile-2.0 t13 > aaa > > You can see that the mtime is not fully used here: the cache is misapplied > even if there is a delay of seconds between the creations of the two > script files. The cache's mtime check will only notice a mismatch if > the script currently seen under the supplied name was modified later > than when the previous script was *compiled*. > > Obviously, in this test case the cache could trivially distinguish the > two script files by looking at the inode numbers. On its own the inode > number isn't sufficient, but exact match on device, inode number, and > mtime would be far superior to the current behaviour, only going wrong > in the presence of deliberate timestamp manipulation. As a bonus, if > the cache were actually *keyed* by inode number and device, rather than > by pathname, it would retain the caching of compilation across renamings > of the script. > > Or, even better, the cache could be keyed by a cryptographic hash of the > file contents. This would be immune even to timestamp manipulation, and > would preserve the cached compilation even across the script being copied > to a fresh file or being edited and reverted. This would be a cache > worthy of the name. The only downside is the expense of computing the > hash, but I expect this is small compared to the expense of compilation. You could make the same complaint about 'make', 'rsync', or any number of other programs. It's true that a cryptographic hash would be more robust, but it would also be considerably more expensive in the common case where the .go file is already in the cache. I don't think it's worth paying this cost every time a .go file is loaded, to guard against the unlikely scenario you outlined above. The mtime check is very widely used, and accepted practice. I'm closing this ticket. Mark From unknown Fri Sep 05 08:41:29 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Thu, 30 Oct 2014 11:24:04 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator From debbugs-submit-bounces@debbugs.gnu.org Wed May 13 06:45:49 2015 Received: (at control) by debbugs.gnu.org; 13 May 2015 10:45:49 +0000 Received: from localhost ([127.0.0.1]:43113 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YsUAe-0007g1-St for submit@debbugs.gnu.org; Wed, 13 May 2015 06:45:49 -0400 Received: from river.fysh.org ([5.135.154.127]:43745 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YsUAb-0007fq-BX for control@debbugs.gnu.org; Wed, 13 May 2015 06:45:46 -0400 Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian)) id 1YsUAX-0008OU-Gy; Wed, 13 May 2015 11:45:41 +0100 Date: Wed, 13 May 2015 11:45:41 +0100 From: Zefram To: control@debbugs.gnu.org Subject: Re: bug#16361 acknowledged by developer (Re: bug#16361: compile cache confused about file identity) Message-ID: <20150513104541.GA26475@fysh.org> References: <87zjdf7fxp.fsf@yeeloong.lan> <20140105230841.GF30283@fysh.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) unarchive 16361 thanks -zefram From debbugs-submit-bounces@debbugs.gnu.org Wed May 13 07:07:48 2015 Received: (at 16361) by debbugs.gnu.org; 13 May 2015 11:07:48 +0000 Received: from localhost ([127.0.0.1]:43156 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YsUVv-0001FX-IN for submit@debbugs.gnu.org; Wed, 13 May 2015 07:07:47 -0400 Received: from river.fysh.org ([5.135.154.127]:44412 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YsUVt-0001FO-UR for 16361@debbugs.gnu.org; Wed, 13 May 2015 07:07:46 -0400 Received: from zefram by river.fysh.org with local (Exim 4.80 #2 (Debian)) id 1YsUVn-0000cq-ND; Wed, 13 May 2015 12:07:39 +0100 Date: Wed, 13 May 2015 12:07:39 +0100 From: Zefram To: 16361@debbugs.gnu.org Subject: Re: bug#16361: compile cache confused about file identity Message-ID: <20150513110739.GB26475@fysh.org> References: <20140105230841.GF30283@fysh.org> <87zjdf7fxp.fsf@yeeloong.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87zjdf7fxp.fsf@yeeloong.lan> X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 16361 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Mark H Weaver wrote: >You could make the same complaint about 'make', 'rsync', or any number >of other programs. Not really. make does use this type of freshness check, but it's used in a specific situation where the freshness issue is immediately obvious and is part of the program's visible primary concern. That's quite unlike guile's compile cache, which as the name suggests is a cache. It's meant to be unobtrusive, and the cache semantics are not a direct part of the transaction that is ostensibly taking place, of running a program that happens to be written in Scheme. Those circumstances, of running an arbitrary program, are much broader than circumstances in which make's freshness checks become relevant. make also gets a pass from having always worked this way, whereas guile used to not cache compilations. rsync, by contrast, does not use this type of freshness checking; I believe it uses a hash mechanism. > It's true that a cryptographic hash would be more >robust, but it would also be considerably more expensive in the common >case where the .go file is already in the cache. > >I don't think it's worth paying this cost every time OK, you can rule that suggestion out, but I think you have erred in jumping from that to wontfix on the general problem. You have not addressed my prior suggestion of identifying programs by exact match on device, inode number, and mtime. (File size could also be included.) This freshness check is very cheap, because it's just a few fixed-size fields from the stat structure, and you're already necessarily doing a stat on the program file. Using the identifying fields as the cache key even saves you a stat on the cached file. Although not quite as effective as a hash comparison, it would be a huge practical improvement over the current filename-and-inexact-mtime comparison. -zefram From unknown Fri Sep 05 08:41:29 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Wed, 10 Jun 2015 11:24:06 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator