Package: emacs;
Reported by: Lin Sun <sunlin7 <at> hotmail.com>
Date: Sat, 10 May 2025 00:14:02 UTC
Severity: normal
Tags: patch
Done: Eli Zaretskii <eliz <at> gnu.org>
View this message in rfc822 format
From: Lynn Winebarger <owinebar <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sunlin7 <at> hotmail.com, acorallo <at> gnu.org, 78340 <at> debbugs.gnu.org Subject: bug#78340: [PATCH] New option for comp *.eln file name by the file timestamp of *.el Date: Mon, 12 May 2025 19:47:54 -0400
On Mon, May 12, 2025 at 12:14 PM Eli Zaretskii <eliz <at> gnu.org> wrote: > > > From: Lynn Winebarger <owinebar <at> gmail.com> > > Date: Mon, 12 May 2025 11:37:13 -0400 > > Cc: acorallo <at> gnu.org, sunlin7 <at> hotmail.com, 78340 <at> debbugs.gnu.org > > > > There's nothing there saying the check is to make sure that the code in the .eln is safe, only that the eln > > filename is unique when a library is recompiled with different source. There's nothing that implies that it > > would be *unsafe* to load a library generated from a version with a different content hash than the one that > > happens to exist on disk at the moment. A compile-time time stamp would probably be adequate for the > > stated purpose of ensuring a unqiue file name. > > So now you are saying that what we do is NOT enough, and more should > be done to make sure the loaded .eln file is safe? > > Or are you saying that, because (in your opinion) the current tests > are not enough, we could well throw them away completely, because it > will save us some milliseconds per file? > > Or what is it that you are saying? I claimed none of those things. I was trying to understand what you meant by "No, the ABI hash is the evidence for the Emacs binary. We need evidence for the .eln file's code." That way I might be able to assist the reporter in coming up with an approach that would solve his issue and have a chance of being accepted. I am sympathetic to wanting to make emacs startup take less time in general. I elaborated my guesses as to what the reason might be because I hadn't seen a concrete explanation of the hazard being avoided. Functionally, as far as I can tell, the content hash only serves to ensure that ELN files generated from distinct source files will have distinct names, so that dlopen will not confuse them, even if they map to the same relative path from the root lisp directory. That's certainly how I read the comment in the source. If I am missing something, please correct me. > > Whatever you claim about the current code, it withstood the test of 2 > major Emacs releases: I've yet top hear any report about Emacs > crashing because it loaded an incompatible .eln file. That's not > nothing. I'm not sure why you seem to be taking offense. It's clearly a conservative approach that has avoided failures that might have given the system a reputation for unreliability. > > I don't see anything in comp.c storing the content hash in the .eln file for verification purposes. It looks like > > if I modify an elisp library, then copy the existing .eln to have a filename with the content hash from the new > > version, but without any modification of the eln contents, it should load the same way the older byte-compiled > > version would. Are you (or Andrea) saying that would risk a crash? > > Am I missing something? > > Maliciously tampering with files can still crash Emacs, yes. That's > not what those safeguards are about. > I don't think the test I suggested as an experiment to determine the actual possibility of crashes even approaches "maliciously tampering with files". To the best of my ability to judge what would happen based on the compiler infrastructure, there's no hazard in that experiment of a crash happening, only that the new code would be ignored, just as it would be if "load-prefer-newer" is nil for byte-compiled files. But I'm more than happy to be corrected by a concrete example. If I'm incorrect, I would prefer to know sooner than later. > > If the only purpose of the content hash is to certify that some valid elisp source (with respect to the running > > emacs) is known to exist, then emacs should be able to keep a table in memory loaded at startup and just > > consult that for verification. That should address any concern about crashing, at least, unless I have > > completely misunderstood the concern about (which is very possible). > > You lost me here. What would be the contents of that table, how it > will be produced, and how it will be used? Actually, this very simple idea occurred to me in the course of this discussion. Just have the compilers (both byte- and native-) keep a log file with the following items in each line: -------- <source file path relative to load path (i.e. argument to "load" or require> <file path relative to emacs lisp source directory as used by native compiler> <file basename>-<md5 hash of relative path> <content hash of source code that was compiled> <basename of compiled library (whether .elc or .eln> <content hash of compiled file> -------- Emacs can load this log (part from the installation or invocation location and part from the user emacs directory) at startup. Then, at least for compiled libraries when "load-prefer-newer" is nil, the correct library to load can be determined entirely from internal records, without hitting the disk at all, as long as the compiled file exists where expected (and it can even be verified if required). In the worst case, the current behavior can still be followed. With that approach, emacs actually has an authoritative source of content hashes that its compiler has verified as proper elisp code (to the extent that is the concern), and in the extremis, the library with that name can be verified to be the output of the compiler (whether from the byte compiler or the native compiler). I don't think that's necessary, but it would become possible, and it's not that hard to obtain the option. This would actually avoid a lot of performance related issues with packages and long load-paths to boot that derive from system libraries being the most likely to be required but at the end of the path to be searched, although that is just a fringe benefit. > > And, btw, what does all that have to do with the current bug, which is > very specific and narrow? It sounds like you'd like to start a > discussion on emacs-devel where you would like to suggest how to > redesign the way we validate and load *.eln files. It's a separate > discussion with much broader scope and implications. Well, if the bug is specifically "add an option to use file stat times to determine validity", then you're correct. I took the reporter to be saying "20% of my emacs startup time is due to this check, I'd like to avoid that because I don't understand what the purpose of this check is when this other check seems just as good, here's a patch". I think the approach I suggested above would be much more fruitful for the OP if they want to contribute something with a chance of being accepted. It's not trivial like the approach implemented by the submitted patch, but it isn't insurmountably complex or conceptually difficult to grasp. OTOH, I got engrossed in the issue at hand (as I understand it) and totally forgot this was from bug-emacs and not emacs-devel already, so you might be right about that being the appropriate forum for my last suggestion above. Lynn
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.