Package: emacs;
Reported by: Lin Sun <sunlin7 <at> hotmail.com>
Date: Sat, 10 May 2025 00:14:02 UTC
Severity: normal
Tags: patch
Done: Eli Zaretskii <eliz <at> gnu.org>
Message #68 received at 78340 <at> debbugs.gnu.org (full text, mbox):
From: Lynn Winebarger <owinebar <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: sunlin7 <at> hotmail.com, acorallo <at> gnu.org, 78340 <at> debbugs.gnu.org Subject: Re: bug#78340: [PATCH] New option for comp *.eln file name by the file timestamp of *.el Date: Mon, 12 May 2025 19:52:00 -0400
On Mon, May 12, 2025 at 7:47 PM Lynn Winebarger <owinebar <at> gmail.com> wrote: > > On Mon, May 12, 2025 at 12:14 PM Eli Zaretskii <eliz <at> gnu.org> wrote: > > > > > From: Lynn Winebarger <owinebar <at> gmail.com> > > > Date: Mon, 12 May 2025 11:37:13 -0400 > > > Cc: acorallo <at> gnu.org, sunlin7 <at> hotmail.com, 78340 <at> debbugs.gnu.org > > > > > > There's nothing there saying the check is to make sure that the code in the .eln is safe, only that the eln > > > filename is unique when a library is recompiled with different source. There's nothing that implies that it > > > would be *unsafe* to load a library generated from a version with a different content hash than the one that > > > happens to exist on disk at the moment. A compile-time time stamp would probably be adequate for the > > > stated purpose of ensuring a unqiue file name. > > > > So now you are saying that what we do is NOT enough, and more should > > be done to make sure the loaded .eln file is safe? > > > > Or are you saying that, because (in your opinion) the current tests > > are not enough, we could well throw them away completely, because it > > will save us some milliseconds per file? > > > > Or what is it that you are saying? > > I claimed none of those things. I was trying to understand what you > meant by "No, the ABI hash is the evidence for the Emacs binary. We > need evidence for the .eln file's code." That way I might be able to > assist the reporter in coming up with an approach that would solve his > issue and have a chance of being accepted. I am sympathetic to > wanting to make emacs startup take less time in general. > I elaborated my guesses as to what the reason might be because I > hadn't seen a concrete explanation of the hazard being avoided. > Functionally, as far as I can tell, the content hash only serves to > ensure that ELN files generated from distinct source files will have > distinct names, so that dlopen will not confuse them, even if they map > to the same relative path from the root lisp directory. That's > certainly how I read the comment in the source. If I am missing > something, please correct me. > > > > > Whatever you claim about the current code, it withstood the test of 2 > > major Emacs releases: I've yet top hear any report about Emacs > > crashing because it loaded an incompatible .eln file. That's not > > nothing. > > I'm not sure why you seem to be taking offense. It's clearly a > conservative approach that has avoided failures that might have given > the system a reputation for unreliability. > > > > I don't see anything in comp.c storing the content hash in the .eln file for verification purposes. It looks like > > > if I modify an elisp library, then copy the existing .eln to have a filename with the content hash from the new > > > version, but without any modification of the eln contents, it should load the same way the older byte-compiled > > > version would. Are you (or Andrea) saying that would risk a crash? > > > Am I missing something? > > > > Maliciously tampering with files can still crash Emacs, yes. That's > > not what those safeguards are about. > > > I don't think the test I suggested as an experiment to determine the > actual possibility of crashes even approaches "maliciously tampering > with files". To the best of my ability to judge what would happen > based on the compiler infrastructure, there's no hazard in that > experiment of a crash happening, only that the new code would be > ignored, just as it would be if "load-prefer-newer" is nil for > byte-compiled files. But I'm more than happy to be corrected by a > concrete example. If I'm incorrect, I would prefer to know sooner > than later. > > > > If the only purpose of the content hash is to certify that some valid elisp source (with respect to the running > > > emacs) is known to exist, then emacs should be able to keep a table in memory loaded at startup and just > > > consult that for verification. That should address any concern about crashing, at least, unless I have > > > completely misunderstood the concern about (which is very possible). > > > > You lost me here. What would be the contents of that table, how it > > will be produced, and how it will be used? > > Actually, this very simple idea occurred to me in the course of this > discussion. Just have the compilers (both byte- and native-) keep a > log file with the following items in each line: I forgot the most important item! > -------- > <source file path relative to load path (i.e. argument to "load" or require> > <file path relative to emacs lisp source directory as used by native compiler> > <file basename>-<md5 hash of relative path> > <content hash of source code that was compiled> <time stamp at which the compiled file was written> !!! > <basename of compiled library (whether .elc or .eln> > <content hash of compiled file> > -------- > Emacs can load this log (part from the installation or invocation > location and part from the user emacs directory) at startup. Then, at > least for compiled libraries when "load-prefer-newer" is nil, the > correct library to load can be determined entirely from internal > records, without hitting the disk at all, as long as the compiled file > exists where expected (and it can even be verified if required). In > the worst case, the current behavior can still be followed. > With that approach, emacs actually has an authoritative source of > content hashes that its compiler has verified as proper elisp code (to > the extent that is the concern), and in the extremis, the library with > that name can be verified to be the output of the compiler (whether > from the byte compiler or the native compiler). I don't think that's > necessary, but it would become possible, and it's not that hard to > obtain the option. This would actually avoid a lot of performance > related issues with packages and long load-paths to boot that derive > from system libraries being the most likely to be required but at the > end of the path to be searched, although that is just a fringe > benefit. > > > > > And, btw, what does all that have to do with the current bug, which is > > very specific and narrow? It sounds like you'd like to start a > > discussion on emacs-devel where you would like to suggest how to > > redesign the way we validate and load *.eln files. It's a separate > > discussion with much broader scope and implications. > > Well, if the bug is specifically "add an option to use file stat times > to determine validity", then you're correct. I took the reporter to > be saying "20% of my emacs startup time is due to this check, I'd like > to avoid that because I don't understand what the purpose of this > check is when this other check seems just as good, here's a patch". I > think the approach I suggested above would be much more fruitful for > the OP if they want to contribute something with a chance of being > accepted. It's not trivial like the approach implemented by the > submitted patch, but it isn't insurmountably complex or conceptually > difficult to grasp. > > OTOH, I got engrossed in the issue at hand (as I understand it) and > totally forgot this was from bug-emacs and not emacs-devel already, so > you might be right about that being the appropriate forum for my last > suggestion above. > > Lynn
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.