Package: emacs;
Reported by: Lynn Winebarger <owinebar <at> gmail.com>
Date: Wed, 25 Jun 2025 22:01:05 UTC
Severity: normal
View this message in rfc822 format
From: Lynn Winebarger <owinebar <at> gmail.com> To: Pip Cet <pipcet <at> protonmail.com> Cc: 78898 <at> debbugs.gnu.org Subject: bug#78898: Make read/readevalloop reentrant Date: Sun, 29 Jun 2025 00:24:21 -0400
On Sat, Jun 28, 2025 at 10:57 AM Pip Cet <pipcet <at> protonmail.com> wrote: > > "Lynn Winebarger" <owinebar <at> gmail.com> writes: > > > On Fri, Jun 27, 2025 at 5:42 AM Pip Cet <pipcet <at> protonmail.com> wrote: > >> > >> "Lynn Winebarger" <owinebar <at> gmail.com> writes: > >> > >> Thanks for providing an updated patch! It applies and, with the changes > >> below, it seems to work for me. > >> > >> I haven't really looked at the code yet, just tried to get it to work. > >> I did notice that, at least with debug CFLAGS, Emacs startup is much > >> slower than it was before. > > > > That's interesting - I use the debug CFLAGS, and I was surprised that > > I didn't notice any slowdown at startup. > > It's entirely possible it's just the nativecomp abi hash changing, I > didn't think of that! In any case, it's too early to benchmark these > changes, IMHO. > > In particular, I tried running "make check", and while I've reduced the > number of local test failures a little (see below), there are still some > mysterious ones. > > > I replaced a bunch of static variables with indirect references to a > > stack-allocated "frame" data structure, so I expected some > > pessimization. > > That doesn't sound like it would be noticeable, at least when > optimization is on. > > > I did use "register" for the frame parameter, in the hope that the > > compiler would at least generate code that wouldn't be much worse than > > for ordinary stack-allocated variables. > > I think "register" provides no benefit (it makes non-optimizing GCC put > things into registers, sometimes, which is undesirable for debugging) > and harms readability, but others disagree. > > > There are other optimization opportunities, I just got bitten by > > Knuth's "root of all evil" enough that I gave up on optimizing before > > getting the logic right. > > Sounds like the right approach to me. > > > Are you using native compilation in your build? The comp-abi-hash > > changes since I added some subrs. Could you tell me the configuration? > > My build/test is on x86_64 Linux. > > Same here. > > > If I isolate the changes to just the re-entrancy, I could try making > > read0 a monolithic procedure so the read stack is a local variable, > > and making inline versions of the read char procedures in that > > procedure, and use the tricks in bytecode.c for using computed goto in > > read0 to use local variables uniformly in read0. That should take > > care of some of the pessimization, at least. > > I'd suggest attempting that only if it turns out to be absolutely > necessary! That would hurt readability of this code a lot, and we need > more people to understand it. > > >> Is there any possibility of splitting this into several smaller patches? > >> I get the impression it would be a lot of work, but maybe I'm wrong. > > > > Making read0/readevalloop re-entrant requires a global control-flow > > transformation, so most of it is necessary. Two pieces that aren't > > strictly necessary, but are in it because this was actually the last > > step in my initial development of a "noneditor" mode for running > > command-line type emacs-lisp programs with non-standard dump-files. > > Even if it isn't possible to split the changeset into functional > patches, splitting it into separate patches may make the diffs more > readable purely for syntactic reasons. The first patch could merely add > unused macro arguments to READCHAR etc; that would help diff (the > program), in subsequent patches, to anchor the diff in the right > place. (For the same reason, I'd refrain from whitespace changes and > drive-by fixes). > > > FIrst, the load_source_file step, because I want to try building a > > minimal command-line compiler for boot strapping and async compiling > > and think it ought to be able to handle all the source files in the > > emacs lisp distribution without having to load every character set and > > coding system available into the dump file. > > > Second, I modified lisp_file_lexical_cookie to allow the > > lexical-binding cookie on any line in a leading comment block, rather > > than just the first line. Now that the cookie is being warned about, > > I would prefer it if the system didn't force users to violate their > > style requirements by making the first line obscenely long. However, > > it makes use of the "lread_rewind_input" to get around possible > > limitations on ungetc. > > > > If I remove those two pieces, I think the rest will probably work. > > I think removing them for now would be a good idea. The first change > sounds like a great idea, but does it really depend on a reentrant > reader? > > >> > I tried explicitly forcing emacs-internal as the coding-system for > >> > load_source_file, but without any noticeable difference. > >> > >> Hmm. I think the problem there is that you specbind > >> Qcoding_system_for_read to Qemacs_internal while calling > >> dump-emacs-portable. The pdumper doesn't unwind specbind bindings, so > >> the Vcoding_system_for_read variable is dumped with that value, and > >> restored with that value from the dump, and then things go wrong because > >> other code assumes Vcoding_system_for_read is nil. > > > > I'm sure that is a big, but it didn't fix the problem for me (ignoring > > the the 2 non-fixes). > > I'm not sure what "the problem" is here; IIRC, the pdumper change fixes > some of the differences in the generated .el files, while the non-fixes > can be dropped if we use coding = Qutf_8 instead of coding = > Qno_conversion in Fload. (But I'm not sure that's the right fix). The real problem is that I developed that 1000-yard coding stare looking at this too long. Somehow I got infected with the idea that elc files were loaded as raw text, but in fact, the code in readchar always decodes them according to either emac-internal or emacs-mule and so are always multi-byte. Once I made that change I got a working build with no complaints from the build process, at least. I'll do the cleanups we've discussed and send the revised patch (or patches, per your recommendation).
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.