On Fri, Jun 27, 2025 at 5:42 AM Pip Cet <pipcet@protonmail.com> wrote:
>
> "Lynn Winebarger" <owinebar@gmail.com> writes:
>
> Thanks for providing an updated patch!  It applies and, with the changes
> below, it seems to work for me.
>
> I haven't really looked at the code yet, just tried to get it to work.
> I did notice that, at least with debug CFLAGS, Emacs startup is much
> slower than it was before.

That's interesting - I use the debug CFLAGS, and I was surprised that I didn't notice any slowdown at startup.  I replaced a bunch of static variables with indirect references to a stack-allocated "frame" data structure, so I expected some pessimization.  I did use "register" for the frame parameter, in the hope that the compiler would at least generate code that wouldn't be much worse than for ordinary stack-allocated variables.  There are other optimization opportunities, I just got bitten by Knuth's "root of all evil" enough that I gave up on optimizing before getting the logic right.  Are you using native compilation in your build?  The comp-abi-hash changes since I added some subrs. Could you tell me the configuration?  My build/test is on x86_64 Linux.

If I isolate the changes to just the re-entrancy, I could try making read0 a monolithic procedure so the read stack is a local variable, and making inline versions of the read char procedures in that procedure, and use the tricks in bytecode.c for using computed goto in read0 to use local variables uniformly in read0.  That should take care of some of the pessimization, at least.

>
> Is there any possibility of splitting this into several smaller patches?
> I get the impression it would be a lot of work, but maybe I'm wrong.

Making read0/readevalloop re-entrant requires a global control-flow transformation, so most of it is necessary. Two pieces that aren't strictly necessary, but are in it because this was actually the last step in my initial development of a "noneditor" mode for running command-line type emacs-lisp programs with non-standard dump-files.

FIrst, the load_source_file step, because I want to try building a minimal command-line compiler for boot strapping and async compiling and think it ought to be able to handle all the source files in the emacs lisp distribution without having to load every character set and coding system available into the dump file.

Second, I modified lisp_file_lexical_cookie to allow the lexical-binding cookie on any line in a leading comment block, rather than just the first line.  Now that the cookie is being warned about, I would prefer it if the system didn't force users to violate their style requirements by making the first line obscenely long.  However, it makes use of the "lread_rewind_input" to get around possible limitations on ungetc.

If I remove those two pieces, I think the rest will probably work.

> > I tried explicitly forcing emacs-internal as the coding-system for
> > load_source_file, but without any noticeable difference.
>
> Hmm.  I think the problem there is that you specbind
> Qcoding_system_for_read to Qemacs_internal while calling
> dump-emacs-portable.  The pdumper doesn't unwind specbind bindings, so
> the Vcoding_system_for_read variable is dumped with that value, and
> restored with that value from the dump, and then things go wrong because
> other code assumes Vcoding_system_for_read is nil.

I'm sure that is a big, but it didn't fix the problem for me (ignoring the the 2 non-fixes).  It looks like the real issue is the bootstrap generation of charset and coding system files by elisp programs run by bootstrap-emacs.  Some of those generated files are basically compiled elisp, so should be loaded like elc files.  I'm not sure why they aren't generated as elc files in the first place.

I'm addressing that issue by building our a primitive set-auto-code function that detects coding cookies in the header, and modifying the lisp file generation code to create coding cookies instead of in the local variable section.  Then, check if the coding system is actually defined and use it if so, or no-conversion if not.  Then, I am only adding definitions for utf-8 and utf-8-emacs if the system is not dumping (but isn't initialized from a dump file either).  That should restore the previous behavior.  Maybe it will also remove some of the unidata-related fragility currently required in loadup.el, I'm not sure.
Thanks for the additional notes, they have really helped!

Lynn