On Tue, Jul 22, 2025, 8:28 AM Mattias Engdegård <mattias.engdegard@gmail.com>
wrote:

> 21 juli 2025 kl. 17.29 skrev Lynn Winebarger <owinebar@gmail.com>:
>
> > For my next increment, I want to try managing the read stack with alloca
> and friends.  The way it is now, memory allocated for that stack is never
> recovered.
>
> Not sure why that would be a problem as the stack is usually small.
> (`alloca` is not without its own problems, especially in Emacs; I prefer
> not using it.)
>

I'd ask that you withhold judgement on using alloca here until I submit
something.  Considering the read_entry stack is implemented as part of an
optimization of C recursion, allocating it on the C stack is quite
natural.  What I have in mind should actually simplify the code.

The memory leak aspect is more observation than alarm.  The leak is only
safe because the stack is implemented as a global static variable.  Getting
rid of that property means either freeing the stack at the end of the
initial entry to read, or allocating the read_entry stack from the garbage
collected heap.

> I also tried eliminating base_sp from read0, but it failed. IOW, read0 is
> getting recursively invoked in the middle of reading an expression.  So the
> size of the read stack may not be bounded by the maximum depth of a single
> expression.  I want to measure what's going on there.
>
> `read0` must be re-entrant because the \N{CHARNAME} notation makes us call
> out to Lisp and possibly load some files (for the char names). We could
> avoid this if it seems likely that it would be worth the trouble.
>

Thanks, that also answers the question "Can read0 be called during a call
to read-from-string?" posed by Stefan.

read-from-string is called from the pdump loader when an eln library is
initialized.  An occurrence of that notation must be either very unlikely
or carefully orchestrated by the pdump loader.

> Also, I'd like some concrete cases for measuring the impact of the reader
> performance on emacs startup.  When I look at the profiles of the dump
> recipes, the time spent in lread.c is tiny compared to evaluating the code
> being read, even when loading byte-compiled code.
>
> It could be that the reader is already fairly fast (it's seen some
> improvement lately) but also that profiling isn't easy. If you don't find
> what you are looking for in a profile it usually means the time is spent
> elsewhere. In Emacs that is often the allocator and GC but not always.
>

Right, all the more reason to have well-defined benchmarking. That's why I
stated it the way I did: "time spent in lread.c".  I'm not sure how Stefan
was measuring the performance of the reader.

I will hazard a guess that the performance of the reader is more noticeable
when the code being loaded is already compiled.  It may be even more so
when all the code in the dump is compiled as well, especially if there
isn't an explicit call to GC after every call to load.

I'm all for making the reader faster; even small improvements can matter.
> Complicating things for little or no benefit, not so much. Removing ancient
> cruft that no longer matters, yes please.


The change I have in mind should make the code simpler and make the
read_entry stack thread-safe.  I doubt it will make a huge difference to
performance one way or the other, if the size of the stack growth is
reasonably tuned.

Hand written lisp code should definitely not require a huge read stack.  My
guess is that, say, 30 entries would be conservative for all but the most
pathological cases for a single lisp expression.  I have no grasp on how
deep the byte code emitted by the compiler might go, though.  A histogram
of the depth of expressions in the source and byte code of the emacs tree
should be plenty for that tuning.

Lynn