On Tue, Jul 22, 2025, 8:28 AM Mattias Engdegård wrote: > 21 juli 2025 kl. 17.29 skrev Lynn Winebarger : > > > For my next increment, I want to try managing the read stack with alloca > and friends. The way it is now, memory allocated for that stack is never > recovered. > > Not sure why that would be a problem as the stack is usually small. > (`alloca` is not without its own problems, especially in Emacs; I prefer > not using it.) > I'd ask that you withhold judgement on using alloca here until I submit something. Considering the read_entry stack is implemented as part of an optimization of C recursion, allocating it on the C stack is quite natural. What I have in mind should actually simplify the code. The memory leak aspect is more observation than alarm. The leak is only safe because the stack is implemented as a global static variable. Getting rid of that property means either freeing the stack at the end of the initial entry to read, or allocating the read_entry stack from the garbage collected heap. > I also tried eliminating base_sp from read0, but it failed. IOW, read0 is > getting recursively invoked in the middle of reading an expression. So the > size of the read stack may not be bounded by the maximum depth of a single > expression. I want to measure what's going on there. > > `read0` must be re-entrant because the \N{CHARNAME} notation makes us call > out to Lisp and possibly load some files (for the char names). We could > avoid this if it seems likely that it would be worth the trouble. > Thanks, that also answers the question "Can read0 be called during a call to read-from-string?" posed by Stefan. read-from-string is called from the pdump loader when an eln library is initialized. An occurrence of that notation must be either very unlikely or carefully orchestrated by the pdump loader. > Also, I'd like some concrete cases for measuring the impact of the reader > performance on emacs startup. When I look at the profiles of the dump > recipes, the time spent in lread.c is tiny compared to evaluating the code > being read, even when loading byte-compiled code. > > It could be that the reader is already fairly fast (it's seen some > improvement lately) but also that profiling isn't easy. If you don't find > what you are looking for in a profile it usually means the time is spent > elsewhere. In Emacs that is often the allocator and GC but not always. > Right, all the more reason to have well-defined benchmarking. That's why I stated it the way I did: "time spent in lread.c". I'm not sure how Stefan was measuring the performance of the reader. I will hazard a guess that the performance of the reader is more noticeable when the code being loaded is already compiled. It may be even more so when all the code in the dump is compiled as well, especially if there isn't an explicit call to GC after every call to load. I'm all for making the reader faster; even small improvements can matter. > Complicating things for little or no benefit, not so much. Removing ancient > cruft that no longer matters, yes please. The change I have in mind should make the code simpler and make the read_entry stack thread-safe. I doubt it will make a huge difference to performance one way or the other, if the size of the stack growth is reasonably tuned. Hand written lisp code should definitely not require a huge read stack. My guess is that, say, 30 entries would be conservative for all but the most pathological cases for a single lisp expression. I have no grasp on how deep the byte code emitted by the compiler might go, though. A histogram of the depth of expressions in the source and byte code of the emacs tree should be plenty for that tuning. Lynn