Package: emacs;
Reported by: Lynn Winebarger <owinebar <at> gmail.com>
Date: Wed, 25 Jun 2025 22:01:05 UTC
Severity: normal
View this message in rfc822 format
From: Lynn Winebarger <owinebar <at> gmail.com> To: Stefan Monnier <monnier <at> iro.umontreal.ca> Cc: 78898 <at> debbugs.gnu.org Subject: bug#78898: Make read/readevalloop reentrant Date: Sun, 29 Jun 2025 08:16:26 -0400
On Sun, Jun 29, 2025 at 12:51 AM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote: > > > It's true that read and readevalloop are distinct, but, ignoring the > > readfun parameter of readevalloop, any static variables read0 is dependent > > on, readevalloop is also dependent on. It's also true that not every entry > > to read0 is an entry to readevalloop, but practically every entry to > > readevalloop will enter read0. And they both depend on the state accessed > > by readchar and unreadchar. > > I'm afraid I still have no idea why the above is relevant. > > > So, readevalloop can never really be reentrant as long as > > read0/readchar/unreadchar are not. > > That's obviously not true: every `load` calls `readevalloop` and from > there it can all other `load`s. So when file A `require`s file B which > `require`s file C, you're using `readevalloop` in a reentrant way. I think we are talking at cross-purposes here. It's certainly true that readevalloop and read0 are being used recursively. But my understanding of "X is reentrant" is something like "X is robust against failures due to recursive calls". According to Wikipedia, the term is even stronger and implies a procedure can be safely called concurrently. > > > My initial (current) approach has been to identify all the private > > state in lread.c accessed > > You tell me how you go about solving your problem, but I don't know > what is the problem you're trying to solve. > (a) My personal immediate problem is evaluating arbitrary programs supplied a user of a bare temacs as a command line argument, but I don't think that is in itself very motivating for emacs maintainers (b) For future maintainers, making it easier to debug recursive usage of the reader/readevalloop. (c) Incremental improvement in concurrent usage of read0, at least with respect to C variables. Dealing with the global symbol table is a blocker that isn't specific to the reader implementation so much as LISP semantics. (d) Possibly cleaner handling of interrupts and signals raised while in read0 or readevalloop. If the C runtime is intended to be preserved in amber (or reduced in scope), then I suppose the possibility of increasing the safety of futures additional uses of these procedures from C will not be very motivating. > > Initially, I had tried implementing a simple read-eval loop for processing > > sequences of expressions in a string, similar to "load" for files or > > "eval-buffer". > > You can easily write this in ELisp, so I don't see how that relates to > any part of the C code. Well, in case (a), there's a chicken and egg problem, but I don't think you will find that very motivating. > > > I considered that even if I determined the proximate cause of the issue, it > > would be difficult to tell whether any particular solution would be TRT or > > just a band-aid. So, I suppose the most immediate advantage for me is > > improving my ability to debug recursive invocations of read0/readevalloop > > by looking at the explicit stack of frames. > > How/when&why do you expect/want `read` to be called recursively? It can definitely be entered multiple times if a stream being read blocks and the command loop executes another function that calls read. That can clearly happen when reading from user-supplied procedures or a raw stdio stream (i.e. from "load"). I am not at all confident that I know all the ways a maybe_quit() call can occur in primitives called by read0 (directly or indirectly) to rule it out of occuring while reading from a string. You definitely have the advantage of me there. If read0 can only access maybe_quit through the calls to that function that actually appear in lread.c, then maybe such a recursion can't happen. But that is a global property that has to be enforced when working on code outside of lread.c - maybe the maintainers all know which functions are guaranteed to be "safe" in that way, but I'm not sure how the rest of us would know what those are. > > > So, for example, I've implemented an input stream "reader-input-stream" > > that can be used arbitrarily by code being evaluated for characters from > > the "current" input stream, which is pretty much impossible in the current > > implementation. It's main use is for "readfun" in readevalloop. > > Implementing "eval-stream", which evaluates all input streams as programs > > uniformly, becomes trivial. > > I don't know what "evaluates all input streams as programs > uniformly" means, nor do I know what "eval-stream" refers to (is that > the name of a standard function in some language? What does it do?). I added it in the 0002-...patch in one of my responses to Pip Cet above. Interpreted languages are characterized not only be the existence of "eval" for evaluating individual top-level expressions, but some form of "eval-program" that reads a textual representation of a sequence of top-level expressions, parsing and evaluating one expression at a time. Elisp has three such operators - load, eval-buffer,and eval-region, each independently responsible for ensuring the safety of mutual recursion with the others. I am using "eval-stream" to mean an "eval-program" that can accept any of the types of input streams defined in the elisp manual. Lynn
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.