GNU bug report logs - #78898
Make read/readevalloop reentrant

Package: emacs;

Reported by: Lynn Winebarger <owinebar <at> gmail.com>

Date: Wed, 25 Jun 2025 22:01:05 UTC

Severity: normal

To reply to this bug, email your comments to 78898 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Wed, 25 Jun 2025 22:01:07 GMT) Full text and rfc822 format available.

Acknowledgement sent to Lynn Winebarger <owinebar <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 25 Jun 2025 22:01:07 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Make read/readevalloop reentrant
Date: Wed, 25 Jun 2025 17:59:53 -0400

[Message part 1 (text/plain, inline)]

In the recursive sense, not the "safe for concurrency" sense.

I want to use readevalloop to process command-line arguments, and my
initial attempts failed to handle '"(load "loadup.el") (princ
"Done")'.  So I took a stab at making all the entry points synchronize
with an explicit stack of continuation frames shared between read0 and
readevalloop.

The attached diff is against master from a couple of days ago.  For
me, a configure/build (from scratch) completes *almost* everything
(including the standard AOT native-compiled bits) - it chokes on
lisp/leim/ja-dic.el due to a failure in a compiled function in
bytecomp.elc.  There are clearly still some issues where the new code
is not dealing with unibyte/multibyte correctly.  For example, running
emacs -Q shows the initial scratch buffer contents as:

;; This buffer is for text that is not saved, and for Lisp evaluation.
;; To create a file, visit it with â€˜C-x C-fâ€™ and enter text in its buffer.

This is definitely not ready for committing, but I'm hoping someone
else finds this of interest and maybe can spot where I'm going wrong
with the unibyte/multibyte handling.  Or what is going on with
ja-dic.el.  The error I'm seeing there is in the execution of some
byte-code, where some bit-wise operator gets a non-numeric argument.
My only idea at present is to diff the byte-compiled libraries and see
what's going on there - theoretically, there shouldn't be any change
in the results of the byte-compiler.

Thanks,
Lynn

[reentrant-read.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Thu, 26 Jun 2025 23:17:05 GMT) Full text and rfc822 format available.

Message #8 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Thu, 26 Jun 2025 19:15:57 -0400

[Message part 1 (text/plain, inline)]

On Wed, Jun 25, 2025 at 6:01 PM Lynn Winebarger <owinebar <at> gmail.com> wrote:
>
> In the recursive sense, not the "safe for concurrency" sense.
>
> I want to use readevalloop to process command-line arguments, and my
> initial attempts failed to handle '"(load "loadup.el") (princ
> "Done")'.  So I took a stab at making all the entry points synchronize
> with an explicit stack of continuation frames shared between read0 and
> readevalloop.
>
> The attached diff is against master from a couple of days ago.  For
> me, a configure/build (from scratch) completes *almost* everything
> (including the standard AOT native-compiled bits) - it chokes on
> lisp/leim/ja-dic.el due to a failure in a compiled function in
> bytecomp.elc.  There are clearly still some issues where the new code
> is not dealing with unibyte/multibyte correctly.  For example, running
> emacs -Q shows the initial scratch buffer contents as:
>
> ;; This buffer is for text that is not saved, and for Lisp evaluation.
> ;; To create a file, visit it with â€˜C-x C-fâ€™ and enter text in its buffer.
>
> This is definitely not ready for committing, but I'm hoping someone
> else finds this of interest and maybe can spot where I'm going wrong
> with the unibyte/multibyte handling.  Or what is going on with
> ja-dic.el.  The error I'm seeing there is in the execution of some
> byte-code, where some bit-wise operator gets a non-numeric argument.
> My only idea at present is to diff the byte-compiled libraries and see
> what's going on there - theoretically, there shouldn't be any change
> in the results of the byte-compiler.


It was pointed out to me that the patch did not actually apply cleanly
to master.
I'm attaching a patch generated by 'git format-path' after installing
the modified lread.c to a freshly cloned master.  That failed to
build, so this patch is not quite the same.  The one issue I mentioned
about the initial scratch buffer seems to have resolved, at least.
When I start it without '-Q', though, the button that precedes the
missing lexical-binding cookie message is labeled with 'â\226_'
instead of whatever the text should be.  The 'ja-dic.el' file still
produces the same error, but putting a pre-built copy of ja-dic.elc in
the source tree allowed the build to complete.
My configuration options are "--enable-checking=yes,glyphs
--enable-check-lisp-object-type 'CFLAGS=-O0 -g3 -gdwarf-5
-fvar-tracking -gstatement-frontiers'
--infodir=/usr/local/share/emacs/31/share/info
--mandir=/usr/local/share/emacs/31/share/man
--docdir=/usr/local/share/emacs/31/share/doc/emacs
--localedir=/usr/local/share/emacs/31/share/locale --with-x --with-xim
--with-sound --with-xpm --with-jpeg --with-tiff --with-gif --with-png
--with-rsvg --with-imagemagick --with-dbus --with-gpm
--with-x-toolkit=gtk3 --with-toolkit-scroll-bars --with-libotf
--with-m17n-flt --with-cairo --with-xwidgets --disable-build-details
--without-pop --with-mailutils --without-hesiod --with-gameuser=:games
--with-kerberos --with-kerberos5 --with-file-notification=inotify
--with-modules --with-tree-sitter NATIVE_COMPILATION_AOT=yes"
Although it is not currently building all the libraries natively AOT.
Which is probably good, because the native compiler seems to be even
slower than usual while bootstrapping.

[0001-Changes-to-make-entry-to-read0-and-readevalloop-recu.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Fri, 27 Jun 2025 06:46:02 GMT) Full text and rfc822 format available.

Message #11 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Fri, 27 Jun 2025 02:45:22 -0400

[Message part 1 (text/plain, inline)]

I tried explicitly forcing emacs-internal as the coding-system for
load_source_file, but without any noticeable difference.
The button preceding the warning messages should be a stop sign emoji.
Using the built emacs to view emacs-lisp/warning.el shows the stop
sign emoji clearly, but it is apparently mangled in the generated
byte-code in warning.elc.
I also built a clean emacs from commit
2bdcf0250acecdb0719203ae711aedf5baad1783.  When I compared the source
elisp files (after building) I see differences in these files:
lisp/international/uni-bidi.el
lisp/international/uni-brackets.el
lisp/international/uni-category.el
lisp/international/uni-combining.el
lisp/international/uni-comment.el
lisp/international/uni-decomposition.el
lisp/international/uni-old-name.el
lisp/international/uni-name.el
lisp/leim/leim-list.el
lisp/leim/ja-dic/ja-dic.el
lisp/net/tramp-loaddefs.el
lisp/textmodes/reftex-loaddefs.el
lisp/loaddefs.el
lisp/finder-inf.el
lisp/cus-load.el

In the byte-compiled files, I see differences for:
lisp/emacs-lisp/eieio-opt.elc
lisp/erc/erc-nicks.elc
lisp/erc/erc.elc
lisp/eshell/em-cmpl.elc
lisp/gnus/gnus-group.elc
lisp/gnus/gnus-srvr.elc
lisp/gnus/gnus-topic.elc
lisp/gnus/gnus-sieve.elc
lisp/gnus/gnus-search.elc
lisp/gnus/message.elc
lisp/gnus/nnfeed.elc
lisp/gnus/nnml.elc
lisp/gnus/nnimap.elc
lisp/gnus/nnagent.elc
lisp/gnus/nnmaildir.elc
lisp/gnus/gnus-start.elc
lisp/gnus/gnus-sum.elc
lisp/gnus/nnmairix.elc
lisp/gnus/nnselect.elc
lisp/net/tramp-cache.elc
lisp/net/tramp-cmds.elc
lisp/net/tramp-gvfs.elc
lisp/net/tramp-sh.elc
lisp/obsolete/nnir.elc
lisp/org/org-agenda.elc
lisp/org/org-lint.elc
lisp/org/ox-ascii.elc
lisp/org/ox-odt.elc
lisp/org/org.elc
lisp/progmodes/eglot.elc
lisp/progmodes/cc-engine.elc
lisp/progmodes/cc-mode.elc
lisp/progmodes/js.elc
lisp/use-package/use-package-core.elc
lisp/loaddefs.elc

I'm not clear on why loaddefs.el should be different.  That, and
eieio-opt.elc being different is unexpected.

Lynn

[0002-Try-explicit-use-of-emacs-internal-coding-in-load_so.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Fri, 27 Jun 2025 09:21:02 GMT) Full text and rfc822 format available.

Message #14 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Fri, 27 Jun 2025 09:19:59 +0000

"Lynn Winebarger" <owinebar <at> gmail.com> writes:

> I tried explicitly forcing emacs-internal as the coding-system for
> load_source_file, but without any noticeable difference.

Hmm.  I think the problem there is that you specbind
Qcoding_system_for_read to Qemacs_internal while calling
dump-emacs-portable.  The pdumper doesn't unwind specbind bindings, so
the Vcoding_system_for_read variable is dumped with that value, and
restored with that value from the dump, and then things go wrong because
other code assumes Vcoding_system_for_read is non-nil.

> The button preceding the warning messages should be a stop sign emoji.
> Using the built emacs to view emacs-lisp/warning.el shows the stop
> sign emoji clearly, but it is apparently mangled in the generated
> byte-code in warning.elc.

To me, it looks like it's encoded in warnings.elc as a UTF-8 sequence.
But you force coding_system_for_read to emacs-internal, so it's read as
garbage.  The same problem, by the way, applies to ja-dic-cnv.el, which
contains verbatim UTF-8 data which is then copied into ja-dic-cnv.elc
and misread.




> I also built a clean emacs from commit
> 2bdcf0250acecdb0719203ae711aedf5baad1783.  When I compared the source
> elisp files (after building) I see differences in these files:
> lisp/international/uni-bidi.el
> lisp/international/uni-brackets.el
> lisp/international/uni-category.el
> lisp/international/uni-combining.el
> lisp/international/uni-comment.el
> lisp/international/uni-decomposition.el
> lisp/international/uni-old-name.el
> lisp/international/uni-name.el
> lisp/leim/leim-list.el
> lisp/leim/ja-dic/ja-dic.el
> lisp/net/tramp-loaddefs.el
> lisp/textmodes/reftex-loaddefs.el
> lisp/loaddefs.el
> lisp/finder-inf.el
> lisp/cus-load.el
>
> In the byte-compiled files, I see differences for:
> lisp/emacs-lisp/eieio-opt.elc
> lisp/erc/erc-nicks.elc
> lisp/erc/erc.elc
> lisp/eshell/em-cmpl.elc
> lisp/gnus/gnus-group.elc
> lisp/gnus/gnus-srvr.elc
> lisp/gnus/gnus-topic.elc
> lisp/gnus/gnus-sieve.elc
> lisp/gnus/gnus-search.elc
> lisp/gnus/message.elc
> lisp/gnus/nnfeed.elc
> lisp/gnus/nnml.elc
> lisp/gnus/nnimap.elc
> lisp/gnus/nnagent.elc
> lisp/gnus/nnmaildir.elc
> lisp/gnus/gnus-start.elc
> lisp/gnus/gnus-sum.elc
> lisp/gnus/nnmairix.elc
> lisp/gnus/nnselect.elc
> lisp/net/tramp-cache.elc
> lisp/net/tramp-cmds.elc
> lisp/net/tramp-gvfs.elc
> lisp/net/tramp-sh.elc
> lisp/obsolete/nnir.elc
> lisp/org/org-agenda.elc
> lisp/org/org-lint.elc
> lisp/org/ox-ascii.elc
> lisp/org/ox-odt.elc
> lisp/org/org.elc
> lisp/progmodes/eglot.elc
> lisp/progmodes/cc-engine.elc
> lisp/progmodes/cc-mode.elc
> lisp/progmodes/js.elc
> lisp/use-package/use-package-core.elc
> lisp/loaddefs.elc
>
> I'm not clear on why loaddefs.el should be different.  That, and
> eieio-opt.elc being different is unexpected.

Did you use a parallel build?  As our defsubst expansion is
unpredictable in parallel builds, there may be bytecode differences that
will simply go away if you rebuild (bootstrap) without the "-j" flag.  I
suspect that's what happened here, since I don't see a difference in
that file.


>
> Lynn
>
> [2. text/x-patch; 0002-Try-explicit-use-of-emacs-internal-coding-in-load_so.patch]...

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Fri, 27 Jun 2025 09:43:01 GMT) Full text and rfc822 format available.

Message #17 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Fri, 27 Jun 2025 09:41:58 +0000

"Lynn Winebarger" <owinebar <at> gmail.com> writes:

Thanks for providing an updated patch!  It applies and, with the changes
below, it seems to work for me.

I haven't really looked at the code yet, just tried to get it to work.
I did notice that, at least with debug CFLAGS, Emacs startup is much
slower than it was before.

Is there any possibility of splitting this into several smaller patches?
I get the impression it would be a lot of work, but maybe I'm wrong.

> I tried explicitly forcing emacs-internal as the coding-system for
> load_source_file, but without any noticeable difference.

Hmm.  I think the problem there is that you specbind
Qcoding_system_for_read to Qemacs_internal while calling
dump-emacs-portable.  The pdumper doesn't unwind specbind bindings, so
the Vcoding_system_for_read variable is dumped with that value, and
restored with that value from the dump, and then things go wrong because
other code assumes Vcoding_system_for_read is nil.

> The button preceding the warning messages should be a stop sign emoji.
> Using the built emacs to view emacs-lisp/warning.el shows the stop
> sign emoji clearly, but it is apparently mangled in the generated
> byte-code in warning.elc.

To me, it looks like it's encoded in warnings.elc as a UTF-8 sequence.
But you force coding_system_for_read to emacs-internal, so it's read as
garbage.  The same problem, by the way, applies to ja-dic-cnv.el, which
contains verbatim UTF-8 data which is then copied into ja-dic-cnv.elc
and misread, which causes the wrong-type-argument you've described.

Here's a non-fix:

diff --git a/lisp/emacs-lisp/warnings.el b/lisp/emacs-lisp/warnings.el
index 6c77d57a6ba..ae02bc2a94b 100644
--- a/lisp/emacs-lisp/warnings.el
+++ b/lisp/emacs-lisp/warnings.el
@@ -1,4 +1,4 @@
-;;; warnings.el --- log and display warnings  -*- lexical-binding:t -*-
+;;; warnings.el --- log and display warnings  -*- lexical-binding:t; no-byte-compile: t -*-

 ;; Copyright (C) 2002-2025 Free Software Foundation, Inc.

diff --git a/lisp/international/ja-dic-cnv.el b/lisp/international/ja-dic-cnv.el
index 901d1742af5..2c76291a8f3 100644
--- a/lisp/international/ja-dic-cnv.el
+++ b/lisp/international/ja-dic-cnv.el
@@ -1,4 +1,4 @@
-;;; ja-dic-cnv.el --- convert a Japanese dictionary (SKK-JISYO.L) to Emacs Lisp  -*- lexical-binding: t; -*-
+;;; ja-dic-cnv.el --- convert a Japanese dictionary (SKK-JISYO.L) to Emacs Lisp  -*- lexical-binding: t; no-byte-compile: t -*-

 ;; Copyright (C) 2001-2025 Free Software Foundation, Inc.

And these two look like necessary fixes to me:

diff --git a/src/lread.c b/src/lread.c
index cb8559f455c..6dd06614270 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -1217,7 +1217,7 @@ #define READCHAR_BEGIN(frame) \
   while (0)

 #define READCHAR_FROM_FILE_BEGIN(frame) \
-  unsigned char buf[4]; \
+  unsigned char buf[MAX_MULTIBYTE_LENGTH]; \
   READCHAR_BEGIN(frame); \
   c = readbyte_from_file (frame); \
   if (c < 0) \
diff --git a/src/pdumper.c b/src/pdumper.c
index b3de90bfa03..536716bdba0 100644
--- a/src/pdumper.c
+++ b/src/pdumper.c
@@ -4243,6 +4243,7 @@ DEFUN ("dump-emacs-portable",
   ctx->old_process_environment = Vprocess_environment;
   Vprocess_environment = Qnil;

+  Vcoding_system_for_read = Qnil;
   {
     USE_SAFE_ALLOCA;

With this patch, I can build Emacs and launch it.

The rest of the email is about the specific differences you mentioned.

> I also built a clean emacs from commit
> 2bdcf0250acecdb0719203ae711aedf5baad1783.  When I compared the source
> elisp files (after building) I see differences in these files:
> lisp/international/uni-bidi.el
> lisp/international/uni-brackets.el
> lisp/international/uni-category.el
> lisp/international/uni-combining.el
> lisp/international/uni-comment.el
> lisp/international/uni-decomposition.el
> lisp/international/uni-old-name.el
> lisp/international/uni-name.el

On my system, these files include the path to the emacs directory, so
the differences would make sense unless you built both trees in the same
directory.

> lisp/leim/leim-list.el
> lisp/leim/ja-dic/ja-dic.el

No differences here, after the patch.

> lisp/net/tramp-loaddefs.el

Or here; as tramp makes heavy use of defsubst, in parallel builds, there
will often be differences in the bytecode/generated files produced for
tramp.

> lisp/textmodes/reftex-loaddefs.el
> lisp/finder-inf.el

No differences here.

> lisp/loaddefs.el
> lisp/cus-load.el

Differences here, but they look harmless.

> In the byte-compiled files, I see differences for:
> lisp/emacs-lisp/eieio-opt.elc
> lisp/erc/erc-nicks.elc
> lisp/erc/erc.elc
> lisp/eshell/em-cmpl.elc
> lisp/gnus/gnus-group.elc
> lisp/gnus/gnus-srvr.elc
> lisp/gnus/gnus-topic.elc
> lisp/gnus/gnus-sieve.elc
> lisp/gnus/gnus-search.elc
> lisp/gnus/message.elc
> lisp/gnus/nnfeed.elc
> lisp/gnus/nnml.elc
> lisp/gnus/nnimap.elc
> lisp/gnus/nnagent.elc
> lisp/gnus/nnmaildir.elc
> lisp/gnus/gnus-start.elc
> lisp/gnus/gnus-sum.elc
> lisp/gnus/nnmairix.elc
> lisp/gnus/nnselect.elc
> lisp/net/tramp-cache.elc
> lisp/net/tramp-cmds.elc
> lisp/net/tramp-gvfs.elc
> lisp/net/tramp-sh.elc
> lisp/obsolete/nnir.elc
> lisp/org/org-agenda.elc
> lisp/org/org-lint.elc
> lisp/org/ox-ascii.elc
> lisp/org/ox-odt.elc
> lisp/org/org.elc
> lisp/progmodes/eglot.elc
> lisp/progmodes/cc-engine.elc
> lisp/progmodes/cc-mode.elc
> lisp/progmodes/js.elc
> lisp/use-package/use-package-core.elc
> lisp/loaddefs.elc

I haven't looked over all these files.  Some of them include the source
path, but see below for eieio-opt.elc.

> I'm not clear on why loaddefs.el should be different.  That, and
> eieio-opt.elc being different is unexpected.

Did you use a parallel build?  As our defsubst expansion is
unpredictable in parallel builds, there may be bytecode differences that
will simply go away if you rebuild (bootstrap) without the "-j" flag.  I
suspect that's what happened here, since I don't see a difference in
that file.

It might be worth it to build both trees in the same file system
directory, and to use a non-parallel make, then run diff -aur a/lisp
b/lisp to look for actual differences.  (I'd suggest disabling native
compilation for now if that takes too long).

Pip

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 28 Jun 2025 10:10:05 GMT) Full text and rfc822 format available.

Message #20 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Lynn Winebarger <owinebar <at> gmail.com>,
 Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Mattias Engdegård <mattiase <at> acm.org>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 28 Jun 2025 13:08:44 +0300

> From: Lynn Winebarger <owinebar <at> gmail.com>
> Date: Wed, 25 Jun 2025 17:59:53 -0400
> 
> In the recursive sense, not the "safe for concurrency" sense.
> 
> I want to use readevalloop to process command-line arguments, and my
> initial attempts failed to handle '"(load "loadup.el") (princ
> "Done")'.  So I took a stab at making all the entry points synchronize
> with an explicit stack of continuation frames shared between read0 and
> readevalloop.
> 
> The attached diff is against master from a couple of days ago.  For
> me, a configure/build (from scratch) completes *almost* everything
> (including the standard AOT native-compiled bits) - it chokes on
> lisp/leim/ja-dic.el due to a failure in a compiled function in
> bytecomp.elc.  There are clearly still some issues where the new code
> is not dealing with unibyte/multibyte correctly.  For example, running
> emacs -Q shows the initial scratch buffer contents as:
> 
> ;; This buffer is for text that is not saved, and for Lisp evaluation.
> ;; To create a file, visit it with â€˜C-x C-fâ€™ and enter text in its buffer.
> 
> This is definitely not ready for committing, but I'm hoping someone
> else finds this of interest and maybe can spot where I'm going wrong
> with the unibyte/multibyte handling.  Or what is going on with
> ja-dic.el.  The error I'm seeing there is in the execution of some
> byte-code, where some bit-wise operator gets a non-numeric argument.
> My only idea at present is to diff the byte-compiled libraries and see
> what's going on there - theoretically, there shouldn't be any change
> in the results of the byte-compiler.

Stefan and Mattias, any comments to this proposal?

Do we really want readevalloop be recursive this way?  What does this
gain us?

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 28 Jun 2025 13:51:07 GMT) Full text and rfc822 format available.

Message #23 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Pip Cet <pipcet <at> protonmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 28 Jun 2025 09:50:07 -0400

[Message part 1 (text/plain, inline)]

On Fri, Jun 27, 2025 at 5:42 AM Pip Cet <pipcet <at> protonmail.com> wrote:
>
> "Lynn Winebarger" <owinebar <at> gmail.com> writes:
>
> Thanks for providing an updated patch!  It applies and, with the changes
> below, it seems to work for me.
>
> I haven't really looked at the code yet, just tried to get it to work.
> I did notice that, at least with debug CFLAGS, Emacs startup is much
> slower than it was before.

That's interesting - I use the debug CFLAGS, and I was surprised that I
didn't notice any slowdown at startup.  I replaced a bunch of static
variables with indirect references to a stack-allocated "frame" data
structure, so I expected some pessimization.  I did use "register" for the
frame parameter, in the hope that the compiler would at least generate code
that wouldn't be much worse than for ordinary stack-allocated variables.
There are other optimization opportunities, I just got bitten by Knuth's
"root of all evil" enough that I gave up on optimizing before getting the
logic right.  Are you using native compilation in your build?  The
comp-abi-hash changes since I added some subrs. Could you tell me the
configuration?  My build/test is on x86_64 Linux.

If I isolate the changes to just the re-entrancy, I could try making read0
a monolithic procedure so the read stack is a local variable, and making
inline versions of the read char procedures in that procedure, and use the
tricks in bytecode.c for using computed goto in read0 to use local
variables uniformly in read0.  That should take care of some of the
pessimization, at least.

>
> Is there any possibility of splitting this into several smaller patches?
> I get the impression it would be a lot of work, but maybe I'm wrong.

Making read0/readevalloop re-entrant requires a global control-flow
transformation, so most of it is necessary. Two pieces that aren't strictly
necessary, but are in it because this was actually the last step in my
initial development of a "noneditor" mode for running command-line type
emacs-lisp programs with non-standard dump-files.

FIrst, the load_source_file step, because I want to try building a minimal
command-line compiler for boot strapping and async compiling and think it
ought to be able to handle all the source files in the emacs lisp
distribution without having to load every character set and coding system
available into the dump file.

Second, I modified lisp_file_lexical_cookie to allow the lexical-binding
cookie on any line in a leading comment block, rather than just the first
line.  Now that the cookie is being warned about, I would prefer it if the
system didn't force users to violate their style requirements by making the
first line obscenely long.  However, it makes use of the
"lread_rewind_input" to get around possible limitations on ungetc.

If I remove those two pieces, I think the rest will probably work.

> > I tried explicitly forcing emacs-internal as the coding-system for
> > load_source_file, but without any noticeable difference.
>
> Hmm.  I think the problem there is that you specbind
> Qcoding_system_for_read to Qemacs_internal while calling
> dump-emacs-portable.  The pdumper doesn't unwind specbind bindings, so
> the Vcoding_system_for_read variable is dumped with that value, and
> restored with that value from the dump, and then things go wrong because
> other code assumes Vcoding_system_for_read is nil.

I'm sure that is a big, but it didn't fix the problem for me (ignoring the
the 2 non-fixes).  It looks like the real issue is the bootstrap generation
of charset and coding system files by elisp programs run by
bootstrap-emacs.  Some of those generated files are basically compiled
elisp, so should be loaded like elc files.  I'm not sure why they aren't
generated as elc files in the first place.

I'm addressing that issue by building our a primitive set-auto-code
function that detects coding cookies in the header, and modifying the lisp
file generation code to create coding cookies instead of in the local
variable section.  Then, check if the coding system is actually defined and
use it if so, or no-conversion if not.  Then, I am only adding definitions
for utf-8 and utf-8-emacs if the system is not dumping (but isn't
initialized from a dump file either).  That should restore the previous
behavior.  Maybe it will also remove some of the unidata-related fragility
currently required in loadup.el, I'm not sure.
Thanks for the additional notes, they have really helped!

Lynn

[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 28 Jun 2025 14:59:02 GMT) Full text and rfc822 format available.

Message #26 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 28 Jun 2025 14:57:53 +0000

"Lynn Winebarger" <owinebar <at> gmail.com> writes:

> On Fri, Jun 27, 2025 at 5:42 AM Pip Cet <pipcet <at> protonmail.com> wrote:
>>
>> "Lynn Winebarger" <owinebar <at> gmail.com> writes:
>>
>> Thanks for providing an updated patch!  It applies and, with the changes
>> below, it seems to work for me.
>>
>> I haven't really looked at the code yet, just tried to get it to work.
>> I did notice that, at least with debug CFLAGS, Emacs startup is much
>> slower than it was before.
>
> That's interesting - I use the debug CFLAGS, and I was surprised that
> I didn't notice any slowdown at startup.

It's entirely possible it's just the nativecomp abi hash changing, I
didn't think of that!  In any case, it's too early to benchmark these
changes, IMHO.

In particular, I tried running "make check", and while I've reduced the
number of local test failures a little (see below), there are still some
mysterious ones.

> I replaced a bunch of static variables with indirect references to a
> stack-allocated "frame" data structure, so I expected some
> pessimization.

That doesn't sound like it would be noticeable, at least when
optimization is on.

> I did use "register" for the frame parameter, in the hope that the
> compiler would at least generate code that wouldn't be much worse than
> for ordinary stack-allocated variables.

I think "register" provides no benefit (it makes non-optimizing GCC put
things into registers, sometimes, which is undesirable for debugging)
and harms readability, but others disagree.

> There are other optimization opportunities, I just got bitten by
> Knuth's "root of all evil" enough that I gave up on optimizing before
> getting the logic right.

Sounds like the right approach to me.

> Are you using native compilation in your build?  The comp-abi-hash
> changes since I added some subrs. Could you tell me the configuration?
> My build/test is on x86_64 Linux.

Same here.

> If I isolate the changes to just the re-entrancy, I could try making
> read0 a monolithic procedure so the read stack is a local variable,
> and making inline versions of the read char procedures in that
> procedure, and use the tricks in bytecode.c for using computed goto in
> read0 to use local variables uniformly in read0.  That should take
> care of some of the pessimization, at least.

I'd suggest attempting that only if it turns out to be absolutely
necessary!  That would hurt readability of this code a lot, and we need
more people to understand it.

>> Is there any possibility of splitting this into several smaller patches?
>> I get the impression it would be a lot of work, but maybe I'm wrong.
>
> Making read0/readevalloop re-entrant requires a global control-flow
> transformation, so most of it is necessary. Two pieces that aren't
> strictly necessary, but are in it because this was actually the last
> step in my initial development of a "noneditor" mode for running
> command-line type emacs-lisp programs with non-standard dump-files.

Even if it isn't possible to split the changeset into functional
patches, splitting it into separate patches may make the diffs more
readable purely for syntactic reasons.  The first patch could merely add
unused macro arguments to READCHAR etc; that would help diff (the
program), in subsequent patches, to anchor the diff in the right
place.  (For the same reason, I'd refrain from whitespace changes and
drive-by fixes).

> FIrst, the load_source_file step, because I want to try building a
> minimal command-line compiler for boot strapping and async compiling
> and think it ought to be able to handle all the source files in the
> emacs lisp distribution without having to load every character set and
> coding system available into the dump file.

> Second, I modified lisp_file_lexical_cookie to allow the
> lexical-binding cookie on any line in a leading comment block, rather
> than just the first line.  Now that the cookie is being warned about,
> I would prefer it if the system didn't force users to violate their
> style requirements by making the first line obscenely long.  However,
> it makes use of the "lread_rewind_input" to get around possible
> limitations on ungetc.
>
> If I remove those two pieces, I think the rest will probably work.

I think removing them for now would be a good idea.  The first change
sounds like a great idea, but does it really depend on a reentrant
reader?

>> > I tried explicitly forcing emacs-internal as the coding-system for
>> > load_source_file, but without any noticeable difference.
>>
>> Hmm.  I think the problem there is that you specbind
>> Qcoding_system_for_read to Qemacs_internal while calling
>> dump-emacs-portable.  The pdumper doesn't unwind specbind bindings, so
>> the Vcoding_system_for_read variable is dumped with that value, and
>> restored with that value from the dump, and then things go wrong because
>> other code assumes Vcoding_system_for_read is nil.
>
> I'm sure that is a big, but it didn't fix the problem for me (ignoring
> the the 2 non-fixes).

I'm not sure what "the problem" is here; IIRC, the pdumper change fixes
some of the differences in the generated .el files, while the non-fixes
can be dropped if we use coding = Qutf_8 instead of coding =
Qno_conversion in Fload.  (But I'm not sure that's the right fix).

> It looks like the real issue is the bootstrap generation of charset
> and coding system files by elisp programs run by bootstrap-emacs.
> Some of those generated files are basically compiled elisp, so should
> be loaded like elc files.  I'm not sure why they aren't generated as
> elc files in the first place.

I'm not sure how .elc files should be loaded, TBH.  Your current code
sets coding = Qno_conversion, but some of the .elc files (like
warnings.elc) contain unescaped UTF-8.  Setting coding = Qutf_8 seems to
work a bit better.

> I'm addressing that issue by building our a primitive set-auto-code
> function that detects coding cookies in the header, and modifying the
> lisp file generation code to create coding cookies instead of in the
> local variable section.  Then, check if the coding system is actually
> defined and use it if so, or no-conversion if not.  Then, I am only
> adding definitions for utf-8 and utf-8-emacs if the system is not
> dumping (but isn't initialized from a dump file either).  That should
> restore the previous behavior.  Maybe it will also remove some of the
> unidata-related fragility currently required in loadup.el, I'm not
> sure.

That sounds more complicated than what we do now.  I don't understand
what changed to make such additional complexities necessary, and I don't
see the connection between coding systems and making lread.c reentrant.

> Thanks for the additional notes, they have really helped!

Feel free to ignore this diff, it serves mostly to show up the places
where I changed something in an attempt to reduce test case failures
(some offsets expected as extracted from a larger diff).  The changes
themselves should be treated as highly questionable :-)

diff --git a/src/lread.c b/src/lread.c
index cb8559f455c..b9cb57c7a75 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -1217,7 +1218,7 @@ #define READCHAR_BEGIN(frame) \
   while (0)
 
 #define READCHAR_FROM_FILE_BEGIN(frame) \
-  unsigned char buf[4]; \
+  unsigned char buf[MAX_MULTIBYTE_LENGTH]; \
   READCHAR_BEGIN(frame); \
   c = readbyte_from_file (frame); \
   if (c < 0) \
@@ -3000,7 +3001,7 @@ DEFUN ("load", Fload, Sload, 1, 5, 0,
     }
   /* we may or may not need this */
   lread_frame this_frame = INIT_LREAD_FRAME ();
-  Lisp_Object coding = Qno_conversion;
+  Lisp_Object coding = Qutf_8;
   this_frame.loop_dynamic_frame = SPECPDL_INDEX();
   if (is_module || is_native_elisp)
     {
@@ -3891,7 +3892,8 @@ nonstandard_read (Lisp_Object frameptr)
 {
   register lread_frame *frame = xmint_pointer (frameptr);
   Lisp_Object val;
-  if (!NILP (frame->readfun))
+  if (!NILP (frame->readfun)
+      && !NILP (Fequal (frame->start, frame->end)))
     {
       val = calln (frame->readfun, Qreader_input_stream);
 
@@ -4124,7 +4126,7 @@ enter_buffer_readevalloop (Lisp_Object buffer,
     tem = printflag;
   if (NILP (filename))
     filename = BVAR (XBUFFER (buf), filename);
-  if (NILP (Ffile_name_absolute_p (filename)))
+  if (!NILP (filename) && NILP (Ffile_name_absolute_p (filename)))
     full_name = Fexpand_file_name (filename, Qnil);
   else
     full_name = filename;
@@ -7540,6 +7535,7 @@ activate_lread_frame (lread_frame *new_frame, lread_frame_type new_type)
   switch (frame->intype)
     {
     case LREAD_INPUT_FILE:
+    case LREAD_INPUT_FILE_MULE:
       if (frame->input.f.stream == stdin)
 	{
 	  frame->unread_char = stdin_infile.unread_char;


Pip

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 28 Jun 2025 16:24:02 GMT) Full text and rfc822 format available.

Message #29 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 28 Jun 2025 12:22:57 -0400

> I want to use readevalloop to process command-line arguments, and my
> initial attempts failed to handle '"(load "loadup.el") (princ
> "Done")'.  So I took a stab at making all the entry points synchronize
> with an explicit stack of continuation frames shared between read0 and
> readevalloop.

I'm afraid with the `Subject:` and the above paragraph, I'm still not
able to figure out what you're trying to achieve (and the patch is
quite large).

Also `read` and `readevalloop` are quite different.  Do you want both to
be re-entrant?  Is it in both cases for the same reason?
AFAIK `readevalloop` is also re-entrant in that you can call it
recursively from its "eval" part.  What do you mean by "process
command-line arguments"?


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sun, 29 Jun 2025 02:39:02 GMT) Full text and rfc822 format available.

Message #32 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 28 Jun 2025 22:38:09 -0400

[Message part 1 (text/plain, inline)]

On Sat, Jun 28, 2025, 12:23 PM Stefan Monnier <monnier <at> iro.umontreal.ca>
wrote:

> > I want to use readevalloop to process command-line arguments, and my
> > initial attempts failed to handle '"(load "loadup.el") (princ
> > "Done")'.  So I took a stab at making all the entry points synchronize
> > with an explicit stack of continuation frames shared between read0 and
> > readevalloop.
>
> I'm afraid with the `Subject:` and the above paragraph, I'm still not
> able to figure out what you're trying to achieve (and the patch is
> quite large).
>
> Also `read` and `readevalloop` are quite different.  Do you want both to
> be re-entrant?  Is it in both cases for the same reason?
> AFAIK `readevalloop` is also re-entrant in that you can call it
> recursively from its "eval" part.  What do you mean by "process
> command-line arguments"?
>

It's true that read and readevalloop are distinct, but, ignoring the
readfun parameter of readevalloop, any static variables read0 is dependent
on, readevalloop is also dependent on.  It's also true that not every entry
to read0 is an entry to readevalloop, but practically every entry to
readevalloop will enter read0.  And they both depend on the state accessed
by readchar and unreadchar.  So, readevalloop can never really be reentrant
as long as read0/readchar/unreadchar are not.  My initial (current)
approach has been to identify all the private state in lread.c accessed
(directly or indirectly) and create a single data structure representing
that state as a collective stack frame for essentially all the procedures
of lread.c in the transitive closure of readevalloop 's call graph.  It's
true that this data structure is bigger than necessary, but readevalloop
only has to allocate one frame that will be used by each invocation of
read0 it makes in that activation frame of readevalloop.

Initially, I had tried implementing a simple read-eval loop for processing
sequences of expressions in a string, similar to "load" for files or
"eval-buffer".  I had implemented a new temacs mode, eval-args, which skips
loadup and all other command line options processing, to evaluate programs
given as command line arguments. I made read_internal_start reentrant with
respect to the obvious string state variables using unwind-protect and
specbind techniques.  However, it wasn't enough, because while it worked
for simple cases like "(princ comp-abi-hash) (terpri)" or even a simple
repl on stdin, the program '(load "loadup.el") (print "Finished")' would
complete loading load up, but immediately exit without printing
"Finished".  Or, if I typed '(load "loadup.el")' into that simple repl, it
would immediately exit after completing the load.

I considered that even if I determined the proximate cause of the issue, it
would be difficult to tell whether any particular solution would be TRT or
just a band-aid. So, I suppose the most immediate advantage for me is
improving my ability to debug recursive invocations of read0/readevalloop
by looking at the explicit stack of frames.  In lread.c's current state,
you can't tell just from looking at the static variables how or whether
they are related to one another, and in what order, at least not without
travelling along the backtrace and digging through the specpdl entries.

So, for example, I've implemented an input stream "reader-input-stream"
that can be used arbitrarily by code being evaluated for characters from
the "current" input stream, which is pretty much impossible in the current
implementation.  It's main use is for "readfun" in readevalloop.
Implementing "eval-stream", which evaluates all input streams as programs
uniformly, becomes trivial.

Another consideration is just the incremental removal of some barriers to
concurrency .  Really, the "current_frame" variable should be thread local,
the way specpdl variables are.  It just seemed like too much for my first
cut at this.  Obviously, it's just a drop in the bucket, but with the
progress of the igc branch, effective concurrency is looking more feasible.

That said, the patch was derived by dropping the lread.c from that other
project into a recent master, and hacking on it a bit to make it function
on a stand-alone basis.  So I can cut out some of the elements of the patch
that aren't really needed for re-entrancy.

Some of the changes (like moving a lot of definitions/declarations to the
top of the file) are just because I had to identify the functions that were
being changed to take the single "frame" parameter and keep track of them
somehow.  There was a lot of trial and error in figuring out what
information belonged to the "input" based on the type of stream it
represented, versus the fields that were properly part of the general logic
in read0/readevalloop/etc.  I'm sure there's room for improvement, but my
brain is a bit fried from combing over the code.

I also made a minor effort to differentiate between the case where
readevalloop initiates and eval and is responsible for handling errors,
versus the case where an intervening evaluation began due to an interrupt
(i.e. while processing input events or a timer in a maybe_quit()
somewhere).  It's half-baked because I only implemented it for calls to
maybe_quit() that appear in lread.c, just to see if I understood how I
might make it work.  In particular, at least for signals processed during
those particular maybe_quit calls, An unhandled end-of-file error generated
during one of those intervening evaluations will not be intercepted by
readevalloop and mistaken for ending the current reader stream.  Instead,
all such spurious signals are passed down to the entry point of that
activation of readevalloop (or read) and then.rethrown appropriately.  That
piece isn't really necessary, as any real solution will require overhauling
the way interrupts are handled.

At any rate, I'm working on cutting down the patch.  Pip asked that it be
split into smaller chunks.  I can do that, but if I don't think I'd want to
guarantee that the project would necessarily even compile after applying a
subset of the patches, even when applied in order.  Is that acceptable?

Lynn

[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sun, 29 Jun 2025 04:25:03 GMT) Full text and rfc822 format available.

Message #35 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Pip Cet <pipcet <at> protonmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sun, 29 Jun 2025 00:24:21 -0400

On Sat, Jun 28, 2025 at 10:57 AM Pip Cet <pipcet <at> protonmail.com> wrote:
>
> "Lynn Winebarger" <owinebar <at> gmail.com> writes:
>
> > On Fri, Jun 27, 2025 at 5:42 AM Pip Cet <pipcet <at> protonmail.com> wrote:
> >>
> >> "Lynn Winebarger" <owinebar <at> gmail.com> writes:
> >>
> >> Thanks for providing an updated patch!  It applies and, with the changes
> >> below, it seems to work for me.
> >>
> >> I haven't really looked at the code yet, just tried to get it to work.
> >> I did notice that, at least with debug CFLAGS, Emacs startup is much
> >> slower than it was before.
> >
> > That's interesting - I use the debug CFLAGS, and I was surprised that
> > I didn't notice any slowdown at startup.
>
> It's entirely possible it's just the nativecomp abi hash changing, I
> didn't think of that!  In any case, it's too early to benchmark these
> changes, IMHO.
>
> In particular, I tried running "make check", and while I've reduced the
> number of local test failures a little (see below), there are still some
> mysterious ones.
>
> > I replaced a bunch of static variables with indirect references to a
> > stack-allocated "frame" data structure, so I expected some
> > pessimization.
>
> That doesn't sound like it would be noticeable, at least when
> optimization is on.
>
> > I did use "register" for the frame parameter, in the hope that the
> > compiler would at least generate code that wouldn't be much worse than
> > for ordinary stack-allocated variables.
>
> I think "register" provides no benefit (it makes non-optimizing GCC put
> things into registers, sometimes, which is undesirable for debugging)
> and harms readability, but others disagree.
>
> > There are other optimization opportunities, I just got bitten by
> > Knuth's "root of all evil" enough that I gave up on optimizing before
> > getting the logic right.
>
> Sounds like the right approach to me.
>
> > Are you using native compilation in your build?  The comp-abi-hash
> > changes since I added some subrs. Could you tell me the configuration?
> > My build/test is on x86_64 Linux.
>
> Same here.
>
> > If I isolate the changes to just the re-entrancy, I could try making
> > read0 a monolithic procedure so the read stack is a local variable,
> > and making inline versions of the read char procedures in that
> > procedure, and use the tricks in bytecode.c for using computed goto in
> > read0 to use local variables uniformly in read0.  That should take
> > care of some of the pessimization, at least.
>
> I'd suggest attempting that only if it turns out to be absolutely
> necessary!  That would hurt readability of this code a lot, and we need
> more people to understand it.
>
> >> Is there any possibility of splitting this into several smaller patches?
> >> I get the impression it would be a lot of work, but maybe I'm wrong.
> >
> > Making read0/readevalloop re-entrant requires a global control-flow
> > transformation, so most of it is necessary. Two pieces that aren't
> > strictly necessary, but are in it because this was actually the last
> > step in my initial development of a "noneditor" mode for running
> > command-line type emacs-lisp programs with non-standard dump-files.
>
> Even if it isn't possible to split the changeset into functional
> patches, splitting it into separate patches may make the diffs more
> readable purely for syntactic reasons.  The first patch could merely add
> unused macro arguments to READCHAR etc; that would help diff (the
> program), in subsequent patches, to anchor the diff in the right
> place.  (For the same reason, I'd refrain from whitespace changes and
> drive-by fixes).
>
> > FIrst, the load_source_file step, because I want to try building a
> > minimal command-line compiler for boot strapping and async compiling
> > and think it ought to be able to handle all the source files in the
> > emacs lisp distribution without having to load every character set and
> > coding system available into the dump file.
>
> > Second, I modified lisp_file_lexical_cookie to allow the
> > lexical-binding cookie on any line in a leading comment block, rather
> > than just the first line.  Now that the cookie is being warned about,
> > I would prefer it if the system didn't force users to violate their
> > style requirements by making the first line obscenely long.  However,
> > it makes use of the "lread_rewind_input" to get around possible
> > limitations on ungetc.
> >
> > If I remove those two pieces, I think the rest will probably work.
>
> I think removing them for now would be a good idea.  The first change
> sounds like a great idea, but does it really depend on a reentrant
> reader?
>
> >> > I tried explicitly forcing emacs-internal as the coding-system for
> >> > load_source_file, but without any noticeable difference.
> >>
> >> Hmm.  I think the problem there is that you specbind
> >> Qcoding_system_for_read to Qemacs_internal while calling
> >> dump-emacs-portable.  The pdumper doesn't unwind specbind bindings, so
> >> the Vcoding_system_for_read variable is dumped with that value, and
> >> restored with that value from the dump, and then things go wrong because
> >> other code assumes Vcoding_system_for_read is nil.
> >
> > I'm sure that is a big, but it didn't fix the problem for me (ignoring
> > the the 2 non-fixes).
>
> I'm not sure what "the problem" is here; IIRC, the pdumper change fixes
> some of the differences in the generated .el files, while the non-fixes
> can be dropped if we use coding = Qutf_8 instead of coding =
> Qno_conversion in Fload.  (But I'm not sure that's the right fix).

The real problem is that I developed that 1000-yard coding stare
looking at this too long.  Somehow I got infected with the idea that
elc files were loaded as raw text, but in fact, the code in readchar
always decodes them according to either emac-internal or emacs-mule
and so are always multi-byte.  Once I made that change I got  a
working build with no complaints from the build process, at least.

I'll do the cleanups we've discussed and send the revised patch (or
patches, per your recommendation).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sun, 29 Jun 2025 04:37:03 GMT) Full text and rfc822 format available.

Message #38 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sun, 29 Jun 2025 00:36:21 -0400

On Sat, Jun 28, 2025 at 10:38 PM Lynn Winebarger <owinebar <at> gmail.com> wrote:
>
>
>
> On Sat, Jun 28, 2025, 12:23 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>>
>> > I want to use readevalloop to process command-line arguments, and my
>> > initial attempts failed to handle '"(load "loadup.el") (princ
>> > "Done")'.  So I took a stab at making all the entry points synchronize
>> > with an explicit stack of continuation frames shared between read0 and
>> > readevalloop.
>>
>> I'm afraid with the `Subject:` and the above paragraph, I'm still not
>> able to figure out what you're trying to achieve (and the patch is
>> quite large).
>>
>> Also `read` and `readevalloop` are quite different.  Do you want both to
>> be re-entrant?  Is it in both cases for the same reason?
>> AFAIK `readevalloop` is also re-entrant in that you can call it
>> recursively from its "eval" part.  What do you mean by "process
>> command-line arguments"?
>
>
> It's true that read and readevalloop are distinct, but, ignoring the readfun parameter of readevalloop, any static variables read0 is dependent on, readevalloop is also dependent on.  It's also true that not every entry to read0 is an entry to readevalloop, but practically every entry to readevalloop will enter read0.  And they both depend on the state accessed by readchar and unreadchar.  So, readevalloop can never really be reentrant as long as read0/readchar/unreadchar are not.  My initial (current) approach has been to identify all the private state in lread.c accessed (directly or indirectly) and create a single data structure representing that state as a collective stack frame for essentially all the procedures of lread.c in the transitive closure of readevalloop 's call graph.  It's true that this data structure is bigger than necessary, but readevalloop only has to allocate one frame that will be used by each invocation of read0 it makes in that activation frame of readevalloop.
>
Just to clarify my original wording - I approached this conceptually
using the technique for transforming an imperative-style program with
global state into a purely functional one.  Transforming the original
program into continuation-passing style makes every function a
continuation, then making every global a parameter of every function,
with assignments/side-effects being achieved by changing the value
being passed for the corresponding parameter.  Then, the continuations
(which means the functions, there is no difference in CPS) can be
simplified to only pass the globals required for themselves or the
continuations they call.  Then you can take an entry point
(readevalloop in this case), and its parameter set will be the union
of the parameters of the transitive closure of its continuations, plus
whatever parameters it alone requires.  That's why I referred to the
"frame" as a continuation frame, as it captures the union of all the
continuations called by readevalloop, excepting the one from the eval
call.

Lynn

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sun, 29 Jun 2025 04:52:02 GMT) Full text and rfc822 format available.

Message #41 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sun, 29 Jun 2025 00:51:30 -0400

> It's true that read and readevalloop are distinct, but, ignoring the
> readfun parameter of readevalloop, any static variables read0 is dependent
> on, readevalloop is also dependent on.  It's also true that not every entry
> to read0 is an entry to readevalloop, but practically every entry to
> readevalloop will enter read0.  And they both depend on the state accessed
> by readchar and unreadchar.

I'm afraid I still have no idea why the above is relevant.

> So, readevalloop can never really be reentrant as long as
> read0/readchar/unreadchar are not.

That's obviously not true: every `load` calls `readevalloop` and from
there it can all other `load`s.  So when file A `require`s file B which
`require`s file C, you're using `readevalloop` in a reentrant way.

> My initial (current) approach has been to identify all the private
> state in lread.c accessed

You tell me how you go about solving your problem, but I don't know
what is the problem you're trying to solve.

> Initially, I had tried implementing a simple read-eval loop for processing
> sequences of expressions in a string, similar to "load" for files or
> "eval-buffer".

You can easily write this in ELisp, so I don't see how that relates to
any part of the C code.

> I considered that even if I determined the proximate cause of the issue, it
> would be difficult to tell whether any particular solution would be TRT or
> just a band-aid. So, I suppose the most immediate advantage for me is
> improving my ability to debug recursive invocations of read0/readevalloop
> by looking at the explicit stack of frames.

How/when&why do you expect/want `read` to be called recursively?

> So, for example, I've implemented an input stream "reader-input-stream"
> that can be used arbitrarily by code being evaluated for characters from
> the "current" input stream, which is pretty much impossible in the current
> implementation.  It's main use is for "readfun" in readevalloop.
> Implementing "eval-stream", which evaluates all input streams as programs
> uniformly, becomes trivial.

I don't know what "evaluates all input streams as programs
uniformly" means, nor do I know what "eval-stream" refers to (is that
the name of a standard function in some language?  What does it do?).


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sun, 29 Jun 2025 12:17:04 GMT) Full text and rfc822 format available.

Message #44 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sun, 29 Jun 2025 08:16:26 -0400

On Sun, Jun 29, 2025 at 12:51 AM Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:
>
> > It's true that read and readevalloop are distinct, but, ignoring the
> > readfun parameter of readevalloop, any static variables read0 is dependent
> > on, readevalloop is also dependent on.  It's also true that not every entry
> > to read0 is an entry to readevalloop, but practically every entry to
> > readevalloop will enter read0.  And they both depend on the state accessed
> > by readchar and unreadchar.
>
> I'm afraid I still have no idea why the above is relevant.
>
> > So, readevalloop can never really be reentrant as long as
> > read0/readchar/unreadchar are not.
>
> That's obviously not true: every `load` calls `readevalloop` and from
> there it can all other `load`s.  So when file A `require`s file B which
> `require`s file C, you're using `readevalloop` in a reentrant way.

I think we are talking at cross-purposes here.  It's certainly true
that readevalloop and read0 are being used recursively.  But my
understanding of "X is reentrant" is something like "X is robust
against failures due to recursive calls".  According to Wikipedia, the
term is even stronger and implies a procedure can be safely called
concurrently.

>
> > My initial (current) approach has been to identify all the private
> > state in lread.c accessed
>
> You tell me how you go about solving your problem, but I don't know
> what is the problem you're trying to solve.
>

(a) My personal immediate problem is evaluating arbitrary programs
supplied a user of a bare temacs as a command line argument, but I
don't think that is in itself very motivating for emacs maintainers
(b) For future maintainers, making it easier to debug recursive usage
of the reader/readevalloop.
(c) Incremental improvement in concurrent usage of read0, at least
with respect to C variables.  Dealing with the global symbol table is
a blocker that isn't specific to the reader implementation so much as
LISP semantics.
(d) Possibly cleaner handling of interrupts and signals raised while
in read0 or readevalloop.

If the C runtime is intended to be preserved in amber (or reduced in
scope), then I suppose the possibility of increasing the safety of
futures additional uses of these procedures from C will not be very
motivating.

> > Initially, I had tried implementing a simple read-eval loop for processing
> > sequences of expressions in a string, similar to "load" for files or
> > "eval-buffer".
>
> You can easily write this in ELisp, so I don't see how that relates to
> any part of the C code.

Well, in case (a), there's a chicken and egg problem, but I don't
think you will find that very motivating.
>
> > I considered that even if I determined the proximate cause of the issue, it
> > would be difficult to tell whether any particular solution would be TRT or
> > just a band-aid. So, I suppose the most immediate advantage for me is
> > improving my ability to debug recursive invocations of read0/readevalloop
> > by looking at the explicit stack of frames.
>
> How/when&why do you expect/want `read` to be called recursively?

It can definitely be entered multiple times if a stream being read
blocks and the command loop executes another function that calls read.
That can clearly happen when reading from user-supplied procedures or
a raw stdio stream (i.e. from "load").  I am not at all confident that
I know all the ways a maybe_quit() call can occur in primitives called
by read0 (directly or indirectly) to rule it out of occuring while
reading from a string.  You definitely have the advantage of me there.
If read0 can only access maybe_quit through the calls to that function
that actually appear in lread.c, then maybe such a recursion can't
happen.  But that is a global property that has to be enforced when
working on code outside of lread.c - maybe the maintainers all know
which functions are guaranteed to be "safe" in that way, but I'm not
sure how the rest of us would know what those are.

>
> > So, for example, I've implemented an input stream "reader-input-stream"
> > that can be used arbitrarily by code being evaluated for characters from
> > the "current" input stream, which is pretty much impossible in the current
> > implementation.  It's main use is for "readfun" in readevalloop.
> > Implementing "eval-stream", which evaluates all input streams as programs
> > uniformly, becomes trivial.
>
> I don't know what "evaluates all input streams as programs
> uniformly" means, nor do I know what "eval-stream" refers to (is that
> the name of a standard function in some language?  What does it do?).

I added it in the 0002-...patch in one of my responses to Pip Cet
above.  Interpreted languages are characterized not only be the
existence of "eval" for evaluating individual top-level expressions,
but some form of "eval-program" that reads a textual representation of
a sequence of top-level expressions, parsing and evaluating one
expression at a time.  Elisp has three such operators - load,
eval-buffer,and eval-region, each independently responsible for
ensuring the safety of mutual recursion with the others.  I am using
"eval-stream" to mean an "eval-program" that can accept any of the
types of input streams defined in the elisp manual.

Lynn

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sun, 29 Jun 2025 17:04:04 GMT) Full text and rfc822 format available.

Message #47 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sun, 29 Jun 2025 13:03:38 -0400

> (a) My personal immediate problem is evaluating arbitrary programs
> supplied a user of a bare temacs as a command line argument, but I
> don't think that is in itself very motivating for emacs maintainers

That's still very vague:

    temacs .../foo.el

satisfies my understanding of "evaluating arbitrary programs
supplied [by] a user of a bare temacs as a command line argument" and
I can't see why that would require a reentrant `read`.

If you mean

    temacs ... '(progn (blabla))'

I still can't see why you'd need a reentrant `read` for that.

Any hope you can give us an actual concrete scenario?

>> How/when&why do you expect/want `read` to be called recursively?
> It can definitely be entered multiple times if a stream being read
> blocks and the command loop executes another function that calls read.
> That can clearly happen when reading from user-supplied procedures or
> a raw stdio stream (i.e. from "load").

A concrete scenario would help, here as well.

In any case, what I gather from this exchange is that your main
motivation is to stop `read` relying so much on global variables just
because It's Right (the benefits you describe don't seem to be ones you
personally need).

I can definitely go along with that.  The reason why I care so much
about the motivation behind the change is that it affects the tradeoffs
(e.g. impact on performance or code maintenance).  Have you made any
measurements about the impact on the speed for `read`?
Emacs startup time is significantly affected by the time to `read`.

Given the patch size, I barely started looking at it, but see some
comments below.


        Stefan


> -#define lread_fd_p	(fd >= 0)
> -#define lread_close	emacs_close
> -#define lread_fstat	sys_fstat
> -#define lread_read_quit	emacs_read_quit
> -#define lread_lseek	lseek
> -
> -#define file_stream		FILE *
> -#define file_seek		fseek
> -#define file_stream_valid_p(p)	(p)
> -#define file_stream_close	emacs_fclose
> -#define file_stream_invalid	NULL
> -#define file_get_char		getc
> +#define lread_fd_p      (fd >= 0)
> +#define lread_close     emacs_close
> +#define lread_fstat     sys_fstat
> +#define lread_read_quit emacs_read_quit
> +#define lread_lseek     lseek
> +
> +#define file_stream             FILE *
> +#define file_seek               fseek
> +#define file_stream_valid_p(p)  (p)
> +#define file_stream_close       emacs_fclose
> +#define file_stream_invalid     NULL
> +#define file_get_char           getc

Please avoid changing whitespace in parts of the code you don't
otherwise touch.

> +struct inbuffer
> +{
> +  Lisp_Object buffer;
> +};
> +#define INIT_INBUFFER()                                 \
> +  ((struct inbuffer)                                    \
> +   {                                                    \
> +     .buffer = Qnil                                     \
> +   } )

I think INIT_INBUFFER can just as well be a static function.  And more
importantly, it would benefit from taking the buffer as argument so you
don't have this intermediate state where the struct has been created but
not yet really initialized (other than with those dummy values).

Same for the other INIT_*

> +typedef union lread_input
> +{
> +  unsigned char uninit[LREAD_INPUT_SIZE];
> +  struct infile f;
> +  struct instring s;
> +  struct inbuffer b;
> +  struct inmarker m;
> +  struct inproc p;
> +  struct inindirect frame;
> +  struct innull n;
> +} lread_input;

I'd put the `lread_input_type` right alongside it inside a `struct`, so
we know they always come in pairs.

Also, why do you need the "uninit" case?


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Mon, 30 Jun 2025 02:29:03 GMT) Full text and rfc822 format available.

Message #50 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sun, 29 Jun 2025 22:28:34 -0400

On Sun, Jun 29, 2025 at 1:03 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
> > (a) My personal immediate problem is evaluating arbitrary programs
> > supplied a user of a bare temacs as a command line argument, but I
> > don't think that is in itself very motivating for emacs maintainers
>
> That's still very vague:
>
>     temacs .../foo.el
>
> satisfies my understanding of "evaluating arbitrary programs
> supplied [by] a user of a bare temacs as a command line argument" and
> I can't see why that would require a reentrant `read`.
>
> If you mean
>
>     temacs ... '(progn (blabla))'
>
> I still can't see why you'd need a reentrant `read` for that.

Basically the latter, because I was looking at using it Makefiles and
didn't want to deal with generating little files to load.  If I was
willing to limit the strings to single expressions, you'd be right,
but why should the form a program is input (string, file, buffer,
whatever) determine what's allowed?
I did try to get by with just save/restore on the string variables,
but it failed when evaluating '(load "loadup.el") (print "Finished")".
The failure certainly seems like an issue caused by some subtlety in
the way recursive use of readevalloop is implemented, but I can't
bring myself to try and figure out a workaround for the specific
brokenness when I could just rework it to be sure its reentrant, then
see if that fixes the issue.  Then, even if I lost the bet that
re-entrancy was the issue, I"ve at least gotten something from the
effort, whereas if I just hack away looking for some magic fix I'll
never feel like I actually solved the issue.
Or maybe the problem I'm solving is my ignorance of how the emacs
runtime works in practice?  A lack of amusing pasttimes?  A stubborn
refusal to accept things that do not make sense to me?

>
> Any hope you can give us an actual concrete scenario?
>
I keep giving you one, you just consider it unnecessary.

> >> How/when&why do you expect/want `read` to be called recursively?
> > It can definitely be entered multiple times if a stream being read
> > blocks and the command loop executes another function that calls read.
> > That can clearly happen when reading from user-supplied procedures or
> > a raw stdio stream (i.e. from "load").
>
> A concrete scenario would help, here as well.

Ok, there are calls to maybe quit in readbyte_from_stdio where the
file returns EOF due to an interrupt.  So, if one function is reading
a lisp expression when this occurs, then the user type "M-: (read
<something>)", you will have a recursive re-entry to read.  Or any
other source of lisp evaluation that might happen while processing
events - who knows, it really has nothing to do with the read that is
stuck waiting for input (but not blocked).

>
> In any case, what I gather from this exchange is that your main
> motivation is to stop `read` relying so much on global variables just
> because It's Right (the benefits you describe don't seem to be ones you
> personally need).

I mean, I guess.  I just don't like spending a lot of time
understanding the finer details of how a fragile system fails instead
of just simplifying it into a form where the fragility can be reduced
if not eliminated?

> I can definitely go along with that.  The reason why I care so much
> about the motivation behind the change is that it affects the tradeoffs
> (e.g. impact on performance or code maintenance).  Have you made any
> measurements about the impact on the speed for `read`?
> Emacs startup time is significantly affected by the time to `read`.

Not yet.  I wanted to make sure it actually functioned and preserved
the intended behavior first.  Although I personally haven't noticed
any real impact, and I expected to since I'm running -O0.  Maybe it
would be more noticeable with optimization enabled - I don't know if
using the register keyword for the frame actually achieved my hope
that performance wouldn't be much worse than having to access local
variables from the stack.  But I'm sure you know Knuth's maxim.
>
> Given the patch size, I barely started looking at it, but see some
> comments below.
>
Thanks!

> > -#define lread_fd_p   (fd >= 0)
> > -#define lread_close  emacs_close
> > -#define lread_fstat  sys_fstat
> > -#define lread_read_quit      emacs_read_quit
> > -#define lread_lseek  lseek
> > -
> > -#define file_stream          FILE *
> > -#define file_seek            fseek
> > -#define file_stream_valid_p(p)       (p)
> > -#define file_stream_close    emacs_fclose
> > -#define file_stream_invalid  NULL
> > -#define file_get_char                getc
> > +#define lread_fd_p      (fd >= 0)
> > +#define lread_close     emacs_close
> > +#define lread_fstat     sys_fstat
> > +#define lread_read_quit emacs_read_quit
> > +#define lread_lseek     lseek
> > +
> > +#define file_stream             FILE *
> > +#define file_seek               fseek
> > +#define file_stream_valid_p(p)  (p)
> > +#define file_stream_close       emacs_fclose
> > +#define file_stream_invalid     NULL
> > +#define file_get_char           getc
>
> Please avoid changing whitespace in parts of the code you don't
> otherwise touch.

Yeah, I goofed on that.

> > +struct inbuffer
> > +{
> > +  Lisp_Object buffer;
> > +};
> > +#define INIT_INBUFFER()                                 \
> > +  ((struct inbuffer)                                    \
> > +   {                                                    \
> > +     .buffer = Qnil                                     \
> > +   } )
>
> I think INIT_INBUFFER can just as well be a static function.  And more
> importantly, it would benefit from taking the buffer as argument so you
> don't have this intermediate state where the struct has been created but
> not yet really initialized (other than with those dummy values).
>
> Same for the other INIT_*

It's not really intended to be a procedure, just a static initializer.
I haven't regularly programmed in C since the decade I learned the
language from the recently published K&R 2nd edition, so my style
discipline is pretty low.  I probably got bit by an identifier used
for member access  getting expanded as an object-like macro and
started making all my macros function-like for a while.
As for the static initializers themselves, I'm happy to take some
guidance on style preferences.  I'm just going by the emacs code base
and the material on cppreference.com.  I kept on getting complaints
from gcc about the form of the initializers, so I ended up just
writing everything out explicitly just to be done with it. It took me
forever to work out a "missing braces" message was referring to a
specpdl_ref initializer in the INIT_LREAD_FRAME.

>
> > +typedef union lread_input
> > +{
> > +  unsigned char uninit[LREAD_INPUT_SIZE];
> > +  struct infile f;
> > +  struct instring s;
> > +  struct inbuffer b;
> > +  struct inmarker m;
> > +  struct inproc p;
> > +  struct inindirect frame;
> > +  struct innull n;
> > +} lread_input;
>
> I'd put the `lread_input_type` right alongside it inside a `struct`, so
> we know they always come in pairs.

The details of those data representations definitely need some
attention.  I will probably fine-tune them to follow the approach used
by the pseudovector types.

>
> Also, why do you need the "uninit" case?

That's my unfamiliarity with C99 and zero-initializing a union and
being certain it zeros the entire thing and not just the first
alternative.  It seemed like { 0 } wasn't zero initializing members of
structures that were also structures, but I could have been mistaken -
that wasn't the interesting thing I was working out, just an annoying
hoop to jump through at the time.  I may just be waving a dead chicken
here.

Lynn

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Mon, 30 Jun 2025 03:56:02 GMT) Full text and rfc822 format available.

Message #53 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sun, 29 Jun 2025 23:54:59 -0400

>>     temacs ... '(progn (blabla))'
>>
>> I still can't see why you'd need a reentrant `read` for that.
>
> Basically the latter, because I was looking at using it Makefiles and
> didn't want to deal with generating little files to load.  If I was
> willing to limit the strings to single expressions, you'd be right,
> but why should the form a program is input (string, file, buffer,
> whatever) determine what's allowed?

I don't understand the connection.
Hmm... oh, is it because when you use `read` on a string it can read
only the first sexp?  If so, I kinda see your point, but since you know
you're starting from a string, you can use `read-from-string` which
returns the info needed for the loop that reads all the sexps.

> I did try to get by with just save/restore on the string variables,
> but it failed when evaluating '(load "loadup.el") (print "Finished")".

I still can't imagine what that "just save/restore on the string
variables" looked like.

> The failure certainly seems like an issue caused by some subtlety in
> the way recursive use of readevalloop is implemented, but I can't

From which I gather the above "just save/restore on the string
variables" somehow involved `readevalloop`, but I don't know how.

> bring myself to try and figure out a workaround for the specific
> brokenness when I could just rework it to be sure its reentrant, then
> see if that fixes the issue.

🙂
Seems like a big hammer for that little problem.
But if it can get you to improve our code, who are we to judge, eh?

> Or maybe the problem I'm solving is my ignorance of how the Emacs
> runtime works in practice?  A lack of amusing pasttimes?  A stubborn
> refusal to accept things that do not make sense to me?

You're at the right place.

>> Any hope you can give us an actual concrete scenario?
> I keep giving you one, you just consider it unnecessary.

No, I just can't imagine how you get from the problem you describe to an
attempted solution that uses `readevalloop` (from there I see how you
got to the re-entrancy thingy).

>> >> How/when&why do you expect/want `read` to be called recursively?
>> > It can definitely be entered multiple times if a stream being read
>> > blocks and the command loop executes another function that calls read.
>> > That can clearly happen when reading from user-supplied procedures or
>> > a raw stdio stream (i.e. from "load").
>>
>> A concrete scenario would help, here as well.
>
> Ok, there are calls to maybe quit in readbyte_from_stdio where the
> file returns EOF due to an interrupt.  So, if one function is reading
> a lisp expression when this occurs, then the user type "M-: (read
> <something>)", you will have a recursive re-entry to read.

Hmm... IIRC `maybe_quit` is not supposed to run ELisp code.
The more likely path to the problem would seem to be via a stream
that's a function.

In any case, if you don't know of a recipe to reproduce the problem,
don't bother looking for one, I was asking because I thought you had
bumped into one, so you could just exhibit it, I didn't mean to ask you
to reverse-engineer it from the code.

[ FWIW, I definitely believe you that it's possible.  But I expect we
  have *many* potential problems like these in our code base, so unless
  we have a clear way to solve a whole class of them I'd rather focus on
  the ones that are known to occur in existing code.  ]

> It's not really intended to be a procedure, just a static initializer.

I understand, but functions are just so much better behaved than macros
that it's worth avoiding macros when a function can be used just as well.

> I haven't regularly programmed in C since the decade I learned the
> language from the recently published K&R 2nd edition, so my style
> discipline is pretty low.

That was my assumption.  Since then, compilers have gotten much better
at inlining functions, so you can do

    init_inbuffer (&x);

and they'll generate the same code as it does with your

    struct inbuffer x = INIT_INBUFFER ();

[ Well, admittedly, not with `-O0`.  ]

Note also that

    struct inbuffer x = { .buffer = buf };

should work just as well, so maybe "nothing" is an even better option
than either.

> It took me forever to work out a "missing braces" message was
> referring to a specpdl_ref initializer in the INIT_LREAD_FRAME.

🙂

>> Also, why do you need the "uninit" case?
> That's my unfamiliarity with C99 and zero-initializing a union
> and being certain it zeros the entire thing and not just the
> first alternative.

Ah I see.  AFAIK `memset` is the standard solution.
`grep` shows it's used extensively in our code for that very purpose.
[ Current C compilers routinely replace calls to `memset` and `memcpy`
  with open-coded machine code when the size is known at compile-time.  ]


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Tue, 01 Jul 2025 06:44:03 GMT) Full text and rfc822 format available.

Message #56 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Tue, 1 Jul 2025 02:43:04 -0400

[Message part 1 (text/plain, inline)]

Hi, Stefan,
I've attached an updated patch that removes (I think) the
modifications that aren't actually for re-entrancy of
read/readevalloop.
I think I got all of the unintentional whitespace changes backed out as well.
I also applied one of Pip's changes, and the fix that lets it compile
and start without a hitch, setting file streams to load as multibyte,
since the utf-8-emacs decoding was always in the readchar function.
I ran "make check", and attached a short elisp file I used to load the
ones with failures as well as the results of "M-x ert t".
It looks like eval-buffer doesn't restore the buffer state correctly
at least some of the time.

Also, the rx-tests fail because (load "<...>/rx-tests.elc") has a
failure in read0 - something is screwy that seems to be related to the
#@10 near the beginning, since readevalloop successfully evaluates the
first exec-byte-code form (the whole thing is available as "val" in
the frame when I look through the backtrace), but read0 starts the
next expression 10 characters back at a "]".  Which, I don't know how
it goes back by 10 characters when it's just reading a stdio stream
one character at a time.  I'll have to investigate further.


On Sun, Jun 29, 2025 at 11:55 PM Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:
>
> >>     temacs ... '(progn (blabla))'
> >>
> >> I still can't see why you'd need a reentrant `read` for that.
> >
> > Basically the latter, because I was looking at using it Makefiles and
> > didn't want to deal with generating little files to load.  If I was
> > willing to limit the strings to single expressions, you'd be right,
> > but why should the form a program is input (string, file, buffer,
> > whatever) determine what's allowed?
>
> I don't understand the connection.
> Hmm... oh, is it because when you use `read` on a string it can read
> only the first sexp?  If so, I kinda see your point, but since you know
> you're starting from a string, you can use `read-from-string` which
> returns the info needed for the loop that reads all the sexps.
>
> > I did try to get by with just save/restore on the string variables,
> > but it failed when evaluating '(load "loadup.el") (print "Finished")".
>
> I still can't imagine what that "just save/restore on the string
> variables" looked like.
>
> > The failure certainly seems like an issue caused by some subtlety in
> > the way recursive use of readevalloop is implemented, but I can't
>
> From which I gather the above "just save/restore on the string
> variables" somehow involved `readevalloop`, but I don't know how.
>
> > bring myself to try and figure out a workaround for the specific
> > brokenness when I could just rework it to be sure its reentrant, then
> > see if that fixes the issue.
>
> 🙂
> Seems like a big hammer for that little problem.
> But if it can get you to improve our code, who are we to judge, eh?
>
> > Or maybe the problem I'm solving is my ignorance of how the Emacs
> > runtime works in practice?  A lack of amusing pasttimes?  A stubborn
> > refusal to accept things that do not make sense to me?
>
> You're at the right place.

Also, this kind of exercise is just table stakes if I ever want to try
my hand at some of the ideas I've been bouncing around emacs-devel for
years now.  Like encapsulating the lisp machine so a single process
can have multiple lisps running, then work out distributed data
structures, etc.  Having a gc that supports concurrency will remove
the biggest blocker for that.
>
> >> Any hope you can give us an actual concrete scenario?
> > I keep giving you one, you just consider it unnecessary.
>
> No, I just can't imagine how you get from the problem you describe to an
> attempted solution that uses `readevalloop` (from there I see how you
> got to the re-entrancy thingy).
>
Maybe it will make more sense once I submit the crazy feature I've
implemented that this started with.  I needed a basic trampoline in
place of command-loop, since Frecursive_edit requires things that
aren't available without loadup.el loaded.

> >> >> How/when&why do you expect/want `read` to be called recursively?
> >> > It can definitely be entered multiple times if a stream being read
> >> > blocks and the command loop executes another function that calls read.
> >> > That can clearly happen when reading from user-supplied procedures or
> >> > a raw stdio stream (i.e. from "load").
> >>
> >> A concrete scenario would help, here as well.
> >
> > Ok, there are calls to maybe quit in readbyte_from_stdio where the
> > file returns EOF due to an interrupt.  So, if one function is reading
> > a lisp expression when this occurs, then the user type "M-: (read
> > <something>)", you will have a recursive re-entry to read.
>
> Hmm... IIRC `maybe_quit` is not supposed to run ELisp code.
> The more likely path to the problem would seem to be via a stream
> that's a function.
>
> In any case, if you don't know of a recipe to reproduce the problem,
> don't bother looking for one, I was asking because I thought you had
> bumped into one, so you could just exhibit it, I didn't mean to ask you
> to reverse-engineer it from the code.
>
> [ FWIW, I definitely believe you that it's possible.  But I expect we
>   have *many* potential problems like these in our code base, so unless
>   we have a clear way to solve a whole class of them I'd rather focus on
>   the ones that are known to occur in existing code.  ]
>
> > It's not really intended to be a procedure, just a static initializer.
>
> I understand, but functions are just so much better behaved than macros
> that it's worth avoiding macros when a function can be used just as well.
>
> > I haven't regularly programmed in C since the decade I learned the
> > language from the recently published K&R 2nd edition, so my style
> > discipline is pretty low.
>
> That was my assumption.  Since then, compilers have gotten much better
> at inlining functions, so you can do
>
>     init_inbuffer (&x);
>
> and they'll generate the same code as it does with your
>
>     struct inbuffer x = INIT_INBUFFER ();
>
> [ Well, admittedly, not with `-O0`.  ]
>
> Note also that
>
>     struct inbuffer x = { .buffer = buf };
>
> should work just as well, so maybe "nothing" is an even better option
> than either.

I noticed that, now, I was only using those macros in one place each,
so I've just done away with the macros and written them out (in the
attached patch).

>
> > It took me forever to work out a "missing braces" message was
> > referring to a specpdl_ref initializer in the INIT_LREAD_FRAME.
>
> 🙂
>
> >> Also, why do you need the "uninit" case?
> > That's my unfamiliarity with C99 and zero-initializing a union
> > and being certain it zeros the entire thing and not just the
> > first alternative.
>
> Ah I see.  AFAIK `memset` is the standard solution.
> `grep` shows it's used extensively in our code for that very purpose.
> [ Current C compilers routinely replace calls to `memset` and `memcpy`
>   with open-coded machine code when the size is known at compile-time.  ]

I don't have a style discipline (at least not once GNU's style is
chosen - my preferences run to more compact presentations).  But I do
have a kind of aesthetic preference for writing constant values as
literals (compound literals in this case).  I mean, that is definitely
a LISP heritage that's shown up in C and other imperative languages.

I may follow up on some of the other points later, but maybe I should
give you a chance to look at the cleaned up patch.

Lynn

[0001-Changes-to-make-entry-to-read0-and-readevalloop-recu.patch (text/x-patch, attachment)]

[bad-tests.el (text/x-emacs-lisp, attachment)]

[ert-failures.el (text/x-emacs-lisp, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Tue, 01 Jul 2025 23:49:03 GMT) Full text and rfc822 format available.

Message #59 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Tue, 1 Jul 2025 19:47:45 -0400

[Message part 1 (text/plain, inline)]

On Sun, Jun 29, 2025 at 11:55 PM Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:
>
> That was my assumption.  Since then, compilers have gotten much better
> at inlining functions, so you can do
>
>     init_inbuffer (&x);
>
> and they'll generate the same code as it does with your
>
>     struct inbuffer x = INIT_INBUFFER ();
>
> [ Well, admittedly, not with `-O0`.  ]
>
> Note also that
>
>     struct inbuffer x = { .buffer = buf };
>
> should work just as well, so maybe "nothing" is an even better option
> than either.
>

I've replaced the READCHAR_* macros with inline functions.  GDB just
cannot do anything with function-like macros.  I also noticed and
fixed the bug that was causing Fload to fail trying to load
rx-tests.elc.
None of the other test failures went away, but there were no new ones either.

[0002-Transform-READCHAR_-macros-into-inline-functions.patch (text/x-patch, attachment)]

[ert-failures-0002.el (text/x-emacs-lisp, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Fri, 04 Jul 2025 02:49:02 GMT) Full text and rfc822 format available.

Message #62 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: 78898 <at> debbugs.gnu.org
Cc: Pip Cet <pipcet <at> protonmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Thu, 3 Jul 2025 22:47:38 -0400

[Message part 1 (text/plain, inline)]

The attached patch (in addition to the preceding 2) eliminates most of
the errors from "make check".  I've attached the ones from edebug.  I
see "edebug-tests-break-in-lambda-out-of-defining-context" fail in my
build of the master checkout, so that leaves two unexplained failures,
edebug-tests-step-into-function and
edebug-tests-trace-function-call-and-return.  I currently have no idea
what is going on.  The other 67 (or thereabouts) errors were fixed by
1. making the control flow more faithful to the original for
eval-region [ about 35 tests].  I split LREAD_FRAME_MULTIPLE into
LREAD_FRAME_INTEGRAL (where all lisp state changes are integrated) and
LREAD_FRAME_DYNAMIC_WIND (where eval-region provides some protection
of the current buffer state).  The previous version of the code relied
on !NILP(start) to prompt this protective wind-down/wind-up behavior
between read and eval.
2. Ensure that if readfun is called and generates an eof due to the
point moving to the end of the region, any expression returned gets
evaluated before the eof is honored in readevalloop.
3. ensuring that calling "load-read-function" uses the buffer as a
stream argument if that is the input for the frame.  Edebug relies on
this undocumented behavior in advising load-read-function to
instrument code being debugged [20+ tests].
4. Getting rid of the new signal handlers used to bypass frames - may
have been responsible for one or two test failures.  They were mainly
an experiment to understand how the dynamic stack works and if a more
explicit dynamic-wind type behavior would work if code rand while
processing a maybe_quit.  I'm not sure if this changes an observed
test results, but it doesn't really do anything useful beyond testing
how the idea might actually work.  So, calls to read are still
susceptible to eof signals generated but not handled by elisp code run
while reading the stream.

I can't perceive any difference in startup time with the debug build
compiled without support for native compilation.

Overall results of 'make check'
SUMMARY OF TEST RESULTS
-----------------------
Files examined: 533
Ran 8119 tests, 7959 results as expected, 27 unexpected, 133 skipped
4 files contained unexpected results:
  lisp/vc/vc-tests/vc-tests.log
  lisp/progmodes/python-tests.log
  lisp/progmodes/c-ts-mode-tests.log
  lisp/emacs-lisp/edebug-tests.log

For the non-edebug tests, I get the same failures for my reference
checkout on my test system.
I'd be happy to take any hints on those two remaining edebug
regressions.  I have no idea what the difference of interest is
between those two and all the other similar looking tests is, in terms
of reliance on the behavior of
eval-region/eval-buffer/load-read-function.

[0003-Reverted-over-regularization-of-control-flow.patch (text/x-patch, attachment)]

[ert-failures-0003.el (text/x-emacs-lisp, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Fri, 04 Jul 2025 21:07:03 GMT) Full text and rfc822 format available.

Message #65 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: 78898 <at> debbugs.gnu.org
Cc: Pip Cet <pipcet <at> protonmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Fri, 4 Jul 2025 17:06:16 -0400

[Message part 1 (text/plain, inline)]

I'm attaching a unified patch with a more cogent commit message.  The
previous patch series had a number of misfeatures in the initial patch
backed out by subsequent patches.  This patch should remove the need
to review code that is not ultimately used.

Lynn

On Thu, Jul 3, 2025 at 10:47 PM Lynn Winebarger <owinebar <at> gmail.com> wrote:
>
> The attached patch (in addition to the preceding 2) eliminates most of
> the errors from "make check".  I've attached the ones from edebug.  I
> see "edebug-tests-break-in-lambda-out-of-defining-context" fail in my
> build of the master checkout, so that leaves two unexplained failures,
> edebug-tests-step-into-function and
> edebug-tests-trace-function-call-and-return.  I currently have no idea
> what is going on.  The other 67 (or thereabouts) errors were fixed by
> 1. making the control flow more faithful to the original for
> eval-region [ about 35 tests].  I split LREAD_FRAME_MULTIPLE into
> LREAD_FRAME_INTEGRAL (where all lisp state changes are integrated) and
> LREAD_FRAME_DYNAMIC_WIND (where eval-region provides some protection
> of the current buffer state).  The previous version of the code relied
> on !NILP(start) to prompt this protective wind-down/wind-up behavior
> between read and eval.
> 2. Ensure that if readfun is called and generates an eof due to the
> point moving to the end of the region, any expression returned gets
> evaluated before the eof is honored in readevalloop.
> 3. ensuring that calling "load-read-function" uses the buffer as a
> stream argument if that is the input for the frame.  Edebug relies on
> this undocumented behavior in advising load-read-function to
> instrument code being debugged [20+ tests].
> 4. Getting rid of the new signal handlers used to bypass frames - may
> have been responsible for one or two test failures.  They were mainly
> an experiment to understand how the dynamic stack works and if a more
> explicit dynamic-wind type behavior would work if code rand while
> processing a maybe_quit.  I'm not sure if this changes an observed
> test results, but it doesn't really do anything useful beyond testing
> how the idea might actually work.  So, calls to read are still
> susceptible to eof signals generated but not handled by elisp code run
> while reading the stream.
>
> I can't perceive any difference in startup time with the debug build
> compiled without support for native compilation.
>
> Overall results of 'make check'
> SUMMARY OF TEST RESULTS
> -----------------------
> Files examined: 533
> Ran 8119 tests, 7959 results as expected, 27 unexpected, 133 skipped
> 4 files contained unexpected results:
>   lisp/vc/vc-tests/vc-tests.log
>   lisp/progmodes/python-tests.log
>   lisp/progmodes/c-ts-mode-tests.log
>   lisp/emacs-lisp/edebug-tests.log
>
> For the non-edebug tests, I get the same failures for my reference
> checkout on my test system.
> I'd be happy to take any hints on those two remaining edebug
> regressions.  I have no idea what the difference of interest is
> between those two and all the other similar looking tests is, in terms
> of reliance on the behavior of
> eval-region/eval-buffer/load-read-function.

[0001-Make-read-readevalloop-jointly-reentrant.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 05 Jul 2025 10:14:02 GMT) Full text and rfc822 format available.

Message #68 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattias.engdegard <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Pip Cet <pipcet <at> protonmail.com>, 78898 <at> debbugs.gnu.org,
 Stefan Monnier <monnier <at> iro.umontreal.ca>,
 Lynn Winebarger <owinebar <at> gmail.com>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 5 Jul 2025 12:13:07 +0200

28 juni 2025 kl. 12.08 skrev Eli Zaretskii <eliz <at> gnu.org>:

> Do we really want readevalloop be recursive this way?  What does this
> gain us?

In principle there should be no reason for readevalloop being non-reentrant at all and remedying this should be an obvious improvement, but I cannot give a concrete example of how it would benefit us immediately.

That said, I think the approach here is a bit too much at once; I'd rather nibble away at state a little at a time since much of lread.c is messy legacy code.

More to the point, I have a patch for lread.c which cleans up some poor code, fixes some bugs and most importantly improves reader performance. I'd rather that go in first, as it provides tangible improvements and so that I won't have to undo Lynn's changes when applying it. (Further work in bug#70988).

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 05 Jul 2025 18:08:02 GMT) Full text and rfc822 format available.

Message #71 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 5 Jul 2025 14:07:25 -0400

[Message part 1 (text/plain, inline)]

On Sat, Jul 5, 2025 at 6:13 AM Mattias Engdegård
<mattias.engdegard <at> gmail.com> wrote:
>
> 28 juni 2025 kl. 12.08 skrev Eli Zaretskii <eliz <at> gnu.org>:
>
> > Do we really want readevalloop be recursive this way?  What does this
> > gain us?
>
> In principle there should be no reason for readevalloop being non-reentrant at all and remedying this should be an obvious improvement, but I cannot give a concrete example of how it would benefit us immediately.
>
> That said, I think the approach here is a bit too much at once; I'd rather nibble away at state a little at a time since much of lread.c is messy legacy code.
>
> More to the point, I have a patch for lread.c which cleans up some poor code, fixes some bugs and most importantly improves reader performance. I'd rather that go in first, as it provides tangible improvements and so that I won't have to undo Lynn's changes when applying it. (Further work in bug#70988).

Hi, Mattias,
I'm not sure which version of the patch you're referring to.  The last
one removed a number of changes not necessary for re-entrancy.
One thing all of the versions did was address the initial issue in bug
#70988 by making the "multibyte" decision about the input of any given
entry to the users of "READCHAR/UNREAD" only at entry or when it could
be changed by an intervening "eval".  Instead of having a single
version of readchar, there are different versions for each combination
of input type and multibyte-ness that are straight-line code.  The
correct version is chosen at initial entry or when restoring the state
of any entry point after an eval.  I do still have two outstanding
test failures from make check.  I haven't tried the check-expensive or
check-all targets.

Do you have benchmarks I can use for concrete speed measurements?  I
haven't noticed any particular slow-down with my version, and it's
compiled without native-compile so it should be using the reader
pretty heavily.  That said, I'm using a -O0 build, so an optimized
build might make more efficient use of statically allocated variables.

I'm attaching the last version of my patch.  It should be contained in
the bug record already, but I don't know how convenient or
inconvenient that is.  It's still a non-trivial patch.

Lynn

[0001-Make-read-readevalloop-jointly-reentrant.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sun, 06 Jul 2025 18:21:02 GMT) Full text and rfc822 format available.

Message #74 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: 78898 <at> debbugs.gnu.org
Cc: Pip Cet <pipcet <at> protonmail.com>, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sun, 6 Jul 2025 14:20:02 -0400

On Thu, Jul 3, 2025 at 10:47 PM Lynn Winebarger <owinebar <at> gmail.com> wrote:
>
> The attached patch (in addition to the preceding 2) eliminates most of
> the errors from "make check".  I've attached the ones from edebug.  I
> see "edebug-tests-break-in-lambda-out-of-defining-context" fail in my
> build of the master checkout, so that leaves two unexplained failures,
> edebug-tests-step-into-function and
> edebug-tests-trace-function-call-and-return.  I currently have no idea
> what is going on.

When I start the built emacs with
./src/emacs -Q -l ../test/lisp/emacs-lisp/edebug-tests.el
Then enter "M-x ert RET edebug-tests-trace-function-call-and-return
RET", the edebug-tests-trace-function-call-and-return test passes.  So
I'm thinking there is an issue with the test framework itself for this
one?
When I run edebug-tests-step-into-function similarly, it still fails,
but I am having difficulty understanding the testing framework in the
first place.  The test does not seem to open a buffer for executing
the test at all, it just throws a cl-block-error.

Lynn

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Wed, 09 Jul 2025 10:58:02 GMT) Full text and rfc822 format available.

Message #77 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Wed, 09 Jul 2025 10:57:34 +0000

"Lynn Winebarger" <owinebar <at> gmail.com> writes:
> I'd be happy to take any hints on those two remaining edebug
> regressions.  I have no idea what the difference of interest is
> between those two and all the other similar looking tests is, in terms
> of reliance on the behavior of
> eval-region/eval-buffer/load-read-function.

I think the difference is rather subtle, and it's not clear to me the
old behavior, which edebug relies on, is actually better.

The old readevalloop skips whitespace and comments before calling
Vload_read_function.  The new code, AFAIT, only calls
lread_skip_insignificant if use_read0.

The problem is that edebug assumes the point position when its
Vload_read_function is called is actually the beginning of the form; it
then uses

    (end-of-defun)
    (beginning-of-defun)

to jump to what it thinks is the beginning of the defun.  But point was
still in the previous defun (before the separating whitespace), so it's
actually the beginning of the previous defun, so we fail to instrument
the correct defun in edebug-tests-step-into-function.

diff --git a/src/lread.c b/src/lread.c
index 3be2f8bd7d2..b247b6251b5 100644
--- a/src/lread.c
+++ b/src/lread.c
@@ -3796,10 +3796,10 @@ readevalloop (Lisp_Object frameptr)
       frame->read_expr = false;
       /* Handle the most common case first */
       /* not sure why the Vpurify_flag/will_dump_p case is special */
+      lread_skip_insignificant (frame);
       if (use_read0)
 	{
 	/* avoid condition case */
-	  lread_skip_insignificant (frame);
 	  if (frame->read_eof)
 	      break;
 	  val = read0 (frame);

appears to fix these test failures.

Pip

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Wed, 16 Jul 2025 02:23:01 GMT) Full text and rfc822 format available.

Message #80 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Pip Cet <pipcet <at> protonmail.com>
Cc: 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Tue, 15 Jul 2025 22:22:01 -0400

On Wed, Jul 9, 2025 at 6:57 AM Pip Cet <pipcet <at> protonmail.com> wrote:
>
> "Lynn Winebarger" <owinebar <at> gmail.com> writes:
> > I'd be happy to take any hints on those two remaining edebug
> > regressions.  I have no idea what the difference of interest is
> > between those two and all the other similar looking tests is, in terms
> > of reliance on the behavior of
> > eval-region/eval-buffer/load-read-function.
>
> I think the difference is rather subtle, and it's not clear to me the
> old behavior, which edebug relies on, is actually better.
>
> The old readevalloop skips whitespace and comments before calling
> Vload_read_function.  The new code, AFAIT, only calls
> lread_skip_insignificant if use_read0.

That was on purpose.  Actually, there were a number of places I "poked
the bear" by simplifying things I did not see the point of to either
show it was unnecessary or find the undocumented interface promise.
Most of it was deep in the middle of transforming the code/data
structures though (and middle of the night, likely).  In particular,
I'd like to add a "syntax-object" lisp type and augment the reader to
track whitespace and comments in such objects so that I can implement
a "rewriting-pcase" efficiently in read0.  Rewriting the text of the
code means the comments associated with a piece of code should move
with the text, after all.  And since load-read-function *could* be an
arbitrary lisp function, there is no a priori reason to believe it
promises to skip whitespace and comments with no effect.  At least,
there was nothing documented to that effect.
Also, I'm not sure that edebug relies on that behavior so much as the
edebug test framework.

>
> The problem is that edebug assumes the point position when its
> Vload_read_function is called is actually the beginning of the form; it
> then uses
>
>     (end-of-defun)
>     (beginning-of-defun)
>
> to jump to what it thinks is the beginning of the defun.  But point was
> still in the previous defun (before the separating whitespace), so it's
> actually the beginning of the previous defun, so we fail to instrument
> the correct defun in edebug-tests-step-into-function.

Thanks for this.  I think that resolves all the known inconsistent
behaviors. Now I just have to take the final version and rebase it (or
is it "merge" it?) against Mattias's updates.  Oh, and use the patch I
provided in "bug#79001: Retain profile image files generated during
build" to get some concrete, statistically significant measurements of
the impact.

Lynn

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Wed, 16 Jul 2025 07:19:02 GMT) Full text and rfc822 format available.

Message #83 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 78898 <at> debbugs.gnu.org
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Tue, 15 Jul 2025 23:46:45 -0400

[Message part 1 (text/plain, inline)]

On Sun, Jun 29, 2025 at 1:03 PM Stefan Monnier <monnier <at> iro.umontreal.ca> wrote:
>
>
> I can definitely go along with that.  The reason why I care so much
> about the motivation behind the change is that it affects the tradeoffs
> (e.g. impact on performance or code maintenance).  Have you made any
> measurements about the impact on the speed for `read`?
> Emacs startup time is significantly affected by the time to `read`.
>
I've modified the build process to retain profiling data generated
from using the --enable-profiling option.  See bug#79001: "Retain
profile image files generated during build" for a patch.  I built
emacs for commit 2bdcf0250acecdb0719203ae711aedf5baad1783 as the
baseline and then that commit + the last patchset I provided using:
CC=/usr/local/bin/gcc \
     LD=/usr/local/bin/ld \
     PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig \
    ../configure --enable-checking='yes,glyphs'
--enable-check-lisp-object-type  \
    CFLAGS='-O0 -g3 -gdwarf-5 -fvar-tracking -gstatement-frontiers' \
    --infodir=/usr/local/share/emacs/31/share/info \
    --mandir=/usr/local/share/emacs/31/share/man \
    --docdir=/usr/local/share/emacs/31/share/doc/emacs \
    --localedir=/usr/local/share/emacs/31/share/locale \
    --with-x --with-xim --with-sound --with-xpm --with-jpeg \
    --with-tiff --with-gif --with-png --with-rsvg --with-imagemagick \
    --with-dbus --with-gpm --with-x-toolkit=gtk3 --with-toolkit-scroll-bars \
    --with-libotf --with-m17n-flt --with-cairo  --with-xwidgets \
    --disable-build-details --without-pop --with-mailutils --without-hesiod \
    --with-gameuser=:games --with-kerberos --with-kerberos5 \
    --with-file-notification=inotify --with-modules --with-tree-sitter \
    --with-native-compilation=no --enable-profiling=all
make -j $(nproc) PROFILE_BUILD=yes

The list of ELC jobs run in the BUILD/lisp directory is given in the
attached ELC-gmon-out-files.txt for reference.  I summarized them
using gprof -s, then produced 4 analysis output files for each (hence
8 attached) in BUILD/lisp:

Just the lread functions:
gprof --flat-profile=lread.c ../src/temacs temacs-gmon-ELC.sum
>../ELC-lread-flat-profile-<id>.txt
gprof --graph=lread.c -k lread.c/eval.c ../src/temacs
temacs-gmon-ELC.sum >../ELC-lread-call-graph-<id>.txt

These are for reference - the functions of lread.c only consume about
1.5-2% of the total profile time according to these:
gprof --flat-profile ../src/temacs temacs-gmon-ELC.sum
>../ELC-flat-profile-<id>.txt
gprof --graph ../src/temacs temacs-gmon-ELC.sum >../ELC-call-graph-<id>.txt

The identifiers are 2bdcf0250acecdb0719203ae711aedf5baad1783 for the
baseline and 2bdcf0250acecdb0719203ae711aedf5baad1783-reentrant-reader
for the patched version.
Since these are -O0 builds, it's hard to say too much for cases where
macros have been replaced by inline function calls.  For read0 itself,
there is an increase in "self-seconds" of 3%, presumably due to using
pointers into a data structure instead of static variables.  The same
comment applies to a number of other functions called by read0, but
the penalty for those appears substantially higher.

For some reason I do not understand, the time spent in oblookup
decreased by 9% after the patch.

At any rate, I don't want to spend too much effort plumbing these
results since Mattias has checked in some meaningful changes to
lread.c, so (among other things), after I rebase (or merge?) the patch
to a recent commit, at least there will be more inline function versus
inline function comparison, instead of inline versus macro.

Also, I should be able to make the patch from bug #79001 able to use
gcov or perf for profiling instead of gprof, if that would be of
interest, with fairly little effort.  With a bit of elbow grease, I
could make the elisp profiler run whenever some environment variable
like ELISP_MON_OUT is defined (with the output file name to use), that
could run at the same time as gcov is in effect, I think.

Lynn

[ELC-lread-flat-profile-2bdcf0250acecdb0719203ae711aedf5baad1783.txt (text/plain, attachment)]

[ELC-lread-call-graph-2bdcf0250acecdb0719203ae711aedf5baad1783.txt (text/plain, attachment)]

[ELC-flat-profile-2bdcf0250acecdb0719203ae711aedf5baad1783.txt (text/plain, attachment)]

[ELC-call-graph-2bdcf0250acecdb0719203ae711aedf5baad1783.txt (text/plain, attachment)]

[ELC-lread-flat-profile-2bdcf0250acecdb0719203ae711aedf5baad1783-reentrant-reader.txt (text/plain, attachment)]

[ELC-flat-profile-2bdcf0250acecdb0719203ae711aedf5baad1783-reentrant-reader.txt (text/plain, attachment)]

[ELC-lread-call-graph-2bdcf0250acecdb0719203ae711aedf5baad1783-reentrant-reader.txt (text/plain, attachment)]

[ELC-call-graph2bdcf0250acecdb0719203ae711aedf5baad1783-reentrant-reader.txt (text/plain, attachment)]

[ELC-gmon-out-files.txt (text/plain, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Mon, 21 Jul 2025 15:30:03 GMT) Full text and rfc822 format available.

Message #86 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Mon, 21 Jul 2025 11:29:36 -0400

[Message part 1 (text/plain, inline)]

On Sat, Jul 5, 2025, 6:13 AM Mattias Engdegård <mattias.engdegard <at> gmail.com>
wrote:

> 28 juni 2025 kl. 12.08 skrev Eli Zaretskii <eliz <at> gnu.org>:
>
> > Do we really want readevalloop be recursive this way?  What does this
> > gain us?
>
> In principle there should be no reason for readevalloop being
> non-reentrant at all and remedying this should be an obvious improvement,
> but I cannot give a concrete example of how it would benefit us immediately.
>
> That said, I think the approach here is a bit too much at once; I'd rather
> nibble away at state a little at a time since much of lread.c is messy
> legacy code.

For my next increment, I want to try managing the read stack with alloca
and friends.  The way it is now, memory allocated for that stack is never
recovered.

I also tried eliminating base_sp from read0, but it failed. IOW, read0 is
getting recursively invoked in the middle of reading an expression.  So the
size of the read stack may not be bounded by the maximum depth of a single
expression.  I want to measure what's going on there.

Also, I'd like some concrete cases for measuring the impact of the reader
performance on emacs startup.  When I look at the profiles of the dump
recipes, the time spent in lread.c is tiny compared to evaluating the code
being read, even when loading byte-compiled code.

Lynn

[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Tue, 22 Jul 2025 12:30:02 GMT) Full text and rfc822 format available.

Message #89 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattias.engdegard <at> gmail.com>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Tue, 22 Jul 2025 14:28:52 +0200

21 juli 2025 kl. 17.29 skrev Lynn Winebarger <owinebar <at> gmail.com>:

> For my next increment, I want to try managing the read stack with alloca and friends.  The way it is now, memory allocated for that stack is never recovered.

Not sure why that would be a problem as the stack is usually small. (`alloca` is not without its own problems, especially in Emacs; I prefer not using it.)

> I also tried eliminating base_sp from read0, but it failed. IOW, read0 is getting recursively invoked in the middle of reading an expression.  So the size of the read stack may not be bounded by the maximum depth of a single expression.  I want to measure what's going on there.

`read0` must be re-entrant because the \N{CHARNAME} notation makes us call out to Lisp and possibly load some files (for the char names). We could avoid this if it seems likely that it would be worth the trouble.

> Also, I'd like some concrete cases for measuring the impact of the reader performance on emacs startup.  When I look at the profiles of the dump recipes, the time spent in lread.c is tiny compared to evaluating the code being read, even when loading byte-compiled code.

It could be that the reader is already fairly fast (it's seen some improvement lately) but also that profiling isn't easy. If you don't find what you are looking for in a profile it usually means the time is spent elsewhere. In Emacs that is often the allocator and GC but not always.

I'm all for making the reader faster; even small improvements can matter. Complicating things for little or no benefit, not so much. Removing ancient cruft that no longer matters, yes please.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Tue, 22 Jul 2025 16:49:02 GMT) Full text and rfc822 format available.

Message #92 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Tue, 22 Jul 2025 12:47:38 -0400

[Message part 1 (text/plain, inline)]

On Tue, Jul 22, 2025, 8:28 AM Mattias Engdegård <mattias.engdegard <at> gmail.com>
wrote:

> 21 juli 2025 kl. 17.29 skrev Lynn Winebarger <owinebar <at> gmail.com>:
>
> > For my next increment, I want to try managing the read stack with alloca
> and friends.  The way it is now, memory allocated for that stack is never
> recovered.
>
> Not sure why that would be a problem as the stack is usually small.
> (`alloca` is not without its own problems, especially in Emacs; I prefer
> not using it.)
>

I'd ask that you withhold judgement on using alloca here until I submit
something.  Considering the read_entry stack is implemented as part of an
optimization of C recursion, allocating it on the C stack is quite
natural.  What I have in mind should actually simplify the code.

The memory leak aspect is more observation than alarm.  The leak is only
safe because the stack is implemented as a global static variable.  Getting
rid of that property means either freeing the stack at the end of the
initial entry to read, or allocating the read_entry stack from the garbage
collected heap.

> I also tried eliminating base_sp from read0, but it failed. IOW, read0 is
> getting recursively invoked in the middle of reading an expression.  So the
> size of the read stack may not be bounded by the maximum depth of a single
> expression.  I want to measure what's going on there.
>
> `read0` must be re-entrant because the \N{CHARNAME} notation makes us call
> out to Lisp and possibly load some files (for the char names). We could
> avoid this if it seems likely that it would be worth the trouble.
>

Thanks, that also answers the question "Can read0 be called during a call
to read-from-string?" posed by Stefan.

read-from-string is called from the pdump loader when an eln library is
initialized.  An occurrence of that notation must be either very unlikely
or carefully orchestrated by the pdump loader.

> Also, I'd like some concrete cases for measuring the impact of the reader
> performance on emacs startup.  When I look at the profiles of the dump
> recipes, the time spent in lread.c is tiny compared to evaluating the code
> being read, even when loading byte-compiled code.
>
> It could be that the reader is already fairly fast (it's seen some
> improvement lately) but also that profiling isn't easy. If you don't find
> what you are looking for in a profile it usually means the time is spent
> elsewhere. In Emacs that is often the allocator and GC but not always.
>

Right, all the more reason to have well-defined benchmarking. That's why I
stated it the way I did: "time spent in lread.c".  I'm not sure how Stefan
was measuring the performance of the reader.

I will hazard a guess that the performance of the reader is more noticeable
when the code being loaded is already compiled.  It may be even more so
when all the code in the dump is compiled as well, especially if there
isn't an explicit call to GC after every call to load.

I'm all for making the reader faster; even small improvements can matter.
> Complicating things for little or no benefit, not so much. Removing ancient
> cruft that no longer matters, yes please.

The change I have in mind should make the code simpler and make the
read_entry stack thread-safe.  I doubt it will make a huge difference to
performance one way or the other, if the size of the stack growth is
reasonably tuned.

Hand written lisp code should definitely not require a huge read stack.  My
guess is that, say, 30 entries would be conservative for all but the most
pathological cases for a single lisp expression.  I have no grasp on how
deep the byte code emitted by the compiler might go, though.  A histogram
of the depth of expressions in the source and byte code of the emacs tree
should be plenty for that tuning.

Lynn

[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Tue, 22 Jul 2025 18:42:02 GMT) Full text and rfc822 format available.

Message #95 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>,
 Mattias Engdegård <mattias.engdegard <at> gmail.com>,
 78898 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Tue, 22 Jul 2025 14:40:50 -0400

> The memory leak aspect is more observation than alarm.  The leak is only
> safe because the stack is implemented as a global static variable.  Getting
> rid of that property means either freeing the stack at the end of the
> initial entry to read, or allocating the read_entry stack from the garbage
> collected heap.

BTW memory that's not returned to the OS but that Emacs can reuse later
is generally not considered as a leak.  E.g., most `malloc`
implementations have the property of (virtually) never bothering to
return freed memory to the OS.

>> `read0` must be re-entrant because the \N{CHARNAME} notation makes us call
>> out to Lisp and possibly load some files (for the char names).

Oh, right, I'd forgotten about this one.   I seem to remember bugs
caused by this (or something similar where we lazily load some Unicode
table).  It's particularly tricky because this Lisp code is only ever
run at most once per session, so the "test coverage" is very poor.

> read-from-string is called from the pdump loader when an eln library is
> initialized.  An occurrence of that notation must be either very unlikely
> or carefully orchestrated by the pdump loader.

Yup: the \N{CHARNAME} notation cannot occur too early in the build.
And we know it won't occur at all in the (non-bootstrap) dump because
the \N{CHARNAME} notation is never used in the `.elc` or `.eln` files
we generate (because `print.c` never generates it).

> Right, all the more reason to have well-defined benchmarking. That's why I
> stated it the way I did: "time spent in lread.c".  I'm not sure how Stefan
> was measuring the performance of the reader.

I'd suggest you simply concatenate a bunch of `.elc` files, until you
get something "large enough" to make the timing easy to measure.
Then measure the time to read that file (put the content inside
a buffer, wrap it inside a pair of `(...)`, then (benchmark-run
(read))`).

[ Back before the pdump code was written, we considered a much simpler
  solution of saving the contents of the obarray into a big `.eld` file as
  a kind of "dump", i.e. basically using one big `.elc` file instead of
  the current `.pdmp`.
  It's easy to make it work (much easier than the pdump code), but it's
  significantly slower.  ]

> I will hazard a guess that the performance of the reader is more noticeable
> when the code being loaded is already compiled.

Yes, we don't care about the speed of reading `.el` files nearly as
much as the speed to read `.elc` or `.eld` files.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Wed, 23 Jul 2025 22:55:02 GMT) Full text and rfc822 format available.

Message #98 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Pip Cet <pipcet <at> protonmail.com>,
 Mattias Engdegård <mattias.engdegard <at> gmail.com>,
 78898 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Wed, 23 Jul 2025 15:53:51 -0700

[Message part 1 (text/plain, inline)]

On Tue, Jul 22, 2025, 2:40 PM Stefan Monnier <monnier <at> iro.umontreal.ca>
wrote:

> > The memory leak aspect is more observation than alarm.  The leak is only
> > safe because the stack is implemented as a global static variable.
> Getting
> > rid of that property means either freeing the stack at the end of the
> > initial entry to read, or allocating the read_entry stack from the
> garbage
> > collected heap.
>
> BTW memory that's not returned to the OS but that Emacs can reuse later
> is generally not considered as a leak.  E.g., most `malloc`
> implementations have the property of (virtually) never bothering to
> return freed memory to the OS.
>

Terminology aside, the memory used for the stack is never freed for use by
other parts of Emacs.  I think the code would be cleaner, and possibly a
little faster, with one of the following approaches:

(1) Just reserve a conservatively large amount of stack space at the start
and abort if it's exceeded.  A lisp variable could be used to let the user
change the size of the reserved space between activations of read0, like
the maximum lisp recursion depth.

(2). Allocate an array in the C stack on entry to read0 for the reader
stack.  During the parse, if all array entries are used, just use C
recursion to call read0.

(3) Reverse the order of call and return in the recursion of option (2).
Meaning, enter read0 with a trampoline that allocates the reader stack with
alloca.  On entry to the read_obj label,  check that there is enough stack
space for the maximum number of entries that can be pushed before control
returns to read_obj.  If there is not enough space, return to the
trampoline.  The trampoline calls read0 in a do...while loop as long as the
stack is nonempty, then returns.  Since the read stack is implemented in a
simple trampoline function, each alloca just extends the initial reader
stack array, so growing that array never requires copying.

I'm going to try implementing (3), since it should be efficient even if
multiple threads were used to run independent lisp VMs.

There's also (3'), in which the trampoline is implemented by putting a
block scope around the parsing code in read0, and putting a trampoline
label just before that block.  That would probably work with -O0, but I
have no idea if the optimizer could reorder code in such a way that the
stack allocations end up being non-contiguous and there are live variable
definitions between them.  It would be nice to replace the "return/call"
sequence with a simple "goto".

> read-from-string is called from the pdump loader when an eln library is
> > initialized.  An occurrence of that notation must be either very unlikely
> > or carefully orchestrated by the pdump loader.
>
> Yup: the \N{CHARNAME} notation cannot occur too early in the build.
> And we know it won't occur at all in the (non-bootstrap) dump because
> the \N{CHARNAME} notation is never used in the `.elc` or `.eln` files
> we generate (because `print.c` never generates it).
>

I'll put this in the "very unlikely" category, at least in configurations
supported by Emacs maintainers.

> Right, all the more reason to have well-defined benchmarking. That's why I
> > stated it the way I did: "time spent in lread.c".  I'm not sure how
> Stefan
> > was measuring the performance of the reader.
>
> I'd suggest you simply concatenate a bunch of `.elc` files, until you
> get something "large enough" to make the timing easy to measure.
> Then measure the time to read that file (put the content inside
> a buffer, wrap it inside a pair of `(...)`, then (benchmark-run
> (read))`).
>

So the circular expression syntax supports creating such nested
expressions?  I didn't notice read0 recurring to the internal start (to get
fresh hash tables for the object map).  I don't see how such an artificial
list of byte code could be guaranteed to be read correctly if those hash
tables are not cleared between top level expressions in the file.

Lynn

[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 26 Jul 2025 16:01:01 GMT) Full text and rfc822 format available.

Message #101 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>,
 Mattias Engdegård <mattias.engdegard <at> gmail.com>,
 78898 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 26 Jul 2025 12:00:08 -0400

>> BTW memory that's not returned to the OS but that Emacs can reuse
>> later is generally not considered as a leak.  E.g., most `malloc`
>> implementations have the property of (virtually) never bothering to
>> return freed memory to the OS.
> Terminology aside, the memory used for the stack is never freed for
> use by other parts of Emacs.

Yup.  Still not considered a leak.  🙁
It may be a waste of memory, of course, but we use similar approaches in
other places without worrying very much about it.  For example, we grow
hashtables on-demand in `puthash`, but we never shrink them back in
`remhash`.

That doesn't mean I'm opposed to changing the read code to be more
careful with its use of memory, but I don't consider it a serious
concern, so code clarity, speed, reentrancy, and things like that are
more important.

>> Yup: the \N{CHARNAME} notation cannot occur too early in the build.
>> And we know it won't occur at all in the (non-bootstrap) dump because
>> the \N{CHARNAME} notation is never used in the `.elc` or `.eln` files
>> we generate (because `print.c` never generates it).
>
> I'll put this in the "very unlikely" category, at least in configurations
> supported by Emacs maintainers.

🙂

>> I'd suggest you simply concatenate a bunch of `.elc` files, until you
>> get something "large enough" to make the timing easy to measure.
>> Then measure the time to read that file (put the content inside
>> a buffer, wrap it inside a pair of `(...)`, then (benchmark-run
>> (read))`).
>
> So the circular expression syntax supports creating such nested
> expressions?

Yes:

    ELISP> '(#1=4 #1# #1=5 #1#)
    (4 4 5 5)

    ELISP> 

> I didn't notice read0 recurring to the internal start (to get
> fresh hash tables for the object map).  I don't see how such an artificial
> list of byte code could be guaranteed to be read correctly if those hash
> tables are not cleared between top level expressions in the file.

The #N# references in a `.elc` file would fail when loaded if they refer
to a #N= that's not in the same file.  After concatenation, since the
#N# refers to the "most recent" #N= it should always refer to the
"right" one.


        Stefan

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Mon, 28 Jul 2025 04:33:01 GMT) Full text and rfc822 format available.

Message #104 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Pip Cet <pipcet <at> protonmail.com>,
 Mattias Engdegård <mattias.engdegard <at> gmail.com>,
 78898 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Mon, 28 Jul 2025 00:32:30 -0400

[Message part 1 (text/plain, inline)]

On Sat, Jul 26, 2025 at 12:00 PM Stefan Monnier
<monnier <at> iro.umontreal.ca> wrote:
>
> >> BTW memory that's not returned to the OS but that Emacs can reuse
> >> later is generally not considered as a leak.  E.g., most `malloc`
> >> implementations have the property of (virtually) never bothering to
> >> return freed memory to the OS.
> > Terminology aside, the memory used for the stack is never freed for
> > use by other parts of Emacs.
>
> Yup.  Still not considered a leak.  🙁
> It may be a waste of memory, of course, but we use similar approaches in
> other places without worrying very much about it.  For example, we grow
> hashtables on-demand in `puthash`, but we never shrink them back in
> `remhash`.
>
> That doesn't mean I'm opposed to changing the read code to be more
> careful with its use of memory, but I don't consider it a serious
> concern, so code clarity, speed, reentrancy, and things like that are
> more important.

Ok, the attached patch bootstraps with -O0 without native compilation.
I haven't done any benchmarking or tuning, or tested with optimization
turned on, but it shows the technique.  The changes don't seem to have
leaked beyond the obviously required scope.  The patch does eliminate
one unwind_protect (per entry to read0) which is made unnecessary by
using alloca.

If this approach works in the reader, the same technique could be
applied to making tail recursion safe for space in the bytecode VM.

> >> I'd suggest you simply concatenate a bunch of `.elc` files, until you
> >> get something "large enough" to make the timing easy to measure.
> >> Then measure the time to read that file (put the content inside
> >> a buffer, wrap it inside a pair of `(...)`, then (benchmark-run
> >> (read))`).
> >
> > So the circular expression syntax supports creating such nested
> > expressions?
>
> Yes:
>
>     ELISP> '(#1=4 #1# #1=5 #1#)
>     (4 4 5 5)
>
>     ELISP>
>
> > I didn't notice read0 recurring to the internal start (to get
> > fresh hash tables for the object map).  I don't see how such an artificial
> > list of byte code could be guaranteed to be read correctly if those hash
> > tables are not cleared between top level expressions in the file.
>
> The #N# references in a `.elc` file would fail when loaded if they refer
> to a #N= that's not in the same file.  After concatenation, since the
> #N# refers to the "most recent" #N= it should always refer to the
> "right" one.

I need to go back and check some elc files, because it seemed like
some of the byte-code had #N= definitions after #N# uses, which could
happen if the enumeration is performed breadth-first rather than
depth-first.

In any case, for benchmarking purposes:
* the cost of allocating the structure being read should count toward
the cost of read, but the cost of growing those hash tables repeatedly
should not.
* none of the constructed expressions are released, so the cost of
expanding the heap size (from the OS) is included.

One quibble I would have is that there is some difference in the
details of reading a character from a buffer and reading from a file.
It just seems weird because ELC files are never loaded into a buffer,
and buffers are not exactly simple.  An alternative would be to memory
map elc files that fstat indicates are regular.  That might be a
useful change in itself.

Lynn

[0001-Allocate-reader-stack-on-C-stack.patch (application/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Mon, 28 Jul 2025 11:17:02 GMT) Full text and rfc822 format available.

Message #107 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Pip Cet <pipcet <at> protonmail.com>,
 Mattias Engdegård <mattias.engdegard <at> gmail.com>,
 78898 <at> debbugs.gnu.org, Eli Zaretskii <eliz <at> gnu.org>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Mon, 28 Jul 2025 07:15:34 -0400

[Message part 1 (text/plain, inline)]

On Mon, Jul 28, 2025 at 12:32 AM Lynn Winebarger <owinebar <at> gmail.com> wrote:
>
> On Sat, Jul 26, 2025 at 12:00 PM Stefan Monnier
> <monnier <at> iro.umontreal.ca> wrote:
> >
> > >> BTW memory that's not returned to the OS but that Emacs can reuse
> > >> later is generally not considered as a leak.  E.g., most `malloc`
> > >> implementations have the property of (virtually) never bothering to
> > >> return freed memory to the OS.
> > > Terminology aside, the memory used for the stack is never freed for
> > > use by other parts of Emacs.
> >
> > Yup.  Still not considered a leak.  🙁
> > It may be a waste of memory, of course, but we use similar approaches in
> > other places without worrying very much about it.  For example, we grow
> > hashtables on-demand in `puthash`, but we never shrink them back in
> > `remhash`.
> >
> > That doesn't mean I'm opposed to changing the read code to be more
> > careful with its use of memory, but I don't consider it a serious
> > concern, so code clarity, speed, reentrancy, and things like that are
> > more important.
>
> Ok, the attached patch bootstraps with -O0 without native compilation.
> I haven't done any benchmarking or tuning, or tested with optimization
> turned on, but it shows the technique.  The changes don't seem to have
> leaked beyond the obviously required scope.  The patch does eliminate
> one unwind_protect (per entry to read0) which is made unnecessary by
> using alloca.
>
This version eliminates the trampoline function and uses C block scope
and goto in place of return/call.  I have no idea if C semantics
guarantee the assumptions about jumping out of block scope and
variables allocated on the stack.  It seems to work with -O0.

Lynn

[0001-Allocate-reader-stack-on-C-stack.patch (text/x-patch, attachment)]

[0002-Inline-read0_trampoline-at-return-site.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Mon, 28 Jul 2025 11:58:01 GMT) Full text and rfc822 format available.

Message #110 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattias.engdegard <at> gmail.com>
To: Lynn Winebarger <owinebar <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Mon, 28 Jul 2025 13:57:15 +0200

24 juli 2025 kl. 00.53 skrev Lynn Winebarger <owinebar <at> gmail.com>:

> (1) Just reserve a conservatively large amount of stack space at the start and abort if it's exceeded.  A lisp variable could be used to let the user change the size of the reserved space between activations of read0, like the maximum lisp recursion depth.
> 
> (2). Allocate an array in the C stack on entry to read0 for the reader stack.  During the parse, if all array entries are used, just use C recursion to call read0.
> 
> (3) Reverse the order of call and return in the recursion of option (2).  Meaning, enter read0 with a trampoline that allocates the reader stack with alloca.  On entry to the read_obj label,  check that there is enough stack space for the maximum number of entries that can be pushed before control returns to read_obj.  If there is not enough space, return to the trampoline.  The trampoline calls read0 in a do...while loop as long as the stack is nonempty, then returns.  Since the read stack is implemented in a simple trampoline function, each alloca just extends the initial reader stack array, so growing that array never requires copying.

Thank you, but it's not quite what we had in mind so I think we'll pass on that for now.

More generally, it's not entirely clear what concrete problem you are actually trying to solve.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Tue, 29 Jul 2025 00:11:02 GMT) Full text and rfc822 format available.

Message #113 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Mon, 28 Jul 2025 20:09:56 -0400

[Message part 1 (text/plain, inline)]

On Mon, Jul 28, 2025 at 7:57 AM Mattias Engdegård
<mattias.engdegard <at> gmail.com> wrote:
>
> 24 juli 2025 kl. 00.53 skrev Lynn Winebarger <owinebar <at> gmail.com>:
>
> > (1) Just reserve a conservatively large amount of stack space at the start and abort if it's exceeded.  A lisp variable could be used to let the user change the size of the reserved space between activations of read0, like the maximum lisp recursion depth.
> >
> > (2). Allocate an array in the C stack on entry to read0 for the reader stack.  During the parse, if all array entries are used, just use C recursion to call read0.
> >
> > (3) Reverse the order of call and return in the recursion of option (2).  Meaning, enter read0 with a trampoline that allocates the reader stack with alloca.  On entry to the read_obj label,  check that there is enough stack space for the maximum number of entries that can be pushed before control returns to read_obj.  If there is not enough space, return to the trampoline.  The trampoline calls read0 in a do...while loop as long as the stack is nonempty, then returns.  Since the read stack is implemented in a simple trampoline function, each alloca just extends the initial reader stack array, so growing that array never requires copying.
>
The attached patch set implements what I described as option (3') in
the original email.  I've tested the resulting code both with and
without native compilation at -O0, and without native compilation at
the default optimization level, and no new test failures are
introduced (versus commit e3380ed6db7a1c1574c4b39d58f9200a3ffab).  In
fact one test failure went away in one of the configurations, though I
have no idea why.  It implements

> Thank you, but it's not quite what we had in mind so I think we'll pass on that for now.
>
> More generally, it's not entirely clear what concrete problem you are actually trying to solve.

Well, this is the first time in my adult life I haven't had an
employer claiming to own every idea or even potential production of
intellectual property regardless of whether it is properly in the
course of employment, so I'm letting myself have some fun with a
program I've liked for most of that adult life.  So, there's no simple
concrete problem beyond the statement of the bug itself.  I think
Stefan had my number on that.

But, I do have a number of ideas for longer term projects to improve
my favorite editor, like making the lisp machine more concurrency
friendly.  For example, eliminating static variables/global state is
part of that, and the reader ecosystem seems like a good place for me
to start.  I also want to add lisp objects for capturing grammatical
structure at a more abstract level than tree-sitter (e.g. supporting
an updated Semantic framework), and integrating the construction of
such objects into the lisp reader would be one project.

If you look at the code after applying all three patches, I think the
code is actually simpler, and possibly faster.  I can add guard rails
for dealing with the risk of stack overflow.  If you really need to
support unbounded expression depth, I can special case allocation of
stack to handle extraordinarily deep expressions.  However, I'm not
sure how much benefit there is to being able to process an ELC file
that has a correct header followed by, say  a million left parentheses
and then the EOF.  If a user wants to handle such outliers, I think
they can be handled without the level of performance provided for more
realistic cases.   For example, by allocating a memory segment with
one read stack entry for every byte of the input file to use as the
read stack.

Lynn

[0001-Allocate-reader-stack-on-C-stack.patch (text/x-patch, attachment)]

[0003-Manage-GC-root-for-read-stack.patch (text/x-patch, attachment)]

[0002-Inline-read0_trampoline-at-return-site.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Wed, 30 Jul 2025 13:10:02 GMT) Full text and rfc822 format available.

Message #116 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Wed, 30 Jul 2025 09:08:45 -0400

[Message part 1 (text/plain, inline)]

On Mon, Jul 28, 2025 at 8:09 PM Lynn Winebarger <owinebar <at> gmail.com> wrote:
> If you look at the code after applying all three patches, I think the
> code is actually simpler, and possibly faster.  I can add guard rails
> for dealing with the risk of stack overflow.  If you really need to
> support unbounded expression depth, I can special case allocation of
> stack to handle extraordinarily deep expressions.  However, I'm not
> sure how much benefit there is to being able to process an ELC file
> that has a correct header followed by, say  a million left parentheses
> and then the EOF.  If a user wants to handle such outliers, I think
> they can be handled without the level of performance provided for more
> realistic cases.   For example, by allocating a memory segment with
> one read stack entry for every byte of the input file to use as the
> read stack.

I reviewed the material on storage duration in C on cppreference.com.
Except for variable length arrays, the storage of objects in variables
is determined on block entry, not at the declaration.  As long as no
VLAs are declared in the function scope past the read_obj label, the
block scope wrapping the code from that label to the end is
unnecessary.  So I removed that, reverted the whitespace changes, and
squashed it all into one change.

Without the added block scope, the change becomes very simple.   No
new test failures (relative to the baseline commit of master) have
been added.  I can add a slow path for the case where the stack is
almost overflowing or the input is just ridiculously complicated.  For
files being loaded, I'd prefer to add a different source type that
either memory maps the file or slurps it into memory so we can safely
gauge a worst-case size for the slow path.  A similar approach would
apply to buffers, since they are already in memory.  For other stream
types being read, I think erroring out on expressions that would
overflow the stack is reasonable, given I can build emacs and run all
the tests without an overflow.  Stefan's preferred benchmark technique
would be problematic though.  It would almost certainly either
overflow or take the slow path.

I haven't rebased my development branch to the updates in lread.c on
master this morning.  I evidently don't understand how to maintain the
bare clone I made of the emacs repo to let myself use multiple local
working trees.

[0001-Simplify-the-read-stack.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 02 Aug 2025 11:43:02 GMT) Full text and rfc822 format available.

Message #119 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 2 Aug 2025 07:42:25 -0400

On Wed, Jul 30, 2025 at 9:08 AM Lynn Winebarger <owinebar <at> gmail.com> wrote:
> Without the added block scope, the change becomes very simple.   No
> new test failures (relative to the baseline commit of master) have
> been added.  I can add a slow path for the case where the stack is
> almost overflowing or the input is just ridiculously complicated.  For
> files being loaded, I'd prefer to add a different source type that
> either memory maps the file or slurps it into memory so we can safely
> gauge a worst-case size for the slow path.  A similar approach would
> apply to buffers, since they are already in memory.  For other stream
> types being read, I think erroring out on expressions that would
> overflow the stack is reasonable, given I can build emacs and run all
> the tests without an overflow.  Stefan's preferred benchmark technique
> would be problematic though.  It would almost certainly either
> overflow or take the slow path.

Another way to make this safer is to stop accumulating sequences on
the stack and "reduce" those stack entries early (with Fcons).  That
way, Stefan's benchmark would only add 1 to the maximum stack depth,
rather than making the number of top-level forms the minimum stack
depth.
That will be my next increment.
Lynn

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#78898; Package emacs. (Sat, 02 Aug 2025 12:41:02 GMT) Full text and rfc822 format available.

Message #122 received at 78898 <at> debbugs.gnu.org (full text, mbox):

From: Lynn Winebarger <owinebar <at> gmail.com>
To: Mattias Engdegård <mattias.engdegard <at> gmail.com>
Cc: Pip Cet <pipcet <at> protonmail.com>, Eli Zaretskii <eliz <at> gnu.org>,
 78898 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#78898: Make read/readevalloop reentrant
Date: Sat, 2 Aug 2025 08:39:41 -0400

On Sat, Aug 2, 2025 at 7:42 AM Lynn Winebarger <owinebar <at> gmail.com> wrote:
> On Wed, Jul 30, 2025 at 9:08 AM Lynn Winebarger <owinebar <at> gmail.com> wrote:
> > Without the added block scope, the change becomes very simple.   No
> > new test failures (relative to the baseline commit of master) have
> > been added.  I can add a slow path for the case where the stack is
> > almost overflowing or the input is just ridiculously complicated.  For
> > files being loaded, I'd prefer to add a different source type that
> > either memory maps the file or slurps it into memory so we can safely
> > gauge a worst-case size for the slow path.  A similar approach would
> > apply to buffers, since they are already in memory.  For other stream
> > types being read, I think erroring out on expressions that would
> > overflow the stack is reasonable, given I can build emacs and run all
> > the tests without an overflow.  Stefan's preferred benchmark technique
> > would be problematic though.  It would almost certainly either
> > overflow or take the slow path.
>
> Another way to make this safer is to stop accumulating sequences on
> the stack and "reduce" those stack entries early (with Fcons).  That
> way, Stefan's benchmark would only add 1 to the maximum stack depth,
> rather than making the number of top-level forms the minimum stack
> depth.
> That will be my next increment.
Retract that.  It's already being done that way, so the maximum stack
depth should be approximately the maximum nesting depth of the
s-expression being read.

I'm a little confused about the reluctance to just allocate the read
stack on the C stack and aborting the read if it's too deep.

Then my next step may be to add a source type for sucking elc files
straight into memory (or memory mapping if available), to eliminate
the need for copying to a read buffer in the most common case.

Lynn

This bug report was last modified 12 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #78898 Make read/readevalloop reentrant

GNU bug report logs - #78898
Make read/readevalloop reentrant