#70368 - [PATCH] Use a dedicated type to represent interpreted-function values

GNU bug report logs - #70368
[PATCH] Use a dedicated type to represent interpreted-function values

Package: emacs;

Reported by: Stefan Monnier <monnier <at> iro.umontreal.ca>

Date: Sat, 13 Apr 2024 19:58:03 UTC

Severity: normal

Tags: patch

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

View this message in rfc822 format

From: Stefan Monnier <monnier <at> iro.umontreal.ca> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 70368 <at> debbugs.gnu.org Subject: bug#70368: [PATCH] Use a dedicated type to represent interpreted-function values Date: Sun, 14 Apr 2024 09:49:23 -0400

> I don't think I understand the implications of this on compatibility > of byte-code. Will the byte-code produced by Emacs 30 after these > changes be compatible or incompatible with previous versions of Emacs? The byte code is unchanged, so compatible both ways. Some new `.elc` files may be incompatible with older Emacsen if they happen to contain a non-compiled function (e.g. via a macro which somehow gets an interpreted function during the expansion (e.g. via `eval` of by fetching it from a not-yet-compiled file) and then stashes it in the returned code). FWIW, I don't see any occurrence of that in Emacs's own files, for example. > And what about the compatibility in the other direction? There are two kinds of incompatibilities it introduces, in my experience: - "soft" incompatibilities for code which tries to display the kind of the object it receives (such as in a completion table that wants to add an icon indicating if something is a macro, a compiled function, etc...). Such code will still work but may display less informative info because it may fail to recognize the new objects as being interpreted functions. - "real" incompatibilities for code which digs inside the entrails of functions to try and extract specific information. This may fail when faced with the new interpreted functions but should be easy to fix. As long as such code only tries to extract info and does it via `help-function-arglist`, `documentation`, and `interactive-form`, there's no problem, but some packages may use code inherited from a long time ago when `help-function-arglist` didn't exist, or written by coders who didn't know better. I know Hyperbole used to do that, but I believe that's been fixed since. - "hard" incompatibilities for code which really digs inside the code of functions. `vc.el` did that a long time ago, `kmacro.el` did as well until OClosures, and Buttercup (a NonGNU ELPA package) did until a few months ago. There are probably one or two packages out there that will be affected like Buttercup would have been. FWIW, the Buttercup case was a mix of "hard" and "soft": it really looked inside the code, but used that only to provide more informative messages and had a fallback case when the code was compiled, so it would still work OK. >> The first point above means that `help-function-arglist`, >> `documentation`, and `interactive-form`s don't need to >> distinguish interpreted and bytecode functions any more. > Why is such a distinction needed? Indeed, why? [ More seriously, what I'm saying above is that for those three functions, the code used for byte-code objects now also works on interpreted functions by virtue of both objects sharing the same underlying representation. ] >> Main benefits of the change: >> >> - We can now reliably distinguish a list from a function value. >> This removes some ambiguity in cases where the data can be made of >> a list or a function, such as `run-hooks` or completion tables. >> - `cl-defmethod` can dispatch on `interactive-function`. >> Dispatch on `function` also works now for interpreted functions (but still >> won't work for functions represented as lists or as symbols, of course). >> - Function values are now self-evaluating. That was already the case >> when byte-compiled, but not when interpreted since >> (eval '(closure ...)) signals a void-function error. >> That also avoids false-positive warnings about "don't quote your lambdas" >> when doing things like `(mapcar ',func ...)`. > > Are there no downsides, none whatever? It would sound too good to be > true... There's the backward incompatibilty. > Do the above changes require changes to .gdbinit? E.g., I see that we > use PVEC_COMPILED there. Indeed, I missed that, thanks, fixed. > This new #f syntax is not documented anywhere, AFAICT. If this is the > new printed representation (and maybe also read syntax?) of functions, It's only a cl-print representation, it's not `read`able. The `read`able representation uses the #[...] syntax also used for bytecode functions. > it should be documented, like we do with other printed representations. Where would that be? I don't see where we document the #f(compiled-function ...) used for byte-code, which was my inspiration for the #f(lambda ...). >> --- a/lisp/emacs-lisp/byte-opt.el >> +++ b/lisp/emacs-lisp/byte-opt.el >> @@ -164,7 +164,7 @@ byte-compile-inline-expand >> ;; The byte-code will be really inlined in byte-compile-unfold-bcf. >> (byte-compile--check-arity-bytecode form fn) >> `(,fn ,@(cdr form))) >> - ((or `(lambda . ,_) `(closure . ,_)) >> + ((pred interpreted-function-p) > > What does this change mean for backward compatibility? Nothing, it just reflects the fact that this code previously received interpreted function in the form of (closure ...) and now it receives them in the form of `interpreted-function` objects. >> diff --git a/lisp/emacs-lisp/bytecomp.el b/lisp/emacs-lisp/bytecomp.el >> index fb3278c08ab..c82f8ba6ae2 100644 >> --- a/lisp/emacs-lisp/bytecomp.el >> +++ b/lisp/emacs-lisp/bytecomp.el >> @@ -2900,9 +2900,14 @@ byte-compile-output-as-comment >> (defun byte-compile--reify-function (fun) >> "Return an expression which will evaluate to a function value FUN. >> FUN should be an interpreted closure." >> - (pcase-let* ((`(closure ,env ,args . ,body) fun) >> - (`(,preamble . ,body) (macroexp-parse-body body)) >> - (renv ())) >> + (let* ((args (aref fun 0)) >> + (body (aref fun 1)) >> + (env (aref fun 2)) >> + (docstring (function-documentation fun)) >> + (iform (interactive-form fun)) >> + (preamble `(,@(if docstring (list docstring)) >> + ,@(if iform (list iform)))) >> + (renv ())) > > And this? Same: this adjust the internal code of `byte-compile` so it works on the new representation of function values. >> @@ -5571,7 +5575,7 @@ display-call-tree >> " <compiled macro>" >> " <macro>")) >> ((eq 'lambda (car f)) >> - "<function>") >> + "<function-like list>") > ^^^^^^^^^^^^^^^^^^^^ > Should this be documented somewhere? This should be an extremely rare occurrence, and `display-call-tree` itself is not documented, so I doubt it's worth the trouble. If you prefer I can keep it as `<function>`. >> +DEFUN ("closurep", Fclosurep, Sclosurep, >> + 1, 1, 0, >> + doc: /* Return t if OBJECT is a function object. */) > > If the doc string is correct, then why is the function called > 'closurep'? It's against mnemonic memory. The term "closure" is kind of funny, indeed: often it's used to say "this is a piece of code packaged with its environment", but in most cases where it's used all functions are like that, so the precise meaning of "closure" can be argued to be exactly the same as "function", just with a different connotation. Sometimes the connotation is to insist on the fact that it captured some variables, but object it's used to distinguish between the abstract notion of functions and their runtime representation, where "closure" means "an object which represents a function". IOW, the docstring above is meant to say "a function *object*" rather than "a *function* object". Compare with `functionp`: Return t if OBJECT is a function. Maybe I should say something like: Return t if OBJECT is a closure, i.e. a function object. ? >> +DEFUN ("interpreted-function-p", Finterpreted_function_p, >> + Sinterpreted_function_p, 1, 1, 0, >> + doc: /* Return t if OBJECT is an interpreted function value. */) > ^^^^^^^^^^^^^^ > "function value"? what's that? In general, `eval` takes an source expression and returns a value. (lambda (x) (+ x y)) is a list of length 3 which represents a function as a source expression and `eval` turns it into a value (presumably a closure which will remember the current binding of `y`). (closure .. (x) (+ x y)) used to be the representation used for interpreted function *values*. >> + (Lisp_Object object) >> +{ >> + if (CLOSUREP (object) && CONSP (AREF (object, CLOSURE_CODE))) >> return Qt; >> return Qnil; > > These new primitives should be documented in the ELisp manual, and > perhaps also mentioned in NEWS. OK. >> +DEFUN ("make-interpreted-closure", Fmake_interpreted_closure, >> + Smake_interpreted_closure, 3, 5, 0, >> + doc: /* Make an interpreted closure. > > Please try to mention the arguments in the first sentence of doc > string. > > And this primitive should also be documented in the ELisp manual. OK. >> +ARGS should be the list of formal arguments. >> +BODY should be a non-empty list of forms. >> +ENV should be a lexical environment, like the second argument of `eval'. >> +IFORM if non-nil should be of the form (interactive ...). */) >> + (Lisp_Object args, Lisp_Object body, Lisp_Object env, >> + Lisp_Object docstring, Lisp_Object iform) >> +{ >> + CHECK_CONS (body); /* Make sure it's not confused with byte-code! */ >> + if (!NILP (iform)) >> + { >> + iform = Fcdr (iform); >> + return CALLN (Fmake_byte_code, >> + args, body, env, Qnil, docstring, >> + NILP (Fcdr (iform)) >> + ? Fcar (iform) >> + : CALLN (Fvector, XCAR (iform), XCDR (iform))); >> + } >> + else if (!NILP (docstring)) >> + return CALLN (Fmake_byte_code, args, body, env, Qnil, docstring); >> + else >> + return CALLN (Fmake_byte_code, args, body, env); > > I'm probably missing something, but if the doc string says "make an > _interpreted_ closure", why does the implementation call > make-byte-code? Isn't byte-code a kind-of antithesis of > "interpreted"? Because `make-byte-code` is the function that we already have which creates PVEC_CLOSURE (previous called PVEC_COMPILED) objects. It used to be used exclusively to create byte-code functions but now that same representation is used for interpreted function. I could rename it but we already have `make-closure` for something slightly different, and I don't see much benefit to the rename. I'll add a comment explaining why we use `Fmake_byte_code`, Stefan

This bug report was last modified 1 year and 22 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #70368 [PATCH] Use a dedicated type to represent interpreted-function values

GNU bug report logs - #70368
[PATCH] Use a dedicated type to represent interpreted-function values