GNU bug report logs - #51982
Erroneous handling of local variables in byte-compiled nested lambdas

Previous Next

Package: emacs;

Reported by: Paul Pogonyshev <pogonyshev <at> gmail.com>

Date: Fri, 19 Nov 2021 20:32:02 UTC

Severity: normal

Tags: patch

Fixed in version 29.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #58 received at 51982 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>,
 Paul Pogonyshev <pogonyshev <at> gmail.com>, 51982 <at> debbugs.gnu.org
Subject: Re: bug#51982: Erroneous handling of local variables in byte-compiled
 nested lambdas
Date: Wed, 1 Dec 2021 17:04:44 +0100
30 nov. 2021 kl. 23.41 skrev Stefan Monnier <monnier <at> iro.umontreal.ca>:

> [ We could also force dynamically-scoped code to go through (a neutered
>  version of) cconv.el , so that bytecomp.el and byte-opt.el can presume
>  that `let*` doesn't exist any more.  ]

Yes, a dynbind frontend would be handy for other reasons (some syntactic normalisation in case we can't do in macroexpand-all). 

> BTW, have you checked the impact on byte-code quality?

With respect to these patches? Yes: the B patch gives slightly better code because materialising the accessor (internal-get-closed-var N) is as cheap or cheaper than even a stack variable access. But the difference is small and since the case is rare it's probably insignificant.

In fact, there is probably a way of making them produce identical code by constant-propagating such forms in the optimiser. Who knows, might give unexpected improvements to existing code as well. Time for an experiment!

>>> These two tests are identical aren't they?
>> No, they exercise different code paths (let and let*).
> 
> Then that deserves a comment ;-)

Will do.

>>> Looks good (better than patch A).
>> 
>> And here I was prepared to apply patch A since it's slightly more
>> conservative and it seems to be a rare problem anyway.
>> I've now split the patches in a more sensible (and easily reviewed) way: the
>> first corresponds to patch A, and the second is the diff to B. Take a second
>> look before making up your mind.
>> 
>>> You say "On the other hand, patch B does abuse the cconv data structures
>>> a little (but it works!)" so the code should say something about
>>> this abuse.  A least I failed to see where the abuse lies.
>> 
>> There are comments and doc strings such as
>> 
>>  EXTEND is a list of variables which might need to be accessed even
>>  from places where they are shadowed, because some part of ENV causes
>>  them to be used at places where they originally did not
>>  directly appear.
>> 
>> but with the B patch we put things into `extend` that are not strictly
>> variables but (international-get-closed-var N).
> 
> See below, I think we don't need to put them there.
> 
>> Similarly, `env` has entries like (VAR . (apply-partially F ARG1 ARG2 ..))
>> where the ARGi are always treated as variables but now they can be access
>> forms as well.
> 
> I don't think the current code assumes that ARGs are vars here.
> You're probably right that it used to be the case and it's not any more,
> but that shouldn't cause problems.  The risk I can see is if one of
> those ARGs is an expression which refers to a var which gets shadowed,
> in which case `cconv--remap-llv` won't rewrite it the way it should.
> But I think with your code ARG will either be a simple var or something
> of the form (internal-get-closed-var N) so we should be safe.
> 
>> @@ -304,6 +304,22 @@ cconv--convert-funcbody
>>             `(,@(nreverse special-forms) ,@(macroexp-unprogn body))))
>>       funcbody)))
>> 
>> +(defun cconv--lifted-arg (var env)
>> +  "The argument to use for VAR in λ-lifted calls according to ENV."
>> +  (let ((mapping (cdr (assq var env))))
>> +    (pcase-exhaustive mapping
>> +      (`(internal-get-closed-var . ,_)
>> +       ;; The variable is captured.
>> +       mapping)
>> +      (`(car-safe (internal-get-closed-var . ,_))
>> +       ;; The variable is mutably captured; skip
>> +       ;; the indirection step because the variable is
>> +       ;; passed "by reference" to the λ-lifted function.
>> +       (cadr mapping))
>> +      ((or '() `(car-safe ,(pred symbolp)))
>> +       ;; The variable is not captured; use the (shadowed) variable value.
>> +       var))))
> 
> The docstring or comment at the beginning should mention this function
> is specifically for shadowed vars.

Right.

> Also, If mapping is of the form (car-safe SYMBOL) is `var` really the
> correct answer?  Shouldn't it still be (cadr mapping)?

Can there ever be a difference? I don't think so, but prove me wrong!
(If you manage to do that, you will have found a second bug in the original code.)

For context, this is the case when we have a variable mutated by a lambda lifted inner function (that doesn't escape). The variable will be wrapped in a cons but retain its name. Example:

(lambda (x)
  (let ((f (lambda () (setq x (1+ x)))))
    (let ((x 3))
      (list x (funcall f)))))
->
(lambda (x)
  (let ((x (list x))) 
    (let ((f (lambda (x) (setcar x (1+ (car-safe x))))))
      (let ((x 3)
            (closed-x x))
        (list x (funcall f closed-x))))))

> Side note: I don't understand why we `(cons closedsym`, since that
> `closedsym` can never appear in another binding (since it's fresh).

Maybe it's to satisfy the invariant checked by the assertion at the top?

> I don't much like this `symbolp` test (which fundamentally seems to
> be trying to recover the information about which branch of the `pcase`
> we're coming from in `cconv--lifted-arg`).

That's precisely what it is trying to do and no, I don't like it much either.

I suppose cconv--lifted-arg could be made a location function; we could then access and mutate local variables. Something poetically self-referential about that, but I'm not overly fond of the closure creation overhead (better than what it once was but still too high).

>  It at least deserves
> a comment explaining why it's doing the right thing.

> If we can remove this `symbolp` test recovering info about provenance of
> the result of `cconv--lifted-arg` then I think option B is better, but
> I prefer otherwise option A.

I don't see any alternative that is obviously better so I'm applying patch A. We can still go with B later on if we want; the changes are minor.

Good comments, thank you very much!






This bug report was last modified 2 years and 251 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.