GNU bug report logs - #16362
compiler doesn't preserve distinctness of literals

Previous Next

Package: guile;

Reported by: Zefram <zefram <at> fysh.org>

Date: Sun, 5 Jan 2014 23:45:13 UTC

Severity: normal

Tags: notabug, wontfix

Done: Mark H Weaver <mhw <at> netris.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 16362 in the body.
You can then email your comments to 16362 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#16362; Package guile. (Sun, 05 Jan 2014 23:45:13 GMT) Full text and rfc822 format available.

Acknowledgement sent to Zefram <zefram <at> fysh.org>:
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Sun, 05 Jan 2014 23:45:13 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Zefram <zefram <at> fysh.org>
To: bug-guile <at> gnu.org
Subject: compiler disrespects referential integrity
Date: Sun, 5 Jan 2014 23:13:47 +0000
The guile-2.0.9 compiler doesn't preserve the distinctness of mutable
objects that are referenced in code via the read-eval (#.) facility.
(I'm not mutating the code itself, only quoted objects.)  The interpreter,
and for comparison guile-1.8, do preserve object identity, allowing
read-eval to be used to incorporate direct object references into code.
Test case:

$ cat t9
(cond-expand
  (guile-2 (defmacro compile-time f `(eval-when (compile eval) ,@f)))
  (else (defmacro compile-time f `(begin ,@f))))
(compile-time (fluid-set! read-eval? #t))
(compile-time (define aaa (cons 1 2)))
(set-car! '#.aaa 5)
(write '#.aaa)
(newline)
(write '(1 . 2))
(newline)
$ guile-1.8 t9
(5 . 2)
(1 . 2)
$ guile-2.0 --no-auto-compile t9
(5 . 2)
(1 . 2)
$ guile-2.0 t9
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;;       or pass the --no-auto-compile argument to disable.
;;; compiling /home/zefram/usr/guile/t9
;;; compiled /home/zefram/.cache/guile/ccache/2.0-LE-8-2.0/home/zefram/usr/guile/t9.go
(5 . 2)
(5 . 2)
$ guile-2.0 t9
(5 . 2)
(5 . 2)

In the test case, the explicitly-constructed pair aaa is conflated with
the pair literal (1 . 2), and so the runtime modification of aaa (which
is correctly mutable) affects the literal.

This issue seems closely related to the problem described at
<http://debbugs.gnu.org/cgi/bugreport.cgi?bug=11198>, wherein the compiler
is entirely unable to handle code incorporating references to some kinds
of object.  In that case the failure mode is a compile-time error, so
the problem can be worked around.  The failure mode with pairs, silent
misbehaviour, is a more serious problem.  Between them, these problems
break most of the interesting uses for read-eval, albeit only when using
the compiler.

Debian incarnation of this bug report:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=734157

-zefram




Information forwarded to bug-guile <at> gnu.org:
bug#16362; Package guile. (Wed, 15 Jan 2014 20:00:03 GMT) Full text and rfc822 format available.

Message #8 received at 16362 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Zefram <zefram <at> fysh.org>
Cc: 16362 <at> debbugs.gnu.org, request <at> debbugs.gnu.org
Subject: Re: bug#16362: compiler disrespects referential integrity
Date: Wed, 15 Jan 2014 14:57:05 -0500
tags 16362 notabug
thanks

Zefram <zefram <at> fysh.org> writes:
> The guile-2.0.9 compiler doesn't preserve the distinctness of mutable
> objects that are referenced in code via the read-eval (#.) facility.
> (I'm not mutating the code itself, only quoted objects.)

I'm sorry that you've written code that assumes that this is allowed,
but in Scheme all literals are immutable.

> The interpreter, and for comparison guile-1.8, do preserve object
> identity, allowing read-eval to be used to incorporate direct object
> references into code.

It worked by accident in Guile 1.8, but there's simply no way to support
this robustly in an ahead-of-time compiler, which must serialize all
literals to an object file.

    Thanks,
      Mark




Added tag(s) notabug. Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Wed, 15 Jan 2014 20:00:04 GMT) Full text and rfc822 format available.

Information forwarded to bug-guile <at> gnu.org:
bug#16362; Package guile. (Wed, 15 Jan 2014 21:03:02 GMT) Full text and rfc822 format available.

Message #13 received at 16362 <at> debbugs.gnu.org (full text, mbox):

From: Zefram <zefram <at> fysh.org>
To: Mark H Weaver <mhw <at> netris.org>
Cc: 16362 <at> debbugs.gnu.org
Subject: Re: bug#16362: compiler disrespects referential integrity
Date: Wed, 15 Jan 2014 21:02:51 +0000
Mark H Weaver wrote:
>I'm sorry that you've written code that assumes that this is allowed,
>but in Scheme all literals are immutable.

It's not a literal: the object was not constructed by the action of
the reader.  It was constructed by non-literal means, and merely *passed
through* the reader.

That's not to say your not-a-bug opinion is wrong, though.  Scheme as
defined by RnRS certainly doesn't support this kind of thing.  It treats
the print form of an expression as primary, and so doesn't like having
anything unprintable in the object form.

>It worked by accident in Guile 1.8,

This is the bit that's really news to me.  *Scheme* doesn't support
it, but *Guile* is more than just Scheme, and I presumed that it was
intentional that it took a more enlightened view of what constitutes
an expression.  If that was just an accident, then what you actually
support ought to be documented.  In principle it would also be a good
idea to enforce this restriction in the interpreter, to avoid having
this incompatibility between interpreter and compiler of the `same'
implementation.

>                                    but there's simply no way to support
>this robustly in an ahead-of-time compiler, which must serialize all
>literals to an object file.

Sure there is.  The object in question is eminently serialisable: it
contains only references to other serialisable data.  All that needs
to change is to distinguish between actual literal pairs (that can be
merged) and non-literals whose distinct identity needs to be preserved.
This might well be painful to add to your existing code, given the
way you represent pairs.  But that's a difficulty with the specific
implementation, not an inherent limitation of compilation.

-zefram




Changed bug title to 'compiler doesn't preserve distinctness of literals' from 'compiler disrespects referential integrity' Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Wed, 15 Jan 2014 21:17:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guile <at> gnu.org:
bug#16362; Package guile. (Wed, 15 Jan 2014 22:18:02 GMT) Full text and rfc822 format available.

Message #18 received at 16362 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Zefram <zefram <at> fysh.org>
Cc: 16362 <at> debbugs.gnu.org
Subject: Re: bug#16362: compiler disrespects referential integrity
Date: Wed, 15 Jan 2014 17:15:14 -0500
Zefram <zefram <at> fysh.org> writes:

> Mark H Weaver wrote:
>>I'm sorry that you've written code that assumes that this is allowed,
>>but in Scheme all literals are immutable.
>
> It's not a literal: the object was not constructed by the action of
> the reader.  It was constructed by non-literal means, and merely *passed
> through* the reader.

In Scheme terminology, an expression of the form (quote <datum>) is a
literal.  Where that <datum> came from is not relevant to the definition
of "literal".

> That's not to say your not-a-bug opinion is wrong, though.  Scheme as
> defined by RnRS certainly doesn't support this kind of thing.  It treats
> the print form of an expression as primary, and so doesn't like having
> anything unprintable in the object form.
>
>>It worked by accident in Guile 1.8,
>
> This is the bit that's really news to me.  *Scheme* doesn't support
> it, but *Guile* is more than just Scheme, and I presumed that it was
> intentional that it took a more enlightened view of what constitutes
> an expression.  If that was just an accident, then what you actually
> support ought to be documented.

Where does it say in the documentation that this is allowed?

To my mind, Guile documents itself as Scheme plus extensions, but you
cannot determine what extensions you can depend on by experiment.  If a
given extension is not documented, then you cannot depend on it.

> In principle it would also be a good idea to enforce this restriction
> in the interpreter, to avoid having this incompatibility between
> interpreter and compiler of the `same' implementation.

Perhaps, but there are always going to be discernable differences
between multiple implementations of the same language.

>>                                    but there's simply no way to support
>>this robustly in an ahead-of-time compiler, which must serialize all
>>literals to an object file.
>
> Sure there is.  The object in question is eminently serialisable: it
> contains only references to other serialisable data.

Yes, but the identity of the objects cannot in general be preserved by
serialization where multiple object files and multiple Guile sessions
are involved.

Consider this: you serialize an object to one file, and then the same
object to a second file.  Now you load them both in from a different
Guile session.  How can the Guile loader know whether these two objects
should have the same identity or be distinct?

> All that needs to change is to distinguish between actual literal
> pairs (that can be merged) and non-literals whose distinct identity
> needs to be preserved.

That information is not preserved by the reader.

> This might well be painful to add to your existing code, given the
> way you represent pairs.  But that's a difficulty with the specific
> implementation, not an inherent limitation of compilation.

There are inherent limitations to serialization.  In the general case,
the identity of mutable objects cannot be reliably preserved.

For example, how do you correctly serialize a procedure produced by
make-counter?

  (define (make-counter)
    (let ((n 0))
      (lambda ()
        (set! n (+ n 1)) n)))

      Mark




Information forwarded to bug-guile <at> gnu.org:
bug#16362; Package guile. (Thu, 16 Jan 2014 01:58:02 GMT) Full text and rfc822 format available.

Message #21 received at 16362 <at> debbugs.gnu.org (full text, mbox):

From: Zefram <zefram <at> fysh.org>
To: Mark H Weaver <mhw <at> netris.org>
Cc: 16362 <at> debbugs.gnu.org
Subject: Re: bug#16362: compiler disrespects referential integrity
Date: Thu, 16 Jan 2014 01:57:44 +0000
Mark H Weaver wrote:
>In Scheme terminology, an expression of the form (quote <datum>) is a
>literal.

Ah, sorry, I see your usage now.  R6RS speaks of that kind of expression
being a "literal expression".  (Elsewhere it uses "literal" in the sense
I was using it, referring to the readable representation of an object.)
Section 5.10 "Storage model" says "It is desirable for constants (i.e. the
values of literal expressions) to reside in read-only memory.".  So in
the Scheme model whatever that <datum> in the expression is it's a
"constant".  Of course, that's in the RnRS view of expressions that
ignores the homoiconic representation.  It's assuming that these
"constants" will always be "literal" in the sense I was using.

>Where does it say in the documentation that this is allowed?

It doesn't: as far as I can see it doesn't document that aspect of the
language at all.  It would be nice if it did.

>To my mind, Guile documents itself as Scheme plus extensions,

I thought the documentation was attempting to document the language that
Guile implements per se.  It doesn't generally just refer to RnRS for the
language definition; it actually tells you most of what it could have
referred to RnRS for.  For example, it fully describes tail recursion,
without any reference to RnRS.  It's good that it does this, and it
would be good for it to be more complete in the areas such as this where
it's lacking.

So maybe I got the wrong impression of the documentation's role.  As the
documentation doesn't describe expressions in the RnRS character-based
way, I got the impression that Guile had not necessarily adopted that
restriction.  As it doesn't describe expressions in the homoiconic way
either, I interpreted it as silent on the issue, making experimentation
appropriate to determine the intent.

Maybe the documentation should have a note about its relationship
to the Scheme language definition: say which things it tries to be
authoritative about.

>cannot determine what extensions you can depend on by experiment.

Fair point, and I'm not bitter about my experiment turning out to have
this limited applicability.

>Consider this: you serialize an object to one file, and then the same
>object to a second file.  Now you load them both in from a different
>Guile session.  How can the Guile loader know whether these two objects
>should have the same identity or be distinct?

That's an interesting case, and I suppose I wouldn't expect that to
preserve identity.  I also wouldn't expect you to serialise an I/O port.
But the case I'm concerned about is a standalone script, being compiled
as a whole, and the objects it's setting up at compile time are made of
ordinary data.

I think some of our difference of opinion here comes because you're
mainly thinking of the compiler as something to apply to modules, so
you expect to deal with many compiled files in one session, whereas I'm
thinking about compilation of a program as a whole.  Your viewpoint is
the more general.

>For example, how do you correctly serialize a procedure produced by
>make-counter?

Assuming we're only serialising it to one file, it shouldn't be any more
difficult than my test case with a mutable pair.  The procedure object
needs to contain a reference to the body expression and a reference to
the lexical environment that it closed over.  The lexical environment
contains the binding of the symbol "n" to a variable, which contains
some current numeric value.  That variable is the basic mutable item
whose identity needs to be maintained through serialisation.  If we have
multiple procedures generated by make-counter, they'll have distinct
variables, and therefore distinct lexical environments, and therefore
be distinct procedures, though they'll share bodies.

The only part of this that looks at all difficult to me is that you may
have compiled the function body down to VM code, which is not exactly
a normal Lisp object and needs its own serialisation arrangements.
Presumably you already have that solved in order to compile code that
contains function definitions.  Aside from that it's all ordinary
Lisp objects that look totally serialisable.  What do you think is the
difficult part?

-zefram




Information forwarded to bug-guile <at> gnu.org:
bug#16362; Package guile. (Wed, 01 Oct 2014 19:05:02 GMT) Full text and rfc822 format available.

Message #24 received at 16362 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: Zefram <zefram <at> fysh.org>
Cc: 16362 <at> debbugs.gnu.org, request <at> debbugs.gnu.org
Subject: Re: bug#16362: compiler disrespects referential integrity
Date: Wed, 01 Oct 2014 15:04:00 -0400
tags 16362 + notabug wontfix
close 16362
thanks

I'm sorry that you came to depend on the undocumented behavior of
earlier versions of Guile, but the Scheme standards are quite clear that
literals are immutable and that no guarantees are made about preserving
object identity as seen by eq? or eqv?.  To my knowledge we never made
any promises that this would work, and we can't make it work properly in
the general case in our new ahead-of-time compilation model.

I'm closing this ticket.

      Mark




Added tag(s) wontfix. Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Wed, 01 Oct 2014 19:05:03 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 16362 <at> debbugs.gnu.org and Zefram <zefram <at> fysh.org> Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Wed, 01 Oct 2014 19:05:04 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 30 Oct 2014 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 233 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.