GNU bug report logs -
#20087
'gensym' is not guaranteed to return a fresh symbol
Previous Next
Reported by: ludo <at> gnu.org (Ludovic Courtès)
Date: Wed, 11 Mar 2015 17:16:02 UTC
Severity: normal
Done: Andy Wingo <wingo <at> pobox.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 20087 in the body.
You can then email your comments to 20087 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Wed, 11 Mar 2015 17:16:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
ludo <at> gnu.org (Ludovic Courtès)
:
New bug report received and forwarded. Copy sent to
bug-guile <at> gnu.org
.
(Wed, 11 Mar 2015 17:16:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
‘gensym’ returns interned symbols, but the algorithm to determine the
new symbol is simplistic and predictable.
Thus, one can arrange to produce a symbol before ‘gensym’ does, leading
‘gensym’ to return a symbol that’s not fresh (in terms of ‘eq?’), as is
the case with the second call to ‘gensym’ here:
--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (gensym "x")
$1 = x379
scheme@(guile-user)> 'x405
$2 = x405
scheme@(guile-user)> (gensym "x")
$3 = x405
--8<---------------cut here---------------end--------------->8---
Should we worry about it? I think it may have hard to anticipate
security implications.
Ludo’.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Fri, 18 Mar 2016 18:10:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 20087 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hello
I agree, this goes against the main assumption people have about gensym.
I was able to reproduce the bug.
Here's a patch to libguile/symbol.c which fixes this behavior by
incrementing the gensym counter in a loop until it creates a fresh
symbol.
[guile-gensym-fix.diff (text/x-diff, attachment)]
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Tue, 22 Mar 2016 05:25:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 20087 <at> debbugs.gnu.org (full text, mbox):
ludo <at> gnu.org (Ludovic Courtès) writes:
> ‘gensym’ returns interned symbols, but the algorithm to determine the
> new symbol is simplistic and predictable.
>
> Thus, one can arrange to produce a symbol before ‘gensym’ does, leading
> ‘gensym’ to return a symbol that’s not fresh (in terms of ‘eq?’), as is
> the case with the second call to ‘gensym’ here:
rain1 <at> openmailbox.org writes:
> I agree, this goes against the main assumption people have about
> gensym. I was able to reproduce the bug.
>
> Here's a patch to libguile/symbol.c which fixes this behavior by
> incrementing the gensym counter in a loop until it creates a fresh
> symbol.
I've considered this idea in the past, but it only avoids collisions
with symbols that have been interned before the gensym. It does not
avoid collisions with symbols interned *after* the gensym. Obviously,
there's no way to avoid such collisions.
Therefore, we must unfortunately live with the possibility of
collisions. Furthermore, if we add a requirement for deterministic
behavior (which I think we must), then we must live with the fact that
intentional collisions can be trivially achieved.
With this in mind, I'm not sure it makes sense to add code that
attempts, but fails, to eliminate the possibility of collisions.
If we cannot eliminate the possibility of collisions, and we cannot
avoid intentional collisions, what can we do? I think the best we can
hope for is to significantly reduce the probability of _unintentional_
collisions, perhaps by starting the gensym counter at a large number.
The other thing we can do is to clearly document these inherent problems
with gensym, so that they will not be misused for jobs for which they
are not appropriate.
What do you think?
Mark
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Tue, 22 Mar 2016 07:59:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 20087 <at> debbugs.gnu.org (full text, mbox):
Mark H Weaver <mhw <at> netris.org> skribis:
> I've considered this idea in the past, but it only avoids collisions
> with symbols that have been interned before the gensym. It does not
> avoid collisions with symbols interned *after* the gensym. Obviously,
> there's no way to avoid such collisions.
Yeah, good point.
> If we cannot eliminate the possibility of collisions, and we cannot
> avoid intentional collisions, what can we do? I think the best we can
> hope for is to significantly reduce the probability of _unintentional_
> collisions, perhaps by starting the gensym counter at a large number.
I’m not sure if that would help.
One thing that could help avoid unintentional collisions is to
automatically add whitespace before the number, such that:
(gensym "x") => #{x 123}#
(This is already the case when called with no arguments.)
> The other thing we can do is to clearly document these inherent problems
> with gensym, so that they will not be misused for jobs for which they
> are not appropriate.
I think we should add a sentence to that effect in the manual.
Thanks,
Ludo’.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Tue, 22 Mar 2016 11:30:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 20087 <at> debbugs.gnu.org (full text, mbox):
On 2016-03-22 05:24, Mark H Weaver wrote:
> ludo <at> gnu.org (Ludovic Courtès) writes:
>> ‘gensym’ returns interned symbols, but the algorithm to determine the
>> new symbol is simplistic and predictable.
>>
>> Thus, one can arrange to produce a symbol before ‘gensym’ does,
>> leading
>> ‘gensym’ to return a symbol that’s not fresh (in terms of ‘eq?’), as
>> is
>> the case with the second call to ‘gensym’ here:
>
> rain1 <at> openmailbox.org writes:
>> I agree, this goes against the main assumption people have about
>> gensym. I was able to reproduce the bug.
>>
>> Here's a patch to libguile/symbol.c which fixes this behavior by
>> incrementing the gensym counter in a loop until it creates a fresh
>> symbol.
>
> I've considered this idea in the past, but it only avoids collisions
> with symbols that have been interned before the gensym. It does not
> avoid collisions with symbols interned *after* the gensym. Obviously,
> there's no way to avoid such collisions.
Thanks for looking over the patch I sent!
One expects of gensym to create a fresh symbol, something not EQ? to any
symbol that already exists. It is an important property to be able to
rely on and this patch achieves that.
About symbols interned after, would that refer to something like this:
------------------------
scheme@(guile-user)> (define a (gensym "x"))
scheme@(guile-user)> a
$1 = x280
scheme@(guile-user)> (eq? a (string->symbol "x280"))
$2 = #t
------------------------
In most lisps gensym creates an uninterned symbol. I think that would
stop the previous giving #t. I could write a patch for this if wanted.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Tue, 22 Mar 2016 18:08:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 20087 <at> debbugs.gnu.org (full text, mbox):
rain1 <at> openmailbox.org writes:
> On 2016-03-22 05:24, Mark H Weaver wrote:
>> ludo <at> gnu.org (Ludovic Courtès) writes:
>>> ‘gensym’ returns interned symbols, but the algorithm to determine the
>>> new symbol is simplistic and predictable.
>>>
>>> Thus, one can arrange to produce a symbol before ‘gensym’ does,
>>> leading
>>> ‘gensym’ to return a symbol that’s not fresh (in terms of ‘eq?’),
>>> as is
>>> the case with the second call to ‘gensym’ here:
>>
>> rain1 <at> openmailbox.org writes:
>>> I agree, this goes against the main assumption people have about
>>> gensym. I was able to reproduce the bug.
>>>
>>> Here's a patch to libguile/symbol.c which fixes this behavior by
>>> incrementing the gensym counter in a loop until it creates a fresh
>>> symbol.
>>
>> I've considered this idea in the past, but it only avoids collisions
>> with symbols that have been interned before the gensym. It does not
>> avoid collisions with symbols interned *after* the gensym. Obviously,
>> there's no way to avoid such collisions.
>
> Thanks for looking over the patch I sent!
>
> One expects of gensym to create a fresh symbol, something not EQ? to
> any symbol that already exists. It is an important property to be able
> to rely on and this patch achieves that.
Can you give a (non-contrived) example of an application that requires
the property you stated above, but does not rely on avoiding collisions
with symbols interned after the gensym?
I’m open to the idea that such applications exist, but at the moment I
cannot think of one :)
> About symbols interned after, would that refer to something like this:
>
> ------------------------
> scheme@(guile-user)> (define a (gensym "x"))
> scheme@(guile-user)> a
> $1 = x280
> scheme@(guile-user)> (eq? a (string->symbol "x280"))
> $2 = #t
> ------------------------
Right. Another example would be using ‘read’ after the gensym, on input
that contains a symbol of the same name.
> In most lisps gensym creates an uninterned symbol. I think that would
> stop the previous giving #t.
Indeed, it would solve this problem, but we cannot change the behavior
of Guile's ‘gensym’ in this way, since it would break a lot of existing
code.
By the way, I looked at our manual entry for ‘gensym’, and it includes
the following text:
The symbols generated by ‘gensym’ are _likely_ to be unique, since
their names begin with a space and it is only otherwise possible to
generate such symbols if a programmer goes out of their way to do so.
Uniqueness can be guaranteed by instead using uninterned symbols
(*noteSymbol Uninterned::), though they can’t be usefully written out
and read back in.
We have ‘make-symbol’ for creating uninterned symbols, although you must
provide the exact name of the returned symbol.
> I could write a patch for this if wanted.
It would be nice to have another procedure, maybe ‘uninterned-gensym’
(I’m not sure what to call it, names are hard :) which would be like
‘gensym’ but would return an uninterned symbol, and thus reliably avoid
collisions.
If you’d like to contribute such a procedure, that would be welcome.
It is our policy to ask contributors to assign copyright to the Free
Software Foundation. Would you be willing to do this?
Mark
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Wed, 23 Mar 2016 17:56:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 20087 <at> debbugs.gnu.org (full text, mbox):
ludo <at> gnu.org (Ludovic Courtès) writes:
> Mark H Weaver <mhw <at> netris.org> skribis:
>
>> If we cannot eliminate the possibility of collisions, and we cannot
>> avoid intentional collisions, what can we do? I think the best we can
>> hope for is to significantly reduce the probability of _unintentional_
>> collisions, perhaps by starting the gensym counter at a large number.
>
> I’m not sure if that would help.
>
> One thing that could help avoid unintentional collisions is to
> automatically add whitespace before the number, such that:
>
> (gensym "x") => #{x 123}#
I think this is a good idea.
>> The other thing we can do is to clearly document these inherent problems
>> with gensym, so that they will not be misused for jobs for which they
>> are not appropriate.
>
> I think we should add a sentence to that effect in the manual.
It turns out the manual already has the following text in the ‘gensym’
entry, which I think is sufficient.
The symbols generated by ‘gensym’ are _likely_ to be unique, since
their names begin with a space and it is only otherwise possible to
generate such symbols if a programmer goes out of their way to do so.
Uniqueness can be guaranteed by instead using uninterned symbols
(*noteSymbol Uninterned::), though they can’t be usefully written out
and read back in.
What do you think?
Mark
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Thu, 24 Mar 2016 08:46:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 20087 <at> debbugs.gnu.org (full text, mbox):
Mark H Weaver <mhw <at> netris.org> skribis:
> ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw <at> netris.org> skribis:
>>
>>> If we cannot eliminate the possibility of collisions, and we cannot
>>> avoid intentional collisions, what can we do? I think the best we can
>>> hope for is to significantly reduce the probability of _unintentional_
>>> collisions, perhaps by starting the gensym counter at a large number.
>>
>> I’m not sure if that would help.
>>
>> One thing that could help avoid unintentional collisions is to
>> automatically add whitespace before the number, such that:
>>
>> (gensym "x") => #{x 123}#
>
> I think this is a good idea.
>
>>> The other thing we can do is to clearly document these inherent problems
>>> with gensym, so that they will not be misused for jobs for which they
>>> are not appropriate.
>>
>> I think we should add a sentence to that effect in the manual.
>
> It turns out the manual already has the following text in the ‘gensym’
> entry, which I think is sufficient.
>
> The symbols generated by ‘gensym’ are _likely_ to be unique, since
> their names begin with a space and it is only otherwise possible to
> generate such symbols if a programmer goes out of their way to do so.
> Uniqueness can be guaranteed by instead using uninterned symbols
> (*noteSymbol Uninterned::), though they can’t be usefully written out
> and read back in.
>
> What do you think?
Oh indeed, I guess I had overlooked that.
Ludo’.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Thu, 23 Jun 2016 13:50:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 20087 <at> debbugs.gnu.org (full text, mbox):
On Thu 24 Mar 2016 09:45, ludo <at> gnu.org (Ludovic Courtès) writes:
> Mark H Weaver <mhw <at> netris.org> skribis:
>
>> It turns out the manual already has the following text in the ‘gensym’
>> entry, which I think is sufficient.
>>
>> The symbols generated by ‘gensym’ are _likely_ to be unique, since
>> their names begin with a space and it is only otherwise possible to
>> generate such symbols if a programmer goes out of their way to do so.
>> Uniqueness can be guaranteed by instead using uninterned symbols
>> (*noteSymbol Uninterned::), though they can’t be usefully written out
>> and read back in.
>>
>> What do you think?
>
> Oh indeed, I guess I had overlooked that.
I just pushed something to master to error when serializing an
uninterned symbol. Otherwise compiling an uninterned symbol effectively
interns it! I am not sure that we can apply such a fix in 2.0 though as
who knows, maybe someone is compiling something with symbols made with
make-symbol. WDYT? If you agree we can close this bug.
Andy
Information forwarded
to
bug-guile <at> gnu.org
:
bug#20087
; Package
guile
.
(Thu, 23 Jun 2016 14:15:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 20087 <at> debbugs.gnu.org (full text, mbox):
Andy Wingo <wingo <at> pobox.com> skribis:
> On Thu 24 Mar 2016 09:45, ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw <at> netris.org> skribis:
>>
>>> It turns out the manual already has the following text in the ‘gensym’
>>> entry, which I think is sufficient.
>>>
>>> The symbols generated by ‘gensym’ are _likely_ to be unique, since
>>> their names begin with a space and it is only otherwise possible to
>>> generate such symbols if a programmer goes out of their way to do so.
>>> Uniqueness can be guaranteed by instead using uninterned symbols
>>> (*noteSymbol Uninterned::), though they can’t be usefully written out
>>> and read back in.
>>>
>>> What do you think?
>>
>> Oh indeed, I guess I had overlooked that.
>
> I just pushed something to master to error when serializing an
> uninterned symbol. Otherwise compiling an uninterned symbol effectively
> interns it! I am not sure that we can apply such a fix in 2.0 though as
> who knows, maybe someone is compiling something with symbols made with
> make-symbol. WDYT? If you agree we can close this bug.
That makes sense to me.
Thanks!
Ludo’.
Reply sent
to
Andy Wingo <wingo <at> pobox.com>
:
You have taken responsibility.
(Thu, 23 Jun 2016 16:06:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
ludo <at> gnu.org (Ludovic Courtès)
:
bug acknowledged by developer.
(Thu, 23 Jun 2016 16:06:02 GMT)
Full text and
rfc822 format available.
Message #37 received at 20087-done <at> debbugs.gnu.org (full text, mbox):
On Thu 23 Jun 2016 16:13, ludo <at> gnu.org (Ludovic Courtès) writes:
> Andy Wingo <wingo <at> pobox.com> skribis:
>
>> I just pushed something to master to error when serializing an
>> uninterned symbol. Otherwise compiling an uninterned symbol effectively
>> interns it! I am not sure that we can apply such a fix in 2.0 though as
>> who knows, maybe someone is compiling something with symbols made with
>> make-symbol. WDYT? If you agree we can close this bug.
>
> That makes sense to me.
Closing then. Thanks for the review :)
Andy
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 22 Jul 2016 11:24:05 GMT)
Full text and
rfc822 format available.
This bug report was last modified 9 years and 29 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.