GNU bug report logs -
#62009
29.0.60; Emacs crashes on setf symbol-name
Previous Next
To reply to this bug, email your comments to 62009 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Mon, 06 Mar 2023 19:28:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Daniel Mendler <mail <at> daniel-mendler.de>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Mon, 06 Mar 2023 19:28:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Execute the following in the scratch buffer:
(setf (aref (symbol-name 'car) 1) ?o)
Emacs crashes with a segmentation fault. Is this a well-known issue? I
could reproduce the problem on Emacs 27 and 29. Should there be some
mechanism to protect the strings of symbols?
I found the snippet on reddit:
https://old.reddit.com/r/emacs/comments/11ix6yu/ive_found_what_ive_been_looking_for/jb4ah5v/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Tue, 07 Mar 2023 14:20:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 62009 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Daniel Mendler <mail <at> daniel-mendler.de> writes:
> Execute the following in the scratch buffer:
>
> (setf (aref (symbol-name 'car) 1) ?o)
>
> Emacs crashes with a segmentation fault. Is this a well-known issue? I
> could reproduce the problem on Emacs 27 and 29. Should there be some
> mechanism to protect the strings of symbols?
>
> I found the snippet on reddit:
> https://old.reddit.com/r/emacs/comments/11ix6yu/ive_found_what_ive_been_looking_for/jb4ah5v/
Can't access reddit, but can reproduce in recent master (6fb8a4dff7ef).
To test, first put this file under emacs.git/src/:
[test.el (text/plain, attachment)]
[Message part 3 (text/plain, inline)]
$ make; cd src
Then do the following for each symbol:
- setf
- find-file
- with-current-buffer
- buffer-file-name
$ ./emacs -Q -batch -l test.el -eval '(foo (quote setf))'
[1] "setf"
[2] "sxtf"
$ ./emacs -Q -batch -l test.el -eval '(foo (quote find-file))'
[1] "find-file"
[2] "fxnd-file"
And these below below: aref, null, car, cdr, save-current-buffer
$ ./emacs -Q -batch -l test.el -eval '(foo (quote aref))'
[1] "aref"
Fatal error 11: Segmentation fault
Backtrace:
...
My observation is that symbols "introduced" via C defuns and defmacros
exhibit this problem, whereas those introduced via Elisp defuns and
defmacros do not. No symbols introduced via defvars exhibit this
problem, as shown above with buffer-file-name.
Seeing that it is a segfault, maybe the setf is trying to modify
readonly memory produced by the C defuns and defmacros? If that is the
case, *if* we allow such modifications, we should make the memory
readwrite; *otherwise* maybe we should no-op, warn, or err in setf and
friends when we see readonly memory blocks?
With this collection of GDB commands:
[debug.gdb (text/plain, attachment)]
[Message part 5 (text/plain, inline)]
And this GDB command line option:
$ gdb -x debug.gdb --batch --args ./emacs -Q -batch -l ../test.el -eval '(foo (quote car))' > car.backtrace
I get the backtrace (attached below) for setf + symbol-name + 'car as
reported by OP.
[car.backtrace (text/plain, attachment)]
[Message part 7 (text/plain, inline)]
HTH.
--
Best,
RY
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Tue, 07 Mar 2023 15:46:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 62009 <at> debbugs.gnu.org (full text, mbox):
Daniel Mendler <mail <at> daniel-mendler.de> writes:
> Execute the following in the scratch buffer:
>
> (setf (aref (symbol-name 'car) 1) ?o)
>
> Emacs crashes with a segmentation fault. Is this a well-known issue? I
> could reproduce the problem on Emacs 27 and 29. Should there be some
> mechanism to protect the strings of symbols?
I vaguely remember this has been discussed some time ago, but I don't
find a bug report about it. Maybe it had been on emacs-dev.
Maybe the outcome was something like that we can't protect everybody in
every case from shooting in the own foot, I don't recall.
Michael.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Tue, 07 Mar 2023 17:09:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 3/7/23 16:45, Michael Heerdegen wrote:
> Daniel Mendler <mail <at> daniel-mendler.de> writes:
>
>> Execute the following in the scratch buffer:
>>
>> (setf (aref (symbol-name 'car) 1) ?o)
>>
>> Emacs crashes with a segmentation fault. Is this a well-known issue? I
>> could reproduce the problem on Emacs 27 and 29. Should there be some
>> mechanism to protect the strings of symbols?
>
> Maybe the outcome was something like that we can't protect everybody in
> every case from shooting in the own foot, I don't recall.
Maybe it would be possible to introduce a flag which marks strings as
"frozen"? Then we could ensure that no mutations of such frozen string
happen. Freezing strings (vectors or pairs) may be generally useful
beyond preventing such issues.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Tue, 07 Mar 2023 17:41:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 62009 <at> debbugs.gnu.org
> Date: Tue, 7 Mar 2023 18:08:43 +0100
> From: Daniel Mendler <mail <at> daniel-mendler.de>
>
> On 3/7/23 16:45, Michael Heerdegen wrote:
> > Daniel Mendler <mail <at> daniel-mendler.de> writes:
> >
> >> Execute the following in the scratch buffer:
> >>
> >> (setf (aref (symbol-name 'car) 1) ?o)
> >>
> >> Emacs crashes with a segmentation fault. Is this a well-known issue? I
> >> could reproduce the problem on Emacs 27 and 29. Should there be some
> >> mechanism to protect the strings of symbols?
> >
> > Maybe the outcome was something like that we can't protect everybody in
> > every case from shooting in the own foot, I don't recall.
>
> Maybe it would be possible to introduce a flag which marks strings as
> "frozen"? Then we could ensure that no mutations of such frozen string
> happen.
AFAICT, they _are_ frozen. These names are in read-only memory, where
you cannot write. That's why Emacs crashes, AFAIU: the code is trying
to write to protected memory.
Just don't do that, cause it's gonna hurt...
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Thu, 09 Mar 2023 21:12:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 62009 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> Cc: Stefan Monnier <monnier <at> iro.umontreal.ca>, 62009 <at> debbugs.gnu.org
>> Date: Tue, 7 Mar 2023 18:08:43 +0100
>> From: Daniel Mendler <mail <at> daniel-mendler.de>
>>
>> On 3/7/23 16:45, Michael Heerdegen wrote:
>> > Daniel Mendler <mail <at> daniel-mendler.de> writes:
>> >
>> >> Execute the following in the scratch buffer:
>> >>
>> >> (setf (aref (symbol-name 'car) 1) ?o)
>> >>
>> >> Emacs crashes with a segmentation fault. Is this a well-known issue? I
>> >> could reproduce the problem on Emacs 27 and 29. Should there be some
>> >> mechanism to protect the strings of symbols?
>> >
>> > Maybe the outcome was something like that we can't protect everybody in
>> > every case from shooting in the own foot, I don't recall.
>>
>> Maybe it would be possible to introduce a flag which marks strings as
>> "frozen"? Then we could ensure that no mutations of such frozen string
>> happen.
>
> AFAICT, they _are_ frozen. These names are in read-only memory, where
> you cannot write. That's why Emacs crashes, AFAIU: the code is trying
> to write to protected memory.
>
> Just don't do that, cause it's gonna hurt...
Is it not possible to detect this before the illegal memory access, and
raise a signal in Emacs Lisp?
--
Philip Kaludercic
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 07:12:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> From: Philip Kaludercic <philipk <at> posteo.net>
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, michael_heerdegen <at> web.de,
> monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org
> Date: Thu, 09 Mar 2023 21:11:13 +0000
>
> > AFAICT, they _are_ frozen. These names are in read-only memory, where
> > you cannot write. That's why Emacs crashes, AFAIU: the code is trying
> > to write to protected memory.
> >
> > Just don't do that, cause it's gonna hurt...
>
> Is it not possible to detect this before the illegal memory access, and
> raise a signal in Emacs Lisp?
It won't be easy, if at all possible. And I'm not sure we even want
to do that. What would be the purpose of supporting such a use of
Emacs?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 08:46:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On Fri, 10 Mar 2023 at 09:11, Eli Zaretskii wrote:
>> Is it not possible to detect this before the illegal memory access, and
>> raise a signal in Emacs Lisp?
>
> It won't be easy, if at all possible. And I'm not sure we even want
> to do that. What would be the purpose of supporting such a use of
> Emacs?
What is the purpose of supporting mutation of symbol names in general?
(aset (symbol-name 'find-file) 1 ?o)
(fboundp 'find-file)
=> nil
This one doesn't crash Emacs, but wreaks havoc, maybe in even worse
ways.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 08:48:01 GMT)
Full text and
rfc822 format available.
Message #29 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On Fri, 10 Mar 2023 at 09:45, Augusto Stoffel wrote:
> On Fri, 10 Mar 2023 at 09:11, Eli Zaretskii wrote:
>
>>> Is it not possible to detect this before the illegal memory access, and
>>> raise a signal in Emacs Lisp?
>>
>> It won't be easy, if at all possible. And I'm not sure we even want
>> to do that. What would be the purpose of supporting such a use of
>> Emacs?
>
> What is the purpose of supporting mutation of symbol names in general?
>
> (aset (symbol-name 'find-file) 1 ?o)
> (fboundp 'find-file)
> => nil
>
> This one doesn't crash Emacs, but wreaks havoc, maybe in even worse
> ways.
(To clarify, I think this of course should raise a signal.)
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 09:41:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>>> AFAICT, they _are_ frozen. These names are in read-only memory, where
>>> you cannot write. That's why Emacs crashes, AFAIU: the code is trying
>>> to write to protected memory.
>>>
>>> Just don't do that, cause it's gonna hurt...
>>
>> Is it not possible to detect this before the illegal memory access, and
>> raise a signal in Emacs Lisp?
>
> It won't be easy, if at all possible. And I'm not sure we even want to
> do that. What would be the purpose of supporting such a use of Emacs?
>
Instead of raising a signal, I suggest:
diff --git a/src/data.c b/src/data.c
index 0f1d881e00b..76867d6787e 100644
--- a/src/data.c
+++ b/src/data.c
@@ -780,7 +780,7 @@ DEFUN ("symbol-name", Fsymbol_name, Ssymbol_name, 1,
1, 0,
CHECK_SYMBOL (symbol);
name = SYMBOL_NAME (symbol);
- return name;
+ return build_string (SSDATA (name));
}
DEFUN ("bare-symbol", Fbare_symbol, Sbare_symbol, 1, 1, 0,
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 10:32:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 3/10/23 10:40, Gregory Heytings wrote:
> Instead of raising a signal, I suggest:
>
> diff --git a/src/data.c b/src/data.c
> index 0f1d881e00b..76867d6787e 100644
> --- a/src/data.c
> +++ b/src/data.c
> @@ -780,7 +780,7 @@ DEFUN ("symbol-name", Fsymbol_name, Ssymbol_name, 1,
> 1, 0,
>
> CHECK_SYMBOL (symbol);
> name = SYMBOL_NAME (symbol);
> - return name;
> + return build_string (SSDATA (name));
> }
>
> DEFUN ("bare-symbol", Fbare_symbol, Sbare_symbol, 1, 1, 0,
Creating a string is not a good idea since it will lead to an
unacceptably large performance overhead. Raising an exception upon
modification would be the best approach.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:00:01 GMT)
Full text and
rfc822 format available.
Message #38 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>
> Creating a string is not a good idea since it will lead to an
> unacceptably large performance overhead.
>
Is "symbol-name" a function that is used in performance-critical code?
And did you actually measure that performance overhead before concluding
that it it "unacceptably large"? According to my measurements, creating a
string from a symbol name costs about 100 CPU cycles.
>
> Raising an exception upon modification would be the best approach.
>
That would also come with a performance overhead, as there is currently no
way to distinguist strings that are used for symbol names from other
strings. Not to mention the added complexity in the code.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:11:02 GMT)
Full text and
rfc822 format available.
Message #41 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 3/10/23 11:59, Gregory Heytings wrote:
> Is "symbol-name" a function that is used in performance-critical code?
> And did you actually measure that performance overhead before concluding
> that it it "unacceptably large"? According to my measurements, creating a
> string from a symbol name costs about 100 CPU cycles.
Yes, of course, for example completion. It would add cost all over the
place in many packages. Also note the added GC pressure. It is not a
good idea to change symbol-name now to allocate strings.
>> Raising an exception upon modification would be the best approach.
>>
>
> That would also come with a performance overhead, as there is currently no
> way to distinguist strings that are used for symbol names from other
> strings. Not to mention the added complexity in the code.
One could check if the string is located in read-only memory. Or one
could add a flag bit to the string data structure (and possibly to other
data structures too). Freezing data structures such that they become
read-only is a generally useful feature. There won't be any performance
overhead of the check since a branch not taken is fast thanks to the
branch predictor.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:25:01 GMT)
Full text and
rfc822 format available.
Message #44 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On Fri, 10 Mar 2023 at 12:09, Daniel Mendler wrote:
> On 3/10/23 11:59, Gregory Heytings wrote:
>> That would also come with a performance overhead, as there is currently no
>> way to distinguist strings that are used for symbol names from other
>> strings. Not to mention the added complexity in the code.
>
> One could check if the string is located in read-only memory. Or one
> could add a flag bit to the string data structure (and possibly to other
> data structures too). Freezing data structures such that they become
> read-only is a generally useful feature. There won't be any performance
> overhead of the check since a branch not taken is fast thanks to the
> branch predictor.
Note also that in-place modification of strings is arbitrarily costly,
cf. (aset "ascii" 0 ?😼). Not to mention that it's rarely a good or
necessary move to make, as far as programming style is concerned.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:31:02 GMT)
Full text and
rfc822 format available.
Message #47 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 10 Mar 2023 12:09:59 +0100, Daniel Mendler <mail <at> daniel-mendler.de> said:
Daniel> One could check if the string is located in read-only memory. Or one
Daniel> could add a flag bit to the string data structure (and possibly to other
Daniel> data structures too). Freezing data structures such that they become
Daniel> read-only is a generally useful feature. There won't be any performance
Daniel> overhead of the check since a branch not taken is fast thanks to the
Daniel> branch predictor.
We already have such a flag:
/* Number of characters in string; MSB is used as the mark bit. */
ptrdiff_t size;
/* If nonnegative, number of bytes in the string (which is multibyte).
If negative, the string is unibyte:
-1 for data normally allocated
-2 for data in rodata (C string constants)
-3 for data that must be immovable (used for bytecode) */
ptrdiff_t size_byte;
Try this:
diff --git a/src/lisp.h b/src/lisp.h
index 1276285e2f2..80bbb047824 100644
--- a/src/lisp.h
+++ b/src/lisp.h
@@ -1685,6 +1685,8 @@ SREF (Lisp_Object string, ptrdiff_t index)
INLINE void
SSET (Lisp_Object string, ptrdiff_t index, unsigned char new)
{
+ if (XSTRING (string)->u.s.size_byte == -2)
+ Fsignal (Qsetting_constant, string);
SDATA (string)[index] = new;
}
INLINE ptrdiff_t
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:37:01 GMT)
Full text and
rfc822 format available.
Message #50 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 3/10/23 12:30, Robert Pluim wrote:
>>>>>> On Fri, 10 Mar 2023 12:09:59 +0100, Daniel Mendler <mail <at> daniel-mendler.de> said:
>
> Daniel> One could check if the string is located in read-only memory. Or one
> Daniel> could add a flag bit to the string data structure (and possibly to other
> Daniel> data structures too). Freezing data structures such that they become
> Daniel> read-only is a generally useful feature. There won't be any performance
> Daniel> overhead of the check since a branch not taken is fast thanks to the
> Daniel> branch predictor.
>
> We already have such a flag:
>
> /* Number of characters in string; MSB is used as the mark bit. */
> ptrdiff_t size;
> /* If nonnegative, number of bytes in the string (which is multibyte).
> If negative, the string is unibyte:
> -1 for data normally allocated
> -2 for data in rodata (C string constants)
> -3 for data that must be immovable (used for bytecode) */
> ptrdiff_t size_byte;
Thanks! That's good. Given that a read only flag already exists, it is
easy to fix the issue. We just have to make sure that the size is
negative for the symbol names and add a check in `aset`.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:50:01 GMT)
Full text and
rfc822 format available.
Message #53 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> From: Augusto Stoffel <arstoffel <at> gmail.com>
> Cc: Philip Kaludercic <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
> mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org
> Date: Fri, 10 Mar 2023 09:45:11 +0100
>
> On Fri, 10 Mar 2023 at 09:11, Eli Zaretskii wrote:
>
> >> Is it not possible to detect this before the illegal memory access, and
> >> raise a signal in Emacs Lisp?
> >
> > It won't be easy, if at all possible. And I'm not sure we even want
> > to do that. What would be the purpose of supporting such a use of
> > Emacs?
>
> What is the purpose of supporting mutation of symbol names in general?
We don't. Whoever does this is on their own.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:51:01 GMT)
Full text and
rfc822 format available.
Message #56 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> From: Augusto Stoffel <arstoffel <at> gmail.com>
> Cc: Philip Kaludercic <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
> mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org
> Date: Fri, 10 Mar 2023 09:47:33 +0100
>
> On Fri, 10 Mar 2023 at 09:45, Augusto Stoffel wrote:
>
> > On Fri, 10 Mar 2023 at 09:11, Eli Zaretskii wrote:
> >
> >>> Is it not possible to detect this before the illegal memory access, and
> >>> raise a signal in Emacs Lisp?
> >>
> >> It won't be easy, if at all possible. And I'm not sure we even want
> >> to do that. What would be the purpose of supporting such a use of
> >> Emacs?
> >
> > What is the purpose of supporting mutation of symbol names in general?
> >
> > (aset (symbol-name 'find-file) 1 ?o)
> > (fboundp 'find-file)
> > => nil
> >
> > This one doesn't crash Emacs, but wreaks havoc, maybe in even worse
> > ways.
>
> (To clarify, I think this of course should raise a signal.)
Why bother? Emacs is not in the business of preventing Lisp
programmers from shooting themselves in the foot, certainly not when
that incurs runtime overhead, even a small one.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:54:02 GMT)
Full text and
rfc822 format available.
Message #59 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Mar 2023 09:40:31 +0000
> From: Gregory Heytings <gregory <at> heytings.org>
> cc: Philip Kaludercic <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
> mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org
>
>
> Instead of raising a signal, I suggest:
>
> diff --git a/src/data.c b/src/data.c
> index 0f1d881e00b..76867d6787e 100644
> --- a/src/data.c
> +++ b/src/data.c
> @@ -780,7 +780,7 @@ DEFUN ("symbol-name", Fsymbol_name, Ssymbol_name, 1,
> 1, 0,
>
> CHECK_SYMBOL (symbol);
> name = SYMBOL_NAME (symbol);
> - return name;
> + return build_string (SSDATA (name));
> }
>
> DEFUN ("bare-symbol", Fbare_symbol, Sbare_symbol, 1, 1, 0,
No, we will NOT increase GC pressure in Emacs just because someone
could do a silly and nonsensical thing. No way.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 11:58:01 GMT)
Full text and
rfc822 format available.
Message #62 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>
> diff --git a/src/lisp.h b/src/lisp.h
> index 1276285e2f2..80bbb047824 100644
> --- a/src/lisp.h
> +++ b/src/lisp.h
> @@ -1685,6 +1685,8 @@ SREF (Lisp_Object string, ptrdiff_t index)
> INLINE void
> SSET (Lisp_Object string, ptrdiff_t index, unsigned char new)
> {
> + if (XSTRING (string)->u.s.size_byte == -2)
> + Fsignal (Qsetting_constant, string);
> SDATA (string)[index] = new;
> }
> INLINE ptrdiff_t
>
That flag is useful only for the first part of the bug: setting the symbol
name of a function defined in C. It does not prevent changing symbol
names in general, e.g. (aset (symbol-name 'find-file) 1 ?o).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:00:02 GMT)
Full text and
rfc822 format available.
Message #65 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>
> No, we will NOT increase GC pressure in Emacs just because someone could
> do a silly and nonsensical thing. No way.
>
I agree with you that this is not really a bug, so doing nothing is fine
for me, too.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:00:03 GMT)
Full text and
rfc822 format available.
Message #68 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Mar 2023 10:59:03 +0000
> From: Gregory Heytings <gregory <at> heytings.org>
> cc: Eli Zaretskii <eliz <at> gnu.org>, Philip Kaludercic <philipk <at> posteo.net>,
> michael_heerdegen <at> web.de, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org,
> Augusto Stoffel <arstoffel <at> gmail.com>
>
> > Creating a string is not a good idea since it will lead to an
> > unacceptably large performance overhead.
>
> Is "symbol-name" a function that is used in performance-critical code?
Yes, it is. Just grep for it. We even call it from C quite a few
times. And processing is just one aspect of that; memory and GC is
another, not less important.
> And did you actually measure that performance overhead before concluding
> that it it "unacceptably large"?
Anything greater than zero is unacceptably large from where I stand,
when the other side of the balance is the use case against which this
protects.
> > Raising an exception upon modification would be the best approach.
>
> That would also come with a performance overhead, as there is currently no
> way to distinguist strings that are used for symbol names from other
> strings. Not to mention the added complexity in the code.
Which is why we should do neither.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:01:02 GMT)
Full text and
rfc822 format available.
Message #71 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 3/10/23 12:50, Eli Zaretskii wrote:
> Why bother? Emacs is not in the business of preventing Lisp
> programmers from shooting themselves in the foot, certainly not when
> that incurs runtime overhead, even a small one.
Of course Elisp is in the business of preventing programmers from
shooting themselves in the foot, otherwise we would extend Emacs in C.
It is not only that Elisp is easier to write thanks to macros and other
conveniences, but also because it is safer!
I fully agree with you that we should not introduce a performance
regression, in particular not one which increases GC pressure badly.
Furthermore I agree that this is a minor bug which only occurs as an
edge case when some specific strings are mutated.
However the cost of fixing this bug is minor, since Elisp already
supports read-only data stuctures as pointed out by Robert. We may only
need an additional check in aset which won't introduce significant costs
thanks to the branch predictor of CPUs. As Augusto pointed out, aset on
strings can already be very costly because the string must be
reallocated. In comparison, the additional check will be free.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:11:02 GMT)
Full text and
rfc822 format available.
Message #74 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> From: Augusto Stoffel <arstoffel <at> gmail.com>
> Cc: Gregory Heytings <gregory <at> heytings.org>, Eli Zaretskii <eliz <at> gnu.org>,
> Philip Kaludercic <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
> monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org
> Date: Fri, 10 Mar 2023 12:23:54 +0100
>
> On Fri, 10 Mar 2023 at 12:09, Daniel Mendler wrote:
>
> > On 3/10/23 11:59, Gregory Heytings wrote:
> >> That would also come with a performance overhead, as there is currently no
> >> way to distinguist strings that are used for symbol names from other
> >> strings. Not to mention the added complexity in the code.
> >
> > One could check if the string is located in read-only memory. Or one
> > could add a flag bit to the string data structure (and possibly to other
> > data structures too). Freezing data structures such that they become
> > read-only is a generally useful feature. There won't be any performance
> > overhead of the check since a branch not taken is fast thanks to the
> > branch predictor.
>
> Note also that in-place modification of strings is arbitrarily costly,
> cf. (aset "ascii" 0 ?😼). Not to mention that it's rarely a good or
> necessary move to make, as far as programming style is concerned.
Be this tru as it may, we will not constrain what Lisp programs can
legitimately do just because we think it is "rarely a good move".
That's against our long-time policy, which is explain why something
might not be a good idea, but otherwise don't block that, letting the
unwise cope with the consequences of their unwise actions.
IOW, we encourage Lisp programmers to DTRT and not use dangerous
practices, but don't block them if they want to.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:13:02 GMT)
Full text and
rfc822 format available.
Message #77 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: Gregory Heytings <gregory <at> heytings.org>, Philip Kaludercic
> <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
> monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org, Eli Zaretskii
> <eliz <at> gnu.org>, Augusto Stoffel <arstoffel <at> gmail.com>
> Date: Fri, 10 Mar 2023 12:30:48 +0100
>
> diff --git a/src/lisp.h b/src/lisp.h
> index 1276285e2f2..80bbb047824 100644
> --- a/src/lisp.h
> +++ b/src/lisp.h
> @@ -1685,6 +1685,8 @@ SREF (Lisp_Object string, ptrdiff_t index)
> INLINE void
> SSET (Lisp_Object string, ptrdiff_t index, unsigned char new)
> {
> + if (XSTRING (string)->u.s.size_byte == -2)
> + Fsignal (Qsetting_constant, string);
"Setting constant" is misleading.
And again, why do that at all? It's a waste of cycles, incurred on
_everyone_, for an extremely rare use case that is explicitly
discouraged. We are not the TSA, and should not adopt their policy of
punishing the innocent 99.99% on behalf of a handful of villains.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:14:01 GMT)
Full text and
rfc822 format available.
Message #80 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Mar 2023 12:36:17 +0100
> Cc: Gregory Heytings <gregory <at> heytings.org>,
> Philip Kaludercic <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
> monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org, Eli Zaretskii
> <eliz <at> gnu.org>, Augusto Stoffel <arstoffel <at> gmail.com>
> From: Daniel Mendler <mail <at> daniel-mendler.de>
>
> > /* Number of characters in string; MSB is used as the mark bit. */
> > ptrdiff_t size;
> > /* If nonnegative, number of bytes in the string (which is multibyte).
> > If negative, the string is unibyte:
> > -1 for data normally allocated
> > -2 for data in rodata (C string constants)
> > -3 for data that must be immovable (used for bytecode) */
> > ptrdiff_t size_byte;
>
> Thanks! That's good. Given that a read only flag already exists, it is
> easy to fix the issue. We just have to make sure that the size is
> negative for the symbol names and add a check in `aset`.
Let's not do that!
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:25:02 GMT)
Full text and
rfc822 format available.
Message #83 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 3/10/23 13:13, Eli Zaretskii wrote:
>> Date: Fri, 10 Mar 2023 12:36:17 +0100
>> Cc: Gregory Heytings <gregory <at> heytings.org>,
>> Philip Kaludercic <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
>> monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org, Eli Zaretskii
>> <eliz <at> gnu.org>, Augusto Stoffel <arstoffel <at> gmail.com>
>> From: Daniel Mendler <mail <at> daniel-mendler.de>
>>
>>> /* Number of characters in string; MSB is used as the mark bit. */
>>> ptrdiff_t size;
>>> /* If nonnegative, number of bytes in the string (which is multibyte).
>>> If negative, the string is unibyte:
>>> -1 for data normally allocated
>>> -2 for data in rodata (C string constants)
>>> -3 for data that must be immovable (used for bytecode) */
>>> ptrdiff_t size_byte;
>>
>> Thanks! That's good. Given that a read only flag already exists, it is
>> easy to fix the issue. We just have to make sure that the size is
>> negative for the symbol names and add a check in `aset`.
>
> Let's not do that!
Why not? There won't be a performance cost.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:36:01 GMT)
Full text and
rfc822 format available.
Message #86 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Mar 2023 13:00:34 +0100
> Cc: philipk <at> posteo.net, michael_heerdegen <at> web.de, monnier <at> iro.umontreal.ca,
> 62009 <at> debbugs.gnu.org, Robert Pluim <rpluim <at> gmail.com>,
> Augusto Stoffel <arstoffel <at> gmail.com>
> From: Daniel Mendler <mail <at> daniel-mendler.de>
>
> On 3/10/23 12:50, Eli Zaretskii wrote:
> > Why bother? Emacs is not in the business of preventing Lisp
> > programmers from shooting themselves in the foot, certainly not when
> > that incurs runtime overhead, even a small one.
>
> Of course Elisp is in the business of preventing programmers from
> shooting themselves in the foot, otherwise we would extend Emacs in C.
We disagree here, and this is a very fundamental disagreement, which
basically means continuing this argument is pointless, since we have
no common basis.
> I fully agree with you that we should not introduce a performance
> regression, in particular not one which increases GC pressure badly.
> Furthermore I agree that this is a minor bug which only occurs as an
> edge case when some specific strings are mutated.
>
> However the cost of fixing this bug is minor
No, it isn't, not in my book.
Sorry, I object to any change to cater for this use case.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:46:02 GMT)
Full text and
rfc822 format available.
Message #89 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 3/10/23 13:35, Eli Zaretskii wrote:
>> Date: Fri, 10 Mar 2023 13:00:34 +0100
>> Cc: philipk <at> posteo.net, michael_heerdegen <at> web.de, monnier <at> iro.umontreal.ca,
>> 62009 <at> debbugs.gnu.org, Robert Pluim <rpluim <at> gmail.com>,
>> Augusto Stoffel <arstoffel <at> gmail.com>
>> From: Daniel Mendler <mail <at> daniel-mendler.de>
>>
>> On 3/10/23 12:50, Eli Zaretskii wrote:
>>> Why bother? Emacs is not in the business of preventing Lisp
>>> programmers from shooting themselves in the foot, certainly not when
>>> that incurs runtime overhead, even a small one.
>>
>> Of course Elisp is in the business of preventing programmers from
>> shooting themselves in the foot, otherwise we would extend Emacs in C.
>
> We disagree here, and this is a very fundamental disagreement, which
> basically means continuing this argument is pointless, since we have
> no common basis.
I don't see that the disagreement is that strong. For example aset
signals an error if you try to access elements out of bounds.
(aset "abc" 3 ?x) -> args-out-of-range
So there are clearly use cases where signaling an error is justified. In
other cases you claim signaling an error is unjustified and a crash is
better. I don't like the crashing. That's the whole disagreement. I
suspect that you also don't like if Emacs crashes. Maybe it doesn't
bother you in this case, but in others.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 12:58:02 GMT)
Full text and
rfc822 format available.
Message #92 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Mar 2023 13:45:11 +0100
> Cc: philipk <at> posteo.net, michael_heerdegen <at> web.de, monnier <at> iro.umontreal.ca,
> 62009 <at> debbugs.gnu.org, rpluim <at> gmail.com, arstoffel <at> gmail.com
> From: Daniel Mendler <mail <at> daniel-mendler.de>
>
> > We disagree here, and this is a very fundamental disagreement, which
> > basically means continuing this argument is pointless, since we have
> > no common basis.
>
> I don't see that the disagreement is that strong. For example aset
> signals an error if you try to access elements out of bounds.
>
> (aset "abc" 3 ?x) -> args-out-of-range
yes, because that's a frequent programmatic mistake.
> So there are clearly use cases where signaling an error is justified.
Of course. It's just that the use case being discussed is not one of
them.
> In other cases you claim signaling an error is unjustified and a
> crash is better.
I said nothing of the kind. I said it was unjustified in the
particular case which is being discussed here.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 13:09:02 GMT)
Full text and
rfc822 format available.
Message #95 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 3/10/23 13:57, Eli Zaretskii wrote:
>> Date: Fri, 10 Mar 2023 13:45:11 +0100
>> Cc: philipk <at> posteo.net, michael_heerdegen <at> web.de, monnier <at> iro.umontreal.ca,
>> 62009 <at> debbugs.gnu.org, rpluim <at> gmail.com, arstoffel <at> gmail.com
>> From: Daniel Mendler <mail <at> daniel-mendler.de>
>>
>>> We disagree here, and this is a very fundamental disagreement, which
>>> basically means continuing this argument is pointless, since we have
>>> no common basis.
>>
>> I don't see that the disagreement is that strong. For example aset
>> signals an error if you try to access elements out of bounds.
>>
>> (aset "abc" 3 ?x) -> args-out-of-range
>
> yes, because that's a frequent programmatic mistake.
Agree.
>> So there are clearly use cases where signaling an error is justified.
>
> Of course. It's just that the use case being discussed is not one of
> them.
I agree with you, that this is not a common mistake. But it is still a
mistake and we could easily protect the user from it.
This is a general question - do we want to prevent and catch most
programmer mistakes or not? You seem to lean on rather not catching some
errors which are rare and I am in favor of catching more errors if the
costs permit. In this case, the costs are small in my book (I cannot
look into your book and understand how you come to your conclusions).
Given that Robert already pointed out that objects can be marked as
read-only, all the necessary infrastructure is already in place, making
this a cheap change.
Catching more mistakes improves overall robustness of Emacs and
generally there are also security considerations which should be taken
into account. It may not matter in this case, but ensuring that the
Elisp runtime is memory safe as much as possible is a worthy goal.
>> In other cases you claim signaling an error is unjustified and a
>> crash is better.
>
> I said nothing of the kind. I said it was unjustified in the
> particular case which is being discussed here.
You clearly said that you object to any measures which fix this issue.
This means you prefer if Emacs crashes, instead of it signaling an error.
Daniel
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 13:20:01 GMT)
Full text and
rfc822 format available.
Message #98 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Fri, 10 Mar 2023 14:12:25 +0200, Eli Zaretskii <eliz <at> gnu.org> said:
>> From: Robert Pluim <rpluim <at> gmail.com>
>> Cc: Gregory Heytings <gregory <at> heytings.org>, Philip Kaludercic
>> <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
>> monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org, Eli Zaretskii
>> <eliz <at> gnu.org>, Augusto Stoffel <arstoffel <at> gmail.com>
>> Date: Fri, 10 Mar 2023 12:30:48 +0100
>>
>> diff --git a/src/lisp.h b/src/lisp.h
>> index 1276285e2f2..80bbb047824 100644
>> --- a/src/lisp.h
>> +++ b/src/lisp.h
>> @@ -1685,6 +1685,8 @@ SREF (Lisp_Object string, ptrdiff_t index)
>> INLINE void
>> SSET (Lisp_Object string, ptrdiff_t index, unsigned char new)
>> {
>> + if (XSTRING (string)->u.s.size_byte == -2)
>> + Fsignal (Qsetting_constant, string);
Eli> "Setting constant" is misleading.
True. I was lazy and picked the first one I found.
Eli> And again, why do that at all? It's a waste of cycles, incurred on
Eli> _everyone_, for an extremely rare use case that is explicitly
Eli> discouraged. We are not the TSA, and should not adopt their policy of
Eli> punishing the innocent 99.99% on behalf of a handful of villains.
I wasnʼt seriously proposing it for inclusion, just pointing out that
it was possible.
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 15:04:02 GMT)
Full text and
rfc822 format available.
Message #101 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Date: Fri, 10 Mar 2023 14:08:44 +0100
> Cc: philipk <at> posteo.net, michael_heerdegen <at> web.de, monnier <at> iro.umontreal.ca,
> 62009 <at> debbugs.gnu.org, rpluim <at> gmail.com, arstoffel <at> gmail.com
> From: Daniel Mendler <mail <at> daniel-mendler.de>
>
> >> (aset "abc" 3 ?x) -> args-out-of-range
> >
> > yes, because that's a frequent programmatic mistake.
>
> Agree.
>
> >> So there are clearly use cases where signaling an error is justified.
> >
> > Of course. It's just that the use case being discussed is not one of
> > them.
>
> I agree with you, that this is not a common mistake. But it is still a
> mistake and we could easily protect the user from it.
I don't agree with "easily". And I think "not common" is an
understatement.
> This is a general question - do we want to prevent and catch most
> programmer mistakes or not?
Depends on the mistakes and the price.
> You seem to lean on rather not catching some errors which are rare
> and I am in favor of catching more errors if the costs permit.
We disagree about the importance of the mistake, and we disagree about
the costs of handling it.
In addition to the runtime costs, in terms of CPU and memory/GC,
there's also the aspect of introducing into Emacs a feature that we
will have to support for the observable future. What if we want at
some point to change how these strings are store, and that change
makes these mistakes even harder to support? We are taking upon
ourselves an obligation that we could regret, for the benefit of silly
code that shouldn't be written in the first place. That kind of
trade-off makes no sense.
> Catching more mistakes improves overall robustness of Emacs and
> generally there are also security considerations which should be taken
> into account. It may not matter in this case, but ensuring that the
> Elisp runtime is memory safe as much as possible is a worthy goal.
The disagreement is not about principles, it's about this particular
case.
> >> In other cases you claim signaling an error is unjustified and a
> >> crash is better.
> >
> > I said nothing of the kind. I said it was unjustified in the
> > particular case which is being discussed here.
>
> You clearly said that you object to any measures which fix this issue.
> This means you prefer if Emacs crashes, instead of it signaling an error.
That's your conclusion, not something I said. What I did say is that
the nature and the rarity of the issue do not justify the costs.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 18:57:01 GMT)
Full text and
rfc822 format available.
Message #104 received at 62009 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Philip Kaludercic <philipk <at> posteo.net>
>> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, michael_heerdegen <at> web.de,
>> monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org
>> Date: Thu, 09 Mar 2023 21:11:13 +0000
>>
>> > AFAICT, they _are_ frozen. These names are in read-only memory, where
>> > you cannot write. That's why Emacs crashes, AFAIU: the code is trying
>> > to write to protected memory.
>> >
>> > Just don't do that, cause it's gonna hurt...
>>
>> Is it not possible to detect this before the illegal memory access, and
>> raise a signal in Emacs Lisp?
>
> It won't be easy, if at all possible. And I'm not sure we even want
> to do that. What would be the purpose of supporting such a use of
> Emacs?
The point is not that much that a signal is raised, but that anything
else but segfaulting is done.
Emacs is generally not "safe", but the idea of someone stuffing an
expression like the one mentioned in this bug report into the autoload
of a package just to cause chaos is uncompfortable. Of course, you can
already do
;;;###autoload (delete-directory "~" t)
or
;;;###autoload (fset 'eval 'ignore)
so maybe my point is moot.
--
Philip Kaludercic
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Fri, 10 Mar 2023 22:02:02 GMT)
Full text and
rfc822 format available.
Message #107 received at 62009 <at> debbugs.gnu.org (full text, mbox):
On 10/03/2023 14:24, Daniel Mendler wrote:
> On 3/10/23 13:13, Eli Zaretskii wrote:
>>> Date: Fri, 10 Mar 2023 12:36:17 +0100
>>> Cc: Gregory Heytings<gregory <at> heytings.org>,
>>> Philip Kaludercic<philipk <at> posteo.net>,michael_heerdegen <at> web.de,
>>> monnier <at> iro.umontreal.ca,62009 <at> debbugs.gnu.org, Eli Zaretskii
>>> <eliz <at> gnu.org>, Augusto Stoffel<arstoffel <at> gmail.com>
>>> From: Daniel Mendler<mail <at> daniel-mendler.de>
>>>
>>>> /* Number of characters in string; MSB is used as the mark bit. */
>>>> ptrdiff_t size;
>>>> /* If nonnegative, number of bytes in the string (which is multibyte).
>>>> If negative, the string is unibyte:
>>>> -1 for data normally allocated
>>>> -2 for data in rodata (C string constants)
>>>> -3 for data that must be immovable (used for bytecode) */
>>>> ptrdiff_t size_byte;
>>> Thanks! That's good. Given that a read only flag already exists, it is
>>> easy to fix the issue. We just have to make sure that the size is
>>> negative for the symbol names and add a check in `aset`.
>> Let's not do that!
> Why not? There won't be a performance cost.
Perhaps we could use some exact benchmark results.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Sat, 11 Mar 2023 07:08:01 GMT)
Full text and
rfc822 format available.
Message #110 received at 62009 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> Date: Fri, 10 Mar 2023 09:40:31 +0000
>> From: Gregory Heytings <gregory <at> heytings.org>
>> cc: Philip Kaludercic <philipk <at> posteo.net>, michael_heerdegen <at> web.de,
>> mail <at> daniel-mendler.de, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org
>>
>>
>> Instead of raising a signal, I suggest:
>>
>> diff --git a/src/data.c b/src/data.c
>> index 0f1d881e00b..76867d6787e 100644
>> --- a/src/data.c
>> +++ b/src/data.c
>> @@ -780,7 +780,7 @@ DEFUN ("symbol-name", Fsymbol_name, Ssymbol_name, 1,
>> 1, 0,
>>
>> CHECK_SYMBOL (symbol);
>> name = SYMBOL_NAME (symbol);
>> - return name;
>> + return build_string (SSDATA (name));
>> }
>>
>> DEFUN ("bare-symbol", Fbare_symbol, Sbare_symbol, 1, 1, 0,
>
> No, we will NOT increase GC pressure in Emacs just because someone
> could do a silly and nonsensical thing. No way.
Can't we make puresize.h check (in addition to whether or not the string
is in pure space) whether or not the string lies in read-only segments
of the executable?
Or maybe put all of string data of DEFSYM'd symbols in pure space, since
Faset already checks that the string is not in pure space.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Sat, 11 Mar 2023 07:09:02 GMT)
Full text and
rfc822 format available.
Message #113 received at 62009 <at> debbugs.gnu.org (full text, mbox):
Robert Pluim <rpluim <at> gmail.com> writes:
>>>>>> On Fri, 10 Mar 2023 12:09:59 +0100, Daniel Mendler <mail <at> daniel-mendler.de> said:
>
> Daniel> One could check if the string is located in read-only memory. Or one
> Daniel> could add a flag bit to the string data structure (and possibly to other
> Daniel> data structures too). Freezing data structures such that they become
> Daniel> read-only is a generally useful feature. There won't be any performance
> Daniel> overhead of the check since a branch not taken is fast thanks to the
> Daniel> branch predictor.
>
> We already have such a flag:
>
> /* Number of characters in string; MSB is used as the mark bit. */
> ptrdiff_t size;
> /* If nonnegative, number of bytes in the string (which is multibyte).
> If negative, the string is unibyte:
> -1 for data normally allocated
> -2 for data in rodata (C string constants)
> -3 for data that must be immovable (used for bytecode) */
> ptrdiff_t size_byte;
>
> Try this:
>
> diff --git a/src/lisp.h b/src/lisp.h
> index 1276285e2f2..80bbb047824 100644
> --- a/src/lisp.h
> +++ b/src/lisp.h
> @@ -1685,6 +1685,8 @@ SREF (Lisp_Object string, ptrdiff_t index)
> INLINE void
> SSET (Lisp_Object string, ptrdiff_t index, unsigned char new)
> {
> + if (XSTRING (string)->u.s.size_byte == -2)
> + Fsignal (Qsetting_constant, string);
> SDATA (string)[index] = new;
> }
> INLINE ptrdiff_t
>
> Robert
Why does this have to be in SSET and not in PURE_P?
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Sat, 11 Mar 2023 07:48:02 GMT)
Full text and
rfc822 format available.
Message #116 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: Gregory Heytings <gregory <at> heytings.org>, michael_heerdegen <at> web.de,
> mail <at> daniel-mendler.de, philipk <at> posteo.net, monnier <at> iro.umontreal.ca,
> 62009 <at> debbugs.gnu.org
> Date: Sat, 11 Mar 2023 15:07:05 +0800
>
> > No, we will NOT increase GC pressure in Emacs just because someone
> > could do a silly and nonsensical thing. No way.
>
> Can't we make puresize.h check (in addition to whether or not the string
> is in pure space) whether or not the string lies in read-only segments
> of the executable?
>
> Or maybe put all of string data of DEFSYM'd symbols in pure space, since
> Faset already checks that the string is not in pure space.
Let me remind us all that we intend to toss pure space soon. So let's
not build any new features on what pure space means and does,
certainly not for such marginal use cases. We don't want to take upon
ourselves any jobs that might prove difficult to keep doing when some
of the underlying infrastructure changes, because it would be
ridiculous to have this little tail wag the Emacs dog when the time
comes.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Sat, 11 Mar 2023 15:17:02 GMT)
Full text and
rfc822 format available.
Message #119 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>
> Sorry, I object to any change to cater for this use case.
>
IMO a reasonable change here would be to update the docstring of
symbol-name, which only says "Return SYMBOL's name, a string.", with a
warning similar to the one in the manual:
Warning: Changing the string by substituting characters does change the
name of the symbol, but fails to update the obarray, so don't do it!
Perhaps we could also explicitly mention, in the docstring and/or in the
manual, that doing that can also crash Emacs.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Sat, 11 Mar 2023 15:38:02 GMT)
Full text and
rfc822 format available.
Message #122 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 11 Mar 2023 15:16:36 +0000
> From: Gregory Heytings <gregory <at> heytings.org>
> cc: Daniel Mendler <mail <at> daniel-mendler.de>, philipk <at> posteo.net,
> michael_heerdegen <at> web.de, rpluim <at> gmail.com, monnier <at> iro.umontreal.ca,
> 62009 <at> debbugs.gnu.org, arstoffel <at> gmail.com
>
>
> >
> > Sorry, I object to any change to cater for this use case.
> >
>
> IMO a reasonable change here would be to update the docstring of
> symbol-name, which only says "Return SYMBOL's name, a string.", with a
> warning similar to the one in the manual:
>
> Warning: Changing the string by substituting characters does change the
> name of the symbol, but fails to update the obarray, so don't do it!
That is okay, of course. My objection was to the code changes.
> Perhaps we could also explicitly mention, in the docstring and/or in the
> manual, that doing that can also crash Emacs.
I think being a bit vague here and talking about "dangers" should be
good enough, since not every such code will crash, and probably also
not on every platform and in every build configuration.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Mon, 13 Mar 2023 08:08:02 GMT)
Full text and
rfc822 format available.
Message #125 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>>>>> On Sat, 11 Mar 2023 15:07:56 +0800, Po Lu <luangruo <at> yahoo.com> said:
Po Lu> Why does this have to be in SSET and not in PURE_P?
That would work as well, although I thought purespace was going away?
Robert
--
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Mon, 13 Mar 2023 08:29:02 GMT)
Full text and
rfc822 format available.
Message #128 received at 62009 <at> debbugs.gnu.org (full text, mbox):
Robert Pluim <rpluim <at> gmail.com> writes:
>>>>>> On Sat, 11 Mar 2023 15:07:56 +0800, Po Lu <luangruo <at> yahoo.com> said:
>
> Po Lu> Why does this have to be in SSET and not in PURE_P?
>
> That would work as well, although I thought purespace was going away?
>
> Robert
Once pure space goes away, we can change the call to PURE_P in Faset to
only check if the string is read-only.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Mon, 13 Mar 2023 11:47:01 GMT)
Full text and
rfc822 format available.
Message #131 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Philip Kaludercic
> <philipk <at> posteo.net>, michael_heerdegen <at> web.de, Gregory Heytings
> <gregory <at> heytings.org>, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org,
> Eli Zaretskii <eliz <at> gnu.org>, Augusto Stoffel <arstoffel <at> gmail.com>
> Date: Mon, 13 Mar 2023 09:07:20 +0100
>
> >>>>> On Sat, 11 Mar 2023 15:07:56 +0800, Po Lu <luangruo <at> yahoo.com> said:
>
> Po Lu> Why does this have to be in SSET and not in PURE_P?
>
> That would work as well, although I thought purespace was going away?
At some point, yes. So we shouldn't build any new features or
improvements on aspects that will disappear when that happens.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Mon, 13 Mar 2023 11:48:02 GMT)
Full text and
rfc822 format available.
Message #134 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> From: Po Lu <luangruo <at> yahoo.com>
> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Philip Kaludercic
> <philipk <at> posteo.net>, michael_heerdegen <at> web.de, Gregory Heytings
> <gregory <at> heytings.org>, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org,
> Eli Zaretskii <eliz <at> gnu.org>, Augusto Stoffel <arstoffel <at> gmail.com>
> Date: Mon, 13 Mar 2023 16:28:10 +0800
>
> Robert Pluim <rpluim <at> gmail.com> writes:
>
> >>>>>> On Sat, 11 Mar 2023 15:07:56 +0800, Po Lu <luangruo <at> yahoo.com> said:
> >
> > Po Lu> Why does this have to be in SSET and not in PURE_P?
> >
> > That would work as well, although I thought purespace was going away?
> >
> > Robert
>
> Once pure space goes away, we can change the call to PURE_P in Faset to
> only check if the string is read-only.
There's no reason to believe someone will remember that. Just don't
add anything that will require changes when we remove pure space
(except if it's specific to the unexec build).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Mon, 13 Mar 2023 11:52:01 GMT)
Full text and
rfc822 format available.
Message #137 received at 62009 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Po Lu <luangruo <at> yahoo.com>
>> Cc: Daniel Mendler <mail <at> daniel-mendler.de>, Philip Kaludercic
>> <philipk <at> posteo.net>, michael_heerdegen <at> web.de, Gregory Heytings
>> <gregory <at> heytings.org>, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org,
>> Eli Zaretskii <eliz <at> gnu.org>, Augusto Stoffel <arstoffel <at> gmail.com>
>> Date: Mon, 13 Mar 2023 16:28:10 +0800
>>
>> Robert Pluim <rpluim <at> gmail.com> writes:
>>
>> >>>>>> On Sat, 11 Mar 2023 15:07:56 +0800, Po Lu <luangruo <at> yahoo.com> said:
>> >
>> > Po Lu> Why does this have to be in SSET and not in PURE_P?
>> >
>> > That would work as well, although I thought purespace was going away?
>> >
>> > Robert
>>
>> Once pure space goes away, we can change the call to PURE_P in Faset to
>> only check if the string is read-only.
>
> There's no reason to believe someone will remember that. Just don't
> add anything that will require changes when we remove pure space
> (except if it's specific to the unexec build).
OK, will keep that in mind.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Sat, 18 Mar 2023 22:47:02 GMT)
Full text and
rfc822 format available.
Message #140 received at 62009 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
>
> I think being a bit vague here and talking about "dangers" should be
> good enough, since not every such code will crash, and probably also not
> on every platform and in every build configuration.
>
Is the attached patch okay?
[Improve-warning-about-changing-the-string-returned-b.patch (text/x-diff, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Sun, 19 Mar 2023 06:05:01 GMT)
Full text and
rfc822 format available.
Message #143 received at 62009 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 18 Mar 2023 22:46:35 +0000
> From: Gregory Heytings <gregory <at> heytings.org>
> cc: philipk <at> posteo.net, michael_heerdegen <at> web.de, mail <at> daniel-mendler.de,
> rpluim <at> gmail.com, monnier <at> iro.umontreal.ca, 62009 <at> debbugs.gnu.org,
> arstoffel <at> gmail.com
>
> > I think being a bit vague here and talking about "dangers" should be
> > good enough, since not every such code will crash, and probably also not
> > on every platform and in every build configuration.
>
> Is the attached patch okay?
Yes, thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#62009
; Package
emacs
.
(Sun, 19 Mar 2023 21:21:01 GMT)
Full text and
rfc822 format available.
Message #146 received at 62009 <at> debbugs.gnu.org (full text, mbox):
>> Is the attached patch okay?
>
> Yes, thanks.
>
Thanks, pushed (b7f0333355).
This bug report was last modified 2 years and 88 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.