Package: emacs;
Reported by: Pip Cet <pipcet <at> protonmail.com>
Date: Fri, 3 Jan 2025 14:31:02 UTC
Severity: normal
To reply to this bug, email your comments to 75318 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Fri, 03 Jan 2025 14:31:02 GMT) Full text and rfc822 format available.Pip Cet <pipcet <at> protonmail.com>
:bug-gnu-emacs <at> gnu.org
.
(Fri, 03 Jan 2025 14:31:03 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: bug-gnu-emacs <at> gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca> Subject: Lisp_Subr header and layout Date: Fri, 03 Jan 2025 14:29:58 +0000
Lisp_Subr::command_modes aren't marked by GC; the vectorlike_header of Lisp_Subr is often incorrect; Lisp_Subr fields don't follow the usual order This is not an acute bug, but it's problematic code which might cause a bug if something changes. struct Lisp_Subr contains a "Lisp_Object command_modes" field, whether or not Emacs is built with native-comp support. It's only marked by GC if Emacs is built with native-comp support. Putting arbitrary values in there would cause crashes. Right now, command_modes is always Qnil in non-native-comp builds, so this isn't a problem as nil doesn't need to be marked explicitly by GC; it does, however, waste some space (5 KB on 32-bit machines). I think making the existence of the command_modes field depend on nativecomp would be the best solution; if we want to use a command_modes field for non-nativecomp builds, in the future, we have to change the GC marking code and the pdumper code, but I really don't see the point of allocating and GC-marking a field that we know is always nil on the off chance it might be useful in the future. After this change, struct Lisp_Subr contains no Lisp_Object fields, and if we want to add one (most likely because ::doc should be a Lisp_Object rather than an EMACS_INT) again, we should consider making it an ordinary PVEC, with the Lisp_Object fields first and the POD C data at the end. This would eventually allow us to remove the special-casing of subrs in the GC code, but this would also require us to use the expected pvec header word value for built-in subrs; right now, we don't. I'll prepare patches for those two changes (remove command_modes; correct pvec headers for subrs); not the third (reorder Lisp_Subr fields) because it's a nativecomp change, and there may be a small performance impact. Pip
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Tue, 21 Jan 2025 13:06:01 GMT) Full text and rfc822 format available.Message #8 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: 75318 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca Subject: Re: bug#75318: Lisp_Subr header and layout Date: Tue, 21 Jan 2025 13:04:48 +0000
"Pip Cet via \"Bug reports for GNU Emacs, the Swiss army knife of text editors\"" <bug-gnu-emacs <at> gnu.org> writes: > I'll prepare patches for those two changes (remove command_modes; > correct pvec headers for subrs); not the third (reorder Lisp_Subr > fields) because it's a nativecomp change, and there may be a small > performance impact. The first patch is a simple bug fix. Is this okay? PVECHEADERSIZE uses "lispsize" and "restsize", but allocate_pseudovector uses "memlen" and "lisplen". That's confusing. Someone was confused, the header value was wrong but it never caused any trouble. Still, it's good to fix this, and it caused trouble for feature/igc. Ideally, we would fix the confusing situation. My suggestion is to always pass all three sizes. Please let me know if that's desirable. From 08fd00deb5d4d7ea73d2959577d2b4ab9855f1c2 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet <at> protonmail.com> Subject: [PATCH 1/2] Correct PVEC header for main_thread (bug#75318) * src/thread.c (main_thread): Pass correct restsize to PVECHEADERSIZE. --- src/thread.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/thread.c b/src/thread.c index 5610f8be0dd..5539fae15ab 100644 --- a/src/thread.c +++ b/src/thread.c @@ -50,7 +50,9 @@ static_assert (GCALIGNED (union aligned_thread_state)); .header.size = PVECHEADERSIZE (PVEC_THREAD, PSEUDOVECSIZE (struct thread_state, event_object), - VECSIZE (struct thread_state)), + VECSIZE (struct thread_state) - + PSEUDOVECSIZE (struct thread_state, + event_object)), .m_last_thing_searched = LISPSYM_INITIALLY (Qnil), .m_saved_last_thing_searched = LISPSYM_INITIALLY (Qnil), .name = LISPSYM_INITIALLY (Qnil), -- 2.47.1
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Tue, 21 Jan 2025 23:12:02 GMT) Full text and rfc822 format available.Message #11 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: 75318 <at> debbugs.gnu.org Cc: Po Lu <luangruo <at> yahoo.com>, monnier <at> iro.umontreal.ca Subject: Re: bug#75318: Lisp_Subr header and layout Date: Tue, 21 Jan 2025 23:11:13 +0000
"Pip Cet via \"Bug reports for GNU Emacs, the Swiss army knife of text editors\"" <bug-gnu-emacs <at> gnu.org> writes: > I'll prepare patches for those two changes (remove command_modes; > correct pvec headers for subrs); not the third (reorder Lisp_Subr > fields) because it's a nativecomp change, and there may be a small > performance impact. The second patch fixes Lisp_Subr initialization: the old code set the header to an incorrect value first (PVEC_SUBR was in the type field, but the PSEUDOVECTOR_FLAG was not set, making it look like a huge ordinary vector). Then defsubr wrote to the Lisp_Subr structure to fix that, by calling XSETPVECTYPE: this didn't affect the type field, but did set the PSEUDOVECTOR_FLAG. The size fields always indicated a size of zero. But three static Lisp_Subr structures weren't defined via defsubr, and did set the PSEUDOVECTOR_FLAG initially, but otherwise contained verbatim copies of the DEFUN macro's initialization code. Ideally, we'd turn these three into plain DEFUNs, with docstrings indicating they're internal functions. I'd also like to turn the underscores into ordinary dashes in the Lisp function names if we do that. If we don't, we might at least want to move them to the .subrs section, or move them out of function scope (function-scope static variables aren't easy to debug with gdb). This patch splits up the macros so they're reusable, initializes using LISPSYM_INITIALLY rather than lisp_h_Qnil, ensures that all Lisp_Object fields are initialized, uses explicit field-name initializers, and uses c-mode default indentation (this is incorrect in one place, but it's hardly noticeable). This change alone is not sufficient to make the .subrs section read-only. See store_function_docstring and pdumper. Like the first one, this second patch fixes things that appear to be unintended, but that never caused any bugs on the master branch. The oddities did cause problems when playing with feature/igc and other GC stuff, though. From a67f72e44f5cc9b4af48f0ac3503d3286b0c91a0 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet <at> protonmail.com> Subject: [PATCH 2/2] Fix Lisp_Subr initialization peculiarities (bug#75318) * src/lisp.h (SUBR_HEADER, SUBR_INITIALIZER) (SUBR_NATIVECOMP_INITIALIZER): New macros. (DEFUN): Use them. Initialize the header directly to a correct value. * src/alloc.c (Swatch_gc_cons_threshold, Swatch_gc_cons_percentage): * src/sfntfont.c (Scompare_font_entities): Use SUBR_INITIALIZER. * src/lread.c (defsubr): Don't call XSETPVECTYPE to initialize pseudovector flag. --- src/alloc.c | 10 ++++------ src/lisp.h | 29 +++++++++++++++++++++++++---- src/lread.c | 1 - src/sfntfont.c | 10 +++------- 4 files changed, 32 insertions(+), 18 deletions(-) diff --git a/src/alloc.c b/src/alloc.c index c4e2ff52015..3425f913362 100644 --- a/src/alloc.c +++ b/src/alloc.c @@ -8208,16 +8208,14 @@ syms_of_alloc (void) Lisp_Object watcher; static union Aligned_Lisp_Subr Swatch_gc_cons_threshold = - {{{ PSEUDOVECTOR_FLAG | (PVEC_SUBR << PSEUDOVECTOR_AREA_BITS) }, - { .a4 = watch_gc_cons_threshold }, - 4, 4, "watch_gc_cons_threshold", {0}, lisp_h_Qnil}}; + SUBR_INITIALIZER ("watch_gc_cons_threshold", watch_gc_cons_threshold, + 4, 4, NULL); XSETSUBR (watcher, &Swatch_gc_cons_threshold.s); Fadd_variable_watcher (Qgc_cons_threshold, watcher); static union Aligned_Lisp_Subr Swatch_gc_cons_percentage = - {{{ PSEUDOVECTOR_FLAG | (PVEC_SUBR << PSEUDOVECTOR_AREA_BITS) }, - { .a4 = watch_gc_cons_percentage }, - 4, 4, "watch_gc_cons_percentage", {0}, lisp_h_Qnil}}; + SUBR_INITIALIZER ("watch_gc_cons_percentage", watch_gc_cons_percentage, + 4, 4, NULL); XSETSUBR (watcher, &Swatch_gc_cons_percentage.s); Fadd_variable_watcher (Qgc_cons_percentage, watcher); DEFSYM (Qalloc, "alloc"); diff --git a/src/lisp.h b/src/lisp.h index 8b870119315..e9c975b81e8 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -3481,15 +3481,36 @@ CHECK_SUBR (Lisp_Object x) A null string means call interactively with no arguments. `doc' is documentation for the user. */ +#define SUBR_HEADER { .size = \ + PVECHEADERSIZE (PVEC_SUBR, 0, VECSIZE (union Aligned_Lisp_Subr)) } + +#ifdef HAVE_NATIVE_COMP +#define SUBR_NATIVECOMP_INITIALIZER \ + .native_comp_u = LISPSYM_INITIALLY (Qnil), \ + .lambda_list = LISPSYM_INITIALLY (Qnil), \ + .type = LISPSYM_INITIALLY (Qnil), +#else +#define SUBR_NATIVECOMP_INITIALIZER +#endif + +#define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ + {{ .header = SUBR_HEADER, \ + .function = { .a ## maxargs = fnname }, \ + .min_args = minargs, \ + .max_args = maxargs, \ + .symbol_name = lname, \ + .intspec = { .string = intspec_value }, \ + .command_modes = LISPSYM_INITIALLY (Qnil), \ + SUBR_NATIVECOMP_INITIALIZER \ + }} + /* This version of DEFUN declares a function prototype with the right arguments, so we can catch errors with maxargs at compile-time. */ #define DEFUN(lname, fnname, sname, minargs, maxargs, intspec, doc) \ SUBR_SECTION_ATTRIBUTE \ static union Aligned_Lisp_Subr sname = \ - {{{ PVEC_SUBR << PSEUDOVECTOR_AREA_BITS }, \ - { .a ## maxargs = fnname }, \ - minargs, maxargs, lname, {intspec}, lisp_h_Qnil}}; \ - Lisp_Object fnname + SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec); \ + Lisp_Object fnname /* defsubr (Sname); is how we define the symbol for function `name' at start-up time. */ diff --git a/src/lread.c b/src/lread.c index e5d1ba0e6bb..279d384eed6 100644 --- a/src/lread.c +++ b/src/lread.c @@ -5475,7 +5475,6 @@ defsubr (union Aligned_Lisp_Subr *aname) struct Lisp_Subr *sname = &aname->s; Lisp_Object sym, tem; sym = intern_c_string (sname->symbol_name); - XSETPVECTYPE (sname, PVEC_SUBR); XSETSUBR (tem, sname); set_symbol_function (sym, tem); #ifdef HAVE_NATIVE_COMP diff --git a/src/sfntfont.c b/src/sfntfont.c index 5752ff81c1c..b7a8b7c2671 100644 --- a/src/sfntfont.c +++ b/src/sfntfont.c @@ -1985,13 +1985,9 @@ sfntfont_compare_font_entities (Lisp_Object a, Lisp_Object b) fields are set within the first than in the second. */ static union Aligned_Lisp_Subr Scompare_font_entities = - { - { - { PSEUDOVECTOR_FLAG | (PVEC_SUBR << PSEUDOVECTOR_AREA_BITS), }, - { .a2 = sfntfont_compare_font_entities, }, - 2, 2, "sfntfont_compare_font_entities", {0}, lisp_h_Qnil, - }, - }; + SUBR_INITIALIZER ("sfntfont_compare_font_entities", + sfntfont_compare_font_entities, + 2, 2, NULL); /* Return a list of font-entities matching the specified FONT_SPEC. */ -- 2.47.1
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Tue, 21 Jan 2025 23:48:02 GMT) Full text and rfc822 format available.Message #14 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: 75318 <at> debbugs.gnu.org Cc: monnier <at> iro.umontreal.ca Subject: Re: bug#75318: Lisp_Subr header and layout Date: Tue, 21 Jan 2025 23:46:45 +0000
Pip Cet <pipcet <at> protonmail.com> writes: > "Pip Cet via \"Bug reports for GNU Emacs, the Swiss army knife of text editors\"" <bug-gnu-emacs <at> gnu.org> writes: > >> I'll prepare patches for those two changes (remove command_modes; >> correct pvec headers for subrs); not the third (reorder Lisp_Subr >> fields) because it's a nativecomp change, and there may be a small >> performance impact. > > The second patch fixes Lisp_Subr initialization: the old code set the This third, small patch moves the command_modes field into the HAVE_NATIVE_COMP #ifdef. It adjusts Fcommand_modes to return Qnil for builtin subrs in the !HAVE_NATIVE_COMP case, and adjusts pdumper, fixing a latent bug: it used dump_field_emacs_ptr instead of dump_field_lv. As that might have been tricky to debug, I briefly confirmed that no similar bugs remain by compiling with this and --enable-check-lisp-object-type: #define CHECK_POINTER(p) ((0) ? (void)(*(p)) : (void) (0)) #define dump_field_emacs_ptr(ctx, out, in_start, in_field) (CHECK_POINTER(*(in_field)), dump_field_emacs_ptr(ctx, out, in_start, in_field)) (similarly for other functions using a void * which itself must point to a pointer, not a Lisp_Object). This change isn't about saving memory; on my system, it doesn't, since GCC aligns all Lisp Subrs to 32-byte boundaries on my system. From 7b1384a7e71f783f3946e7aaad092e8095386387 Mon Sep 17 00:00:00 2001 From: Pip Cet <pipcet <at> protonmail.com> Subject: [PATCH] Remove unused command_modes from Lisp_Subr structure (bug#75318) This field isn't marked by the garbage collector in !HAVE_NATIVE_COMP builds, so using it in those would have led to very tricky bugs. * src/data.c (Fcommand_modes) [!HAVE_NATIVE_COMP]: Return Qnil for built-in subrs. * src/lisp.h (struct Lisp_Subr) [!HAVE_NATIVE_COMP]: (SUBR_NATIVECOMP_INITIALIZER): (SUBR_INITIALIZER): Remove command_modes field. * src/pdumper.c (dump_subr): Don't dump command_modes field unless necessary. When it is necessary, use dump_field_lv, not dump_field_emacs_ptr. --- src/data.c | 2 ++ src/lisp.h | 4 ++-- src/pdumper.c | 3 +-- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/src/data.c b/src/data.c index dcaa5756ebe..bb7238065f5 100644 --- a/src/data.c +++ b/src/data.c @@ -1234,7 +1234,9 @@ DEFUN ("command-modes", Fcommand_modes, Scommand_modes, 1, 1, 0, if (SUBRP (fun)) { +#ifdef HAVE_NATIVE_COMP return XSUBR (fun)->command_modes; +#endif } else if (CLOSUREP (fun)) { diff --git a/src/lisp.h b/src/lisp.h index 9c31630cc1a..46e940dad09 100644 --- a/src/lisp.h +++ b/src/lisp.h @@ -2226,12 +2226,12 @@ CHAR_TABLE_SET (Lisp_Object ct, int idx, Lisp_Object val) const char *string; Lisp_Object native; } intspec; - Lisp_Object command_modes; /* Positive values: offset into etc/DOC. Negative values: one's complement of index into the native comp unit's vector of documentation strings. */ EMACS_INT doc; #ifdef HAVE_NATIVE_COMP + Lisp_Object command_modes; Lisp_Object native_comp_u; char *native_c_name; Lisp_Object lambda_list; @@ -3486,6 +3486,7 @@ #define SUBR_HEADER { .size = \ #ifdef HAVE_NATIVE_COMP #define SUBR_NATIVECOMP_INITIALIZER \ + .command_modes = LISPSYM_INITIALLY (Qnil), \ .native_comp_u = LISPSYM_INITIALLY (Qnil), \ .lambda_list = LISPSYM_INITIALLY (Qnil), \ .type = LISPSYM_INITIALLY (Qnil), @@ -3500,7 +3501,6 @@ #define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ .max_args = maxargs, \ .symbol_name = lname, \ .intspec = { .string = intspec_value }, \ - .command_modes = LISPSYM_INITIALLY (Qnil), \ SUBR_NATIVECOMP_INITIALIZER \ }} diff --git a/src/pdumper.c b/src/pdumper.c index e83c7bcf9a1..64afbdd3065 100644 --- a/src/pdumper.c +++ b/src/pdumper.c @@ -2992,16 +2992,15 @@ dump_subr (struct dump_context *ctx, const struct Lisp_Subr *subr) COLD_OP_NATIVE_SUBR, make_lisp_ptr ((void *) subr, Lisp_Vectorlike)); dump_field_lv (ctx, &out, subr, &subr->intspec.native, WEIGHT_NORMAL); - dump_field_lv (ctx, &out, subr, &subr->command_modes, WEIGHT_NORMAL); } else { dump_field_emacs_ptr (ctx, &out, subr, &subr->symbol_name); dump_field_emacs_ptr (ctx, &out, subr, &subr->intspec.string); - dump_field_emacs_ptr (ctx, &out, subr, &subr->command_modes); } DUMP_FIELD_COPY (&out, subr, doc); #ifdef HAVE_NATIVE_COMP + dump_field_lv (ctx, &out, subr, &subr->command_modes, WEIGHT_NORMAL); dump_field_lv (ctx, &out, subr, &subr->native_comp_u, WEIGHT_NORMAL); if (!NILP (subr->native_comp_u)) dump_field_fixup_later (ctx, &out, subr, &subr->native_c_name); -- 2.47.1
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 12:54:01 GMT) Full text and rfc822 format available.Message #17 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Stefan Monnier <monnier <at> iro.umontreal.ca> To: Pip Cet <pipcet <at> protonmail.com> Cc: Po Lu <luangruo <at> yahoo.com>, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 07:53:31 -0500
> Ideally, we'd turn these three into plain DEFUNs, with docstrings > indicating they're internal functions. I'd also like to turn the > underscores into ordinary dashes in the Lisp function names if we > do that. +1 Does anyone know why they're not plain DEFUNs? Stefan
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 13:19:01 GMT) Full text and rfc822 format available.Message #20 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 15:17:54 +0200
> Cc: Po Lu <luangruo <at> yahoo.com>, monnier <at> iro.umontreal.ca > Date: Tue, 21 Jan 2025 23:11:13 +0000 > From: Pip Cet via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> > > This patch splits up the macros so they're reusable, initializes using > LISPSYM_INITIALLY rather than lisp_h_Qnil, ensures that all Lisp_Object > fields are initialized, uses explicit field-name initializers, and uses > c-mode default indentation (this is incorrect in one place, but it's > hardly noticeable). While I realize that this is a battle lost long ago, I cannot resist lamenting the once-simple way many lisp.h macros and inline functions were defined, which is all but gone now, replaced by the never-ending rabbit hole of macros that call macros that call macros that call... Add to this the fact that each macro/inline function has several different definitions, whose conditions are sometimes so complicated that the only reasonable way to quickly understand what definition a particular build uses is to fire up good-old GDB and evaluate the macros there, and you get the picture. For example, in the below: > static union Aligned_Lisp_Subr Swatch_gc_cons_percentage = > - {{{ PSEUDOVECTOR_FLAG | (PVEC_SUBR << PSEUDOVECTOR_AREA_BITS) }, > - { .a4 = watch_gc_cons_percentage }, > - 4, 4, "watch_gc_cons_percentage", {0}, lisp_h_Qnil}}; what is lisp_h_Qnil is relatively easy to understand, whereas the replacement, viz.: > +#define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ > + {{ .header = SUBR_HEADER, \ > + .function = { .a ## maxargs = fnname }, \ > + .min_args = minargs, \ > + .max_args = maxargs, \ > + .symbol_name = lname, \ > + .intspec = { .string = intspec_value }, \ > + .command_modes = LISPSYM_INITIALLY (Qnil), \ requires one to look up LISPSYM_INITIALLY, which is #define LISPSYM_INITIALLY(name) \ TAG_PTR_INITIALLY (Lisp_Symbol, (intptr_t) ((i##name) * sizeof *lispsym)) and then you need to go down the rabbit hole: TAG_PTR_INITIALLY => LISP_INITIALLY => LISP_WORD_TAG => VALBITS Wonderful, right? (No, I don't expect you to change anything in your patch, just realize the plight.) > +#ifdef HAVE_NATIVE_COMP > +#define SUBR_NATIVECOMP_INITIALIZER \ > + .native_comp_u = LISPSYM_INITIALLY (Qnil), \ > + .lambda_list = LISPSYM_INITIALLY (Qnil), \ > + .type = LISPSYM_INITIALLY (Qnil), > +#else > +#define SUBR_NATIVECOMP_INITIALIZER > +#endif > + > +#define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ > + {{ .header = SUBR_HEADER, \ > + .function = { .a ## maxargs = fnname }, \ > + .min_args = minargs, \ > + .max_args = maxargs, \ > + .symbol_name = lname, \ > + .intspec = { .string = intspec_value }, \ > + .command_modes = LISPSYM_INITIALLY (Qnil), \ > + SUBR_NATIVECOMP_INITIALIZER \ > + }} > + To make the depth of the proverbial rabbit hole at least to some extent smaller here, could we please refrain from defining SUBR_NATIVECOMP_INITIALIZER, and instead include the necessary stuff directly in SUBR_INITIALIZER, with an #ifdef? Or do you envision enough other users of SUBR_NATIVECOMP_INITIALIZER to justify yet another macro? Thanks.
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 13:32:02 GMT) Full text and rfc822 format available.Message #23 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com>, Andrea Corallo <acorallo <at> gnu.org> Cc: monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 15:31:00 +0200
> Cc: monnier <at> iro.umontreal.ca > Date: Tue, 21 Jan 2025 23:46:45 +0000 > From: Pip Cet via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> > > diff --git a/src/lisp.h b/src/lisp.h > index 9c31630cc1a..46e940dad09 100644 > --- a/src/lisp.h > +++ b/src/lisp.h > @@ -2226,12 +2226,12 @@ CHAR_TABLE_SET (Lisp_Object ct, int idx, Lisp_Object val) > const char *string; > Lisp_Object native; > } intspec; > - Lisp_Object command_modes; > /* Positive values: offset into etc/DOC. Negative values: one's > complement of index into the native comp unit's vector of > documentation strings. */ > EMACS_INT doc; > #ifdef HAVE_NATIVE_COMP > + Lisp_Object command_modes; > Lisp_Object native_comp_u; > char *native_c_name; > Lisp_Object lambda_list; IME, having different definitions of data structures under different build conditions is a problematic solution: it makes it harder to make safe changes that don't break some build. Many/most people don't try to build all of the configurations Emacs supports when they make changes, so it is easy to write code which references a struct member that doesn't exist in some build configuration, and thus break it, and miss the fact that build is broken by not trying it. (It is unrealistic to expect everyone who has write access to the repository to test-build all the possible configurations Emacs supports as a prerequisite for installing a change.) So: do we really need to remove command_modes from the struct in a build without native compilation? What bad things will happen if we leave it there? If we would like to be a bit more efficient by avoiding a few function calls, like here: > --- a/src/pdumper.c > +++ b/src/pdumper.c > @@ -2992,16 +2992,15 @@ dump_subr (struct dump_context *ctx, const struct Lisp_Subr *subr) > COLD_OP_NATIVE_SUBR, > make_lisp_ptr ((void *) subr, Lisp_Vectorlike)); > dump_field_lv (ctx, &out, subr, &subr->intspec.native, WEIGHT_NORMAL); > - dump_field_lv (ctx, &out, subr, &subr->command_modes, WEIGHT_NORMAL); > } > else > { > dump_field_emacs_ptr (ctx, &out, subr, &subr->symbol_name); > dump_field_emacs_ptr (ctx, &out, subr, &subr->intspec.string); > - dump_field_emacs_ptr (ctx, &out, subr, &subr->command_modes); > } then it should be okay to #ifdef away those calls when !HAVE_NATIVE_COMP, as that doesn't run the risks mentioned above. I also would like to hear from Andrea about this patch. Thanks.
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 14:26:02 GMT) Full text and rfc822 format available.Message #26 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 14:25:22 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Cc: Po Lu <luangruo <at> yahoo.com>, monnier <at> iro.umontreal.ca >> Date: Tue, 21 Jan 2025 23:11:13 +0000 >> From: Pip Cet via "Bug reports for GNU Emacs, >> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> >> >> This patch splits up the macros so they're reusable, initializes using >> LISPSYM_INITIALLY rather than lisp_h_Qnil, ensures that all Lisp_Object >> fields are initialized, uses explicit field-name initializers, and uses >> c-mode default indentation (this is incorrect in one place, but it's >> hardly noticeable). > > While I realize that this is a battle lost long ago, I cannot resist > lamenting the once-simple way many lisp.h macros and inline functions > were defined, which is all but gone now, replaced by the never-ending > rabbit hole of macros that call macros that call macros that call... > Add to this the fact that each macro/inline function has several > different definitions, whose conditions are sometimes so complicated > that the only reasonable way to quickly understand what definition a > particular build uses is to fire up good-old GDB and evaluate the > macros there, and you get the picture. I do. I agree completely. I don't know how to fix it. XIL is necessary for CHECK_LISP_OBJECT_TYPE (and there seems to be a consensus we can't unconditionally enable that one yet), but I don't see why we can't make an exception for Qnil and make the simple definition of lisp_h_Qnil apply to this specific symbol in all circumstances: #define Qnil XIL(0) If we used that instead, it would make LISPSYM_INITIALLY slightly harder to find because fewer places would use it. > For example, in the below: > >> static union Aligned_Lisp_Subr Swatch_gc_cons_percentage = >> - {{{ PSEUDOVECTOR_FLAG | (PVEC_SUBR << PSEUDOVECTOR_AREA_BITS) }, >> - { .a4 = watch_gc_cons_percentage }, >> - 4, 4, "watch_gc_cons_percentage", {0}, lisp_h_Qnil}}; > > what is lisp_h_Qnil is relatively easy to understand, whereas the > replacement, viz.: >> +#define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ >> + {{ .header = SUBR_HEADER, \ >> + .function = { .a ## maxargs = fnname }, \ >> + .min_args = minargs, \ >> + .max_args = maxargs, \ >> + .symbol_name = lname, \ >> + .intspec = { .string = intspec_value }, \ >> + .command_modes = LISPSYM_INITIALLY (Qnil), \ > > requires one to look up LISPSYM_INITIALLY, which is If you're saying that LISPSYM_INITIALLY is bad, I agree completely. If you're saying a single LISPSYM_INITIALLY in DEFUN, which is well-documented to the point that users do not have to read the definition, is worse than three of them in C files, I don't. > #define LISPSYM_INITIALLY(name) \ > TAG_PTR_INITIALLY (Lisp_Symbol, (intptr_t) ((i##name) * sizeof *lispsym)) > > and then you need to go down the rabbit hole: > > TAG_PTR_INITIALLY > => LISP_INITIALLY > => LISP_WORD_TAG > => VALBITS > > Wonderful, right? I had forgotten how hard it was to get used to this. I think we should definitely avoid making the situation worse, even if that means we can't do fun stuff. > (No, I don't expect you to change anything in your patch, just realize > the plight.) Thank you, in any case. >> +#ifdef HAVE_NATIVE_COMP >> +#define SUBR_NATIVECOMP_INITIALIZER \ >> + .native_comp_u = LISPSYM_INITIALLY (Qnil), \ >> + .lambda_list = LISPSYM_INITIALLY (Qnil), \ >> + .type = LISPSYM_INITIALLY (Qnil), >> +#else >> +#define SUBR_NATIVECOMP_INITIALIZER >> +#endif >> + >> +#define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ >> + {{ .header = SUBR_HEADER, \ >> + .function = { .a ## maxargs = fnname }, \ >> + .min_args = minargs, \ >> + .max_args = maxargs, \ >> + .symbol_name = lname, \ >> + .intspec = { .string = intspec_value }, \ >> + .command_modes = LISPSYM_INITIALLY (Qnil), \ >> + SUBR_NATIVECOMP_INITIALIZER \ >> + }} >> + > > To make the depth of the proverbial rabbit hole at least to some > extent smaller here, could we please refrain from defining > SUBR_NATIVECOMP_INITIALIZER, and instead include the necessary stuff > directly in SUBR_INITIALIZER, with an #ifdef? Or do you envision > enough other users of SUBR_NATIVECOMP_INITIALIZER to justify yet > another macro? Do you mean #ifdef HAVE_NATIVE_COMP #define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ {{ .header = SUBR_HEADER, \ .function = { .a ## maxargs = fnname }, \ .min_args = minargs, \ .max_args = maxargs, \ .symbol_name = lname, \ .intspec = { .string = intspec_value }, \ .command_modes = LISPSYM_INITIALLY (Qnil), \ .native_comp_u = LISPSYM_INITIALLY (Qnil), \ .lambda_list = LISPSYM_INITIALLY (Qnil), \ .type = LISPSYM_INITIALLY (Qnil), }} #else #define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ {{ .header = SUBR_HEADER, \ .function = { .a ## maxargs = fnname }, \ .min_args = minargs, \ .max_args = maxargs, \ .symbol_name = lname, \ .intspec = { .string = intspec_value }, \ }} #endif (and, I assume, the same thing for SUBR_HEADER?) Avoiding the repetition appears to require a second macro no matter what I do. While *I* realize *that* battle was lost long ago, I still think nativecomp subrs should have had their own PVEC type. It's too late for that now; let's not revisit a discussion that I remember as very fair, and which definitely concluded that we'd share the PVEC type. I'm thinking (very vaguely; no patch expected in 2025) about proposing to reduce all Lisp_Objects in the subr object to a simple plist, though, or a vector if we have to. I think 32 bytes is a very good size for a subr to have, and that's header, plist (Lisp_Object so first), minargs+maxargs+4-byte gap, function pointer union. It's easier to GC-trace an ordinary PVEC with lisplen=1 and restlen=2 without worrying about whether it is a subr, I tihnk. Pip
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 15:29:02 GMT) Full text and rfc822 format available.Message #29 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 17:27:57 +0200
> Date: Wed, 22 Jan 2025 14:25:22 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: 75318 <at> debbugs.gnu.org, luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca > > > For example, in the below: > > > >> static union Aligned_Lisp_Subr Swatch_gc_cons_percentage = > >> - {{{ PSEUDOVECTOR_FLAG | (PVEC_SUBR << PSEUDOVECTOR_AREA_BITS) }, > >> - { .a4 = watch_gc_cons_percentage }, > >> - 4, 4, "watch_gc_cons_percentage", {0}, lisp_h_Qnil}}; > > > > what is lisp_h_Qnil is relatively easy to understand, whereas the > > replacement, viz.: > > >> +#define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ > >> + {{ .header = SUBR_HEADER, \ > >> + .function = { .a ## maxargs = fnname }, \ > >> + .min_args = minargs, \ > >> + .max_args = maxargs, \ > >> + .symbol_name = lname, \ > >> + .intspec = { .string = intspec_value }, \ > >> + .command_modes = LISPSYM_INITIALLY (Qnil), \ > > > > requires one to look up LISPSYM_INITIALLY, which is > > If you're saying that LISPSYM_INITIALLY is bad, I agree completely. It isn't bad, it's that replacing lisp_h_Qnil with it makes the code harder to understand (without going crazy with M-.). As for LISPSYM_INITIALLY, its name is the main source of confusion: the INITIALLY part all but prevents any attempt to get a clear idea what the macro does, without actually looking up all the way to the basics. Unless on remembers that by heart, of course (I don't). > > To make the depth of the proverbial rabbit hole at least to some > > extent smaller here, could we please refrain from defining > > SUBR_NATIVECOMP_INITIALIZER, and instead include the necessary stuff > > directly in SUBR_INITIALIZER, with an #ifdef? Or do you envision > > enough other users of SUBR_NATIVECOMP_INITIALIZER to justify yet > > another macro? > > Do you mean > > #ifdef HAVE_NATIVE_COMP > #define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ > {{ .header = SUBR_HEADER, \ > .function = { .a ## maxargs = fnname }, \ > .min_args = minargs, \ > .max_args = maxargs, \ > .symbol_name = lname, \ > .intspec = { .string = intspec_value }, \ > .command_modes = LISPSYM_INITIALLY (Qnil), \ > .native_comp_u = LISPSYM_INITIALLY (Qnil), \ > .lambda_list = LISPSYM_INITIALLY (Qnil), \ > .type = LISPSYM_INITIALLY (Qnil), > }} > #else > #define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ > {{ .header = SUBR_HEADER, \ > .function = { .a ## maxargs = fnname }, \ > .min_args = minargs, \ > .max_args = maxargs, \ > .symbol_name = lname, \ > .intspec = { .string = intspec_value }, \ > }} > #endif > > (and, I assume, the same thing for SUBR_HEADER?) No. I meant to use #ifdef inside the macro. But if that is impossible in a C macro, I'd even support making this an inline function. Anyway, this is a minor comment, so if it's hard to do better, let's not go overboard. I'll survive.
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 15:31:02 GMT) Full text and rfc822 format available.Message #32 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: Andrea Corallo <acorallo <at> gnu.org>, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 15:30:24 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Cc: monnier <at> iro.umontreal.ca >> Date: Tue, 21 Jan 2025 23:46:45 +0000 >> From: Pip Cet via "Bug reports for GNU Emacs, >> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> >> >> diff --git a/src/lisp.h b/src/lisp.h >> index 9c31630cc1a..46e940dad09 100644 >> --- a/src/lisp.h >> +++ b/src/lisp.h >> @@ -2226,12 +2226,12 @@ CHAR_TABLE_SET (Lisp_Object ct, int idx, Lisp_Object val) >> const char *string; >> Lisp_Object native; >> } intspec; >> - Lisp_Object command_modes; >> /* Positive values: offset into etc/DOC. Negative values: one's >> complement of index into the native comp unit's vector of >> documentation strings. */ >> EMACS_INT doc; >> #ifdef HAVE_NATIVE_COMP >> + Lisp_Object command_modes; >> Lisp_Object native_comp_u; >> char *native_c_name; >> Lisp_Object lambda_list; > > IME, having different definitions of data structures under different > build conditions is a problematic solution: it makes it harder to make That statement applies to the status quo before the patch just as much as it does to the new status, though, right? Lisp_Subr had three conditional fields before, plus one that's unconditionally compiled in but only usable if one of the other three is non-nil. Now it has four conditional fields, which are unusable if they don't exist because the compiler won't let you. > safe changes that don't break some build. Many/most people don't try This doesn't; it's impossible ta safely use the command_modes field for the reasons I explained. And making it safe without using it at all seems both wasteful and all but impossible: we'd have to test the code somehow, making up usages of a field that hasn't been in real use for a long time. > to build all of the configurations Emacs supports when they make > changes, so it is easy to write code which references a struct member > that doesn't exist in some build configuration, and thus break it, and > miss the fact that build is broken by not trying it. (It is > unrealistic to expect everyone who has write access to the repository > to test-build all the possible configurations Emacs supports as a > prerequisite for installing a change.) I agree, but I don't see how that applies here. Before the patch, if someone had started using nativecomp-only command_modes, we would have had very tricky GC/pdumper bugs which would have taken much longer to resolve than simply removing an unused field. The new patch does still make it possible for someone using only nativecomp builds to accidentally use the field for non-nativecomp subrs. If someone tried to build that tree without nativecomp, they would get a compiler error, which is the best we can do to warn them of this situation, given the constraint that we use the same structure for nativecomp subrs and primitive ones. Of course it would be better if we statically asserted that we're looking at a nativecomp subr before accessing the field, but we've discussed that and concluded sharing a C structure is a constraint which I obeyed in this patch. If you think this field is significant enough to warrant it (it's used once in comp.c, once in data.c, and three times by pdumper.c; two of the pdumper usages can be merged because the buggy one would have to be replaced by the non-buggy one anyway. The third is an assertion), we could add accessors, but it's simply not. > So: do we really need to remove command_modes from the struct in a > build without native compilation? What bad things will happen if we Yes. > leave it there? If we would like to be a bit more efficient by > avoiding a few function calls, like here: People might use it, and run into the various bugs that have accumulated. It's better to remove the unusable field (and there's all that talk about not wasting two words of staticvec, but wasting 1,000 words is suddenly okay?). >> --- a/src/pdumper.c >> +++ b/src/pdumper.c >> @@ -2992,16 +2992,15 @@ dump_subr (struct dump_context *ctx, const struct Lisp_Subr *subr) >> COLD_OP_NATIVE_SUBR, >> make_lisp_ptr ((void *) subr, Lisp_Vectorlike)); >> dump_field_lv (ctx, &out, subr, &subr->intspec.native, WEIGHT_NORMAL); >> - dump_field_lv (ctx, &out, subr, &subr->command_modes, WEIGHT_NORMAL); >> } >> else >> { >> dump_field_emacs_ptr (ctx, &out, subr, &subr->symbol_name); >> dump_field_emacs_ptr (ctx, &out, subr, &subr->intspec.string); >> - dump_field_emacs_ptr (ctx, &out, subr, &subr->command_modes); >> } > > then it should be okay to #ifdef away those calls when > !HAVE_NATIVE_COMP, as that doesn't run the risks mentioned above. That is precisely what I did in the code you cite: the two - lines you show turned into a single + line behind an #ifdef, which is *precisely* what we do for the other three fields which are usable in equivalent circumstances. Adding a separate #ifdef for two lines that should have been identical but weren't, a bug that never became relevant only because the field was unused. (The bug would have been avoided by some macro-based check-this-is-a-non-void-pointer code, see the discussion before). > I also would like to hear from Andrea about this patch. Absolutely, let's. Thanks for your comments. You have convinced me that the .intspec.native union member also shouldn't exist if we're building without nativecomp. As we've seen with Lisp_Subr->doc, using the same field for different purposes is fragile and leads to bugs. I won't revise the patch for now as this minor change doesn't really change things significantly: intspec.native is only set in one place and read in another one, plus GC/pdumper bookkeeping. A comment would suffice, too. Pip
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 16:09:02 GMT) Full text and rfc822 format available.Message #35 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: acorallo <at> gnu.org, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 18:07:43 +0200
> Date: Wed, 22 Jan 2025 15:30:24 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: Andrea Corallo <acorallo <at> gnu.org>, 75318 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca > > "Eli Zaretskii" <eliz <at> gnu.org> writes: > > >> Cc: monnier <at> iro.umontreal.ca > >> Date: Tue, 21 Jan 2025 23:46:45 +0000 > >> From: Pip Cet via "Bug reports for GNU Emacs, > >> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> > >> > >> diff --git a/src/lisp.h b/src/lisp.h > >> index 9c31630cc1a..46e940dad09 100644 > >> --- a/src/lisp.h > >> +++ b/src/lisp.h > >> @@ -2226,12 +2226,12 @@ CHAR_TABLE_SET (Lisp_Object ct, int idx, Lisp_Object val) > >> const char *string; > >> Lisp_Object native; > >> } intspec; > >> - Lisp_Object command_modes; > >> /* Positive values: offset into etc/DOC. Negative values: one's > >> complement of index into the native comp unit's vector of > >> documentation strings. */ > >> EMACS_INT doc; > >> #ifdef HAVE_NATIVE_COMP > >> + Lisp_Object command_modes; > >> Lisp_Object native_comp_u; > >> char *native_c_name; > >> Lisp_Object lambda_list; > > > > IME, having different definitions of data structures under different > > build conditions is a problematic solution: it makes it harder to make > > That statement applies to the status quo before the patch just as much > as it does to the new status, though, right? Could be; I didn't check. It's a general statement. > > safe changes that don't break some build. Many/most people don't try > > This doesn't; it's impossible ta safely use the command_modes field for > the reasons I explained. Can you point me to the explanation, or explain again? > And making it safe without using it at all seems both wasteful and all > but impossible: we'd have to test the code somehow, making up usages of > a field that hasn't been in real use for a long time. I just wanted to avoid breaking a build by referencing the struct member in a build that doesn't have it. There's no need to ensure the member is always used safely. Doesn't initializing it to nil solve this? > > > to build all of the configurations Emacs supports when they make > > changes, so it is easy to write code which references a struct member > > that doesn't exist in some build configuration, and thus break it, and > > miss the fact that build is broken by not trying it. (It is > > unrealistic to expect everyone who has write access to the repository > > to test-build all the possible configurations Emacs supports as a > > prerequisite for installing a change.) > > I agree, but I don't see how that applies here. Before the patch, if > someone had started using nativecomp-only command_modes, we would have > had very tricky GC/pdumper bugs which would have taken much longer to > resolve than simply removing an unused field. Well, both problems exist, they just push us in opposite directions. > The new patch does still make it possible for someone using only > nativecomp builds to accidentally use the field for non-nativecomp > subrs. If someone tried to build that tree without nativecomp, they > would get a compiler error, which is the best we can do to warn them of > this situation, given the constraint that we use the same structure for > nativecomp subrs and primitive ones. OTOH, adding code that references the field without #ifdef, and installing it without trying a non-nativecomp build, will break everybody who builds --without-native-compilation. > People might use it, and run into the various bugs that have > accumulated. It's better to remove the unusable field (and there's all > that talk about not wasting two words of staticvec, but wasting 1,000 > words is suddenly okay?). If this leads to such a significant memory waste, then indeed what I proposed is a non-starter, unfortunately.
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 16:09:03 GMT) Full text and rfc822 format available.Message #38 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 16:07:52 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Wed, 22 Jan 2025 14:25:22 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: 75318 <at> debbugs.gnu.org, luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca >> >> > For example, in the below: >> > >> >> static union Aligned_Lisp_Subr Swatch_gc_cons_percentage = >> >> - {{{ PSEUDOVECTOR_FLAG | (PVEC_SUBR << PSEUDOVECTOR_AREA_BITS) }, >> >> - { .a4 = watch_gc_cons_percentage }, >> >> - 4, 4, "watch_gc_cons_percentage", {0}, lisp_h_Qnil}}; >> > >> > what is lisp_h_Qnil is relatively easy to understand, whereas the >> > replacement, viz.: >> >> >> +#define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ >> >> + {{ .header = SUBR_HEADER, \ >> >> + .function = { .a ## maxargs = fnname }, \ >> >> + .min_args = minargs, \ >> >> + .max_args = maxargs, \ >> >> + .symbol_name = lname, \ >> >> + .intspec = { .string = intspec_value }, \ >> >> + .command_modes = LISPSYM_INITIALLY (Qnil), \ >> > >> > requires one to look up LISPSYM_INITIALLY, which is >> >> If you're saying that LISPSYM_INITIALLY is bad, I agree completely. > > It isn't bad, it's that replacing lisp_h_Qnil with it makes the code > harder to understand (without going crazy with M-.). > As for LISPSYM_INITIALLY, its name is the main source of confusion: > the INITIALLY part all but prevents any attempt to get a clear idea > what the macro does, without actually looking up all the way to the > basics. Unless on remembers that by heart, of course (I don't). I think that second paragraph means that LISPSYM_INITIALLY is, at best, an unfortunate workaround which may one day be retired. So, IMHO, yes, it's bad in an absolute way, but it may be the best option anyway. (In GDB, with .gdbinit loaded, "p LISPSYM_INITIALLY (Qnil)" prints an unhelpful "{0x0}"; "p lisp_h_Qnil" prints "{0}". ptype thinks the first is a Lisp_Object (but expands the type to its obscure definition), the second claims to be a singleton array of int. While all of these are unhelpful, the last one is dangerous, and we can avoid it by using the unfortunately-named LISPSYM_INITIALLY. However, I think Qnil should be usable in static context. If someone working on a new GC can't do that (I speak from experience)), they most likely cannot implement LISPSYM_INITIALLY at all, and have to resort to some kind of constructor extension; GNU C has one, C++ supports them natively. None of that means we should be forced to keep the name, which is confusing and misleading. >> > To make the depth of the proverbial rabbit hole at least to some >> > extent smaller here, could we please refrain from defining >> > SUBR_NATIVECOMP_INITIALIZER, and instead include the necessary stuff >> > directly in SUBR_INITIALIZER, with an #ifdef? Or do you envision >> > enough other users of SUBR_NATIVECOMP_INITIALIZER to justify yet >> > another macro? >> >> Do you mean >> >> #ifdef HAVE_NATIVE_COMP >> #define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ >> {{ .header = SUBR_HEADER, \ >> .function = { .a ## maxargs = fnname }, \ >> .min_args = minargs, \ >> .max_args = maxargs, \ >> .symbol_name = lname, \ >> .intspec = { .string = intspec_value }, \ >> .command_modes = LISPSYM_INITIALLY (Qnil), \ >> .native_comp_u = LISPSYM_INITIALLY (Qnil), \ >> .lambda_list = LISPSYM_INITIALLY (Qnil), \ >> .type = LISPSYM_INITIALLY (Qnil), >> }} >> #else >> #define SUBR_INITIALIZER(lname, fnname, minargs, maxargs, intspec_value) \ >> {{ .header = SUBR_HEADER, \ >> .function = { .a ## maxargs = fnname }, \ >> .min_args = minargs, \ >> .max_args = maxargs, \ >> .symbol_name = lname, \ >> .intspec = { .string = intspec_value }, \ >> }} >> #endif >> >> (and, I assume, the same thing for SUBR_HEADER?) > > No. I meant to use #ifdef inside the macro. But if that is > impossible in a C macro, I'd even support making this an inline > function. My understanding is you cannot use #ifdef inside a macro in GNU C; there might be a way to use _Pragma in other C versions, but it's horrible. Similarly, as the standard C we're attempting to use doesn't have constexpr and inlining functions is always optional, I'm pretty sure it's still impossible to statically initialize a variable with the result of calling a function, inline or otherwise. Pip
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 16:39:02 GMT) Full text and rfc822 format available.Message #41 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: acorallo <at> gnu.org, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 16:38:26 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Wed, 22 Jan 2025 15:30:24 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: Andrea Corallo <acorallo <at> gnu.org>, 75318 <at> debbugs.gnu.org, monnier <at> iro.umontreal.ca >> >> "Eli Zaretskii" <eliz <at> gnu.org> writes: >> >> >> Cc: monnier <at> iro.umontreal.ca >> >> Date: Tue, 21 Jan 2025 23:46:45 +0000 >> >> From: Pip Cet via "Bug reports for GNU Emacs, >> >> the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> >> >> >> >> diff --git a/src/lisp.h b/src/lisp.h >> >> index 9c31630cc1a..46e940dad09 100644 >> >> --- a/src/lisp.h >> >> +++ b/src/lisp.h >> >> @@ -2226,12 +2226,12 @@ CHAR_TABLE_SET (Lisp_Object ct, int idx, Lisp_Object val) >> >> const char *string; >> >> Lisp_Object native; >> >> } intspec; >> >> - Lisp_Object command_modes; >> >> /* Positive values: offset into etc/DOC. Negative values: one's >> >> complement of index into the native comp unit's vector of >> >> documentation strings. */ >> >> EMACS_INT doc; >> >> #ifdef HAVE_NATIVE_COMP >> >> + Lisp_Object command_modes; >> >> Lisp_Object native_comp_u; >> >> char *native_c_name; >> >> Lisp_Object lambda_list; >> > >> > IME, having different definitions of data structures under different >> > build conditions is a problematic solution: it makes it harder to make >> >> That statement applies to the status quo before the patch just as much >> as it does to the new status, though, right? > > Could be; I didn't check. It's a general statement. > >> > safe changes that don't break some build. Many/most people don't try >> >> This doesn't; it's impossible ta safely use the command_modes field for >> the reasons I explained. > > Can you point me to the explanation, or explain again? GC doesn't trace the field, pdumper doesn't save it; the sole remaining user expects (and insists) on a Qnil value. Iff we wanted to fix the first we'd have to add new users for testing, if we want to fix and test the second we'd have to add pre-dump users, which is harder. Pdumper also asserts that the field isn't used, so it's possible other things will break, because I'm sure that assertion was put there for a reason. >> And making it safe without using it at all seems both wasteful and all >> but impossible: we'd have to test the code somehow, making up usages of >> a field that hasn't been in real use for a long time. > > I just wanted to avoid breaking a build by referencing the struct > member in a build that doesn't have it. There's no need to ensure the > member is always used safely. Doesn't initializing it to nil solve > this? I don't understand that at all, sorry. If anyone uses the field for anything but checking its nil-ness in a non-nativecomp build, they either get a compile-time error or a very frustrating inexplicable crash. Changing from the latter to the former is the best we can do. >> The new patch does still make it possible for someone using only >> nativecomp builds to accidentally use the field for non-nativecomp >> subrs. If someone tried to build that tree without nativecomp, they >> would get a compiler error, which is the best we can do to warn them of >> this situation, given the constraint that we use the same structure for >> nativecomp subrs and primitive ones. > > OTOH, adding code that references the field without #ifdef, and > installing it without trying a non-nativecomp build, will break > everybody who builds --without-native-compilation. Indeed. It'd be nice to statically check more things before we push to master (and, generally, compiling a C program and checking it for errors are two separate things; the same is true for texinfo, it seems). I honestly think we should add a separate "lint" pass (by this or any other name) sooner rather than later, but right now, we don't. The best we can do is CI and breaking the build rather than crashing inexplicably. (Random example: print.c exposes the string "unreadeable-function" (meaning "unreadable-function") to Lisp and expects Lisp to provide it in some contexts. Obviously not something GCC can check for during every build, but if we'd run a weekly typo lint, we could have caught it in time. I've let the resident typo expert know, but this probably requires a bug report and discussion now). But, again, pie-in-the-sky dreams of a better build system aside, this is strictly about compile-time errors vs run-time bugs. >> People might use it, and run into the various bugs that have >> accumulated. It's better to remove the unusable field (and there's all >> that talk about not wasting two words of staticvec, but wasting 1,000 >> words is suddenly okay?). > > If this leads to such a significant memory waste, then indeed what I > proposed is a non-starter, unfortunately. Insignificant for my current system, unfortunately, which aligns static Lisp_Subr to 32-byte boundaries (for good reason). However, as there are many systems with different alignments and sizes, saving a 32-byte line every fourth time we remove a word has the same expected benefit as regaining 8 bytes the second we do so.
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 16:40:02 GMT) Full text and rfc822 format available.Message #44 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Eli Zaretskii <eliz <at> gnu.org> To: Pip Cet <pipcet <at> protonmail.com> Cc: luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 18:39:12 +0200
> Date: Wed, 22 Jan 2025 16:07:52 +0000 > From: Pip Cet <pipcet <at> protonmail.com> > Cc: 75318 <at> debbugs.gnu.org, luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca > > (In GDB, with .gdbinit loaded, "p LISPSYM_INITIALLY (Qnil)" prints an > unhelpful "{0x0}"; "p lisp_h_Qnil" prints "{0}". ptype thinks the first > is a Lisp_Object (but expands the type to its obscure definition), the > second claims to be a singleton array of int. I guess your build is with check-lisp-objects? Because here I get (gdb) p LISPSYM_INITIALLY (Qnil) $1 = 0 (gdb) p lisp_h_Qnil $3 = 0 (gdb) ptype LISPSYM_INITIALLY(Qnil) type = long long (gdb) ptype lisp_h_Qnil type = int
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Wed, 22 Jan 2025 17:59:01 GMT) Full text and rfc822 format available.Message #47 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Wed, 22 Jan 2025 17:57:48 +0000
"Eli Zaretskii" <eliz <at> gnu.org> writes: >> Date: Wed, 22 Jan 2025 16:07:52 +0000 >> From: Pip Cet <pipcet <at> protonmail.com> >> Cc: 75318 <at> debbugs.gnu.org, luangruo <at> yahoo.com, monnier <at> iro.umontreal.ca >> >> (In GDB, with .gdbinit loaded, "p LISPSYM_INITIALLY (Qnil)" prints an >> unhelpful "{0x0}"; "p lisp_h_Qnil" prints "{0}". ptype thinks the first >> is a Lisp_Object (but expands the type to its obscure definition), the >> second claims to be a singleton array of int. > > I guess your build is with check-lisp-objects? Because here I get Sorry, meant to mention that and forgot. It is. > (gdb) p LISPSYM_INITIALLY (Qnil) > $1 = 0 > (gdb) p lisp_h_Qnil > $3 = 0 > (gdb) ptype LISPSYM_INITIALLY(Qnil) > type = long long > (gdb) ptype lisp_h_Qnil > type = int I guess your build is with wide Emacs ints, as usual :-) I'd also forgotten about "whatis", which splits the difference and tells me the first is a Lisp_Word, not a Lisp_Object. Pip
bug-gnu-emacs <at> gnu.org
:bug#75318
; Package emacs
.
(Thu, 23 Jan 2025 12:00:03 GMT) Full text and rfc822 format available.Message #50 received at 75318 <at> debbugs.gnu.org (full text, mbox):
From: Pip Cet <pipcet <at> protonmail.com> To: Stefan Monnier <monnier <at> iro.umontreal.ca> Cc: Po Lu <luangruo <at> yahoo.com>, 75318 <at> debbugs.gnu.org Subject: Re: bug#75318: Lisp_Subr header and layout Date: Thu, 23 Jan 2025 11:59:04 +0000
"Stefan Monnier" <monnier <at> iro.umontreal.ca> writes: >> Ideally, we'd turn these three into plain DEFUNs, with docstrings >> indicating they're internal functions. I'd also like to turn the >> underscores into ordinary dashes in the Lisp function names if we >> do that. > > +1 > Does anyone know why they're not plain DEFUNs? The sfntfont.c isn't safe to call from Lisp (it assumes it's passed two vectors of the right size). Po Lu, I assume it was originally a normal DEFUN and then modified for performance reasons? Are those reasons significant enough to keep the code as it is, in your opinion, until a better solution than turning it into a safe DEFUN is found? (I must confess I don't understand the code at all: the first results are the ones with the fewest number of fields set? Is the list reversed somewhere else? I'm not saying this code needs more comments, of course. If others can understand it easily, I may just have a blind spot here.) About the alloc.c non-DEFUNs: No idea, couldn't find anything, it looks weird in C-h v gc-cons-threshold. Nothing in the commit (97ffa339b6d67cebcbefbdfaa2880214adab639c) or bug discussion (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=37006). The comment in lisp.h, above Lisp_Subr says: /* This structure describes a built-in function. It is generated by the DEFUN macro only. defsubr makes it into a Lisp object. */ The least we could do is adjust that by expanding the second line. (In the nativecomp case, all three lines need fixing.) The watcher is exposed to Lisp and can be called directly or removed, so that's not it. The underscored name is also printed in C-h v gc-cons-threshold, so hiding it isn't the reason either. (My drive-by proposal in bug#75632 is to improve display of variable watchers, for example by starting with: diff --git a/lisp/help-fns.el b/lisp/help-fns.el index 9324cf85454..f4f595aedbe 100644 --- a/lisp/help-fns.el +++ b/lisp/help-fns.el @@ -1712,7 +1712,9 @@ help-fns--var-watchpoints (when watchpoints (princ " Calls these functions when changed: ") ;; FIXME: Turn function names into hyperlinks. - (princ watchpoints) + (dolist (watchpoint watchpoints) + (princ (help-fns-function-name watchpoint)) + (princ "\n ")) (terpri)))) And then fixing help-fns-function-name so we can remove the FIXME.) No matter whether we change help-fns.el, the name is likely to remain in the output, so the underscores are confusing; and giving it a proper symbol seems like a good idea to me, not just because we can then add the symbol rather than the subr to the watchers list, and that prints nicely. If there aren't any good reasons, I'll send a patch turning all three into DEFUNs and we can discuss that further. Definitely going to push this particular change until there's consensus, though. Pip
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.