GNU bug report logs - #78686
feature/igc: [PATCH] Make charset_table an exact root

Previous Next

Package: emacs;

Reported by: Helmut Eller <eller.helmut <at> gmail.com>

Date: Tue, 3 Jun 2025 19:45:08 UTC

Severity: normal

Tags: patch

Full log


Message #29 received at 78686 <at> debbugs.gnu.org (full text, mbox):

From: Pip Cet <pipcet <at> protonmail.com>
To: Helmut Eller <eller.helmut <at> gmail.com>
Cc: 78686 <at> debbugs.gnu.org, Gerd Möllmann <gerd <at> gnu.org>
Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root
Date: Fri, 06 Jun 2025 07:58:45 +0000
"Helmut Eller" <eller.helmut <at> gmail.com> writes:

> On Thu, Jun 05 2025, Pip Cet wrote:
>
> [...]
>>>> So let's get rid of that unnecessary root.
>>
>> I still think we should do that.  Do you think getting rid of the root
>> would actually make things worse?
>
> Probably not.

Okay.  I'll push that change, then.

> [...]
>>>> 1. undumped Emacs sessions.  I don't think these are worth optimizing
>>>> for.  Do you?
>>>
>>> No. But there is also no need to make them gratuitously different from
>>> normal sessions.
>>
>> I agree, but the decision has been made on the master branch to do just
>> that (static storage pre-pdump; pdump storage post-pdump; xmalloc if
>> grown), and if we want to revisit it we need to fix it on the master
>> branch.
>
> Sounds like the inmates are running the asylum :-)

I don't think mental-health analogies are useful here, to be honest.

I'd like to propose the following patch for *master*:

From 7f2e728623aed6ac04bd129f769b455c27f548dd Mon Sep 17 00:00:00 2001
From: Pip Cet <pipcet <at> protonmail.com>
Subject: [PATCH 1/2] Remove charset_table_init (bug#78686)

In usual post-pdump Emacs sessions, charset_table is set by pdumper to
point to a table of sufficient size contained in the pdump.  Stop
allocating ~59KB of BSS space for such sessions.

* src/charset.c (Fdefine_charset_internal): Grow 'charset_table' to
180 entries right away if this code is ever reached, which it isn't in
ordinary post-dump Emacs sessions.
(charset_table_init): Remove variable.
(syms_of_charset): Initialize 'charset_table' to NULL.
---
 src/charset.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/src/charset.c b/src/charset.c
index 3264d75f92e..df669b71e46 100644
--- a/src/charset.c
+++ b/src/charset.c
@@ -1132,7 +1132,7 @@ DEFUN ("define-charset-internal", Fdefine_charset_internal,
 	  int old_size = charset_table_size;
 	  ptrdiff_t new_size = old_size;
 	  struct charset *new_table =
-	    xpalloc (0, &new_size, 1,
+	    xpalloc (0, &new_size, (new_size == 0) ? 180 : 1,
 		     min (INT_MAX, MOST_POSITIVE_FIXNUM),
                      sizeof *charset_table);
           memcpy (new_table, charset_table, old_size * sizeof *new_table);
@@ -2340,18 +2340,6 @@ init_charset_once (void)
   PDUMPER_REMEMBER_SCALAR (charset_ksc5601);
 }
 
-/* Allocate an initial charset table that is large enough to handle
-   Emacs while it is bootstrapping.  As of September 2011, the size
-   needs to be at least 166; make it a bit bigger to allow for future
-   expansion.
-
-   Don't make the value so small that the table is reallocated during
-   bootstrapping, as glibc malloc calls larger than just under 64 KiB
-   during an initial bootstrap wreak havoc after dumping; see the
-   M_MMAP_THRESHOLD value in alloc.c, plus there is an extra overhead
-   internal to glibc malloc and perhaps to Emacs malloc debugging.  */
-static struct charset charset_table_init[180];
-
 void
 syms_of_charset (void)
 {
@@ -2377,8 +2365,10 @@ syms_of_charset (void)
   staticpro (&Vcharset_hash_table);
   Vcharset_hash_table = CALLN (Fmake_hash_table, QCtest, Qeq);
 
-  charset_table = charset_table_init;
-  charset_table_size = ARRAYELTS (charset_table_init);
+  /* charset_table is special-cased in pdumper.c, to make it point into
+     the dump when it is loaded.  */
+  charset_table = NULL;
+  charset_table_size = 0;
   PDUMPER_REMEMBER_SCALAR (charset_table_size);
   charset_table_used = 0;
   PDUMPER_REMEMBER_SCALAR (charset_table_used);
-- 
2.48.1

This fixes the problem of allocating BSS space.  This is the one problem
that I've understood your patch to fix.  I'm not saying that there
aren't other problems it fixes, and I'd like to understand what I'm
missing here, but I cannot think of any.

>>> Exactly: my patch copies the charset table to the place where it
>>> belongs, the space in the BSS section reserved for it.  Just like
>>> builtin symbols and subrs are copied to their space in the BSS section.
>>
>> I don't think those two special cases (lispsym has very good reasons for
>> being special; subrs aren't in the BSS section) warrant introducing a
>> third special case.
>
> So subrs end up in the .subrs section.  Sill, subrs are copied back from
> the dump.

The only reason for that, that I'm aware of, is precisely that they are
initialized, and creating dynamic objects from static ones was
considered less efficient than using the static ones directly (this is
probably not true).  So that reason doesn't apply to charset_table,
which is not initialized.

The reason for lispsym to be static is that we need the address for
Lisp_Object <-> struct Lisp_Symbol * conversion, and we also want iQxxx
to be constant because it's used in initializers.

Note that lispsym and subrs caused a lot of trouble, some of it very
recent, because it's hard to enforce the proper alignment on some
systems (DJGPP).  Allegedly, it's impossible to do so at all on some
other systems, but I don't believe that to be correct; still, we have
the code and pay the maintenance price for it.

In any case, none of these arguments apply to charset_table.  I think it
was static because it made unexec easier.  Unexec's gone, so let's get
rid of it.

>>> I like my patch better.
>>
>> Okay.  Still, the first single-line patch would improve things, right?
>
> Yes.  Still not as good as my approach.

I'm sorry, but can you try once more to point out what the advantage of
your approach is?  I simply do not see it, and I'd like to understand.

Pip





This bug report was last modified 69 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.