GNU bug report logs -
#78686
feature/igc: [PATCH] Make charset_table an exact root
Previous Next
Full log
Message #23 received at 78686 <at> debbugs.gnu.org (full text, mbox):
Pip Cet <pipcet <at> protonmail.com> writes:
> "Helmut Eller" <eller.helmut <at> gmail.com> writes:
>
>> On Wed, Jun 04 2025, Pip Cet wrote:
>>
>>> As far as I can tell, there's just one significant problem: we have a
>>> root covering charset_table_init even though that array is unused in
>>> pdumped Emacs sessions. It's not a huge problem, just 180 zero words
>>> fixed as Lisp objects, but it is unsatisfying and we should fix it.
>>
>> It's more like 60kb of zeroes.
>
> Oops, you're right. All the more reason to remove that root right away.
>
>>> So let's get rid of that unnecessary root.
>
> I still think we should do that. Do you think getting rid of the root
> would actually make things worse? If you don't, I'll push that change
> first.
>
>>> That leaves two very minor problems:
>>>
>>> 1. undumped Emacs sessions. I don't think these are worth optimizing
>>> for. Do you?
>>
>> No. But there is also no need to make them gratuitously different from
>> normal sessions.
>
> I agree, but the decision has been made on the master branch to do just
> that (static storage pre-pdump; pdump storage post-pdump; xmalloc if
> grown), and if we want to revisit it we need to fix it on the master
> branch.
>
>>> 2. Emacs sessions which grow charset_table beyond the 180 entries in the
>>> dump. Does this happen? In any case, we do handle the case correctly,
>>> leaking memory but giving the user access to as many charsets as they
>>> desire.
>>
>> Defining new charsets is presumably an exotic thing to do.
>
> Still, I see no reason it wouldn't have worked before, and it can't
> possibly work after your patch.
>
>>> There is also the long-term goal of making it so all objects in the
>>> pdump section can be relocated elsewhere, allowing us to free the
>>> section. We should do so by tracing objects in the dump precisely so
>>> they can be moved out.
>>
>> As my patch does, in contrast to your idea.
>
> Your patch does not make the charset table movable or protectable.
>
>>>> Second, I don't like the practice of using the GC header to create
>>>> "shadow types".
>>>
>>> Noted. That's a good argument for not introducing new ones, but your
>>> patch doesn't introduce or delete any; it just copies things back from
>>> the dump heap (which is never freed) to an unprotected root in the BSS
>>> section or to xmalloc'd memory.
>>
>> Exactly: my patch copies the charset table to the place where it
>> belongs, the space in the BSS section reserved for it. Just like
>> builtin symbols and subrs are copied to their space in the BSS section.
>
> I don't think those two special cases (lispsym has very good reasons for
> being special; subrs aren't in the BSS section) warrant introducing a
> third special case.
>
>>>> Maybe, depending on how this charset table business goes.
>>>
>>> I don't see the dependency.
>>
>> My motivation may depend on it.
>
> Let me just state that removing roots, particularly ambiguous ones, is a
> very good thing, and you've identified one (the charset table one) where
> it is particularly easy to do so. I just don't understand why you don't
> want to remove it, but make it necessary instead. But the situation
> with kbd_buffer is different: it's currently a large ambiguous root, and
> reducing the amount of time we spend scanning it is a good thing, even
> if it stays a root.
>
> And, again, the charset table situation in general should be improved,
> on the master branch, and such improvements would trickle down to
> feature/igc.
>
>>> As far as I can tell, we need this patch. It removes the extraneous
>>> root.
>>
>> I like my patch better.
>
> Okay. Still, the first single-line patch would improve things, right?
> If so, there's no reason not to push the patch we presumably agree
> improves things (even though we both think it's not ideal) while
> discussing the one we disagree about.
>
> Obviously, just a suggestion. I've CC'd Gerd in case he thinks I'm
> being silly and we should just apply your original patch (in that case,
> I'd try to verify it doesn't introduce new crashes if charset_table is
> grown pre-dump), and fix the bug which I think would introduce such
> crashes.
>
I've read Helmut's patch, and I think I understand the idea, but let me
check: The idea is to get rid of the special handling of charset_table
in the MPS heap after loading a dump. It is currently a pinned object by
having an ambiguous root point to the charset_table object in the MPS
heap. Helmut rather copies it back to where it came from, and the root
that is used for pinning can go. The charset_table is instead an exact
root. Hope that's more or less what happens.
Makes sense to me 🤷. I find it a bit simpler, because the pinning root
is gone, which I find a good thing. One thing less to understand. One
pinned object less.
I did not understand everything in the following mails. I'm probably
missing things that came up/were discussed somewhere else. I'd rather
not sprinkle comments when I don't really know what's going on.
This bug report was last modified 65 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.