From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 03 15:44:23 2025 Received: (at submit) by debbugs.gnu.org; 3 Jun 2025 19:44:23 +0000 Received: from localhost ([127.0.0.1]:41854 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uMXYZ-0002LS-DH for submit@debbugs.gnu.org; Tue, 03 Jun 2025 15:44:22 -0400 Received: from lists.gnu.org ([2001:470:142::17]:58516) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uMXYS-0002KT-SE for submit@debbugs.gnu.org; Tue, 03 Jun 2025 15:44:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uMXYG-0007E2-9O for bug-gnu-emacs@gnu.org; Tue, 03 Jun 2025 15:44:00 -0400 Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1uMXYC-0005SY-UZ for bug-gnu-emacs@gnu.org; Tue, 03 Jun 2025 15:43:59 -0400 Received: by mail-wr1-x430.google.com with SMTP id ffacd0b85a97d-3a361b8a664so5814469f8f.3 for ; Tue, 03 Jun 2025 12:43:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1748979834; x=1749584634; darn=gnu.org; h=mime-version:user-agent:message-id:date:subject:to:from:from:to:cc :subject:date:message-id:reply-to; bh=Id1Kfprj60ypRt//v5QW8hyjlk0VIXyT9qLsAW/rEgw=; b=ZUst/QAy7l7VkN5tauurKDI+ueVssIvVHwz3okS8yl2VBdEN0bFvSvI5HnAl/QQ8AP eRFPDx4pLi+X88/lYmmFmTFsw8lhp59+CJFvyHT79HuciAOsg4YFPE1vSYjGD0OxsM2S Wp7z7DxJTaXq6DiZntqiccQOxO1KSj/4qXX1zyeldFwYU4YRWQSVwLYwuzP6LG58hyi/ fAg1+Xff5jdzaRkci2jaza24E3ryDN1FLBibdwqDmMZApPqnN3tL/LHQd1b4JUFR979e B6NsGRe3k7ubnTi61Jk2A5QgRWkGusCJAtvayAIyfwqNuiKPMa2uykd1L3ABq5F5eW1t hEfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748979834; x=1749584634; h=mime-version:user-agent:message-id:date:subject:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Id1Kfprj60ypRt//v5QW8hyjlk0VIXyT9qLsAW/rEgw=; b=wMaM/FGIeHRCxRU6O9V+NUsQTJSjMcGyDjJZghEHa9xJX4NVWQ91pvOauGckecPaIo q1IOUlvHYduy2cMgfmO5YGeujalWYFjXP+RIikLJ08rGuw67/rda5Kveb7AVp2P/O/8u sAA0+gUvsDNaGLgYQJ6crttHUTRJSaTezgwWH63ykql+A1189oqlfZLlVXZfCa4BBFfz mvgpq2XgMCZ2Nms7dpNV2agySoOZhoTLdZ1F8bP10r/lwmeyWXhWuBwstH2IU/ULS4sZ QVsCGs42nQBIUkeNK1+vgXFIySkFr0c67RH9GIBxNojLoM/YQqYlZZp5XBo5/jHFeizj xSWg== X-Gm-Message-State: AOJu0YyukDyn4EsBpG9ngJ3zV58/TO19m1cJWQhDbRmcnntNE4T/AyGa RfnFlrj+lPO6wwz6GERX8jnxmi7zMaJC57SY9LtDIpe58iWDbc5Yeq75q0SqAw== X-Gm-Gg: ASbGnctzkEfwo49hmcqetcKw9i14hnsYpwht8wYytKEaNKtPDub8/vHW7w4sxyqEqn2 klhGm9bVxctqKqUAguHaSAmK2ShdulmmPiTJ5p1QU3oR3IuY1kXphQSOxjm2UIdG3DrO4ELMSKY 8PmRh4LQK5lyyOy+NoZRSZY9PuzgDfb3YhPFDgV5PJUXApmchiR7Ulc2zBywqd0FhLKt0y5OU1K WA6xTpdAmTmeprTjMqVircRLaR5aA3YjYM9WyUhPjvsfnw9a5eM1FkBn3c/luM8u2UDdOXv1S8j /cmuOkwjglqr6fr5hTXb5nR/djjAwHMUaZYrQxD4pyiguBOUyk/M90JiKy8= X-Google-Smtp-Source: AGHT+IHNswQHst+f26a25fmVT9RTVtHVFU0tQjNlplMrwqXa4fYlICL8xd60L2kp8H/R+5F1J3jWMw== X-Received: by 2002:a05:6000:2389:b0:3a4:deb9:8964 with SMTP id ffacd0b85a97d-3a51d91f899mr48273f8f.17.1748979834239; Tue, 03 Jun 2025 12:43:54 -0700 (PDT) Received: from caladan ([89.107.110.32]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-450d7fc281csm179914285e9.40.2025.06.03.12.43.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jun 2025 12:43:53 -0700 (PDT) From: Helmut Eller To: bug-gnu-emacs@gnu.org Subject: feature/igc: [PATCH] Make charset_table an exact root X-Debbugs-Cc: Date: Tue, 03 Jun 2025 21:43:52 +0200 Message-ID: <87r000k3uv.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=2a00:1450:4864:20::430; envelope-from=eller.helmut@gmail.com; helo=mail-wr1-x430.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) --=-=-= Content-Type: text/plain Here is a proposal to move charset_table out of the GC heap and make it an exact root area instead. --=-=-= Content-Type: text/x-diff Content-Disposition: attachment; filename=0001-Make-charset_table-an-exact-root.patch >From ded35d23523a18b5c61dd7e7c2de92cb2dafc7ff Mon Sep 17 00:00:00 2001 From: Helmut Eller Date: Tue, 3 Jun 2025 17:30:33 +0200 Subject: [PATCH] Make charset_table an exact root Previously, the charset_table was located in various places: 1. in charset_table_init during bootstrap (ambiguous root) 2. on the GC heap (loaded from the dump; pinned) 3. on the malloc heap when it needed to grow (ambiguous root) Now, the charset table is copied back to charset_table_init after loading the dump and it is scanned as an exact root area. * src/igc.h (igc_root_create_area): New API. (igc_scan_charset_table): New. * src/igc.c (igc_scan_charset_table): Implementation. (root_create_charset_table): Use it. (clone_charset_table, clone_charset): New helpers. (igc_on_pdump_loaded): Use it. (fix_charset_table): Deleted. * src/charset.c (Fdefine_charset_internal): Create/destory root areas when allocating/freeing charset tables. --- src/charset.c | 19 +++++++----- src/igc.c | 84 ++++++++++++++++++++++++++++++--------------------- src/igc.h | 4 +++ 3 files changed, 65 insertions(+), 42 deletions(-) diff --git a/src/charset.c b/src/charset.c index 9cbc899b354..2ee6fa0345c 100644 --- a/src/charset.c +++ b/src/charset.c @@ -1132,17 +1132,20 @@ DEFUN ("define-charset-internal", Fdefine_charset_internal, coding_system.charbuf[i] entries, which are 'int'. */ int old_size = charset_table_size; ptrdiff_t new_size = old_size; - struct charset *new_table = + struct charset *new_table + = xpalloc (0, &new_size, 1, + min (INT_MAX, MOST_POSITIVE_FIXNUM), + sizeof *charset_table); #ifdef HAVE_MPS - igc_xpalloc_ambig -#else - xpalloc + memset (new_table, 0, new_size * sizeof *new_table); + igc_root_create_area (new_table, new_table + new_size, + igc_scan_charset_table, NULL, "charset-table"); #endif - (0, &new_size, 1, - min (INT_MAX, MOST_POSITIVE_FIXNUM), - sizeof *charset_table); memcpy (new_table, charset_table, old_size * sizeof *new_table); - charset_table = new_table; +#ifdef HAVE_MPS + igc_destroy_root_with_start (charset_table); +#endif + charset_table = new_table; charset_table_size = new_size; /* FIXME: This leaks memory, as the old charset_table becomes unreachable. If the old charset table is charset_table_init diff --git a/src/igc.c b/src/igc.c index f63b0fe8c88..67aefb1b273 100644 --- a/src/igc.c +++ b/src/igc.c @@ -1725,6 +1725,20 @@ scan_tty_list (mps_ss_t ss, void *start, void *end, void *closure) return MPS_RES_OK; } +mps_res_t +igc_scan_charset_table (struct igc_ss *s, void *start, void *end, void *closure) +{ + mps_ss_t ss = (mps_ss_t) s; + struct charset *table = start; + MPS_SCAN_BEGIN (ss) + { + for (size_t i = 0, used = charset_table_used; i < used; i++) + IGC_FIX12_OBJ (ss, &table[i].attributes); + } + MPS_SCAN_END (ss); + return MPS_RES_OK; +} + /*********************************************************************** Default pad, fwd, ... ***********************************************************************/ @@ -1941,20 +1955,6 @@ fix_handler (mps_ss_t ss, struct handler *h) return MPS_RES_OK; } -static mps_res_t -fix_charset_table (mps_ss_t ss, struct charset *table, size_t nbytes) -{ - igc_assert (table == charset_table); - igc_assert (nbytes == (charset_table_size * sizeof (struct charset))); - MPS_SCAN_BEGIN (ss) - { - for (size_t i = 0, len = nbytes / sizeof (struct charset); i < len; i++) - IGC_FIX12_OBJ (ss, &table[i].attributes); - } - MPS_SCAN_END (ss); - return MPS_RES_OK; -} - static mps_res_t fix_vector (mps_ss_t ss, struct Lisp_Vector *v); static mps_res_t fix_marker_vector (mps_ss_t ss, struct Lisp_Vector *v); static mps_res_t fix_weak_hash_table_strong_part (mps_ss_t ss, struct Lisp_Weak_Hash_Table_Strong_Part *t); @@ -2025,6 +2025,7 @@ dflt_scan_obj (mps_ss_t ss, mps_addr_t base_start, mps_addr_t base_limit, case IGC_OBJ_DUMPED_BUFFER_TEXT: case IGC_OBJ_DUMPED_BIGNUM_DATA: case IGC_OBJ_DUMPED_BYTES: + case IGC_OBJ_DUMPED_CHARSET_TABLE: /* Can occur in the dump. */ break; @@ -2080,11 +2081,6 @@ dflt_scan_obj (mps_ss_t ss, mps_addr_t base_start, mps_addr_t base_limit, fix_blv); break; - case IGC_OBJ_DUMPED_CHARSET_TABLE: - IGC_FIX_CALL (ss, fix_charset_table (ss, (struct charset *)client, - obj_size (header))); - break; - case IGC_OBJ_WEAK_HASH_TABLE_STRONG_PART: IGC_FIX_CALL_FN (ss, struct Lisp_Weak_Hash_Table_Strong_Part, client, fix_weak_hash_table_strong_part); @@ -2982,8 +2978,8 @@ root_create_bc (struct igc_thread_list *t) static void root_create_charset_table (struct igc *gc) { - root_create_ambig (gc, charset_table_init, - charset_table_init + ARRAYELTS (charset_table_init), + root_create_exact (gc, &charset_table_init, &charset_table_init + 1, + (mps_area_scan_t) igc_scan_charset_table, "charset-table"); } @@ -3069,6 +3065,14 @@ igc_root_create_n (Lisp_Object start[], size_t n) return root_create_exact_n (start, n); } +void +igc_root_create_area (void *start, void *end, igc_scan_area_t f, void *closure, + const char *label) +{ + root_create (global_igc, start, end, mps_rank_exact (), (mps_area_scan_t) f, + closure, false, label); +} + static igc_root_list * root_find (void *start) { @@ -5150,7 +5154,26 @@ check_dump (mps_addr_t start, mps_addr_t end) return true; } -static mps_addr_t pinned_objects_in_dump[3]; +static void +clone_charset (struct charset *src, struct charset *dst) +{ + *dst = *src; + if (src->code_space_mask) + { + size_t nbytes = 256; + dst->code_space_mask = xzalloc (nbytes); + memcpy (dst->code_space_mask, src->code_space_mask, nbytes); + } +} + +static void +clone_charset_table (size_t used, size_t size, struct charset src[size], + struct charset dst[size]) +{ + igc_assert (global_igc->park_count > 0); + for (size_t i = 0; i < used; i++) + clone_charset (&src[i], &dst[i]); +} /* Called from pdumper_load. [HOT_START, HOT_END) is the hot section of the dump. [COL_START, COLD_END) is the cold section of the @@ -5190,20 +5213,13 @@ igc_on_pdump_loaded (void *dump_base, void *hot_start, void *hot_end, set_header (hot_end, IGC_OBJ_PAD, discardable_size, 0); /* Ignore relocs */ set_header (heap_end, IGC_OBJ_PAD, relocs_size, 0); + /* Copy charset_table out of the GC heap */ + igc_assert (ARRAYELTS (charset_table_init) == charset_table_size); + clone_charset_table (charset_table_used, charset_table_size, + charset_table, charset_table_init); + charset_table = charset_table_init; eassert (check_dump (h, cold_end)); - /* Pin some stuff in the dump */ - mps_addr_t pinned_roots[] = { - charset_table, - cold_start, /* code_space_masks */ - cold_user_data_start, - }; - static_assert (sizeof pinned_roots == sizeof pinned_objects_in_dump); - memcpy (pinned_objects_in_dump, pinned_roots, sizeof pinned_roots); - igc_root_create_ambig (pinned_objects_in_dump, - (uint8_t *) pinned_objects_in_dump - + sizeof pinned_objects_in_dump, - "dump-pins"); } void * diff --git a/src/igc.h b/src/igc.h index 5e24c7c338a..7275eaf937f 100644 --- a/src/igc.h +++ b/src/igc.h @@ -152,7 +152,11 @@ #define EMACS_IGC_H void igc_root_destroy_comp_unit (struct Lisp_Native_Comp_Unit *u); void igc_root_destroy_comp_unit_eph (struct Lisp_Native_Comp_Unit *u); void *igc_root_create_n (Lisp_Object start[], size_t n); +void igc_root_create_area (void *start, void *end, igc_scan_area_t f, + void *closure, const char *label); void igc_destroy_root_with_start (void *start); +igc_scan_result_t igc_scan_charset_table (struct igc_ss *, void *, void *, + void *); size_t igc_header_size (void); char *igc_dump_finish_obj (void *client, enum igc_obj_type type, char *base, char *end); -- 2.39.5 --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 04 09:10:28 2025 Received: (at 78686) by debbugs.gnu.org; 4 Jun 2025 13:10:28 +0000 Received: from localhost ([127.0.0.1]:48737 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uMnsy-0001Py-2K for submit@debbugs.gnu.org; Wed, 04 Jun 2025 09:10:28 -0400 Received: from mail-10630.protonmail.ch ([79.135.106.30]:18085) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uMnsu-0001P5-7V for 78686@debbugs.gnu.org; Wed, 04 Jun 2025 09:10:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1749042616; x=1749301816; bh=NEtdUDez1PpvWrNQWu+dXY2eNnhLRhd784sXWjRJ1kA=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post; b=UTAMjxyrxkyJ+zt0GSKpkh54OGmwP7/mSC/TVyBO2bSpp1DkbLQXxZUxo8gXLmsms YUyfl88/tlcEHyJqWobWZKZcgUXOX2wQIrLyU/M9cqy6UcbNYlE1eawk6SXR1X5ByL /F0sqHafLJdL5xq55YDkn6zhYElUKMU94OY5ph0d0tyRiB5vYRyh1N4dmGFcATAIuY 6oxziU3yaq4DBV8PqSJ5Bo1L7aTpd1uW+iXVI5crukJZXkIszz1BCEdwR23jvCjmpv VpZV69WvwZAgdmYMO6K+0pzIitXP2uRIYFMfdbuA9uNbLwN18SXgpgEh3FyeI+80yk 0CU5qcFihEm5w== Date: Wed, 04 Jun 2025 13:10:12 +0000 To: Helmut Eller From: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root Message-ID: <87bjr3ejpr.fsf@protonmail.com> In-Reply-To: <87r000k3uv.fsf@gmail.com> References: <87r000k3uv.fsf@gmail.com> Feedback-ID: 112775352:user:proton X-Pm-Message-ID: ade9e6814df905f440c8c6d3f1332e4f1a0b9154 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) "Helmut Eller" writes: > Here is a proposal to move charset_table out of the GC heap and make it > an exact root area instead. > Previously, the charset_table was located in various places: > 1. in charset_table_init during bootstrap (ambiguous root) > 2. on the GC heap (loaded from the dump; pinned) > 3. on the malloc heap when it needed to grow (ambiguous root) > > Now, the charset table is copied back to charset_table_init after > loading the dump and it is scanned as an exact root area. Thanks. Seems like the wrong approach to me: This removes the case we want, which is case 2, by restoring the case we don't want, by adding new and larger unprotected roots. Shouldn't we simply make it so if igc is in use, we only ever use case 2? This code doesn't handle charset_table growing above the initial size, then dumping. We should attempt to support unusual dumps like that. There is a warning about making charset_table_init[] large enough so it doesn't grow during bootstrapping, but I think that comment is obsolete and refers to unexec problems. It should be fine now, I think, but there really is no reason for an optimization that applies only to temacs, anyway. At the very least, this deserves an emacs_abort (not an igc_assert which might be disabled) and a comment explaining why it's not usually hit: a silent failure and crash is a bad way of indicating to a user they need to bump the initial charset_table_size. Scanning only to charset_used, rather than to the end of the array, is a questionable optimization: scanning a small number of Qnils is better than relying on what's potentially another thread making memory modifications so they become visible in the right order. charset_table should move to the cold section of the dump if it's no longer expected to be used after restoring the dump. I don't think "clone_charset_table" is a good name for the only thing this function does, which is to restore a dumped charset table, making the dumped copy unsafe to use in the process. I don't understand why it needs to assert the arena is parked. That simply makes future changes more difficult. So why not just turn IGC_OBJ_DUMPED_CHARSET_TABLE into IGC_OBJ_CHARSET_TABLE, remove some igc_asserts, and avoid roots altogether, ensuring that root_create_charset_table is called only in temacs, never in pdumped emacs? Also, I think removing pinned_objects_in_dump is a separate change. Are you looking at kbd_buffer next? That's a very large ambiguous root, but a little tricky, not least due to the GCC bug. Pip From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 04 14:56:07 2025 Received: (at 78686) by debbugs.gnu.org; 4 Jun 2025 18:56:07 +0000 Received: from localhost ([127.0.0.1]:52541 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uMtHS-00037s-9Z for submit@debbugs.gnu.org; Wed, 04 Jun 2025 14:56:06 -0400 Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]:49177) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1uMtHQ-000372-7t for 78686@debbugs.gnu.org; Wed, 04 Jun 2025 14:56:04 -0400 Received: by mail-wr1-x430.google.com with SMTP id ffacd0b85a97d-3a3673e12c4so84566f8f.2 for <78686@debbugs.gnu.org>; Wed, 04 Jun 2025 11:56:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749063358; x=1749668158; darn=debbugs.gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=uOPsM5WgB5hQGMN9zhVtaiSTSQs3P5OgMX1Tlu9LaYs=; b=jXmDtP+wknOlsn++UFAmRCMhzltd7FVdw5+g2ch5pydw7SY/yHTlTS1w+OmZ2IUlVQ rmOc/9UNfdQfUglwyB/tG1c3BRixjNQW/m/hBRSNAChB1frt5AqSIZpGXqrI4PqZPhFR AYiRS6631CwoSnx3756sdOg182bYrb3MDRMFmrbFzlBS5FjX9QJ7qyk0v94TRJE5d55e 4LsBSvsTWx6xKDKvWTcouJg8jqVkDovSoWA11UtMTd1RKRw0puDVhWlj8iB8lhJBCUTy aNpfFNNNMzy7mppyC4ki4zmxTdpyK5l+msZflHt9hQAlqvfXpRlMxmlUASnWKCrQ4gUB EaMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749063358; x=1749668158; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=uOPsM5WgB5hQGMN9zhVtaiSTSQs3P5OgMX1Tlu9LaYs=; b=NtvSzXIub2adadtlDg7Ygahz29OskmryUR6PQ1r+Xiy8znHSoVsrF1px+VlHGrHKoK tQo4s14A6auSvOsGDgJGkyeO4/9JS1lieF6Xq5btUJ8Kd+zM0FUym5qC4akIiKyrehs9 A4Eu0r6ma0kgoqQ43VYsYFxVLgn6U/YYpTtLOMHknLCcdCoIlsGAN7sT5e/Eukejp/YV 0sdZ5cHpYrNjl8uOojlb6CyA7l0txGGRdFsxV+IYtWD+0q5PrEEgecFjqHwA+jjM4xC5 TYwX++bjVKcqDpxrzhemGV2GzvL6lEHefJdFE+BUImtnEBPji3oqe50k5wdE9YgZ2RdV 0vlQ== X-Gm-Message-State: AOJu0YzXhQgSIlSFNrSpnBpKB5LLuZbJyB/2onLy0Jmmfx0W5T2LGD/T ouaxSe+JNZwXt4Fdj+YSVJ80jaw7o1oXxntbiMEGh+avt/jBpqKR9lC1K0k7vA== X-Gm-Gg: ASbGncsMkKgwqBkjISsNszfcFdClDZnjyD0ZactoYlNnQ58w/61NOCNqRTNXnl0M1po Ym0O3JLP7Js7OhsYih4Ua/Ls0I0vGF/REd5RlN8dD77B/cdiYzFA/SJVg65FAUqd+TN5x7e042b DTXC6LL98TGL8uhmFP7Zef/bQN165jnrEEHm1H8NvSLKmOSU6NVskZsics7ZyUazSZi/hnMwwro O7CobJPUqNinwvCLlmeL+ZC7IGpzO12S2RuPEHTy7sXQ4dt2uuwi9EMc8UgnYeNplTNdBRE5pVo 3GeHmao5zkMa+fo3FaOi6L26UT8N0rZMlMtg41uuVRScEZhh X-Google-Smtp-Source: AGHT+IEoLkn0lASNi8amihdMcD5NEy9p8WKSR2/RVKLxgU9ZsRPu35HU/KNT4Z1GbGgDNmcfjt3JSw== X-Received: by 2002:a05:6000:240d:b0:3a4:d99b:f51d with SMTP id ffacd0b85a97d-3a51d8efd25mr3834415f8f.11.1749063357661; Wed, 04 Jun 2025 11:55:57 -0700 (PDT) Received: from caladan ([89.107.110.32]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a521c64d22sm2251045f8f.48.2025.06.04.11.55.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Jun 2025 11:55:57 -0700 (PDT) From: Helmut Eller To: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root In-Reply-To: <87bjr3ejpr.fsf@protonmail.com> References: <87r000k3uv.fsf@gmail.com> <87bjr3ejpr.fsf@protonmail.com> Date: Wed, 04 Jun 2025 20:55:56 +0200 Message-ID: <87qzzziber.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Wed, Jun 04 2025, Pip Cet wrote: > "Helmut Eller" writes: > >> Here is a proposal to move charset_table out of the GC heap and make it >> an exact root area instead. > >> Previously, the charset_table was located in various places: > >> 1. in charset_table_init during bootstrap (ambiguous root) >> 2. on the GC heap (loaded from the dump; pinned) >> 3. on the malloc heap when it needed to grow (ambiguous root) >> >> Now, the charset table is copied back to charset_table_init after >> loading the dump and it is scanned as an exact root area. > > Thanks. Seems like the wrong approach to me: > > This removes the case we want, which is case 2, by restoring the case we > don't want, by adding new and larger unprotected roots. Shouldn't we > simply make it so if igc is in use, we only ever use case 2? Not necessarily. My ideal solution would be to replace the charset.attributes field with something that doesn't need to be traced, e.g. an index into a Lisp_Vector of attributes. [...] > So why not just turn IGC_OBJ_DUMPED_CHARSET_TABLE into > IGC_OBJ_CHARSET_TABLE, remove some igc_asserts, and avoid roots > altogether, ensuring that root_create_charset_table is called only in > temacs, never in pdumped emacs? First, The IGC_OBJ_DUMPED_FOO types only exist because changing the GC heap format was easier than wrestling with the dumper. There was the expectation that the dumper will eventually be extended to support what MPS actually needs: sections that correspond to MPS pools. Second, I don't like the practice of using the GC header to create "shadow types". I'd prefer to make them pvec types or keep 'em out of the GC heap. [...] > Are you looking at kbd_buffer next? That's a very large ambiguous root, > but a little tricky, not least due to the GCC bug. Maybe, depending on how this charset table business goes. Helmut From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 04 16:16:02 2025 Received: (at 78686) by debbugs.gnu.org; 4 Jun 2025 20:16:02 +0000 Received: from localhost ([127.0.0.1]:53199 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uMuWm-0005Mr-TK for submit@debbugs.gnu.org; Wed, 04 Jun 2025 16:16:02 -0400 Received: from mail-10630.protonmail.ch ([79.135.106.30]:26851) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uMuWk-00050q-82 for 78686@debbugs.gnu.org; Wed, 04 Jun 2025 16:15:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1749068149; x=1749327349; bh=DDFMohfGpxBOv3bOslRcaER2LKJn/PDA7RmYyupn6KI=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post; b=nTMWmnXiEg0c1e5szwwDGgWlK2orXO1w+WSxifbd2+JFXVS5ySpqc1D0MU2laDhcP 3sCtf42wGmxLIyNB4zepGL3CGOurtpldb2uwhNJ2YfUan3n1ZQEjLuJTEcdtTBMXQT yqzJFKRGjd4QlwDotXaw5hVN8aYuGvjNVEPryEpurN5EXT8ntrBKkkdzx4rf3U317M HDQT6bxaIFRsPO+lcdan/um3apRwYpw52VXYQ2EAw95fQfMrfMaJp3QPYljkQ6jy3H n3E/3zcj+v742WN/3IpPFNuUiJS+F6m9pjf4PoW8tKC4IK39660VIt9KgAcjpTQ0zo LXOV/oae+nJaQ== Date: Wed, 04 Jun 2025 20:15:46 +0000 To: Helmut Eller From: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root Message-ID: <87cybj8dqo.fsf@protonmail.com> In-Reply-To: <87qzzziber.fsf@gmail.com> References: <87r000k3uv.fsf@gmail.com> <87bjr3ejpr.fsf@protonmail.com> <87qzzziber.fsf@gmail.com> Feedback-ID: 112775352:user:proton X-Pm-Message-ID: 445735e5b15450a204dfa8ea0c649b5bbcded4ae MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) "Helmut Eller" writes: > On Wed, Jun 04 2025, Pip Cet wrote: > >> "Helmut Eller" writes: >> >>> Here is a proposal to move charset_table out of the GC heap and make it >>> an exact root area instead. >> >>> Previously, the charset_table was located in various places: >> >>> 1. in charset_table_init during bootstrap (ambiguous root) >>> 2. on the GC heap (loaded from the dump; pinned) >>> 3. on the malloc heap when it needed to grow (ambiguous root) >>> >>> Now, the charset table is copied back to charset_table_init after >>> loading the dump and it is scanned as an exact root area. >> >> Thanks. Seems like the wrong approach to me: >> >> This removes the case we want, which is case 2, by restoring the case we >> don't want, by adding new and larger unprotected roots. Shouldn't we >> simply make it so if igc is in use, we only ever use case 2? > > Not necessarily. My ideal solution would be to replace the > charset.attributes field with something that doesn't need to be traced, > e.g. an index into a Lisp_Vector of attributes. I agree the charset design in general could do with some attention. In particular, we're still leaking the old charset tables when growing for the second or subsequent times on the master branch, and I really think if there were a good reason to do so someone would have updated the comment by now. Plus, there is no way to undefine charsets. However, such changes should not be specific to feature/igc. What problem are we trying to solve here? As far as I can tell, there's just one significant problem: we have a root covering charset_table_init even though that array is unused in pdumped Emacs sessions. It's not a huge problem, just 180 zero words fixed as Lisp objects, but it is unsatisfying and we should fix it. So let's get rid of that unnecessary root. That leaves two very minor problems: 1. undumped Emacs sessions. I don't think these are worth optimizing for. Do you? 2. Emacs sessions which grow charset_table beyond the 180 entries in the dump. Does this happen? In any case, we do handle the case correctly, leaking memory but giving the user access to as many charsets as they desire. There is also the long-term goal of making it so all objects in the pdump section can be relocated elsewhere, allowing us to free the section. We should do so by tracing objects in the dump precisely so they can be moved out. We should not do so by explicitly memcpying objects from the dump to xmalloc'd memory. >> So why not just turn IGC_OBJ_DUMPED_CHARSET_TABLE into >> IGC_OBJ_CHARSET_TABLE, remove some igc_asserts, and avoid roots >> altogether, ensuring that root_create_charset_table is called only in >> temacs, never in pdumped emacs? > > First, The IGC_OBJ_DUMPED_FOO types only exist because changing the GC > heap format was easier than wrestling with the dumper. There was the > expectation that the dumper will eventually be extended to support what > MPS actually needs: sections that correspond to MPS pools. I'm not agreeing or disagreeing with that statement. I think the MPS tradeoff of not having snapshots and requiring client code to handle dump/restore scenarios on its own was probably a good choice, and I see no way of changing it now. > Second, I don't like the practice of using the GC header to create > "shadow types". Noted. That's a good argument for not introducing new ones, but your patch doesn't introduce or delete any; it just copies things back from the dump heap (which is never freed) to an unprotected root in the BSS section or to xmalloc'd memory. > I'd prefer to make them pvec types or keep 'em out of > the GC heap. I agree with the sentiment, but turning struct charset into a pseudovector would be a significant change which should be done on the master branch, too, IMHO, and would need to be discussed first. I assume there were good reasons for making it a flat table rather than a vector of pseudovectors (it does save one level of indirection), but I'm not sure those are still relevant. >> Are you looking at kbd_buffer next? That's a very large ambiguous root, >> but a little tricky, not least due to the GCC bug. > > Maybe, depending on how this charset table business goes. I don't see the dependency. As far as I can tell, we need this patch. It removes the extraneous root. >From a13ef4ad9b97455c58682b4f616272a1c09c4b8f Mon Sep 17 00:00:00 2001 From: Pip Cet Subject: [PATCH] Destroy the charset_table_init root when pdump is loaded (bug#78686) At this point, pdumper has changed charset_table to point into the dump, not to charset_table_init, which remains unused for the rest of the Emacs session. No need for a root. * src/igc.c (igc_on_pdump_loaded): Destroy unnecessary root. --- src/igc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/igc.c b/src/igc.c index f63b0fe8c88..ab61fa7d3d6 100644 --- a/src/igc.c +++ b/src/igc.c @@ -5204,6 +5204,8 @@ igc_on_pdump_loaded (void *dump_base, void *hot_start= , void *hot_end, =09=09=09 (uint8_t *) pinned_objects_in_dump =09=09=09 + sizeof pinned_objects_in_dump, =09=09=09 "dump-pins"); + + igc_destroy_root_with_start (charset_table_init); } =20 void * --=20 2.48.1 We should also add a comment explaining that pinned_objects_in_dump[1] is necessary to keep the code space masks around: >From 48e21dc0e27e9d0a19579c5cd13cbad43533ada8 Mon Sep 17 00:00:00 2001 From: Pip Cet Subject: [PATCH] ; * src/igc.c (fix_charset_table): Add comment about untra= ced ref. --- src/igc.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/igc.c b/src/igc.c index ab61fa7d3d6..7ff263255f9 100644 --- a/src/igc.c +++ b/src/igc.c @@ -1950,6 +1950,9 @@ fix_charset_table (mps_ss_t ss, struct charset *table= , size_t nbytes) { for (size_t i =3D 0, len =3D nbytes / sizeof (struct charset); i < len= ; i++) IGC_FIX12_OBJ (ss, &table[i].attributes); + /* FIXME/igc: fix table[i].code_space_mask, once that's no longer an + interior pointer. Currently, it's protected explicitly by + pinned_objects_in_dump[1] =3D=3D code_space_masks. */ } MPS_SCAN_END (ss); return MPS_RES_OK; --=20 2.48.1 Finally, once we're convinced that it's not needed for anything else, we can do this: >From 7111cb4562bbd89f12d6a7bf284ec95f93ccf69d Mon Sep 17 00:00:00 2001 From: Pip Cet Subject: [PATCH] Make charset_table an exact root (bug#78686) * src/igc.c (fix_charset_table): Adjust comment. (igc_on_pdump_loaded): Turn charset_table into an exact rather than an ambiguous (pinned) root. --- src/igc.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/igc.c b/src/igc.c index 7ff263255f9..70e8eac49e9 100644 --- a/src/igc.c +++ b/src/igc.c @@ -1952,7 +1952,7 @@ fix_charset_table (mps_ss_t ss, struct charset *table= , size_t nbytes) IGC_FIX12_OBJ (ss, &table[i].attributes); /* FIXME/igc: fix table[i].code_space_mask, once that's no longer an interior pointer. Currently, it's protected explicitly by - pinned_objects_in_dump[1] =3D=3D code_space_masks. */ + pinned_objects_in_dump[0] =3D=3D code_space_masks. */ } MPS_SCAN_END (ss); return MPS_RES_OK; @@ -5153,7 +5153,7 @@ check_dump (mps_addr_t start, mps_addr_t end) return true; } =20 -static mps_addr_t pinned_objects_in_dump[3]; +static mps_addr_t pinned_objects_in_dump[2]; =20 /* Called from pdumper_load. [HOT_START, HOT_END) is the hot section of the dump. [COL_START, COLD_END) is the cold section of the @@ -5195,9 +5195,11 @@ igc_on_pdump_loaded (void *dump_base, void *hot_star= t, void *hot_end, set_header (heap_end, IGC_OBJ_PAD, relocs_size, 0); =20 eassert (check_dump (h, cold_end)); + + igc_root_create_exact_ptr (&charset_table); + /* Pin some stuff in the dump */ mps_addr_t pinned_roots[] =3D { - charset_table, cold_start, /* code_space_masks */ cold_user_data_start, }; --=20 2.48.1 WDYT? Pip From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 05 04:09:23 2025 Received: (at 78686) by debbugs.gnu.org; 5 Jun 2025 08:09:23 +0000 Received: from localhost ([127.0.0.1]:58436 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uN5f8-0004Q9-UA for submit@debbugs.gnu.org; Thu, 05 Jun 2025 04:09:23 -0400 Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]:61820) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1uN5f6-0004Pd-O3 for 78686@debbugs.gnu.org; Thu, 05 Jun 2025 04:09:21 -0400 Received: by mail-wr1-x432.google.com with SMTP id ffacd0b85a97d-3a503d9ef59so537091f8f.3 for <78686@debbugs.gnu.org>; Thu, 05 Jun 2025 01:09:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749110954; x=1749715754; darn=debbugs.gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=4mIMfAm/zu3Jqj088+Nr/ChyUMPTYuICdbD4AOTMj88=; b=QE2Fgfonr9LwNuMEq5/7RdVZwZPWsZQ83OZJrsqjX3OfnQSdEYF204DYhI9tAh660N V/EZG1mMTA0+UKaAbNLfvyM2wiGetShUy3OYN+wqgzOTYbJapmAFN3TAE6J1CWc1tvjd dYiA2LvObwDZ881eAmoURiZFWsZKpQyvmlk6B4y0svLhXRHC+DSl4R7YvF83ZFyBq6CK /XV3HG9kyHifusdPOKmDhYLQ0+pMzT/oMaEC6dQFhmN/9PMDoU1kA1ycjm24EqBOJsK7 XDW1ykNf9xvce1YvUSwuayljM/NdhaCWR+DBu+0pWnjxnri9FzlbGmQ7JdzqIupTcplP ymDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749110954; x=1749715754; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4mIMfAm/zu3Jqj088+Nr/ChyUMPTYuICdbD4AOTMj88=; b=uRVr0uWAREuGOnSUrCNSiIxpt9FxHiJW+yMD76PkgU4wW0/sOP1Oi4iiTVyR3+x9dA yQR0WgaKhTlcScOZbM2r3Sc37t6hIWChMEAdcVner7eziU1lLSpvrrlRjUOThetXd051 88wC+X4abeqMYleKTizLh+DTNdYpQMu2U2/QdLSqKV/SAX2IScoPw3YwEq2Z3WNCXMn3 APEUMJuFRwlAq++9sguj8/uD2tArxMh44bBqGQFNtNZjrZZ1WKgmrb3qkwm8q0iMy8IT SNP2dHg3Aov0oAH0V1bwuGszo8yBn+PTQxTjut1vim87I7jbnXNl9hJQbTMCUYePcTMc RtxQ== X-Gm-Message-State: AOJu0YwgZ7xQma1BmLBf4etNcSwHC/1KRVk7duqAYM3P0tCl9uMfYWvr ooH0rhnfMIB/IuxgT7/00KuXfHVnXNJKgPFUcyVHGQqDLzyEaib9ucXiGe7oqg== X-Gm-Gg: ASbGncs/OMqubDEd4XkqHr2lV1j1kp+DqC4zXMnXQvbRY8z6l/M6NlfArgK1KwcJK8a mSf73A+95y7ZxOJV3loudvb6H2hdON+SGpdftHXOq0td6yxOex2V325fOE8w+3tEOU93Kw4vVqu 2g7/uN1R9cBXtzU3Lv3JLxG+tSCpP2YubDkb5COmz8qMKnkEREW1360KS7C7xRgrMyZfRq/Cmiv DsWi0h7XoACNakgz8p7TPE+1dzKkeH8YjRr+yP+9B7xINcwlJwfK4W1Mtz0eWNwko+AtjbHN+JZ VQjJqnsUtu5KVPcauuZlkwRmxzc+rfnfnrCZjWfaNRTjTbz+ X-Google-Smtp-Source: AGHT+IFGbc2SCjp0p1XDGDN7Aqwrwmuf9e8/P3zhP6rckX8ZjqWZJOaf1/UZAYnYS7hjZWmM44UFTg== X-Received: by 2002:a05:6000:4287:b0:3a4:d6ed:8e00 with SMTP id ffacd0b85a97d-3a51d959f72mr5108826f8f.33.1749110954076; Thu, 05 Jun 2025 01:09:14 -0700 (PDT) Received: from caladan ([89.107.110.32]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4efe5b87csm24164860f8f.12.2025.06.05.01.09.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 01:09:13 -0700 (PDT) From: Helmut Eller To: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root In-Reply-To: <87cybj8dqo.fsf@protonmail.com> References: <87r000k3uv.fsf@gmail.com> <87bjr3ejpr.fsf@protonmail.com> <87qzzziber.fsf@gmail.com> <87cybj8dqo.fsf@protonmail.com> Date: Thu, 05 Jun 2025 10:09:12 +0200 Message-ID: <875xhaip93.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Wed, Jun 04 2025, Pip Cet wrote: > As far as I can tell, there's just one significant problem: we have a > root covering charset_table_init even though that array is unused in > pdumped Emacs sessions. It's not a huge problem, just 180 zero words > fixed as Lisp objects, but it is unsatisfying and we should fix it. It's more like 60kb of zeroes. > So let's get rid of that unnecessary root. > > That leaves two very minor problems: > > 1. undumped Emacs sessions. I don't think these are worth optimizing > for. Do you? No. But there is also no need to make them gratuitously different from normal sessions. > 2. Emacs sessions which grow charset_table beyond the 180 entries in the > dump. Does this happen? In any case, we do handle the case correctly, > leaking memory but giving the user access to as many charsets as they > desire. Defining new charsets is presumably an exotic thing to do. > There is also the long-term goal of making it so all objects in the > pdump section can be relocated elsewhere, allowing us to free the > section. We should do so by tracing objects in the dump precisely so > they can be moved out. As my patch does, in contrast to your idea. [...] >> Second, I don't like the practice of using the GC header to create >> "shadow types". > > Noted. That's a good argument for not introducing new ones, but your > patch doesn't introduce or delete any; it just copies things back from > the dump heap (which is never freed) to an unprotected root in the BSS > section or to xmalloc'd memory. Exactly: my patch copies the charset table to the place where it belongs, the space in the BSS section reserved for it. Just like builtin symbols and subrs are copied to their space in the BSS section. [...] >> Maybe, depending on how this charset table business goes. > > I don't see the dependency. My motivation may depend on it. > As far as I can tell, we need this patch. It removes the extraneous > root. I like my patch better. Helmut From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 05 06:50:04 2025 Received: (at 78686) by debbugs.gnu.org; 5 Jun 2025 10:50:04 +0000 Received: from localhost ([127.0.0.1]:59564 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uN8Ad-0008Jv-Go for submit@debbugs.gnu.org; Thu, 05 Jun 2025 06:50:04 -0400 Received: from mail-4322.protonmail.ch ([185.70.43.22]:56865) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uN8Ab-0008JA-Ba for 78686@debbugs.gnu.org; Thu, 05 Jun 2025 06:50:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1749120594; x=1749379794; bh=xrRNY9s5ImoF7BgNiDdgJ+4sOl3HSHQXzTD1JF96/lU=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post; b=TXvsg3WrT5JbUn3vjJLCgoJ6lZj997lAMOIctA3tNTmH352hUJSsK0rhzKmSsh8h3 Bw53R5O1IAKUkCftjCUt5MRBSf0eFNqFeHoNnMl+5r7lnZYXNoN2vaTGvY/25Meqqy OryOtepFm1N4xVdVgXTzPhj21TI8aly33clRhPYAQQ+mr2rgNGpgqvFFXBEnAzyVw9 81cp7QXUtztPesmnsVgd/NFy6Nf7LLihF7kVRKlAk+m661zi3tvFHFTWhkfMzz8osv Ddh6ofRogGXTeR9UvmYjomJD6dycTPTZDseqlCN3/qwLtQzQAGfHOXij6Jx5I8jv5B HYh15o0PJhMvg== Date: Thu, 05 Jun 2025 10:49:50 +0000 To: Helmut Eller From: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root Message-ID: <874iwu8nuf.fsf@protonmail.com> In-Reply-To: <875xhaip93.fsf@gmail.com> References: <87r000k3uv.fsf@gmail.com> <87bjr3ejpr.fsf@protonmail.com> <87qzzziber.fsf@gmail.com> <87cybj8dqo.fsf@protonmail.com> <875xhaip93.fsf@gmail.com> Feedback-ID: 112775352:user:proton X-Pm-Message-ID: 7200496d98d68683f7d9b8aad98685e7324f4ff1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org, =?utf-8?Q?Gerd_M=C3=B6llmann?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) "Helmut Eller" writes: > On Wed, Jun 04 2025, Pip Cet wrote: > >> As far as I can tell, there's just one significant problem: we have a >> root covering charset_table_init even though that array is unused in >> pdumped Emacs sessions. It's not a huge problem, just 180 zero words >> fixed as Lisp objects, but it is unsatisfying and we should fix it. > > It's more like 60kb of zeroes. Oops, you're right. All the more reason to remove that root right away. >> So let's get rid of that unnecessary root. I still think we should do that. Do you think getting rid of the root would actually make things worse? If you don't, I'll push that change first. >> That leaves two very minor problems: >> >> 1. undumped Emacs sessions. I don't think these are worth optimizing >> for. Do you? > > No. But there is also no need to make them gratuitously different from > normal sessions. I agree, but the decision has been made on the master branch to do just that (static storage pre-pdump; pdump storage post-pdump; xmalloc if grown), and if we want to revisit it we need to fix it on the master branch. >> 2. Emacs sessions which grow charset_table beyond the 180 entries in the >> dump. Does this happen? In any case, we do handle the case correctly, >> leaking memory but giving the user access to as many charsets as they >> desire. > > Defining new charsets is presumably an exotic thing to do. Still, I see no reason it wouldn't have worked before, and it can't possibly work after your patch. >> There is also the long-term goal of making it so all objects in the >> pdump section can be relocated elsewhere, allowing us to free the >> section. We should do so by tracing objects in the dump precisely so >> they can be moved out. > > As my patch does, in contrast to your idea. Your patch does not make the charset table movable or protectable. >>> Second, I don't like the practice of using the GC header to create >>> "shadow types". >> >> Noted. That's a good argument for not introducing new ones, but your >> patch doesn't introduce or delete any; it just copies things back from >> the dump heap (which is never freed) to an unprotected root in the BSS >> section or to xmalloc'd memory. > > Exactly: my patch copies the charset table to the place where it > belongs, the space in the BSS section reserved for it. Just like > builtin symbols and subrs are copied to their space in the BSS section. I don't think those two special cases (lispsym has very good reasons for being special; subrs aren't in the BSS section) warrant introducing a third special case. >>> Maybe, depending on how this charset table business goes. >> >> I don't see the dependency. > > My motivation may depend on it. Let me just state that removing roots, particularly ambiguous ones, is a very good thing, and you've identified one (the charset table one) where it is particularly easy to do so. I just don't understand why you don't want to remove it, but make it necessary instead. But the situation with kbd_buffer is different: it's currently a large ambiguous root, and reducing the amount of time we spend scanning it is a good thing, even if it stays a root. And, again, the charset table situation in general should be improved, on the master branch, and such improvements would trickle down to feature/igc. >> As far as I can tell, we need this patch. It removes the extraneous >> root. > > I like my patch better. Okay. Still, the first single-line patch would improve things, right? If so, there's no reason not to push the patch we presumably agree improves things (even though we both think it's not ideal) while discussing the one we disagree about. Obviously, just a suggestion. I've CC'd Gerd in case he thinks I'm being silly and we should just apply your original patch (in that case, I'd try to verify it doesn't introduce new crashes if charset_table is grown pre-dump), and fix the bug which I think would introduce such crashes. Pip From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 05 07:46:45 2025 Received: (at 78686) by debbugs.gnu.org; 5 Jun 2025 11:46:45 +0000 Received: from localhost ([127.0.0.1]:59984 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uN93U-0008VD-Fm for submit@debbugs.gnu.org; Thu, 05 Jun 2025 07:46:45 -0400 Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]:44050) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1uN93R-0008UX-H5 for 78686@debbugs.gnu.org; Thu, 05 Jun 2025 07:46:42 -0400 Received: by mail-wr1-x42c.google.com with SMTP id ffacd0b85a97d-3a375e72473so418813f8f.0 for <78686@debbugs.gnu.org>; Thu, 05 Jun 2025 04:46:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749123995; x=1749728795; darn=debbugs.gnu.org; h=content-transfer-encoding:mime-version:user-agent:message-id:date :references:in-reply-to:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=E9c38OO3OfY4yFB4xXXAD/Jbg2ph0/yvSPhi7K+Po3c=; b=ISsD+2WhurEXca1ZrMhJpYOvByBzh9UIISaBAmyz7rWwe0BSI+lZG+ashQ1GJ6eyey sPIeWJMatuvifKlanBFf77NImj+y3LrrX7cA42pBoDmwH7ZJGyfX5iRLhP4ZlcAC1gps eDaM9+8583jH3kUCaoJOQ67b3jiqKR0jGwG+sRa74+xwy8ZtSgVeIE0iXFv5mQE8rE4P lXNBv4wwpTNszKJLH6+icW76FKFqYpSVy8CfVYapmz2FbhZxxbQWIa8aCEE/v2WPFlaT 3vR9wV7P345Q0qFPJtgZQkHmUtpymha1akzFTaziiQ+RETM6mviFsJXlcbVhNsyguZi1 N36g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749123995; x=1749728795; h=content-transfer-encoding:mime-version:user-agent:message-id:date :references:in-reply-to:subject:cc:to:from:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=E9c38OO3OfY4yFB4xXXAD/Jbg2ph0/yvSPhi7K+Po3c=; b=YRZIp1yBqWTDGieACvTRtiryrcVZ1j1BUObNc6egN8+ESTt/PM+VQVeoIeI2+rH51q wQWth2PU9EQdimnn7LRsH4eLCvnqUsQUaeAMnRVjlG/Zj5f2uzB2ON6N2hfL7pdx9F9k Pyx3JR/FptnPtg8tSOfOwh0GYW2T13reB6pQKwixh1F6dTq6FC0X6IC6+QDUmavEGh4k X+hko5M1D2TOIrIbngvSVigF9P673wmctVk6jwi12A9axlgHzvQb1zD6uVAfLH1m/uuI 0D8xzqe0DS4Z3wtQFBcCEobA51Yd2l/ATE6PQkOwS/mrjDmndZIIg0lINKzjp3UGXESj A/hQ== X-Forwarded-Encrypted: i=1; AJvYcCWRtv1G9l8doSzbNdI5Wmpbu1uwd4t9TF8yqE4XibuK1G2eE7igPFz3LUAE1cecP1qVNfVICA==@debbugs.gnu.org X-Gm-Message-State: AOJu0YxApQKtisHzHcDU2NxtPTtzVvod5E4pMjIubMXq7WupueffncnQ 7Bao//w65DzUhu8EdcaMI4JQxLeOvzMWiMP6oqsYmonnFLy9RKbqbMbs X-Gm-Gg: ASbGncs5keZal+eGp2uf4proBp2lFzYwe3YwiPVoLWjmdJKQ2Rq6AngbDxGaputm7eq KvoGA+XqKp5vXLjaCg52jYclVDY/iPv6jC+AS69BuxMxH59dEX5JV4OkQoiUvyIhIE2RiTQ/eyu 0AEWWY5moPACAqGF5v8zRNDztb0Fot7ULCPqgK0bgcdMPJGJSL8Lb6vz+BGaf2XHJxVan8zzvnw VpImA0YyphqeU9kTplkZFCZv7KpkkyZOajUvRxQ3NooSKnSxdxfpt6BwRg1U5jRbQjreqmDSnKP 9KN6KWbYh4Tnkv0zn/iUrz6gCKWjZMsgp5J02f/eT7UCrjfS2e4mIZvl3fi63EZ5tzHnekbLRTv ChFGZ2Ob7g5LuQA3BXs/mIl0tZ9K7W/wLZhCgYS+kjuLtdbszzpzf/A== X-Google-Smtp-Source: AGHT+IEAmAgkS9tobGekS1By5kgZVZLiRa2Txrk7miK8T5ukgSqFz4UzUvIRHCL5cxzr/wN4Mn3beg== X-Received: by 2002:a05:6000:402c:b0:3a4:d53d:be22 with SMTP id ffacd0b85a97d-3a51dc4c8d0mr5035876f8f.58.1749123994991; Thu, 05 Jun 2025 04:46:34 -0700 (PDT) Received: from pro2 (p200300e0b7095600688e4fdf01bce601.dip0.t-ipconnect.de. [2003:e0:b709:5600:688e:4fdf:1bc:e601]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4efe6cd15sm24702867f8f.39.2025.06.05.04.46.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 04:46:34 -0700 (PDT) From: =?utf-8?Q?Gerd_M=C3=B6llmann?= To: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root In-Reply-To: <874iwu8nuf.fsf@protonmail.com> References: <87r000k3uv.fsf@gmail.com> <87bjr3ejpr.fsf@protonmail.com> <87qzzziber.fsf@gmail.com> <87cybj8dqo.fsf@protonmail.com> <875xhaip93.fsf@gmail.com> <874iwu8nuf.fsf@protonmail.com> Date: Thu, 05 Jun 2025 13:46:33 +0200 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org, Gerd =?utf-8?Q?M=C3=B6llmann?= , Helmut Eller X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Pip Cet writes: > "Helmut Eller" writes: > >> On Wed, Jun 04 2025, Pip Cet wrote: >> >>> As far as I can tell, there's just one significant problem: we have a >>> root covering charset_table_init even though that array is unused in >>> pdumped Emacs sessions. It's not a huge problem, just 180 zero words >>> fixed as Lisp objects, but it is unsatisfying and we should fix it. >> >> It's more like 60kb of zeroes. > > Oops, you're right. All the more reason to remove that root right away. > >>> So let's get rid of that unnecessary root. > > I still think we should do that. Do you think getting rid of the root > would actually make things worse? If you don't, I'll push that change > first. > >>> That leaves two very minor problems: >>> >>> 1. undumped Emacs sessions. I don't think these are worth optimizing >>> for. Do you? >> >> No. But there is also no need to make them gratuitously different from >> normal sessions. > > I agree, but the decision has been made on the master branch to do just > that (static storage pre-pdump; pdump storage post-pdump; xmalloc if > grown), and if we want to revisit it we need to fix it on the master > branch. > >>> 2. Emacs sessions which grow charset_table beyond the 180 entries in the >>> dump. Does this happen? In any case, we do handle the case correctly, >>> leaking memory but giving the user access to as many charsets as they >>> desire. >> >> Defining new charsets is presumably an exotic thing to do. > > Still, I see no reason it wouldn't have worked before, and it can't > possibly work after your patch. > >>> There is also the long-term goal of making it so all objects in the >>> pdump section can be relocated elsewhere, allowing us to free the >>> section. We should do so by tracing objects in the dump precisely so >>> they can be moved out. >> >> As my patch does, in contrast to your idea. > > Your patch does not make the charset table movable or protectable. > >>>> Second, I don't like the practice of using the GC header to create >>>> "shadow types". >>> >>> Noted. That's a good argument for not introducing new ones, but your >>> patch doesn't introduce or delete any; it just copies things back from >>> the dump heap (which is never freed) to an unprotected root in the BSS >>> section or to xmalloc'd memory. >> >> Exactly: my patch copies the charset table to the place where it >> belongs, the space in the BSS section reserved for it. Just like >> builtin symbols and subrs are copied to their space in the BSS section. > > I don't think those two special cases (lispsym has very good reasons for > being special; subrs aren't in the BSS section) warrant introducing a > third special case. > >>>> Maybe, depending on how this charset table business goes. >>> >>> I don't see the dependency. >> >> My motivation may depend on it. > > Let me just state that removing roots, particularly ambiguous ones, is a > very good thing, and you've identified one (the charset table one) where > it is particularly easy to do so. I just don't understand why you don't > want to remove it, but make it necessary instead. But the situation > with kbd_buffer is different: it's currently a large ambiguous root, and > reducing the amount of time we spend scanning it is a good thing, even > if it stays a root. > > And, again, the charset table situation in general should be improved, > on the master branch, and such improvements would trickle down to > feature/igc. > >>> As far as I can tell, we need this patch. It removes the extraneous >>> root. >> >> I like my patch better. > > Okay. Still, the first single-line patch would improve things, right? > If so, there's no reason not to push the patch we presumably agree > improves things (even though we both think it's not ideal) while > discussing the one we disagree about. > > Obviously, just a suggestion. I've CC'd Gerd in case he thinks I'm > being silly and we should just apply your original patch (in that case, > I'd try to verify it doesn't introduce new crashes if charset_table is > grown pre-dump), and fix the bug which I think would introduce such > crashes. > I've read Helmut's patch, and I think I understand the idea, but let me check: The idea is to get rid of the special handling of charset_table in the MPS heap after loading a dump. It is currently a pinned object by having an ambiguous root point to the charset_table object in the MPS heap. Helmut rather copies it back to where it came from, and the root that is used for pinning can go. The charset_table is instead an exact root. Hope that's more or less what happens. Makes sense to me =F0=9F=A4=B7. I find it a bit simpler, because the pinnin= g root is gone, which I find a good thing. One thing less to understand. One pinned object less. I did not understand everything in the following mails. I'm probably missing things that came up/were discussed somewhere else. I'd rather not sprinkle comments when I don't really know what's going on. From debbugs-submit-bounces@debbugs.gnu.org Thu Jun 05 12:41:36 2025 Received: (at 78686) by debbugs.gnu.org; 5 Jun 2025 16:41:36 +0000 Received: from localhost ([127.0.0.1]:35821 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uNDeq-0002Vv-Ap for submit@debbugs.gnu.org; Thu, 05 Jun 2025 12:41:36 -0400 Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]:47272) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1uNDen-0002VS-9T for 78686@debbugs.gnu.org; Thu, 05 Jun 2025 12:41:34 -0400 Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-6020ff8d51dso2195644a12.2 for <78686@debbugs.gnu.org>; Thu, 05 Jun 2025 09:41:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749141687; x=1749746487; darn=debbugs.gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=2OCZnNTcGMcurFLhNN5mvJZi2POLjoB2o2bZVxrm360=; b=SDAqAUTgNsNUYbmVLvPXCZOU9eQ2La4C8Gp1b/i9CxDgMzQ6Br62jEtbEVg840ITmB 8dhL4bKkA4RpCS0GzcAarkyaPKq0UIrHTqjWCCJmhB5KE5R4HRqY1I6Fvd42vVqDykSX 4giV3AR1ROtxy4NsnQ+2KPxF3S6D6LyysQsnCY8kXbXuUls1TlcD1k4RXO2a42yqBhVw VXgLpdSpFNZbFnlV4Bg4Qf4v2ZYEZvXhgT0t5/MT7XoGfFgP+04wyH/eDATlUiX6NEPm MPxHegrPcPFXZWlHa5VqbT22RxoiUFrkX/B0qWQqBrKOjRvepyF/3Y5pXrQligNmDINB JYHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749141687; x=1749746487; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2OCZnNTcGMcurFLhNN5mvJZi2POLjoB2o2bZVxrm360=; b=pEKM3SyDC0LVx0w7yOb5E1ONmiMF+STzbAUf/sZygygul/xFJoRbIQ100XEJpWvYAh iJAsxsdx917VyRHvfrkIZ+s8alu0YCIogoCpaB+WYXEKnpo/qWmm8+7YeynY1SYpB6Y3 4Esyvj8ebMqSen/61rh1x+hCt8GwbopSdXKMfRKTWuRUS2pI+YkbYCmfnAwbPPiFnlFt 3lpF8Xksj4mDl23vjgmzRdupROePj+T+7UUEvQtm5Aiuw/uF0gh+1Dqd+u7tJZ2JtMsD ItzLnK7yGhg8iHC2FT9AENSQvdm2U7N8mgiTW0uHsPm8Yy2YDUO69SF4CRjhj4ZvGPxR SR1g== X-Gm-Message-State: AOJu0YwIvlpx6jeN7M788QWWvt8dZxI0weTvezVyRbD0QF243EQwlKiB 7sbjgZcRp5lgtCIkS2JB7bkA4yPtIhc/qYZ0aTcxsSM/f2wxBSwlP0Y4WdJUXA== X-Gm-Gg: ASbGncsLJuTgUhhDckAMgIBttZ/2CxjEzvUT7zX52aIy+rJCjtSLsKc98ghYLmh0fzw Qwh1tOZG+nzKA8w2VRyV/kTmiAKHsOtWlUuhrJYUOkWxCgUw0abioQnmD4OOECzQya2K2hGU/Q4 G2ri3Q95IK4d3Xg8KNoeXRClfHjONvwV3LOLSzACw+f8BB3BuSqINmLtivt+u9uBmbAGd5RDCm5 osoxVXLqP5senKaaMH66+ompLph2aC+JTF3YSvRgHk+Sl6qAyL8xE/MFDyW/WRoAtET2R5yyt2r 9drIcO+dHEyOOP/lpdRg8DMc8Zk1mSL/ZCi4JNeuogMv7CE5 X-Google-Smtp-Source: AGHT+IHKs9B5KCJ3WghRJJBeljcYiJ88SI2hG0tuDxP3YPIVWXhNSPYx4HqDNOWcJpuWdQh1WEYjWQ== X-Received: by 2002:a05:6402:42c3:b0:606:9211:e293 with SMTP id 4fb4d7f45d1cf-606ea15f726mr5932250a12.9.1749141686757; Thu, 05 Jun 2025 09:41:26 -0700 (PDT) Received: from caladan ([89.107.110.32]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6070a0b401dsm2001833a12.73.2025.06.05.09.41.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 09:41:26 -0700 (PDT) From: Helmut Eller To: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root In-Reply-To: <874iwu8nuf.fsf@protonmail.com> References: <87r000k3uv.fsf@gmail.com> <87bjr3ejpr.fsf@protonmail.com> <87qzzziber.fsf@gmail.com> <87cybj8dqo.fsf@protonmail.com> <875xhaip93.fsf@gmail.com> <874iwu8nuf.fsf@protonmail.com> Date: Thu, 05 Jun 2025 18:41:25 +0200 Message-ID: <87ldq6gmyy.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org, Gerd =?utf-8?Q?M=C3=B6llmann?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Thu, Jun 05 2025, Pip Cet wrote: [...] >>> So let's get rid of that unnecessary root. > > I still think we should do that. Do you think getting rid of the root > would actually make things worse? Probably not. [...] >>> 1. undumped Emacs sessions. I don't think these are worth optimizing >>> for. Do you? >> >> No. But there is also no need to make them gratuitously different from >> normal sessions. > > I agree, but the decision has been made on the master branch to do just > that (static storage pre-pdump; pdump storage post-pdump; xmalloc if > grown), and if we want to revisit it we need to fix it on the master > branch. Sounds like the inmates are running the asylum :-) [...] >> Exactly: my patch copies the charset table to the place where it >> belongs, the space in the BSS section reserved for it. Just like >> builtin symbols and subrs are copied to their space in the BSS section. > > I don't think those two special cases (lispsym has very good reasons for > being special; subrs aren't in the BSS section) warrant introducing a > third special case. So subrs end up in the .subrs section. Sill, subrs are copied back from the dump. [...] >> I like my patch better. > > Okay. Still, the first single-line patch would improve things, right? Yes. Still not as good as my approach. Helmut From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 06 03:59:00 2025 Received: (at 78686) by debbugs.gnu.org; 6 Jun 2025 07:59:00 +0000 Received: from localhost ([127.0.0.1]:41493 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uNRye-0007FT-6a for submit@debbugs.gnu.org; Fri, 06 Jun 2025 03:59:00 -0400 Received: from mail-10628.protonmail.ch ([79.135.106.28]:52821) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uNRya-0007FC-Vx for 78686@debbugs.gnu.org; Fri, 06 Jun 2025 03:58:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail3; t=1749196730; x=1749455930; bh=FjpnGJUOmHglpZzU8H6D19Cs47cZhYS8MBA/mxRhU4w=; h=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References: Feedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post; b=ZIcznvIKfQcOjZs6uf/jk8xcIno7ZtLJ9X117Avq4UQsnpz4nVI9BpExo+xoWCqAP uBD96RgQEin2lElLJNiX8YRtuRMwwT99QCn4BgsAgQ2TDWILxaXWuD8GGCX2eFDBGV R7EIWowwDwvUQu2GOOi62EXlHNJOiRoChmtmZxn29EifAMbOw6LpT4Muf2FJxYtuv/ wg4SgZgcr1XkgyWcwRAVl78dSaxYCWC9iqx7ZmXkKKdBIdMHeKAjInMW1/iiAOM3Lv HN3Z3GoUrVe4fenCOrXKfW3SedJOWIW4YUHZvBrbNSxi8CD+WLUNnV4ksJODNolql1 2V0XUgx177dOw== Date: Fri, 06 Jun 2025 07:58:45 +0000 To: Helmut Eller From: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root Message-ID: <87h60t713g.fsf@protonmail.com> In-Reply-To: <87ldq6gmyy.fsf@gmail.com> References: <87r000k3uv.fsf@gmail.com> <87bjr3ejpr.fsf@protonmail.com> <87qzzziber.fsf@gmail.com> <87cybj8dqo.fsf@protonmail.com> <875xhaip93.fsf@gmail.com> <874iwu8nuf.fsf@protonmail.com> <87ldq6gmyy.fsf@gmail.com> Feedback-ID: 112775352:user:proton X-Pm-Message-ID: 246fbc81af80d5275e9f6df6d9e5787097f1dae1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org, =?utf-8?Q?Gerd_M=C3=B6llmann?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) "Helmut Eller" writes: > On Thu, Jun 05 2025, Pip Cet wrote: > > [...] >>>> So let's get rid of that unnecessary root. >> >> I still think we should do that. Do you think getting rid of the root >> would actually make things worse? > > Probably not. Okay. I'll push that change, then. > [...] >>>> 1. undumped Emacs sessions. I don't think these are worth optimizing >>>> for. Do you? >>> >>> No. But there is also no need to make them gratuitously different from >>> normal sessions. >> >> I agree, but the decision has been made on the master branch to do just >> that (static storage pre-pdump; pdump storage post-pdump; xmalloc if >> grown), and if we want to revisit it we need to fix it on the master >> branch. > > Sounds like the inmates are running the asylum :-) I don't think mental-health analogies are useful here, to be honest. I'd like to propose the following patch for *master*: >From 7f2e728623aed6ac04bd129f769b455c27f548dd Mon Sep 17 00:00:00 2001 From: Pip Cet Subject: [PATCH 1/2] Remove charset_table_init (bug#78686) In usual post-pdump Emacs sessions, charset_table is set by pdumper to point to a table of sufficient size contained in the pdump. Stop allocating ~59KB of BSS space for such sessions. * src/charset.c (Fdefine_charset_internal): Grow 'charset_table' to 180 entries right away if this code is ever reached, which it isn't in ordinary post-dump Emacs sessions. (charset_table_init): Remove variable. (syms_of_charset): Initialize 'charset_table' to NULL. --- src/charset.c | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/src/charset.c b/src/charset.c index 3264d75f92e..df669b71e46 100644 --- a/src/charset.c +++ b/src/charset.c @@ -1132,7 +1132,7 @@ DEFUN ("define-charset-internal", Fdefine_charset_int= ernal, =09 int old_size =3D charset_table_size; =09 ptrdiff_t new_size =3D old_size; =09 struct charset *new_table =3D -=09 xpalloc (0, &new_size, 1, +=09 xpalloc (0, &new_size, (new_size =3D=3D 0) ? 180 : 1, =09=09 min (INT_MAX, MOST_POSITIVE_FIXNUM), sizeof *charset_table); memcpy (new_table, charset_table, old_size * sizeof *new_table); @@ -2340,18 +2340,6 @@ init_charset_once (void) PDUMPER_REMEMBER_SCALAR (charset_ksc5601); } =20 -/* Allocate an initial charset table that is large enough to handle - Emacs while it is bootstrapping. As of September 2011, the size - needs to be at least 166; make it a bit bigger to allow for future - expansion. - - Don't make the value so small that the table is reallocated during - bootstrapping, as glibc malloc calls larger than just under 64 KiB - during an initial bootstrap wreak havoc after dumping; see the - M_MMAP_THRESHOLD value in alloc.c, plus there is an extra overhead - internal to glibc malloc and perhaps to Emacs malloc debugging. */ -static struct charset charset_table_init[180]; - void syms_of_charset (void) { @@ -2377,8 +2365,10 @@ syms_of_charset (void) staticpro (&Vcharset_hash_table); Vcharset_hash_table =3D CALLN (Fmake_hash_table, QCtest, Qeq); =20 - charset_table =3D charset_table_init; - charset_table_size =3D ARRAYELTS (charset_table_init); + /* charset_table is special-cased in pdumper.c, to make it point into + the dump when it is loaded. */ + charset_table =3D NULL; + charset_table_size =3D 0; PDUMPER_REMEMBER_SCALAR (charset_table_size); charset_table_used =3D 0; PDUMPER_REMEMBER_SCALAR (charset_table_used); --=20 2.48.1 This fixes the problem of allocating BSS space. This is the one problem that I've understood your patch to fix. I'm not saying that there aren't other problems it fixes, and I'd like to understand what I'm missing here, but I cannot think of any. >>> Exactly: my patch copies the charset table to the place where it >>> belongs, the space in the BSS section reserved for it. Just like >>> builtin symbols and subrs are copied to their space in the BSS section. >> >> I don't think those two special cases (lispsym has very good reasons for >> being special; subrs aren't in the BSS section) warrant introducing a >> third special case. > > So subrs end up in the .subrs section. Sill, subrs are copied back from > the dump. The only reason for that, that I'm aware of, is precisely that they are initialized, and creating dynamic objects from static ones was considered less efficient than using the static ones directly (this is probably not true). So that reason doesn't apply to charset_table, which is not initialized. The reason for lispsym to be static is that we need the address for Lisp_Object <-> struct Lisp_Symbol * conversion, and we also want iQxxx to be constant because it's used in initializers. Note that lispsym and subrs caused a lot of trouble, some of it very recent, because it's hard to enforce the proper alignment on some systems (DJGPP). Allegedly, it's impossible to do so at all on some other systems, but I don't believe that to be correct; still, we have the code and pay the maintenance price for it. In any case, none of these arguments apply to charset_table. I think it was static because it made unexec easier. Unexec's gone, so let's get rid of it. >>> I like my patch better. >> >> Okay. Still, the first single-line patch would improve things, right? > > Yes. Still not as good as my approach. I'm sorry, but can you try once more to point out what the advantage of your approach is? I simply do not see it, and I'd like to understand. Pip From debbugs-submit-bounces@debbugs.gnu.org Fri Jun 06 12:35:16 2025 Received: (at 78686) by debbugs.gnu.org; 6 Jun 2025 16:35:16 +0000 Received: from localhost ([127.0.0.1]:44575 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uNa2G-0002ar-4F for submit@debbugs.gnu.org; Fri, 06 Jun 2025 12:35:16 -0400 Received: from mail-ej1-x62e.google.com ([2a00:1450:4864:20::62e]:45112) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1uNa2E-0002Y2-2n for 78686@debbugs.gnu.org; Fri, 06 Jun 2025 12:35:14 -0400 Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-ad1d1f57a01so414130166b.2 for <78686@debbugs.gnu.org>; Fri, 06 Jun 2025 09:35:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749227708; x=1749832508; darn=debbugs.gnu.org; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=k2cFdjAaJKP7D45PYBc6GCJbLMNZX+cEocprhXuyTsg=; b=ZGqhgsDDAiA5WvO0bJ+2Vta1tRaJ67uRYkpsmXujS65qFYBU5pswQiTpwrZoOm6Qaq TH1n2ftYG0CpiojkPclt2d6dFQ9D4uRTWl8j/HAYcd2M94tRUVsBUzIlXD3vp90JSL1/ EnvNPA4UjJ/f+ycxzkVxxcpY8xVQOnW7+TVth6Dfu69pO0xWA2guEqsLUPJGDbvsI3qF kpA64bayNRDzziGjfDR6Sz4ivVE9BTyexDInJ6dTkXGDK4FQKfWAGThtCif/HhWpNchU CAN/pbQS24Mw+EN6KQH4AvmEHDrrs4ABt8U4DTFy1yWo9jb1alyGvLZoAdwZqT8MSphO a+3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749227708; x=1749832508; h=mime-version:user-agent:message-id:date:references:in-reply-to :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=k2cFdjAaJKP7D45PYBc6GCJbLMNZX+cEocprhXuyTsg=; b=FY+1ssMFAOKlRmvoaRNNOd2RuxYLDmjvNuHMvP/C9rR+vogMMZdkhmQwT7C73+oM14 jiib4MYMAfYjdtBehpGKtVwMP9qnoK5kzOjicLp7XFt5mymnPu5PNLU16lCdu0k4WROW +gu/f7nlbCJfhSvJBASHfvoiCrlqhppR7AuAXf+uhBpxQkBhNsS5nyPTpa5PFKyxDEkT pwaRAI83QL3SxUR6uHQFxALFgNCVq/TFLUhrFiONy6AHUUCzdT3BswHL53n79qd/SFUc k0hEoGI7atUvqyXIlt2iedqt9jCnBFYSUi2fqaO30TuAOatFIbXXy+HGXEdNtEdtcIx8 PjiQ== X-Gm-Message-State: AOJu0YxvSxgzKChsu9yIE9NPBIB4F0bPvJeVsw+6DogpHPMiaWTVa1F7 6RBwRtF3BqyEQEzUZNnFfsRtJvgG7oqXMeVWeRuFMHzvSjH+LHTDnMLw X-Gm-Gg: ASbGnct/iI3HrEtOlgPaV6tesFEJDtAm26EPXhA2DYVclMzg+8EkCvWrlhQfpsJgfSp h1184hc8ih/CdRvxNrw7wyNC4fzXGji1y9jbjAS3e7DSv7Zr01ysM0ebWiDlTCHgMqsai64RPKV jjOgClCRSOwssvRbWHP0u/+uy2FvvTD540ZxWUCsNgjYzCpy21wnxvJz4dfKxhLk6BG58Wlznfi rKwqQonzHID/kBknYTEilC1dSFMtbp7vt/vBbUf2JPveMYWwUdA+f2dVI/DxXQMbAhOjP/HYN0z U9wMYB/7kg96++8JRARpcc7yKVbqQAjanSnA3Ie+RFDZFW02cEvQ8pwY/WpSe58eNFV3Dfg4apa yvFuffuIN X-Google-Smtp-Source: AGHT+IGRdYogyQtI3giZVt1OsjagozshrGgfwasU8GBUksnJ9A6eqvwLvv6sJiQheLLP/J0vHmNghQ== X-Received: by 2002:a17:907:930a:b0:ad5:72d4:85ee with SMTP id a640c23a62f3a-ade1ab5ee12mr378988766b.53.1749227706888; Fri, 06 Jun 2025 09:35:06 -0700 (PDT) Received: from caladan (dial-184179.pool.broadband44.net. [212.46.184.179]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ade1d7542b1sm140545366b.35.2025.06.06.09.35.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Jun 2025 09:35:06 -0700 (PDT) From: Helmut Eller To: Pip Cet Subject: Re: bug#78686: feature/igc: [PATCH] Make charset_table an exact root In-Reply-To: <87h60t713g.fsf@protonmail.com> References: <87r000k3uv.fsf@gmail.com> <87bjr3ejpr.fsf@protonmail.com> <87qzzziber.fsf@gmail.com> <87cybj8dqo.fsf@protonmail.com> <875xhaip93.fsf@gmail.com> <874iwu8nuf.fsf@protonmail.com> <87ldq6gmyy.fsf@gmail.com> <87h60t713g.fsf@protonmail.com> Date: Fri, 06 Jun 2025 18:35:05 +0200 Message-ID: <8734cchlqe.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 78686 Cc: 78686@debbugs.gnu.org, Gerd =?utf-8?Q?M=C3=B6llmann?= X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Fri, Jun 06 2025, Pip Cet wrote: [...] >This fixes the problem of allocating BSS space. This is the one problem >that I've understood your patch to fix. I'm not saying that there >aren't other problems it fixes, and I'd like to understand what I'm >missing here, but I cannot think of any. There was nothing wrong with placing the charset table in the BSS section, because the table hardly ever grows. [...] >> Yes. Still not as good as my approach. > >I'm sorry, but can you try once more to point out what the advantage of >your approach is? I simply do not see it, and I'd like to understand. - fewer pinned objects - temacs and normal emacs are more similar Helmut