From debbugs-submit-bounces@debbugs.gnu.org Sat Aug 23 22:16:23 2025 Received: (at submit) by debbugs.gnu.org; 24 Aug 2025 02:16:23 +0000 Received: from localhost ([127.0.0.1]:42541 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uq0HP-0008Hp-8G for submit@debbugs.gnu.org; Sat, 23 Aug 2025 22:16:23 -0400 Received: from lists.gnu.org ([2001:470:142::17]:38014) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uq0HL-0008Ha-Aw for submit@debbugs.gnu.org; Sat, 23 Aug 2025 22:16:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uq0HF-0003Rr-MJ for bug-gnu-emacs@gnu.org; Sat, 23 Aug 2025 22:16:13 -0400 Received: from mail-pf1-x432.google.com ([2607:f8b0:4864:20::432]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1uq0HE-00074Y-4W for bug-gnu-emacs@gnu.org; Sat, 23 Aug 2025 22:16:13 -0400 Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-7704f3c46ceso328063b3a.2 for ; Sat, 23 Aug 2025 19:16:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1756001768; x=1756606568; darn=gnu.org; h=mime-version:user-agent:subject:to:from:message-id:date:from:to:cc :subject:date:message-id:reply-to; bh=XvERbEgOl7mUa9OMAWQQpHyftCCxLWcxx2nKmQqkwEE=; b=UEYKFjDxbULaJkGE54ej8YUZgZGNYBJ+dxmnWmBT+kTLhRe6CN9/zo+HTzI9w4eESA K00DyGKO5KMn0HXX0qkF9IUnmbeDqnmvTYflPszXZNBOoqTnuUQtTgMI/eHT1Jp8evKQ Oki5ExOg2EotqUe59mFEJugYrXGJMQy7kNFLayvLBNMJzjQlqS6tKaq3jGztAmqOLF3v wNIpCqrxZKgm9Q+2Tq5fdazoMr214ui7xalNqg+FjKxw5ItOgpl7Ii5Tb86UOQavK8Pd 89yD1cXE2DNhCtti8BUCJ8S7TQGCydRcuEEbB0a93XrngEFV6w2xS5GSdxdH89VjkPz9 I7fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756001768; x=1756606568; h=mime-version:user-agent:subject:to:from:message-id:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XvERbEgOl7mUa9OMAWQQpHyftCCxLWcxx2nKmQqkwEE=; b=HiZ17GYhdX5wTyZ5VEc2TFiKpGfg7yufSpkY4xPnQgi4FyI04VEhZWUNMRZ6jrB9gM WEet7wFxrkI1nZPsAQ2zjlexwkLRT1ONjYV2iCgPMUo6ka0vFl7DoiYIltYaRs7vXVXq Pb5KwDpzSZ7QaWgzu7ev80cSlKoe6VRyfn43q2KhxvD9AYxCfMJcul8lcE+0nlomZKCV HArLFU2ItvCBvc7b4HlBUt0qhF9zcZ+AKghenkf3bxSTY7CoVVk1Fxdcw3uZqDIdyrQp GObrYvOdp6pNlLqNAV5n7berWYEg0Idskpy1g+NXAg8zjWIS9ZryzK5uKsrhWAQj0Dfb nDjw== X-Gm-Message-State: AOJu0YzvTNJmbhQ2jEl3oxjx9fCXUDuSf8i2/HbQihXGjDCF7kGftB49 cKjJV9gruRSrb5/+jc43srfyo+J+ozpMaXGCI3ZGdTgfywpaiaRQh9dNqRINEg== X-Gm-Gg: ASbGncu5P3g3Sm+TUByWYs1LH/89a1S4XFGxQJ4Q7LYn4OqOulInliBF9Gu41JytuF+ lCvwLZ9qHfQ5N0AYZ/FicRN0VcbLqnKE0gc9ZEsatjNxzQjjpIj49BNU/igFELOApExj0ihhvGm V+SD9QCZzfELaMQZX01bRbuMX7sVIh2OgakaPMPpFPR9DAK2FHOI3ioK1LN+ZLEmW73akAaGq+b NpQQw5nAqxz7qsr20PwMW6zbOCidSsjA9sihBkrwsTzmRpWEVJrFLebBnc3icNN1Wg3ANsdFljG DoszHda0tdRt4jTFUQq9haRD4XCtGGpe9lvJH0zJDBry2n6kbb9poyqk+Q6WkrYobbv/cyHl0Iv gGjVjd4qrZb19aCjlLUnxPQbnY6mAgv2Ttx3Ig5JirX+09w== X-Google-Smtp-Source: AGHT+IHXfSzS/FHd6teh4KphqxVVEcXW2ou5FrwFQ7j+qp1y2PTiZ+9TBEVcdn4opzdahjDc12QXPg== X-Received: by 2002:a05:6a00:3999:b0:76b:f25d:54c9 with SMTP id d2e1a72fcca58-7702f9eb4e1mr9164895b3a.10.1756001768319; Sat, 23 Aug 2025 19:16:08 -0700 (PDT) Received: from DESKTOP-CQRAB2T.gmail.com ([240d:1a:6f4:6b00:558:8a58:425e:7364]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7703ffef1c6sm3631864b3a.37.2025.08.23.19.16.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 Aug 2025 19:16:07 -0700 (PDT) Date: Sun, 24 Aug 2025 11:16:05 +0900 Message-ID: <86qzx1xy56.wl-shingo.fg8@gmail.com> From: Shingo Tanaka To: bug-gnu-emacs@gnu.org Subject: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.2 (x86_64-w64-mingw32) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-2022-JP Received-SPF: pass client-ip=2607:f8b0:4864:20::432; envelope-from=shingo.fg8@gmail.com; helo=mail-pf1-x432.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 1.2 (+) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: MS Windows has a utf-8 configuration with setting "Beta: Use Unicode UTF-8 for worldwide language support" to on in its language settings. In Japanese environment, system coding is cp932 (MS version o [...] Content analysis details: (1.2 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record 1.0 SPF_SOFTFAIL SPF: sender does not match SPF record (softfail) 0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends in digit (shingo.fg8[at]gmail.com) 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (shingo.fg8[at]gmail.com) -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2001:470:142:0:0:0:0:17 listed in] [list.dnswl.org] X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.2 (/) MS Windows has a utf-8 configuration with setting "Beta: Use Unicode UTF-8 for worldwide language support" to on in its language settings. In Japanese environment, system coding is cp932 (MS version of Japanese SJIS) without the setting but becomes cp65001 (utf-8) with the setting on. Emacs looks like successfully detecting the change because locale-coding-system gets from cp932 to cp65001 as expected, and working as expected overall. However, I found a bug that format-time-string returns a day of the week string encoded wrongly - with cp932 even in locale-coding-system is cp65001. Conditions: - Windows 11 Pro 24H2 with Japanese setting and "Beta: Use Unicode UTF-8 for worldwide language support" on - Emacs 30.2 (latest, https://ftp.gnu.org/gnu/emacs/windows/emacs-30/) Here is how to reproduce. 1. Run Emacs with no-init-file 2. Go to *scratch* buffer and evaluate: (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) 3. You will get below wrongly encoded string: "25,01,01 \220\205\227j\223\372" 4. evaluate: (decode-coding-string (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) 'cp932) 5. You will get below correctly decoded string: #("25,01,01 水曜日" 9 12 (charset cp932-2-byte)) This issue doesn't happen when "Beta: Use Unicode UTF-8 for worldwide language support" off (locale-coding-system is cp932). If you have any question or need further information, please let me know. Regards, Shingo From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 24 02:16:12 2025 Received: (at 79296) by debbugs.gnu.org; 24 Aug 2025 06:16:12 +0000 Received: from localhost ([127.0.0.1]:43087 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uq41U-0003No-34 for submit@debbugs.gnu.org; Sun, 24 Aug 2025 02:16:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:56666) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uq41R-0003NZ-T9 for 79296@debbugs.gnu.org; Sun, 24 Aug 2025 02:16:10 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uq41K-0006Xq-DQ; Sun, 24 Aug 2025 02:16:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=cU3W0YOuCQkms8X+jDXjnClbv+FCUVGaVSjCmDaRwl0=; b=Pyh0thKq95bsp9n7UIs1 C48V6Z8aqJ38EPRjcow7TXDTlDvLfT7BG+MVrKB7gqxefEPupuBgClF68cy2iXK/zsfWgzvoPEaWB KJTUnM4CpKtUxpzex5ErqLrVTjrgansWIDgC65RKNPQdo61sJ2zP8D4ocNGiWIrBXLoe92T8A+D2+ XBuTZVCfVwYiJvNHU5HwcfjHXUq2NeCsbOgMJ/U0v4Rvf/eo5QZ/MDnoLMP3ERbGH/GSYivY3imiz XE3VVGfSre6waIF50I4Yhn6zJYyHeQ6LpDyqBY74lSWVy5SlFy4udAXeJwkPxUuGLVXIRVerejuUU Z7ZY1UMXiaPcYg==; Date: Sun, 24 Aug 2025 09:15:59 +0300 Message-Id: <86bjo58ctc.fsf@gnu.org> From: Eli Zaretskii To: Shingo Tanaka , Paul Eggert , Bruno Haible In-Reply-To: <86qzx1xy56.wl-shingo.fg8@gmail.com> (message from Shingo Tanaka on Sun, 24 Aug 2025 11:16:05 +0900) Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config References: <86qzx1xy56.wl-shingo.fg8@gmail.com> MIME-version: 1.0 Content-type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Sun, 24 Aug 2025 11:16:05 +0900 > From: Shingo Tanaka > > MS Windows has a utf-8 configuration with setting "Beta: Use Unicode UTF-8 for > worldwide language support" to on in its language settings. In Japanese > environment, system coding is cp932 (MS version of Japanese SJIS) without the > setting but becomes cp65001 (utf-8) with the setting on. > > Emacs looks like successfully detecting the change because locale-coding-system > gets from cp932 to cp65001 as expected, and working as expected overall. > > However, I found a bug that format-time-string returns a day of the week string > encoded wrongly - with cp932 even in locale-coding-system is cp65001. > > Conditions: > - Windows 11 Pro 24H2 with Japanese setting and "Beta: Use Unicode UTF-8 > for worldwide language support" on > - Emacs 30.2 (latest, https://ftp.gnu.org/gnu/emacs/windows/emacs-30/) > > Here is how to reproduce. > 1. Run Emacs with no-init-file > 2. Go to *scratch* buffer and evaluate: > (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) > 3. You will get below wrongly encoded string: > "25,01,01 \220\205\227j\223\372" > 4. evaluate: > (decode-coding-string > (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) > 'cp932) > 5. You will get below correctly decoded string: > #("25,01,01 水曜日" 9 12 (charset cp932-2-byte)) > > This issue doesn't happen when "Beta: Use Unicode UTF-8 for worldwide language > support" off (locale-coding-system is cp932). Thanks. I think this is an issue with Gnulib, whose nstrftime function we use to format the time in Emacs: it seems to produce time strings encoded in cp932 even though the UTF-8 support is turned on on MS-Windows. I've added the Gnulib folks to the discussion. Bruno and Paul, does Gnulib's nstrftime support the UTF-8 system codepage on MS-Windows? I see some COMPILE_WIDE preprocessor conditions in the source, but it is not clear to me whether it is necessary for Unicode support, and whether strftime (as opposed to wcsftime) from the Windows C runtime properly supports this "beta feature". Do you happen to know? The "\220\205\227j\223\372" bytestream shown above is AFAICT the correct text properly encoded in cp932, so if we cannot get Windows and Gnulib to produce a UTF-8 string in this case, we might need as the last resort to use cp932 when decoding time strings, even if locale-coding-system is UTF-8 on MS-Windows. Shingo Tanaka, could you please tell what is the value of w32-multibyte-code-page on your system, both when "Beta: Use Unicode UTF-8 for worldwide language support" is ON and when it is OFF? From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 24 03:13:33 2025 Received: (at 79296) by debbugs.gnu.org; 24 Aug 2025 07:13:34 +0000 Received: from localhost ([127.0.0.1]:43179 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uq4uz-0005yL-Fc for submit@debbugs.gnu.org; Sun, 24 Aug 2025 03:13:33 -0400 Received: from mo4-p00-ob.smtp.rzone.de ([81.169.146.160]:37223) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uq4ut-0005y8-LA for 79296@debbugs.gnu.org; Sun, 24 Aug 2025 03:13:31 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1756019603; cv=none; d=strato.com; s=strato-dkim-0002; b=ctBvEg6IsBGl6obH33PSeM4LuhnFui74Xssj2Xt72I8jlzjxFwqW0voYTRjCtTHNSh dmzo88xl70WleSzNB9j59nUMhGpAdEvDfVMgTxu8agKsdMiPTsWsusx3fuOBvnwCVbzk RJmE/kVovwMFUMMMwwN2GCyxOWcnaaOxo7fao2uEwBqnRSufgukPpXm0cgTqQRYe6f56 mJG2sZoTHFtBi+0YHmfizwxVFRznqnpK3NBG1EPvd7wXMXlxeuu4KQNOrUK5o7DT/wo6 zB/e6TzkQozUeKGXWUhWKRnHbkVmBmtpl+wQ8gAYzYaZ5d+GbKMd9DnmP8fdomqhgBUF k43Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1756019603; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=uiCrQWKlDKAMgMZM0+a4dtZ8/uOEYu3PG17bhB+wMLw=; b=aDD1XBK0D+fWc+vwPiHnWQGWxOsfR5qxE6+wyIQuzDUMd5UbPsagTOom5ZkFxr0VvU oaA9AxJ0XiruUewWwr8xzl3w+XLgbrOt99Vb0Envv3ckpKgrn1ejCnArpD2adWZPIaj0 anE46Ta2D7FwHC8TbtINK4CWRxm+RD/ERp5o9vUpeREdkQZtjiVplL309OHZ+JckHLWL HYXw7rPpmBq4c7zCBguNS7gl9W7DB27qnPnp9v62G0lzbENNnh9qJVuF10Ck2NC3WqPP Fsu/uHTLiy8029vUu6zmTwHf8fQHxTroNnKCYaKFjIzW/fWMgIZqW9ZCc1mVfBCsfyD2 nmFQ== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1756019603; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=uiCrQWKlDKAMgMZM0+a4dtZ8/uOEYu3PG17bhB+wMLw=; b=X8QkWVwmSSxe0njwvM0+T29+XhsYt2Jk1Ix7IWYADllpgM+GArgomctACDVtx3NlOe gNYjoLU8ARVuFUUCTtXvpsBkBYpVZcf3XoPPGhwqltUKLk4+ghzGBbXalM/4Z6JOKVXi NTpOfraWjPKMXPkWHVFfbkjLGwJOufihfH46oCBUq4U4QeRimkNppXX5hUq6cFps8F0b 1NC42Dwa5wFOD3MgIOGTybxpgx6CwmPo4A1lJ5tSxYBmriWiYqXrARSTsaS33UbjlFAF n9JmcXCOlfFdIlS8eLmn1tpl8cGREbop1Dc78YtZq0kmDdCrrF3xAM0NdiSFXIRSn2ap HxIQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1756019603; s=strato-dkim-0003; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=uiCrQWKlDKAMgMZM0+a4dtZ8/uOEYu3PG17bhB+wMLw=; b=teT1Pi5ox9UMbI1IokSBWhI/91Y7YUS6alLZ1PQf6fOJaThphk4SBG0xqTQY01sPU5 ZbZUE++rP+DggXVogkDw== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlLnY4jECd2hdUURIbZgL8PX2QiTuZ3cdB8X/nqmeRQGytvNTZclhh/PiSpMPj40YY" Received: from nimes.localnet by smtp.strato.de (RZmta 52.1.2 AUTH) with ESMTPSA id N9ae6317O7DNMvk (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Sun, 24 Aug 2025 09:13:23 +0200 (CEST) From: Bruno Haible To: Shingo Tanaka , Eli Zaretskii Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config Date: Sun, 24 Aug 2025 09:13:22 +0200 Message-ID: <5043940.kys9EeIHyz@nimes> Organization: GNU In-Reply-To: <86bjo58ctc.fsf@gnu.org> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <86bjo58ctc.fsf@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, Paul Eggert X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hi Eli, > > 2. Go to *scratch* buffer and evaluate: > > (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) > > 3. You will get below wrongly encoded string: > > "25,01,01 \220\205\227j\223\372" > ... > I think this is an issue with Gnulib, whose nstrftime function we use > to format the time in Emacs: it seems to produce time strings encoded > in cp932 even though the UTF-8 support is turned on on MS-Windows. > I've added the Gnulib folks to the discussion. > > Bruno and Paul, does Gnulib's nstrftime support the UTF-8 system > codepage on MS-Windows? * Facts: - Gnulib supports the UTF-8 system codepage of Windows, since 2024-12-23. It includes some unit tests, namely gnulib/tests/*w32utf8* . - This UTF-8 system codepage is only supported with Microsoft UCRT, not with the MSVCRT. At compile time, this configuration can be tested via '#ifdef _UCRT'. (This is true for both the mingw and the MSVC toolchains.) * Hypothesis 1: The Gnulib support included in Emacs 30.2 is older than 2024-12-23. * Hypothesis 2: The Gnulib support included in Emacs 30.2 misses the commits https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=927a70e0853345315570f051fd6996cfeb7b4d96 https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=9f7ff4f423cd805866cd4edef806c32393621df0 https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=00211fc69c926d6c8f6e3f3cf1d8802623db2af9 https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8e795a8d9f8c3269a3d30d0d1adbaf0ea9ad4a84 * Hypothesis 3: The Emacs 30.2 binaries are linked with MSVCRT, not with UCRT. * Hypothesis 4: Enabling the option "Beta: Use Unicode UTF-8 for worldwide language support" has a different effect than creating a .manifest file like the Gnulib test suite does. Hypothesis 4 sounds unlikely. > I see some COMPILE_WIDE preprocessor > conditions in the source, but it is not clear to me whether it is > necessary for Unicode support This COMPILE_WIDE condition is needed only by glibc for the wcsftime() function. It is not used by Gnulib. It is not needed for i18n or Unicode support. * Actions: - Bruno: Add a unit test for nstrftime in w32utf8 mode. - Eli or Paul: Disprove hypotheses 1, 2, 3. > Shingo Tanaka, could you please tell what is the value of > w32-multibyte-code-page on your system, both when "Beta: Use Unicode > UTF-8 for worldwide language support" is ON and when it is OFF? Yes, this info would be useful. Bruno From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 24 03:27:47 2025 Received: (at 79296) by debbugs.gnu.org; 24 Aug 2025 07:27:47 +0000 Received: from localhost ([127.0.0.1]:43253 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uq58l-0006hh-2D for submit@debbugs.gnu.org; Sun, 24 Aug 2025 03:27:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57502) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uq58i-0006hT-Es for 79296@debbugs.gnu.org; Sun, 24 Aug 2025 03:27:45 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uq58c-0005Fy-6U; Sun, 24 Aug 2025 03:27:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=5V5hevXTL+wAutQ2puhEKZ3NDkw4cXhMxebjXmTRUEk=; b=Aw+aG3S+eCpB hUWveTkoDpRkcEolN1Os/o/iXJYD2GJM2ag7fFV3CMJb/WHDc5/p30sM3F7I+Jt7cvcvCFS0KK7bk 7e3+e+GqdYtELWz+TH7j2Zb55j9xb9lEY3pplvJusBmKXMmpqSh4i5fX2DGDXwHMue0ZObMyCUYhn bXFdj4zRpBNNoR+5qZAfjpQNv6OsfDJpQbZQ+6n8gelxUVx86SiciYmrl/yoaDwn8nmPdAznx5Lqp dLzObnPPPXjUnvg86mOOlT2j+1DKtueP0drK56UD+DnrZRaz7KOYyQT78HqLNXvkWvDQv18ku6RZd P99YGaeppez+JOeQHOsQhw==; Date: Sun, 24 Aug 2025 10:27:34 +0300 Message-Id: <86349h89i1.fsf@gnu.org> From: Eli Zaretskii To: Bruno Haible , Corwin Brust In-Reply-To: <5043940.kys9EeIHyz@nimes> (message from Bruno Haible on Sun, 24 Aug 2025 09:13:22 +0200) Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <86bjo58ctc.fsf@gnu.org> <5043940.kys9EeIHyz@nimes> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, eggert@cs.ucla.edu, shingo.fg8@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Bruno Haible > Cc: Paul Eggert , 79296@debbugs.gnu.org > Date: Sun, 24 Aug 2025 09:13:22 +0200 > > > > 2. Go to *scratch* buffer and evaluate: > > > (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) > > > 3. You will get below wrongly encoded string: > > > "25,01,01 \220\205\227j\223\372" > > ... > > I think this is an issue with Gnulib, whose nstrftime function we use > > to format the time in Emacs: it seems to produce time strings encoded > > in cp932 even though the UTF-8 support is turned on on MS-Windows. > > I've added the Gnulib folks to the discussion. > > > > Bruno and Paul, does Gnulib's nstrftime support the UTF-8 system > > codepage on MS-Windows? > > * Facts: > - Gnulib supports the UTF-8 system codepage of Windows, since 2024-12-23. > It includes some unit tests, namely gnulib/tests/*w32utf8* . > - This UTF-8 system codepage is only supported with Microsoft UCRT, not > with the MSVCRT. At compile time, this configuration can be tested via > '#ifdef _UCRT'. (This is true for both the mingw and the MSVC toolchains.) > > * Hypothesis 1: > The Gnulib support included in Emacs 30.2 is older than 2024-12-23. > > * Hypothesis 2: > The Gnulib support included in Emacs 30.2 misses the commits > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=927a70e0853345315570f051fd6996cfeb7b4d96 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=9f7ff4f423cd805866cd4edef806c32393621df0 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=00211fc69c926d6c8f6e3f3cf1d8802623db2af9 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8e795a8d9f8c3269a3d30d0d1adbaf0ea9ad4a84 > > * Hypothesis 3: > The Emacs 30.2 binaries are linked with MSVCRT, not with UCRT. > > * Hypothesis 4: > Enabling the option "Beta: Use Unicode UTF-8 for worldwide language support" > has a different effect than creating a .manifest file like the Gnulib > test suite does. > > > Hypothesis 4 sounds unlikely. > > > I see some COMPILE_WIDE preprocessor > > conditions in the source, but it is not clear to me whether it is > > necessary for Unicode support > > This COMPILE_WIDE condition is needed only by glibc for the wcsftime() function. > It is not used by Gnulib. It is not needed for i18n or Unicode support. > > * Actions: > - Bruno: Add a unit test for nstrftime in w32utf8 mode. > - Eli or Paul: Disprove hypotheses 1, 2, 3. Hypothesis 3 is actually for Corwin (CC'ed), since he built that binary. > > Shingo Tanaka, could you please tell what is the value of > > w32-multibyte-code-page on your system, both when "Beta: Use Unicode > > UTF-8 for worldwide language support" is ON and when it is OFF? > > Yes, this info would be useful. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 24 04:14:50 2025 Received: (at 79296) by debbugs.gnu.org; 24 Aug 2025 08:14:50 +0000 Received: from localhost ([127.0.0.1]:43396 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uq5sI-0003QQ-38 for submit@debbugs.gnu.org; Sun, 24 Aug 2025 04:14:50 -0400 Received: from mail-pg1-x52c.google.com ([2607:f8b0:4864:20::52c]:60565) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1uq5sG-0003QB-0Z for 79296@debbugs.gnu.org; Sun, 24 Aug 2025 04:14:48 -0400 Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-b47475cf8ecso2421865a12.0 for <79296@debbugs.gnu.org>; Sun, 24 Aug 2025 01:14:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1756023281; x=1756628081; darn=debbugs.gnu.org; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=tcEiA69Vxy5BWfcc/zIq4Y1dHKnOxMz+LmH0hH12zus=; b=ctJn4uHSoafNylKhb2oka47jFsg2tnlD/BKOHIdZs6Xz2+EZF/srsmmVvz5y/wkZOl m7bWK646D8mBfiTdIlXlDe4PcunCcjUdc6hALBCG00ELJvHaO5IK4jSeuRqHW9XoQGiP ucjuRWR2U31EqfTUqCu4cVe+xY7TQplJxlL/BMx8JPeOxp5vwAk66/yJ/PKExY++UN7/ uS2l0sl4+ZK5gj+a4X9/0fdrtXo2PC7ka78dxxYfrtLVewNoUpZmwNmqj6/wLNE/31fc 1nhZsbWEdphIg/yoDnbEUEjHDUxEAx1OD0AnvGqtNAHFqIeKwM+LQ5DDuYpABmuhDVwh JTLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756023281; x=1756628081; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=tcEiA69Vxy5BWfcc/zIq4Y1dHKnOxMz+LmH0hH12zus=; b=JhmM+LV6KrImPHV6q+qQz8ZGgosKM4uCQaQfaIlcoxKLNIiPG75FGKMDVQBT47vn3P hpE8jGUC5CYkx9e5mT6p4xhCAhTQoP8Md8arqUYa8vyGKy/3jIjy255iC6Respvs9ZgK TUgOm2ypH0p+0HD+8AxQtz3wpw7L+kWbKGnWC7X7N1kTwlWRq65oWbezTInui5sXJ5M3 Xn67lo8F4dnZRvKaUCIyAyDXhNVU/DZPM5mW8eNkl2FTLzBJ6jKjdo2VS1FJXj02e+IH NK7dbs1MVopoag1nznMU02/xSyr+cGvCMZ06OLnVM3IiC4t8l3W4SUB8om+FCZIEHqcQ kxlg== X-Forwarded-Encrypted: i=1; AJvYcCWYIc1TjYN0n2Kr4u4q8zeEFJb2B1Nyl9JsHx/hwpDXg3cbG/f6QX+xjqX2jLXWeZpeO/fhjQ==@debbugs.gnu.org X-Gm-Message-State: AOJu0Yz+09GYek0UF45Tlu0N6RBBvfTk9Woti++f9cN+ZggafNO3o4In qIHWFvVkJfdPWS0/qABMTUozUERsroSnpFu3kyUVAUnaCs0xE6zP5o2Y X-Gm-Gg: ASbGncsaoBi229HrT38Q9IiNZak9NHIfCYrbKPTvPhiSAxNarsFL7ni4udYY2nf5vHr KLWOcHqgYzva2I75PHL4Ak6VKRvIF+pUJr109lRjtN0WSXrgvMc6yDWuJBhfO192ob2fVdzQv2l ogHPjIPvOieF/EzlFjpLUy+OkpSYEKLSeHkJG5rNyaZphPDp9t+5nb0w4dSyzhdOMhMlb43/sTx Cea7i8NTeNJjt3rsQGBf6wVsndrzy9hIaPNy96FsKKJZL3OIoKq68j75lbYAEqYe3ob44ZXsZQo SijURGZWhpdVVvPeyF+M6RptDQ6H4zSt+3WO4cm0e52z/C0GbuckFDJxSEQMObdN4OGEN5gRsKK 7kqNhyHGAeZ/zuXGxbECxXOtF9Jv9tt94GudbiMaqpesd29mWa92tYojVT6oJOaqgWbAPmZL+qQ == X-Google-Smtp-Source: AGHT+IGuIxdbJhmolJMg/k9s6bdLnX59iUkYCEiFuTm/8vQwJit23dMBz16nfjTBUIOg8GVaFjNeKA== X-Received: by 2002:a17:903:3885:b0:240:6aad:1c43 with SMTP id d9443c01a7336-2462ef70541mr92966915ad.48.1756023281405; Sun, 24 Aug 2025 01:14:41 -0700 (PDT) Received: from DESKTOP-CQRAB2T.gmail.com (fpb6a8c66d.knge114.ap.nuro.jp. [182.168.198.109]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2466885efeasm38612935ad.81.2025.08.24.01.14.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 Aug 2025 01:14:40 -0700 (PDT) Date: Sun, 24 Aug 2025 17:14:37 +0900 Message-ID: <87jz2tkufm.wl-shingo.fg8@gmail.com> From: Shingo Tanaka To: Bruno Haible , Eli Zaretskii Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config In-Reply-To: <5043940.kys9EeIHyz@nimes> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <86bjo58ctc.fsf@gnu.org> <5043940.kys9EeIHyz@nimes> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/31.0.50 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, Paul Eggert , Shingo Tanaka X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) Hi Eli, Bruno, > > Shingo Tanaka, could you please tell what is the value of > > w32-multibyte-code-page on your system, both when "Beta: Use Unicode > > UTF-8 for worldwide language support" is ON and when it is OFF? > > Yes, this info would be useful. Here is the results. Looks like the value of w32-multibyte-code-page is wrong? - "Beta: Use Unicode UTF-8 for worldwide language support": OFF w32-system-coding-system cp932 w32-multibyte-code-page 932 w32-ansi-code-page 932 - "Beta: Use Unicode UTF-8 for worldwide language support": ON w32-system-coding-system cp65001 w32-multibyte-code-page 0 w32-ansi-code-page 65001 Please let me know if you need further information. -- Shingo From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 24 05:12:32 2025 Received: (at 79296) by debbugs.gnu.org; 24 Aug 2025 09:12:32 +0000 Received: from localhost ([127.0.0.1]:43583 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uq6m8-0006I9-6d for submit@debbugs.gnu.org; Sun, 24 Aug 2025 05:12:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34422) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uq6m5-0006Hv-8G for 79296@debbugs.gnu.org; Sun, 24 Aug 2025 05:12:30 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uq6ly-0007Te-If; Sun, 24 Aug 2025 05:12:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=XmhHv+mxKW10Etg7iVDrKg7ZCS4xViJZEXULJnlti4I=; b=N8YXmXGXSGc1 ALfVddwyw8p+tZzGZ6M/OXS52Q1w7T+Z1cll7Jj7VTw9FUfMabgguchV1hQC56RT9+rnzon1fSULN SYvUoTZwVY99Y0QjV7va5r9aJlU9Pl3KpIXox3NyECWyU2NwOPST0UcKDws6SJK82v5WZtGwmGFBs c+6wJTm8tzl0ep17/oRb0YuRqFNe/vvb/2u3nWD73VzAib21bJdO0v4p3fbvf1ZAYganix4lM0aD/ rUS22I5sbfq/NfC/2+NzPvZRM28Vj2+371jsZY9tdElsTqfxlL+Fr2Qm/ER9P31AD3AXBj9JuMeNQ +p8bQ0CAMq+wz8Vpi7ewdg==; Date: Sun, 24 Aug 2025 12:12:20 +0300 Message-Id: <86qzx16q2z.fsf@gnu.org> From: Eli Zaretskii To: Shingo Tanaka In-Reply-To: <87jz2tkufm.wl-shingo.fg8@gmail.com> (message from Shingo Tanaka on Sun, 24 Aug 2025 17:14:37 +0900) Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <86bjo58ctc.fsf@gnu.org> <5043940.kys9EeIHyz@nimes> <87jz2tkufm.wl-shingo.fg8@gmail.com> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, shingo.fg8@gmail.com, eggert@cs.ucla.edu, bruno@clisp.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Sun, 24 Aug 2025 17:14:37 +0900 > From: Shingo Tanaka > Cc: Shingo Tanaka , > Paul Eggert , > 79296@debbugs.gnu.org > > Hi Eli, Bruno, > > > > Shingo Tanaka, could you please tell what is the value of > > > w32-multibyte-code-page on your system, both when "Beta: Use Unicode > > > UTF-8 for worldwide language support" is ON and when it is OFF? > > > > Yes, this info would be useful. > > Here is the results. Looks like the value of w32-multibyte-code-page is wrong? > > - "Beta: Use Unicode UTF-8 for worldwide language support": OFF > > w32-system-coding-system > cp932 > > w32-multibyte-code-page > 932 > > w32-ansi-code-page > 932 > > - "Beta: Use Unicode UTF-8 for worldwide language support": ON > > w32-system-coding-system > cp65001 > > w32-multibyte-code-page > 0 > > w32-ansi-code-page > 65001 > > Please let me know if you need further information. Thanks. This means w32-multibyte-code-page doesn't provide a good way of detecting Japanese Windows where the system codepage was changed to be UTF-8. What other aspects of your environment can be evidence that this is the case? Please tell what do the following produce when evaluated via "M-:" in a running Emacs session. Please show the values both when "Beta: Use Unicode UTF-8 for worldwide language support" is ON and when it is OFF. (w32-get-current-locale-id) (w32-get-locale-info (w32-get-current-locale-id)) (w32-get-default-locale-id) (w32-get-locale-info (w32-get-default-locale-id)) (w32-get-console-codepage) (w32-get-console-output-codepage) Hopefully, some of these will allow us to identify the combination of a Japanese Windows with UTF-8 as a system codepage. From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 24 05:50:09 2025 Received: (at 79296) by debbugs.gnu.org; 24 Aug 2025 09:50:09 +0000 Received: from localhost ([127.0.0.1]:43665 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uq7MX-000896-AY for submit@debbugs.gnu.org; Sun, 24 Aug 2025 05:50:09 -0400 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]:45276) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1uq7MT-00084B-Bb for 79296@debbugs.gnu.org; Sun, 24 Aug 2025 05:50:07 -0400 Received: by mail-pg1-x529.google.com with SMTP id 41be03b00d2f7-b47174beb13so2400190a12.2 for <79296@debbugs.gnu.org>; Sun, 24 Aug 2025 02:50:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1756028999; x=1756633799; darn=debbugs.gnu.org; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=Qsoh/XjslN07dbmEhEx/y+ZKtYRmKcmicllPDI8lCoo=; b=i/yaz9UYTVEAHGD9MgF90tMxSUNmlq6+yxjuWLhwiRuUMHvt+MHO1ZipcQR4in/2qc q7eKFaQw2HL2Ke7fiONaI1AWg66K2NRR4pIIpJHvK8UYS6K1qVbnX+BhRnrmf3dgpcCU /58JxJvVP5sZMBS7ptn6OjDuBc7Cp2yjDkVTtwCjdXkKblkffjGRoUWX6QrIt3zfu88b bEOAU0bD+8U1OaabOeFGzTCHBEpbu9aJ3Us0ySc8aA/IPgfpt6R/WVdzf4L7N0sF+U+7 Aoaqm/SH+BhA2T8wzepWct93EfxqIKV3EAfKTRPsu4k14C2+JC2xm5jGpnqEdo8YpkML Glfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756028999; x=1756633799; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Qsoh/XjslN07dbmEhEx/y+ZKtYRmKcmicllPDI8lCoo=; b=ZcmS25t9vC5zz3KM7HvSleTddxn7rrV0F6UJ9cINRjDEDXdstRD/RmsplFCKqFjh9a 1yofN9AQ6gi1UXB4o4GXj21POBDKD1N+p8WuxViah6gCrDZ3KM+lRz/2W2G1lDKZkbQi 1L/TgCV0HNR2xtYorvaF/Ba3tN3n6s95dUIc8XdOUnnNiMidKEPLtDm4rqm4DbOiUE6g 3PqL8eEuXNxP9OS91twVNQ2Ap6rlDjjVM5vWG4v4eTIcXE3Nr5R3vMorLMo03ocpZUrz J+zOOwf6vcvHYuFcip6e7z42p9Dpwx6KiTRjoEMBdlDytZf4Q9SrjE4WzQ9ckVw6Log6 ToCQ== X-Forwarded-Encrypted: i=1; AJvYcCVCVy8jEcp9eMyb9Hk/j9xpJSSa1zDYbZc41VTDA1KWHz19R7/XfibkEj0RXQYFuRZ0SnMyQQ==@debbugs.gnu.org X-Gm-Message-State: AOJu0YxRH+YZHeF8Z2Almaf/kA/kyMIRwffqA/ptbuaUKoA7U4sE3i/J WZtub0ePSr79K0IVstJXFPCbY7JNv94g8kaLd8Qz+v/lfE43N7VvAxY4 X-Gm-Gg: ASbGncuKlQVhKc9sObUhQqjQmSSD23Y/7at1BS0ivvKrts3kT9hnV69fPYoAEVBHs9X +TRA+xhGbZ5iQuSjYKoJA5WsI/qwtXfoTnSnGlMgVjE+wa+foxRdCVaq5rc0T4RBXiBAZjE7wOS h7m8NwbVw7OXV3p4h3gf3ussm2XqQgRO9bVscRcKfvbIKd1bXzRcSICDIyqgNvlooeSlgJkgxXX shCAUia/3z4OfVgV/BwMPMzL4ixtf/W26NSMC8xQ7UP2KW/5aECEJtvVXMgwSmAUWxQZWtKUQ86 o9qZ+X4MM6+Y1KU2908QwPGF0ckv6Sr1QAY1RL0rwT/XoTpPMvIl0iTQ7nyqiDJIqcK4lFoLgJ/ 3NDZsvXp+R4hRUjfKBv4N9nTnyYSG6m4p4SbF7ZjTFq8c5A== X-Google-Smtp-Source: AGHT+IEQsxiU1DEcpxb9ftfIvWSmSTXU5SC/bJ2H/cnHqOi1jRVKKZrO8rVFZ8nvQaazPaQV0qcgFA== X-Received: by 2002:a17:902:f645:b0:240:3915:99d8 with SMTP id d9443c01a7336-2462ef542d9mr116883135ad.47.1756028998957; Sun, 24 Aug 2025 02:49:58 -0700 (PDT) Received: from DESKTOP-CQRAB2T.gmail.com ([240d:1a:6f4:6b00:416d:20e:a728:efd0]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2466889d267sm39768865ad.136.2025.08.24.02.49.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 Aug 2025 02:49:58 -0700 (PDT) Date: Sun, 24 Aug 2025 18:49:54 +0900 Message-ID: <86tt1xf3r1.wl-shingo.fg8@gmail.com> From: Shingo Tanaka To: Eli Zaretskii Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config In-Reply-To: <86qzx16q2z.fsf@gnu.org> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <86bjo58ctc.fsf@gnu.org> <5043940.kys9EeIHyz@nimes> <87jz2tkufm.wl-shingo.fg8@gmail.com> <86qzx16q2z.fsf@gnu.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.2 (x86_64-w64-mingw32) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, bruno@clisp.org, eggert@cs.ucla.edu, Shingo Tanaka X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Mon, 25 Aug 2025 03:12:20 +0900, Eli Zaretskii wrote: > > Thanks. This means w32-multibyte-code-page doesn't provide a good way > of detecting Japanese Windows where the system codepage was changed to > be UTF-8. What other aspects of your environment can be evidence that > this is the case? Please tell what do the following produce when > evaluated via "M-:" in a running Emacs session. Please show the > values both when "Beta: Use Unicode UTF-8 for worldwide language > support" is ON and when it is OFF. > > (w32-get-current-locale-id) > (w32-get-locale-info (w32-get-current-locale-id)) > (w32-get-default-locale-id) > (w32-get-locale-info (w32-get-default-locale-id)) > (w32-get-console-codepage) > (w32-get-console-output-codepage) Here you are. - "Beta: Use Unicode UTF-8 for worldwide language support": OFF w32-system-coding-system: cp932 w32-multibyte-code-page: 932 w32-ansi-code-page: 932 (w32-get-current-locale-id): 1041 (w32-get-locale-info (w32-get-current-locale-id)): "JPN" (w32-get-default-locale-id): 1041 (w32-get-locale-info (w32-get-default-locale-id)): "JPN" (w32-get-console-codepage): 932 (w32-get-console-output-codepage): 932 - "Beta: Use Unicode UTF-8 for worldwide language support": ON w32-system-coding-system: cp65001 w32-multibyte-code-page: 0 w32-ansi-code-page: 65001 (w32-get-current-locale-id): 1041 (w32-get-locale-info (w32-get-current-locale-id)): "JPN" (w32-get-default-locale-id): 1041 (w32-get-locale-info (w32-get-default-locale-id)): "JPN" (w32-get-console-codepage): 65001 (w32-get-console-output-codepage): 65001 -- Shingo From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 24 06:41:51 2025 Received: (at 79296) by debbugs.gnu.org; 24 Aug 2025 10:41:51 +0000 Received: from localhost ([127.0.0.1]:43832 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uq8AZ-0002J1-2k for submit@debbugs.gnu.org; Sun, 24 Aug 2025 06:41:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54514) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uq8AW-0002Ik-76 for 79296@debbugs.gnu.org; Sun, 24 Aug 2025 06:41:49 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uq8AQ-0000qi-Hp; Sun, 24 Aug 2025 06:41:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=GPEGUzos+FMag6Z6dl833z8ZCKP8iGwvVtmg8+d8Ea4=; b=cLWvofvSxC5a eIjliWmcUua1jmVKnyduTbbh2RYjFdfGIbeRM/TBxU8vllketJ3AY6iu4yqbdjJXn+XLhOyprsHD9 hTzevcFQzkIVHrLXXLjfi9iNJlAyS4hCivrXjiU7+hQ5v7raCVuIwFaN7mMoPGI4tbw0GVhh/S8KD zDBCfNGltj3pvcPQTLsGFvS20J8dW3u5kP/bQDt5vt/i42Y/WjulZqEg1rnFlH8sTSySLB0fl0h9e RbMgIRQ4mldA6y2Nx2Vsr8j+/5miOGlGbc2h4vXfAQTLMDulryU/rGS1pDDoifuAMfpAGDVAGBDZs Mwkbv8aJZFkKs56EiE4F0A==; Date: Sun, 24 Aug 2025 13:41:39 +0300 Message-Id: <86ms7p6ly4.fsf@gnu.org> From: Eli Zaretskii To: Bruno Haible In-Reply-To: <5043940.kys9EeIHyz@nimes> (message from Bruno Haible on Sun, 24 Aug 2025 09:13:22 +0200) Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <86bjo58ctc.fsf@gnu.org> <5043940.kys9EeIHyz@nimes> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, eggert@cs.ucla.edu, shingo.fg8@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Bruno Haible > Cc: Paul Eggert , 79296@debbugs.gnu.org > Date: Sun, 24 Aug 2025 09:13:22 +0200 > > * Hypothesis 1: > The Gnulib support included in Emacs 30.2 is older than 2024-12-23. How does one know? I see Paul last ran admin/merge-gnulib on the emacs-30 release branch on Aug 2, 2025, but maybe this is not what I should be looking at? In any case, Dec 2024 sounds too old even for the release branch. > * Hypothesis 2: > The Gnulib support included in Emacs 30.2 misses the commits > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=927a70e0853345315570f051fd6996cfeb7b4d96 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=9f7ff4f423cd805866cd4edef806c32393621df0 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=00211fc69c926d6c8f6e3f3cf1d8802623db2af9 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8e795a8d9f8c3269a3d30d0d1adbaf0ea9ad4a84 These commits are in Gnulib files that are not used in Emacs. What are their effects on the issue at hand, which is the non-ASCII strings produced by Gnulib's nstrftime? > - This UTF-8 system codepage is only supported with Microsoft UCRT, not > with the MSVCRT. At compile time, this configuration can be tested via > '#ifdef _UCRT'. (This is true for both the mingw and the MSVC toolchains.) What is it in UCRT that is required for Gnulib to support the UTF-8 system codepage on Windows, in particular for strftime? IOW, what does the UCRT implementation of libc does that the MSVCRT one doesn't, that affects this aspect of Gnulib's strftime? > * Hypothesis 4: > Enabling the option "Beta: Use Unicode UTF-8 for worldwide language support" > has a different effect than creating a .manifest file like the Gnulib > test suite does. > This is about defining a process-specific codepage, which is not what happens in this case. So I don't think it's relevant. > * Actions: > - Bruno: Add a unit test for nstrftime in w32utf8 mode. I'd be interested to see how this test works for you. > - Eli or Paul: Disprove hypotheses 1, 2, 3. > > > Shingo Tanaka, could you please tell what is the value of > > w32-multibyte-code-page on your system, both when "Beta: Use Unicode > > UTF-8 for worldwide language support" is ON and when it is OFF? > > Yes, this info would be useful. The upshot is that we can only reliably know the system's language ID (0x11), but it is still a mystery for me where did strftime take cp932 with which it encoded the time-related strings. Because all the other APIs I know about which report codepages all say it's UTF-8. From debbugs-submit-bounces@debbugs.gnu.org Sun Aug 24 11:32:31 2025 Received: (at 79296) by debbugs.gnu.org; 24 Aug 2025 15:32:32 +0000 Received: from localhost ([127.0.0.1]:46076 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uqChr-0002og-H7 for submit@debbugs.gnu.org; Sun, 24 Aug 2025 11:32:31 -0400 Received: from mail.cs.ucla.edu ([131.179.128.66]:40416) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uqChp-0002oQ-OA for 79296@debbugs.gnu.org; Sun, 24 Aug 2025 11:32:30 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id D91FD3C306631; Sun, 24 Aug 2025 08:32:23 -0700 (PDT) Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10032) with ESMTP id Pe_7NNhKqq5s; Sun, 24 Aug 2025 08:32:23 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by mail.cs.ucla.edu (Postfix) with ESMTP id B07303C306632; Sun, 24 Aug 2025 08:32:23 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.cs.ucla.edu B07303C306632 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.ucla.edu; s=9D0B346E-2AEB-11ED-9476-E14B719DCE6C; t=1756049543; bh=tT8ExYT7fLoehTN4VmXQLOKjw3wOOGnmRKR+AbPHnW0=; h=Message-ID:Date:MIME-Version:To:From; b=fED4J+M/CA5ipP8+L04SjwGwSzZLq8vlI9RLim/PBRcxKKRXsJkgLesh5jscPZCCf IQpZTvD9cp8eIZS0fLzW8AJqZ9uVhbacj6izrvFj8m1WZ89qZSMm1yvFOMILTTkXos wxA7VBRMLLvhhjJ1SlwpeIsb1ebSrbpfoMsrsAr8lp4QdmQqOgK48sbWdyz0bpvQBY s54bFUK90BgvwP5mtHUl7u4YuPNE/I86yORv40EeNd8g3RSLZwUNsloXufO550fYD3 Z3C8/zkjALv5mF0WfGuiAPFtAESDVIFykVm5TVaV6FeH4tFtHoahY/0rWpgnzU/PO4 Xi4GbJzqgCafg== X-Virus-Scanned: amavis at mail.cs.ucla.edu Received: from mail.cs.ucla.edu ([127.0.0.1]) by localhost (mail.cs.ucla.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id 0lAv8oywPE1R; Sun, 24 Aug 2025 08:32:23 -0700 (PDT) Received: from penguin.cs.ucla.edu (47-154-18-19.fdr01.snmn.ca.ip.frontiernet.net [47.154.18.19]) by mail.cs.ucla.edu (Postfix) with ESMTPSA id 898E13C306631; Sun, 24 Aug 2025 08:32:23 -0700 (PDT) Message-ID: <79dfe790-3a38-4b8b-92ae-955f1b18535a@cs.ucla.edu> Date: Sun, 24 Aug 2025 08:32:23 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config To: Eli Zaretskii References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <86bjo58ctc.fsf@gnu.org> <5043940.kys9EeIHyz@nimes> <86ms7p6ly4.fsf@gnu.org> Content-Language: en-US From: Paul Eggert Organization: UCLA Computer Science Department In-Reply-To: <86ms7p6ly4.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, Bruno Haible , shingo.fg8@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On 2025-08-24 03:41, Eli Zaretskii wrote: >> From: Bruno Haible >> Date: Sun, 24 Aug 2025 09:13:22 +0200 >> >> * Hypothesis 1: >> The Gnulib support included in Emacs 30.2 is older than 2024-12-23. > > How does one know? I see Paul last ran admin/merge-gnulib on the > emacs-30 release branch on Aug 2, 2025, but maybe this is not what I > should be looking at? In any case, Dec 2024 sounds too old even for > the release branch. I ran admin/merge-gnulib on bleeding-edge Gnulib as usual, so the release branch should have Gnulib as of August 2. From debbugs-submit-bounces@debbugs.gnu.org Mon Aug 25 18:34:36 2025 Received: (at 79296) by debbugs.gnu.org; 25 Aug 2025 22:34:36 +0000 Received: from localhost ([127.0.0.1]:51887 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uqflr-0001jx-Q9 for submit@debbugs.gnu.org; Mon, 25 Aug 2025 18:34:36 -0400 Received: from mo4-p01-ob.smtp.rzone.de ([85.215.255.52]:45191) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uqflf-0001jW-Rf for 79296@debbugs.gnu.org; Mon, 25 Aug 2025 18:34:26 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1756161257; cv=none; d=strato.com; s=strato-dkim-0002; b=a8vVdjbcrYW9nxMNogrkxVTXg44hh0YIodJoi4qMTjF096mpkZrllgjLtTfeT4panQ fbIw1h1biF0T1UhbWwJZM82x6TVNUhorAF8YS7Gs7WDBv8rxMXvyk5mks3SZmWeQ/lgK sNWrH/VkFSlemBRwS+d1jyABqqumtfxq3XdwTI7Tio4imlYM1nz7cGtEKmjZQAz00r5i UD/eRudZOXXG3SUcYVS01XqyRnne8L4PFu7JpU/sOkN8VWbwVGbmlY6Xl3U40xa1sG/w VcoCBuYKplv2ULO2+e+C+8Dvr9SNPN6VxhXJ3Oc2zHZ3wf9IcqKAP+3YCnOLs3RtIKJ1 d9cQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1756161257; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=QrHflULI/ZxJOWDK3E4WKbrhG4h14qpqZ7CwLZd+Gq8=; b=OQYZu/1ltiDUpjUNo8ungW+kPkTqtzoa56b7gjOsZ5Xsn2Ka/Hna8rD4QLgxC/Ob+1 txZI0D6QWDGyo1D9cgD2EcTqytOwyTQfXbO3hCW9WZw3pJM0prvbquLDoWsBTjJWJOxL cxEdWGPA2Sa0MwF0F797c1SGO+7sZD7sD2/wQblP4KywQ+w8ot8T7MvCsSEegK2XrYku oahglZXejTB1Tu9YaFfENrwKeF0XwoYvjfJ3CZc+c54xFKC+6oHIA0XymlM6b1WCJn6G XwsxpO33XFafaWVQITHylE1Y2K1bTt15qFkn9Y5yrTf00dtQyl5HVyLOdH1iglPUn+QI Codg== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo01 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1756161257; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=QrHflULI/ZxJOWDK3E4WKbrhG4h14qpqZ7CwLZd+Gq8=; b=cCgOIaykt74chW9upnWpZH4y9bOtCb3xcyPE1t6IrZELrHkYe1zjCkxNy+yptuY8R7 eOmOk0JRI04PQ+6bvJfMrilVYya+oRFHPprC7okAZIclfV2Hn3ZpLpm3jz0AncnlnotZ 2HSTgvJDevb4BBH0jGx7PxtrovPHYL3zuXLU1lSYZiMxeQbDNTHN/JffkDZBgpfqVMqV O2aUQ4jHAYUtC7XtSWF74qtnS1VPSY7+kBIKYz5Q0mEgmB0FSfRbi3Z7UNSwuTXNNXC9 D6N5WNDp8kcAbprq9kEIf4VxeL+hU40Vak5vS0OVNdEYExJqoIbk+e5lMBHP4Yl6fy0f 6g3A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1756161257; s=strato-dkim-0003; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=QrHflULI/ZxJOWDK3E4WKbrhG4h14qpqZ7CwLZd+Gq8=; b=6vMB33/tbST1bqQHal7zTo3/FHth+OgoEjVCLMCMthnWdu/BVhKtTzAPdLukVtdSge r3mW6/Mn3DKxZEd4BjAg== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlLnY4jECd2hdUURIbZgL8PX2QiTuZ3cdB8X/nqmydFjit8tJ0y+qFDHfUY/Ykqrbk" Received: from nimes.localnet by smtp.strato.de (RZmta 52.1.2 AUTH) with ESMTPSA id N9ae6317PMYGSfB (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Tue, 26 Aug 2025 00:34:16 +0200 (CEST) From: Bruno Haible To: Eli Zaretskii , Corwin Brust Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config Date: Tue, 26 Aug 2025 00:34:16 +0200 Message-ID: <4361825.HVULnkfqZJ@nimes> Organization: GNU In-Reply-To: <86ms7p6ly4.fsf@gnu.org> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <5043940.kys9EeIHyz@nimes> <86ms7p6ly4.fsf@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, eggert@cs.ucla.edu, shingo.fg8@gmail.com X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) > * Actions: > - Bruno: Add a unit test for nstrftime in w32utf8 mode. Done. The test verifies that nstrftime produces the Japanese weekday in UTF-8 encoding. It passes, provided the locale name used is "Japanese_Japan.65001", *not* "Japanese_Japan.932". Eli Zaretskii wrote: > > * Hypothesis 1: > > The Gnulib support included in Emacs 30.2 is older than 2024-12-23. > > How does one know? I see Paul last ran admin/merge-gnulib on the > emacs-30 release branch on Aug 2, 2025 With Paul's newer comment, this hypothesis is falsified. > > * Hypothesis 2: > > The Gnulib support included in Emacs 30.2 misses the commits > > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=927a70e0853345315570f051fd6996cfeb7b4d96 > > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=9f7ff4f423cd805866cd4edef806c32393621df0 > > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=00211fc69c926d6c8f6e3f3cf1d8802623db2af9 > > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8e795a8d9f8c3269a3d30d0d1adbaf0ea9ad4a84 > > These commits are in Gnulib files that are not used in Emacs. What > are their effects on the issue at hand, which is the non-ASCII strings > produced by Gnulib's nstrftime? That's most likely the problem, then. For Emacs, the third commit should be the essential one: It forces a setlocale() argument that ends in ".65001", thus telling the Microsoft UCRT that you want the UTF-8 environment. > > - This UTF-8 system codepage is only supported with Microsoft UCRT, not > > with the MSVCRT. At compile time, this configuration can be tested via > > '#ifdef _UCRT'. (This is true for both the mingw and the MSVC toolchains.) > > What is it in UCRT that is required for Gnulib to support the UTF-8 > system codepage on Windows, in particular for strftime? IOW, what > does the UCRT implementation of libc does that the MSVCRT one doesn't, > that affects this aspect of Gnulib's strftime? Microsoft's UCRT has many changes compared to MSVCRT, probably worth of 10 years of development. Support for the UTF-8 environment is certainly only one of the many improvements. So, the remaining hypotheses are: * Hypothesis 2: The string that Emacs passes to the setlocale() function does not end in ".65001". * Hypothesis 3: The Emacs 30.2 binaries are linked with MSVCRT, not with UCRT. -> Corwin? Bruno From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 26 07:19:10 2025 Received: (at 79296) by debbugs.gnu.org; 26 Aug 2025 11:19:10 +0000 Received: from localhost ([127.0.0.1]:54580 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uqrhi-0007GB-H9 for submit@debbugs.gnu.org; Tue, 26 Aug 2025 07:19:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38762) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uqrhR-0007EF-Bc for 79296@debbugs.gnu.org; Tue, 26 Aug 2025 07:18:54 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uqrhG-0006sl-Ut; Tue, 26 Aug 2025 07:18:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=k4KYmjlzt4K5G1gjQJSHVJsQMp5fbhTk0fDL0EOPyWo=; b=k0F1SHyrP8XP W1Eqt9B9doJNvIdKtEhDyfRzP7++QGzUzSUclgxI20uOvEg6QT/X5Ml/9KpYE2+6K1opN8NgoJsKt w27sd9eqwFNxxQNvCDf9F81u4XMjOh4RmSrpctrtzhijCCVDVF3bpoHVF1HkAbQoN8iHESu45nDpU uQCHvF4yy3AaVqE+zUTYd4rihFVvB64h0FdbywYTppIYM3Qt7GQItDgjYURMxtUCYZJEXA5Tbm+eo b7UijsJf6WmkZ8+x50MuzVF5/hhOwy0XfFfKjQUG3Z2HaYJhHr+fIpQ3waRr05PCHBMuqqXO6cNir msU0/ycyuaVHXhI8g1Ptug==; Date: Tue, 26 Aug 2025 14:18:34 +0300 Message-Id: <86o6s2uy9h.fsf@gnu.org> From: Eli Zaretskii To: Bruno Haible , corwin@bru.st, shingo.fg8@gmail.com In-Reply-To: <4361825.HVULnkfqZJ@nimes> (message from Bruno Haible on Tue, 26 Aug 2025 00:34:16 +0200) Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <5043940.kys9EeIHyz@nimes> <86ms7p6ly4.fsf@gnu.org> <4361825.HVULnkfqZJ@nimes> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Bruno Haible > Cc: shingo.fg8@gmail.com, eggert@cs.ucla.edu, 79296@debbugs.gnu.org > Date: Tue, 26 Aug 2025 00:34:16 +0200 > > > * Actions: > > - Bruno: Add a unit test for nstrftime in w32utf8 mode. > > Done. The test verifies that nstrftime produces the Japanese weekday > in UTF-8 encoding. It passes, provided the locale name used is > "Japanese_Japan.65001", *not* "Japanese_Japan.932". Thanks. See below about that, in the context of Emacs. > > > * Hypothesis 2: > > > The Gnulib support included in Emacs 30.2 misses the commits > > > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=927a70e0853345315570f051fd6996cfeb7b4d96 > > > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=9f7ff4f423cd805866cd4edef806c32393621df0 > > > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=00211fc69c926d6c8f6e3f3cf1d8802623db2af9 > > > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8e795a8d9f8c3269a3d30d0d1adbaf0ea9ad4a84 > > > > These commits are in Gnulib files that are not used in Emacs. What > > are their effects on the issue at hand, which is the non-ASCII strings > > produced by Gnulib's nstrftime? > > That's most likely the problem, then. For Emacs, the third commit should be > the essential one: It forces a setlocale() argument that ends in ".65001", > thus telling the Microsoft UCRT that you want the UTF-8 environment. Emacs by default calls setlocale with the argument of "", thus setting up to use the default system locale. Are you saying that a call like setlocale (LC_TIME, ""); is insufficient to force UTF-8 encoding of time-related strings, on MS-Windows with the UTF-8 system-codepage feature turned on? Can you try running your tests with a locale of "" and see if the codeset is set to UTF-8 or codepage 65001? > > > - This UTF-8 system codepage is only supported with Microsoft UCRT, not > > > with the MSVCRT. At compile time, this configuration can be tested via > > > '#ifdef _UCRT'. (This is true for both the mingw and the MSVC toolchains.) > > > > What is it in UCRT that is required for Gnulib to support the UTF-8 > > system codepage on Windows, in particular for strftime? IOW, what > > does the UCRT implementation of libc does that the MSVCRT one doesn't, > > that affects this aspect of Gnulib's strftime? > > Microsoft's UCRT has many changes compared to MSVCRT, probably worth of 10 years > of development. Support for the UTF-8 environment is certainly only one of > the many improvements. Any details beyond that general consideration? Are you saying that MSVCRT doesn't support codepage 65001 as a codeset of a locale, whereas UCRT does? Do the tests you wrote fail when linked with MSVCRT? > So, the remaining hypotheses are: > > * Hypothesis 2: > The string that Emacs passes to the setlocale() function does not end in ".65001". AFAIU, it shouldn't, not if Windows does TRT with the default locale when the UTF-8 option is turned on. However, since this is Emacs, Shingo Tanaka could test this by setting the Lisp variable system-time-locale to the string "Japanese_Japan.65001" and repeating the test presented at the beginning of this discussion. Assuming that the build is a UCRT build (Corwin?), this should fix the problem, if your analysis is correct. > * Hypothesis 3: > The Emacs 30.2 binaries are linked with MSVCRT, not with UCRT. > -> Corwin? Corwin? From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 26 10:19:30 2025 Received: (at 79296) by debbugs.gnu.org; 26 Aug 2025 14:19:30 +0000 Received: from localhost ([127.0.0.1]:55945 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uquWI-0002f3-0j for submit@debbugs.gnu.org; Tue, 26 Aug 2025 10:19:30 -0400 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]:44203) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1uquWD-0002ek-Uw for 79296@debbugs.gnu.org; Tue, 26 Aug 2025 10:19:26 -0400 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-77057266cb8so2000337b3a.0 for <79296@debbugs.gnu.org>; Tue, 26 Aug 2025 07:19:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1756217959; x=1756822759; darn=debbugs.gnu.org; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=LYmEhw8pPhU2fCHObNk/Sg4QuLzXJaxsxyBLtC6MTrs=; b=PyQu2oX/pKJ8ROob2Vyn+zPNMLi24iF6I4qZB3fkJjTOwo6iSQ91uYTT/6WDoiMpMV YcNLhnzK2yUNUO+kI4MOEC+fxgOWzjF8UBGrxgPiPk9b8j8OkRglIqOwxVKtfpBPn86B P+NiZpUpxtNMse/g9B7P5ZV96yXE2vpRKbiLfTaz/bJ8bIb7nr5cqBkgWZiG5T2zrEbf vkhQZDvyrmiFPuYy1zzxYAmx6HnpGBwfL7ZJCtGt5bIIPtoI33CuIiuDhNSlDjS47/J8 fRNl/p4IzJkEG0ghWKz4TroTUa4E4LghiDjANbkZiv5iomVgGJeMZMEUbYoNOA37/XeA uUmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756217959; x=1756822759; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=LYmEhw8pPhU2fCHObNk/Sg4QuLzXJaxsxyBLtC6MTrs=; b=xScbv8cXApi5VEs+EHdNkNtYa/JdYRQ0V54kRVJsMA+2OdO+oJPvH4+l357AFXg0gS nv240cT14/ziSYayCueDPSsqiR59rnCHtdLvFzDrxZ2CLWrmVVl8Uy5SyHl4kHe0gUPR ICQgOwmKoO/zjk1sMIj3hZTTWUzDRN4rOGW9t3zzBqJ9aBWsOpajjzrxR8d+cJByEkbB Q5VvKVGEc6Z0aTxairX2/nfWXKM19B7FTNxw20khYLixKPXwawT8A/8q6TZYIpykcBbY neeP/JAVMm90pJ+3uP7Imz16mvOpHFSE1upwTvP9RgpcBL9dyuPo3oa8Hfd5QL2UH0OP 6VCw== X-Forwarded-Encrypted: i=1; AJvYcCWvtLL5G0e+OgTZbKwWygyrFzKvhGXAVMcqiAIXrXtuKf9RKEsVvkpfFeXFFs6xzNfQyAsgMQ==@debbugs.gnu.org X-Gm-Message-State: AOJu0YxUTFMxI6kjqTHuf0NA5jqisow6P/NfHLumdjG0BjnUCQCaONt2 qSMuud3pS+YD+1EU2oV6U1eblX60+XqxRqKIRP1Qa2uxG3IJv+idPSTH X-Gm-Gg: ASbGncv8CxbOFPDNfnsdLd7VWcStpr4kXtb9tQ33HEZRcfuWCqlDY75yJA4fzwjqrSV KEPV+ZpjUTALi9pLZyRYa0ab6YG7DWmK/+81pBSp/Q4lJizUmGjhPCPdLbzbduS4mdkKCfs8FO1 0if1dhUqTOKtXvvOB9NpG1rPbQzaXMG+fUxk7lMpTiX+B7vvzqqnCBix90Z7Jl5mtnmJa0OfWsV 0B6ghxk8sY5XgPef4QfeRkieM5jxn9l53wJPP4Z15qjr1vDvxc/pUdew9MJXGHvAw5CxkM5WflH QJqtiFq2guXDgA9+bx3WDmAOmGsVZTPq9tZ95NxToLX6o39R8lgqXw/QQSJ8dyPHrEQpJNr0vu3 zdhlsdxgDYJelleonKjMpaFm8YEQ5Q226VJU6s90JwqEsheBEmUOkCKCFGA== X-Google-Smtp-Source: AGHT+IFfi5XlAPq9LhZCAIP2/aOR9xVYGeyL1qYyu0UHHQHr913JIEoK00+NSYEyg1+NI3NMsE3Djw== X-Received: by 2002:a17:902:d48e:b0:246:441f:f144 with SMTP id d9443c01a7336-246441ff51fmr211140665ad.56.1756217959164; Tue, 26 Aug 2025 07:19:19 -0700 (PDT) Received: from DESKTOP-CQRAB2T.gmail.com ([240d:1a:6f4:6b00:6cd1:bac0:4170:9474]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2466889f5dasm97889455ad.147.2025.08.26.07.19.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Aug 2025 07:19:18 -0700 (PDT) Date: Tue, 26 Aug 2025 23:19:14 +0900 Message-ID: <86ikia5fod.wl-shingo.fg8@gmail.com> From: Shingo Tanaka To: Eli Zaretskii Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config In-Reply-To: <86o6s2uy9h.fsf@gnu.org> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <5043940.kys9EeIHyz@nimes> <86ms7p6ly4.fsf@gnu.org> <4361825.HVULnkfqZJ@nimes> <86o6s2uy9h.fsf@gnu.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.2 (x86_64-w64-mingw32) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, shingo.fg8@gmail.com, corwin@bru.st, Bruno Haible , eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Wed, 27 Aug 2025 05:18:34 +0900, Eli Zaretskii wrote: > > However, since this is Emacs, Shingo Tanaka could test this by setting > the Lisp variable system-time-locale to the string > "Japanese_Japan.65001" and repeating the test presented at the > beginning of this discussion. Assuming that the build is a UCRT build > (Corwin?), this should fix the problem, if your analysis is correct. Here is the result. Unfortunately it doesn't fix the issue. (setq system-time-locale "Japanese_Japan.65001") "Japanese_Japan.65001" (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) "25,01,01 \220\205\227j\223\372" Regards, Shingo From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 26 11:50:02 2025 Received: (at 79296) by debbugs.gnu.org; 26 Aug 2025 15:50:03 +0000 Received: from localhost ([127.0.0.1]:56208 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uqvvu-0007QI-72 for submit@debbugs.gnu.org; Tue, 26 Aug 2025 11:50:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37970) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uqvvp-0007Pb-CT for 79296@debbugs.gnu.org; Tue, 26 Aug 2025 11:49:59 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uqvvf-0003i8-MG; Tue, 26 Aug 2025 11:49:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=zKXkFD5jdAvTrQntEKy0PBJp6CdayBwWBGP6tKQBYcg=; b=apuWvBluBxQO wbKuQcdghoTRbgrLEf1QcH+lUX8eX+vg5Rc2tzinH9gbV33CSpqB820nUvQXSINDQvEVDtH7rc3te Onc9CPOwNSZGSWthSnuCIFFX6+T0mznt4mNU0lZWnk08M/D5u25HUmQboPZuzTb4IcgTAi+hDBGzc GJ3kLcuekMFDgzMJ60Y0dOwVVZyhSFpOnE4ZKK3zRD8UX/+1eOFDpwHTLKMyE5P0m7k7w7vRrFvA7 +kfpEbWcpkjBiFtpn0RI6ri/FyExsLocJJo0KXfK6zw3oNaGFZ7Tjx1JaQ4ygKsSXSsntqJPRms2Q QsgVfpRyo7/grA/DQTDf8g==; Date: Tue, 26 Aug 2025 18:49:23 +0300 Message-Id: <86a53mulq4.fsf@gnu.org> From: Eli Zaretskii To: Shingo Tanaka In-Reply-To: <86ikia5fod.wl-shingo.fg8@gmail.com> (message from Shingo Tanaka on Tue, 26 Aug 2025 23:19:14 +0900) Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <5043940.kys9EeIHyz@nimes> <86ms7p6ly4.fsf@gnu.org> <4361825.HVULnkfqZJ@nimes> <86o6s2uy9h.fsf@gnu.org> <86ikia5fod.wl-shingo.fg8@gmail.com> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, shingo.fg8@gmail.com, corwin@bru.st, bruno@clisp.org, eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Tue, 26 Aug 2025 23:19:14 +0900 > From: Shingo Tanaka > Cc: Bruno Haible , > corwin@bru.st, > shingo.fg8@gmail.com, > eggert@cs.ucla.edu, > 79296@debbugs.gnu.org > > On Wed, 27 Aug 2025 05:18:34 +0900, > Eli Zaretskii wrote: > > > > However, since this is Emacs, Shingo Tanaka could test this by setting > > the Lisp variable system-time-locale to the string > > "Japanese_Japan.65001" and repeating the test presented at the > > beginning of this discussion. Assuming that the build is a UCRT build > > (Corwin?), this should fix the problem, if your analysis is correct. > > Here is the result. Unfortunately it doesn't fix the issue. > > (setq system-time-locale "Japanese_Japan.65001") > "Japanese_Japan.65001" > > (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) > "25,01,01 \220\205\227j\223\372" OK, now let's try to establish whether your Emacs was linked against UCRT or MSVCRT. If you have objdump.exe (part of Binutils) installed, please do objdump /path/to/emacs.exe | fgrep "DLL Name" and see if the output includes msvcrt.dll (case-insensitive) or ucrtbase.dll. If you don't have objdump, try the dependency walker (https://www.dependencywalker.com/) instead. Or Process Explorer with its lower panel set to show DLLs. Look for msvcrt.dll or ucrtbase.dll. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 26 12:19:11 2025 Received: (at 79296) by debbugs.gnu.org; 26 Aug 2025 16:19:11 +0000 Received: from localhost ([127.0.0.1]:56271 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1uqwO7-0000Sq-8h for submit@debbugs.gnu.org; Tue, 26 Aug 2025 12:19:11 -0400 Received: from mo4-p00-ob.smtp.rzone.de ([81.169.146.220]:34751) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1uqwO2-0000Sa-0q for 79296@debbugs.gnu.org; Tue, 26 Aug 2025 12:19:08 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1756225141; cv=none; d=strato.com; s=strato-dkim-0002; b=lqTrOlUcYXaMkgmbFmCJb604AfE01O5ffNie/RTGMWB/1XamTojSQEv707DzUnspkL pSWQEccsUUh6ZbV4MlpEagiWeCOUwhIpBU2w74dPLlCIZ5mj7FAWoCjvhzcUxCiUIcMl XFqQW/2lOMjSZU+LZBYBeGUiw3aXv5s1jezGJuANH4tONtzGhefKSxLCK1kdE4xrrgxs WaLWOMk0oNHB/Y44FC58keHFw9otl2Tgqqe/LGRzTl61ZH5+zVM2H2TcZgf3NV1udouU 07JJWvTpHsdSkTsEspQ2zYBb4+zvFZRX5HEoZBEYXyIp5Xg4HlyeYRvc+pwJ+MniBxiV Jj2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1756225141; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=i9xJ3WbaMYQTAAvmJbo2lD+QZ5Ean8ATFKtYEvG37n8=; b=lh6R2lvhsXKOllzcOe0iVJOxkGKtZs4gfm6pp3FcFHQdk745zgYMav6+5jEqJl5is1 3GLZLpXpCjH7cHQtovZO0Lh+vo5v4krL4AVrCCk2+2FN5Fg33q/LUdvPwXjeObehcHqk b1tLHQZHK3q/gM4Jg8vEkV3/muDOTwTGAj0749R2VololutePkiyxzdoKzAivZtoL8Ko 7LBfRGHzDFVEBwTlX3C6jadakSP4B5OAlb8vRZFJKPh6xp8PL7nuHt/HOku7Z0wxgh8P focMMwjDLraJXp1FVkXWkD/BpWE7e75ey74XIMHAfy0HDAtqbZPECcWMc+bylXBDywwe 5Vsw== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1756225141; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=i9xJ3WbaMYQTAAvmJbo2lD+QZ5Ean8ATFKtYEvG37n8=; b=g5r7GjzgMURaLZFjuGWmEN1LClXk38dSHOGQY8YTfrVR4crd2Jw3flz8J7y7OPXFoH kvW72ORUQ+sNyl8J/v4VbcXLX6Pif/p76l2H7RFh2AYHcX5c1JQ0tVP2Qh4+9GMGvsLz crDEosC9qCPSIdGO/vR6PQL7xEVIG+BIXGO1z+2BJK02mhb5qhPGOomHkf2he/RHGKoj znW9oleOMOwIaLhPgW4B/itbFe8gCsnXUXh0v800guOSAgpLnYwDcM2+QX87fiNOBaBL 4mDfOZurgisnYvDL/cAEoT4J0pYgj9KFBCaACm36gUsPCihkepjLG+KAj16NvFpsl21k 61Yg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1756225141; s=strato-dkim-0003; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=i9xJ3WbaMYQTAAvmJbo2lD+QZ5Ean8ATFKtYEvG37n8=; b=ma48Ku+MKSYrGlO1btSMMep0KTpgdFku842PjgVIKq7PB4ccjJpClV0Px5PrMSgIoZ bpI0ILKC7Xoet8PjxdBA== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlLnY4jECd2hdUURIbZgL8PX2QiTuZ3cdB8X/nqm+QRTsg8D4hUMn4LdYg7YwLDmy1" Received: from nimes.localnet by smtp.strato.de (RZmta 52.1.2 AUTH) with ESMTPSA id N9ae6317QGJ1WZL (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Tue, 26 Aug 2025 18:19:01 +0200 (CEST) From: Bruno Haible To: Eli Zaretskii Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config Date: Tue, 26 Aug 2025 18:19:00 +0200 Message-ID: <4324413.uijzvN6y4y@nimes> Organization: GNU In-Reply-To: <86a53mulq4.fsf@gnu.org> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <86ikia5fod.wl-shingo.fg8@gmail.com> <86a53mulq4.fsf@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, corwin@bru.st, shingo.fg8@gmail.com, eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii wrote: > OK, now let's try to establish whether your Emacs was linked against > UCRT or MSVCRT. I just did that: > - Emacs 30.2 (latest, https://ftp.gnu.org/gnu/emacs/windows/emacs-30/) Downloaded and unpacked it, and ran $ dumpbin /imports emacs.exe Result: It is linked against msvcrt.dll. So, there is no way to make these binaries work right in the UTF-8 environment of Windows. Bruno From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 26 17:08:25 2025 Received: (at 79296) by debbugs.gnu.org; 26 Aug 2025 21:08:25 +0000 Received: from localhost ([127.0.0.1]:57007 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ur0u0-0006Ug-M1 for submit@debbugs.gnu.org; Tue, 26 Aug 2025 17:08:25 -0400 Received: from mo4-p01-ob.smtp.rzone.de ([81.169.146.166]:45473) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ur0tw-0006UB-U3 for 79296@debbugs.gnu.org; Tue, 26 Aug 2025 17:08:22 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1756242497; cv=none; d=strato.com; s=strato-dkim-0002; b=Iez+S2jmHkWZJpodOwcWDuwWd34z5AsxywVDPU2c/SVsweVoY3hbiHSotUlCEBfaVf RVOwJGwYiFrQxgRB+FhQXPeaGM0n7mjcOHH8miJvCZYwIUU9e4j3Des65H1zh+QSNm6Z lCwFSGBl02+nfXQOqgRGI8UvceyaL/sFuUXg9iTHxV39bxd7jzjXGar+yJWvpvWtjlKf Qza395nQK884vV/6JBWnx20UW28Vs02yrn5IuRNHRpbkIkAeX8iUvaV/J4GyjljPOQFm +MeMyNFCfK9R3iFx+ip3Bqj2WRR3dX3iOJlvoYBoEfR7e0Xbf8xUEj97wFaPyGfD/I7m iJoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1756242497; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=LPc+De5FwporOoUeefNublibaWdQUNvPJfIYFFF/XRA=; b=kjOPxWPnnP9Q9sD8D4kwd4GWuFDLtSONWtmzD/b7Na7mnKNjX4Av4qKzvhjw+Cu5Ly 5lirF6xYnAjMOqlwudB7P5az2RPSuTMrRdAKPu08pO/gAIcLqrVYPiBoiAnYVayFkzAU ALiWpcUS/kB9OxcO78xGm3LxD3WTl4JoehbfZV8l8FVBT566YyTbN2n35zsx5rqXgOH/ ni2REdUtQadF7KsB95sNtsNTlLDldjwMIhmS/vAOtj+7jLnvNkGiZzla/JAZH24akKUE BgJs8iztbiBXZnhYZKJKE2NmoDo8LYAYCQwu9Yop+7wGhXDwRBIy158u8igBEOZuN71b nhGQ== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo01 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1756242497; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=LPc+De5FwporOoUeefNublibaWdQUNvPJfIYFFF/XRA=; b=HhseRmKxa6eygp7ltd8JCWNS4hHm/6bXd29WY7WMAX00M/XCfzYPUQ9NGte2/SS9kn Tl/2T94uOwcCdRHOtkqEsm71NAEHdW4UVssaunbu1+tOXQx+MuymcZWiUMVSvT72vkK4 W5x53z8hjfxOm5MIN/W+OmLMFOZyjuHy/8RaBUaFILXjfIGMdjou8NiqBXJ4cQxRf0Qj ci9/ZbET60raXqFrvEoJ13UK8HSdvMX086zsHPUiQ7pd2k6XZUI3KhTjza8jToSg/74f LUt5S8nuMI/8eps0jouQZD+TLEnX+7SUr5MYo73O+7q7nUQMAY8dr492PyVpqaeSus/q zKYA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1756242497; s=strato-dkim-0003; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=LPc+De5FwporOoUeefNublibaWdQUNvPJfIYFFF/XRA=; b=asXEEVllorBNYUiB8tTcpO5QkqMqgpYy9ZOIYD4vFotoI179XEC8Zm3nYh/tCIrvDc bHyWhEsAxIiuIErkKfDg== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlLnY4jECd2hdUURIbZgL8PX2QiTuZ3cdB8X/nqm+QRTsg8D4hUMn4LdYg7YwLDmy1" Received: from nimes.localnet by smtp.strato.de (RZmta 52.1.2 AUTH) with ESMTPSA id N9ae6317QL8GX3Z (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Tue, 26 Aug 2025 23:08:16 +0200 (CEST) From: Bruno Haible To: Eli Zaretskii Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config Date: Tue, 26 Aug 2025 23:08:16 +0200 Message-ID: <7697164.KhUVIng19X@nimes> Organization: GNU In-Reply-To: <86o6s2uy9h.fsf@gnu.org> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <4361825.HVULnkfqZJ@nimes> <86o6s2uy9h.fsf@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, corwin@bru.st, shingo.fg8@gmail.com, eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) Eli Zaretskii wrote: > Emacs by default calls setlocale with the argument of "", thus setting > up to use the default system locale. OK. > Are you saying that a call like > > setlocale (LC_TIME, ""); > > is insufficient to force UTF-8 encoding of time-related strings, on > MS-Windows with the UTF-8 system-codepage feature turned on? No, with the Windows UCRT libc and the enabled UTF-8 setting/checkbox this is enough to get nstrftime() to produce UTF-8 encoded output. That's what I can infer by playing with variations of my unit test. On GNU systems, you will also need setlocale (LC_CTYPE, ""); because glibc requires that the LC_TIME and LC_CTYPE categories specify the same encoding. (This is a kind of sanity check in glibc.) > Can you > try running your tests with a locale of "" and see if the codeset is > set to UTF-8 or codepage 65001? If I use setlocale (LC_ALL, ""); instead of just setlocale (LC_TIME, ""); then - again, in UCRT only - MB_CUR_MAX gets set to >= 4, which indicates an UTF-8 encoding. Even without a setlocale invocation, GetACP() returns 65001, since that's the direct effect of the UTF-8 setting/checkbox. > > Microsoft's UCRT has many changes compared to MSVCRT, probably worth of 10 years > > of development. Support for the UTF-8 environment is certainly only one of > > the many improvements. > > Any details beyond that general consideration? Are you saying that > MSVCRT doesn't support codepage 65001 as a codeset of a locale, > whereas UCRT does? Yes, that's what I'm saying. With MSVCRT, there is no way to get a MB_CUR_MAX value > 2. Which means, no UTF-8 support. > Do the tests you wrote fail when linked with MSVCRT? Yes, the tests already fail at the 'MB_CUR_MAX >= 4' assertion when linked with MSVCRT. Bruno From debbugs-submit-bounces@debbugs.gnu.org Tue Aug 26 20:05:51 2025 Received: (at 79296) by debbugs.gnu.org; 27 Aug 2025 00:05:51 +0000 Received: from localhost ([127.0.0.1]:57533 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ur3fi-0007N0-Pm for submit@debbugs.gnu.org; Tue, 26 Aug 2025 20:05:51 -0400 Received: from mo4-p01-ob.smtp.rzone.de ([81.169.146.166]:32947) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1ur3fe-0007Mh-SO for 79296@debbugs.gnu.org; Tue, 26 Aug 2025 20:05:48 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1756253143; cv=none; d=strato.com; s=strato-dkim-0002; b=mPIjhyAjKY1jRqwU+FGR1SoWYrhyFOIURZVCy5kjqezU78OL9RZic37V4+CmPHWpyM si+rUuLUu1WErJgs3eU+5J27ATjU8RUVmoojbCiTQyAvbqCZewa6WXgmgyYNwa0he/2U SFVFBWa+Dfa8s+FyY5tbhKHpU5xYWfVcidhJowhPY6N6VxToQfyggTjCtT7E2BzfOtsb nzcber1WFO6zsmJ84D319117FyA51zNNdH0nOIknfeu0OKCThVKlht4j0gkq9FozH+lL GoTBT18KsVgNqysFEbGTbLTZA7OAvZVLTi/p5wqfOI7KIt89Khp6dJLAkWCuqP++MBqM 2ZzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1756253143; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=ytRvzNFIa6fzKb2LiuO9bo9SLlHqQpA1lLYtoy8KdL8=; b=s55CfezIgdz2r6vhbXHcq/GPDF1okDi+ALuu2oHCt5MAanAtO2tdrA3lamDrfEW0qI vHOVqQoeokmkK0j6ajaV0HX2PqmwTxNig5J8LIic7XOb26tNh+fJS8gOT6j+tpdxFa6F IAw6NNQlYDaZv6ilYR9BETqJdPwR7ceUSzxafWVmWKBs1YYxyyKR1K8f6fK+BbPj/e88 WU95TqiQbX+jz6irKEuOYOVKi5plTbFYdGsc0VOMNF5jfrQQvBE2pR50pxshzpOEffhl +DZvxyezY4gTAowgsOyF4WKiSDNR438Q4XAos0Y/VnAY4CXeMqj2g4NLGj6GNy23Dlrs VO/g== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo01 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1756253143; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=ytRvzNFIa6fzKb2LiuO9bo9SLlHqQpA1lLYtoy8KdL8=; b=A2rc9gnPxlj3ntU3qYyIOia98wCwdaoLVv7tPUy2CHXXauBHS6ztVTNxN6vnwNMWr0 +X4y/HzGSj/RvWLuxRd130RBR3J6R+uyIMZ4Xu+2urTNqJ2db2Sd1G7JW6et+CmVB0h4 LPg2XiKfnHPwKmlYjqyAVzZyVmapPdx2iwMcCdI3lKkqSTDjcwyShkMk9545cjOfWHtT 7xreS5aViVrwQ5mU9HiCjnidjck3IO7awW7qkZ1hnt8zU6xGcx8DGt4IPjpvpTkxJZer j+X/i/k5UT6bPNqad/DQ9DtvyxVZFANn/5KnoRYioCSlvkwMtFLtxJ+yH7uBEPNejzFQ SuzA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1756253143; s=strato-dkim-0003; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Cc:Date: From:Subject:Sender; bh=ytRvzNFIa6fzKb2LiuO9bo9SLlHqQpA1lLYtoy8KdL8=; b=FsQHr8STDPEdL9VfbaC7+fG6zeIw5XVVpRbXnO31FmRC4wdgFqqAGBYImHkc9D8RNQ 0zgpk82NDTvChiZIFMBQ== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlLnY4jECd2hdUURIbZgL8PX2QiTuZ3cdB8X/nqm+QRTsg8D4hUMn4LdYg7YwLDmy1" Received: from nimes.localnet by smtp.strato.de (RZmta 52.1.2 AUTH) with ESMTPSA id N9ae6317R05gXEb (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Wed, 27 Aug 2025 02:05:42 +0200 (CEST) From: Bruno Haible To: Eli Zaretskii Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config Date: Wed, 27 Aug 2025 02:05:42 +0200 Message-ID: <17780947.5xaa3U7HCr@nimes> Organization: GNU In-Reply-To: <86o6s2uy9h.fsf@gnu.org> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <4361825.HVULnkfqZJ@nimes> <86o6s2uy9h.fsf@gnu.org> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Spam-Score: -0.7 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, corwin@bru.st, shingo.fg8@gmail.com, eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) Eli Zaretskii wrote: > Any details beyond that general consideration? Are you saying that > MSVCRT doesn't support codepage 65001 as a codeset of a locale, > whereas UCRT does? Do the tests you wrote fail when linked with > MSVCRT? Tried it now: running that unit test in the Windows UTF-8 environment, linked against MSVCRT: * GetACP() returns 65001. Which is not surprising, since GetACP() is a Windows API, not a libc API. * setlocale (LC_ALL, "") fails. [This was the Gnulib setlocale() override. I assume the MSVCRT setlocale failed in the same way.] * If you ignore the setlocale failure, MB_CUR_MAX is not >= 4. Meaning that the locale encoding is not UTF-8. MSVCRT supports only MB_CUR_MAX == 1 or == 2. Looking at the output of "dumpbin /imports emacs.exe, I see that the Emacs binary uses the following symbols from MSVCRT: 6C ___lc_codepage_func 6F ___mb_cur_max_func 188 _getmbcp 240 _mbschr 252 _mbsinc 256 _mbslwr 27A _mbsncpy 27E _mbsnextc 28C _mbspbrk 28E _mbsrchr 302 _snprintf 33C _stricmp 343 _strlwr 34A _strnicmp 4B1 fprintf 4D4 isalpha 4DC isspace 4EB isxdigit 4EF localeconv 51E setlocale 534 strerror 535 strftime 556 tolower 557 toupper 55D vfprintf Most of these are sensitive to the locale encoding and therefore will not produce the expected results for an UTF-8 environment. Additionally, the Emacs binary uses several DLLs, some of which also use locale-aware functions from libc. These DLLs will not work as expected either. So, the only reasonable way forward, for supporting the Windows UTF-8 environment, is to produce two sets of binaries for Emacs: - one set of .exe and .dlls linked with MSVCRT, for use on old Windows versions, - one set of .exe and .dlls linked with UCRT, for use on Windows versions from 2019 or newer [1]. For producing such binaries with only Free Software (no MSVC compiler, no MSVC header files) one can use MSYS2. For a year or two already it supports two target environments: - mingw-w64 with MSVCRT, - mingw-w64 with UCRT. These two development environments are very similar, which means that the Makefile will need very few adapations. Bruno [1] https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 27 08:05:31 2025 Received: (at 79296) by debbugs.gnu.org; 27 Aug 2025 12:05:31 +0000 Received: from localhost ([127.0.0.1]:59812 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1urEuB-0006xs-0i for submit@debbugs.gnu.org; Wed, 27 Aug 2025 08:05:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59602) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1urEu6-0006xT-JN for 79296@debbugs.gnu.org; Wed, 27 Aug 2025 08:05:27 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1urEtx-0000aO-SW; Wed, 27 Aug 2025 08:05:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=mSH/bDPxU5lrbY+d2GIGc7r+N8CAZSXS/W29casr5XQ=; b=Q9t2iL5zHFSd x202ezOVtrH2KQcaCm7MnRFbIC6gnNOAU2DJ/L4w56bnOL870PsDJ5odrz+YlECANiUpGwAD39GA+ Y4/3CD3028zVirVTBEBJYztwHd6+KJYfJoWWtUwBeyH4iqFB/N3Ian0zJtRr2oNHi4+ZKazw7FVpm WyiPjzWbioKMKq2O0tpcON7oWwjWuFMOFhn0jRRjaiYF4e1MDC6GI9FQpzYAgk1/IEZxPKu2Ipujo JV67Q387SA4q1nUXrjx84PXudCJgOBnT0UIqfL9RJtI17r3iISkmBSx+U115v3xyyy5bPD0n3h5S5 0SbTxmL3XJ63KE0bElcr6g==; Date: Wed, 27 Aug 2025 15:04:45 +0300 Message-Id: <86v7m9t1gi.fsf@gnu.org> From: Eli Zaretskii To: Bruno Haible In-Reply-To: <17780947.5xaa3U7HCr@nimes> (message from Bruno Haible on Wed, 27 Aug 2025 02:05:42 +0200) Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <4361825.HVULnkfqZJ@nimes> <86o6s2uy9h.fsf@gnu.org> <17780947.5xaa3U7HCr@nimes> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, corwin@bru.st, shingo.fg8@gmail.com, eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Bruno Haible > Cc: corwin@bru.st, shingo.fg8@gmail.com, eggert@cs.ucla.edu, > 79296@debbugs.gnu.org > Date: Wed, 27 Aug 2025 02:05:42 +0200 > > Eli Zaretskii wrote: > > Any details beyond that general consideration? Are you saying that > > MSVCRT doesn't support codepage 65001 as a codeset of a locale, > > whereas UCRT does? Do the tests you wrote fail when linked with > > MSVCRT? > > Tried it now: running that unit test in the Windows UTF-8 environment, linked > against MSVCRT: > > * GetACP() returns 65001. Which is not surprising, since GetACP() is a > Windows API, not a libc API. > > * setlocale (LC_ALL, "") fails. [This was the Gnulib setlocale() override. > I assume the MSVCRT setlocale failed in the same way.] > > * If you ignore the setlocale failure, MB_CUR_MAX is not >= 4. Meaning > that the locale encoding is not UTF-8. > > MSVCRT supports only MB_CUR_MAX == 1 or == 2. Thanks for this info. > Looking at the output of "dumpbin /imports emacs.exe, I see that the Emacs > binary uses the following symbols from MSVCRT: > > 6C ___lc_codepage_func > 6F ___mb_cur_max_func > 188 _getmbcp > 240 _mbschr > 252 _mbsinc > 256 _mbslwr > 27A _mbsncpy > 27E _mbsnextc > 28C _mbspbrk > 28E _mbsrchr > 302 _snprintf > 33C _stricmp > 343 _strlwr > 34A _strnicmp > 4B1 fprintf > 4D4 isalpha > 4DC isspace > 4EB isxdigit > 4EF localeconv > 51E setlocale > 534 strerror > 535 strftime > 556 tolower > 557 toupper > 55D vfprintf Those are in most cases used only when w32-unicode-filenames is turned off, which is supposed to happen only on Windows 9X (or in debugging). The rest are used at startup, when the system locale and the corresponding encoding machinery is not yet set up. But yes, if turning on this UTF-8 feature doesn't make these functions in MSVCRT use UTF-8 as the multibyte encoding, things will fall apart in subtle ways when non-ASCII strings are involved. > Additionally, the Emacs binary uses several DLLs, some of which > also use locale-aware functions from libc. These DLLs will not > work as expected either. That's a separate issue, and it doesn't get resolved by linking Emacs with UCRT. That's because, AFAIK, if a DLL was linked against MSVCRT at its build time, it will continue using MSVCRT even when called from a program that uses UCRT. So a person who wants to use UTF-8 as the system codepage will need to make sure _all_ of the optional libraries used by Emacs were also linked with UCRT. Moreover, the source code of those libraries should be UTF-8 aware. For example, it should use multibyte-aware functions for walking a string by character, instead of assuming that each byte is a separate character. And how many ported Unix and GNU libraries are aware of that? As a simple example, it's enough to have something like char filename[MAX_PATH]; to run the risk of blowing up the stack if the file name is non-ASCII, encoded in UTF-8, and is long enough. (Emacs handles this particular problem in its own code, but many external libraries don't.) > So, the only reasonable way forward, for supporting the Windows UTF-8 > environment, is to produce two sets of binaries for Emacs: > - one set of .exe and .dlls linked with MSVCRT, for use on old > Windows versions, > - one set of .exe and .dlls linked with UCRT, for use on Windows > versions from 2019 or newer [1]. The Emacs project doesn't produce binaries. That is left to distros. The MS-Windows binaries on the Gnu FTP site are produced by Corwin who volunteered for this job, so it is up to him what he wants to support and how much would he agree to complicate his job. Windows versions before Vista (perhaps even before Windows 8.1) are already unsupported by those binaries, since MSYS2 tossed them, so the resulting binaries depend on APIs and DLLs that older systems don't have, and will thus refuse to run on those older systems. In addition, linking Emacs itself against UCRT is not enough, see above. For these reasons, I stand by my opinion that UTF-8 support on Windows is not yet ready for prime time, and advise against turning it on if one wants to use Emacs reliably on MS-Windows. MS knew what they were doing when they designated this feature "Beta". As a stopgap, we could introduce Windows-specific variables in Emacs through which users could specify the encoding to decode time strings and perhaps other strings if needed, instead of automatically falling back on locale-coding-system. Then users like Shingo Tanaka could say (setq w32-time-coding-system 'cp932) and have the time strings decoded correctly. From debbugs-submit-bounces@debbugs.gnu.org Wed Aug 27 09:54:46 2025 Received: (at 79296) by debbugs.gnu.org; 27 Aug 2025 13:54:46 +0000 Received: from localhost ([127.0.0.1]:60421 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1urGbt-0007Es-UC for submit@debbugs.gnu.org; Wed, 27 Aug 2025 09:54:46 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:54407) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1urGbq-0007EP-UE for 79296@debbugs.gnu.org; Wed, 27 Aug 2025 09:54:43 -0400 Received: by mail-pg1-x52d.google.com with SMTP id 41be03b00d2f7-b49d46a8d05so3565425a12.0 for <79296@debbugs.gnu.org>; Wed, 27 Aug 2025 06:54:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1756302876; x=1756907676; darn=debbugs.gnu.org; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:from:to:cc:subject:date:message-id:reply-to; bh=412GgdPP3irbMa+f+mw8yO8zsgmmaHGIel7kyOetHkY=; b=UfICyv5J6aHCijdZpjXasjXXIldHy0ubeTmBzxsky7U/NBg/EK51pXmAcMMh012RpH 300yoZQWwI4jsD0XYfAR12q+7BRnRy6Mg7cslWwp/wlKm1+147jqyt+NKBUhRjTq62zS TWslUNXkRO/sKstF1Zg5SSKhaBO0AyfXCUJRVGki0CZBpsqwjJTA/HQFRuOa7MhtMK/A YtGixjgTUFV+sIjdyQ0tNxg2bIjTg9XGVONjCg2NBEkoSv0S0yJ/yEpAtXOvqM/nOs1m fmKRTFw2Dl6qXwrVUHb2op6SRSvXsH/qyQ1dk8fimQl8mIkhvimDwgCWHNt/mT1kJPJA v1oQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756302876; x=1756907676; h=mime-version:user-agent:references:in-reply-to:subject:cc:to:from :message-id:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=412GgdPP3irbMa+f+mw8yO8zsgmmaHGIel7kyOetHkY=; b=nts55ufpHZuXWyr2hPJR6lKbKyLGC85baIDSCN6x/3Hc3P9j16isH7MhbbflagzqWX JfdrPhoXpslVhlPnuPZ+QHKkcU0OpvEsQtip6v17GzanoIKiDrBhzQ6j23+0zsIVCG81 Qi4M7aD8/izukNve5bBdKSpYPU3uMaL5yll2GyIMFMdyNtIXxWcbeX+xWG3muITlzC79 EkZXSh+1qOhhFQQQrGFBuASjzGLT/TDA+9ltWChNhADH+RkKZjDXTY9fF+Sl1KADfUkV yUg28mwVVDcfO9DxnVMR0ro/YFD/nOuOOO4tHSj9BvF8mTBggHbRySM+mPUSooSB74tE K3Mw== X-Forwarded-Encrypted: i=1; AJvYcCWDohQnP39qfl72Do9kNJ+WyKjgS8KFmeWRp3hjY49mk3weV3gbMCGoAQTk8lQovHgteGk+1w==@debbugs.gnu.org X-Gm-Message-State: AOJu0YzOAV6K0shD3q9DB51zDQ7aHlD1hogop6Qz1tqvLk4YBwxHcaqz zWNl/JU84amBPp1/kvvLDBVCItgmcDsO+5fF//pbRx8I36/3WZ7zixAh X-Gm-Gg: ASbGncv8N0seto7L+C8PPiVfNtZPBvpAMjviLNJIxkfiYmP+lt+cbFUaabzK7oRsquw 64tS8FAqAAXbH8DAoJD0PZC6UoXM+iWZK4uu6mU03njZJF/5UXW3s+caWbE5rkv04pvxs61Sln2 wI7Qksnv4rzCBVtLaH7+NVztagdviKffPIw80PZ8E3ij+/Rv5PujvHPDip+Vkm26AK/G0U/hTDY 0eed73s/L/mC+istrsK1jXcPJnoMv23abgHpK0s8ihx3UnZAu6lY2UyPkKlXyZX7TIa2FRdL8V9 gf0OhOhb3tgiriywp6d71y7Lkb3EwwWcWhKq6g3JQTLqmFUUdYo51vb4vjXSnDMzXVmd49RnaIJ bGoZZWzKaJqfWtsKbUItamuQkCv0UTN8droH7oQZi6iYVMVn7urygJVnA8Q== X-Google-Smtp-Source: AGHT+IGJQTrwsPj+ES6835KoEyUIhSWCWPiJGyZPIyWqHY5eAVsoBiu7OKhrG8vVugZzHpI6Rd8TrQ== X-Received: by 2002:a17:903:90f:b0:246:76ee:535b with SMTP id d9443c01a7336-24676ee56cbmr193819485ad.27.1756302876154; Wed, 27 Aug 2025 06:54:36 -0700 (PDT) Received: from DESKTOP-CQRAB2T.gmail.com ([240d:1a:6f4:6b00:ec39:7e0b:5546:2bdc]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-24668864426sm123055515ad.80.2025.08.27.06.54.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Aug 2025 06:54:35 -0700 (PDT) Date: Wed, 27 Aug 2025 22:54:30 +0900 Message-ID: <86349cdg4p.wl-shingo.fg8@gmail.com> From: Shingo Tanaka To: Bruno Haible Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config In-Reply-To: <17780947.5xaa3U7HCr@nimes> References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <4361825.HVULnkfqZJ@nimes> <86o6s2uy9h.fsf@gnu.org> <17780947.5xaa3U7HCr@nimes> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/30.2 (x86_64-w64-mingw32) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=ISO-2022-JP X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 79296 Cc: 79296@debbugs.gnu.org, Eli Zaretskii , corwin@bru.st, shingo.fg8@gmail.com, eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.7 (/) On Wed, 27 Aug 2025 18:05:42 +0900, Bruno Haible wrote: > > For producing such binaries with only Free Software (no MSVC compiler, > no MSVC header files) one can use MSYS2. For a year or two already > it supports two target environments: > - mingw-w64 with MSVCRT, > - mingw-w64 with UCRT. > These two development environments are very similar, which means that > the Makefile will need very few adapations. I've just installed MSYS2 Emacs of UCRT version and run it with no init file. ~> pacman -S mingw-w64-ucrt-x86_64-emacs ~> /ucrt64/bin/runemacs --no-init-file And confirmed the issue I reported doesn't happen even with "Beta: Use Unicode UTF-8 for worldwide language support" on. In *scratch* buffer: (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) "25,01,01 水曜日" w32-system-coding-system cp65001 w32-multibyte-code-page 0 w32-ansi-code-page 65001 (w32-get-current-locale-id) 1041 (w32-get-locale-info (w32-get-current-locale-id)) "JPN" (w32-get-default-locale-id) 1041 (w32-get-locale-info (w32-get-default-locale-id)) "JPN" (w32-get-console-codepage) 65001 (w32-get-console-output-codepage) 65001 Regards, Shingo From debbugs-submit-bounces@debbugs.gnu.org Sat Aug 30 05:25:22 2025 Received: (at 79296-done) by debbugs.gnu.org; 30 Aug 2025 09:25:22 +0000 Received: from localhost ([127.0.0.1]:44966 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1usHpq-00022X-7a for submit@debbugs.gnu.org; Sat, 30 Aug 2025 05:25:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58156) by debbugs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1usHpo-0001yr-3r for 79296-done@debbugs.gnu.org; Sat, 30 Aug 2025 05:25:20 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1usHpg-0000Sh-U5; Sat, 30 Aug 2025 05:25:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=tvq+HpGGn5tYkaPcz5oEhFMxzjudI+z/01IW3susGZc=; b=sCI1hPQN5gaDCzP42Hw9 4QOP9chJNVHntIkID1V8RsG1CkJuTDg7cim/DA8Lg3vpRrTHMQuhJoeRxqYED/tSsx7L9yAIlzzon IAsJ1vwmOE+hqJ1ByP94OqEcbTSZ12IaioAAS6ipSk02u4w9BbonTh170jB4HHi6pL4fvWrM7VTA0 ADSl25XPGHbP7rc4ED5mX7+gYOcQmd1f5HBKpI2WZ2vE2C4APRqw/4FnmyqPidEQeDxysuKyXQk1E zxVG+9Xm/VRZ+W2FyRtYI2P/UB6Q4jshmYtKCfxoxdQ/TTfozlZW1CwGdAn0pDWfF0pMT1Uxil3VA XeFN0RHZZ7eFmQ==; Date: Sat, 30 Aug 2025 12:25:09 +0300 Message-Id: <863499qhze.fsf@gnu.org> From: Eli Zaretskii To: Shingo Tanaka In-Reply-To: <86349cdg4p.wl-shingo.fg8@gmail.com> (message from Shingo Tanaka on Wed, 27 Aug 2025 22:54:30 +0900) Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config References: <86qzx1xy56.wl-shingo.fg8@gmail.com> <4361825.HVULnkfqZJ@nimes> <86o6s2uy9h.fsf@gnu.org> <17780947.5xaa3U7HCr@nimes> <86349cdg4p.wl-shingo.fg8@gmail.com> MIME-version: 1.0 Content-type: text/plain; charset=iso-2022-jp Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 79296-done Cc: 79296-done@debbugs.gnu.org, shingo.fg8@gmail.com, corwin@bru.st, bruno@clisp.org, eggert@cs.ucla.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Wed, 27 Aug 2025 22:54:30 +0900 > From: Shingo Tanaka > Cc: Eli Zaretskii , > corwin@bru.st, > shingo.fg8@gmail.com, > eggert@cs.ucla.edu, > 79296@debbugs.gnu.org > > On Wed, 27 Aug 2025 18:05:42 +0900, > Bruno Haible wrote: > > > > For producing such binaries with only Free Software (no MSVC compiler, > > no MSVC header files) one can use MSYS2. For a year or two already > > it supports two target environments: > > - mingw-w64 with MSVCRT, > > - mingw-w64 with UCRT. > > These two development environments are very similar, which means that > > the Makefile will need very few adapations. > > I've just installed MSYS2 Emacs of UCRT version and run it with no init file. > > ~> pacman -S mingw-w64-ucrt-x86_64-emacs > ~> /ucrt64/bin/runemacs --no-init-file > > And confirmed the issue I reported doesn't happen even with > "Beta: Use Unicode UTF-8 for worldwide language support" on. > > In *scratch* buffer: > > (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan"))) > "25,01,01 水曜日" > > w32-system-coding-system > cp65001 > > w32-multibyte-code-page > 0 > > w32-ansi-code-page > 65001 > > (w32-get-current-locale-id) > 1041 > > (w32-get-locale-info (w32-get-current-locale-id)) > "JPN" > > (w32-get-default-locale-id) > 1041 > > (w32-get-locale-info (w32-get-default-locale-id)) > "JPN" > > (w32-get-console-codepage) > 65001 > > (w32-get-console-output-codepage) > 65001 Thanks. I've now added a new section to the Emacs w32 FAQ about these issues, and I'm therefore closing this bug.