#79296 - 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config

GNU bug report logs - #79296
30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config

Package: emacs;

Reported by: Shingo Tanaka <shingo.fg8 <at> gmail.com>

Date: Sun, 24 Aug 2025 02:17:02 UTC

Severity: normal

Found in version 30.2

Done: Eli Zaretskii <eliz <at> gnu.org>

Message #26 received at 79296 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org> To: Bruno Haible <bruno <at> clisp.org> Cc: 79296 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, shingo.fg8 <at> gmail.com Subject: Re: bug#79296: 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config Date: Sun, 24 Aug 2025 13:41:39 +0300

> From: Bruno Haible <bruno <at> clisp.org> > Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 79296 <at> debbugs.gnu.org > Date: Sun, 24 Aug 2025 09:13:22 +0200 > > * Hypothesis 1: > The Gnulib support included in Emacs 30.2 is older than 2024-12-23. How does one know? I see Paul last ran admin/merge-gnulib on the emacs-30 release branch on Aug 2, 2025, but maybe this is not what I should be looking at? In any case, Dec 2024 sounds too old even for the release branch. > * Hypothesis 2: > The Gnulib support included in Emacs 30.2 misses the commits > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=927a70e0853345315570f051fd6996cfeb7b4d96 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=9f7ff4f423cd805866cd4edef806c32393621df0 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=00211fc69c926d6c8f6e3f3cf1d8802623db2af9 > https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8e795a8d9f8c3269a3d30d0d1adbaf0ea9ad4a84 These commits are in Gnulib files that are not used in Emacs. What are their effects on the issue at hand, which is the non-ASCII strings produced by Gnulib's nstrftime? > - This UTF-8 system codepage is only supported with Microsoft UCRT, not > with the MSVCRT. At compile time, this configuration can be tested via > '#ifdef _UCRT'. (This is true for both the mingw and the MSVC toolchains.) What is it in UCRT that is required for Gnulib to support the UTF-8 system codepage on Windows, in particular for strftime? IOW, what does the UCRT implementation of libc does that the MSVCRT one doesn't, that affects this aspect of Gnulib's strftime? > * Hypothesis 4: > Enabling the option "Beta: Use Unicode UTF-8 for worldwide language support" > has a different effect than creating a .manifest file like the Gnulib > test suite does. > <https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page> This is about defining a process-specific codepage, which is not what happens in this case. So I don't think it's relevant. > * Actions: > - Bruno: Add a unit test for nstrftime in w32utf8 mode. I'd be interested to see how this test works for you. > - Eli or Paul: Disprove hypotheses 1, 2, 3. > > > Shingo Tanaka, could you please tell what is the value of > > w32-multibyte-code-page on your system, both when "Beta: Use Unicode > > UTF-8 for worldwide language support" is ON and when it is OFF? > > Yes, this info would be useful. The upshot is that we can only reliably know the system's language ID (0x11), but it is still a mystery for me where did strftime take cp932 with which it encoded the time-related strings. Because all the other APIs I know about which report codepages all say it's UTF-8.

This bug report was last modified 21 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #79296 30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config

GNU bug report logs - #79296
30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config