GNU bug report logs - #79296
30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config

Previous Next

Package: emacs;

Reported by: Shingo Tanaka <shingo.fg8 <at> gmail.com>

Date: Sun, 24 Aug 2025 02:17:02 UTC

Severity: normal

Found in version 30.2

Done: Eli Zaretskii <eliz <at> gnu.org>

Full log


Message #11 received at 79296 <at> debbugs.gnu.org (full text, mbox):

From: Bruno Haible <bruno <at> clisp.org>
To: Shingo Tanaka <shingo.fg8 <at> gmail.com>, Eli Zaretskii <eliz <at> gnu.org>
Cc: 79296 <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>
Subject: Re: bug#79296: 30.2;
 format-time-string returns wrongly encoded string in MS Windows Japanese with
 cp65001 beta config
Date: Sun, 24 Aug 2025 09:13:22 +0200
Hi Eli,

> > 2. Go to *scratch* buffer and evaluate:
> >    (format-time-string "%y,%d,%m %A" (date-to-time (concat "2025,1,Jan")))
> > 3. You will get below wrongly encoded string:
> >    "25,01,01 \220\205\227j\223\372"
> ...
> I think this is an issue with Gnulib, whose nstrftime function we use
> to format the time in Emacs: it seems to produce time strings encoded
> in cp932 even though the UTF-8 support is turned on on MS-Windows.
> I've added the Gnulib folks to the discussion.
> 
> Bruno and Paul, does Gnulib's nstrftime support the UTF-8 system
> codepage on MS-Windows?

* Facts:
  - Gnulib supports the UTF-8 system codepage of Windows, since 2024-12-23.
    It includes some unit tests, namely gnulib/tests/*w32utf8* .
  - This UTF-8 system codepage is only supported with Microsoft UCRT, not
    with the MSVCRT. At compile time, this configuration can be tested via
    '#ifdef _UCRT'. (This is true for both the mingw and the MSVC toolchains.)

* Hypothesis 1:
  The Gnulib support included in Emacs 30.2 is older than 2024-12-23.

* Hypothesis 2:
  The Gnulib support included in Emacs 30.2 misses the commits
  https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=927a70e0853345315570f051fd6996cfeb7b4d96
  https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=9f7ff4f423cd805866cd4edef806c32393621df0
  https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=00211fc69c926d6c8f6e3f3cf1d8802623db2af9
  https://gitweb.git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8e795a8d9f8c3269a3d30d0d1adbaf0ea9ad4a84

* Hypothesis 3:
  The Emacs 30.2 binaries are linked with MSVCRT, not with UCRT.

* Hypothesis 4:
  Enabling the option "Beta: Use Unicode UTF-8 for worldwide language support"
  has a different effect than creating a .manifest file like the Gnulib
  test suite does.
  <https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page>

Hypothesis 4 sounds unlikely.

> I see some COMPILE_WIDE preprocessor
> conditions in the source, but it is not clear to me whether it is
> necessary for Unicode support

This COMPILE_WIDE condition is needed only by glibc for the wcsftime() function.
It is not used by Gnulib. It is not needed for i18n or Unicode support.

* Actions:
  - Bruno: Add a unit test for nstrftime in w32utf8 mode.
  - Eli or Paul: Disprove hypotheses 1, 2, 3.

  > Shingo Tanaka, could you please tell what is the value of
  > w32-multibyte-code-page on your system, both when "Beta: Use Unicode
  > UTF-8 for worldwide language support" is ON and when it is OFF?

  Yes, this info would be useful.

Bruno







This bug report was last modified 21 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.