GNU bug report logs -
#79296
30.2; format-time-string returns wrongly encoded string in MS Windows Japanese with cp65001 beta config
Previous Next
Full log
Message #50 received at 79296 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii wrote:
> Any details beyond that general consideration? Are you saying that
> MSVCRT doesn't support codepage 65001 as a codeset of a locale,
> whereas UCRT does? Do the tests you wrote fail when linked with
> MSVCRT?
Tried it now: running that unit test in the Windows UTF-8 environment, linked
against MSVCRT:
* GetACP() returns 65001. Which is not surprising, since GetACP() is a
Windows API, not a libc API.
* setlocale (LC_ALL, "") fails. [This was the Gnulib setlocale() override.
I assume the MSVCRT setlocale failed in the same way.]
* If you ignore the setlocale failure, MB_CUR_MAX is not >= 4. Meaning
that the locale encoding is not UTF-8.
MSVCRT supports only MB_CUR_MAX == 1 or == 2.
Looking at the output of "dumpbin /imports emacs.exe, I see that the Emacs
binary uses the following symbols from MSVCRT:
6C ___lc_codepage_func
6F ___mb_cur_max_func
188 _getmbcp
240 _mbschr
252 _mbsinc
256 _mbslwr
27A _mbsncpy
27E _mbsnextc
28C _mbspbrk
28E _mbsrchr
302 _snprintf
33C _stricmp
343 _strlwr
34A _strnicmp
4B1 fprintf
4D4 isalpha
4DC isspace
4EB isxdigit
4EF localeconv
51E setlocale
534 strerror
535 strftime
556 tolower
557 toupper
55D vfprintf
Most of these are sensitive to the locale encoding and therefore
will not produce the expected results for an UTF-8 environment.
Additionally, the Emacs binary uses several DLLs, some of which
also use locale-aware functions from libc. These DLLs will not
work as expected either.
So, the only reasonable way forward, for supporting the Windows UTF-8
environment, is to produce two sets of binaries for Emacs:
- one set of .exe and .dlls linked with MSVCRT, for use on old
Windows versions,
- one set of .exe and .dlls linked with UCRT, for use on Windows
versions from 2019 or newer [1].
For producing such binaries with only Free Software (no MSVC compiler,
no MSVC header files) one can use MSYS2. For a year or two already
it supports two target environments:
- mingw-w64 with MSVCRT,
- mingw-w64 with UCRT.
These two development environments are very similar, which means that
the Makefile will need very few adapations.
Bruno
[1] https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
This bug report was last modified 21 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.