GNU bug report logs - #75207
29.4; Path conversion from native codepage to UTF-8 fails when Windows is set by default to UTF-8

Previous Next

Package: emacs;

Reported by: michal <at> 0lock.xyz

Date: Mon, 30 Dec 2024 18:30:02 UTC

Severity: wishlist

Found in version 29.4

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Eli Zaretskii <eliz <at> gnu.org>
To: michal <at> 0lock.xyz
Cc: 75207 <at> debbugs.gnu.org
Subject: bug#75207: Fwd: bug#75207: 29.4; Path conversion from native codepage to UTF-8 fails when Windows is set by default to UTF-8
Date: Sat, 04 Jan 2025 11:30:34 +0200
> Cc: 75207 <at> debbugs.gnu.org
> Date: Fri, 03 Jan 2025 17:25:31 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> 
> > I debugged a bit and it looks like w32_ansi_code_page is set to 1252 at some point.
> 
> AFAICT, that happens when we load the pdumper file.
> 
> > M-: w32-multibyte-code-page -> 0
> > M-: locale-coding-system -> cp65001
> > M-: file-name-coding-system -> nil
> > M-: default-file-name-coding-system -> cp65001
> 
> OK, I think this confirms my hypothesis.  I'll try to come up with a
> patch, probably tomorrow.

The patch is below, and it is for the master branch of the Emacs Git
repository.

> > > If I send you a C-level patch, are you able to build Emacs after patching it,
> > > preferably the master branch of our Git repository?
> > 
> > Sure.
> 
> OK, but you'll need to build Emacs with a different system codepage to
> see the effects of the fix.

This still stands: to fully test the patch, please change your system
codepage after building Emacs and then start Emacs and see if
everything works as expected.

diff --git a/src/emacs.c b/src/emacs.c
index c1e0c9f..896f219 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -1419,7 +1419,18 @@ android_emacs_init (int argc, char **argv, char *dump_file)
 
 #ifdef HAVE_PDUMPER
   if (attempt_load_pdump)
-    initial_emacs_executable = load_pdump (argc, argv, dump_file);
+    {
+      initial_emacs_executable = load_pdump (argc, argv, dump_file);
+#ifdef WINDOWSNT
+  /* Reinitialize the codepage for file names, needed to decode
+     non-ASCII file names during startup.  This is needed because
+     loading the pdumper file above assigns to those variables values
+     from the dump stage, which might be incorrect, if dumping was done
+     on a different system.  */
+      if (dumped_with_pdumper_p ())
+	w32_init_file_name_codepage ();
+#endif
+    }
 #else
   ptrdiff_t bufsize;
   initial_emacs_executable = find_emacs_executable (argv[0], &bufsize);
diff --git a/src/w32.c b/src/w32.c
index a493991..deeca03 100644
--- a/src/w32.c
+++ b/src/w32.c
@@ -1685,6 +1685,19 @@ w32_init_file_name_codepage (void)
 {
   file_name_codepage = CP_ACP;
   w32_ansi_code_page = CP_ACP;
+#ifdef HAVE_PDUMPER
+  /* If we were dumped with pdumper, this function will be called after
+     loading the pdumper file, and needs to reset the following
+     variables that come from the dump stage, which could be on a
+     different system with different default codepages.  Then, the
+     correct value of w32-ansi-code-page will be assigned by
+     globals_of_w32fns, which is called from 'main'.  Until that call
+     happens, w32-ansi-code-page will have the value of CP_ACP, which
+     stands for the default ANSI codepage.  The other variables will be
+     computed by codepage_for_filenames below.  */
+  Vdefault_file_name_coding_system = Qnil;
+  Vfile_name_coding_system = Qnil;
+#endif
 }
 
 /* Produce a Windows ANSI codepage suitable for encoding file names.




This bug report was last modified 189 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.