Package: emacs;
Reported by: Glenn Morris <rgm <at> gnu.org>
Date: Mon, 4 Nov 2013 18:46:01 UTC
Severity: normal
Tags: fixed
Found in version 24.3
Fixed in version 28.1
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Eli Zaretskii <eliz <at> gnu.org> To: Glenn Morris <rgm <at> gnu.org> Cc: 15803 <at> debbugs.gnu.org Subject: bug#15803: default-file-name-coding-system: utf-8 better than latin-1 these days? Date: Fri, 08 Dec 2017 11:46:29 +0200
> From: Glenn Morris <rgm <at> gnu.org> > Cc: 15803 <at> debbugs.gnu.org > Date: Mon, 04 Dec 2017 19:35:05 -0500 > > Eli Zaretskii wrote: > > > Perhaps on Posix systems, but not elsewhere. > > I assume non-POSIX is newspeak for MS-Windows (native and DOS). I didn't say "non-Posix"; you did. MS-Windows is definitely not a Posix system, but whether it is the only one, I don't know. Are we sure all macOS/Darwin systems are sufficiently Posix in this aspect? AFAIR they use quite different encoding methods for file names (canonical normalization etc.). > > And if we make the change, we should make sure building Emacs in a > > non-ASCII directory still works. > > It works fine for me on G/L to have source, build, and install > directories be distinct non-ASCII directories. Was it in a UTF-8 locale or in a non-UTF-8 locale? The latter is the potentially problematic case, AFAIR. > (Emacs works, that is, > but makeinfo 5.1 fails to find @include files in non-ASCII directories, > so I wonder how common such setups are.) Building a release tarball doesn't require makeinfo. > BTW, it feels very dated to me to have discussion of Windows 9X in the > Emacs manual section on file-name-coding. We still try to support it, and the aspects of file-name encoding related to it are definitely non-trivial. Everything described there is in the code. > diff --git i/doc/emacs/mule.texi w/doc/emacs/mule.texi > index 78f77cb..5fc44a6 100644 > --- i/doc/emacs/mule.texi > +++ w/doc/emacs/mule.texi > @@ -1214,11 +1214,8 @@ system can encode. > > If @code{file-name-coding-system} is @code{nil}, Emacs uses a > default coding system determined by the selected language environment, > -and stored in the @code{default-file-name-coding-system} variable. > -@c FIXME? Is this correct? What is the "default language environment"? > -In the default language environment, non-@acronym{ASCII} characters in > -file names are not encoded specially; they appear in the file system > -using the internal Emacs representation. > +and stored in the @code{default-file-name-coding-system} variable > +(normally UTF-8). Not sure why you removed the sentence which had the FIXME comment. Is it in any way related to the issue at hand? > @cindex file-name encoding, MS-Windows > @vindex w32-unicode-filenames > diff --git i/lisp/international/mule-cmds.el w/lisp/international/mule-cmds.el > index 9d22d6e..192f0e9 100644 > --- i/lisp/international/mule-cmds.el > +++ w/lisp/international/mule-cmds.el > @@ -1797,10 +1797,11 @@ The default status is as follows: > 'raw-text) > > (set-default-coding-systems nil) > - (setq default-sendmail-coding-system 'iso-latin-1) > - ;; On Darwin systems, this should be utf-8-unix, but when this file is loaded > - ;; that is not yet defined, so we set it in set-locale-environment instead. > - (setq default-file-name-coding-system 'iso-latin-1-unix) > + (setq default-sendmail-coding-system 'utf-8) > + (setq default-file-name-coding-system (if (memq system-type > + '(window-nt ms-dos)) > + 'iso-latin-1-unix > + 'utf-8-unix)) Why are we changing sendmail-coding-system? It has nothing to do with file names, AFAIK. > ;; Preserve eol-type from existing default-process-coding-systems. > ;; On non-unix-like systems in particular, these may have been set > ;; carefully by the user, or by the startup code, to deal with the > @@ -1816,8 +1817,10 @@ The default status is as follows: > (input-coding > (condition-case nil > (coding-system-change-text-conversion > - (cdr default-process-coding-system) 'iso-latin-1) > - (coding-system-error 'iso-latin-1)))) > + (cdr default-process-coding-system) > + (if (memq system-type '(window-nt ms-dos)) 'iso-latin-1 'utf-8)) > + (coding-system-error > + (if (memq system-type '(window-nt ms-dos)) 'iso-latin-1 'utf-8))))) > (setq default-process-coding-system > (cons output-coding input-coding))) And this changes the default encoding used to communicate with sub-processes. Why? We never talked about a wholesale change of all the defaults to UTF-8, that is a much more broad issue than just encoding of file names. > diff --git i/lisp/mh-e/mh-comp.el w/lisp/mh-e/mh-comp.el > index 98067ce..25118cd 100644 > --- i/lisp/mh-e/mh-comp.el > +++ w/lisp/mh-e/mh-comp.el > @@ -304,6 +304,7 @@ message and scan line." > (let ((draft-buffer (current-buffer)) > (file-name buffer-file-name) > (config mh-previous-window-config) > + ;; FIXME this is subtly different to select-message-coding-system. > (coding-system-for-write > (if (and (local-variable-p 'buffer-file-coding-system > (current-buffer)) ;XEmacs needs two args > @@ -315,7 +316,7 @@ message and scan line." > (or (and (boundp 'sendmail-coding-system) sendmail-coding-system) > (and (default-boundp 'buffer-file-coding-system) > (default-value 'buffer-file-coding-system)) > - 'iso-latin-1)))) > + 'utf-8)))) Changes like that in MH-E should be communicated to the MH-E developer; I 'm not sure he is reading this list. And you never answered my question about the rationale: > Btw, why does the default matter so much? Once Emacs starts up > default-file-name-coding-system on GNU/Linux is set to UTF-8, if the > locale says so. Is this just an aesthetic issue?
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.