GNU bug report logs - #15803
default-file-name-coding-system: utf-8 better than latin-1 these days?

Previous Next

Package: emacs;

Reported by: Glenn Morris <rgm <at> gnu.org>

Date: Mon, 4 Nov 2013 18:46:01 UTC

Severity: normal

Tags: fixed

Found in version 24.3

Fixed in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


Message #18 received at 15803 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Morris <rgm <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 15803 <at> debbugs.gnu.org
Subject: Re: bug#15803: default-file-name-coding-system: utf-8 better than
 latin-1 these days?
Date: Mon, 11 Dec 2017 20:38:15 -0500
Eli Zaretskii wrote:

> Are we sure all macOS/Darwin systems are sufficiently Posix in this
> aspect?

Emacs on Darwin has been unconditionally using utf-8 for over a decade.
It's special-cased in mule-cmds, as visible in the diff I sent.

>> It works fine for me on G/L to have source, build, and install
>> directories be distinct non-ASCII directories.
>
> Was it in a UTF-8 locale or in a non-UTF-8 locale?  The latter is the
> potentially problematic case, AFAIR.

I had LANG=en_US.UTF-8. I've repeated with LANG=en_US. Still works.

>>    If @code{file-name-coding-system} is @code{nil}, Emacs uses a
>>  default coding system determined by the selected language environment,
>> -and stored in the @code{default-file-name-coding-system} variable.
>> -@c FIXME?  Is this correct?  What is the "default language environment"?
>> -In the default language environment, non-@acronym{ASCII} characters in
>> -file names are not encoded specially; they appear in the file system
>> -using the internal Emacs representation.
>> +and stored in the @code{default-file-name-coding-system} variable
>> +(normally UTF-8).
>
> Not sure why you removed the sentence which had the FIXME comment.  Is
> it in any way related to the issue at hand?

I wrote the FIXME comment. In 5 years, no-one has addressed it.
Defaulting to UTF-8 makes it no longer relevant, so it seems better to
remove it.

> Why are we changing sendmail-coding-system?  It has nothing to do with
> file names, AFAIK.

I'm changing all (3) things that currently default to latin-1 to default to
utf-8.

>> Btw, why does the default matter so much?  Once Emacs starts up
>> default-file-name-coding-system on GNU/Linux is set to UTF-8, if the
>> locale says so.  Is this just an aesthetic issue?

utf-8 is the sensible, "modern" (ie, non-ancient) default.
If there is no reason to use latin-1, Emacs should use utf-8.
I'm not claiming it's critical.

Take it or leave it, as you wish.




This bug report was last modified 4 years and 256 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.