#13936 - Default to UTF-8 for most Emacs source files

GNU bug report logs - #13936
Default to UTF-8 for most Emacs source files

Package: emacs;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Tue, 12 Mar 2013 21:23:01 UTC

Severity: wishlist

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Message #38 received at 13936 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu> To: Kenichi Handa <handa <at> gnu.org> Cc: 13936 <at> debbugs.gnu.org Subject: Re: bug#13936: Default to UTF-8 for most Emacs source files Date: Wed, 20 Mar 2013 09:43:38 -0700

On 03/20/13 01:18, Kenichi Handa wrote: > Among CJK files, I think K(orean) files can be in UTF-8 > without problem. It's easy enough to convert the K files to UTF-8 too, and I'll propose a patch to do that in followup email. > Are there any people familiar with Korean situation? Sorry, I don't know. For what it's worth, when I use Emacs to convert TUTORIAL.ko to UTF-8 and back, the result is identical to the original, so no information is lost by making that change. (This is not true for TUTORIAL.ja.) I have another question. Shouldn't it be OK to convert Elisp source files such as leim/quail/japanese.el to UTF-8 as well? Emacs internally converts their text to UTF-8 while compiling them, so the corresponding .elc files are in UTF-8 already, and there should be no functional difference if we convert the .el files to UTF-8. Converting these files to UTF-8 would fix an inconsistency in Emacs behavior. For example, if I visit the file leim/quail/japanese.el I see this definition: (defvar quail-japanese-use-double-n nil "If non-nil, use type \"nn\" to insert ん.") where the character 'ん' is displayed using code point 0x2473 in charset japanese-jisx0208. But if I *use* the above definition string, by typing "C-h v quail-japanese-use-double-n RET", the help string that I see has been translated to UTF-8, so Emacs displays that character using code point 0x3093 in charset unicode instead. It would be better if the runtime behavior matched the source code, and an easy way to do that would be to convert the source code to UTF-8. Here is the list of the remaining .el files that I'd like to convert to UTF-8: leim/quail/cyril-jis.el leim/quail/hanja-jis.el leim/quail/japanese.el leim/quail/py-punct.el leim/quail/pypunct-b5.el lisp/international/ja-dic-cnv.el lisp/international/ja-dic-utl.el lisp/international/kinsoku.el lisp/international/kkc.el lisp/international/titdic-cnv.el lisp/language/japan-util.el lisp/language/japanese.el lisp/term/x-win.el x-win.el is a special case, since it has two "Kana: Fixme:" lines talking about problems when converting to UTF-8 -- evidently these are issues in our current setup anyway since Emacs converts the text to UTF-8 before compiling it.

This bug report was last modified 12 years and 101 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #13936 Default to UTF-8 for most Emacs source files

GNU bug report logs - #13936
Default to UTF-8 for most Emacs source files