GNU bug report logs - #57531
28.1; Character encoding missing for "eo"

Previous Next

Package: emacs;

Reported by: Jonathan Reeve <jonathan <at> jonreeve.com>

Date: Thu, 1 Sep 2022 19:34:02 UTC

Severity: normal

Tags: moreinfo

Found in version 28.1

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Gregory Heytings <gregory <at> heytings.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Jonathan Reeve <jonathan <at> jonreeve.com>, 57531 <at> debbugs.gnu.org
Subject: bug#57531: 28.1; Character encoding missing for "eo"
Date: Sun, 04 Sep 2022 23:35:37 +0000
>> The problem is in this line from `locale-language-names'. Here's what 
>> it says:
>>
>> `("eo" . "Esperanto")'
>>
>> Here's what it should say:
>>
>> `("eo" "Esperanto" utf-8)'
>
> That's only correct for glibc systems, though, as I already explained. I 
> found no authoritative place on the Internet which would mandate that 
> the Esperanto locale should use or prefer UTF-8 as its encoding.
>

I don't think it's possible to find a truly authoritative source of 
information about an artificial language.  One semi-authoritative source 
is Bertilo Wennergren, who is (according to Wikipedia) a member of the 
Esperanto Academy and "holds the post of director of the Academy's General 
Dictionary section".  He appears to be the expert on that matter (namely 
computer encodings for Esperanto), and explains on his website that:

Latino 3 is made for Esperanto and for the Galician, Maltese and Turkish 
languages. However, few computer programs support Latin 3, and some bodies 
have even directly discouraged the use of Latin 3. The Turks currently 
prefer the character code Latin 5 (ISO 8859-9) . Esperantists also 
currently prefer and should prefer Unicode instead of Latin 3. [1, 
translation from Google]

He also gives instructions on how to configure a GNU/Linux distribution 
for Esperanto:

To be able to use Esperanto well in Linux, it is necessary that the system 
uses a Unicode locale. Fortunately, more or less all Linux distributions 
currently use Unicode locales by default. To check which character code 
your system's locale uses, type the following command: "locale charmap". 
If the answer appears "UTF-8" (that is the most commonly used code 
representation of Unicode), then everything about character code in your 
locale is already in order. [2, translation from Google]

Amusingly, at the bottom of that page one finds:

It is also possible to speak Esperanto in the powerful text editor 
"Emacs", but I know nothing about "Emacs".  I myself mainly use the Vim 
editor. Here are instructions for installing and configuring Unicode Vim 
7.

So it seems safer to assume that the coding system is UTF-8 when the 
locale is "eo" (which IIUC is what the above suggested change does), and 
to expect users who would not like that default to add

(prefer-coding-system 'iso-latin-3)

in their init file, than to assume ISO-8859-3 when the locale is "eo" 
(which IIUC is what Emacs currently does), and to expect users who do not 
like that default to add

(prefer-coding-system 'utf-8)

in their init file.

[1] https://bertilow.com/html/signokodoj/latino3.html

[2] https://bertilow.com/komputo/linukso.html




This bug report was last modified 2 years and 228 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.