GNU bug report logs -
#16292
24.3.50; info docs now contain single straight quotes instead of `'
Previous Next
Reported by: Gregor Zattler <grfz <at> gmx.de>
Date: Sun, 29 Dec 2013 22:10:01 UTC
Severity: wishlist
Found in version 24.3.50
Fixed in version 24.4
Done: Glenn Morris <rgm <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #60 received at 16292 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 30 Dec 2013 21:58:31 -0800
> From: Paul Eggert <eggert <at> cs.ucla.edu>
> CC: 16292 <at> debbugs.gnu.org, grfz <at> gmx.de
>
> Eli Zaretskii wrote:
>
> > (names of people) so using Latin-1 doesn't hamper users'
> > ability to read the manual in any way
>
> Most of the non-ASCII words are people's names, but many are not,
> and often ASCIIfying these would hurt the manual.
> These include symbols (e.g., "¬"), examples of encoding
> ("@samp{Naïve} is encoded as @samp{=?iso-8859-1?q?Na=EFve?=}"),
> calendars ("Bahá'í"), the names of GNU programs ("真 Gnus"),
> and configuration examples ("écrit" in email configuration).
I don't think we care about encoding of a handful of words, as long as
the bulk of the manual, including markup and quotes, is legible. I
only mentioned Latin-1 because it seemed to cover most of the
non-ASCII characters. But I don't insist on it. Neither do I insist
on a single-byte encoding of those few words and names; in particular,
UTF-8 will do -- but only for the non-ASCII text in the manuals.
> Nowadays, on GNUish and POSIXish systems in the Emacs target
> audience, there's more usage of UTF-8 than of Latin-1. On
> Ubuntu and Fedora, for example, the default locale for US
> English is en_US.utf8. Hence, converting info files to
> Latin-1 would hurt standalone info users in the typical
> setup on GNUish and POSIXish platforms.
It hurts them in a very small number of places, most or all of which
don't affect in any way the ability of the reader to read and
understand the presented material.
As I say above, I won't object to having the non-ASCII words encoded
in UTF-8, as long as it doesn't affect the (single and double) quote
characters, and any other characters/strings (like '#' and '=>') we
use for describing the Emacs and Lisp features.
The problem here is that @documentencoding is virulent when you use
UTF-8: it affects the quotes, not just non-ASCII text in the Texinfo
sources. This is unlike any other value of @documentencoding. And
that is the only problem that bothers me, and IMO should bother us
all.
Perhaps a possible solution would be to customize OPEN_QUOTE_SYMBOL
and CLOSE_QUOTE_SYMBOL (although I'm not sure it affects double
quotes), or edit the Info files with Sed to replace Unicode quote
characters with some ASCII characters. The rest of the non-ASCII text
can be left intact, in UTF-8.
> Perhaps Microsoft Windows users are different, and typically
> use Latin-1 or some other unibyte encoding.
This has nothing to do with Windows; I first hit the problem on a
GNU/Linux machine that was configured with a non-UTF locale. The
reason I never saw the problem since last March is that I still use
makeinfo from Texinfo 4.13, which doesn't affect the quote characters
when @documentencoding of UTF-8 is specified. So the Info files I
produce when I build Emacs don't suffer from this misfeature.
This bug report was last modified 11 years and 18 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.