GNU bug report logs - #20499
[PROPOSED PATCH] C-x 8 shorthands for curved quotes, Euro, etc.

Previous Next

Package: emacs;

Reported by: Paul Eggert <eggert <at> cs.ucla.edu>

Date: Mon, 4 May 2015 01:15:03 UTC

Severity: wishlist

Tags: patch

Merged with 16082

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ivan Shmakov <ivan <at> siamics.net>
Cc: 20499 <at> debbugs.gnu.org
Subject: bug#20499: [PROPOSED PATCH] C-x 8 shorthands for curved quotes, Euro, etc.
Date: Thu, 07 May 2015 00:53:32 -0700
[Message part 1 (text/plain, inline)]
> 	I believe that both C-x 8 . and C-x 8 u are too convenient to be
> 	dropped without more discussion.  For one thing, · seems more
> 	“common” a character than İ.

In Turkish and Azerbaijani the reverse is true.  And since RMS requested dotted 
I and dotless i my assumption was that Turkish is of some importance.  Dotted 
sequences are the natural ways to type these characters as well as other dotted 
letters ĊċĖėĠġĿŀŻż in the proposal (used variously in Lithuanian, Maltese, and 
Polish), so there is a pretty strong case to usurp "C-x 8 .".

The case for usurping "C-x 8 u" is even stronger, since it's equivalent to the 
equally-short "C-x 8 m", some easily-typed symbol is needed to denote breve, and 
"u" looks more like breve than any other ASCII character does.

>       Other than that, C-x 8 . . feels
> 	easier to type than C-x 8 SPC.

Good point, and I've done this in the attached patch.

>  > -;;; iso-transl.el --- keyboard input definitions for ISO 8859-1  -*- coding: utf-8 -*-
>  > +;;; iso-transl.el --- keyboard input for ISO characters -*- coding: utf-8 -*-
>
> 	I guess we may safely state “ISO 10646” here.

Thanks, done in the attached patch.

>  > +;; This package supports all characters defined by ISO 8859-1,
>  > +;; along with many other Latin characters and a few other characters
>  > +;; commonly used in English and basic math.
>
> 	… And may also mention it here.

Thanks, also done.

>  >      ("-"    . [?­])
>  > -    ("*."   . [?·])
>
> 	The removal above doesn’t seem to be strictly necessary.  The
> 	same for the *= and *u ones.

Thanks, fixed in the attached patch.

> 	… Also, did you consider generating this list automatically,
> 	based on the codepoint properties already known to Emacs?
> 	Something along the lines of the function MIMEd, which readily
> 	produces a list of entries for the following 133 characters.
> 	(Three spaces added for symmetry purposes.)
>
>     À Á Â Ã Ä È É Ê Ë Ì Í Î Ï Ñ Ò Ó Ô Õ Ö Ù Ú Û Ü Ý
>     à á â ã ä è é ê ë ì í î ï ñ ò ó ô õ ö ù ú û ü ý
>     ÿ   Ā ā Ć ć Ĉ ĉ Č č Ď ď Ē ē Ě ě Ĝ ĝ Ĥ ĥ Ĩ ĩ Ī ī Ĵ ĵ Ĺ ĺ
>     Ľ ľ Ń ń Ň ň Ō ō Ŕ ŕ Ř ř Ś ś Ŝ ŝ Š š Ť ť Ũ ũ Ū ū Ŵ ŵ Ŷ ŷ
>     Ÿ   Ź ź Ž ž Ǎ ǎ Ǐ ǐ Ǒ ǒ Ǔ ǔ Ǧ ǧ Ǩ ǩ   ǰ Ǵ ǵ Ǹ ǹ Ș ș Ț ț
>     Ȟ ȟ Ȳ ȳ

Sorry, I don't really follow the code that you attached.  Although I suppose it 
comes from a decomposition table, I don't know what the table was designed for, 
and it's not clear to me how it's relevant.  Anyway, most of those letters are 
either in iso-transl.el now, or are in the previously proposed patch.  Here are 
the exceptional (i.e., missing even in the previously proposed patch) letters, 
along with some comments about these exceptions:

> Ǎ ǎ Ǐ ǐ Ǒ ǒ Ǔ ǔ Ǹ ǹ

These are for toned Pinyin but this list is incomplete.  If we wanted to cover 
toned Pinyin, we'd also need Ǖ ǖ Ǘ ǘ Ǚ ǚ Ǜ ǜ.  Coming up with two-character 
abbreviations for all these might be tricky.  Most Pinyin usage omits the tones.

> Ǧ ǧ Ǩ ǩ

These are Skolt Sami but this list is also incomplete; we'd also need Ʒ Ǥ ǥ Ǯ	ǯ 
ʒ at least.

> ǰ

What language uses this?  I couldn't find one.

> Ǵ ǵ

Good catch.  These are used for transliteration from Serbian and Macedonian.  We 
should also include Ḱ ḱ as they are also needed.  Included in the attached patch.

> Ȟ ȟ

Used in Finnish Kalo, which is quite obscure.

> Ȳ ȳ

Used in Livonian, but for that we'd also need a whole bunch of other letters, 
including Ǟ ǟ Ḑ ḑ Ȫ ȫ Ȭ ȭ Ȯ ȯ Ȱ and I've probably omitted some.  Plus, modern 
Livonian doesn't seem to be using Ȳ ȳ any more....

Anyway, part of what's going on here is that the proposed list doesn't cover 
every Latin character in the ISO 10646 repertoire (that'd be a large set), but 
instead is limited to what appear to be reasonably commonly letters.  Admittedly 
this is not universal but one must cut things off somewhere, and it would be odd 
to add only partial coverage for toned Pinyin, Livonian, etc.

>  > --------------090904020002020306060104
>  > Content-Type: text/x-patch;
>  >  name="0001-C-x-8-shorthands-for-curved-quotes-Euro-etc.patch"
>
> 	This MIME part sure wants ‘; charset=UTF-8’.  Otherwise, Gnus
> 	does no decoding, and Emacs shows the contents with the likes of
> 	\304\260.

Hmm, it works for me.  I use Thunderbird to read the top level message, and it 
spins off an Emacs to display the attachment with no problem.  The web-site 
archive at <http://bugs.gnu.org/20499#60> also works for me with Firefox.

It's common for people to send the output of "git send-email" as attachments; if 
this doesn't work with Gnus I suppose a Gnus user (i.e. not me :-) should file a 
bug report.  I looked around the net and found other Gnus users with similar 
problems and some code that worked for them; please see 
<http://bewatermyfriend.org/p/2011/00a/> and/or 
<http://blog.printf.net/articles/tag/emacs/>.  But this stuff appeared to be 
several years old and this leads me to hope that maybe recent-enough Gnus 
versions will do the right thing already.

[0001-C-x-8-shorthands-for-curved-quotes-Euro-etc.patch (text/x-patch, attachment)]

This bug report was last modified 4 years and 343 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.