Package: emacs;
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Mon, 4 May 2015 01:15:03 UTC
Severity: wishlist
Tags: patch
Merged with 16082
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Eli Zaretskii <eliz <at> gnu.org> To: rms <at> gnu.org Cc: ivan <at> siamics.net, 20499 <at> debbugs.gnu.org Subject: bug#20499: [PROPOSED PATCH] C-x 8 shorthands for curved quotes, Euro, etc. Date: Wed, 06 May 2015 19:27:44 +0300
> Date: Wed, 06 May 2015 09:09:26 -0400 > From: Richard Stallman <rms <at> gnu.org> > CC: ivan <at> siamics.net, 20499 <at> debbugs.gnu.org > > > > > > Would admin/unidata/UnicodeData.txt do? > > > > > > It doesn't do the job, becuase it doesn't contain the characters > > > themselves. > > > You mean, the glyphs? > > Yes, exactly. > > (It does show the codepoint, so you can easily > > display the character via "C-x 8 RET".) > > You mean, one character at a time? > > I want to be able to scan quickly through the buffer looking at > lots of characters to find the one I want. If I have to type > a command for _each character_, just to see it, that is useless > for the purpose. Maybe I don't understand the use case you have in mind. I thought the use case was that you already know the character's name, at least approximately, and want to look up its code, to type is faster. > C-x 8 RET is even worse than that, because it requires > _copying_ the name of the character. To actually see the character > point is on requires > M-f C-f C-SPC C-s ; C-b M-w C-a C-x 8 RET C-y SPC "C-x 8 RET" accepts the codepoint in hex, so if you are already looking at the line that defines the character, all you need is to type a 4-, sometimes 5-hex-digit number. And if you want to type the name, "C-x 8 RET" provides completion, so no need for such a complicated dance for copying the name. > I could make that a keyboard macro and repeat it many times > to get all these codes into the buffer. It would take a long time. > Furthermore, it would show only one character per line, > so few characters would appear on the screen at any time. > To look at them all would require lots of scrolling. I don't really see how looking for a character with your eyes could be a convenient feature, except in very corner situations with a small number of simply-looking characters. Even for Latin characters, there are many similar shapes, like Ả and Ă or Ő and Ố, and they are spread all over the Unicode range. How would you go about finding your character, if all you have is some vague idea of its shape (which, btw, could look quite different with different fonts)? Sounds like a very inefficient way to me. I think we must assume the user has some idea about the character: either its approximate name, or at least the block or script to which it belongs. Then we could display some reasonably manageable subset of characters. We could further help by asking about the base character (the above examples have either A or O as their base character), because if the user knows that, with some scripts the number of potential candidates will go down drastically. But even when the base character is known, the number of candidates is not negligible: e.g., there are 46 characters in the Unicode database that are somehow related to A. > The buffer shoulod be divided into stanzas, each one labeled with the > name of its script or portion thereof. Not sure what you mean by "script" here. Emacs currently knows about almost 100 scripts defined by Unicode, so even displaying a couple of lines for each one will make a large buffer. Isn't it better to allow the user to specify one, with completion? > > As for showing the glyphs, visiting a file with large number of > > characters runs a high risk of being an annoyance due to the > > corresponding fonts being unavailable on the system. > > We could set up a way to test whether a code point can be > displayed, and skip scripts that can't be displayed. Alas, we don't know which cannot be displayed until we've tried and failed. > So if we provide such a command, IMO we should prompt for a block of > codepoints, and display only that block. > > It is inconvenient to expect users to know the codepoint values. Unicode blocks have names, so providing completion for them would do the job, I think. The entire Unicode codespace is divided into about 200 blocks, so if the user knows, or can guess the one she needs, that will probably limit the search for the character to some reasonable quantity. Moreover, some scripts share the same blocks, and vice versa. So being able to specify just scripts or just blocks is not enough; we need both. I think we need all these methods, possibly more, because you may not necessarily know or guess easily where to look. For example, there are certain characters that appear as mathematical symbols in addition to their "normal" places, so unless the user already knows in which block to look, they will find the "base character" method very useful, and without it could very well miss their character. > Suppose I want to see Greek letters -- I have no idea what codepoints > those are, and I should not need to know them in order to specify > "Greek letters". You'd only need to know "Greek", and all the Greek blocks will be displayed. If you happen to know more, like "Greek Extended", it will further limit the number of characters to view. And, of course, there are complications: you might think it's a Greek character, but it could really be a math symbol or a Cyrillic character instead. > The header line for each script could have a [hide] or [show] button > to select visibility of that script. Initially they could all be > hidden, and the user would expose those that she is interested in. A 100-button buffer is not very convenient, especially when you have only an approximate idea about the script you are after (e.g., is that funny shape part of "Miscellaneous Technical" block or "Geometric Shapes"?)
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.