GNU bug report logs - #58326
Reading unicode user inputs from minibuffer

Package: emacs;

Reported by: uzibalqa <uzibalqa <at> proton.me>

Date: Thu, 6 Oct 2022 03:43:02 UTC

Severity: normal

Tags: notabug

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 58326 in the body.
You can then email your comments to 58326 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#58326; Package emacs. (Thu, 06 Oct 2022 03:43:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to uzibalqa <uzibalqa <at> proton.me>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Thu, 06 Oct 2022 03:43:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: uzibalqa <uzibalqa <at> proton.me>
To: "bug-gnu-emacs <at> gnu.org" <bug-gnu-emacs <at> gnu.org>
Subject: Reading unicode user inputs from minibuffer
Date: Thu, 06 Oct 2022 03:42:14 +0000

I am using "read-char-by-name" to read utf8 hex codes from user for input to  "glasses-separator".
But because "glasses-separator" requires a string I have to do (string (read-char-by-name "hex: ")).
Meaning that users cannot pass "\u2192", but have to use "#x2192".  Yet, using "completing-read",
the list can contain "\u2192", which works fine.  I am also unsure whether there is an inconsistency
with display-fill-column-indicator-character which also takes unicode. 

Could the setting up of "glasses-separator" be simplified?  Could "read-char-by-name" be extended to accept
hexcodes like "\u2192", or is there some other function that can handle the different unicode inputs from minibuffer better?

-----------------------------------------------------

(defun camel-glasses (hexcode)

  "Splits CamelCase phrases using separator."

  (interactive (list (completing-read "hexcode: " '("\u27A4" "\u25BA") nil t)))

  (setq glasses-separator hexcode)

  (glasses-set-overlay-properties))

----------------------------------------------

(defun camel-glasses (hexcode)

  "Splits CamelCase phrases using separator."

  (interactive (list (string (read-char-by-name "hex: ")))

  (setq glasses-separator hexcode)

  (glasses-set-overlay-properties))

-------------------------------------

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58326; Package emacs. (Thu, 06 Oct 2022 08:46:02 GMT) Full text and rfc822 format available.

Message #8 received at 58326 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: 58326 <at> debbugs.gnu.org
Cc: uzibalqa <uzibalqa <at> proton.me>
Subject: Re: bug#58326: Reading unicode user inputs from minibuffer
Date: Thu, 06 Oct 2022 10:45:24 +0200

>>>>> On Thu, 06 Oct 2022 03:42:14 +0000, uzibalqa via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> said:

    uzibalqa> I am using "read-char-by-name" to read utf8 hex codes
    uzibalqa> from user for input to "glasses-separator".  But because
    uzibalqa> "glasses-separator" requires a string I have to do
    uzibalqa> (string (read-char-by-name "hex: ")).  Meaning that
    uzibalqa> users cannot pass "\u2192", but have to use "#x2192".
    uzibalqa> Yet, using "completing-read", the list can contain
    uzibalqa> "\u2192", which works fine.  I am also unsure whether
    uzibalqa> there is an inconsistency with
    uzibalqa> display-fill-column-indicator-character which also takes
    uzibalqa> unicode.

    uzibalqa> Could the setting up of "glasses-separator" be
    uzibalqa> simplified?  Could "read-char-by-name" be extended to
    uzibalqa> accept hexcodes like "\u2192", or is there some other
    uzibalqa> function that can handle the different unicode inputs
    uzibalqa> from minibuffer better?

I suggest you read the docstring for `read-char-by-name' more
carefully:

    Accept a name like "CIRCULATION FUNCTION", a hexadecimal
    number like "2A10", or a number in hash notation (e.g.,
    "#x2a10" for hex, "10r10768" for decimal, or "#o25020" for
    octal).  Treat otherwise-ambiguous strings like "BED" (U+1F6CF)
    as names, not numbers.

Robert
--

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58326; Package emacs. (Thu, 06 Oct 2022 11:52:01 GMT) Full text and rfc822 format available.

Message #11 received at 58326 <at> debbugs.gnu.org (full text, mbox):

From: uzibalqa <uzibalqa <at> proton.me>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 58326 <at> debbugs.gnu.org
Subject: Re: bug#58326: Reading unicode user inputs from minibuffer
Date: Thu, 06 Oct 2022 11:51:06 +0000

------- Original Message -------
On Thursday, October 6th, 2022 at 8:45 AM, Robert Pluim <rpluim <at> gmail.com> wrote:


> > > > > > On Thu, 06 Oct 2022 03:42:14 +0000, uzibalqa via "Bug reports for GNU Emacs, the Swiss army knife of text editors" bug-gnu-emacs <at> gnu.org said:
> 
> 
> uzibalqa> I am using "read-char-by-name" to read utf8 hex codes
> 
> uzibalqa> from user for input to "glasses-separator". But because
> 
> uzibalqa> "glasses-separator" requires a string I have to do
> 
> uzibalqa> (string (read-char-by-name "hex: ")). Meaning that
> 
> uzibalqa> users cannot pass "\u2192", but have to use "#x2192".
> 
> uzibalqa> Yet, using "completing-read", the list can contain
> 
> uzibalqa> "\u2192", which works fine. I am also unsure whether
> 
> uzibalqa> there is an inconsistency with
> 
> uzibalqa> display-fill-column-indicator-character which also takes
> 
> uzibalqa> unicode.
> 
> 
> uzibalqa> Could the setting up of "glasses-separator" be
> 
> uzibalqa> simplified? Could "read-char-by-name" be extended to
> 
> uzibalqa> accept hexcodes like "\u2192", or is there some other
> 
> uzibalqa> function that can handle the different unicode inputs
> 
> uzibalqa> from minibuffer better?
> 
> 
> I suggest you read the docstring for `read-char-by-name' more
> carefully:
> 
> Accept a name like "CIRCULATION FUNCTION", a hexadecimal
> number like "2A10", or a number in hash notation (e.g.,
> "#x2a10" for hex, "10r10768" for decimal, or "#o25020" for
> octal). Treat otherwise-ambiguous strings like "BED" (U+1F6CF)
> as names, not numbers.

Have read the docstring.  The discussion is not about the docstring.
Using "\u2192" should be perfectly fine for a utf code.  One often sees
things like "?\u2192".  After all, Emacs provides several types of escape
syntax that one can use to specify non-ASCII text characters including
"?\uxxxx" besides the hexadecimal character codes escape sequence 
(See 2.4.3.2 General Escape Syntax.)

For instance "glasses-separator" accepts "\u2192", yet the user cannot
input the same for minibuffer input involving utf.  This also applies to
"display-fill-column-indicator-character" where "?\u2503" is perfectly
acceptable.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58326; Package emacs. (Thu, 06 Oct 2022 12:22:02 GMT) Full text and rfc822 format available.

Message #14 received at 58326 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: uzibalqa <uzibalqa <at> proton.me>
Cc: 58326 <at> debbugs.gnu.org, Robert Pluim <rpluim <at> gmail.com>
Subject: Re: bug#58326: Reading unicode user inputs from minibuffer
Date: Thu, 06 Oct 2022 14:21:27 +0200

This doesn't seem to be about any bugs in Emacs, so I'm closing this bug
report.

If you need help with using Emacs, please use the mailing lists that
exist for that purpose.

Added tag(s) notabug. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Thu, 06 Oct 2022 12:22:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 58326 <at> debbugs.gnu.org and uzibalqa <uzibalqa <at> proton.me> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Thu, 06 Oct 2022 12:22:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58326; Package emacs. (Thu, 06 Oct 2022 16:35:02 GMT) Full text and rfc822 format available.

Message #21 received at 58326 <at> debbugs.gnu.org (full text, mbox):

From: uzibalqa <uzibalqa <at> proton.me>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 58326 <at> debbugs.gnu.org, Robert Pluim <rpluim <at> gmail.com>
Subject: Re: bug#58326: Reading unicode user inputs from minibuffer
Date: Thu, 06 Oct 2022 16:34:01 +0000

------- Original Message -------
On Thursday, October 6th, 2022 at 12:21 PM, Lars Ingebrigtsen <larsi <at> gnus.org> wrote:


> This doesn't seem to be about any bugs in Emacs, so I'm closing this bug
> report.
> 
> If you need help with using Emacs, please use the mailing lists that
> exist for that purpose.

It is about limitation on not taking \uN.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58326; Package emacs. (Thu, 06 Oct 2022 17:22:02 GMT) Full text and rfc822 format available.

Message #24 received at 58326 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: uzibalqa <uzibalqa <at> proton.me>
Cc: 58326 <at> debbugs.gnu.org, Lars Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#58326: Reading unicode user inputs from minibuffer
Date: Thu, 06 Oct 2022 19:21:18 +0200

>>>>> On Thu, 06 Oct 2022 16:34:01 +0000, uzibalqa <uzibalqa <at> proton.me> said:

    uzibalqa> ------- Original Message -------
    uzibalqa> On Thursday, October 6th, 2022 at 12:21 PM, Lars Ingebrigtsen <larsi <at> gnus.org> wrote:


    >> This doesn't seem to be about any bugs in Emacs, so I'm closing this bug
    >> report.
    >> 
    >> If you need help with using Emacs, please use the mailing lists that
    >> exist for that purpose.

    uzibalqa> It is about limitation on not taking \uN. 

`read-char-by-name' accepts N or #xN, so why would it need extending
to accept \uN?

Robert
--

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58326; Package emacs. (Thu, 06 Oct 2022 17:53:01 GMT) Full text and rfc822 format available.

Message #27 received at 58326 <at> debbugs.gnu.org (full text, mbox):

From: uzibalqa <uzibalqa <at> proton.me>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 58326 <at> debbugs.gnu.org, Lars Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#58326: Reading unicode user inputs from minibuffer
Date: Thu, 06 Oct 2022 17:52:05 +0000

------- Original Message -------
On Thursday, October 6th, 2022 at 5:21 PM, Robert Pluim <rpluim <at> gmail.com> wrote:

> > > > > > On Thu, 06 Oct 2022 16:34:01 +0000, uzibalqa uzibalqa <at> proton.me said:
> 
> 
> uzibalqa> ------- Original Message -------
> 
> uzibalqa> On Thursday, October 6th, 2022 at 12:21 PM, Lars Ingebrigtsen larsi <at> gnus.org wrote:
> 
> 
> 
> >> This doesn't seem to be about any bugs in Emacs, so I'm closing this bug
> 
> >> report.
> 
> >>
> 
> >> If you need help with using Emacs, please use the mailing lists that
> 
> >> exist for that purpose.
> 
> 
> uzibalqa> It is about limitation on not taking \uN.
> 
> 
> `read-char-by-name' accepts N or #xN, so why would it need extending
> to accept \uN?

Because \uN is also an acceptable declaration as has been used in elisp source
code in other routines.  Although accepting "N" from users is satisfactory.
At times I feel that certain decisions on what to accept and what not to accept
are completely arbitrary.  I am of the school of thought that if there are three 
valid possibilities, one could simply support the three.  Why deal with just 
two of them.         

There is also another problem, suppose one decides to use a list, passing utf codes
through "completing-read".  In such case only codes inputted as "\u25BA" would work.
Using "25BA" and "#x25BA" is futile.  These are complications that can be avoided.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#58326; Package emacs. (Fri, 07 Oct 2022 00:47:02 GMT) Full text and rfc822 format available.

Message #30 received at 58326 <at> debbugs.gnu.org (full text, mbox):

From: uzibalqa <uzibalqa <at> proton.me>
To: uzibalqa <uzibalqa <at> proton.me>
Cc: 58326 <at> debbugs.gnu.org, Robert Pluim <rpluim <at> gmail.com>,
 Lars Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#58326: Reading unicode user inputs from minibuffer
Date: Fri, 07 Oct 2022 00:45:45 +0000

------- Original Message -------
On Thursday, October 6th, 2022 at 5:52 PM, uzibalqa via "Bug reports for GNU Emacs, the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org> wrote:


> ------- Original Message -------
> On Thursday, October 6th, 2022 at 5:21 PM, Robert Pluim rpluim <at> gmail.com wrote:
> 
> 
> 
> > > > > > > On Thu, 06 Oct 2022 16:34:01 +0000, uzibalqa uzibalqa <at> proton.me said:
> > 
> > uzibalqa> ------- Original Message -------
> > 
> > uzibalqa> On Thursday, October 6th, 2022 at 12:21 PM, Lars Ingebrigtsen larsi <at> gnus.org wrote:
> > 
> > > > This doesn't seem to be about any bugs in Emacs, so I'm closing this bug
> > 
> > > > report.
> > 
> > > > If you need help with using Emacs, please use the mailing lists that
> > 
> > > > exist for that purpose.
> > 
> > uzibalqa> It is about limitation on not taking \uN.
> > 
> > `read-char-by-name' accepts N or #xN, so why would it need extending
> > to accept \uN?
> 
> 
> Because \uN is also an acceptable declaration as has been used in elisp source
> code in other routines. Although accepting "N" from users is satisfactory.
> At times I feel that certain decisions on what to accept and what not to accept
> are completely arbitrary. I am of the school of thought that if there are three
> valid possibilities, one could simply support the three. Why deal with just
> two of them.
> 
> There is also another problem, suppose one decides to use a list, passing utf codes
> through "completing-read". In such case only codes inputted as "\u25BA" would work.
> Using "25BA" and "#x25BA" is futile. These are complications that can be avoided.

There is some serious inconsistency because 

I can do (code (completing-read "Opt: " '("\u27A4") nil t))) (setq glasses-separator code).
This works, but fails with "#x27A4".  

But then for read-char-by-name
(code (string (read-char-by-name "Opt: "))) (setq glasses-separator code)
This works with "#x27A4", but fails with "\u27A4"

(setq glasses-separator code) accepts a string, whether "\u27A4" or "#x27A4", but minibuffer
input is affected by which call is used "completing-read" or "read-char-by-name".

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 04 Nov 2022 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 287 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #58326 Reading unicode user inputs from minibuffer

GNU bug report logs - #58326
Reading unicode user inputs from minibuffer