GNU bug report logs - #29837
UTF-16 char display problems and the macOS "character palette"

Previous Next

Package: emacs;

Reported by: Alan Third <alan <at> idiocy.org>

Date: Sun, 24 Dec 2017 16:02:02 UTC

Severity: normal

Tags: fixed

Fixed in version 27.1

Done: Alan Third <alan <at> idiocy.org>

Bug is archived. No further changes may be made.

Full log


Message #23 received at 29837 <at> debbugs.gnu.org (full text, mbox):

From: Philipp Stephani <p.stephani2 <at> gmail.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Alan Third <alan <at> idiocy.org>, 29837 <at> debbugs.gnu.org
Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character
 palette"
Date: Mon, 25 Dec 2017 20:13:55 +0000
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> schrieb am So., 24. Dez. 2017 um 20:35 Uhr:

> > Date: Sun, 24 Dec 2017 19:28:07 +0000
> > From: Alan Third <alan <at> idiocy.org>
> > Cc: 29837 <at> debbugs.gnu.org
> >
> > If I try to select utf-16 I get this
> >
> >     set-keyboard-coding-system: Unsuitable coding system for keyboard:
> utf-16
> >
> > and I used tab completion to find which other coding systems were
> > available but all the ones beginning utf-16 that I tried return the
> > same message.
>
> Oh, I now recollect that Handa-san said at some point that keyboard
> input doesn't support UTF-16...
>
> How do other macOS programs read UTF-16 keyboard input?  Maybe you
> could use the same way to read the sequences, and then decode them
> internally as UTF-16 using coding.c facilities, and feed them into the
> Emacs event queue?  Just a thought.
>
>
IIUC Emacs receives the input as a single UTF-16 string (in insertText),
then iterates over the UTF-16 code units, converting each into an Emacs
event. That's wrong, no matter whether the input comes from the character
palette or from the keyboard; normal keyboard layouts just happen to not
contain non-BMP characters. The loop needs to account for surrogates.
As a small optimization (which is warranted because the function is
probably called on every keystroke), this should use [NSString
getCharacters:range:] to copy all the UTF-16 code units to a buffer first,
to avoid repeated calls to characterAtIndex.
[Message part 2 (text/html, inline)]

This bug report was last modified 7 years and 139 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.