GNU bug report logs - #10299
Emacs doesn't handle Unicode characters in keyboard layout on MS Windows

Previous Next

Package: emacs;

Reported by: Joakim Hårsman <joakim.harsman <at> gmail.com>

Date: Wed, 14 Dec 2011 20:42:02 UTC

Severity: normal

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 10299 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Joakim Hårsman <joakim.harsman <at> gmail.com>
Cc: jasonr <at> gnu.org, 10299 <at> debbugs.gnu.org, handa <at> m17n.org
Subject: Re: bug#10299: Emacs doesn't handle Unicode characters in keyboard
	layout on MS Windows
Date: Thu, 15 Dec 2011 01:22:11 -0500
> Date: Wed, 14 Dec 2011 21:39:28 +0100
> From: Joakim Hårsman <joakim.harsman <at> gmail.com>
> 
> However, Emacs doesn't seem to handle the case when the keyboard
> layout contains characters not available in the ANSI code page, and
> just prints a question mark character instead.

Yes, Emacs on Windows uses the ANSI codepage to read the keyboard
input.  Does it help to play with the value of keyboard-coding-system?

> For certain characters,
> a character that is visually similar to the actual character is
> printed instead of a question mark. For example, if I use a layout
> where AltGr+O produces U+2218 RING OPERATOR, Emacs prints U+00B0
> DEGREE SYMBOL instead. The degree symbol is available in Windows 1252,
> the default ANSI code page on my system, but the ring operator
> isn't.

I'm guessing that this is Windows trying to translate the characters
to the ANSI codepage behind the scenes.

> However, if the layout maps AltGr+R to U+0220A SMALL ELEMENT OF, Emacs
> just prints a question mark, presumably because Windows 1252 doesn't
> contain a reasonable replacement for that character.

Will inputting these characters with "C-x 8 RET 0220a RET" or "C-x 8
RET SMALL ELEMENT OF RET" be a good enough solution for you?  You can
input any Unicode character by its name or codepoint using "C-x 8 RET".

> I'd be happy to help debug this but I have no idea where to even
> start. Is there an easy way to find out if it's the C code that
> clobbers the character or if it happens in lisp for example?

I don't think there any "clobbering".  Emacs deliberately converts the
Unicode characters to the current locale's ANSI codepage.  I think
(but I'm not sure) the reason is that Emacs cannot use UTF-16 for
keyboard input.  Perhaps Jason and Handa-san could comment on this.




This bug report was last modified 12 years and 290 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.