Package: emacs;
Reported by: Joakim Hårsman <joakim.harsman <at> gmail.com>
Date: Wed, 14 Dec 2011 20:42:02 UTC
Severity: normal
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Joakim Hårsman <joakim.harsman <at> gmail.com> To: Eli Zaretskii <eliz <at> gnu.org> Cc: 10299 <at> debbugs.gnu.org Subject: bug#10299: Emacs doesn't handle Unicode characters in keyboard layout on MS Windows Date: Tue, 20 Dec 2011 22:16:53 +0100
On 18 December 2011 19:13, Eli Zaretskii <eliz <at> gnu.org> wrote: >> Date: Sun, 18 Dec 2011 18:31:55 +0100 >> From: Joakim Hårsman <joakim.harsman <at> gmail.com> >> >> > That's good news. However, I'm puzzled: are you saying that the code >> > points passed by Windows to Emacs for the characters generated by MKLC >> > are outside the Unicode BMP, i.e. larger than 65535? If so, what code >> > points are they? >> >> No, none of the characters I needed are outside the BMP. >> >> WM_CHAR encodes the codepoint in UTF-16 inside wParam, while >> WM_UNICHAR uses UTF-32. So if I press something which gives U+2218 >> RING OPERATOR, I get a WM_CHAR event with a wParam of 2228248 or >> 0x220018. > > ??? UTF-16 encodes the characters in the BMP as themselves, i.e. a > single 16-bit value that is numerically identical to the codepoint. > That is, you should have gotten 0x2218. What am I missing? > >> I experimented a bit, and CreateWindowW isn't needed after all. As >> long as I use RegisterClassW and GetMessageW, things work. I'm unsure >> if it's TranslateMessage that translates the key press to a question >> mark or if it's GetMessage that does it on receiving the message. > > Question marks are a sign that Windows tried to convert the character > to its ANSI equivalent, and failed. I.e., it means that Windows > thought the program asked for ANSI encoded characters. So it's > probably TranslateMessage that did it. > >> I'll try to get frame titles working again as well, then I can >> probably switch on os_subtype in two or three places and Windows 95 >> won't be affected at all. Do you think that is a good plan? > > Yes, thanks. I've fixed the issues with the frame titles, and everything appears to work, there are a number of issues I find very confusing however. Here's the state of my changes as of now: === modified file 'src/w32fns.c' --- src/w32fns.c 2011-12-04 08:02:42 +0000 +++ src/w32fns.c 2011-12-20 20:46:40 +0000 @@ -1697,10 +1697,10 @@ if (FRAME_W32_WINDOW (f)) { if (STRING_MULTIBYTE (name)) - name = ENCODE_SYSTEM (name); + name = ENCODE_SYSTEM (name); BLOCK_INPUT; - SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); + SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); UNBLOCK_INPUT; } } @@ -1746,7 +1746,7 @@ name = ENCODE_SYSTEM (name); BLOCK_INPUT; - SetWindowText (FRAME_W32_WINDOW (f), SDATA (name)); + SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); UNBLOCK_INPUT; } } @@ -1785,7 +1785,7 @@ static BOOL w32_init_class (HINSTANCE hinst) { - WNDCLASS wc; + WNDCLASSW wc; wc.style = CS_HREDRAW | CS_VREDRAW; wc.lpfnWndProc = (WNDPROC) w32_wnd_proc; @@ -1796,9 +1796,9 @@ wc.hCursor = w32_load_cursor (IDC_ARROW); wc.hbrBackground = NULL; /* GetStockObject (WHITE_BRUSH); */ wc.lpszMenuName = NULL; - wc.lpszClassName = EMACS_CLASS; + wc.lpszClassName = L"Emacs"; - return (RegisterClass (&wc)); + return (RegisterClassW (&wc)); } static HWND @@ -2248,7 +2248,7 @@ msh_mousewheel = RegisterWindowMessage (MSH_MOUSEWHEEL); - while (GetMessage (&msg, NULL, 0, 0)) + while (GetMessageW (&msg, NULL, 0, 0)) { if (msg.hwnd == NULL) { @@ -2915,8 +2915,21 @@ case WM_SYSCHAR: case WM_CHAR: - post_character_message (hwnd, msg, wParam, lParam, - w32_get_key_modifiers (wParam, lParam)); + if (wParam > 255 ) + { + unsigned short lo = wParam & 0x0000FFFF; + unsigned short hi = (wParam & 0xFFFF0000) >> 8; + wParam = hi | lo; + + W32Msg wmsg; + wmsg.dwModifiers = w32_get_key_modifiers (wParam, lParam); + signal_user_input (); + my_post_msg (&wmsg, hwnd, WM_UNICHAR, wParam, lParam); + + } + else + post_character_message (hwnd, msg, wParam, lParam, + w32_get_key_modifiers (wParam, lParam)); break; case WM_UNICHAR: I should probably also only do this on NT (to avoid breaking stuff on Windows 95), but that should be easy to fix. There are a couple of very weird things going on however: 1. Why is wParam encoded in a weird format spread over the lo and hi word of the wParam DWORD? 2. Why does sending 8-bit strings to SetWindowTextW work, but sending 8-bit strings to SetWindowTextA for a window with a "Unicode" window class only use the first character? My guess would be that the correct solution for 2 is to always encode frame captions in utf-16le before sending them to SetWindowTextW, however I'm not sure what the best way to do this is. I figure I should use something like this: Lisp_Object encoding = intern_c_string ("utf-16le-dos"); name = code_convert_string_norecord (name, encoding, 1); SetWindowTextW (FRAME_W32_WINDOW (f), SDATA (name)); Sadly that didn't work (I still get single char frame captions), and I never managed to get gdb on Windows to print Lisp objects correctly, so I had a hard time understanding why it didn't work. Looking at the data that actually gets sent to SetWindowText might make things clearer. Anyway, the current patch works fine as far as I can tell, but it's a bit disconcerting to not know *why* things work the way they do.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.