From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Ilya Zakharevich Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 03 Mar 2015 23:11:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 19994@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.142542421418490 (code B ref -1); Tue, 03 Mar 2015 23:11:02 +0000 Received: (at submit) by debbugs.gnu.org; 3 Mar 2015 23:10:14 +0000 Received: from localhost ([127.0.0.1]:34330 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YSvx5-0004o6-H6 for submit@debbugs.gnu.org; Tue, 03 Mar 2015 18:10:13 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54399) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YSvx2-0004nf-0f for submit@debbugs.gnu.org; Tue, 03 Mar 2015 18:10:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YSvwu-0007d3-D9 for submit@debbugs.gnu.org; Tue, 03 Mar 2015 18:10:02 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:58599) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSvwu-0007cs-9M for submit@debbugs.gnu.org; Tue, 03 Mar 2015 18:10:00 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56113) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSvwr-0005mE-ND for bug-gnu-emacs@gnu.org; Tue, 03 Mar 2015 18:10:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YSvwn-0007Wz-Vv for bug-gnu-emacs@gnu.org; Tue, 03 Mar 2015 18:09:57 -0500 Received: from nm22-vm7.bullet.mail.gq1.yahoo.com ([98.136.217.70]:43395) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSvwn-0007Vx-Ig for bug-gnu-emacs@gnu.org; Tue, 03 Mar 2015 18:09:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1425424192; bh=My7jnpCUBEYHRUR4tstqz2azTsNPFSefnssFO3QSvCo=; h=Date:From:To:Subject:From:Subject; b=eTb8aTrjDUf8AWgeB3ZwfYIFIWXW7EmpqQved0oKxzOkatc/2u7yljxwWCaeQj0ffDL5M5FlQXh0x4E9KdtPu4mAf8MpmJuURq45I81svRt3MdFR4oucjPh7W+8W/uVkIxRwo3Ga/I6HC8cqvY8VPUII3ozFD72BhQrAru1aNNUBDxRZ7lhW9PsLbLY6bD99vtr14VYLj7huE0WFy3zVYO23j7ImE7A5HPIs6eT8DgTH3ZA6TUCaERTcg3h+oi6iN+wplAa0xCuV6A8jBioGDomCJKoIPNQOIY7pJ6k8i7kOemIEGa6GShPqV5q39QwSJZ2wnmA4/eD52mZbgftUkg== Received: from [98.137.12.60] by nm22.bullet.mail.gq1.yahoo.com with NNFMP; 03 Mar 2015 23:09:52 -0000 Received: from [208.71.42.194] by tm5.bullet.mail.gq1.yahoo.com with NNFMP; 03 Mar 2015 23:09:52 -0000 Received: from [127.0.0.1] by smtp205.mail.gq1.yahoo.com with NNFMP; 03 Mar 2015 23:09:52 -0000 X-Yahoo-Newman-Id: 547531.29739.bm@smtp205.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: YDoBvOIVM1l6qCFNjekFMal3CU6ssiWu5uZAbkr6oysMfV6 9n1VRLuWIcN4WlrPm895XzOZN6Yh4PG2ygQXxaDBuCXs7iQeRUrXKuXnpJaB EP9eV1t2isuhWPQ6u6FCeoQ_IyRV_XtH34NML8IMDtlNTYQvQeaDybQUNaOa .AAJKSM42.b0n1WFE1ySw5H3_3Sv16SniHVOJVsdyB.bshYsMC_bCX_BUBu7 DLgtYEWJlJCXHJZlrEUeHQNkK0JHLX7e6XY.HOhvcvbymtv1pv_dCOPzQX9r 0xaAdt_0HN43JGNVbXl.YDqGwoRipXuMZMXcebH1Bl8b7_Q2zRSZI5oMyDrP 6FYWwKsvhAkx6_Cg2nJR9TQMJcf96dGuBtyardMHAj2wo2JnnrPTuhFPPN3G ctBqDyKGRPoRAoDdGc1FYJ1kwyK06MPT4mH2oJi4Mn42UpLkqb5D0OMCwdUI L6FwVIxuDsWkFQCWOMEpxL6Nu6ryUSeofi4k9Y9fo.Z6tvjNND5ROeqZc4YE yPQN4HgqQKKWRrIJB2LR8vcZlGkfP0zrI3s_V41JlPZP7ViEVCToCIR6cRbd DG55oseXI9MokVmfACfptB7tZXqFh3KFcrMXGMrtFeS1P1LLV020ziga2mUI bFaNbY2UNwGNE5zrIE_96YRHppUSx7Ztu0eJkj32kpaNimspLV4d0ogQMW2A F3imtAMSOt5oS2JQIUmcs2VxAsYSF7c6kyfLPZlx_jmK375TZW5OagHtj55l Ds3SJl3mEBvJuiUgd6ibNLvJcY7AEDO1pJP87gdG9puT5Wi8tqf9Rc86XZmY ggpq.crWTMtxs1c81svkLcsgGzUMnTyYkcEpTCszZjJEwWBqgCteiaJbCDCw T4UkfzXE- X-Yahoo-SMTP: oLSY3dWswBBqoBVzCkLl_RIsw6heKMxu8wpEbARv1SU- Date: Tue, 3 Mar 2015 15:09:49 -0800 From: Ilya Zakharevich Message-ID: <20150303230949.GA29784@math.berkeley.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) I’m working on a patch to make Unicode keyboard input to work properly on Windows (in graphic mode). The problems with the current implementation stem from the facts that • on Windows, it IS possible to implement a bullet-proof system of Unicode input (at least, for GUI applications); • However, how to do it is completely undocumented. [See http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#Keyboard_input_on_Windows:_interaction_of_applications_and_the_kernel ] So, essentially, all developers of applications try to design their own set of heuristical approaches which • cover several keyboard layouts they can put their hands on; • more or less follow the design goals of their applications. The approach taken by Emacs is to break the keyboard keys (VK’s) into several groups, and treat different groups differently. Only the keys on the main island of the keyboard may input characters. Moreover, only the most common combinations of modifiers are allowed to be used for the character input. (In addition, there are plain bugs — like treating UTF-16 as if it were UTF-32.) [I gave a very terse description on https://groups.google.com/forum/?hl=en#!search/emacs$20keyboard$20windows$20ilya/gnu.emacs.help/ZHpZK2YfFuo/aAyZFUxrFeEJ ] The “correct” approach should proceed in exactly the opposite direction: if a keypress produces a character, it should be treated as a character — no matter where on the physical keyboard the key is residing, and which modifiers were pressed. The patch below • Implements this “primacy of characters” doctrine; • As far as I could see, is compatible with the current work of Emacs on “simple keyboard layouts”; • Worked at some moment (before I started a massive addition of comments ;-] — and maybe it is still working, I did not touch it for a month); • (Currently) ignores the indent coding rules; • Passes all the test thrown at it by my super-puper-all-bells-and-whistles layouts; see e.g. http://k.ilyaz.org/windows/izKeys-visual-maps.html#examples • Is not bullet-proof: ∘ I use one heuristic to detect which modifiers are “consumed” by the character input, and which are “on top” of character input; ∘ It does not (same as the current Emacs) support Unicode-entered-by-Alt-numbers. • Does not fix a bug with UTF-16 of stand-alone (pumped to us) WM_CHAR’s. If I ever find more time to work on it, I plan to: 1) Add yet more documentation; 2) Change a little bit the logic of detection of consumed/extra modifiers. This change may be cosmetic only — or maybe, with some extremely devilous layouts, it may be beneficial. (I have not seen layouts where this change would matter, though! And I looked though the source code of hundred(s).) 3) Bring it in sync with the Emacs coding style. Meanwhile, I would greatly appreciate all input related to the current state of the patch. (I *HOPE* that I did not break (many!) special cases in the current implementation — but such things are hard to be sure in!) Thanks for the parts of Emacs which ARE working great, Ilya ======================================================= --- w32fns.c-ini 2015-01-30 15:33:23.505201400 -0800 +++ w32fns.c 2015-02-15 02:46:12.070091800 -0800 @@ -2832,6 +2832,126 @@ post_character_message (HWND hwnd, UINT my_post_msg (&wmsg, hwnd, msg, wParam, lParam); } +static int +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int *ctrl_cnt, int *is_dead, int vk, int exp) +{ + MSG msg; + int i = buflen, doubled = 0, code_unit; /* If doubled is at the end, ignore it */ + if (ctrl_cnt) + *ctrl_cnt = 0; + if (is_dead) + *is_dead = -1; + while (buflen && /* Should be called only when w32_unicode_gui */ + PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, PM_NOREMOVE | PM_NOYIELD) && + (msg.message == WM_CHAR || msg.message == WM_SYSCHAR || + msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR || msg.message == WM_UNICHAR)) { /* Not contigious */ + int dead; + + GetMessageW(&msg, aWnd, msg.message, msg.message); + dead = (msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR); + if (is_dead) + *is_dead = (dead ? msg.wParam : -1); + if (dead) + continue; + code_unit = msg.wParam; + if (doubled) { /* had surrogate */ + if (msg.message == WM_UNICHAR || code_unit < 0xDC00 || code_unit > 0xDFFF) { + /* Mismatched first surrogate. Pass both code units as if they were two characters. */ + *buf++ = doubled; + if (!--buflen) // Drop the second char if at the end of the buffer + return i; + } else { + code_unit = (doubled << 10) + code_unit - 0x35FDC00; + } + doubled = 0; + } else if (code_unit >= 0xD800 && code_unit <= 0xDBFF) { + doubled = code_unit; + continue; + } /* We handle mismatched second surrogate the same as a normal character. */ + /* The only "fake" characters delivered by ToUnicode() or TranslateMessage() are: + 0x01 .. 0x1a for Control-chars, + 0x00 and 0x1b .. 0x1f for Control- []\@^_ + 0x7f for Control-BackSpace + 0x20 for Control-Space */ + if (ignore_ctrl && (code_unit < 0x20 || code_unit == 0x7f || (code_unit == 0x20 && ctrl))) { + /* Non-character payload in a WM_CHAR (Ctrl-something pressed). Ignore. */ + if (ctrl_cnt) + *ctrl_cnt++; + continue; + } + if (code_unit < 0x7f && + ((vk >= VK_NUMPAD0 && vk <= VK_DIVIDE) || + (exp && ((vk >= VK_PRIOR && vk <= VK_DOWN) || + vk == VK_INSERT || vk == VK_DELETE || vk == VK_CLEAR))) && + strchr("0123456789/*-+.,", code_unit)) /* Traditionally, Emacs translates these to characters later, in `self-insert-character' */ + continue; + *buf++ = code_unit; + buflen--; + } + return i - buflen; +} + +int +deliver_wm_chars (int do_translate, HWND hwnd, UINT msg, UINT wParam, UINT lParam) +{ + /* An "old style" keyboard description may assign up to 125 UTF-16 code points to a keypress. + (However, the "old style" TranslateMessage() would deliver at most 16 of them.) Be on a + safe side, and prepare to treat many more. */ + int ctrl_cnt, buf[1024], count, is_dead; + + if (do_translate) { + MSG windows_msg = { hwnd, msg, wParam, lParam, 0, {0,0} }; + + windows_msg.time = GetMessageTime (); + TranslateMessage (&windows_msg); + } + count = get_wm_chars (hwnd, buf, sizeof(buf)/sizeof(*buf), 1, + /* The message may have been synthesized by who knows what; be conservative. */ + modifier_set (VK_LCONTROL) || modifier_set (VK_RCONTROL) || modifier_set (VK_CONTROL), + &ctrl_cnt, &is_dead, wParam, (lParam & 0x1000000L) != 0); + if (count) { + W32Msg wmsg; + int *b = buf, strip_Alt = 1; + + /* wParam is checked when converting CapsLock to Shift */ + wmsg.dwModifiers = do_translate ? w32_get_key_modifiers (wParam, lParam) : 0; + + /* What follows is just heuristics; the correct treatement requires non-destructive ToUnicode(). */ + if (wmsg.dwModifiers & ctrl_modifier) /* If ctrl-something delivers chars, ctrl and the rest should be hidden */ + wmsg.dwModifiers = wmsg.dwModifiers & shift_modifier; + /* In many keyboard layouts, (left) Alt is not changing the character. Unless we are in this situation, strip Alt/Meta. */ + if (wmsg.dwModifiers & (alt_modifier | meta_modifier) && /* If alt-something delivers non-ASCIIchars, alt should be hidden */ + count == 1 && *b < 0x10000) { + SHORT r = VkKeyScanW( *b ); + + fprintf(stderr, "VkKeyScanW %#06x %#04x\n", (int)r, wParam); + if ((r & 0xFF) == wParam && !(r & ~0x1FF)) { /* Char available without Alt modifier, so Alt is "on top" */ + if (*b > 0x7f && ('A' <= wParam && wParam <= 'Z')) + return 0; /* Another branch below would convert it to Alt-Latin char via wParam */ + strip_Alt = 0; + } + } + if (strip_Alt) + wmsg.dwModifiers = wmsg.dwModifiers & ~(alt_modifier | meta_modifier); + + signal_user_input (); + while (count--) + { + fprintf(stderr, "unichar %#06x\n", *b); + my_post_msg (&wmsg, hwnd, WM_UNICHAR, *b++, lParam); + } + if (!ctrl_cnt) /* Process ALSO as ctrl */ + return 1; + else + fprintf(stderr, "extra ctrl char\n"); + return -1; + } else if (is_dead >= 0) { + fprintf(stderr, "dead %#06x\n", is_dead); + return 1; + } + return 0; +} + /* Main window procedure */ static LRESULT CALLBACK @@ -3007,7 +3127,6 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA /* Synchronize modifiers with current keystroke. */ sync_modifiers (); record_keydown (wParam, lParam); - wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0); windows_translate = 0; @@ -3117,6 +3236,45 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA wParam = VK_NUMLOCK; break; default: + if (w32_unicode_gui) { + /* If this event generates characters or deadkeys, do not interpret + it as a "raw combination of modifiers and keysym". Hide + deadkeys, and use the generated character(s) instead of the + keysym. (Backward compatibility: exceptions for numpad keys + generating 0-9 . , / * - +, and for extra-Alt combined with a + non-Latin char.) + + Try to not report modifiers which have effect on which + character or deadkey is generated. + + Example (contrived): if rightAlt-? generates f (on a Cyrillic + keyboard layout), and Ctrl, leftAlt do not affect the generated + character, one wants to report Ctrl-leftAlt-f if the user + presses Ctrl-leftAlt-rightAlt-?. */ + int res; +#if 0 + /* Some of WM_CHAR may be fed to us directly, some are results of + TranslateMessage(). Using 0 as the first argument (in a + separate call) might help us distinguish these two cases. + + However, the keypress feeders would most probably expect the + "standard" message pump, when TranslateMessage() is called on + EVERY KeyDown/Keyup event. So they may feed us Down-Ctrl + Down-FAKE Char-o and expect us to recognize it as Ctrl-o. + Using 0 as the first argument would interfere with this. */ + deliver_wm_chars (0, hwnd, msg, wParam, lParam); +#endif + /* Processing the generated WM_CHAR messages *WHILE* we handle + KEYDOWN/UP event is the best choice, since withoug any fuss, + we know all 3 of: scancode, virtual keycode, and expansion. + (Additionally, one knows boundaries of expansion of different + keypresses.) */ + res = deliver_wm_chars (1, hwnd, msg, wParam, lParam); + windows_translate = -( res != 0 ); + if (res > 0) /* Bound to character(s) or a deadkey */ + break; + } /* Some branches after this one may be not needed */ + wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0); /* If not defined as a function key, change it to a WM_CHAR message. */ if (wParam > 255 || !lispy_function_keys[wParam]) { @@ -3184,6 +3342,8 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA } } + if (windows_translate == -1) + break; translate: if (windows_translate) { ======================================================= In GNU Emacs 25.0.50.20 (i686-pc-mingw32) of 2015-02-08 on BUCEFAL Repository revision: d5e3922e08587e7eb9e5aec2e9f84cbda405f857 Windowing system distributor `Microsoft Corp.', version 6.1.7601 Configured using: `configure --prefix=/k/test' Configured features: SOUND NOTIFY ACL Important settings: value of $LANG: ENU locale-coding-system: cp1252 Major mode: Fundamental Minor modes in effect: tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t buffer-read-only: t line-number-mode: t Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Load-path shadows: None found. Features: (shadow sort gnus-util mail-extr emacsbug message dired format-spec rfc822 mml easymenu mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util help-fns mail-prsvr mail-utils time-date tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel dos-w32 ls-lisp disp-table w32-win w32-vars tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process w32notify w32 multi-tty emacs) Memory information: ((conses 8 80324 9864) (symbols 32 17968 0) (miscs 32 85 128) (strings 16 12688 4007) (string-bytes 1 324435) (vectors 8 9470) (vector-slots 4 390690 6074) (floats 8 65 62) (intervals 28 243 45) (buffers 516 13)) From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 04 Mar 2015 18:02:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Ilya Zakharevich Cc: 19994@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.14254920819214 (code B ref 19994); Wed, 04 Mar 2015 18:02:01 +0000 Received: (at 19994) by debbugs.gnu.org; 4 Mar 2015 18:01:21 +0000 Received: from localhost ([127.0.0.1]:35229 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTDbh-0002OV-6z for submit@debbugs.gnu.org; Wed, 04 Mar 2015 13:01:21 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:34953) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTDbb-0002OA-He for 19994@debbugs.gnu.org; Wed, 04 Mar 2015 13:01:15 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0NKP00B008HFM100@a-mtaout20.012.net.il> for 19994@debbugs.gnu.org; Wed, 04 Mar 2015 20:01:05 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NKP00BSR8PS5WC0@a-mtaout20.012.net.il>; Wed, 04 Mar 2015 20:01:05 +0200 (IST) Date: Wed, 04 Mar 2015 20:01:01 +0200 From: Eli Zaretskii In-reply-to: <20150303230949.GA29784@math.berkeley.edu> X-012-Sender: halo1@inter.net.il Message-id: <83bnk8prqa.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <20150303230949.GA29784@math.berkeley.edu> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Tue, 3 Mar 2015 15:09:49 -0800 > From: Ilya Zakharevich > > I’m working on a patch to make Unicode keyboard input to work properly on > Windows (in graphic mode). Thanks! > The patch below > > • Implements this “primacy of characters” doctrine; > > • As far as I could see, is compatible with the current work of Emacs > on “simple keyboard layouts”; > > • Worked at some moment (before I started a massive addition of > comments ;-] — and maybe it is still working, I did not touch it for a > month); > > • (Currently) ignores the indent coding rules; > > • Passes all the test thrown at it by my super-puper-all-bells-and-whistles > layouts; see e.g. > http://k.ilyaz.org/windows/izKeys-visual-maps.html#examples Any chance of coming up with a few tests for this code, and adding them to the test/ directory? > If I ever find more time to work on it, I plan to: > > 1) Add yet more documentation; > > 2) Change a little bit the logic of detection of consumed/extra > modifiers. This change may be cosmetic only — or maybe, with some > extremely devilous layouts, it may be beneficial. > > (I have not seen layouts where this change would matter, though! > And I looked though the source code of hundred(s).) > > 3) Bring it in sync with the Emacs coding style. I suggest, indeed, to clean up the code so we could commit it to the master branch. That way, it will get wider testing, and we can fix whatever problems it might cause. Any deficiencies that don't cause regressions wrt the current code can be fixed later, or even not at all (if we decide them to not be important enough). Question: did you try this code with IME input methods? > Meanwhile, I would greatly appreciate all input related to the current > state of the patch. Some of that (but not much) below. > +static int > +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int ^^^^^^^^ Why 'int' and not 'wchar_t'? > + while (buflen && /* Should be called only when w32_unicode_gui */ > + PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, PM_NOREMOVE | PM_NOYIELD) && Indeed, any "wide" APIs should only be called when w32_unicode_gui is on, and there should be alternative code for when w32_unicode_gui is off. We still try to support Windows 9X. > + if (msg.message == WM_UNICHAR || code_unit < 0xDC00 || code_unit > > 0xDFFF) { > + /* Mismatched first surrogate. Pass both code units as if they were > two characters. */ > + *buf++ = doubled; > + if (!--buflen) // Drop the second char if at the end of the buffer > + return i; > + } else { > + code_unit = (doubled << 10) + code_unit - 0x35FDC00; > + } > + doubled = 0; > + } else if (code_unit >= 0xD800 && code_unit <= 0xDBFF) { Either explain the "magic" constants in comments, or, better, use macros with descriptive names. > + int ctrl_cnt, buf[1024], count, is_dead; I think buf[] should be an array of wchar_t. Also, will this code work for the non-w32_unicode_gui mode? > + if (count) { > + W32Msg wmsg; > + int *b = buf, strip_Alt = 1; Likewise with 'b'. > + SHORT r = VkKeyScanW( *b ); VkKeyScanW should be called only if w32_unicode_gui is on. (Or maybe the caller is only called when w32_unicode_gui is on, in which case maybe we should have an eassert there.) > + fprintf(stderr, "VkKeyScanW %#06x %#04x\n", (int)r, wParam); > + if ((r & 0xFF) == wParam && !(r & ~0x1FF)) { /* Char available > without Alt modifier, so Alt is "on top" */ > + if (*b > 0x7f && ('A' <= wParam && wParam <= 'Z')) > + return 0; /* Another branch below > would convert it to Alt-Latin char via wParam */ > + strip_Alt = 0; > + } > + } > + if (strip_Alt) > + wmsg.dwModifiers = wmsg.dwModifiers & ~(alt_modifier | meta_modifier); > + > + signal_user_input (); > + while (count--) > + { > + fprintf(stderr, "unichar %#06x\n", *b); > + my_post_msg (&wmsg, hwnd, WM_UNICHAR, *b++, lParam); > + } > + if (!ctrl_cnt) /* Process ALSO as ctrl */ > + return 1; > + else > + fprintf(stderr, "extra ctrl char\n"); > + return -1; > + } else if (is_dead >= 0) { > + fprintf(stderr, "dead %#06x\n", is_dead); > + return 1; > + } Lots of debugging output here that should be removed. Thanks again for working on this. From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Ilya Zakharevich Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 06 Mar 2015 00:44:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 19994@debbugs.gnu.org Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.142560262625403 (code B ref 19994); Fri, 06 Mar 2015 00:44:02 +0000 Received: (at 19994) by debbugs.gnu.org; 6 Mar 2015 00:43:46 +0000 Received: from localhost ([127.0.0.1]:36609 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTgMj-0006be-9K for submit@debbugs.gnu.org; Thu, 05 Mar 2015 19:43:45 -0500 Received: from nm30-vm1.bullet.mail.gq1.yahoo.com ([98.136.216.192]:40727) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTgMg-0006bN-AC for 19994@debbugs.gnu.org; Thu, 05 Mar 2015 19:43:43 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1425602615; bh=1wWuBuzCyAc0nXYeo3TH9vOVjB6ZehK/Z85u5blnZ+I=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From:Subject; b=RJE4WjgXiUPQZcsxOhdrJ/jIKzVBJrM+5NPsWWbVNP6/FHk2/sgNPbr83GEMb5zafCSkIolHIKRYQZPi+y/i+Ncc+hR3IP4AxxdeJ1AovfPxGE6Fu9NUxOViDDf6dENbWw/MRyD/DWbRA1ouzHn/RsuDEeRNvjTvp3e5DuqVaqVgOsQp8+5FkI8I1rSuHFpq+hDvFxL4osqqVFYCBupkP3zCc1O8QxoJ+9g0ZARR58kGDtVMZW30BdDbY1Gj2Ypt5nDbHtIxogbltlzFg4+L+CaVAOPypvHcxWSY8FbcXCFj++8dlyty0MjankHe3PTfqVURaD4t/2bpRf1ckzo+0g== Received: from [98.137.12.55] by nm30.bullet.mail.gq1.yahoo.com with NNFMP; 06 Mar 2015 00:43:35 -0000 Received: from [208.71.42.198] by tm15.bullet.mail.gq1.yahoo.com with NNFMP; 06 Mar 2015 00:43:35 -0000 Received: from [127.0.0.1] by smtp209.mail.gq1.yahoo.com with NNFMP; 06 Mar 2015 00:43:35 -0000 X-Yahoo-Newman-Id: 923722.36833.bm@smtp209.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: vVYvBuEVM1k3P0ZMmhsjxbHnWhDopb.Di2VgOEZZp9Kmbsk rSo_fVqZQZotfWqSFQS61JI4T6IkNldYtBr.6AeVX50BlQYz8YSmBIOXZhkk AoBKizvWRFM_sP3dqDS7SGthJhmGiLblbEaFBli_pZQLgfw7i2.ZKUjWiml9 l_BATwabZxva1Y5FH_K6clokGXw2Px0FWMRL7aScUf09FX8W11Gum4d0mk0z DytNf1gJ52KOmm8p_NU2xJDrAQyc7Sc.SBrfcVRZl.JdjXMm75JvTCp9.b3W .zDaGLMIxi5viCDD2wbTN7DPVAdppquU7_9R_gmMSX0M7vlQhryCaR5LuHoV YJtCqf8C0cBdqKvwNuvccEZPcG4oG1F6I.PiZTL9mN3ON8DCOaJTPzvxuX68 Fd71ozSzrxkMOrWnJzOwQWG6X.dhLo23cwDrgeE9tB8lmOd7isxbksYPjt5T XOxopRbkqNqQKCyOxikeQaenj4Pk7egNo52aWcfxtIWo6_9D4.zXuYCvqxhU whIRZ4k4DChHbjNtCmATW7KlJDCDmvPAkikrTauTZ1M9JYxHE4aY- X-Yahoo-SMTP: oLSY3dWswBBqoBVzCkLl_RIsw6heKMxu8wpEbARv1SU- Date: Thu, 5 Mar 2015 16:43:32 -0800 From: Ilya Zakharevich Message-ID: <20150306004332.GA4927@math.berkeley.edu> References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <83bnk8prqa.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote: > > +static int > > +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int > ^^^^^^^^ > Why 'int' and not 'wchar_t'? This is for a Unicode chars. They won’t fit into (Windows’ style) wchar_t. > > + while (buflen && /* Should be called only when w32_unicode_gui */ > > + PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, PM_NOREMOVE | PM_NOYIELD) && > > Indeed, any "wide" APIs should only be called when w32_unicode_gui is > on, and there should be alternative code for when w32_unicode_gui is > off. We still try to support Windows 9X. The caller ensures this. Yes, assert() would be beneficial here. > > + int ctrl_cnt, buf[1024], count, is_dead; > > I think buf[] should be an array of wchar_t. Also, will this code > work for the non-w32_unicode_gui mode? This code is pure-GUI. For non-GUI “bindable” input on Windows the major hurdle is that (A) I know no way to distinguish a “prefix key” (deadkey) keypress from a keypress which should trigger user bindings; (B) with “non-destructive ToUnicode()”, one WOULD be able to distinguish these two cases, — but I have no clue how to find out the current keyboard layout of a console session. (There is a lot of examples of code which returns the keyboard layout of a window; — but these examples do not work for console sessions. I suppose that the reason is that the window is actually owned by a system process, and one does not have permissions to access its properties.) Thanks, Ilya From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 06 Mar 2015 10:53:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Ilya Zakharevich Cc: 19994@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.142563913529034 (code B ref 19994); Fri, 06 Mar 2015 10:53:01 +0000 Received: (at 19994) by debbugs.gnu.org; 6 Mar 2015 10:52:15 +0000 Received: from localhost ([127.0.0.1]:36885 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTpra-0007YD-NX for submit@debbugs.gnu.org; Fri, 06 Mar 2015 05:52:14 -0500 Received: from mtaout20.012.net.il ([80.179.55.166]:47629) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTprY-0007Xz-Kj for 19994@debbugs.gnu.org; Fri, 06 Mar 2015 05:52:13 -0500 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0NKS00300DYVS700@a-mtaout20.012.net.il> for 19994@debbugs.gnu.org; Fri, 06 Mar 2015 12:52:06 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NKS00309E6TO550@a-mtaout20.012.net.il>; Fri, 06 Mar 2015 12:52:06 +0200 (IST) Date: Fri, 06 Mar 2015 12:52:08 +0200 From: Eli Zaretskii In-reply-to: <20150306004332.GA4927@math.berkeley.edu> X-012-Sender: halo1@inter.net.il Message-id: <838ufao0tj.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> <20150306004332.GA4927@math.berkeley.edu> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Thu, 5 Mar 2015 16:43:32 -0800 > From: Ilya Zakharevich > Cc: 19994@debbugs.gnu.org > > On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote: > > > +static int > > > +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int > > ^^^^^^^^ > > Why 'int' and not 'wchar_t'? > > This is for a Unicode chars. They won’t fit into (Windows’ style) wchar_t. Right. > > Also, will this code work for the non-w32_unicode_gui mode? > > This code is pure-GUI. For non-GUI “bindable” input on Windows the > major hurdle is that No, that's not what I meant. I meant GUI sessions in which w32_unicode_gui is zero, i.e. Windows 9X systems. Console input is a different matter (and is handled separately, see w32inevt.c). Thanks. From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Ilya Zakharevich Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 06 Mar 2015 11:41:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 19994@debbugs.gnu.org Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.14256420142725 (code B ref 19994); Fri, 06 Mar 2015 11:41:01 +0000 Received: (at 19994) by debbugs.gnu.org; 6 Mar 2015 11:40:14 +0000 Received: from localhost ([127.0.0.1]:36928 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTqc2-0000hq-32 for submit@debbugs.gnu.org; Fri, 06 Mar 2015 06:40:14 -0500 Received: from nm30-vm4.bullet.mail.gq1.yahoo.com ([98.136.216.195]:56615) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTqc0-0000hP-C6 for 19994@debbugs.gnu.org; Fri, 06 Mar 2015 06:40:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1425642006; bh=6A0R4ROVFyquo0UWvLJlwkvUO5bcnlipaiPJmG2uq24=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From:Subject; b=mW7vWLkxfM8f/vwEnoES4vLoS1c0mG7gf/Vx/q4ZRFAWTWzA0BbyFDoG7MjOG/Q909HoowtVNDkPZEWYPBa/zMTwYKiOB/NSUxRLvaWg6ldY3LnmUSj89LYgAQbc3t84MrZKDd+17cejVRvOnV5bvpJkHJegQ+cB4cLi1GDlrlJaAXXHsrnVDznZJryRrlDy6dgY2CPKVa5fAN3+0vppnWYtbGG8OSIXXpJ03yKHk3xKEYoTKZtM3ma/PkCN5fJkzzhuD3fZHsdwHM1R1ot7SLrO/ZPQ/X8MWLJpnpZctkfY9xq/TmRo7V6qHynhho/cwz9ErJXe7paik5kYP3SZzg== Received: from [98.137.12.175] by nm30.bullet.mail.gq1.yahoo.com with NNFMP; 06 Mar 2015 11:40:06 -0000 Received: from [208.71.42.193] by tm14.bullet.mail.gq1.yahoo.com with NNFMP; 06 Mar 2015 11:40:06 -0000 Received: from [127.0.0.1] by smtp204.mail.gq1.yahoo.com with NNFMP; 06 Mar 2015 11:40:06 -0000 X-Yahoo-Newman-Id: 556613.50167.bm@smtp204.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: 1ORsnbIVM1ktvzA7P5ll9NmFekmdZRIFzG1PEsP_cYQCPBF 9Km3lljzdJfNG17wxTgh9ZNMlFNQoX7gBXnsrImSwOMsEG1YmEXzRRv3iaBH UVgyN8chVrsFkvWAm2m60L7yzJGVGFKv_TkwRWEfcTtGepoTBQDVffaqlIqL GMa.FMPa4eoFuKM7C_lt3y06bbySTCGJBgIKAT_BLQBfnbkMNyx.tpI_jSAS oDto7roPKUL96_YfPZHbv0ZFySujZ3ja_ptb7ocdMTnhRl_Ptht662BQbGPT 1eEmQGoiTecuhJkOiOHCxuzyaqYSD33ygqHQK0VtpMcQw7EkQl_Ssk7jjHqT 1EB.F44IxKkJQpXkPEuAGCCBs7g3MiK19sa1NJmB8pwOzVrKNxP_UkpUhIRr uYWlSPSpncJ_2f7MkYE7TmVWPyq9ZY17xUY01RHGLMIEoK_b581tk9muYtNU DH9sLp6U5ErkeIVIgZFz18MyAmA7xBPk7dnjuX58Kg0gu4bLocsYolwaJIRB P...Yuvw70VcUg6y5jyyWwCuol9P4o7uZyVH_ySVZuUg8scRz X-Yahoo-SMTP: oLSY3dWswBBqoBVzCkLl_RIsw6heKMxu8wpEbARv1SU- Date: Fri, 6 Mar 2015 03:40:03 -0800 From: Ilya Zakharevich Message-ID: <20150306114003.GB11886@math.berkeley.edu> References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> <20150306004332.GA4927@math.berkeley.edu> <838ufao0tj.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <838ufao0tj.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On Fri, Mar 06, 2015 at 12:52:08PM +0200, Eli Zaretskii wrote: > > > Also, will this code work for the non-w32_unicode_gui mode? > > > > This code is pure-GUI. For non-GUI “bindable” input on Windows the > > major hurdle is that > > No, that's not what I meant. I meant GUI sessions in which > w32_unicode_gui is zero, i.e. Windows 9X systems. Unless w32_unicode_gui is set, the changes made by this patch are a NOP. Ilya From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 06 Mar 2015 14:02:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Ilya Zakharevich Cc: 19994@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.142565046423998 (code B ref 19994); Fri, 06 Mar 2015 14:02:02 +0000 Received: (at 19994) by debbugs.gnu.org; 6 Mar 2015 14:01:04 +0000 Received: from localhost ([127.0.0.1]:37002 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTsoH-0006EQ-3e for submit@debbugs.gnu.org; Fri, 06 Mar 2015 09:01:04 -0500 Received: from mtaout27.012.net.il ([80.179.55.183]:48943) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YTsoB-0006E6-Au for 19994@debbugs.gnu.org; Fri, 06 Mar 2015 09:00:59 -0500 Received: from conversion-daemon.mtaout27.012.net.il by mtaout27.012.net.il (HyperSendmail v2007.08) id <0NKS00G00MBM2X00@mtaout27.012.net.il> for 19994@debbugs.gnu.org; Fri, 06 Mar 2015 15:55:23 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by mtaout27.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NKS00FYFMOA2S20@mtaout27.012.net.il>; Fri, 06 Mar 2015 15:55:23 +0200 (IST) Date: Fri, 06 Mar 2015 16:00:51 +0200 From: Eli Zaretskii In-reply-to: <20150306114003.GB11886@math.berkeley.edu> X-012-Sender: halo1@inter.net.il Message-id: <831tl2ns30.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> <20150306004332.GA4927@math.berkeley.edu> <838ufao0tj.fsf@gnu.org> <20150306114003.GB11886@math.berkeley.edu> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Fri, 6 Mar 2015 03:40:03 -0800 > From: Ilya Zakharevich > Cc: 19994@debbugs.gnu.org > > On Fri, Mar 06, 2015 at 12:52:08PM +0200, Eli Zaretskii wrote: > > > > Also, will this code work for the non-w32_unicode_gui mode? > > > > > > This code is pure-GUI. For non-GUI “bindable” input on Windows the > > > major hurdle is that > > > > No, that's not what I meant. I meant GUI sessions in which > > w32_unicode_gui is zero, i.e. Windows 9X systems. > > Unless w32_unicode_gui is set, the changes made by this patch are a NOP. That's fine, thanks. From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Ilya Zakharevich Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 01 Jul 2015 10:08:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 19994@debbugs.gnu.org Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.1435745248612 (code B ref 19994); Wed, 01 Jul 2015 10:08:02 +0000 Received: (at 19994) by debbugs.gnu.org; 1 Jul 2015 10:07:28 +0000 Received: from localhost ([127.0.0.1]:35218 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZAEvO-00009m-Ks for submit@debbugs.gnu.org; Wed, 01 Jul 2015 06:07:27 -0400 Received: from nm16-vm5.bullet.mail.gq1.yahoo.com ([98.137.177.253]:59385) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZAEvL-00009X-05 for 19994@debbugs.gnu.org; Wed, 01 Jul 2015 06:07:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1435745237; bh=Wz234/5ogZhGbRVRhlsOH1L5fd1LCHsM1TWyZ8G3MZA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From:Subject; b=Cr7IfVjTNhhSrOCEt/kuq8KyRKrcTPwvlaBLGubnfuDn6X/El1JYkkNapIBH7UxHF3K/PnSVUNiT04obV0qsdgYlb//N9VDhaHbb80aOAAPPyoPchqKzYXt01c8+LGc4fVaWnJJQFbUpgrPMGGYEkSa2/oYKhKre+Fkn1p4VYHLR0MasrUEcKbmq0/HLXg/Edgw5Qrw06nyNIfbgtDIhLR/Jlr9BmyylM+vKAbup1xwkUbiO3LxZcG8Uo9ExTTpGsp78hQykurgsfpuOzv/6mgW4xnNZgpLGvHHoANpFSOx0Dh1UUC7lkpN6KTp5dwAwDfMUuPJ6wDzlrf59PF3bCQ== Received: from [98.137.12.190] by nm16.bullet.mail.gq1.yahoo.com with NNFMP; 01 Jul 2015 10:07:17 -0000 Received: from [98.136.164.64] by tm11.bullet.mail.gq1.yahoo.com with NNFMP; 01 Jul 2015 10:07:16 -0000 Received: from [127.0.0.1] by smtp226.mail.gq1.yahoo.com with NNFMP; 01 Jul 2015 10:07:16 -0000 X-Yahoo-Newman-Id: 537671.50999.bm@smtp226.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: XLDXFYoVM1mVsX_KlNODDp1r3CNk3rtRTnQa9UlmDEvL08g AE4_nVb6vPIg9_G_UG5Ihe8dnvI_b2O7j3Fa7BBq_spmvmHcBtoAMnjtelaE MwJ_gGRco_pbYm80vSBNexNFoiU8s745a1Tb6B38zzuUf9_s2kHS8l6z7pap fFi6gEciJ3KZsS6Xibl2W8E8aX05mbQUyf9h.HFAboRcp4NnX30P5Rp6zAbh 6hivna4priNq_BmN7kpEU9W20SPuLGct4ev9y16KhkCdQdzHQ6n9XPeDsGxX qTTVekjguav3fcphENbd2F7YpeWShrb5iTeEZHJuQ9XjmWpUJdZDgtaelfw. WdxArrRHWXv1K7RhWFsk1No4nJIEIIqbOeiflvnN3.etd7HUtixaRAY.Y9F4 uqxKPTc9x9SqFWNyPwkazfrmPTiiWZIuvNGvRhDweB0Ju9ELrz_1W_pWqX4t cdMHK8nWUdloCeWqPdRWBd7wC4qpgF1HYg2ZbvpRaPdvFnSPUlEVFUQsp_CG 54P84BuDL0TbLeYQOqly0V4evROxTo62KaLwtIUxz X-Yahoo-SMTP: oLSY3dWswBBqoBVzCkLl_RIsw6heKMxu8wpEbARv1SU- Date: Wed, 1 Jul 2015 03:07:12 -0700 From: Ilya Zakharevich Message-ID: <20150701100712.GA24175@math.berkeley.edu> References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <83bnk8prqa.fsf@gnu.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote: > > Date: Tue, 3 Mar 2015 15:09:49 -0800 > > From: Ilya Zakharevich > > > > I’m working on a patch to make Unicode keyboard input to work properly on > > Windows (in graphic mode). > I suggest, indeed, to clean up the code so we could commit it to the > master branch. That way, it will get wider testing, and we can fix > whatever problems it might cause. Any deficiencies that don't cause > regressions wrt the current code can be fixed later, or even not at > all (if we decide them to not be important enough). I had no time to work on the code itself, but • I fixed the formatting, • I pumped up the docs, • I put in the suggested eassert(). ---------------- As it was before, the patch • defines two new static functions, • delays modification of wParam as late as needed (moves 1 LoC in w32_wnd_proc()), and • adds 8 LoC to w32_wnd_proc(). The call to these static functions is conditional on w32_unicode_gui. Enjoy, Ilya --- w32fns.c-ini 2015-01-30 15:33:23.505201400 -0800 +++ w32fns.c 2015-07-01 02:56:30.787672000 -0700 @@ -2832,6 +2832,233 @@ post_character_message (HWND hwnd, UINT my_post_msg (&wmsg, hwnd, msg, wParam, lParam); } +static int +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, + int *ctrl_cnt, int *is_dead, int vk, int exp) +{ + MSG msg; + /* If doubled is at the end, ignore it */ + int i = buflen, doubled = 0, code_unit; + + if (ctrl_cnt) + *ctrl_cnt = 0; + if (is_dead) + *is_dead = -1; + eassert(w32_unicode_gui); + while (buflen + /* Should be called only when w32_unicode_gui: */ + && PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, + PM_NOREMOVE | PM_NOYIELD) + && (msg.message == WM_CHAR || msg.message == WM_SYSCHAR + || msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR + || msg.message == WM_UNICHAR)) + { + /* We extract character payload, but in this call we handle only the + characters which comes BEFORE the next keyup/keydown message. */ + int dead; + + GetMessageW(&msg, aWnd, msg.message, msg.message); + dead = (msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR); + if (is_dead) + *is_dead = (dead ? msg.wParam : -1); + if (dead) + continue; + code_unit = msg.wParam; + if (doubled) + { + /* had surrogate */ + if (msg.message == WM_UNICHAR + || code_unit < 0xDC00 || code_unit > 0xDFFF) + { /* Mismatched first surrogate. + Pass both code units as if they were two characters. */ + *buf++ = doubled; + if (!--buflen) + return i; /* Drop the 2nd char if at the end of the buffer. */ + } + else /* see https://en.wikipedia.org/wiki/UTF-16 */ + { + code_unit = (doubled << 10) + code_unit - 0x35FDC00; + } + doubled = 0; + } + else if (code_unit >= 0xD800 && code_unit <= 0xDBFF) + { + /* Handle mismatched 2nd surrogate the same as a normal character. */ + doubled = code_unit; + continue; + } + + /* The only "fake" characters delivered by ToUnicode() or + TranslateMessage() are: + 0x01 .. 0x1a for Ctrl-letter, Enter, Tab, Ctrl-Break, Esc, Backspace + 0x00 and 0x1b .. 0x1f for Control- []\@^_ + 0x7f for Control-BackSpace + 0x20 for Control-Space */ + if (ignore_ctrl + && (code_unit < 0x20 || code_unit == 0x7f + || (code_unit == 0x20 && ctrl))) + { + /* Non-character payload in a WM_CHAR + (Ctrl-something pressed, see above). Ignore, and report. */ + if (ctrl_cnt) + *ctrl_cnt++; + continue; + } + /* Traditionally, Emacs would ignore the character payload of VK_NUMPAD* + keys, and would treat them later via `function-key-map'. In addition + to usual 102-key NUMPAD keys, this map also treats `kp-'-variants of + space, tab, enter, separator, equal. TAB and EQUAL, apparently, + cannot be generated on Win-GUI branch. ENTER is already handled + by the code above. According to `lispy_function_keys', kp_space is + generated by not-extended VK_CLEAR. (kp-tab != VK_OEM_NEC_EQUAL!). + + We do similarly for backward-compatibility, but ignore only the + characters restorable later by `function-key-map'. */ + if (code_unit < 0x7f + && ((vk >= VK_NUMPAD0 && vk <= VK_DIVIDE) + || (exp && ((vk >= VK_PRIOR && vk <= VK_DOWN) || + vk == VK_INSERT || vk == VK_DELETE || vk == VK_CLEAR))) + && strchr("0123456789/*-+.,", code_unit)) + continue; + *buf++ = code_unit; + buflen--; + } + return i - buflen; +} + +#ifdef DBG_WM_CHARS +# define FPRINTF_WM_CHARS(ARG) fprintf ARG +#else +# define FPRINTF_WM_CHARS(ARG) 0 +#endif + +int +deliver_wm_chars (int do_translate, HWND hwnd, UINT msg, UINT wParam, + UINT lParam, int legacy_alt_meta) +{ + /* An "old style" keyboard description may assign up to 125 UTF-16 code + points to a keypress. + (However, the "old style" TranslateMessage() would deliver at most 16 of + them.) Be on a safe side, and prepare to treat many more. */ + int ctrl_cnt, buf[1024], count, is_dead; + + /* Since the keypress processing logic of Windows has a lot of state, it + is important to call TranslateMessage() for every keyup/keydown, AND + do it exactly once. (The actual change of state is done by + ToUnicode[Ex](), which is called by TranslateMessage(). So one can + call ToUnicode[Ex]() instead.) + + The "usual" message pump calls TranslateMessage() for EVERY event. + Emacs calls TranslateMessage() very selectively (is it needed for doing + some tricky stuff with Win95??? With newer Windows, selectiveness is, + most probably, not needed - and harms a lot). + + So, with the usual message pump, the following call to TranslateMessage() + is not needed (and is going to be VERY harmful). With Emacs' message + pump, the call is needed. */ + if (do_translate) { + MSG windows_msg = { hwnd, msg, wParam, lParam, 0, {0,0} }; + + windows_msg.time = GetMessageTime (); + TranslateMessage (&windows_msg); + } + count = get_wm_chars (hwnd, buf, sizeof(buf)/sizeof(*buf), 1, + /* The message may have been synthesized by + who knows what; be conservative. */ + modifier_set (VK_LCONTROL) + || modifier_set (VK_RCONTROL) + || modifier_set (VK_CONTROL), + &ctrl_cnt, &is_dead, wParam, + (lParam & 0x1000000L) != 0); + if (count) { + W32Msg wmsg; + int *b = buf, strip_Alt = 1; + + /* wParam is checked when converting CapsLock to Shift */ + wmsg.dwModifiers = do_translate + ? w32_get_key_modifiers (wParam, lParam) : 0; + + /* What follows is just heuristics; the correct treatement requires + non-destructive ToUnicode(): + http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#Can_an_application_on_Windows_accept_keyboard_events?_Part_IV:_application-specific_modifiers + + What one needs to find is: + * which of the present modifiers AFFECT the resulting char(s) + (so should be stripped, since their EFFECT is "already + taken into account" in the string in buf), and + * which modifiers are not affecting buf, so should be reported to + the application for further treatment. + + Example: assume that we know: + (A) lCtrl+rCtrl+rAlt modifiers with VK_A key produce a Latin "f" + ("may be logical" with a JCUKEN-flavored Russian keyboard flavor); + (B) removing any one of lCtrl, rCtrl, rAlt changes the produced char; + (C) Win-modifier is not affecting the produced character + (this is the common case: happens with all "standard" layouts). + + Suppose the user presses Win+lCtrl+rCtrl+rAlt modifiers with VK_A. + What is the intent of the user? We need to guess the intent to decide + which event to deliver to the application. + + This looks like a reasonable logic: wince Win- modifier does not affect + the output string, the user was pressing Win for SOME OTHER purpose. + So the user wanted to generate Win-SOMETHING event. Now, what is + something? If one takes the mantra that "character payload is more + important than the combination of keypresses which resulted in this + payload", then one should ignore lCtrl+rCtrl+rAlt, ignore VK_A, and + assume that the user wanted to generate Win-f. + + Unfortunately, without non-destructive ToUnicode(), checking (B) and (C) + is out of question. So we use heuristics (hopefully, covering 99.9999% + of cases). + */ + + /* If ctrl-something delivers chars, ctrl and the rest should be hidden; + so the consumer of key-event won't interpret it as an accelerator. */ + if (wmsg.dwModifiers & ctrl_modifier) + wmsg.dwModifiers = wmsg.dwModifiers & shift_modifier; + /* In many keyboard layouts, (left) Alt is not changing the character. + Unless we are in this situation, strip Alt/Meta. */ + if (wmsg.dwModifiers & (alt_modifier | meta_modifier) + /* If alt-something delivers non-ASCIIchars, alt should be hidden */ + && count == 1 && *b < 0x10000) + { + SHORT r = VkKeyScanW( *b ); + + FPRINTF_WM_CHARS((stderr, "VkKeyScanW %#06x %#04x\n", (int)r, wParam)); + if ((r & 0xFF) == wParam && !(r & ~0x1FF)) + { + /* Char available without Alt modifier, so Alt is "on top" */ + if (legacy_alt_meta + && *b > 0x7f && ('A' <= wParam && wParam <= 'Z')) + /* For backward-compatibility with older Emacsen, let + this be processed by another branch below (which would convert + it to Alt-Latin char via wParam). */ + return 0; + strip_Alt = 0; + } + } + if (strip_Alt) + wmsg.dwModifiers = wmsg.dwModifiers & ~(alt_modifier | meta_modifier); + + signal_user_input (); + while (count--) + { + FPRINTF_WM_CHARS((stderr, "unichar %#06x\n", *b)); + my_post_msg (&wmsg, hwnd, WM_UNICHAR, *b++, lParam); + } + if (!ctrl_cnt) /* Process ALSO as ctrl */ + return 1; + else + FPRINTF_WM_CHARS((stderr, "extra ctrl char\n")); + return -1; + } else if (is_dead >= 0) { + FPRINTF_WM_CHARS((stderr, "dead %#06x\n", is_dead)); + return 1; + } + return 0; +} + /* Main window procedure */ static LRESULT CALLBACK @@ -3007,7 +3234,6 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA /* Synchronize modifiers with current keystroke. */ sync_modifiers (); record_keydown (wParam, lParam); - wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0); windows_translate = 0; @@ -3117,6 +3343,46 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA wParam = VK_NUMLOCK; break; default: + if (w32_unicode_gui) { + /* If this event generates characters or deadkeys, do not interpret + it as a "raw combination of modifiers and keysym". Hide + deadkeys, and use the generated character(s) instead of the + keysym. (Backward compatibility: exceptions for numpad keys + generating 0-9 . , / * - +, and for extra-Alt combined with a + non-Latin char.) + + Try to not report modifiers which have effect on which + character or deadkey is generated. + + Example (contrived): if rightAlt-? generates f (on a Cyrillic + keyboard layout), and Ctrl, leftAlt do not affect the generated + character, one wants to report Ctrl-leftAlt-f if the user + presses Ctrl-leftAlt-rightAlt-?. */ + int res; +#if 0 + /* Some of WM_CHAR may be fed to us directly, some are results of + TranslateMessage(). Using 0 as the first argument (in a + separate call) might help us distinguish these two cases. + + However, the keypress feeders would most probably expect the + "standard" message pump, when TranslateMessage() is called on + EVERY KeyDown/Keyup event. So they may feed us Down-Ctrl + Down-FAKE Char-o and expect us to recognize it as Ctrl-o. + Using 0 as the first argument would interfere with this. */ + deliver_wm_chars (0, hwnd, msg, wParam, lParam, 1); +#endif + /* Processing the generated WM_CHAR messages *WHILE* we handle + KEYDOWN/UP event is the best choice, since withoug any fuss, + we know all 3 of: scancode, virtual keycode, and expansion. + (Additionally, one knows boundaries of expansion of different + keypresses.) */ + res = deliver_wm_chars (1, hwnd, msg, wParam, lParam, 1); + windows_translate = -( res != 0 ); + if (res > 0) /* Bound to character(s) or a deadkey */ + break; + /* deliver_wm_chars() may make some branches after this vestigal */ + } + wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0); /* If not defined as a function key, change it to a WM_CHAR message. */ if (wParam > 255 || !lispy_function_keys[wParam]) { @@ -3184,6 +3450,8 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA } } + if (windows_translate == -1) + break; translate: if (windows_translate) { From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Ilya Zakharevich Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 09 Jul 2015 00:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: 19994@debbugs.gnu.org Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.143640019722160 (code B ref 19994); Thu, 09 Jul 2015 00:04:01 +0000 Received: (at 19994) by debbugs.gnu.org; 9 Jul 2015 00:03:17 +0000 Received: from localhost ([127.0.0.1]:44570 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZCzJ4-0005lJ-I6 for submit@debbugs.gnu.org; Wed, 08 Jul 2015 20:03:17 -0400 Received: from nm12-vm7.bullet.mail.gq1.yahoo.com ([98.136.218.206]:51838) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZCzIz-0005l3-9u for 19994@debbugs.gnu.org; Wed, 08 Jul 2015 20:03:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1436400183; bh=DxzVPgUsJPiR72x2TfhbUTqOKsPmabVRe2B5y5vrjy8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From:Subject; b=uifTgYcVQuQmro7nz2nb2zTaezB3Lh0h/pBC/1u/vbFCIVJgtpmTGyVY87IZxtMiUXd2btI3MW9CA+ULfECATdpU2XgRU296E+OfsO4eX3rosKcO5kIqx7Af5Zi/8WWpQj0fefDELjGtANPxKzAIfIRRKToO2gBXx0KU31LafXAXdKM0FjmgenjPaXCpZ4RMnh632kIHpnHeTE4VjBVHPTsQpdO7c67XMV88/mkmesrq9fYv7aJBkhhoBDY5v/AGKQCSilUUKDLwxXyKrLE0pRHJfe6R/nkYCuSJihgbK0IkUBbnG05+Htt23WPy1LIMJ0ZFhXDwiMRFMA2ftVzOXw== Received: from [98.137.12.62] by nm12.bullet.mail.gq1.yahoo.com with NNFMP; 09 Jul 2015 00:03:03 -0000 Received: from [208.71.42.199] by tm7.bullet.mail.gq1.yahoo.com with NNFMP; 09 Jul 2015 00:03:03 -0000 Received: from [127.0.0.1] by smtp210.mail.gq1.yahoo.com with NNFMP; 09 Jul 2015 00:03:03 -0000 X-Yahoo-Newman-Id: 303987.68193.bm@smtp210.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: azpXjMsVM1mnSmZ2QmGIqfSxsnVJ1UhbI1tef.aZbbjiAZ_ QrZeU0bZt925AhhRFMEcH9S52RpA.PH60x1RNB1aDbCEJrU.OjemLzBB8nV3 t1yR.nsH.Bw7_angeoRqmTpkikryxCYlybjMTE85Kjen4WfoZg.ywzw8Tmyp 0fCs9HI2QSAnjorTS0.Z5QRFl6nyQPGBi_KMlVi7DZK69SlK8bDuWp9Tuett dDU9JYcS_e12vkEniqxVM_PAIh7HzX7cRe3AX9BfgnsAXWcQi9k1xDgiVEvt mc3P6PnlKbkCYFeNpwg7mQ2nkZduhCvBiq.rbJEAn2ATXCc0dI8OY0BieQku MLF5WhoJzip9wjerfEH.1FOQxNo_v6H4RzQnwPiITUirrFXOAij1aLBZ.ZKt bojDIAiHdMdZpVDRCXQEYCinkY6BtKVV7h0j3ZaNVJT0d7Le0S2lLkxRw8v0 iqND5Es3U6MJr4pMlFmxUWsSJpKWjBpTSihJA6v0ig8qsb0nCRgd1ukDJqOF tu2fcd4b.BZS3MGpFNFi_zWoxOJHkiRdmR83e2eS_dRPR X-Yahoo-SMTP: oLSY3dWswBBqoBVzCkLl_RIsw6heKMxu8wpEbARv1SU- Date: Wed, 8 Jul 2015 17:02:59 -0700 From: Ilya Zakharevich Message-ID: <20150709000259.GA7163@math.berkeley.edu> References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> <20150701100712.GA24175@math.berkeley.edu> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="Q68bSM7Ycu6FN28Q" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20150701100712.GA24175@math.berkeley.edu> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.0 (/) --Q68bSM7Ycu6FN28Q Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit On Wed, Jul 01, 2015 at 03:07:12AM -0700, Ilya Zakharevich wrote: > On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote: > > I suggest, indeed, to clean up the code so we could commit it to the > > master branch. That way, it will get wider testing, and we can fix > I had no time to work on the code itself, but > • I fixed the formatting, > • I pumped up the docs, > • I put in the suggested eassert(). The variant I sent was too primitive — it was not covering a (common?) usage case when (with AltGr-layouts) leftCtrl+rightCtrl was behaving differently than pressing AltGr: • leftCtrl+rightCtrl would trigger C-M-key; • altGr would enter the character payload. This update (0) fixes two formatting-style omissions; (A) adds A LOAD of new comments; (B) treats such important cases (as above) separately; (z) Marks a piece of old code which does not make any sense. (see the last chunk in the relative patch) Notes: • In (B), there are some decisions to make. I encapsulate these decisions into two strings. For best result, these strings should be user-customizable. However, currently they are just put into C #defines. When I sit on this more, and if these customizations turn out to be useful, one can make them into Lisp variables. • There is a bug in the (old) Emacs code which prevents some cases treated in (B) from being really useful. I did not fix it yet. To see the bug: ∘ switch to layout with AltGr; ∘ assume that AltGr-s produces ß (as with US International); ∘ pressing AltGr-rightControl-s produces Meta-ß; ∘ pressing rightControl-AltGr-s produces C-M-s. (I do not think this effect is intentional.) • And, BTW, is it documented anywhere that leftControl-rightControl-key produces C-M-key? I include two patches: □ absolute (ignore the previous patches) □ relative (with whitespace ignored) — for reading. Enjoy, Ilya --Q68bSM7Ycu6FN28Q Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="w32fns.c-diff-v2" --- w32fns.c-ini 2015-01-30 15:33:23.505201400 -0800 +++ w32fns.c 2015-07-08 16:32:11.187197700 -0700 @@ -2832,6 +2832,413 @@ post_character_message (HWND hwnd, UINT my_post_msg (&wmsg, hwnd, msg, wParam, lParam); } +static int +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, + int *ctrl_cnt, int *is_dead, int vk, int exp) +{ + MSG msg; + /* If doubled is at the end, ignore it */ + int i = buflen, doubled = 0, code_unit; + + if (ctrl_cnt) + *ctrl_cnt = 0; + if (is_dead) + *is_dead = -1; + eassert(w32_unicode_gui); + while (buflen + /* Should be called only when w32_unicode_gui: */ + && PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, + PM_NOREMOVE | PM_NOYIELD) + && (msg.message == WM_CHAR || msg.message == WM_SYSCHAR + || msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR + || msg.message == WM_UNICHAR)) + { + /* We extract character payload, but in this call we handle only the + characters which comes BEFORE the next keyup/keydown message. */ + int dead; + + GetMessageW(&msg, aWnd, msg.message, msg.message); + dead = (msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR); + if (is_dead) + *is_dead = (dead ? msg.wParam : -1); + if (dead) + continue; + code_unit = msg.wParam; + if (doubled) + { + /* had surrogate */ + if (msg.message == WM_UNICHAR + || code_unit < 0xDC00 || code_unit > 0xDFFF) + { /* Mismatched first surrogate. + Pass both code units as if they were two characters. */ + *buf++ = doubled; + if (!--buflen) + return i; /* Drop the 2nd char if at the end of the buffer. */ + } + else /* see https://en.wikipedia.org/wiki/UTF-16 */ + { + code_unit = (doubled << 10) + code_unit - 0x35FDC00; + } + doubled = 0; + } + else if (code_unit >= 0xD800 && code_unit <= 0xDBFF) + { + /* Handle mismatched 2nd surrogate the same as a normal character. */ + doubled = code_unit; + continue; + } + + /* The only "fake" characters delivered by ToUnicode() or + TranslateMessage() are: + 0x01 .. 0x1a for Ctrl-letter, Enter, Tab, Ctrl-Break, Esc, Backspace + 0x00 and 0x1b .. 0x1f for Control- []\@^_ + 0x7f for Control-BackSpace + 0x20 for Control-Space */ + if (ignore_ctrl + && (code_unit < 0x20 || code_unit == 0x7f + || (code_unit == 0x20 && ctrl))) + { + /* Non-character payload in a WM_CHAR + (Ctrl-something pressed, see above). Ignore, and report. */ + if (ctrl_cnt) + *ctrl_cnt++; + continue; + } + /* Traditionally, Emacs would ignore the character payload of VK_NUMPAD* + keys, and would treat them later via `function-key-map'. In addition + to usual 102-key NUMPAD keys, this map also treats `kp-'-variants of + space, tab, enter, separator, equal. TAB and EQUAL, apparently, + cannot be generated on Win-GUI branch. ENTER is already handled + by the code above. According to `lispy_function_keys', kp_space is + generated by not-extended VK_CLEAR. (kp-tab != VK_OEM_NEC_EQUAL!). + + We do similarly for backward-compatibility, but ignore only the + characters restorable later by `function-key-map'. */ + if (code_unit < 0x7f + && ((vk >= VK_NUMPAD0 && vk <= VK_DIVIDE) + || (exp && ((vk >= VK_PRIOR && vk <= VK_DOWN) || + vk == VK_INSERT || vk == VK_DELETE || vk == VK_CLEAR))) + && strchr("0123456789/*-+.,", code_unit)) + continue; + *buf++ = code_unit; + buflen--; + } + return i - buflen; +} + +#ifdef DBG_WM_CHARS +# define FPRINTF_WM_CHARS(ARG) fprintf ARG +#else +# define FPRINTF_WM_CHARS(ARG) 0 +#endif + +/* This is a heuristic only. This is supposed to track the state of the + finite automaton in the language environment of Windows. + + However, separate windows (if with the same different language + environments!) should have different values. Moreover, switching to a + non-Emacs window with the same language environment, and using (dead)keys + there would change the value stored in the kernel, but not this value. */ +static int after_deadkey = 0; + +int +deliver_wm_chars (int do_translate, HWND hwnd, UINT msg, UINT wParam, + UINT lParam, int legacy_alt_meta) +{ + /* An "old style" keyboard description may assign up to 125 UTF-16 code + points to a keypress. + (However, the "old style" TranslateMessage() would deliver at most 16 of + them.) Be on a safe side, and prepare to treat many more. */ + int ctrl_cnt, buf[1024], count, is_dead, after_dead = (after_deadkey != -1); + + /* Since the keypress processing logic of Windows has a lot of state, it + is important to call TranslateMessage() for every keyup/keydown, AND + do it exactly once. (The actual change of state is done by + ToUnicode[Ex](), which is called by TranslateMessage(). So one can + call ToUnicode[Ex]() instead.) + + The "usual" message pump calls TranslateMessage() for EVERY event. + Emacs calls TranslateMessage() very selectively (is it needed for doing + some tricky stuff with Win95??? With newer Windows, selectiveness is, + most probably, not needed - and harms a lot). + + So, with the usual message pump, the following call to TranslateMessage() + is not needed (and is going to be VERY harmful). With Emacs' message + pump, the call is needed. */ + if (do_translate) + { + MSG windows_msg = { hwnd, msg, wParam, lParam, 0, {0,0} }; + + windows_msg.time = GetMessageTime (); + TranslateMessage (&windows_msg); + } + count = get_wm_chars (hwnd, buf, sizeof(buf)/sizeof(*buf), 1, + /* The message may have been synthesized by + who knows what; be conservative. */ + modifier_set (VK_LCONTROL) + || modifier_set (VK_RCONTROL) + || modifier_set (VK_CONTROL), + &ctrl_cnt, &is_dead, wParam, + (lParam & 0x1000000L) != 0); + if (count) + { + W32Msg wmsg; + DWORD console_modifiers = construct_console_modifiers (); + int *b = buf, strip_Alt = 1, strip_ExtraMods = 1, hairy = 0; + char *type_CtrlAlt = NULL; + + /* XXXX In fact, there may be another case when we need to do the same: + What happens if the string defined in the LIGATURES has length + 0? Probably, we will get count==0, but the state of the finite + automaton would reset to 0??? */ + after_deadkey = -1; + + /* wParam is checked when converting CapsLock to Shift; this is a clone + of w32_get_key_modifiers (). */ + wmsg.dwModifiers = w32_kbd_mods_to_emacs (console_modifiers, wParam); + + /* What follows is just heuristics; the correct treatement requires + non-destructive ToUnicode(): + http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#Can_an_application_on_Windows_accept_keyboard_events?_Part_IV:_application-specific_modifiers + + What one needs to find is: + * which of the present modifiers AFFECT the resulting char(s) + (so should be stripped, since their EFFECT is "already + taken into account" in the string in buf), and + * which modifiers are not affecting buf, so should be reported to + the application for further treatment. + + Example: assume that we know: + (A) lCtrl+rCtrl+rAlt modifiers with VK_A key produce a Latin "f" + ("may be logical" in JCUKEN-flavored Russian keyboard flavors); + (B) removing any of lCtrl, rCtrl, rAlt changes the produced char; + (C) Win-modifier is not affecting the produced character + (this is the common case: happens with all "standard" layouts). + + Suppose the user presses Win+lCtrl+rCtrl+rAlt modifiers with VK_A. + What is the intent of the user? We need to guess the intent to decide + which event to deliver to the application. + + This looks like a reasonable logic: since Win- modifier doesn't affect + the output string, the user was pressing Win for SOME OTHER purpose. + So the user wanted to generate Win-SOMETHING event. Now, what is + something? If one takes the mantra that "character payload is more + important than the combination of keypresses which resulted in this + payload", then one should ignore lCtrl+rCtrl+rAlt, ignore VK_A, and + assume that the user wanted to generate Win-f. + + Unfortunately, without non-destructive ToUnicode(), checking (B),(C) + is out of question. So we use heuristics (hopefully, covering + 99.9999% of cases). + */ + + /* Another thing to watch for is a possibility to use AltGr-* and + Ctrl-Alt-* with different semantic. + + Background: the layout defining the KLLF_ALTGR bit are treated + specially by the kernel: when VK_RMENU (=rightAlt, =AltGr) is pressed + (released), a press (release) of VK_LCONTROL is emulated (unless Ctrl + is already down). As a result, any press/release of AltGr is seen + by applications as a press/release of lCtrl AND rAlt. This is + applicable, in particular, to ToUnicode[Ex](). (Keyrepeat is covered + the same way!) + + NOTE: it IS possible to see bare rAlt even with KLLF_ALTGR; but this + requires a good finger coordination: doing (physically) + Down-lCtrl Down-rAlt Up-lCtrl Down-a + (doing quick enough, so that key repeat of rAlt [which would + generate new "fake" Down-lCtrl events] does not happens before 'a' + is down) results in no "fake" events, so the application will see + only rAlt down when 'a' is pressed. (However, fake Up-lCtrl WILL + be generated when rAlt goes UP.) + + In fact, note also that KLLF_ALTGR does not prohibit construction of + rCtrl-rAlt (just press them in this order!). + + Moreover: "traditional" layouts do not define distinct modifier-masks + for VK_LMENU and VK_RMENU (same for VK_L/RCONTROL). Instead, they + rely on the KLLF_ALTGR bit to make the behaviour of VK_LMENU and + VK_RMENU distinct. As a corollary, for such layouts, the produced + character is the same for AltGr-* (=rAlt-*) and Ctrl-Alt-* (in any + combination of handedness). For description of masks, see + + http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#Keyboard_input_on_Windows,_Part_I:_what_is_the_kernel_doing? + + By default, Emacs was using these coincidences via the following + heuristics: it was treating: + (*) keypresses with lCtrl-rAlt modifiers as if they are carrying + ONLY the character payload (no matter what the actual keyboard + was defining: if lCtrl-lAlt-b was delivering U+05df=beta, then + Emacs saw [beta]; if lCtrl-lAlt-b was undefined in the layout, + the keypress was completely ignored), and + (*) keypresses with the other combinations of handedness of Ctrl-Alt + modifiers (e.g., lCtrl-lAlt) as if they NEVER carry a character + payload (so they were reported "raw": if lCtrl-lAlt-b was + delivering beta, then Emacs saw event [C-A-b], and not [beta]). + This worked good for "traditional" layouts: users could type both + AltGr-x and Ctrl-Alt-x, and one was a character, another a bindable + event. + + However, for layouts which deliver different characters for AltGr-x + and lCtrl-lAlt-x, this scheme makes the latter character unaccessible + in Emacs. While it is easy to access functionality of [C-M-x] in + Emacs by other means (for example, by the `controlify' prefix, or + using lCtrl-rCtrl-x, or rCtrl-rAlt-x [in this order]), missing + characters cannot be reconstructed without a tedious manual work. */ + + /* These two cases are often going to be distinguishable, since at most + one of these character is defined with KBDCTRL | KBDMENU modifier + bitmap. (This heuristic breaks if both lCtrl-lAlt- AND lCtrl-rAlt- + are translated to modifier bitmaps distinct from KBDCTRL | KBDMENU, + or in the cases when lCtrl-lAlt-* and lCtrl-rAlt-* are generally + different, but lCtrl-lAlt-x and lCtrl-rAlt-x happen to deliver the + same character.) + + So we have 2 chunks of info: + (A) is it lCtrl-rAlt-, or lCtrl-lAlt, or some other combination? + (B) is the delivered character defined with KBDCTRL | KBDMENU bits? + Basing on (A) and (B), we should decide whether to ignore the + delivered character. (Before, Emacs was completely ignoring (B), and + was treating the 3-state of (A) as a bit.) This means that we have 6 + bits of customization. + + Additionally, a presence of two Ctrl down may be AltGr-rCtrl-.*/ + + /* Strip all non-Shift modifiers if: + - more than one UTF-16 code point delivered (can't call VkKeyScanW ()) + - or the character is a result of combining with a prefix key. */ + if (!after_dead && count == 1 && *b < 0x10000) + { + if (console_modifiers & (RIGHT_ALT_PRESSED | LEFT_ALT_PRESSED) + && console_modifiers & (RIGHT_CTRL_PRESSED | LEFT_CTRL_PRESSED)) + { + type_CtrlAlt = "bB"; /* generic bindable Ctrl-Alt- modifiers */ + if (console_modifiers & (LEFT_CTRL_PRESSED | RIGHT_CTRL_PRESSED) + == (LEFT_CTRL_PRESSED | RIGHT_CTRL_PRESSED)) + /* double-Ctrl: + e.g. AltGr-rCtrl on some layouts (in this order!) */ + type_CtrlAlt = "dD"; + else if (console_modifiers + & (LEFT_CTRL_PRESSED | LEFT_ALT_PRESSED) + == (LEFT_CTRL_PRESSED | LEFT_ALT_PRESSED)) + type_CtrlAlt = "lL"; /* Ctrl-Alt- modifiers on the left */ + else if (!NILP (Vw32_recognize_altgr) + && (console_modifiers + & (RIGHT_ALT_PRESSED | LEFT_CTRL_PRESSED)) + == (RIGHT_ALT_PRESSED | LEFT_CTRL_PRESSED)) + type_CtrlAlt = "gG"; /* modifiers as in AltGr */ + } + else if (wmsg.dwModifiers & (alt_modifier | meta_modifier) + || (console_modifiers + & (RIGHT_WIN_PRESSED | RIGHT_WIN_PRESSED + | APPS_PRESSED | SCROLLLOCK_ON))) + { + /* pure Alt (or combination of Alt, Win, APPS, scrolllock */ + type_CtrlAlt = "aA"; + } + if (type_CtrlAlt) + { + /* Out of bound bitmap: */ + SHORT r = VkKeyScanW( *b ), bitmap = 0x1FF; + + FPRINTF_WM_CHARS((stderr, "VkKeyScanW %#06x %#04x\n", (int)r, + wParam)); + if ((r & 0xFF) == wParam) + bitmap = r>>8; /* *b is reachable via simple interface */ + if (*type_CtrlAlt == 'a') /* Simple Alt seen */ + { + if ((bitmap & ~1) == 0) /* 1: KBDSHIFT */ + { + /* In "traditional" layouts, Alt without Ctrl does not + change the delivered character. This detects this + situation; it is safe to report this as Alt-something + - as opposed to delivering the reported character + without modifiers. */ + if (legacy_alt_meta + && *b > 0x7f && ('A' <= wParam && wParam <= 'Z')) + /* For backward-compatibility with older Emacsen, let + this be processed by another branch below (which + would convert it to Alt-Latin char via wParam). */ + return 0; + } + else + { + hairy = 1; + } + } + /* Check whether the delivered character(s) is accessible via + KBDCTRL | KBDALT ( | KBDSHIFT ) modifier mask (which is 7). */ + else if ((bitmap & ~1) != 6) + { + /* The character is not accessible via plain Ctrl-Alt(-Shift) + (which is, probably, same as AltGr) modifiers. + Either it was after a prefix key, or is combined with + modifier keys which we don't see, or there is an asymmetry + between left-hand and right-hand modifiers, or other hairy + stuff. */ + hairy = 1; + } + /* The best solution is to delegate these tough (but rarely + needed) choices to the user. Temporarily (???), it is + implemented as C macros. + + Essentially, there are 3 things to do: return 0 (handle to the + legacy processing code [ignoring the character payload]; keep + some modifiers (so that they will be processed by the binding + system [on top of the character payload]; strip modifiers [so + that `self-insert' is going to be triggered with the character + payload]). + + The default below should cover 99.9999% of cases: + (a) strip Alt- in the hairy case only; + (stripping = not ignoring) + (l) for lAlt-lCtrl, ignore the char in simple cases only; + (g) for what looks like AltGr, ignore the modifiers; + (d) for what looks like lCtrl-rCtrl-Alt (probably + AltGr-rCtrl), ignore the character in simple cases only; + (b) for other cases of Ctrl-Alt, ignore the character in + simple cases only. + + Essentially, in all hairy cases, and in looks-like-AltGr case, + we keep the character, ignoring the modifiers. In all the + other cases, we ignore the delivered character. + */ +#define S_TYPES_TO_IGNORE_CHARACTER_PAYLOAD "aldb" +#define S_TYPES_TO_REPORT_CHARACTER_PAYLOAD_WITH_MODIFIERS "" + if (strchr(S_TYPES_TO_IGNORE_CHARACTER_PAYLOAD, + type_CtrlAlt[hairy])) + return 0; + /* if in neither list, report all the modifiers we see COMBINED + WITH the reported character */ + if (strchr(S_TYPES_TO_REPORT_CHARACTER_PAYLOAD_WITH_MODIFIERS, + type_CtrlAlt[hairy])) + strip_ExtraMods = 0; + } + } + if (strip_ExtraMods) + wmsg.dwModifiers = wmsg.dwModifiers & shift_modifier; + + signal_user_input (); + while (count--) + { + FPRINTF_WM_CHARS((stderr, "unichar %#06x\n", *b)); + my_post_msg (&wmsg, hwnd, WM_UNICHAR, *b++, lParam); + } + if (!ctrl_cnt) /* Process ALSO as ctrl */ + return 1; + else + FPRINTF_WM_CHARS((stderr, "extra ctrl char\n")); + return -1; + } + else if (is_dead >= 0) + { + FPRINTF_WM_CHARS((stderr, "dead %#06x\n", is_dead)); + after_deadkey = is_dead; + return 1; + } + return 0; +} + /* Main window procedure */ static LRESULT CALLBACK @@ -2948,6 +3355,15 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA /* Inform lisp thread of keyboard layout changes. */ my_post_msg (&wmsg, hwnd, msg, wParam, lParam); + /* The state of the finite automaton is separate per every input + language environment (so it does not change when one switches + to a different window with the same environment). Moreover, + the experiments show that the state is not remembered when + one switches back to the pre-previous environment. */ + after_deadkey = -1; + + /* XXXX??? What follows is a COMPLETE misunderstanding of Windows! */ + /* Clear dead keys in the keyboard state; for simplicity only preserve modifier key states. */ { @@ -3007,7 +3423,6 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA /* Synchronize modifiers with current keystroke. */ sync_modifiers (); record_keydown (wParam, lParam); - wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0); windows_translate = 0; @@ -3117,6 +3532,46 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA wParam = VK_NUMLOCK; break; default: + if (w32_unicode_gui) { + /* If this event generates characters or deadkeys, do not interpret + it as a "raw combination of modifiers and keysym". Hide + deadkeys, and use the generated character(s) instead of the + keysym. (Backward compatibility: exceptions for numpad keys + generating 0-9 . , / * - +, and for extra-Alt combined with a + non-Latin char.) + + Try to not report modifiers which have effect on which + character or deadkey is generated. + + Example (contrived): if rightAlt-? generates f (on a Cyrillic + keyboard layout), and Ctrl, leftAlt do not affect the generated + character, one wants to report Ctrl-leftAlt-f if the user + presses Ctrl-leftAlt-rightAlt-?. */ + int res; +#if 0 + /* Some of WM_CHAR may be fed to us directly, some are results of + TranslateMessage(). Using 0 as the first argument (in a + separate call) might help us distinguish these two cases. + + However, the keypress feeders would most probably expect the + "standard" message pump, when TranslateMessage() is called on + EVERY KeyDown/Keyup event. So they may feed us Down-Ctrl + Down-FAKE Char-o and expect us to recognize it as Ctrl-o. + Using 0 as the first argument would interfere with this. */ + deliver_wm_chars (0, hwnd, msg, wParam, lParam, 1); +#endif + /* Processing the generated WM_CHAR messages *WHILE* we handle + KEYDOWN/UP event is the best choice, since withoug any fuss, + we know all 3 of: scancode, virtual keycode, and expansion. + (Additionally, one knows boundaries of expansion of different + keypresses.) */ + res = deliver_wm_chars (1, hwnd, msg, wParam, lParam, 1); + windows_translate = -( res != 0 ); + if (res > 0) /* Bound to character(s) or a deadkey */ + break; + /* deliver_wm_chars() may make some branches after this vestigal */ + } + wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0); /* If not defined as a function key, change it to a WM_CHAR message. */ if (wParam > 255 || !lispy_function_keys[wParam]) { @@ -3184,6 +3639,8 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA } } + if (windows_translate == -1) + break; translate: if (windows_translate) { --Q68bSM7Ycu6FN28Q Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="w32fns.c-diff-v2-relative" --- w32fns.c-sent2 2015-07-01 02:56:30.787672000 -0700 +++ w32fns.c 2015-07-08 16:32:11.187197700 -0700 @@ -2932,6 +2932,15 @@ get_wm_chars (HWND aWnd, int *buf, int b # define FPRINTF_WM_CHARS(ARG) 0 #endif +/* This is a heuristic only. This is supposed to track the state of the + finite automaton in the language environment of Windows. + + However, separate windows (if with the same different language + environments!) should have different values. Moreover, switching to a + non-Emacs window with the same language environment, and using (dead)keys + there would change the value stored in the kernel, but not this value. */ +static int after_deadkey = 0; + int deliver_wm_chars (int do_translate, HWND hwnd, UINT msg, UINT wParam, UINT lParam, int legacy_alt_meta) @@ -2940,7 +2949,7 @@ deliver_wm_chars (int do_translate, HWND points to a keypress. (However, the "old style" TranslateMessage() would deliver at most 16 of them.) Be on a safe side, and prepare to treat many more. */ - int ctrl_cnt, buf[1024], count, is_dead; + int ctrl_cnt, buf[1024], count, is_dead, after_dead = (after_deadkey != -1); /* Since the keypress processing logic of Windows has a lot of state, it is important to call TranslateMessage() for every keyup/keydown, AND @@ -2956,7 +2965,8 @@ deliver_wm_chars (int do_translate, HWND So, with the usual message pump, the following call to TranslateMessage() is not needed (and is going to be VERY harmful). With Emacs' message pump, the call is needed. */ - if (do_translate) { + if (do_translate) + { MSG windows_msg = { hwnd, msg, wParam, lParam, 0, {0,0} }; windows_msg.time = GetMessageTime (); @@ -2970,13 +2980,22 @@ deliver_wm_chars (int do_translate, HWND || modifier_set (VK_CONTROL), &ctrl_cnt, &is_dead, wParam, (lParam & 0x1000000L) != 0); - if (count) { + if (count) + { W32Msg wmsg; - int *b = buf, strip_Alt = 1; - - /* wParam is checked when converting CapsLock to Shift */ - wmsg.dwModifiers = do_translate - ? w32_get_key_modifiers (wParam, lParam) : 0; + DWORD console_modifiers = construct_console_modifiers (); + int *b = buf, strip_Alt = 1, strip_ExtraMods = 1, hairy = 0; + char *type_CtrlAlt = NULL; + + /* XXXX In fact, there may be another case when we need to do the same: + What happens if the string defined in the LIGATURES has length + 0? Probably, we will get count==0, but the state of the finite + automaton would reset to 0??? */ + after_deadkey = -1; + + /* wParam is checked when converting CapsLock to Shift; this is a clone + of w32_get_key_modifiers (). */ + wmsg.dwModifiers = w32_kbd_mods_to_emacs (console_modifiers, wParam); /* What follows is just heuristics; the correct treatement requires non-destructive ToUnicode(): @@ -2991,8 +3010,8 @@ deliver_wm_chars (int do_translate, HWND Example: assume that we know: (A) lCtrl+rCtrl+rAlt modifiers with VK_A key produce a Latin "f" - ("may be logical" with a JCUKEN-flavored Russian keyboard flavor); - (B) removing any one of lCtrl, rCtrl, rAlt changes the produced char; + ("may be logical" in JCUKEN-flavored Russian keyboard flavors); + (B) removing any of lCtrl, rCtrl, rAlt changes the produced char; (C) Win-modifier is not affecting the produced character (this is the common case: happens with all "standard" layouts). @@ -3000,7 +3019,7 @@ deliver_wm_chars (int do_translate, HWND What is the intent of the user? We need to guess the intent to decide which event to deliver to the application. - This looks like a reasonable logic: wince Win- modifier does not affect + This looks like a reasonable logic: since Win- modifier doesn't affect the output string, the user was pressing Win for SOME OTHER purpose. So the user wanted to generate Win-SOMETHING event. Now, what is something? If one takes the mantra that "character payload is more @@ -3008,38 +3027,196 @@ deliver_wm_chars (int do_translate, HWND payload", then one should ignore lCtrl+rCtrl+rAlt, ignore VK_A, and assume that the user wanted to generate Win-f. - Unfortunately, without non-destructive ToUnicode(), checking (B) and (C) - is out of question. So we use heuristics (hopefully, covering 99.9999% - of cases). + Unfortunately, without non-destructive ToUnicode(), checking (B),(C) + is out of question. So we use heuristics (hopefully, covering + 99.9999% of cases). */ - /* If ctrl-something delivers chars, ctrl and the rest should be hidden; - so the consumer of key-event won't interpret it as an accelerator. */ - if (wmsg.dwModifiers & ctrl_modifier) - wmsg.dwModifiers = wmsg.dwModifiers & shift_modifier; - /* In many keyboard layouts, (left) Alt is not changing the character. - Unless we are in this situation, strip Alt/Meta. */ - if (wmsg.dwModifiers & (alt_modifier | meta_modifier) - /* If alt-something delivers non-ASCIIchars, alt should be hidden */ - && count == 1 && *b < 0x10000) - { - SHORT r = VkKeyScanW( *b ); + /* Another thing to watch for is a possibility to use AltGr-* and + Ctrl-Alt-* with different semantic. - FPRINTF_WM_CHARS((stderr, "VkKeyScanW %#06x %#04x\n", (int)r, wParam)); - if ((r & 0xFF) == wParam && !(r & ~0x1FF)) - { - /* Char available without Alt modifier, so Alt is "on top" */ + Background: the layout defining the KLLF_ALTGR bit are treated + specially by the kernel: when VK_RMENU (=rightAlt, =AltGr) is pressed + (released), a press (release) of VK_LCONTROL is emulated (unless Ctrl + is already down). As a result, any press/release of AltGr is seen + by applications as a press/release of lCtrl AND rAlt. This is + applicable, in particular, to ToUnicode[Ex](). (Keyrepeat is covered + the same way!) + + NOTE: it IS possible to see bare rAlt even with KLLF_ALTGR; but this + requires a good finger coordination: doing (physically) + Down-lCtrl Down-rAlt Up-lCtrl Down-a + (doing quick enough, so that key repeat of rAlt [which would + generate new "fake" Down-lCtrl events] does not happens before 'a' + is down) results in no "fake" events, so the application will see + only rAlt down when 'a' is pressed. (However, fake Up-lCtrl WILL + be generated when rAlt goes UP.) + + In fact, note also that KLLF_ALTGR does not prohibit construction of + rCtrl-rAlt (just press them in this order!). + + Moreover: "traditional" layouts do not define distinct modifier-masks + for VK_LMENU and VK_RMENU (same for VK_L/RCONTROL). Instead, they + rely on the KLLF_ALTGR bit to make the behaviour of VK_LMENU and + VK_RMENU distinct. As a corollary, for such layouts, the produced + character is the same for AltGr-* (=rAlt-*) and Ctrl-Alt-* (in any + combination of handedness). For description of masks, see + + http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#Keyboard_input_on_Windows,_Part_I:_what_is_the_kernel_doing? + + By default, Emacs was using these coincidences via the following + heuristics: it was treating: + (*) keypresses with lCtrl-rAlt modifiers as if they are carrying + ONLY the character payload (no matter what the actual keyboard + was defining: if lCtrl-lAlt-b was delivering U+05df=beta, then + Emacs saw [beta]; if lCtrl-lAlt-b was undefined in the layout, + the keypress was completely ignored), and + (*) keypresses with the other combinations of handedness of Ctrl-Alt + modifiers (e.g., lCtrl-lAlt) as if they NEVER carry a character + payload (so they were reported "raw": if lCtrl-lAlt-b was + delivering beta, then Emacs saw event [C-A-b], and not [beta]). + This worked good for "traditional" layouts: users could type both + AltGr-x and Ctrl-Alt-x, and one was a character, another a bindable + event. + + However, for layouts which deliver different characters for AltGr-x + and lCtrl-lAlt-x, this scheme makes the latter character unaccessible + in Emacs. While it is easy to access functionality of [C-M-x] in + Emacs by other means (for example, by the `controlify' prefix, or + using lCtrl-rCtrl-x, or rCtrl-rAlt-x [in this order]), missing + characters cannot be reconstructed without a tedious manual work. */ + + /* These two cases are often going to be distinguishable, since at most + one of these character is defined with KBDCTRL | KBDMENU modifier + bitmap. (This heuristic breaks if both lCtrl-lAlt- AND lCtrl-rAlt- + are translated to modifier bitmaps distinct from KBDCTRL | KBDMENU, + or in the cases when lCtrl-lAlt-* and lCtrl-rAlt-* are generally + different, but lCtrl-lAlt-x and lCtrl-rAlt-x happen to deliver the + same character.) + + So we have 2 chunks of info: + (A) is it lCtrl-rAlt-, or lCtrl-lAlt, or some other combination? + (B) is the delivered character defined with KBDCTRL | KBDMENU bits? + Basing on (A) and (B), we should decide whether to ignore the + delivered character. (Before, Emacs was completely ignoring (B), and + was treating the 3-state of (A) as a bit.) This means that we have 6 + bits of customization. + + Additionally, a presence of two Ctrl down may be AltGr-rCtrl-.*/ + + /* Strip all non-Shift modifiers if: + - more than one UTF-16 code point delivered (can't call VkKeyScanW ()) + - or the character is a result of combining with a prefix key. */ + if (!after_dead && count == 1 && *b < 0x10000) + { + if (console_modifiers & (RIGHT_ALT_PRESSED | LEFT_ALT_PRESSED) + && console_modifiers & (RIGHT_CTRL_PRESSED | LEFT_CTRL_PRESSED)) + { + type_CtrlAlt = "bB"; /* generic bindable Ctrl-Alt- modifiers */ + if (console_modifiers & (LEFT_CTRL_PRESSED | RIGHT_CTRL_PRESSED) + == (LEFT_CTRL_PRESSED | RIGHT_CTRL_PRESSED)) + /* double-Ctrl: + e.g. AltGr-rCtrl on some layouts (in this order!) */ + type_CtrlAlt = "dD"; + else if (console_modifiers + & (LEFT_CTRL_PRESSED | LEFT_ALT_PRESSED) + == (LEFT_CTRL_PRESSED | LEFT_ALT_PRESSED)) + type_CtrlAlt = "lL"; /* Ctrl-Alt- modifiers on the left */ + else if (!NILP (Vw32_recognize_altgr) + && (console_modifiers + & (RIGHT_ALT_PRESSED | LEFT_CTRL_PRESSED)) + == (RIGHT_ALT_PRESSED | LEFT_CTRL_PRESSED)) + type_CtrlAlt = "gG"; /* modifiers as in AltGr */ + } + else if (wmsg.dwModifiers & (alt_modifier | meta_modifier) + || (console_modifiers + & (RIGHT_WIN_PRESSED | RIGHT_WIN_PRESSED + | APPS_PRESSED | SCROLLLOCK_ON))) + { + /* pure Alt (or combination of Alt, Win, APPS, scrolllock */ + type_CtrlAlt = "aA"; + } + if (type_CtrlAlt) + { + /* Out of bound bitmap: */ + SHORT r = VkKeyScanW( *b ), bitmap = 0x1FF; + + FPRINTF_WM_CHARS((stderr, "VkKeyScanW %#06x %#04x\n", (int)r, + wParam)); + if ((r & 0xFF) == wParam) + bitmap = r>>8; /* *b is reachable via simple interface */ + if (*type_CtrlAlt == 'a') /* Simple Alt seen */ + { + if ((bitmap & ~1) == 0) /* 1: KBDSHIFT */ + { + /* In "traditional" layouts, Alt without Ctrl does not + change the delivered character. This detects this + situation; it is safe to report this as Alt-something + - as opposed to delivering the reported character + without modifiers. */ if (legacy_alt_meta && *b > 0x7f && ('A' <= wParam && wParam <= 'Z')) /* For backward-compatibility with older Emacsen, let - this be processed by another branch below (which would convert - it to Alt-Latin char via wParam). */ + this be processed by another branch below (which + would convert it to Alt-Latin char via wParam). */ + return 0; + } + else + { + hairy = 1; + } + } + /* Check whether the delivered character(s) is accessible via + KBDCTRL | KBDALT ( | KBDSHIFT ) modifier mask (which is 7). */ + else if ((bitmap & ~1) != 6) + { + /* The character is not accessible via plain Ctrl-Alt(-Shift) + (which is, probably, same as AltGr) modifiers. + Either it was after a prefix key, or is combined with + modifier keys which we don't see, or there is an asymmetry + between left-hand and right-hand modifiers, or other hairy + stuff. */ + hairy = 1; + } + /* The best solution is to delegate these tough (but rarely + needed) choices to the user. Temporarily (???), it is + implemented as C macros. + + Essentially, there are 3 things to do: return 0 (handle to the + legacy processing code [ignoring the character payload]; keep + some modifiers (so that they will be processed by the binding + system [on top of the character payload]; strip modifiers [so + that `self-insert' is going to be triggered with the character + payload]). + + The default below should cover 99.9999% of cases: + (a) strip Alt- in the hairy case only; + (stripping = not ignoring) + (l) for lAlt-lCtrl, ignore the char in simple cases only; + (g) for what looks like AltGr, ignore the modifiers; + (d) for what looks like lCtrl-rCtrl-Alt (probably + AltGr-rCtrl), ignore the character in simple cases only; + (b) for other cases of Ctrl-Alt, ignore the character in + simple cases only. + + Essentially, in all hairy cases, and in looks-like-AltGr case, + we keep the character, ignoring the modifiers. In all the + other cases, we ignore the delivered character. + */ +#define S_TYPES_TO_IGNORE_CHARACTER_PAYLOAD "aldb" +#define S_TYPES_TO_REPORT_CHARACTER_PAYLOAD_WITH_MODIFIERS "" + if (strchr(S_TYPES_TO_IGNORE_CHARACTER_PAYLOAD, + type_CtrlAlt[hairy])) return 0; - strip_Alt = 0; + /* if in neither list, report all the modifiers we see COMBINED + WITH the reported character */ + if (strchr(S_TYPES_TO_REPORT_CHARACTER_PAYLOAD_WITH_MODIFIERS, + type_CtrlAlt[hairy])) + strip_ExtraMods = 0; } } - if (strip_Alt) - wmsg.dwModifiers = wmsg.dwModifiers & ~(alt_modifier | meta_modifier); + if (strip_ExtraMods) + wmsg.dwModifiers = wmsg.dwModifiers & shift_modifier; signal_user_input (); while (count--) @@ -3052,8 +3229,11 @@ deliver_wm_chars (int do_translate, HWND else FPRINTF_WM_CHARS((stderr, "extra ctrl char\n")); return -1; - } else if (is_dead >= 0) { + } + else if (is_dead >= 0) + { FPRINTF_WM_CHARS((stderr, "dead %#06x\n", is_dead)); + after_deadkey = is_dead; return 1; } return 0; @@ -3175,6 +3355,15 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA /* Inform lisp thread of keyboard layout changes. */ my_post_msg (&wmsg, hwnd, msg, wParam, lParam); + /* The state of the finite automaton is separate per every input + language environment (so it does not change when one switches + to a different window with the same environment). Moreover, + the experiments show that the state is not remembered when + one switches back to the pre-previous environment. */ + after_deadkey = -1; + + /* XXXX??? What follows is a COMPLETE misunderstanding of Windows! */ + /* Clear dead keys in the keyboard state; for simplicity only preserve modifier key states. */ { --Q68bSM7Ycu6FN28Q-- From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 31 Jul 2015 09:24:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Ilya Zakharevich Cc: 19994@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.143833459217717 (code B ref 19994); Fri, 31 Jul 2015 09:24:02 +0000 Received: (at 19994) by debbugs.gnu.org; 31 Jul 2015 09:23:12 +0000 Received: from localhost ([127.0.0.1]:35374 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZL6X1-0004bg-Br for submit@debbugs.gnu.org; Fri, 31 Jul 2015 05:23:11 -0400 Received: from mtaout22.012.net.il ([80.179.55.172]:38624) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZL6Wx-0004bV-5h for 19994@debbugs.gnu.org; Fri, 31 Jul 2015 05:23:08 -0400 Received: from conversion-daemon.a-mtaout22.012.net.il by a-mtaout22.012.net.il (HyperSendmail v2007.08) id <0NSC00F00HVS2H00@a-mtaout22.012.net.il> for 19994@debbugs.gnu.org; Fri, 31 Jul 2015 12:23:05 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout22.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NSC00FN1I2F2M00@a-mtaout22.012.net.il>; Fri, 31 Jul 2015 12:23:05 +0300 (IDT) Date: Fri, 31 Jul 2015 12:23:00 +0300 From: Eli Zaretskii In-reply-to: <20150709000259.GA7163@math.berkeley.edu> X-012-Sender: halo1@inter.net.il Message-id: <83d1z8wuiz.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-transfer-encoding: 8BIT References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> <20150701100712.GA24175@math.berkeley.edu> <20150709000259.GA7163@math.berkeley.edu> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Wed, 8 Jul 2015 17:02:59 -0700 > From: Ilya Zakharevich > Cc: 19994@debbugs.gnu.org > > On Wed, Jul 01, 2015 at 03:07:12AM -0700, Ilya Zakharevich wrote: > > On Wed, Mar 04, 2015 at 08:01:01PM +0200, Eli Zaretskii wrote: > > > > I suggest, indeed, to clean up the code so we could commit it to the > > > master branch. That way, it will get wider testing, and we can fix > > > I had no time to work on the code itself, but > > • I fixed the formatting, > > • I pumped up the docs, > > • I put in the suggested eassert(). > > The variant I sent was too primitive — it was not covering a (common?) > usage case when (with AltGr-layouts) leftCtrl+rightCtrl was behaving > differently than pressing AltGr: > • leftCtrl+rightCtrl would trigger C-M-key; > • altGr would enter the character payload. > > This update > > (0) fixes two formatting-style omissions; > > (A) adds A LOAD of new comments; > (B) treats such important cases (as above) separately; > > (z) Marks a piece of old code which does not make any sense. > (see the last chunk in the relative patch) > > Notes: > > • In (B), there are some decisions to make. I encapsulate these > decisions into two strings. For best result, these strings should > be user-customizable. However, currently they are just put into > C #defines. > > When I sit on this more, and if these customizations turn out to > be useful, one can make them into Lisp variables. > > • There is a bug in the (old) Emacs code which prevents some cases > treated in (B) from being really useful. I did not fix it yet. > > To see the bug: > ∘ switch to layout with AltGr; > ∘ assume that AltGr-s produces ß (as with US International); > ∘ pressing AltGr-rightControl-s produces Meta-ß; > ∘ pressing rightControl-AltGr-s produces C-M-s. > (I do not think this effect is intentional.) > > • And, BTW, is it documented anywhere that > leftControl-rightControl-key produces C-M-key? > > I include two patches: > □ absolute (ignore the previous patches) > □ relative (with whitespace ignored) — for reading. Thanks. I committed this in your name, with a few minor stylistic changes, and also fixed a few typos in the comments. Sorry for a long delay in doing that. I also added a new variable, w32-use-fallback-wm-chars-method, which, when non-nil, makes Emacs use the old code from before your changes. This is meant to be a handy debugging aid, in case we discover some issues with the new code. Do you think there are any user-visible effects of your changes that are worthy of mentioning in NEWS? If so, please propose the text for NEWS. I leave it up to you to decide whether this bug should be closed, or if there's something else to be done about it. Thanks again for working on this. From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 01 Aug 2015 07:41:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: ilya@math.berkeley.edu Cc: 19994@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.143841481331966 (code B ref 19994); Sat, 01 Aug 2015 07:41:02 +0000 Received: (at 19994) by debbugs.gnu.org; 1 Aug 2015 07:40:13 +0000 Received: from localhost ([127.0.0.1]:36237 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZLROu-0008JU-N8 for submit@debbugs.gnu.org; Sat, 01 Aug 2015 03:40:13 -0400 Received: from mtaout23.012.net.il ([80.179.55.175]:38111) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZLROr-0008JK-2v for 19994@debbugs.gnu.org; Sat, 01 Aug 2015 03:40:10 -0400 Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0NSE00E007QSL200@a-mtaout23.012.net.il> for 19994@debbugs.gnu.org; Sat, 01 Aug 2015 10:40:07 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NSE00EMG7YUE490@a-mtaout23.012.net.il>; Sat, 01 Aug 2015 10:40:07 +0300 (IDT) Date: Sat, 01 Aug 2015 10:40:05 +0300 From: Eli Zaretskii In-reply-to: <83d1z8wuiz.fsf@gnu.org> X-012-Sender: halo1@inter.net.il Message-id: <83y4hvv4mi.fsf@gnu.org> References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> <20150701100712.GA24175@math.berkeley.edu> <20150709000259.GA7163@math.berkeley.edu> <83d1z8wuiz.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Fri, 31 Jul 2015 12:23:00 +0300 > From: Eli Zaretskii > Cc: 19994@debbugs.gnu.org > > Thanks. I committed this in your name, with a few minor stylistic > changes, and also fixed a few typos in the comments. Sorry for a long > delay in doing that. > > I also added a new variable, w32-use-fallback-wm-chars-method, which, > when non-nil, makes Emacs use the old code from before your changes. > This is meant to be a handy debugging aid, in case we discover some > issues with the new code. > > Do you think there are any user-visible effects of your changes that > are worthy of mentioning in NEWS? If so, please propose the text for > NEWS. > > I leave it up to you to decide whether this bug should be closed, or > if there's something else to be done about it. Here's one problem evidently caused by the new code: invoke "emacs -Q" and type "M-x" after it starts => you will see "x" being inserted into *scratch*. This doesn't happen if w32-use-fallback-wm-chars-method is non-nil. This is a one-time problem: all the subsequent "M-x" are handled correctly. It sounds like some initialization somewhere is missing? Could you please look into that ASAP? TIA. From unknown Thu Aug 14 18:37:43 2025 X-Loop: help-debbugs@gnu.org Subject: bug#19994: 25.0.50; Unicode keyboard input on Windows Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 02 Aug 2015 14:43:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 19994 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: ilya@math.berkeley.edu Cc: 19994@debbugs.gnu.org Reply-To: Eli Zaretskii Received: via spool by 19994-submit@debbugs.gnu.org id=B19994.143852656327118 (code B ref 19994); Sun, 02 Aug 2015 14:43:02 +0000 Received: (at 19994) by debbugs.gnu.org; 2 Aug 2015 14:42:43 +0000 Received: from localhost ([127.0.0.1]:37467 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZLuTK-00073J-IG for submit@debbugs.gnu.org; Sun, 02 Aug 2015 10:42:42 -0400 Received: from mtaout23.012.net.il ([80.179.55.175]:41589) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1ZLuTI-000738-3X for 19994@debbugs.gnu.org; Sun, 02 Aug 2015 10:42:41 -0400 Received: from conversion-daemon.a-mtaout23.012.net.il by a-mtaout23.012.net.il (HyperSendmail v2007.08) id <0NSG00J00M34R400@a-mtaout23.012.net.il> for 19994@debbugs.gnu.org; Sun, 02 Aug 2015 17:42:38 +0300 (IDT) Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout23.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NSG00JY8M72QM00@a-mtaout23.012.net.il>; Sun, 02 Aug 2015 17:42:38 +0300 (IDT) Date: Sun, 02 Aug 2015 17:42:30 +0300 From: Eli Zaretskii In-reply-to: <83y4hvv4mi.fsf@gnu.org> X-012-Sender: halo1@inter.net.il Message-id: <83k2tdvjjd.fsf@gnu.org> References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> <20150701100712.GA24175@math.berkeley.edu> <20150709000259.GA7163@math.berkeley.edu> <83d1z8wuiz.fsf@gnu.org> <83y4hvv4mi.fsf@gnu.org> X-Spam-Score: 1.0 (+) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) > Date: Sat, 01 Aug 2015 10:40:05 +0300 > From: Eli Zaretskii > Cc: 19994@debbugs.gnu.org > > Here's one problem evidently caused by the new code: invoke "emacs -Q" > and type "M-x" after it starts => you will see "x" being inserted into > *scratch*. This doesn't happen if w32-use-fallback-wm-chars-method is > non-nil. > > This is a one-time problem: all the subsequent "M-x" are handled > correctly. It sounds like some initialization somewhere is missing? I've found that the simple change below fixes this problem. I committed it; if you feel it's not the right fix, please propose an alternative. Thanks. commit 0afb8fab99951262e81d6095302de4c84d7e8847 Author: Eli Zaretskii Date: Sun Aug 2 17:40:19 2015 +0300 Fix handling of 1st keystroke on MS-Windows * src/w32fns.c (globals_of_w32fns): Initialize after_deadkey to -1. This is needed to correctly handle the session's first keystroke, if it has any modifiers. (Bug#19994) diff --git a/src/w32fns.c b/src/w32fns.c index 1c72974..31d23c4 100644 --- a/src/w32fns.c +++ b/src/w32fns.c @@ -9442,6 +9442,8 @@ typedef USHORT (WINAPI * CaptureStackBackTrace_proc) (ULONG, ULONG, PVOID *, else w32_unicode_gui = 0; + after_deadkey = -1; + /* MessageBox does not work without this when linked to comctl32.dll 6.0. */ InitCommonControls (); From unknown Thu Aug 14 18:37:43 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Ilya Zakharevich Subject: bug#19994: closed (Re: bug#19994: 25.0.50; Unicode keyboard input on Windows) Message-ID: References: <20150303230949.GA29784@math.berkeley.edu> X-Gnu-PR-Message: they-closed 19994 X-Gnu-PR-Package: emacs Reply-To: 19994@debbugs.gnu.org Date: Wed, 12 Aug 2020 16:33:04 +0000 Content-Type: multipart/mixed; boundary="----------=_1597249984-23397-1" This is a multi-part message in MIME format... ------------=_1597249984-23397-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #19994: 25.0.50; Unicode keyboard input on Windows which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 19994@debbugs.gnu.org. --=20 19994: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D19994 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1597249984-23397-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 19994-done) by debbugs.gnu.org; 12 Aug 2020 16:32:39 +0000 Received: from localhost ([127.0.0.1]:45306 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k5tg6-000649-FZ for submit@debbugs.gnu.org; Wed, 12 Aug 2020 12:32:38 -0400 Received: from mail-yb1-f169.google.com ([209.85.219.169]:43067) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1k5tg0-00063T-EP for 19994-done@debbugs.gnu.org; Wed, 12 Aug 2020 12:32:33 -0400 Received: by mail-yb1-f169.google.com with SMTP id m200so1674501ybf.10 for <19994-done@debbugs.gnu.org>; Wed, 12 Aug 2020 09:32:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:user-agent :mime-version:date:message-id:subject:to:cc; bh=xsQvDuDqFmaqlPbWosdCZodxrVm9Mo0TLd3EmpKDpTU=; b=SHl3MuvZtw7As2XyqwLPqbRt/fRn+/FHkKcascqzsp0vkJ4h9frOgEYgf7u5JIPB6c dZj/nYWD4xH1quPvGBdTWrpqE5brqlBD32e5brQBd9e8QkCjTjAFba0hDShxF19gESG7 W896S+ijSu15GitkoyWir+YlCW04B4dlSMzEMbwWqLPMqerX18PW1+rZ3/xUu1RJ0lL5 0OEW1cKwYGD3oozF+p6pUFGucF+mBIItV1nxlLP143iR/M10UcvQBlxPRpM8NI4cswAj nk/IZiAHv+UKAFH5nLC3KThzs/T7/dg/qHqCIzLaj1jBNrjwfyFOEAbl/yRe+QqdhjBW NBPA== X-Gm-Message-State: AOAM530MdZBXpLMWHavWh5/wIYS5br8065igFkTV7N1A7jTgkN1gXyKP Y8aoKjLGywo12qenxQD4Y2eYJfZplzAq3NSXaPY= X-Google-Smtp-Source: ABdhPJw19fE2RT/7sAZNVNOVRvY6mFYrzjiExj2TGJtd9VpLy+DS7GRFDyt+C2ku9naSccZ12riawwxDf2DxYAARpPM= X-Received: by 2002:a25:b88b:: with SMTP id w11mr256815ybj.129.1597249947047; Wed, 12 Aug 2020 09:32:27 -0700 (PDT) Received: from 753933720722 named unknown by gmailapi.google.com with HTTPREST; Wed, 12 Aug 2020 09:32:26 -0700 From: Stefan Kangas In-Reply-To: <83k2tdvjjd.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 02 Aug 2015 17:42:30 +0300") References: <20150303230949.GA29784@math.berkeley.edu> <83bnk8prqa.fsf@gnu.org> <20150701100712.GA24175@math.berkeley.edu> <20150709000259.GA7163@math.berkeley.edu> <83d1z8wuiz.fsf@gnu.org> <83y4hvv4mi.fsf@gnu.org> <83k2tdvjjd.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) MIME-Version: 1.0 Date: Wed, 12 Aug 2020 09:32:26 -0700 Message-ID: Subject: Re: bug#19994: 25.0.50; Unicode keyboard input on Windows To: Eli Zaretskii Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 19994-done Cc: 19994-done@debbugs.gnu.org, ilya@math.berkeley.edu X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Eli Zaretskii writes: >> Date: Sat, 01 Aug 2015 10:40:05 +0300 >> From: Eli Zaretskii >> Cc: 19994@debbugs.gnu.org >> >> Here's one problem evidently caused by the new code: invoke "emacs -Q" >> and type "M-x" after it starts => you will see "x" being inserted into >> *scratch*. This doesn't happen if w32-use-fallback-wm-chars-method is >> non-nil. >> >> This is a one-time problem: all the subsequent "M-x" are handled >> correctly. It sounds like some initialization somewhere is missing? > > I've found that the simple change below fixes this problem. I > committed it; if you feel it's not the right fix, please propose an > alternative. It seems like the patch here was installed, an additional fix was committed, and there has been no further progress within 5 years. I'm therefore closing this bug report. Best regards, Stefan Kangas ------------=_1597249984-23397-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 3 Mar 2015 23:10:14 +0000 Received: from localhost ([127.0.0.1]:34330 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YSvx5-0004o6-H6 for submit@debbugs.gnu.org; Tue, 03 Mar 2015 18:10:13 -0500 Received: from eggs.gnu.org ([208.118.235.92]:54399) by debbugs.gnu.org with esmtp (Exim 4.80) (envelope-from ) id 1YSvx2-0004nf-0f for submit@debbugs.gnu.org; Tue, 03 Mar 2015 18:10:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YSvwu-0007d3-D9 for submit@debbugs.gnu.org; Tue, 03 Mar 2015 18:10:02 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:58599) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSvwu-0007cs-9M for submit@debbugs.gnu.org; Tue, 03 Mar 2015 18:10:00 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56113) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSvwr-0005mE-ND for bug-gnu-emacs@gnu.org; Tue, 03 Mar 2015 18:10:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YSvwn-0007Wz-Vv for bug-gnu-emacs@gnu.org; Tue, 03 Mar 2015 18:09:57 -0500 Received: from nm22-vm7.bullet.mail.gq1.yahoo.com ([98.136.217.70]:43395) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YSvwn-0007Vx-Ig for bug-gnu-emacs@gnu.org; Tue, 03 Mar 2015 18:09:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1425424192; bh=My7jnpCUBEYHRUR4tstqz2azTsNPFSefnssFO3QSvCo=; h=Date:From:To:Subject:From:Subject; b=eTb8aTrjDUf8AWgeB3ZwfYIFIWXW7EmpqQved0oKxzOkatc/2u7yljxwWCaeQj0ffDL5M5FlQXh0x4E9KdtPu4mAf8MpmJuURq45I81svRt3MdFR4oucjPh7W+8W/uVkIxRwo3Ga/I6HC8cqvY8VPUII3ozFD72BhQrAru1aNNUBDxRZ7lhW9PsLbLY6bD99vtr14VYLj7huE0WFy3zVYO23j7ImE7A5HPIs6eT8DgTH3ZA6TUCaERTcg3h+oi6iN+wplAa0xCuV6A8jBioGDomCJKoIPNQOIY7pJ6k8i7kOemIEGa6GShPqV5q39QwSJZ2wnmA4/eD52mZbgftUkg== Received: from [98.137.12.60] by nm22.bullet.mail.gq1.yahoo.com with NNFMP; 03 Mar 2015 23:09:52 -0000 Received: from [208.71.42.194] by tm5.bullet.mail.gq1.yahoo.com with NNFMP; 03 Mar 2015 23:09:52 -0000 Received: from [127.0.0.1] by smtp205.mail.gq1.yahoo.com with NNFMP; 03 Mar 2015 23:09:52 -0000 X-Yahoo-Newman-Id: 547531.29739.bm@smtp205.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: YDoBvOIVM1l6qCFNjekFMal3CU6ssiWu5uZAbkr6oysMfV6 9n1VRLuWIcN4WlrPm895XzOZN6Yh4PG2ygQXxaDBuCXs7iQeRUrXKuXnpJaB EP9eV1t2isuhWPQ6u6FCeoQ_IyRV_XtH34NML8IMDtlNTYQvQeaDybQUNaOa .AAJKSM42.b0n1WFE1ySw5H3_3Sv16SniHVOJVsdyB.bshYsMC_bCX_BUBu7 DLgtYEWJlJCXHJZlrEUeHQNkK0JHLX7e6XY.HOhvcvbymtv1pv_dCOPzQX9r 0xaAdt_0HN43JGNVbXl.YDqGwoRipXuMZMXcebH1Bl8b7_Q2zRSZI5oMyDrP 6FYWwKsvhAkx6_Cg2nJR9TQMJcf96dGuBtyardMHAj2wo2JnnrPTuhFPPN3G ctBqDyKGRPoRAoDdGc1FYJ1kwyK06MPT4mH2oJi4Mn42UpLkqb5D0OMCwdUI L6FwVIxuDsWkFQCWOMEpxL6Nu6ryUSeofi4k9Y9fo.Z6tvjNND5ROeqZc4YE yPQN4HgqQKKWRrIJB2LR8vcZlGkfP0zrI3s_V41JlPZP7ViEVCToCIR6cRbd DG55oseXI9MokVmfACfptB7tZXqFh3KFcrMXGMrtFeS1P1LLV020ziga2mUI bFaNbY2UNwGNE5zrIE_96YRHppUSx7Ztu0eJkj32kpaNimspLV4d0ogQMW2A F3imtAMSOt5oS2JQIUmcs2VxAsYSF7c6kyfLPZlx_jmK375TZW5OagHtj55l Ds3SJl3mEBvJuiUgd6ibNLvJcY7AEDO1pJP87gdG9puT5Wi8tqf9Rc86XZmY ggpq.crWTMtxs1c81svkLcsgGzUMnTyYkcEpTCszZjJEwWBqgCteiaJbCDCw T4UkfzXE- X-Yahoo-SMTP: oLSY3dWswBBqoBVzCkLl_RIsw6heKMxu8wpEbARv1SU- Date: Tue, 3 Mar 2015 15:09:49 -0800 From: Ilya Zakharevich To: bug-gnu-emacs@gnu.org Subject: 25.0.50; Unicode keyboard input on Windows Message-ID: <20150303230949.GA29784@math.berkeley.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) I’m working on a patch to make Unicode keyboard input to work properly on Windows (in graphic mode). The problems with the current implementation stem from the facts that • on Windows, it IS possible to implement a bullet-proof system of Unicode input (at least, for GUI applications); • However, how to do it is completely undocumented. [See http://search.cpan.org/~ilyaz/UI-KeyboardLayout/lib/UI/KeyboardLayout.pm#Keyboard_input_on_Windows:_interaction_of_applications_and_the_kernel ] So, essentially, all developers of applications try to design their own set of heuristical approaches which • cover several keyboard layouts they can put their hands on; • more or less follow the design goals of their applications. The approach taken by Emacs is to break the keyboard keys (VK’s) into several groups, and treat different groups differently. Only the keys on the main island of the keyboard may input characters. Moreover, only the most common combinations of modifiers are allowed to be used for the character input. (In addition, there are plain bugs — like treating UTF-16 as if it were UTF-32.) [I gave a very terse description on https://groups.google.com/forum/?hl=en#!search/emacs$20keyboard$20windows$20ilya/gnu.emacs.help/ZHpZK2YfFuo/aAyZFUxrFeEJ ] The “correct” approach should proceed in exactly the opposite direction: if a keypress produces a character, it should be treated as a character — no matter where on the physical keyboard the key is residing, and which modifiers were pressed. The patch below • Implements this “primacy of characters” doctrine; • As far as I could see, is compatible with the current work of Emacs on “simple keyboard layouts”; • Worked at some moment (before I started a massive addition of comments ;-] — and maybe it is still working, I did not touch it for a month); • (Currently) ignores the indent coding rules; • Passes all the test thrown at it by my super-puper-all-bells-and-whistles layouts; see e.g. http://k.ilyaz.org/windows/izKeys-visual-maps.html#examples • Is not bullet-proof: ∘ I use one heuristic to detect which modifiers are “consumed” by the character input, and which are “on top” of character input; ∘ It does not (same as the current Emacs) support Unicode-entered-by-Alt-numbers. • Does not fix a bug with UTF-16 of stand-alone (pumped to us) WM_CHAR’s. If I ever find more time to work on it, I plan to: 1) Add yet more documentation; 2) Change a little bit the logic of detection of consumed/extra modifiers. This change may be cosmetic only — or maybe, with some extremely devilous layouts, it may be beneficial. (I have not seen layouts where this change would matter, though! And I looked though the source code of hundred(s).) 3) Bring it in sync with the Emacs coding style. Meanwhile, I would greatly appreciate all input related to the current state of the patch. (I *HOPE* that I did not break (many!) special cases in the current implementation — but such things are hard to be sure in!) Thanks for the parts of Emacs which ARE working great, Ilya ======================================================= --- w32fns.c-ini 2015-01-30 15:33:23.505201400 -0800 +++ w32fns.c 2015-02-15 02:46:12.070091800 -0800 @@ -2832,6 +2832,126 @@ post_character_message (HWND hwnd, UINT my_post_msg (&wmsg, hwnd, msg, wParam, lParam); } +static int +get_wm_chars (HWND aWnd, int *buf, int buflen, int ignore_ctrl, int ctrl, int *ctrl_cnt, int *is_dead, int vk, int exp) +{ + MSG msg; + int i = buflen, doubled = 0, code_unit; /* If doubled is at the end, ignore it */ + if (ctrl_cnt) + *ctrl_cnt = 0; + if (is_dead) + *is_dead = -1; + while (buflen && /* Should be called only when w32_unicode_gui */ + PeekMessageW(&msg, aWnd, WM_KEYFIRST, WM_KEYLAST, PM_NOREMOVE | PM_NOYIELD) && + (msg.message == WM_CHAR || msg.message == WM_SYSCHAR || + msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR || msg.message == WM_UNICHAR)) { /* Not contigious */ + int dead; + + GetMessageW(&msg, aWnd, msg.message, msg.message); + dead = (msg.message == WM_DEADCHAR || msg.message == WM_SYSDEADCHAR); + if (is_dead) + *is_dead = (dead ? msg.wParam : -1); + if (dead) + continue; + code_unit = msg.wParam; + if (doubled) { /* had surrogate */ + if (msg.message == WM_UNICHAR || code_unit < 0xDC00 || code_unit > 0xDFFF) { + /* Mismatched first surrogate. Pass both code units as if they were two characters. */ + *buf++ = doubled; + if (!--buflen) // Drop the second char if at the end of the buffer + return i; + } else { + code_unit = (doubled << 10) + code_unit - 0x35FDC00; + } + doubled = 0; + } else if (code_unit >= 0xD800 && code_unit <= 0xDBFF) { + doubled = code_unit; + continue; + } /* We handle mismatched second surrogate the same as a normal character. */ + /* The only "fake" characters delivered by ToUnicode() or TranslateMessage() are: + 0x01 .. 0x1a for Control-chars, + 0x00 and 0x1b .. 0x1f for Control- []\@^_ + 0x7f for Control-BackSpace + 0x20 for Control-Space */ + if (ignore_ctrl && (code_unit < 0x20 || code_unit == 0x7f || (code_unit == 0x20 && ctrl))) { + /* Non-character payload in a WM_CHAR (Ctrl-something pressed). Ignore. */ + if (ctrl_cnt) + *ctrl_cnt++; + continue; + } + if (code_unit < 0x7f && + ((vk >= VK_NUMPAD0 && vk <= VK_DIVIDE) || + (exp && ((vk >= VK_PRIOR && vk <= VK_DOWN) || + vk == VK_INSERT || vk == VK_DELETE || vk == VK_CLEAR))) && + strchr("0123456789/*-+.,", code_unit)) /* Traditionally, Emacs translates these to characters later, in `self-insert-character' */ + continue; + *buf++ = code_unit; + buflen--; + } + return i - buflen; +} + +int +deliver_wm_chars (int do_translate, HWND hwnd, UINT msg, UINT wParam, UINT lParam) +{ + /* An "old style" keyboard description may assign up to 125 UTF-16 code points to a keypress. + (However, the "old style" TranslateMessage() would deliver at most 16 of them.) Be on a + safe side, and prepare to treat many more. */ + int ctrl_cnt, buf[1024], count, is_dead; + + if (do_translate) { + MSG windows_msg = { hwnd, msg, wParam, lParam, 0, {0,0} }; + + windows_msg.time = GetMessageTime (); + TranslateMessage (&windows_msg); + } + count = get_wm_chars (hwnd, buf, sizeof(buf)/sizeof(*buf), 1, + /* The message may have been synthesized by who knows what; be conservative. */ + modifier_set (VK_LCONTROL) || modifier_set (VK_RCONTROL) || modifier_set (VK_CONTROL), + &ctrl_cnt, &is_dead, wParam, (lParam & 0x1000000L) != 0); + if (count) { + W32Msg wmsg; + int *b = buf, strip_Alt = 1; + + /* wParam is checked when converting CapsLock to Shift */ + wmsg.dwModifiers = do_translate ? w32_get_key_modifiers (wParam, lParam) : 0; + + /* What follows is just heuristics; the correct treatement requires non-destructive ToUnicode(). */ + if (wmsg.dwModifiers & ctrl_modifier) /* If ctrl-something delivers chars, ctrl and the rest should be hidden */ + wmsg.dwModifiers = wmsg.dwModifiers & shift_modifier; + /* In many keyboard layouts, (left) Alt is not changing the character. Unless we are in this situation, strip Alt/Meta. */ + if (wmsg.dwModifiers & (alt_modifier | meta_modifier) && /* If alt-something delivers non-ASCIIchars, alt should be hidden */ + count == 1 && *b < 0x10000) { + SHORT r = VkKeyScanW( *b ); + + fprintf(stderr, "VkKeyScanW %#06x %#04x\n", (int)r, wParam); + if ((r & 0xFF) == wParam && !(r & ~0x1FF)) { /* Char available without Alt modifier, so Alt is "on top" */ + if (*b > 0x7f && ('A' <= wParam && wParam <= 'Z')) + return 0; /* Another branch below would convert it to Alt-Latin char via wParam */ + strip_Alt = 0; + } + } + if (strip_Alt) + wmsg.dwModifiers = wmsg.dwModifiers & ~(alt_modifier | meta_modifier); + + signal_user_input (); + while (count--) + { + fprintf(stderr, "unichar %#06x\n", *b); + my_post_msg (&wmsg, hwnd, WM_UNICHAR, *b++, lParam); + } + if (!ctrl_cnt) /* Process ALSO as ctrl */ + return 1; + else + fprintf(stderr, "extra ctrl char\n"); + return -1; + } else if (is_dead >= 0) { + fprintf(stderr, "dead %#06x\n", is_dead); + return 1; + } + return 0; +} + /* Main window procedure */ static LRESULT CALLBACK @@ -3007,7 +3127,6 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA /* Synchronize modifiers with current keystroke. */ sync_modifiers (); record_keydown (wParam, lParam); - wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0); windows_translate = 0; @@ -3117,6 +3236,45 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA wParam = VK_NUMLOCK; break; default: + if (w32_unicode_gui) { + /* If this event generates characters or deadkeys, do not interpret + it as a "raw combination of modifiers and keysym". Hide + deadkeys, and use the generated character(s) instead of the + keysym. (Backward compatibility: exceptions for numpad keys + generating 0-9 . , / * - +, and for extra-Alt combined with a + non-Latin char.) + + Try to not report modifiers which have effect on which + character or deadkey is generated. + + Example (contrived): if rightAlt-? generates f (on a Cyrillic + keyboard layout), and Ctrl, leftAlt do not affect the generated + character, one wants to report Ctrl-leftAlt-f if the user + presses Ctrl-leftAlt-rightAlt-?. */ + int res; +#if 0 + /* Some of WM_CHAR may be fed to us directly, some are results of + TranslateMessage(). Using 0 as the first argument (in a + separate call) might help us distinguish these two cases. + + However, the keypress feeders would most probably expect the + "standard" message pump, when TranslateMessage() is called on + EVERY KeyDown/Keyup event. So they may feed us Down-Ctrl + Down-FAKE Char-o and expect us to recognize it as Ctrl-o. + Using 0 as the first argument would interfere with this. */ + deliver_wm_chars (0, hwnd, msg, wParam, lParam); +#endif + /* Processing the generated WM_CHAR messages *WHILE* we handle + KEYDOWN/UP event is the best choice, since withoug any fuss, + we know all 3 of: scancode, virtual keycode, and expansion. + (Additionally, one knows boundaries of expansion of different + keypresses.) */ + res = deliver_wm_chars (1, hwnd, msg, wParam, lParam); + windows_translate = -( res != 0 ); + if (res > 0) /* Bound to character(s) or a deadkey */ + break; + } /* Some branches after this one may be not needed */ + wParam = map_keypad_keys (wParam, (lParam & 0x1000000L) != 0); /* If not defined as a function key, change it to a WM_CHAR message. */ if (wParam > 255 || !lispy_function_keys[wParam]) { @@ -3184,6 +3342,8 @@ w32_wnd_proc (HWND hwnd, UINT msg, WPARA } } + if (windows_translate == -1) + break; translate: if (windows_translate) { ======================================================= In GNU Emacs 25.0.50.20 (i686-pc-mingw32) of 2015-02-08 on BUCEFAL Repository revision: d5e3922e08587e7eb9e5aec2e9f84cbda405f857 Windowing system distributor `Microsoft Corp.', version 6.1.7601 Configured using: `configure --prefix=/k/test' Configured features: SOUND NOTIFY ACL Important settings: value of $LANG: ENU locale-coding-system: cp1252 Major mode: Fundamental Minor modes in effect: tooltip-mode: t global-eldoc-mode: t electric-indent-mode: t mouse-wheel-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t blink-cursor-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t buffer-read-only: t line-number-mode: t Recent messages: For information about GNU Emacs and the GNU system, type C-h C-a. Load-path shadows: None found. Features: (shadow sort gnus-util mail-extr emacsbug message dired format-spec rfc822 mml easymenu mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util help-fns mail-prsvr mail-utils time-date tooltip eldoc electric uniquify ediff-hook vc-hooks lisp-float-type mwheel dos-w32 ls-lisp disp-table w32-win w32-vars tool-bar dnd fontset image regexp-opt fringe tabulated-list newcomment elisp-mode lisp-mode prog-mode register page menu-bar rfn-eshadow timer select scroll-bar mouse jit-lock font-lock syntax facemenu font-core frame cham georgian utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean japanese hebrew greek romanian slovak czech european ethiopic indian cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev minibuffer cl-preloaded nadvice loaddefs button faces cus-face macroexp files text-properties overlay sha1 md5 base64 format env code-pages mule custom widget hashtable-print-readable backquote make-network-process w32notify w32 multi-tty emacs) Memory information: ((conses 8 80324 9864) (symbols 32 17968 0) (miscs 32 85 128) (strings 16 12688 4007) (string-bytes 1 324435) (vectors 8 9470) (vector-slots 4 390690 6074) (floats 8 65 62) (intervals 28 243 45) (buffers 516 13)) ------------=_1597249984-23397-1--