From unknown Sun Jun 22 11:32:31 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#29837 <29837@debbugs.gnu.org> To: bug#29837 <29837@debbugs.gnu.org> Subject: Status: UTF-16 char display problems and the macOS "character palette" Reply-To: bug#29837 <29837@debbugs.gnu.org> Date: Sun, 22 Jun 2025 18:32:31 +0000 retitle 29837 UTF-16 char display problems and the macOS "character palette" reassign 29837 emacs submitter 29837 Alan Third severity 29837 normal tag 29837 fixed thanks From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 24 11:01:09 2017 Received: (at submit) by debbugs.gnu.org; 24 Dec 2017 16:01:09 +0000 Received: from localhost ([127.0.0.1]:50801 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eT8i5-0001So-DY for submit@debbugs.gnu.org; Sun, 24 Dec 2017 11:01:09 -0500 Received: from eggs.gnu.org ([208.118.235.92]:58828) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eT8i4-0001Sb-7A for submit@debbugs.gnu.org; Sun, 24 Dec 2017 11:01:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eT8hx-0002lB-TC for submit@debbugs.gnu.org; Sun, 24 Dec 2017 11:01:03 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,FREEMAIL_FROM, T_DKIM_INVALID autolearn=disabled version=3.3.2 Received: from lists.gnu.org ([2001:4830:134:3::11]:37228) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eT8hx-0002l5-OR for submit@debbugs.gnu.org; Sun, 24 Dec 2017 11:01:01 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49794) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eT8hw-00058I-5i for bug-gnu-emacs@gnu.org; Sun, 24 Dec 2017 11:01:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eT8hu-0002kE-S2 for bug-gnu-emacs@gnu.org; Sun, 24 Dec 2017 11:01:00 -0500 Received: from mail-wm0-x232.google.com ([2a00:1450:400c:c09::232]:42470) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eT8hu-0002jm-Fb for bug-gnu-emacs@gnu.org; Sun, 24 Dec 2017 11:00:58 -0500 Received: by mail-wm0-x232.google.com with SMTP id b199so29476935wme.1 for ; Sun, 24 Dec 2017 08:00:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=sender:date:from:to:subject:message-id:mime-version :content-disposition:content-transfer-encoding:user-agent; bh=wScRBh4ywTqIEI/9kFuvOJ6/iCXGnuSdcA4QHzsv5EA=; b=a6thREtKFC315nwfTyRyHExu4Er8zN0ISg9t0lnN/i3ENH6nbhJ8VNhRuu4SodZqa2 r3Efybk2rLw4ASCiMiW17VRDJRx0Khd4+N1oiC2ixqhYcX1Nl0A4RHLy8LtG3nQYOgYp mDt0ezRg26sg6Qt/CRR3nTj0k3FoApnb6ZHR/4eL/RBBUNfNYZmQqPGPiaEzgldLtUYa wd7U4nnwh5KMrLO88EvO+aBRMDiQAkS4vKxrMNF9I6NlaPvkFnapBsewC6skEpAj7MVT tpWLRYhyw48uMubtPArJlcGm0HGtbJTG298dCLl6Eb/VUM9fuloiABXB8YuVaOZRnk9C cFGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:subject:message-id :mime-version:content-disposition:content-transfer-encoding :user-agent; bh=wScRBh4ywTqIEI/9kFuvOJ6/iCXGnuSdcA4QHzsv5EA=; b=cqN90lnvx7/2niwjsiJzz5QKi8/pgVupPdTOslwiy5xMTgJsDqZXYq8GFnSjEkKcMW fZGNodgkJB7VR099SA8XIn4kD65NVN/dl9K9wEAGva2/BJIQdMbl8NZ0bO5/jyC66h5C Catx/Z1nEwD27ceje677rzcIp5nfuYhtNxSKeggCcytiZEzlSt43IyPS295+mYej240x Mrll2Dhz7YVpiMcWigMpIuDlmW9n4LaN+MTeJMO+4EcYHqC1OV9+7rFwxN5Y85WPxmhU LBYraukuXMYiD45dWyotwfnj46mHty3A/vfusdA0DXwuZYDnNeJUcp4Q51DM27ubjeL3 P9eQ== X-Gm-Message-State: AKGB3mI37I2np8L5xQaHwNDQtQxgCIHK/zzWr67jw+WhBjX7NUqk2Q+y qyTSht7y2M6yGHHIBlAj57ndOR2V X-Google-Smtp-Source: ACJfBosl3a+xwpyWG68yYylRCDHoAHSy4oy7fIh759Zy0O1FLCm6pAP4ow296D8dKl8iJ82jFKrvkw== X-Received: by 10.28.157.7 with SMTP id g7mr15987033wme.89.1514131256807; Sun, 24 Dec 2017 08:00:56 -0800 (PST) Received: from breton.holly.idiocy.org (ip6-2001-08b0-03f8-8129-2872-d519-1675-2de8.holly.idiocy.org. [2001:8b0:3f8:8129:2872:d519:1675:2de8]) by smtp.gmail.com with ESMTPSA id f4sm33412668wra.75.2017.12.24.08.00.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Dec 2017 08:00:55 -0800 (PST) Date: Sun, 24 Dec 2017 16:00:53 +0000 From: Alan Third To: bug-gnu-emacs@gnu.org Subject: UTF-16 char display problems and the macOS "character palette" Message-ID: <20171224160053.GA71863@breton.holly.idiocy.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="82I3+IH0IqGh5yIs" Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.9.1 (2017-09-22) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 2001:4830:134:3::11 X-Spam-Score: -3.5 (---) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.5 (---) --82I3+IH0IqGh5yIs Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Hi, I’ve had a go at enabling the macOS character palette, which is just a virtual keyboard that helps you to enter special characters, emoji’s, etc. It’s easy enough to bring it up (patch attached) but some special characters are put into Emacs incorrectly. I think the problem is that we have multi code‐point UTF‐16 characters, and when they are ‘typed’ into Emacs they are entered as individual 16 bit code‐points and are therefore displayed as a series of blank spaces. An example is '🢫' (RIGHTWARDS FRONT-TILTED SHADOWED WHITE ARROW). If I enter it using C‐x 8 RET, it appears correctly, but if I use the character palette it shows up as two blank spaces. Describe-char reveals these to be HIGH SURROGATE-D83E and LOW SURROGATE-DCAB, in that order. I can’t work out if Emacs should be able to handle these multi code‐point characters being entered from a ‘keyboard’ input or not. If so, does anyone have any idea what I need to do? (Another minor irritation is that some characters (like pointing hands) seem to insert the desired character then follow up with VARIATION SELECTOR-15. I assume this is supposed to tell us what colour we want the hand? If so should it be displayed?) -- Alan Third --82I3+IH0IqGh5yIs Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0001-Add-macOS-character-palette.patch" >From ad16b98288abe91732217535e308ae445303ab59 Mon Sep 17 00:00:00 2001 From: Alan Third Date: Sun, 24 Dec 2017 15:40:03 +0000 Subject: [PATCH] Add macOS character-palette --- lisp/term/ns-win.el | 8 ++++++++ src/nsfns.m | 14 ++++++++++++++ src/nsterm.m | 7 ++++++- 3 files changed, 28 insertions(+), 1 deletion(-) diff --git a/lisp/term/ns-win.el b/lisp/term/ns-win.el index d512e8e506..7955ae0cb0 100644 --- a/lisp/term/ns-win.el +++ b/lisp/term/ns-win.el @@ -144,6 +144,8 @@ global-map (define-key global-map [?\s-z] 'undo) (define-key global-map [?\s-|] 'shell-command-on-region) (define-key global-map [s-kp-bar] 'shell-command-on-region) +;; The key-chord below is C-s-SPC +(define-key global-map [C-s-268632064] 'ns-do-show-character-palette) ;; (as in Terminal.app) (define-key global-map [s-right] 'ns-next-frame) (define-key global-map [s-left] 'ns-prev-frame) @@ -575,6 +577,12 @@ ns-do-emacs-info-panel (interactive) (ns-emacs-info-panel)) +(declare-function ns-show-character-palette "nsfns.m" ()) + +(defun ns-do-show-character-palette () + (interactive) + (ns-show-character-palette)) + (defun ns-next-frame () "Switch to next visible frame." (interactive) diff --git a/src/nsfns.m b/src/nsfns.m index 05605bf657..402771e2f8 100644 --- a/src/nsfns.m +++ b/src/nsfns.m @@ -3135,6 +3135,19 @@ The position is returned as a cons cell (X . Y) of the (pt.y - screen.frame.origin.y))); } +DEFUN ("ns-show-character-palette", + Fns_show_character_palette, + Sns_show_character_palette, 0, 0, 0, + doc: /* Show the macOS character palette. */) + (void) +{ + struct frame *f = SELECTED_FRAME (); + EmacsView *view = FRAME_NS_VIEW (f); + [NSApp orderFrontCharacterPalette:view]; + + return Qnil; +} + /* ========================================================================== Class implementations @@ -3326,6 +3339,7 @@ - (NSString *)panel: (id)sender userEnteredFilename: (NSString *)filename defsubr (&Sns_frame_restack); defsubr (&Sns_set_mouse_absolute_pixel_position); defsubr (&Sns_mouse_absolute_pixel_position); + defsubr (&Sns_show_character_palette); defsubr (&Sx_display_mm_width); defsubr (&Sx_display_mm_height); defsubr (&Sx_display_screens); diff --git a/src/nsterm.m b/src/nsterm.m index 07ac8f978f..65a9aac4a7 100644 --- a/src/nsterm.m +++ b/src/nsterm.m @@ -6284,11 +6284,16 @@ flag set (this is probably a bug in the OS). - (void)insertText: (id)aString { int code; - int len = [(NSString *)aString length]; + int len; int i; NSTRACE ("[EmacsView insertText:]"); + if ([aString isKindOfClass:[NSAttributedString class]]) + aString = [aString string]; + + len = [(NSString *)aString length]; + if (NS_KEYLOG) NSLog (@"insertText '%@'\tlen = %d", aString, len); processingCompose = NO; -- 2.14.3 --82I3+IH0IqGh5yIs-- From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 24 11:56:35 2017 Received: (at 29837) by debbugs.gnu.org; 24 Dec 2017 16:56:35 +0000 Received: from localhost ([127.0.0.1]:50822 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eT9Zj-0002mm-IE for submit@debbugs.gnu.org; Sun, 24 Dec 2017 11:56:35 -0500 Received: from eggs.gnu.org ([208.118.235.92]:41435) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eT9Zi-0002ma-IR for 29837@debbugs.gnu.org; Sun, 24 Dec 2017 11:56:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eT9ZZ-0007Zq-FN for 29837@debbugs.gnu.org; Sun, 24 Dec 2017 11:56:29 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:58334) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eT9ZZ-0007Zk-Cl; Sun, 24 Dec 2017 11:56:25 -0500 Received: from [176.228.60.248] (port=1596 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eT9ZY-0001J8-O6; Sun, 24 Dec 2017 11:56:25 -0500 Date: Sun, 24 Dec 2017 18:56:29 +0200 Message-Id: <83bmiojc8y.fsf@gnu.org> From: Eli Zaretskii To: Alan Third In-reply-to: <20171224160053.GA71863@breton.holly.idiocy.org> (message from Alan Third on Sun, 24 Dec 2017 16:00:53 +0000) Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character palette" References: <20171224160053.GA71863@breton.holly.idiocy.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 29837 Cc: 29837@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Sun, 24 Dec 2017 16:00:53 +0000 > From: Alan Third > > It’s easy enough to bring it up (patch attached) but some special > characters are put into Emacs incorrectly. I think the problem is that > we have multi code‐point UTF‐16 characters, and when they are ‘typed’ > into Emacs they are entered as individual 16 bit code‐points and are > therefore displayed as a series of blank spaces. > > An example is '🢫' (RIGHTWARDS FRONT-TILTED SHADOWED WHITE ARROW). If I > enter it using C‐x 8 RET, it appears correctly, but if I use the > character palette it shows up as two blank spaces. Describe-char > reveals these to be HIGH SURROGATE-D83E and LOW SURROGATE-DCAB, in > that order. You need to tell Emacs that keyboard input is in UTF-16. Did you try "C-x RET k"? > (Another minor irritation is that some characters (like pointing > hands) seem to insert the desired character then follow up with > VARIATION SELECTOR-15. I assume this is supposed to tell us what > colour we want the hand? If so should it be displayed?) Emacs doesn't yet support variation selectors. Patches to add that are welcome (I guess it will need some change in our interface with font back-ends?). From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 24 13:23:31 2017 Received: (at 29837) by debbugs.gnu.org; 24 Dec 2017 18:23:31 +0000 Received: from localhost ([127.0.0.1]:50898 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTAvr-0004xH-Ka for submit@debbugs.gnu.org; Sun, 24 Dec 2017 13:23:31 -0500 Received: from mail-wm0-f46.google.com ([74.125.82.46]:42450) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTAvp-0004x3-Ja for 29837@debbugs.gnu.org; Sun, 24 Dec 2017 13:23:29 -0500 Received: by mail-wm0-f46.google.com with SMTP id b199so29797785wme.1 for <29837@debbugs.gnu.org>; Sun, 24 Dec 2017 10:23:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=F8E7EyefI3V2SNmWRojwlwvpROWUpIqhHf+y6SLjAqc=; b=BV9FiozVKWhCdFRGmAs7wBN9//rgHxGiOFLnlGqe9SiWPeRSDg5fDkUEugFxVxvKAn 9rSXRIJaqbN3dxwuZ/r9KnvDh7Tx2vm9OQ2wEc0Krxn8thzWTqWmLQZPkerdRw/S8Z3i jrT9XHauyjq9MdiAxAiGs9MsYXkgHvE1xpst8Uwxkao3PWEvaBzQP+fEEJY1MjeOcSha e2rWM/skM08A5SM98eztvpP2yKulC6ZlmV7ZfYJWNymxp0LsrPHARn/+Ria6gwg9wC20 08KBAGh72fKim5EV6KOzoOeTxRyDkgDRPaS3WSreVcvUCftBkgque7NhwNquZBoFzb9x 6jwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=F8E7EyefI3V2SNmWRojwlwvpROWUpIqhHf+y6SLjAqc=; b=bqa4PagvgskyBJ7qz5ilV9J7UnN4aFKVaQO3QmUcbRX6NDYBikM6lUKnhf31IcfYlX OhBsQ0H0zxv287zmhwLuMBfP+YgEr2Ks2ZbsnVVSceMO6SuLf2CwukZTx3udtyByTTUR tFVBv9QwlKcwIr37BZC8GLOWtu+4ApDMnfmwoWjTb6XhakwslPcKHdUBtCO2ciBWMu18 2zEA20dqUzZE+jdxC7oLQ4krqAo1j46nSR2PHDCaMsCBvdhBf74OEocBT28eDKUXt1eI 1Ot3dpSeoo9Fx5EJ3E8dcifBKV+QNjNXfU6Nfmy62inxhw9YaOMkQJvC5qsEulC6hgYB IR3A== X-Gm-Message-State: AKGB3mI1Yf6nG4vU7a3yyBe4Qfv+AYJWO/WYp9/Qy78NLwydEZFKySNV Tnpq0GcLEkNpa9e6EhgbzxI= X-Google-Smtp-Source: ACJfBotrtWh4aKPD1FAhxXloq+TpDD9+BGBt148SfSD8t2/rr4NRyr9ZlfvsnzfdZeExn2ZfQkBGLw== X-Received: by 10.28.222.132 with SMTP id v126mr16675029wmg.127.1514139803877; Sun, 24 Dec 2017 10:23:23 -0800 (PST) Received: from breton.holly.idiocy.org (ip6-2001-08b0-03f8-8129-e50b-ef10-9192-e044.holly.idiocy.org. [2001:8b0:3f8:8129:e50b:ef10:9192:e044]) by smtp.gmail.com with ESMTPSA id s45sm22506225wrc.89.2017.12.24.10.23.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Dec 2017 10:23:23 -0800 (PST) Date: Sun, 24 Dec 2017 18:23:21 +0000 From: Alan Third To: Eli Zaretskii Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character palette" Message-ID: <20171224182321.GA72021@breton.holly.idiocy.org> References: <20171224160053.GA71863@breton.holly.idiocy.org> <83bmiojc8y.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <83bmiojc8y.fsf@gnu.org> User-Agent: Mutt/1.9.1 (2017-09-22) X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 29837 Cc: 29837@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) On Sun, Dec 24, 2017 at 06:56:29PM +0200, Eli Zaretskii wrote: > > An example is '🢫' (RIGHTWARDS FRONT-TILTED SHADOWED WHITE ARROW). If I > > enter it using C‐x 8 RET, it appears correctly, but if I use the > > character palette it shows up as two blank spaces. Describe-char > > reveals these to be HIGH SURROGATE-D83E and LOW SURROGATE-DCAB, in > > that order. > > You need to tell Emacs that keyboard input is in UTF-16. Did you try > "C-x RET k"? I have now but I can’t find a utf-16 option that is ‘suitable’ for keyboard input. -- Alan Third From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 24 13:57:12 2017 Received: (at 29837) by debbugs.gnu.org; 24 Dec 2017 18:57:13 +0000 Received: from localhost ([127.0.0.1]:50923 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTBSS-0005ku-BH for submit@debbugs.gnu.org; Sun, 24 Dec 2017 13:57:12 -0500 Received: from eggs.gnu.org ([208.118.235.92]:46010) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTBSQ-0005kf-59 for 29837@debbugs.gnu.org; Sun, 24 Dec 2017 13:57:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eTBSG-0005fq-Az for 29837@debbugs.gnu.org; Sun, 24 Dec 2017 13:57:04 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:33186) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eTBSG-0005fd-8W; Sun, 24 Dec 2017 13:57:00 -0500 Received: from [176.228.60.248] (port=1768 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eTBSF-0005aG-Lr; Sun, 24 Dec 2017 13:57:00 -0500 Date: Sun, 24 Dec 2017 20:57:04 +0200 Message-Id: <834logj6nz.fsf@gnu.org> From: Eli Zaretskii To: Alan Third In-reply-to: <20171224182321.GA72021@breton.holly.idiocy.org> (message from Alan Third on Sun, 24 Dec 2017 18:23:21 +0000) Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character palette" References: <20171224160053.GA71863@breton.holly.idiocy.org> <83bmiojc8y.fsf@gnu.org> <20171224182321.GA72021@breton.holly.idiocy.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 29837 Cc: 29837@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Sun, 24 Dec 2017 18:23:21 +0000 > From: Alan Third > Cc: 29837@debbugs.gnu.org > > > You need to tell Emacs that keyboard input is in UTF-16. Did you try > > "C-x RET k"? > > I have now but I can’t find a utf-16 option that is ‘suitable’ for > keyboard input. What do you mean by "option" and by "suitable"? From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 24 14:28:18 2017 Received: (at 29837) by debbugs.gnu.org; 24 Dec 2017 19:28:18 +0000 Received: from localhost ([127.0.0.1]:50938 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTBwX-0006VI-QS for submit@debbugs.gnu.org; Sun, 24 Dec 2017 14:28:17 -0500 Received: from mail-wr0-f169.google.com ([209.85.128.169]:37885) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTBwW-0006V5-In for 29837@debbugs.gnu.org; Sun, 24 Dec 2017 14:28:16 -0500 Received: by mail-wr0-f169.google.com with SMTP id f8so21098119wre.4 for <29837@debbugs.gnu.org>; Sun, 24 Dec 2017 11:28:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=IwP4Db3MdvVG7/GyNRkZMVjbRRdi8ihz3QQSepRhY0I=; b=q7/XwRRG354LG1JC3Q3t8OESt3L8EsVDC3EQ52+N+/ITtJySueVlAsFO1O1op4VeWn pA7YbjAZUiyerr6ajFuEnxOQMUBI+UEKxcQ/fzb8MkABnXti+eecT0qdGYIk53Auz8/Q BegEGIoI4NMo1gC2R6xDPYWHdlsVhnIWyjQOtojB8wmbTUYcGJ7tVjrfpoSryyw4sjbI 32IcTs+NA7EEL7dMIgCfNaPhx9++FuKSjUzQiVui0nQvCwQ2xpQySZoEHMp/Li++PU7M C9K4I/n5tCkxZtZBUqK4Y2ttZKV8t1TK113q3H01KLb+Nf8hyad7ikercqSO1K6HhYaz 7oxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=IwP4Db3MdvVG7/GyNRkZMVjbRRdi8ihz3QQSepRhY0I=; b=CFke9XNbV1P+HyR7rGh6wfxdYoHodf0hVgrTLMUzWlnzOGA0z1a6yIUsXqz3CsxjCc XfKuvWXZxwmShVThyZj244GXz3Wjbdvtto+BrEy9DWLIqPp4rW3eBmZFz73NrjDOAfSf 2Md2o/ksZRvB35Q7WdUqk/3UCmknaXkiuCKjTVlewX50J9z07Ur7M8f1w54kabkKsWu7 s5QYR2b/lXGMsy/8DRKleGrVnesnkEAgURq3hlmiN+BCQieWxUfGnN7yviut9/Qh6EnD XnGfTTbJn9hZyeP3TpeQ7r9XlWJhmY1BJFut9hJ2lrn37ow0oy3L0ckxqwgsVmflvuPg wV6Q== X-Gm-Message-State: AKGB3mItrwiPm/qSiL7RVzXMeRHoLyYPyFpleSv22dX3NUFg/lKRzfkZ zGjFVulohw2ueGYPm4a467A= X-Google-Smtp-Source: ACJfBotceIqc+MsGQmW6XeDIoMXse3aLo45kpOlDO51Z8CbwYiWWUv1GzhFHCWi7ipb1sXD0Kp20mg== X-Received: by 10.223.198.137 with SMTP id j9mr1563301wrg.57.1514143690845; Sun, 24 Dec 2017 11:28:10 -0800 (PST) Received: from breton.holly.idiocy.org (ip6-2001-08b0-03f8-8129-e50b-ef10-9192-e044.holly.idiocy.org. [2001:8b0:3f8:8129:e50b:ef10:9192:e044]) by smtp.gmail.com with ESMTPSA id f76sm8048799wme.2.2017.12.24.11.28.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Dec 2017 11:28:09 -0800 (PST) Date: Sun, 24 Dec 2017 19:28:07 +0000 From: Alan Third To: Eli Zaretskii Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character palette" Message-ID: <20171224192807.GA73590@breton.holly.idiocy.org> References: <20171224160053.GA71863@breton.holly.idiocy.org> <83bmiojc8y.fsf@gnu.org> <20171224182321.GA72021@breton.holly.idiocy.org> <834logj6nz.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <834logj6nz.fsf@gnu.org> User-Agent: Mutt/1.9.1 (2017-09-22) X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: 29837 Cc: 29837@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) On Sun, Dec 24, 2017 at 08:57:04PM +0200, Eli Zaretskii wrote: > > Date: Sun, 24 Dec 2017 18:23:21 +0000 > > From: Alan Third > > Cc: 29837@debbugs.gnu.org > > > > > You need to tell Emacs that keyboard input is in UTF-16. Did you try > > > "C-x RET k"? > > > > I have now but I can’t find a utf-16 option that is ‘suitable’ for > > keyboard input. > > What do you mean by "option" and by "suitable"? If I try to select utf-16 I get this set-keyboard-coding-system: Unsuitable coding system for keyboard: utf-16 and I used tab completion to find which other coding systems were available but all the ones beginning utf-16 that I tried return the same message. -- Alan Third From debbugs-submit-bounces@debbugs.gnu.org Sun Dec 24 14:35:01 2017 Received: (at 29837) by debbugs.gnu.org; 24 Dec 2017 19:35:01 +0000 Received: from localhost ([127.0.0.1]:50944 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTC33-0006fw-Hr for submit@debbugs.gnu.org; Sun, 24 Dec 2017 14:35:01 -0500 Received: from eggs.gnu.org ([208.118.235.92]:49102) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTC31-0006fh-AF for 29837@debbugs.gnu.org; Sun, 24 Dec 2017 14:35:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eTC2s-0002u7-Ri for 29837@debbugs.gnu.org; Sun, 24 Dec 2017 14:34:54 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=BAYES_20,T_RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:33825) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eTC2s-0002u3-OA; Sun, 24 Dec 2017 14:34:50 -0500 Received: from [176.228.60.248] (port=1791 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1eTC2q-0007UP-PP; Sun, 24 Dec 2017 14:34:50 -0500 Date: Sun, 24 Dec 2017 21:34:37 +0200 Message-Id: <83zi67j4xe.fsf@gnu.org> From: Eli Zaretskii To: Alan Third In-reply-to: <20171224192807.GA73590@breton.holly.idiocy.org> (message from Alan Third on Sun, 24 Dec 2017 19:28:07 +0000) Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character palette" References: <20171224160053.GA71863@breton.holly.idiocy.org> <83bmiojc8y.fsf@gnu.org> <20171224182321.GA72021@breton.holly.idiocy.org> <834logj6nz.fsf@gnu.org> <20171224192807.GA73590@breton.holly.idiocy.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -5.0 (-----) X-Debbugs-Envelope-To: 29837 Cc: 29837@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Eli Zaretskii Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -5.0 (-----) > Date: Sun, 24 Dec 2017 19:28:07 +0000 > From: Alan Third > Cc: 29837@debbugs.gnu.org > > If I try to select utf-16 I get this > > set-keyboard-coding-system: Unsuitable coding system for keyboard: utf-16 > > and I used tab completion to find which other coding systems were > available but all the ones beginning utf-16 that I tried return the > same message. Oh, I now recollect that Handa-san said at some point that keyboard input doesn't support UTF-16... How do other macOS programs read UTF-16 keyboard input? Maybe you could use the same way to read the sequences, and then decode them internally as UTF-16 using coding.c facilities, and feed them into the Emacs event queue? Just a thought. From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 25 15:14:13 2017 Received: (at 29837) by debbugs.gnu.org; 25 Dec 2017 20:14:13 +0000 Received: from localhost ([127.0.0.1]:52103 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTZ8W-0002yg-OJ for submit@debbugs.gnu.org; Mon, 25 Dec 2017 15:14:12 -0500 Received: from mail-qk0-f181.google.com ([209.85.220.181]:46385) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTZ8V-0002yU-Ex for 29837@debbugs.gnu.org; Mon, 25 Dec 2017 15:14:11 -0500 Received: by mail-qk0-f181.google.com with SMTP id b132so11909419qkc.13 for <29837@debbugs.gnu.org>; Mon, 25 Dec 2017 12:14:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=M0qatvkmks6KiU+Z1URsHz/TxE+W/eToVJEa5Tg2xLc=; b=ugYQsuOvJbBH/O2MNesblbjWElWEkMupHRrJWyM7tWjOnXIZ71s1QgLw+YygxpUqt8 wiD2uQxAtuGzOwCBEtWMZWKOUaql28qOt9BSpJsnGzZ+Z8+MsMeJnzv5mnGidx/x6CFC vCTFd9zWp9BtcIyoPav5kqUtPVdWvclpVQASPsCQEAzGNGBE2bUuKHHrWYsFZCCwTsUu K1BhI1xAyQHC9lJ/paPrF52vn5PSHoZtpN28xxzasmaAvvrehnC9UhNHoamwjJOUwWGD ANMKxq2A6lbAwbt9br9Xuv5dn062xoBXi+4uNahVb1dTBTaLCsOZB+keKhyseKASrZwH O9PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=M0qatvkmks6KiU+Z1URsHz/TxE+W/eToVJEa5Tg2xLc=; b=pzuArMdfctJCW6LfbvTQM0trloPzY5cCLU9QpC5cW7RZ7lYQ/vH6g9P/W/B4iE9HF2 QApFc1pINcPxBq3wJHK4pCqc4pL7SIuqI81H+dVkKPCQ8AAp2irJfi8MkDK0SCdeplLR wmfAK7BZ/EqEgoK+uWgAlSXT8ISl0Rz3nYdNAxl3Lm6jtuleosx06qoB1922G2WZsYvO vOmqk6RpJFu9LQfNdGZDl4qRThV8hu4xSxHj+rVzLslauTwp9rWqTC/MgPh3obXfqo0x Gb5Y/5+SFLNnVPP7s5qJ2zVv6mcbrrrRhNle9RR+uX7/1djqlwrmDCBlwyFttB/OsGEl C1XA== X-Gm-Message-State: AKGB3mIjGLqdYi+g6PWdznkJbjsE6/JuZQ02140slJE6NNpsIWf5ysy6 z9RWzDUVUjJlIRt6KNkDbFa6yDDWaquNy0reeQ0= X-Google-Smtp-Source: ACJfBouUrOIBYZsMa8ChmwJ0JIoCuzI5D3C4mybhKkch04vBbjYXynnpnOgupqH5vvoRZ5TXl9GnvdmIfDv34KAuMsw= X-Received: by 10.55.10.7 with SMTP id 7mr30390475qkk.198.1514232845815; Mon, 25 Dec 2017 12:14:05 -0800 (PST) MIME-Version: 1.0 References: <20171224160053.GA71863@breton.holly.idiocy.org> <83bmiojc8y.fsf@gnu.org> <20171224182321.GA72021@breton.holly.idiocy.org> <834logj6nz.fsf@gnu.org> <20171224192807.GA73590@breton.holly.idiocy.org> <83zi67j4xe.fsf@gnu.org> In-Reply-To: <83zi67j4xe.fsf@gnu.org> From: Philipp Stephani Date: Mon, 25 Dec 2017 20:13:55 +0000 Message-ID: Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character palette" To: Eli Zaretskii Content-Type: multipart/alternative; boundary="001a114c563ee1d2b405612fce1d" X-Spam-Score: 0.2 (/) X-Debbugs-Envelope-To: 29837 Cc: Alan Third , 29837@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.2 (/) --001a114c563ee1d2b405612fce1d Content-Type: text/plain; charset="UTF-8" Eli Zaretskii schrieb am So., 24. Dez. 2017 um 20:35 Uhr: > > Date: Sun, 24 Dec 2017 19:28:07 +0000 > > From: Alan Third > > Cc: 29837@debbugs.gnu.org > > > > If I try to select utf-16 I get this > > > > set-keyboard-coding-system: Unsuitable coding system for keyboard: > utf-16 > > > > and I used tab completion to find which other coding systems were > > available but all the ones beginning utf-16 that I tried return the > > same message. > > Oh, I now recollect that Handa-san said at some point that keyboard > input doesn't support UTF-16... > > How do other macOS programs read UTF-16 keyboard input? Maybe you > could use the same way to read the sequences, and then decode them > internally as UTF-16 using coding.c facilities, and feed them into the > Emacs event queue? Just a thought. > > IIUC Emacs receives the input as a single UTF-16 string (in insertText), then iterates over the UTF-16 code units, converting each into an Emacs event. That's wrong, no matter whether the input comes from the character palette or from the keyboard; normal keyboard layouts just happen to not contain non-BMP characters. The loop needs to account for surrogates. As a small optimization (which is warranted because the function is probably called on every keystroke), this should use [NSString getCharacters:range:] to copy all the UTF-16 code units to a buffer first, to avoid repeated calls to characterAtIndex. --001a114c563ee1d2b405612fce1d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


Eli Za= retskii <eliz@gnu.org> schrieb am= So., 24. Dez. 2017 um 20:35=C2=A0Uhr:
> Date: Sun, 24 Dec 2017 19:28:07 +0000
> From: Alan Third <alan@idiocy.org>
> Cc: 29837@d= ebbugs.gnu.org
>
> If I try to select utf-16 I get this
>
>=C2=A0 =C2=A0 =C2=A0set-keyboard-coding-system: Unsuitable coding syste= m for keyboard: utf-16
>
> and I used tab completion to find which other coding systems were
> available but all the ones beginning utf-16 that I tried return the > same message.

Oh, I now recollect that Handa-san said at some point that keyboard
input doesn't support UTF-16...

How do other macOS programs read UTF-16 keyboard input?=C2=A0 Maybe you
could use the same way to read the sequences, and then decode them
internally as UTF-16 using coding.c facilities, and feed them into the
Emacs event queue?=C2=A0 Just a thought.


IIUC Emacs receives the input as a single = UTF-16 string (in insertText), then iterates over the UTF-16 code units, co= nverting each into an Emacs event. That's wrong, no matter whether the = input comes from the character palette or from the keyboard; normal keyboar= d layouts just happen to not contain non-BMP characters. The loop needs to = account for surrogates.
As a small optimization (which is warrant= ed because the function is probably called on every keystroke), this should= use [NSString getCharacters:range:] to copy all the UTF-16 code units to a= buffer first, to avoid repeated calls to characterAtIndex.
--001a114c563ee1d2b405612fce1d-- From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 25 16:07:30 2017 Received: (at 29837) by debbugs.gnu.org; 25 Dec 2017 21:07:30 +0000 Received: from localhost ([127.0.0.1]:52201 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTZy6-0004J3-A1 for submit@debbugs.gnu.org; Mon, 25 Dec 2017 16:07:30 -0500 Received: from mail-qt0-f177.google.com ([209.85.216.177]:43495) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTZy4-0004Iq-Ik for 29837@debbugs.gnu.org; Mon, 25 Dec 2017 16:07:28 -0500 Received: by mail-qt0-f177.google.com with SMTP id w10so43384640qtb.10 for <29837@debbugs.gnu.org>; Mon, 25 Dec 2017 13:07:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:from:date:message-id:subject:to:cc; bh=rvooZLsrvz8YvuUnquICgAY9XT+rbewSEfdL0pCrUO8=; b=lIXdfPRHmBvXz37pbcGmXuYOOb+Be6jLA24eiTV9inh+F4OMmKPNx7nRayHNIwlw4S I4tt+rWKSEQcqFa9mtd9fD2zhr8N7lylW41rah8UvHtqwH3c69IWMOm4fGo1KM31xWFC nsTa0hIiVFBM7cG/C04ZJAeL2PghYj7VfUCaex15rxmfhANPAt42EE5oSuTHtzsryWZq xTZmMYz87WPuyQWUwXiP2mkWf5/JK7IBZAjFUQkUI6NtR9oMSSgGcX4WPecgy2AYRobg HO2CNjl21GkfpReQx6PqYCF+Z2GBKS/IoQJfgt8FUv7ih98u9Y1ilyW759Z203osw29z tSDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:from:date:message-id :subject:to:cc; bh=rvooZLsrvz8YvuUnquICgAY9XT+rbewSEfdL0pCrUO8=; b=sAhGwzBA2BuBWDX6PmEzMlbeAyJwDHeXXY0EqQB0U7z2iiwUi3bggIORxDrYrBN4U3 FHsOr0v7cAY31m7x1NGKDgCFgFJIQMaqh6H9g89DKbCh2lB8Z7kMxA0jdfbR7rQPXElo OYyaYLEqDpWBImtRBpv1C6uPahHgekGm4xfdXQwtgiDiApPwLD1r3DZ95LjnbezvLkDw HqGy08N8xJ4QWpMyITesl1DIorqtV3Wys6hxbHtt0mkREUgbRZyO6tqVyrgarxMhbb7N AlnsmPNArVlmJf+7gfI0dYC9+lRvdNuvGXfMfNDrTVZWcJoZNMDGFpG8f7Qskd0893nL sD4g== X-Gm-Message-State: AKGB3mJSqqt1KFLg1wH40JemWcxDNN9ru7eL8d9LXU0oOtPtFeH4UBas +hYiS+N0LTGLdP0C5+achCSdoVtcFuY1nZQHXeQ= X-Google-Smtp-Source: ACJfBouwwNnM9wHuF8hgUJsgShql6BZoXsNhJOhxirsbN9bOizPzXA5FMDFmbm9AtpbN2ToNk9j1qTCV++tVT1HNlpk= X-Received: by 10.237.60.206 with SMTP id e14mr25359112qtf.157.1514236043177; Mon, 25 Dec 2017 13:07:23 -0800 (PST) MIME-Version: 1.0 References: <20171224160053.GA71863@breton.holly.idiocy.org> <83bmiojc8y.fsf@gnu.org> <20171224182321.GA72021@breton.holly.idiocy.org> <834logj6nz.fsf@gnu.org> <20171224192807.GA73590@breton.holly.idiocy.org> <83zi67j4xe.fsf@gnu.org> From: Philipp Stephani Date: Mon, 25 Dec 2017 21:07:12 +0000 Message-ID: Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character palette" To: Eli Zaretskii Content-Type: multipart/alternative; boundary="001a11414e1e75b2700561308d14" X-Spam-Score: 0.2 (/) X-Debbugs-Envelope-To: 29837 Cc: Alan Third , 29837@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.2 (/) --001a11414e1e75b2700561308d14 Content-Type: text/plain; charset="UTF-8" Philipp Stephani schrieb am Mo., 25. Dez. 2017 um 21:13 Uhr: > > > Eli Zaretskii schrieb am So., 24. Dez. 2017 um 20:35 Uhr: > >> > Date: Sun, 24 Dec 2017 19:28:07 +0000 >> > From: Alan Third >> > Cc: 29837@debbugs.gnu.org >> > >> > If I try to select utf-16 I get this >> > >> > set-keyboard-coding-system: Unsuitable coding system for keyboard: >> utf-16 >> > >> > and I used tab completion to find which other coding systems were >> > available but all the ones beginning utf-16 that I tried return the >> > same message. >> >> Oh, I now recollect that Handa-san said at some point that keyboard >> input doesn't support UTF-16... >> >> How do other macOS programs read UTF-16 keyboard input? Maybe you >> could use the same way to read the sequences, and then decode them >> internally as UTF-16 using coding.c facilities, and feed them into the >> Emacs event queue? Just a thought. >> >> > IIUC Emacs receives the input as a single UTF-16 string (in insertText) ... > On a somewhat related note, insertText: is itself deprecated and should be replaced with insertText:replacementRange:. --001a11414e1e75b2700561308d14 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


Philip= p Stephani <p.stephani2@gmail.c= om> schrieb am Mo., 25. Dez. 2017 um 21:13=C2=A0Uhr:


Eli Zaretskii <eliz@g= nu.org> schrieb am So., 24. Dez. 2017 um 20:35=C2=A0Uhr:
> Date: Sun, 24 Dec 2017 19:28:07 +0000
> From: Alan Third <alan@idiocy.org>
> Cc: 29837@d= ebbugs.gnu.org
>
> If I try to select utf-16 I get this
>
>=C2=A0 =C2=A0 =C2=A0set-keyboard-coding-system: Unsuitable coding syste= m for keyboard: utf-16
>
> and I used tab completion to find which other coding systems were
> available but all the ones beginning utf-16 that I tried return the > same message.

Oh, I now recollect that Handa-san said at some point that keyboard
input doesn't support UTF-16...

How do other macOS programs read UTF-16 keyboard input?=C2=A0 Maybe you
could use the same way to read the sequences, and then decode them
internally as UTF-16 using coding.c facilities, and feed them into the
Emacs event queue?=C2=A0 Just a thought.


IIUC Emacs receives the input as a single = UTF-16 string (in insertText) ...

On a somewhat related note, insertText: is itself deprecated and s= hould be replaced with=C2=A0insertText:replacementRange:.
--001a11414e1e75b2700561308d14-- From debbugs-submit-bounces@debbugs.gnu.org Mon Dec 25 20:34:36 2017 Received: (at 29837) by debbugs.gnu.org; 26 Dec 2017 01:34:36 +0000 Received: from localhost ([127.0.0.1]:52256 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTe8a-0002Ab-GE for submit@debbugs.gnu.org; Mon, 25 Dec 2017 20:34:36 -0500 Received: from mail-wr0-f169.google.com ([209.85.128.169]:36330) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eTe8W-0002AM-Am for 29837@debbugs.gnu.org; Mon, 25 Dec 2017 20:34:32 -0500 Received: by mail-wr0-f169.google.com with SMTP id u19so27295802wrc.3 for <29837@debbugs.gnu.org>; Mon, 25 Dec 2017 17:34:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=XnaGZNsl1pPr4ryRYK8WRxVzjigNs5Eh3lHN2jZep9o=; b=SdlDMOf2Rm1aElKI3X1jZ1yFhAAqzFah+lhWZmjjumZfkDTDmvLoJ6YLxt0Tc2GhV3 vSzvJy1TSptwxw8cPMul+H9jgyFK94Z5RWjSw1L2k5+Vs5JMf8BmptYq7sbGnvQVVd1W Dl1mwOmC9mYGXSHsFNCEl7goCPr9vvLYp0FVBhx7b/c1rKYvV55+aOG0EI2Jw1YWXDmF 9KJFafStzF4M3nhKDl2Z+6xTWEfOvTJ5dj4adPTubuu6R5uz9YMi18MQiOcsj/NVs5WJ K7SGyDx1XVonhrpEaKn44IKb1q+jioXPYXncwWh/z0s3SsO4LNlDuw9B/bgaVJnTE02m 4pAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=XnaGZNsl1pPr4ryRYK8WRxVzjigNs5Eh3lHN2jZep9o=; b=DxI5ZkctO8MjoQGjymztSwsGMX0lEXJhdaaLCTuumZWa84Ebp26A13p9Z/mqAaMK9c THfPeDBnXqV6rFK6jSAUiLo6foQcCWZMSojhWWmWRMUsGWq5/6L949I04lfInxw3FKqe MliXxT+1qv3W54CeCLUtwJz9uPXCF3d8lIwxDmP7QlN9LzqhVwKEt0kRPqmpRFDAfiv6 hHKEYk8h+XZOLyvby4cGQTfaoHl4ZjjjHH8nY1J+dneAjDPlKSHgVC3CwCNZmifZvcZv lgFvb4huWj0FuRMGW+WSVuFo2lRG91tUTWklkuybWS4Rh+jzEX4khpbBj/xL/SDEODnW yGOA== X-Gm-Message-State: AKGB3mKqknnllPMQXLvJ/orAom34wheC5CbG+QVvoDIl58n7ANqJTQNv mEShg8CzrDX2nK4vyph4VIk= X-Google-Smtp-Source: ACJfBoutKjyXjPd3jFiQhSaP3JXfIGhY3IJPmlmhAtaqiFg1ZnfOSEr5x34uRz/kiZU/ld1Zt2L8gQ== X-Received: by 10.223.153.72 with SMTP id x66mr25925665wrb.209.1514252066469; Mon, 25 Dec 2017 17:34:26 -0800 (PST) Received: from breton.holly.idiocy.org (ip6-2001-08b0-03f8-8129-e50b-ef10-9192-e044.holly.idiocy.org. [2001:8b0:3f8:8129:e50b:ef10:9192:e044]) by smtp.gmail.com with ESMTPSA id k25sm38070088wrk.11.2017.12.25.17.34.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Dec 2017 17:34:25 -0800 (PST) Date: Tue, 26 Dec 2017 01:34:23 +0000 From: Alan Third To: Philipp Stephani Subject: Re: bug#29837: UTF-16 char display problems and the macOS "character palette" Message-ID: <20171226013423.GB79310@breton.holly.idiocy.org> References: <20171224160053.GA71863@breton.holly.idiocy.org> <83bmiojc8y.fsf@gnu.org> <20171224182321.GA72021@breton.holly.idiocy.org> <834logj6nz.fsf@gnu.org> <20171224192807.GA73590@breton.holly.idiocy.org> <83zi67j4xe.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 29837 Cc: Eli Zaretskii , 29837@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.3 (/) On Mon, Dec 25, 2017 at 08:13:55PM +0000, Philipp Stephani wrote: > IIUC Emacs receives the input as a single UTF-16 string (in > insertText), then iterates over the UTF-16 code units, converting > each into an Emacs event. That's wrong, no matter whether the input > comes from the character palette or from the keyboard; normal > keyboard layouts just happen to not contain non-BMP characters. The > loop needs to account for surrogates. I finally came to this conclusion myself. I now know a lot more about UTF‐16 than I did yesterday. :) Wish I’d looked at my email earlier, though. > As a small optimization (which is warranted because the function is > probably called on every keystroke), this should use [NSString > getCharacters:range:] to copy all the UTF-16 code units to a buffer > first, to avoid repeated calls to characterAtIndex. Presumably the vast majority of input will consist of just one code unit, though? -- Alan Third From debbugs-submit-bounces@debbugs.gnu.org Sun Jan 07 15:43:20 2018 Received: (at control) by debbugs.gnu.org; 7 Jan 2018 20:43:21 +0000 Received: from localhost ([127.0.0.1]:39855 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eYHmq-00074Y-OS for submit@debbugs.gnu.org; Sun, 07 Jan 2018 15:43:20 -0500 Received: from mail-wr0-f175.google.com ([209.85.128.175]:35080) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1eYHmp-00074M-55 for control@debbugs.gnu.org; Sun, 07 Jan 2018 15:43:19 -0500 Received: by mail-wr0-f175.google.com with SMTP id l16so1160651wrb.2 for ; Sun, 07 Jan 2018 12:43:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=sender:date:message-id:to:from:subject; bh=xDtu2hGkmjH1C19hznMjDlmQt2pgZhkh5i1tqtJjBuk=; b=f0lGDR27Owi/RTYWIcWDrTRwhBuBAXLtXY5K1PRHwHODBccuLEzDzb2dnwI8tMDdIg XQ/ihzsELupxjOaSXB2X7U5xpXQGIq8T/huVchjK+FErJuhx+5F2curVkaebDIexN8fW LfnfLiPrZnM2LobBtVie1QL4nmeDTBJ+BagTYaRgRj8Oo+9d40ipXgfV4HnvLUhQhhbx qd+3sQ+F2rPYrX4kJlq2NDDWQ06j5u2wL8G60BlebplENn11Vc9YbRfMqhyCVhZVmcOD 8P2KCy0TO1MPb0BUv8BZmtHJz2f86BMF4M0WS32oe4dxb2OdK2nEMIZhulyhkrZ1VyPd 9qYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:message-id:to:from:subject; bh=xDtu2hGkmjH1C19hznMjDlmQt2pgZhkh5i1tqtJjBuk=; b=sxN4sXs4CE/8A/b86htYnO95AbNqjWqeW4k5eT5V62kHpBOANwfWVp4rxxtkm1xdyh j2Z4at6B8Uy66HtzdvsrRKv5MzlIfstn9akW1GYm0XWwsjZa1770Ac/m0ckl89SweZcQ Yr5HWI6PJLTVeI/hlIG3KZRXSojqdzhJOhqOSFaeXNasJOzm+eN3TYp6o12BvV94l9GY X2uyt8sPKaGZo+S0V1eNg48AsaNj970ybYWRRYcyfH6KSTs4U6R7jzuY7kGHdBjf4AOU pWxuVLHM+s5RLtRlf7j3Nj0LZcd9Wdx9+lclUGzwKd3YbOsxZUP1EK86KUXHkgGKgPA2 WDAQ== X-Gm-Message-State: AKGB3mLIvtgnuj+jON8+j2MSMwkKxNFRsClJn43lF4lsto9JZ31fAw5J YGCf38LhD4BRE7GDxP/vUYjgiTx3 X-Google-Smtp-Source: ACJfBos80ORFdlvMk044hEdxC4YtToVGoTfxchxRn9aEFkCL11doGsCajn09LpczfPt31Y9BAODmzA== X-Received: by 10.223.202.4 with SMTP id o4mr1706233wrh.226.1515357792754; Sun, 07 Jan 2018 12:43:12 -0800 (PST) Received: from breton.holly.idiocy.org (ip6-2001-08b0-03f8-8129-0c9f-d078-0073-e913.holly.idiocy.org. [2001:8b0:3f8:8129:c9f:d078:73:e913]) by smtp.gmail.com with ESMTPSA id q7sm5483794wrf.31.2018.01.07.12.43.11 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sun, 07 Jan 2018 12:43:11 -0800 (PST) Date: Sun, 07 Jan 2018 20:42:55 +0000 Message-Id: To: control@debbugs.gnu.org From: Alan Third Subject: control message for bug #29837 X-Spam-Score: 0.5 (/) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 0.5 (/) tags 29837 fixed close 29837 27.1 From unknown Sun Jun 22 11:32:31 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Mon, 05 Feb 2018 12:24:05 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator