From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: rms@gnu.org, 4848@debbugs.gnu.org Resent-From: Richard Stallman Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Mon, 02 Nov 2009 05:35:06 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: report 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by submit@emacsbugs.donarmstrong.com id=B.125713988028850 (code B ref -1); Mon, 02 Nov 2009 05:35:06 +0000 Received: (at submit) by emacsbugs.donarmstrong.com; 2 Nov 2009 05:31:20 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-3.3 required=4.0 tests=AWL,FOURLA,IMPRONONCABLE_2 autolearn=no version=3.2.5-bugs.debian.org_2005_01_02 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA25VIF7028847 for ; Sun, 1 Nov 2009 21:31:19 -0800 Received: from rms by fencepost.gnu.org with local (Exim 4.67) (envelope-from ) id 1N4pVd-0008CS-LQ; Mon, 02 Nov 2009 00:31:17 -0500 Content-Type: text/plain; charset=ISO-8859-15 From: Richard Stallman To: emacs-pretest-bug@gnu.org Message-Id: Date: Mon, 02 Nov 2009 00:31:17 -0500 "\ue1" gives the error "Non-hex digit used for Unicode escape". Why doesn't it work to give the Unicode character á? Note that \xe1 does not work for this any more. It gives a different character, which displays as \341 and is described as follows by C-x =. Char: \341 (4194273, #o17777741, #x3fffe1, raw-byte) point=442 of 2980 (15%) column=0 That too is confusing, and certainly not documented clearly where \x is explained. Is there any way to specify unicode e1 with \x? In GNU Emacs 23.1.50.4 (mipsel-unknown-linux-gnu, GTK+ Version 2.12.12) of 2009-08-11 on theobromine2 configured using `configure 'CFLAGS=-O0 -g -Wno-pointer-sign' 'mipsel-unknown-linux-gnu' 'build_alias=mipsel-unknown-linux-gnu' 'host_alias=mipsel-unknown-linux-gnu' 'target_alias=mipsel-unknown-linux-gnu'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_US.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default-enable-multibyte-characters: t Major mode: RMAIL Edit Minor modes in effect: shell-dirtrack-mode: t diff-auto-refine-mode: t gpm-mouse-mode: t display-battery-mode: t tooltip-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t abbrev-mode: t Recent input: b R TAB RET ESC < C-u C-n C-u C-u C-n C-u C-n C-n C-n C-n C-f 4 b o u t C-_ C-x b o u t - 2 2 RET C-a C-p C-x 4 b R TAB RET C-u ESC x c o m p a r e RET C-x o C-x o C-x b RET C-b C-b C-b C-b | ESC C-x C-x C-s C-x b RET C-x o C-b C-b C-x ESC ESC ESC p ESC p RET C-x o C-x o C-x o C-x C-g C-x 4 b RET C-a ESC f C-f C-@ ESC C-f ESC w ESC : C-y RET C-x o ESC : ( l o o k i n g - a t SPC C-y ) RET C-x o C-e ESC b ESC d 2 4 0 ESC C-x C-x o ESC : ESC p RET C-x = C-x o o C-_ C-x o ESC : ESC p C-e ESC DEL ESC DEL ESC DEL " \ 2 4 0 DEL DEL DEL x a 0 " ) RET C-u C-x = C-\ a ' C-g e C-x = C-f a ' C-b C-x = ESC : ESC p C-e C-b C-b ESC DEL DEL C-\ a ' C-e RET C-x = ESC : ESC p C-e C-b C-b DEL \ 3 4 1 RET C-x = ESC : ESC p C-e C-b C-b DEL DEL DEL x e 1 RET C-x = ESC : ESC p C-e C-b C-b C-b C-b DEL u C-e RET ESC : ESC p C-e C-b C-b C-b C-b ESC u C-e RET ESC : ESC p C-e C-b C-b C-b C-b 0 0 C-e RET ESC x r e p o r t SPC e m a c s SPC b u g RET Recent messages: Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57 t Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57 nil Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57 nil Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57 let: Non-hex digit used for Unicode escape [2 times] t Source file `/home/rms/emacs-cvs/lisp/mail/emacsbug.el' newer than byte-compiled file Load-path shadows: None found. From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: Stefan Monnier , 4848@debbugs.gnu.org Resent-From: Stefan Monnier Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Mon, 02 Nov 2009 07:25:06 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by submit@emacsbugs.donarmstrong.com id=B.125714623511846 (code B ref -1); Mon, 02 Nov 2009 07:25:06 +0000 Received: (at submit) by emacsbugs.donarmstrong.com; 2 Nov 2009 07:17:15 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-2.3 required=4.0 tests=AWL,HAS_BUG_NUMBER, IMPRONONCABLE_2 autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA27HDp1011843 for ; Sun, 1 Nov 2009 23:17:14 -0800 Received: from mx10.gnu.org ([199.232.76.166]:49252) by fencepost.gnu.org with esmtp (Exim 4.67) (envelope-from ) id 1N4rA8-00024j-VT for emacs-pretest-bug@gnu.org; Mon, 02 Nov 2009 02:17:13 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1N4rA8-0003Gl-7o for emacs-pretest-bug@gnu.org; Mon, 02 Nov 2009 02:17:12 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:5141 helo=ironport2-out.pppoe.ca) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1N4rA8-0003GV-0i; Mon, 02 Nov 2009 02:17:12 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsgEAIEW7kpFpYlL/2dsb2JhbACBT9gGhDkEiGY X-IronPort-AV: E=Sophos;i="4.44,664,1249272000"; d="scan'208";a="48519745" Received: from 69-165-137-75.dsl.teksavvy.com (HELO ceviche.home) ([69.165.137.75]) by ironport2-out.pppoe.ca with ESMTP; 02 Nov 2009 02:17:10 -0500 Received: by ceviche.home (Postfix, from userid 20848) id 37B5470040; Mon, 2 Nov 2009 02:17:10 -0500 (EST) From: Stefan Monnier To: rms@gnu.org Cc: 4848@debbugs.gnu.org, emacs-pretest-bug@gnu.org Message-ID: References: Date: Mon, 02 Nov 2009 02:17:10 -0500 In-Reply-To: (Richard Stallman's message of "Mon, 02 Nov 2009 00:31:17 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. > "\ue1" gives the error "Non-hex digit used for Unicode escape". > Why doesn't it work to give the Unicode character =C3=A1? I think you mean \u00e1 > Note that \xe1 does not work for this any more. Indeed, this refers to the byte 225 rather than to the char 225. > That too is confusing, and certainly not documented clearly where \x > is explained. Is there any way to specify unicode e1 with \x? \x00e1 also works like \u00e1. Stefan From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: Jason Rumney , 4848@debbugs.gnu.org Resent-From: Jason Rumney Original-Sender: Jason Rumney Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Mon, 02 Nov 2009 07:40:05 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by 4848-submit@emacsbugs.donarmstrong.com id=B4848.125714728813974 (code B ref 4848); Mon, 02 Nov 2009 07:40:05 +0000 Received: (at 4848) by emacsbugs.donarmstrong.com; 2 Nov 2009 07:34:48 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-2.1 required=4.0 tests=AWL,HAS_BUG_NUMBER, IMPRONONCABLE_2 autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from mail-yw0-f179.google.com (mail-yw0-f179.google.com [209.85.211.179]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA27Yjck013962 for <4848@emacsbugs.donarmstrong.com>; Sun, 1 Nov 2009 23:34:47 -0800 Received: by ywh9 with SMTP id 9so4370334ywh.19 for <4848@emacsbugs.donarmstrong.com>; Sun, 01 Nov 2009 23:34:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=vBOOCKxlj6G4ru5nCILvN7sfU1nPioYfyQshxoW4/hI=; b=vB5uNbtvQTDTAZBHqWbda6X1sctxXN/+7ViXzR/PPZfD8v/9Licueogc3VPR0a6Zhu ACIsKU6SrxzugobEamrApgxPU5KpN/zNCthPgu5SaDT83iXWy7RabDQsoCuJWFRHfuh5 JQRe4GBSTjgqUeNg5XZgSNk2HZMmBEl0hIoA0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=Z1DaJSm6cSVeh/51rTr0nlHWeQV1Sk6MDgburMgylKd62xxAmgyOo65vNIf2tTUJbr y5/rrQ4z/W21TqjZ3DczUCxkoS89Nup6dnFq32+nEZuyoEkkel7BjZQf2v1TKCBj3Rcc 3U9B9+hsYheSfspH9UFn1alx6JlGfmhjl2v1o= Received: by 10.150.9.1 with SMTP id 1mr7600713ybi.342.1257147280311; Sun, 01 Nov 2009 23:34:40 -0800 (PST) Received: from ?10.1.1.112? ([61.4.103.130]) by mx.google.com with ESMTPS id 22sm2036092ywh.15.2009.11.01.23.34.37 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 01 Nov 2009 23:34:39 -0800 (PST) Sender: Jason Rumney Message-ID: <4AEE8B5D.1000505@gnu.org> Date: Mon, 02 Nov 2009 15:33:49 +0800 From: Jason Rumney User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: Stefan Monnier , 4848@debbugs.gnu.org CC: rms@gnu.org References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Stefan Monnier wrote: >> "\ue1" gives the error "Non-hex digit used for Unicode escape". >> Why doesn't it work to give the Unicode character á? >> > > I think you mean \u00e1 > I think the error message means "Insufficient hex digits used for Unicode escape". >> Note that \xe1 does not work for this any more. >> > > Indeed, this refers to the byte 225 rather than to the char 225. > > > \x00e1 also works like \u00e1. > That is definitely confusing. From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: rms@gnu.org, 4848@debbugs.gnu.org Resent-From: Richard Stallman Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Tue, 03 Nov 2009 13:45:04 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by submit@emacsbugs.donarmstrong.com id=B.125725554318782 (code B ref -1); Tue, 03 Nov 2009 13:45:04 +0000 Received: (at submit) by emacsbugs.donarmstrong.com; 3 Nov 2009 13:39:03 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-5.3 required=4.0 tests=AWL,HAS_BUG_NUMBER autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA3Dd1DJ018761; Tue, 3 Nov 2009 05:39:02 -0800 Received: from rms by fencepost.gnu.org with local (Exim 4.67) (envelope-from ) id 1N5JbA-0006xB-8N; Tue, 03 Nov 2009 08:39:00 -0500 Content-Type: text/plain; charset=ISO-8859-15 From: Richard Stallman To: Stefan Monnier CC: 4848@debbugs.gnu.org, emacs-pretest-bug@gnu.org In-reply-to: (message from Stefan Monnier on Mon, 02 Nov 2009 02:17:10 -0500) References: Message-Id: Date: Tue, 03 Nov 2009 08:39:00 -0500 > "\ue1" gives the error "Non-hex digit used for Unicode escape". > Why doesn't it work to give the Unicode character á? I think you mean \u00e1 Why shouldn't \ue1 work? > Note that \xe1 does not work for this any more. Indeed, this refers to the byte 225 rather than to the char 225. This needs to be documented. But is it a good meaning for \x? It will rarely be useful this way. Also, is it an incompatible change? From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: Stefan Monnier , 4848@debbugs.gnu.org Resent-From: Stefan Monnier Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Tue, 03 Nov 2009 14:55:05 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by submit@emacsbugs.donarmstrong.com id=B.125725980025247 (code B ref -1); Tue, 03 Nov 2009 14:55:05 +0000 Received: (at submit) by emacsbugs.donarmstrong.com; 3 Nov 2009 14:50:00 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-2.7 required=4.0 tests=AWL,HAS_BUG_NUMBER, MURPHY_WRONG_WORD2 autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA3EnwpB025244 for ; Tue, 3 Nov 2009 06:50:00 -0800 Received: from mx10.gnu.org ([199.232.76.166]:45405) by fencepost.gnu.org with esmtp (Exim 4.67) (envelope-from ) id 1N5Khp-0004HT-Ue for emacs-pretest-bug@gnu.org; Tue, 03 Nov 2009 09:49:57 -0500 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from ) id 1N5Kho-0003cd-Jl for emacs-pretest-bug@gnu.org; Tue, 03 Nov 2009 09:49:57 -0500 Received: from ironport2-out.teksavvy.com ([206.248.154.181]:12223 helo=ironport2-out.pppoe.ca) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1N5Kho-0003cQ-9O; Tue, 03 Nov 2009 09:49:56 -0500 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap0EAITR70pFpYlL/2dsb2JhbACBUN0XhD0EiG8 X-IronPort-AV: E=Sophos;i="4.44,674,1249272000"; d="scan'208";a="48605229" Received: from 69-165-137-75.dsl.teksavvy.com (HELO pastel.home) ([69.165.137.75]) by ironport2-out.pppoe.ca with ESMTP; 03 Nov 2009 09:49:54 -0500 Received: by pastel.home (Postfix, from userid 20848) id 3784481F9; Tue, 3 Nov 2009 09:49:54 -0500 (EST) From: Stefan Monnier To: rms@gnu.org Cc: 4848@debbugs.gnu.org, emacs-pretest-bug@gnu.org Message-ID: References: Date: Tue, 03 Nov 2009 09:49:54 -0500 In-Reply-To: (Richard Stallman's message of "Tue, 03 Nov 2009 08:39:00 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by monty-python.gnu.org: Genre and OS details not recognized. >> "\ue1" gives the error "Non-hex digit used for Unicode escape". >> Why doesn't it work to give the Unicode character =C3=A1? > I think you mean \u00e1 > Why shouldn't \ue1 work? Because the \u format is \uNNNN with exactly 4 hex digits. >> Note that \xe1 does not work for this any more. > Indeed, this refers to the byte 225 rather than to the char 225. > This needs to be documented. But is it a good meaning for \x? It > will rarely be useful this way. Also, is it an incompatible change? I haven't managed to keep track of all the changes w.r.t how we treat \NNN vs \xMM vs \xMMMMM and how it impacts whether the resulting string is unibyte or multibyte. My understanding is that there have been several incompatible changes in this area (and some of those were inevitable). E.g. in Emacs-22: ELISP> "\222" "\222" ELISP> "\xa4" "\xa4" ELISP> (multibyte-string-p "\222") nil ELISP> (multibyte-string-p "\xa4") t ELISP> (multibyte-string-p "\xa45") t ELISP>=20 whereas in Emacs-23.1: ELISP> "\222" "\222" ELISP> "\xa4" "\244" ELISP> (multibyte-string-p "\222") nil ELISP> (multibyte-string-p "\xa4") nil ELISP> (multibyte-string-p "\xa45") t ELISP>=20 Of course, given that fact that char-numbers have changed, the backward compatibility of \xNNNN is irrelevant since they do not represent the same char any more. Stefan From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: Eli Zaretskii , 4848@debbugs.gnu.org Resent-From: Eli Zaretskii Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Tue, 03 Nov 2009 18:45:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by 4848-submit@emacsbugs.donarmstrong.com id=B4848.125727337414283 (code B ref 4848); Tue, 03 Nov 2009 18:45:03 +0000 Received: (at 4848) by emacsbugs.donarmstrong.com; 3 Nov 2009 18:36:14 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-2.9 required=4.0 tests=AWL,HAS_BUG_NUMBER autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from mtaout20.012.net.il (mtaout20.012.net.il [80.179.55.166]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA3IaCFa014280 for <4848@emacsbugs.donarmstrong.com>; Tue, 3 Nov 2009 10:36:14 -0800 Received: from conversion-daemon.a-mtaout20.012.net.il by a-mtaout20.012.net.il (HyperSendmail v2007.08) id <0KSJ00G00Q6VFP00@a-mtaout20.012.net.il> for 4848@emacsbugs.donarmstrong.com; Tue, 03 Nov 2009 20:35:37 +0200 (IST) Received: from HOME-C4E4A596F7 ([87.70.37.193]) by a-mtaout20.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0KSJ00EO3QBBN950@a-mtaout20.012.net.il>; Tue, 03 Nov 2009 20:35:36 +0200 (IST) Date: Tue, 03 Nov 2009 20:35:40 +0200 From: Eli Zaretskii In-reply-to: X-012-Sender: halo1@inter.net.il To: rms@gnu.org, 4848@debbugs.gnu.org Cc: monnier@iro.umontreal.ca Message-id: <83tyxbbgn7.fsf@gnu.org> References: > From: Richard Stallman > Date: Tue, 03 Nov 2009 08:39:00 -0500 > Cc: emacs-pretest-bug@gnu.org, 4848@emacsbugs.donarmstrong.com > > This needs to be documented. I'm not sure what you wanted to be documented. Is the description in "(elisp)General Escape Syntax" what you were looking for? From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: rms@gnu.org, 4848@debbugs.gnu.org Resent-From: Richard Stallman Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Thu, 05 Nov 2009 02:05:05 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by submit@emacsbugs.donarmstrong.com id=B.12573862291394 (code B ref -1); Thu, 05 Nov 2009 02:05:05 +0000 Received: (at submit) by emacsbugs.donarmstrong.com; 5 Nov 2009 01:57:09 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-5.3 required=4.0 tests=AWL,HAS_BUG_NUMBER autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA51v7qT001391; Wed, 4 Nov 2009 17:57:09 -0800 Received: from rms by fencepost.gnu.org with local (Exim 4.67) (envelope-from ) id 1N5rb1-0001Zv-HJ; Wed, 04 Nov 2009 20:57:07 -0500 Content-Type: text/plain; charset=ISO-8859-15 From: Richard Stallman To: Stefan Monnier CC: 4848@debbugs.gnu.org, emacs-pretest-bug@gnu.org In-reply-to: (message from Stefan Monnier on Tue, 03 Nov 2009 09:49:54 -0500) References: Message-Id: Date: Wed, 04 Nov 2009 20:57:07 -0500 > Why shouldn't \ue1 work? Because the \u format is \uNNNN with exactly 4 hex digits. In other words, "it doesn't work because we decided it should't work". But why should't it work? Why shouldn't two digits be allowed? Is there a good reason not to allow that? From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: rms@gnu.org, 4848@debbugs.gnu.org Resent-From: Richard Stallman Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Thu, 05 Nov 2009 02:05:07 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by 4848-submit@emacsbugs.donarmstrong.com id=B4848.12573862071389 (code B ref 4848); Thu, 05 Nov 2009 02:05:07 +0000 Received: (at 4848) by emacsbugs.donarmstrong.com; 5 Nov 2009 01:56:47 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-4.1 required=4.0 tests=AWL,HAS_BUG_NUMBER, IMPRONONCABLE_1,IMPRONONCABLE_2,MURPHY_WRONG_WORD1,MURPHY_WRONG_WORD2 autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA51ukdR001386 for <4848@emacsbugs.donarmstrong.com>; Wed, 4 Nov 2009 17:56:47 -0800 Received: from rms by fencepost.gnu.org with local (Exim 4.67) (envelope-from ) id 1N5raf-0001W5-En; Wed, 04 Nov 2009 20:56:45 -0500 Content-Type: text/plain; charset=ISO-8859-15 From: Richard Stallman To: Eli Zaretskii CC: 4848@debbugs.gnu.org, monnier@iro.umontreal.ca In-reply-to: <83tyxbbgn7.fsf@gnu.org> (message from Eli Zaretskii on Tue, 03 Nov 2009 20:35:40 +0200) References: <83tyxbbgn7.fsf@gnu.org> Message-Id: Date: Wed, 04 Nov 2009 20:56:45 -0500 I'm not sure what you wanted to be documented. Is the description in "(elisp)General Escape Syntax" what you were looking for? The version I have is from August. If it has been substantially improved since then, maybe it is good. The text from August was inadequate and even wrong: To use hex, write a question mark followed by a backslash, @samp{x}, and the hexadecimal character code. You can use any number of hex digits, so you can represent any character code in this way. Thus, @samp{?\x41} for the character @kbd{A}, @samp{?\x1} for the character @kbd{C-a}, and @code{?\x8e0} for the Latin-1 character @iftex @samp{@`a}. @end iftex @ifnottex @samp{a} with grave accent. @end ifnottex And here is something from Non-ASCII In Strings: You can also represent a multibyte non-@acronym{ASCII} character with its character code: use a hex escape, @samp{\x@var{nnnnnnn}}, with as many digits as necessary. (Multibyte non-@acronym{ASCII} character codes are all greater than 256.) Any character which is not a valid hex digit terminates this construct. If the next character in the string could be interpreted as a hex digit, write @w{@samp{\ }} (backslash and space) to terminate the hex escape---for example, @w{@samp{\x8e0\ }} represents one character, @samp{a} with grave accent. @w{@samp{\ }} in a string constant is just like backslash-newline; it does not contribute any character to the string, but it does terminate the preceding hex escape. From unknown Fri Aug 15 14:16:21 2025 X-Loop: owner@emacsbugs.donarmstrong.com Subject: bug#4848: 23.1.50; \u and \x in string Reply-To: Stefan Monnier , 4848@debbugs.gnu.org Resent-From: Stefan Monnier Resent-To: bug-submit-list@lists.donarmstrong.com Resent-CC: Emacs Bugs 2Resent-Date: Thu, 05 Nov 2009 02:55:09 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-Emacs-PR-Message: followup 4848 X-Emacs-PR-Package: emacs X-Emacs-PR-Keywords: Received: via spool by 4848-submit@emacsbugs.donarmstrong.com id=B4848.12573892916340 (code B ref 4848); Thu, 05 Nov 2009 02:55:09 +0000 Received: (at 4848) by emacsbugs.donarmstrong.com; 5 Nov 2009 02:48:11 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-3.6 required=4.0 tests=AWL,HAS_BUG_NUMBER autolearn=ham version=3.2.5-bugs.debian.org_2005_01_02 Received: from ironport2-out.pppoe.ca (ironport2-out.teksavvy.com [206.248.154.183]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA52mAxg006336 for <4848@emacsbugs.donarmstrong.com>; Wed, 4 Nov 2009 18:48:11 -0800 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AkIFAOrL8UpMCqIP/2dsb2JhbACBTtxThD0EiHc X-IronPort-AV: E=Sophos;i="4.44,683,1249272000"; d="scan'208";a="48721781" Received: from 76-10-162-15.dsl.teksavvy.com (HELO pastel.home) ([76.10.162.15]) by ironport2-out.pppoe.ca with ESMTP; 04 Nov 2009 21:48:04 -0500 Received: by pastel.home (Postfix, from userid 20848) id 963AF9941; Wed, 4 Nov 2009 21:48:04 -0500 (EST) From: Stefan Monnier To: rms@gnu.org Cc: 4848@debbugs.gnu.org Message-ID: References: Date: Wed, 04 Nov 2009 21:48:04 -0500 In-Reply-To: (Richard Stallman's message of "Wed, 04 Nov 2009 20:57:07 -0500") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii >> Why shouldn't \ue1 work? > Because the \u format is \uNNNN with exactly 4 hex digits. > In other words, "it doesn't work because we decided it should't work". > But why should't it work? Why shouldn't two digits be allowed? > Is there a good reason not to allow that? I think the \u format is taken from C and it doesn't have an "end" like our \x format has. So for example "\u11111" means (concat "\u1111" "1"). Stefan From debbugs-submit-bounces@debbugs.gnu.org Sun Jul 24 00:51:20 2011 Received: (at control) by debbugs.gnu.org; 24 Jul 2011 04:51:20 +0000 Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qkqeu-00078D-B5 for submit@debbugs.gnu.org; Sun, 24 Jul 2011 00:51:20 -0400 Received: from vm-emlprdomr-04.its.yale.edu ([130.132.50.145]) by debbugs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Qkqer-00077y-QN for control@debbugs.gnu.org; Sun, 24 Jul 2011 00:51:18 -0400 Received: from furball (c-76-28-93-216.hsd1.ct.comcast.net [76.28.93.216]) (authenticated bits=0) by vm-emlprdomr-04.its.yale.edu (8.14.4/8.14.4) with ESMTP id p6O4pBa5015523 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for ; Sun, 24 Jul 2011 00:51:12 -0400 From: Chong Yidong To: control@debbugs.gnu.org Subject: severity 4848 wishlist Date: Sun, 24 Jul 2011 00:51:10 -0400 Message-ID: <87r55g46yp.fsf@stupidchicken.com> MIME-Version: 1.0 Content-Type: text/plain X-Scanned-By: MIMEDefang 2.71 on 130.132.50.145 X-Spam-Score: -2.6 (--) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) severity 4848 wishlist thanks From unknown Fri Aug 15 14:16:21 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: rms@gnu.org Subject: bug#4848: closed (Re: bug#4848: 23.1.50; \u and \x in string) Message-ID: References: X-Gnu-PR-Message: they-closed 4848 X-Gnu-PR-Package: emacs Reply-To: 4848@debbugs.gnu.org Date: Tue, 14 Jun 2016 02:46:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1465872362-13929-1" This is a multi-part message in MIME format... ------------=_1465872362-13929-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #4848: 23.1.50; \u and \x in string which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 4848@debbugs.gnu.org. --=20 4848: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D4848 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1465872362-13929-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 4848-done) by debbugs.gnu.org; 14 Jun 2016 02:45:41 +0000 Received: from localhost ([127.0.0.1]:39465 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bCeMG-0003c8-Mz for submit@debbugs.gnu.org; Mon, 13 Jun 2016 22:45:40 -0400 Received: from mail-oi0-f49.google.com ([209.85.218.49]:35197) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bCeMF-0003bw-CY for 4848-done@debbugs.gnu.org; Mon, 13 Jun 2016 22:45:39 -0400 Received: by mail-oi0-f49.google.com with SMTP id w5so157550639oib.2 for <4848-done@debbugs.gnu.org>; Mon, 13 Jun 2016 19:45:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:message-id:subject:to :content-transfer-encoding; bh=ljufAjZuYRV7mScAi9m/zmZr6djM+nTX3ySRXB3ofok=; b=uVevgM0CvrRoc31FSPu+0wQjaUPYFpTkfXUDqX3egVzrUpSTvrb+34OtNzExeIBrYC XfhlKiqlKs6QaL8Awd+qbBv0lesvaj/JpJQkEHuu2LmqEDH4Y8GSIfeK6LONRdjxpo/w 4cyeRfxy2vKCyXtm1jSYmiOGnlJB/JtBKkyb1bTVovJpALpgjW5re4YaUudJb57gA0cW KP0qS6ul87kXPDwhGCRnkCg9QdXB5Nuaq7SgKmEfCtYk6fwEBxt0x4mN/GNDrA15VJab oMcOR2ZvjeNcbYCmJCKI8YFdIdW4V7lGxM3zECAVdjt2tPpOWc/tw3Win5WXUe59wDY4 hnsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:from:date:message-id:subject :to:content-transfer-encoding; bh=ljufAjZuYRV7mScAi9m/zmZr6djM+nTX3ySRXB3ofok=; b=F5AnCb38pgsVbVYtNL3AugN74AosU8Z4Wd7ipnm5M0xb4/QKhBwYHttTWZ5QvoVL+Q hBLoG2s1SBdCIFsiK9jTj/J5Vm+uRVjNgn50N3VwWGaWUwz1I+5VkaRAoGVGCj5Gwgd6 kiyp7eWgt8CijOyZQFcJz3CGK/nQj7pQMinRR2rZWC8LgMx25kjQtVQbXmBsTDXbcxPE 6V6Kpu7WnA+62U4XRj/tmsTljX5oYehfZn74V3FdIwPvDcoJcAvFaeb0EjuPDbJpH4JH SQfy0yLIOA65/ezsgq5k/TKTry1EaQnMqyH0M6mjkKEEMNxgAu5anDZHJ9QPHycimzdN rndA== X-Gm-Message-State: ALyK8tIlLLymWaBkZHMX+IelpqtxpuSZ+xnDBCRF3FIyIOfXeIK1rmwLtov0bOpFHQ2mbsBaQZqUurAsYvMcNg== X-Received: by 10.202.195.70 with SMTP id t67mr7840158oif.88.1465872333938; Mon, 13 Jun 2016 19:45:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.5.168 with HTTP; Mon, 13 Jun 2016 19:45:33 -0700 (PDT) From: Noam Postavsky Date: Mon, 13 Jun 2016 22:45:33 -0400 X-Google-Sender-Auth: JUqY-RKgUWF2FzrFV4kXGddz4hQ Message-ID: Subject: Re: bug#4848: 23.1.50; \u and \x in string To: 4848-done@debbugs.gnu.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.5 (/) X-Debbugs-Envelope-To: 4848-done X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.5 (/) "Non-ASCII In Strings" now (24.5) says the following which explains about "\xN" producing unibyte characters. You can also use hexadecimal escape sequences (=E2=80=98\xN=E2=80=99) an= d octal escape sequences (=E2=80=98\N=E2=80=99) in string constants. *But beware:*= If a string constant contains hexadecimal or octal escape sequences, and these escape sequences all specify unibyte characters (i.e., less than 256), and there are no other literal non-ASCII characters or Unicode-style escape sequences in the string, then Emacs automatically assumes that it is a unibyte string. That is to say, it assumes that all non-ASCII characters occurring in the string are 8-bit raw bytes. ------------=_1465872362-13929-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by emacsbugs.donarmstrong.com; 2 Nov 2009 05:31:20 +0000 X-Spam-Checker-Version: SpamAssassin 3.2.5-bugs.debian.org_2005_01_02 (2008-06-10) on rzlab.ucr.edu X-Spam-Level: X-Spam-Bayes: score:0.5 Bayes not run. spammytokens:Tokens not available. hammytokens:Tokens not available. X-Spam-Status: No, score=-3.3 required=4.0 tests=AWL,FOURLA,IMPRONONCABLE_2 autolearn=no version=3.2.5-bugs.debian.org_2005_01_02 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.14.3/8.14.3/Debian-5) with ESMTP id nA25VIF7028847 for ; Sun, 1 Nov 2009 21:31:19 -0800 Received: from rms by fencepost.gnu.org with local (Exim 4.67) (envelope-from ) id 1N4pVd-0008CS-LQ; Mon, 02 Nov 2009 00:31:17 -0500 Content-Type: text/plain; charset=ISO-8859-15 From: Richard Stallman To: emacs-pretest-bug@gnu.org Subject: 23.1.50; \u and \x in string Reply-to: rms@gnu.org Message-Id: Date: Mon, 02 Nov 2009 00:31:17 -0500 "\ue1" gives the error "Non-hex digit used for Unicode escape". Why doesn't it work to give the Unicode character á? Note that \xe1 does not work for this any more. It gives a different character, which displays as \341 and is described as follows by C-x =. Char: \341 (4194273, #o17777741, #x3fffe1, raw-byte) point=442 of 2980 (15%) column=0 That too is confusing, and certainly not documented clearly where \x is explained. Is there any way to specify unicode e1 with \x? In GNU Emacs 23.1.50.4 (mipsel-unknown-linux-gnu, GTK+ Version 2.12.12) of 2009-08-11 on theobromine2 configured using `configure 'CFLAGS=-O0 -g -Wno-pointer-sign' 'mipsel-unknown-linux-gnu' 'build_alias=mipsel-unknown-linux-gnu' 'host_alias=mipsel-unknown-linux-gnu' 'target_alias=mipsel-unknown-linux-gnu'' Important settings: value of $LC_ALL: nil value of $LC_COLLATE: nil value of $LC_CTYPE: nil value of $LC_MESSAGES: nil value of $LC_MONETARY: nil value of $LC_NUMERIC: nil value of $LC_TIME: nil value of $LANG: en_US.UTF-8 value of $XMODIFIERS: nil locale-coding-system: utf-8-unix default-enable-multibyte-characters: t Major mode: RMAIL Edit Minor modes in effect: shell-dirtrack-mode: t diff-auto-refine-mode: t gpm-mouse-mode: t display-battery-mode: t tooltip-mode: t tool-bar-mode: t menu-bar-mode: t file-name-shadow-mode: t global-font-lock-mode: t font-lock-mode: t global-auto-composition-mode: t auto-composition-mode: t auto-encryption-mode: t auto-compression-mode: t line-number-mode: t transient-mark-mode: t abbrev-mode: t Recent input: b R TAB RET ESC < C-u C-n C-u C-u C-n C-u C-n C-n C-n C-n C-f 4 b o u t C-_ C-x b o u t - 2 2 RET C-a C-p C-x 4 b R TAB RET C-u ESC x c o m p a r e RET C-x o C-x o C-x b RET C-b C-b C-b C-b | ESC C-x C-x C-s C-x b RET C-x o C-b C-b C-x ESC ESC ESC p ESC p RET C-x o C-x o C-x o C-x C-g C-x 4 b RET C-a ESC f C-f C-@ ESC C-f ESC w ESC : C-y RET C-x o ESC : ( l o o k i n g - a t SPC C-y ) RET C-x o C-e ESC b ESC d 2 4 0 ESC C-x C-x o ESC : ESC p RET C-x = C-x o o C-_ C-x o ESC : ESC p C-e ESC DEL ESC DEL ESC DEL " \ 2 4 0 DEL DEL DEL x a 0 " ) RET C-u C-x = C-\ a ' C-g e C-x = C-f a ' C-b C-x = ESC : ESC p C-e C-b C-b ESC DEL DEL C-\ a ' C-e RET C-x = ESC : ESC p C-e C-b C-b DEL \ 3 4 1 RET C-x = ESC : ESC p C-e C-b C-b DEL DEL DEL x e 1 RET C-x = ESC : ESC p C-e C-b C-b C-b C-b DEL u C-e RET ESC : ESC p C-e C-b C-b C-b C-b ESC u C-e RET ESC : ESC p C-e C-b C-b C-b C-b 0 0 C-e RET ESC x r e p o r t SPC e m a c s SPC b u g RET Recent messages: Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57 t Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57 nil Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57 nil Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57 let: Non-hex digit used for Unicode escape [2 times] t Source file `/home/rms/emacs-cvs/lisp/mail/emacsbug.el' newer than byte-compiled file Load-path shadows: None found. ------------=_1465872362-13929-1--