GNU bug report logs -
#4848
23.1.50; \u and \x in string
Previous Next
Reported by: rms <at> gnu.org
Date: Mon, 2 Nov 2009 05:35:06 UTC
Severity: wishlist
Done: Noam Postavsky <npostavs <at> users.sourceforge.net>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Mon, 13 Jun 2016 22:45:33 -0400
with message-id <CAM-tV-8c5e=oZri6vDFiqwZW+HrsN14YB4z_M=_h0jjGRKP=ag <at> mail.gmail.com>
and subject line Re: bug#4848: 23.1.50; \u and \x in string
has caused the debbugs.gnu.org bug report #4848,
regarding 23.1.50; \u and \x in string
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
4848: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4848
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
"\ue1" gives the error "Non-hex digit used for Unicode escape".
Why doesn't it work to give the Unicode character á?
Note that \xe1 does not work for this any more.
It gives a different character, which displays as \341 and
is described as follows by C-x =.
Char: \341 (4194273, #o17777741, #x3fffe1, raw-byte) point=442 of 2980 (15%) column=0
That too is confusing, and certainly not documented clearly where \x
is explained. Is there any way to specify unicode e1 with \x?
In GNU Emacs 23.1.50.4 (mipsel-unknown-linux-gnu, GTK+ Version 2.12.12)
of 2009-08-11 on theobromine2
configured using `configure 'CFLAGS=-O0 -g -Wno-pointer-sign' 'mipsel-unknown-linux-gnu' 'build_alias=mipsel-unknown-linux-gnu' 'host_alias=mipsel-unknown-linux-gnu' 'target_alias=mipsel-unknown-linux-gnu''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_US.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
Major mode: RMAIL Edit
Minor modes in effect:
shell-dirtrack-mode: t
diff-auto-refine-mode: t
gpm-mouse-mode: t
display-battery-mode: t
tooltip-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
abbrev-mode: t
Recent input:
b R TAB RET ESC < C-u C-n C-u C-u C-n C-u C-n C-n C-n
C-n C-f 4 b o u t C-_ C-x b o u t - 2 2 RET C-a C-p
C-x 4 b R TAB RET C-u ESC x c o m p a r e RET C-x o
C-x o C-x b RET C-b C-b C-b C-b | ESC C-x C-x C-s C-x
b RET C-x o C-b C-b C-x ESC ESC ESC p ESC p RET C-x
o C-x o C-x o C-x C-g C-x 4 b RET C-a ESC f C-f C-@
ESC C-f ESC w ESC : C-y RET C-x o ESC : ( l o o k i
n g - a t SPC C-y ) RET C-x o C-e ESC b ESC d 2 4 0
ESC C-x C-x o ESC : ESC p RET C-x = C-x o o C-_ C-x
o ESC : ESC p C-e ESC DEL ESC DEL ESC DEL " \ 2 4 0
DEL DEL DEL x a 0 " ) RET C-u C-x = C-\ a ' C-g e C-x
= C-f a ' C-b C-x = ESC : ESC p C-e C-b C-b ESC DEL
DEL C-\ a ' C-e RET C-x = ESC : ESC p C-e C-b C-b DEL
\ 3 4 1 RET C-x = ESC : ESC p C-e C-b C-b DEL DEL DEL
x e 1 RET C-x = ESC : ESC p C-e C-b C-b C-b C-b DEL
u C-e RET ESC : ESC p C-e C-b C-b C-b C-b ESC u C-e
RET ESC : ESC p C-e C-b C-b C-b C-b 0 0 C-e RET ESC
x r e p o r t SPC e m a c s SPC b u g RET
Recent messages:
Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57
t
Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57
nil
Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57
nil
Char: =e1 (225, #o341, #xe1) point=1382 of 28873 (5%) column=57
let: Non-hex digit used for Unicode escape [2 times]
t
Source file `/home/rms/emacs-cvs/lisp/mail/emacsbug.el' newer than byte-compiled file
Load-path shadows:
None found.
[Message part 3 (message/rfc822, inline)]
"Non-ASCII In Strings" now (24.5) says the following which explains
about "\xN" producing unibyte characters.
You can also use hexadecimal escape sequences (‘\xN’) and octal
escape sequences (‘\N’) in string constants. *But beware:* If a string
constant contains hexadecimal or octal escape sequences, and these
escape sequences all specify unibyte characters (i.e., less than 256),
and there are no other literal non-ASCII characters or Unicode-style
escape sequences in the string, then Emacs automatically assumes that it
is a unibyte string. That is to say, it assumes that all non-ASCII
characters occurring in the string are 8-bit raw bytes.
This bug report was last modified 9 years and 36 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.