GNU bug report logs -
#68751
29.1; "\x0e0" is a multibyte string
Previous Next
Full log
Message #14 received at 68751 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 27 Jan 2024 06:46:36 +0000
> From: Christopher Yeleighton <giecrilj <at> stegny.2a.pl>
>
> Info (elisp) Non-ASCII in Strings says:
>
> > If a string constant contains hexadecimal or octal escape sequences,
> and these
> > escape sequences all specify unibyte characters (i.e., less than 256),
> > and there are no other literal non-ASCII characters or Unicode-style
> > escape sequences in the string, then Emacs automatically assumes that it
> > is a unibyte string.
>
> I believe it should say:
>
> | (i.e., less than 256 and octal or written with 2 hexadecimal digits),
Right. I modified the text to that effect.
> and additionally
>
> | Unibyte characters embedded in multibyte string constants evaluate to
> private character codes,
> | e.g. "\x0a0\xa0" equals "\x0a0\x3fffa0".
I didn't make this change because I don't see how it is useful.
First, "evaluate" is confusing here. Also, "private character codes"
is confusing/incorrect, as it could be interpreted to mean Emacs
somehow uses the PUA of Unicode codespace, which it doesn't. Finally,
when Emacs converts from a single-byte representation of a raw byte to
its multibyte representation is an obscure matter largely defined by
ad-hoc compatibility considerations, and doesn't belong to the ELisp
manual.
I think this bug can be closed now.
This bug report was last modified 1 year and 133 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.