GNU bug report logs -
#68751
29.1; "\x0e0" is a multibyte string
Previous Next
To reply to this bug, email your comments to 68751 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#68751
; Package
emacs
.
(Sat, 27 Jan 2024 06:31:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Christopher Yeleighton <giecrilj <at> stegny.2a.pl>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sat, 27 Jan 2024 06:31:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
M-: (multibyte-string-p "\x0e0") RET
> t
In GNU Emacs 29.1 (build 1, x86_64-pc-linux-gnu, GTK+ Version 3.24.38,
cairo version 1.17.8)
Windowing system distributor 'The X.Org Foundation', version 11.0.12101010
System Description: Arch Linux
Configured using:
'configure --sysconfdir=/etc --prefix=/usr --libexecdir=/usr/lib
--with-tree-sitter --localstatedir=/var --with-cairo
--disable-build-details --with-harfbuzz --with-libsystemd
--with-modules --with-x-toolkit=gtk3 'CFLAGS=-march=x86-64
-mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2
-Wformat -Werror=format-security -fstack-clash-protection
-fcf-protection -g
-ffile-prefix-map=/build/emacs/src=/usr/src/debug/emacs -flto=auto'
'LDFLAGS=-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto''
Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSYSTEMD LIBXML2 M17N_FLT MODULES NOTIFY INOTIFY
PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS
TREE_SITTER WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB
Important settings:
value of $LANG: pl_PL.UTF-8
locale-coding-system: utf-8-unix
Major mode: Info
Minor modes in effect:
shell-dirtrack-mode: t
tooltip-mode: t
global-eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
isearch-fold-quotes-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
buffer-read-only: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(xref project lpr thai-util thai-word repeat mailalias mailclient
textsec uni-scripts idna-mapping ucs-normalize uni-confusable
textsec-check facemenu shadow sort mail-extr emacsbug message yank-media
puny rfc822 mml mml-sec epa derived epg rfc6068 epg-config gnus-util
mm-decode mm-bodies mm-encode mail-parse rfc2231 mailabbrev gmm-utils
mailheader sendmail rfc2047 rfc2045 ietf-drums mm-util mail-prsvr
mail-utils browse-url url url-proxy url-privacy url-expand url-methods
url-history url-cookie generate-lisp-file url-domsuf url-util url-parse
url-vars mailcap mule-util info tar-mode arc-mode archive-mode sh-script
rx smie treesit executable files-x conf-mode shell pcomplete comint
ansi-osc ansi-color ring dired-aux dired dired-loaddefs noutline outline
icons two-column kmacro debug backtrace find-func face-remap shortdoc
text-property-search cl-extra cl-print erc-lang erc-goodies erc iso8601
auth-source cl-seq eieio eieio-core cl-macs password-cache json map pp
format-spec erc-backend erc-networks byte-opt gv bytecomp byte-compile
erc-common erc-compat erc-loaddefs thingatpt help-fns radix-tree
jka-compr misearch multi-isearch time-date subr-x rfc1345 quail
help-mode cl-loaddefs cl-lib rmc iso-transl tooltip cconv eldoc paren
electric uniquify ediff-hook vc-hooks lisp-float-type elisp-mode mwheel
term/x-win x-win term/common-win x-dnd tool-bar dnd fontset image
regexp-opt fringe tabulated-list replace newcomment text-mode lisp-mode
prog-mode register page tab-bar menu-bar rfn-eshadow isearch easymenu
timer select scroll-bar mouse jit-lock font-lock syntax font-core
term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
theme-loaddefs faces cus-face macroexp files window text-properties
overlay sha1 md5 base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting cairo
move-toolbar gtk x-toolkit xinput2 x multi-tty make-network-process
emacs)
Memory information:
((conses 16 514130 81836)
(symbols 48 26162 44)
(strings 32 129312 5724)
(string-bytes 1 2808230)
(vectors 16 59943)
(vector-slots 8 1889204 140248)
(floats 8 953 203)
(intervals 56 27501 657)
(buffers 984 37))
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#68751
; Package
emacs
.
(Sat, 27 Jan 2024 06:47:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 68751 <at> debbugs.gnu.org (full text, mbox):
Info (elisp) Non-ASCII in Strings says:
> If a string constant contains hexadecimal or octal escape sequences,
and these
> escape sequences all specify unibyte characters (i.e., less than 256),
> and there are no other literal non-ASCII characters or Unicode-style
> escape sequences in the string, then Emacs automatically assumes that it
> is a unibyte string.
I believe it should say:
| (i.e., less than 256 and octal or written with 2 hexadecimal digits),
and additionally
| Unibyte characters embedded in multibyte string constants evaluate to
private character codes,
| e.g. "\x0a0\xa0" equals "\x0a0\x3fffa0".
On 27.01.2024 06:31, GNU bug Tracking System wrote:
> Thank you for filing a new bug report with debbugs.gnu.org.
>
> This is an automatically generated reply to let you know your message
> has been received.
>
> Your message is being forwarded to the package maintainers and other
> interested parties for their attention; they will reply in due course.
>
> Your message has been sent to the package maintainer(s):
> bug-gnu-emacs <at> gnu.org
>
> If you wish to submit further information on this problem, please
> send it to 68751 <at> debbugs.gnu.org.
>
> Please do not send mail to help-debbugs <at> gnu.org unless you wish
> to report a problem with the Bug-tracking system.
>
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#68751
; Package
emacs
.
(Sat, 27 Jan 2024 07:40:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 68751 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 27 Jan 2024 06:23:45 +0000
> From: Christopher Yeleighton <giecrilj <at> stegny.2a.pl>
>
> M-: (multibyte-string-p "\x0e0") RET
>
> > t
Why do you think this is a problem? U+0E0E is à, a non-ASCII
character, so it has a multibyte representation.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#68751
; Package
emacs
.
(Sat, 27 Jan 2024 08:19:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 68751 <at> debbugs.gnu.org (full text, mbox):
> Date: Sat, 27 Jan 2024 06:46:36 +0000
> From: Christopher Yeleighton <giecrilj <at> stegny.2a.pl>
>
> Info (elisp) Non-ASCII in Strings says:
>
> > If a string constant contains hexadecimal or octal escape sequences,
> and these
> > escape sequences all specify unibyte characters (i.e., less than 256),
> > and there are no other literal non-ASCII characters or Unicode-style
> > escape sequences in the string, then Emacs automatically assumes that it
> > is a unibyte string.
>
> I believe it should say:
>
> | (i.e., less than 256 and octal or written with 2 hexadecimal digits),
Right. I modified the text to that effect.
> and additionally
>
> | Unibyte characters embedded in multibyte string constants evaluate to
> private character codes,
> | e.g. "\x0a0\xa0" equals "\x0a0\x3fffa0".
I didn't make this change because I don't see how it is useful.
First, "evaluate" is confusing here. Also, "private character codes"
is confusing/incorrect, as it could be interpreted to mean Emacs
somehow uses the PUA of Unicode codespace, which it doesn't. Finally,
when Emacs converts from a single-byte representation of a raw byte to
its multibyte representation is an obscure matter largely defined by
ad-hoc compatibility considerations, and doesn't belong to the ELisp
manual.
I think this bug can be closed now.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#68751
; Package
emacs
.
(Sat, 27 Jan 2024 08:31:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 68751 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/html, inline)]
bug Marked as fixed in versions 30.1.
Request was from
Stefan Kangas <stefankangas <at> gmail.com>
to
control <at> debbugs.gnu.org
.
(Fri, 02 Feb 2024 08:05:02 GMT)
Full text and
rfc822 format available.
This bug report was last modified 1 year and 133 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.