GNU bug report logs -
#9608
24.0.50; Emacs lisp reader thinks no-break space is 0x08a0 (should be 0x00a0)
Previous Next
Reported by: "David M. Cooke" <david.m.cooke <at> gmail.com>
Date: Tue, 27 Sep 2011 00:03:02 UTC
Severity: normal
Found in version 24.0.50
Done: Andreas Schwab <schwab <at> linux-m68k.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 9608 in the body.
You can then email your comments to 9608 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#9608
; Package
emacs
.
(Tue, 27 Sep 2011 00:03:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"David M. Cooke" <david.m.cooke <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Tue, 27 Sep 2011 00:03:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[zapped boilerplate header]
After reading through lread.c (I was writing an emacs lisp lexer for
syntax-highlighting in pygments), I discovered it treats the unicode
character U+08A0 as whitespace (with the comment "NBSP"). I believe this
was meant to be U+00A0 (NO-BREAK SPACE), as the code point U+08A0 has no
character assigned to it yet (it lies between the Samaritan and the
Devanagari blocks).
Additionally, you can see this by running the following lisp code:
(mapcar (lambda (sym) (string-as-unibyte (symbol-name sym) ))
(read "(a b c\u00a0d e\u08a0f g \u00a0 h i \u08a0 j)"))
This gives the result
("a" "b" "c\302\240d" "e" "f" "g" "\302\240" "h" "i" "j")
where we can see U+00A0 (utf-8: "\302\240") is being treated as a
symbol-constituent character, whereas U+08A0 is whitespace.
The changes to the whitespace handling were introduced in bzr revision
78902 (on 2007-07-30, which is a few weeks after a discussion about
handling NO-BREAK SPACE on the mailing list). I'm guessing using 0x8a0
was just a thinko.
cheers,
David M. Cooke <david.m.cooke <at> gmail.com>
If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
For information about debugging Emacs, please read the file
/Applications/_Editors/Emacs.app/Contents/Resources/etc/DEBUG.
In GNU Emacs 24.0.50.2 (x86_64-apple-darwin10.7.0, NS apple-appkit-1038.35)
of 2011-05-27 on mars.lan
Windowing system distributor `Apple', version 10.3.1138
configured using `configure '--with-ns''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_CA.UTF-8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default enable-multibyte-characters: t
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
( s <backspace> m a p c a r SPC ' s y m b o l - n a
m e SPC ( e v a l SPC " ( a SPC b SPC c \ u 0 0 a 0
d SPC e \ u 0 8 a 0 d <backspace> f ) " ) ) C-j q <down-mouse-1>
<mouse-1> # <down-mouse-1> <mouse-1> C-j q <down-mouse-1>
<mouse-1> ' C-e C-j q <backspace> <left> <left> <left>
<left> <left> <left> <left> <left> <left> <left> <left>
<left> <left> <left> <left> <left> <left> <left> <left>
<left> <left> <left> <left> <left> <left> <left> <backspace>
<backspace> " <left> <left> <backspace> <backspace>
<backspace> <backspace> r e a d C-e C-j <up> <left>
<left> <left> <left> <left> SPC g SPC \ u 0 0 a 0 SPC
h SPC i SPC \ u 0 8 a 0 SPC j C-e C-j <escape> x r
e m p o r <backspace> <backspace> <backspace> <backspace>
p o r <tab> <return>
Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Entering debugger...
Back to top level.
Entering debugger...
Back to top level.
Entering debugger...
Back to top level.
Load-path shadows:
None found.
Features:
(shadow sort gnus-util time-date mail-extr message format-spec rfc822
mml mml-sec mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047
rfc2045 ietf-drums mm-util mail-prsvr mailabbrev mail-utils gmm-utils
mailheader emacsbug help-mode easymenu view debug tooltip ediff-hook
vc-hooks lisp-float-type mwheel ns-win tool-bar dnd fontset image fringe
lisp-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer loaddefs button faces cus-face files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process dbusbind ns
multi-tty emacs)
Reply sent
to
Andreas Schwab <schwab <at> linux-m68k.org>
:
You have taken responsibility.
(Tue, 27 Sep 2011 08:46:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
"David M. Cooke" <david.m.cooke <at> gmail.com>
:
bug acknowledged by developer.
(Tue, 27 Sep 2011 08:46:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 9608-done <at> debbugs.gnu.org (full text, mbox):
"David M. Cooke" <david.m.cooke <at> gmail.com> writes:
> The changes to the whitespace handling were introduced in bzr revision
> 78902 (on 2007-07-30, which is a few weeks after a discussion about
> handling NO-BREAK SPACE on the mailing list).
That was before the unicode merge.
> I'm guessing using 0x8a0 was just a thinko.
No, it was the correct number at that time, when Emacs used the mule
encoding internally.
Andreas.
--
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 25 Oct 2011 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 13 years and 237 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.