GNU bug report logs -
#1006
garbled unicode characters in M-x term
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 1006 in the body.
You can then email your comments to 1006 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#1006
; Package
emacs
.
Full text and
rfc822 format available.
Acknowledgement sent to
Andreas Politz <politza <at> fh-trier.de>
:
New bug report received and forwarded. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
Full text and
rfc822 format available.
Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):
Please write in English if possible, because the Emacs maintainers
usually do not have translators to read other languages for them.
Your bug report will be posted to the bug-gnu-emacs <at> gnu.org mailing list,
and to the gnu.emacs.bug news group.
Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:
Problem : Under certain circumstances multibyte characters in M-x term
become garbled and display as single byte escape sequences.
Example : debians aptitude (character U+2592)
From a post I made to gnu.emacs.help:
Ok, I think I found the problem. term uses `binary' as input coding.
After it has examined the input, it inserts the relevant/visible parts
of it into the buffer. Only at this point it decodes the bytes with
the apropriate coding (variable:locale-coding-system).
At some point it splits the input string, to make it suitable for the
with of the `terminal'. The problem is, that it measures bytes not
characters. So the 3-byte character in question in aptitude, which is mostly
on the last column, gets split in 2 strings a 1 and 2 byte. This 2
strings, when encoded and inserted independently, will result in
what was described as the problem.
Solution would be to encode the string before checking the length of
it.
-ap
If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/share/emacs/22.2/etc/DEBUG for instructions.
In GNU Emacs 22.2.1 (i486-pc-linux-gnu, GTK+ Version 2.12.11)
of 2008-07-25 on raven, modified by Debian
Windowing system distributor `The X.Org Foundation', version 11.0.10402000
configured using `configure '--build=i486-linux-gnu' '--host=i486-linux-gnu' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--enable-locallisppath=/etc/emacs22:/etc/emacs:/usr/local/share/emacs/22.2/site-lisp:/usr/local/share/emacs/site-lisp:/usr/share/emacs/22.2/site-lisp:/usr/share/emacs/site-lisp:/usr/share/emacs/22.2/leim' '--with-x=yes' '--with-x-toolkit=gtk' '--with-toolkit-scroll-bars' 'build_alias=i486-linux-gnu' 'host_alias=i486-linux-gnu' 'CFLAGS=-DDEBIAN -g -O2' 'LDFLAGS=-g' 'CPPFLAGS=''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_US.UTF-8
locale-coding-system: utf-8
default-enable-multibyte-characters: t
Major mode: Fundamental
Minor modes in effect:
shell-dirtrack-mode: t
auto-fill-function: do-auto-fill
show-paren-mode: t
savehist-mode: t
icomplete-mode: t
global-hi-lock-mode: t
hi-lock-mode: t
display-time-mode: t
tooltip-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
unify-8859-on-encoding-mode: t
utf-translate-cjk-mode: t
auto-compression-mode: t
column-number-mode: t
line-number-mode: t
Recent input:
C-x C-s M-x d i f f SPC u DEL C-g C-x o M-? m C-M-v
C-x k RET C-x C-g M-x d i f f RET RET t e r m . RET
C-x o C-v C-v C-v C-v C-v M-< M-x w o m a n RET d i
f f RET C-v C-v C-v M-v C-r i g n o r e C-r C-g C-x
b t e r C-s C-s C-g C-x o M-x C-g C-u M-x d i f f RET
RET t e r C-s RET w <return> C-x o C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n C-n
C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-p C-x o C-x
o M-< C-x k RET C-x o C-u M-x d i f f RET RET t e r
C-s RET DEL w <return> C-x C-g C-u C-g M-x d i f f
RET RET t e r m . RET C-x o C-v C-v C-v C-v C-v M-v
M-v M-v M-v M-v C-x o C-x C-w ~ / . e m / t e r m .
e l <return> C-x b f o RET C-n C-n C-n C-n C-n C-n
C-n C-n C-n C-n M-x r e p o SPC r t RET g r a <backspace>
<backspace> a r b e l e d DEL DEL DEL DEL l e d C-g
Recent messages:
Repeating command 1 other-window
Quit
Repeating command 1 other-window [2 times]
Saving file /home/andy/.emacs.d/term.el...
Wrote /home/andy/.emacs.d/term.el
Making completion list...
Loading emacsbug...done
Quit
Information forwarded to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#1006
; Package
emacs
.
Full text and
rfc822 format available.
Acknowledgement sent to
Chong Yidong <cyd <at> stupidchicken.com>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
Full text and
rfc822 format available.
Message #10 received at 1006 <at> emacsbugs.donarmstrong.com (full text, mbox):
> Ok, I think I found the problem. term uses `binary' as input coding.
> After it has examined the input, it inserts the relevant/visible parts
> of it into the buffer. Only at this point it decodes the bytes with
> the apropriate coding (variable:locale-coding-system). At some point
> it splits the input string, to make it suitable for the with of the
> `terminal'. The problem is, that it measures bytes not characters. So
> the 3-byte character in question in aptitude, which is mostly on the
> last column, gets split in 2 strings a 1 and 2 byte. This 2 strings,
> when encoded and inserted independently, will result in what was
> described as the problem.
Thanks for the analysis. Could you try to write a patch to fix this?
Reply sent to
Chong Yidong <cyd <at> stupidchicken.com>
:
You have taken responsibility.
Full text and
rfc822 format available.
Notification sent to
Andreas Politz <politza <at> fh-trier.de>
:
bug acknowledged by developer.
Full text and
rfc822 format available.
Message #15 received at 1006-done <at> emacsbugs.donarmstrong.com (full text, mbox):
>>>> Thanks for the analysis. Could you try to write a patch to fix
>>>> this?
>>>>
>>> I did. It's a followup in the thread on emacs.bug .
>>
>> Hmm, I don't see your message. Could you please mail it directly to
>> me?
>
> Shure, here it comes :
The patch looks good. I've installed it into the Emacs CVS trunk, with
a few minor cosmetic changes. Thanks very much for debugging and fixing
this.
bug archived.
Request was from
Debbugs Internal Request <don <at> donarmstrong.com>
to
internal_control <at> emacsbugs.donarmstrong.com
.
(Thu, 23 Oct 2008 14:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 16 years and 298 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.