GNU bug report logs -
#4240
23.1.50; C-u doesn't work with Swedish characters
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 4240 in the body.
You can then email your comments to 4240 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4240
; Package
emacs
.
(Sun, 23 Aug 2009 13:35:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Deniz Dogan <deniz.a.m.dogan <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Sun, 23 Aug 2009 13:35:04 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):
Please write in English if possible, because the Emacs maintainers
usually do not have translators to read other languages for them.
Your bug report will be posted to the emacs-pretest-bug <at> gnu.org mailing list.
Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:
I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out
as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by
fourteen "\344" and then a trailing "ä". This happens no matter which
kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's
always the leading and the trailing character that come out right, all
of the other ones are "broken".
If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/home/deniz/usr/share/emacs/23.1.50/etc/DEBUG for instructions.
In GNU Emacs 23.1.50.2 (i686-pc-linux-gnu, GTK+ Version 2.16.5)
of 2009-08-13 on stalin
Windowing system distributor `The X.Org Foundation', version 11.0.10603000
configured using `configure '--without-rsvg' '--without-tiff'
'--without-xpm' '--prefix=/home/deniz/usr''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: C
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: en_US.utf8
value of $XMODIFIERS: nil
locale-coding-system: utf-8-unix
default-enable-multibyte-characters: t
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
tool-bar-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
C-u ä <return> C-u C-u ä <return> M-5 M-0 ä <return>
C-1 C-0 u <return> C-1 C-0 ä M-x r e p o r t - e m
a c s - b u g f <return> <backspace> <return>
Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Load-path shadows:
None found.
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4240
; Package
emacs
.
(Sun, 23 Aug 2009 19:00:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Juri Linkov <juri <at> jurta.org>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Sun, 23 Aug 2009 19:00:05 GMT)
Full text and
rfc822 format available.
Message #10 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):
> I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out
> as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by
> fourteen "\344" and then a trailing "ä". This happens no matter which
> kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's
> always the leading and the trailing character that come out right, all
> of the other ones are "broken".
Please see bug#4037:
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
I received no confirmation that my proposed fix is correct.
Maybe the right fix is to reverse negation? It seems logical to check
if a buffer is unibyte before converting from unibyte to multibyte, but
I don't understand what this code was supposed to do.
Index: src/cmds.c
===================================================================
RCS file: /sources/emacs/emacs/src/cmds.c,v
retrieving revision 1.107
diff -u -r1.107 cmds.c
--- src/cmds.c 13 Jul 2009 01:02:51 -0000 1.107
+++ src/cmds.c 10 Aug 2009 22:54:02 -0000
@@ -337,7 +337,7 @@
/* Add the offset to the character, for Finsert_char.
We pass internal_self_insert the unmodified character
because it itself does this offsetting. */
- if (! NILP (current_buffer->enable_multibyte_characters))
+ if (NILP (current_buffer->enable_multibyte_characters))
modified_char = unibyte_char_to_multibyte (modified_char);
XSETFASTINT (n, XFASTINT (n) - 2);
--
Juri Linkov
http://www.jurta.org/emacs/
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4240
; Package
emacs
.
(Sun, 23 Aug 2009 20:45:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Sun, 23 Aug 2009 20:45:04 GMT)
Full text and
rfc822 format available.
Message #15 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):
> From: Juri Linkov <juri <at> jurta.org>
> Date: Sun, 23 Aug 2009 21:54:04 +0300
> Cc: 4240 <at> emacsbugs.donarmstrong.com
>
> > I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out
> > as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by
> > fourteen "\344" and then a trailing "ä". This happens no matter which
> > kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's
> > always the leading and the trailing character that come out right, all
> > of the other ones are "broken".
>
> Please see bug#4037:
> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
>
> I received no confirmation that my proposed fix is correct.
I think those two lines are not necessary anymore and should be
removed (together with the comments which explain their need). I
think they belong to the old pre-unicode days when raw eight-bit
characters needed such special treatment.
Handa-san, can you please comment on that?
> Maybe the right fix is to reverse negation?
Why, do you see that the code without these two lines don't DTRT when
the characters are inserted into a unibyte buffer? If it works in
both cases, it's the evidence that I'm right and this code is not
needed anymore.
> It seems logical to check if a buffer is unibyte before converting
> from unibyte to multibyte, but I don't understand what this code was
> supposed to do.
It was supposed to produce a multibyte character from a unibyte one,
by using a special locale-dependent table that mapped, e.g., 8859-1
encoded Latin-1 characters in the range [128..255] to the
corresponding multibyte codepoints of Latin-1 characters in the
internal representation of characters Emacs 22 used. See the Emacs 22
definition of unibyte_char_to_multibyte in src/charset.c.
Nowadays we don't need that, since we have a special range of
multibyte codepoints for representing unibyte characters in multibyte
buffers and strings, and insert-char and the primitives it calls
already DTRT with them. So there should be no need to do anything
special outside insert-char.
Merged 4037 4240.
Request was from
Glenn Morris <rgm <at> gnu.org>
to
control <at> emacsbugs.donarmstrong.com
.
(Wed, 26 Aug 2009 01:25:06 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4240
; Package
emacs
.
(Wed, 26 Aug 2009 17:15:14 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Wed, 26 Aug 2009 17:15:15 GMT)
Full text and
rfc822 format available.
Message #22 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):
Ping!
> Date: Sun, 23 Aug 2009 23:40:00 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: deniz.a.m.dogan <at> gmail.com
>
> > From: Juri Linkov <juri <at> jurta.org>
> > Date: Sun, 23 Aug 2009 21:54:04 +0300
> > Cc: 4240 <at> emacsbugs.donarmstrong.com
> >
> > > I hit "C-u ä" expecting it to come out as "ääää". Instead it comes out
> > > as "ä\344\344ä". I try "C-u C-u ä" and it comes out as "ä" followed by
> > > fourteen "\344" and then a trailing "ä". This happens no matter which
> > > kind of repetition I'm doing, be it using C-u or using e.g. M-3. It's
> > > always the leading and the trailing character that come out right, all
> > > of the other ones are "broken".
> >
> > Please see bug#4037:
> > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
> >
> > I received no confirmation that my proposed fix is correct.
>
> I think those two lines are not necessary anymore and should be
> removed (together with the comments which explain their need). I
> think they belong to the old pre-unicode days when raw eight-bit
> characters needed such special treatment.
>
> Handa-san, can you please comment on that?
>
> > Maybe the right fix is to reverse negation?
>
> Why, do you see that the code without these two lines don't DTRT when
> the characters are inserted into a unibyte buffer? If it works in
> both cases, it's the evidence that I'm right and this code is not
> needed anymore.
>
> > It seems logical to check if a buffer is unibyte before converting
> > from unibyte to multibyte, but I don't understand what this code was
> > supposed to do.
>
> It was supposed to produce a multibyte character from a unibyte one,
> by using a special locale-dependent table that mapped, e.g., 8859-1
> encoded Latin-1 characters in the range [128..255] to the
> corresponding multibyte codepoints of Latin-1 characters in the
> internal representation of characters Emacs 22 used. See the Emacs 22
> definition of unibyte_char_to_multibyte in src/charset.c.
>
> Nowadays we don't need that, since we have a special range of
> multibyte codepoints for representing unibyte characters in multibyte
> buffers and strings, and insert-char and the primitives it calls
> already DTRT with them. So there should be no need to do anything
> special outside insert-char.
>
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4240
; Package
emacs
.
(Thu, 27 Aug 2009 05:10:05 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Stefan Monnier <monnier <at> iro.umontreal.ca>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Thu, 27 Aug 2009 05:10:05 GMT)
Full text and
rfc822 format available.
Message #27 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):
>> > Please see bug#4037:
>> > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
>> > I received no confirmation that my proposed fix is correct.
>> I think those two lines are not necessary anymore and should be
>> removed (together with the comments which explain their need). I
>> think they belong to the old pre-unicode days when raw eight-bit
>> characters needed such special treatment.
I believe you're right. Nowadays, the keyboard-decoding should always
take place before we get to that point.
Stefan
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#4240
; Package
emacs
.
(Thu, 27 Aug 2009 06:30:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Kenichi Handa <handa <at> m17n.org>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Thu, 27 Aug 2009 06:30:04 GMT)
Full text and
rfc822 format available.
Message #32 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):
In article <jwvocq14zlk.fsf-monnier+emacsbugreports <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>>> > Please see bug#4037:
>>> > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
>>> > I received no confirmation that my proposed fix is correct.
>>> I think those two lines are not necessary anymore and should be
>>> removed (together with the comments which explain their need). I
>>> think they belong to the old pre-unicode days when raw eight-bit
>>> characters needed such special treatment.
> I believe you're right. Nowadays, the keyboard-decoding should always
> take place before we get to that point.
Sorry for the late responce on this matter. Yes, that
unibyte->multibyte conversion is not necessary. I've just
installed a fix.
---
Kenichi Handa
handa <at> m17n.org
Message #33 received at 4240-done <at> emacsbugs.donarmstrong.com (full text, mbox):
> From: Kenichi Handa <handa <at> m17n.org>
> Cc: eliz <at> gnu.org, 4240 <at> emacsbugs.donarmstrong.com, deniz.a.m.dogan <at> gmail.com
> Date: Thu, 27 Aug 2009 15:23:25 +0900
>
> In article <jwvocq14zlk.fsf-monnier+emacsbugreports <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>
> >>> > Please see bug#4037:
> >>> > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
> >>> > I received no confirmation that my proposed fix is correct.
> >>> I think those two lines are not necessary anymore and should be
> >>> removed (together with the comments which explain their need). I
> >>> think they belong to the old pre-unicode days when raw eight-bit
> >>> characters needed such special treatment.
>
> > I believe you're right. Nowadays, the keyboard-decoding should always
> > take place before we get to that point.
>
> Sorry for the late responce on this matter. Yes, that
> unibyte->multibyte conversion is not necessary. I've just
> installed a fix.
Thanks. I'm closing the two related bug reports.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> emacsbugs.donarmstrong.com
.
(Fri, 25 Sep 2009 14:24:20 GMT)
Full text and
rfc822 format available.
This bug report was last modified 15 years and 327 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.