GNU bug report logs - #4240
23.1.50; C-u doesn't work with Swedish characters

Package: emacs;

Reported by: Deniz Dogan <deniz.a.m.dogan <at> gmail.com>

Date: Sun, 23 Aug 2009 13:35:04 UTC

Severity: normal

Merged with 4037

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 4240 in the body.
You can then email your comments to 4240 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4240; Package emacs. (Sun, 23 Aug 2009 13:35:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Deniz Dogan <deniz.a.m.dogan <at> gmail.com>:
New bug report received and forwarded. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Sun, 23 Aug 2009 13:35:04 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Deniz Dogan <deniz.a.m.dogan <at> gmail.com>
To: emacs-pretest-bug <at> gnu.org
Subject: 23.1.50; C-u doesn't work with Swedish characters
Date: Sun, 23 Aug 2009 15:28:58 +0200

Please write in English if possible, because the Emacs maintainers
usually do not have translators to read other languages for them.

Your bug report will be posted to the emacs-pretest-bug <at> gnu.org mailing list.

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

I hit "C-u ä" expecting it to come out as "ääää".  Instead it comes out
as "ä\344\344ä".  I try "C-u C-u ä" and it comes out as "ä" followed by
fourteen "\344" and then a trailing "ä".  This happens no matter which
kind of repetition I'm doing, be it using C-u or using e.g. M-3.  It's
always the leading and the trailing character that come out right, all
of the other ones are "broken".

If Emacs crashed, and you have the Emacs process in the gdb debugger,
please include the output from the following gdb commands:
    `bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/home/deniz/usr/share/emacs/23.1.50/etc/DEBUG for instructions.


In GNU Emacs 23.1.50.2 (i686-pc-linux-gnu, GTK+ Version 2.16.5)
 of 2009-08-13 on stalin
Windowing system distributor `The X.Org Foundation', version 11.0.10603000
configured using `configure  '--without-rsvg' '--without-tiff'
'--without-xpm' '--prefix=/home/deniz/usr''

Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: C
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_US.utf8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default-enable-multibyte-characters: t

Major mode: Lisp Interaction

Minor modes in effect:
  tooltip-mode: t
  tool-bar-mode: t
  mouse-wheel-mode: t
  menu-bar-mode: t
  file-name-shadow-mode: t
  global-font-lock-mode: t
  font-lock-mode: t
  blink-cursor-mode: t
  global-auto-composition-mode: t
  auto-composition-mode: t
  auto-encryption-mode: t
  auto-compression-mode: t
  line-number-mode: t
  transient-mark-mode: t

Recent input:
C-u ä <return> C-u C-u ä <return> M-5 M-0 ä <return>
C-1 C-0 u <return> C-1 C-0 ä M-x r e p o r t - e m
a c s - b u g f <return> <backspace> <return>

Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.

Load-path shadows:
None found.

Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4240; Package emacs. (Sun, 23 Aug 2009 19:00:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Juri Linkov <juri <at> jurta.org>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Sun, 23 Aug 2009 19:00:05 GMT) Full text and rfc822 format available.

Message #10 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Juri Linkov <juri <at> jurta.org>
To: Deniz Dogan <deniz.a.m.dogan <at> gmail.com>
Cc: 4240 <at> debbugs.gnu.org
Subject: Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters
Date: Sun, 23 Aug 2009 21:54:04 +0300

> I hit "C-u ä" expecting it to come out as "ääää".  Instead it comes out
> as "ä\344\344ä".  I try "C-u C-u ä" and it comes out as "ä" followed by
> fourteen "\344" and then a trailing "ä".  This happens no matter which
> kind of repetition I'm doing, be it using C-u or using e.g. M-3.  It's
> always the leading and the trailing character that come out right, all
> of the other ones are "broken".

Please see bug#4037:
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037

I received no confirmation that my proposed fix is correct.

Maybe the right fix is to reverse negation?  It seems logical to check
if a buffer is unibyte before converting from unibyte to multibyte, but
I don't understand what this code was supposed to do.

Index: src/cmds.c
===================================================================
RCS file: /sources/emacs/emacs/src/cmds.c,v
retrieving revision 1.107
diff -u -r1.107 cmds.c
--- src/cmds.c	13 Jul 2009 01:02:51 -0000	1.107
+++ src/cmds.c	10 Aug 2009 22:54:02 -0000
@@ -337,7 +337,7 @@
 	/* Add the offset to the character, for Finsert_char.
 	   We pass internal_self_insert the unmodified character
 	   because it itself does this offsetting.  */
-	if (! NILP (current_buffer->enable_multibyte_characters))
+	if (NILP (current_buffer->enable_multibyte_characters))
 	  modified_char = unibyte_char_to_multibyte (modified_char);
 
 	XSETFASTINT (n, XFASTINT (n) - 2);

-- 
Juri Linkov
http://www.jurta.org/emacs/

Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4240; Package emacs. (Sun, 23 Aug 2009 20:45:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Eli Zaretskii <eliz <at> gnu.org>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Sun, 23 Aug 2009 20:45:04 GMT) Full text and rfc822 format available.

Message #15 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Juri Linkov <juri <at> jurta.org>, 4240 <at> debbugs.gnu.org
Cc: deniz.a.m.dogan <at> gmail.com, handa <at> m17n.org
Subject: Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters
Date: Sun, 23 Aug 2009 23:40:00 +0300

> From: Juri Linkov <juri <at> jurta.org>
> Date: Sun, 23 Aug 2009 21:54:04 +0300
> Cc: 4240 <at> emacsbugs.donarmstrong.com
> 
> > I hit "C-u ä" expecting it to come out as "ääää".  Instead it comes out
> > as "ä\344\344ä".  I try "C-u C-u ä" and it comes out as "ä" followed by
> > fourteen "\344" and then a trailing "ä".  This happens no matter which
> > kind of repetition I'm doing, be it using C-u or using e.g. M-3.  It's
> > always the leading and the trailing character that come out right, all
> > of the other ones are "broken".
> 
> Please see bug#4037:
> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
> 
> I received no confirmation that my proposed fix is correct.

I think those two lines are not necessary anymore and should be
removed (together with the comments which explain their need).  I
think they belong to the old pre-unicode days when raw eight-bit
characters needed such special treatment.

Handa-san, can you please comment on that?

> Maybe the right fix is to reverse negation?

Why, do you see that the code without these two lines don't DTRT when
the characters are inserted into a unibyte buffer?  If it works in
both cases, it's the evidence that I'm right and this code is not
needed anymore.

> It seems logical to check if a buffer is unibyte before converting
> from unibyte to multibyte, but I don't understand what this code was
> supposed to do.

It was supposed to produce a multibyte character from a unibyte one,
by using a special locale-dependent table that mapped, e.g., 8859-1
encoded Latin-1 characters in the range [128..255] to the
corresponding multibyte codepoints of Latin-1 characters in the
internal representation of characters Emacs 22 used.  See the Emacs 22
definition of unibyte_char_to_multibyte in src/charset.c.

Nowadays we don't need that, since we have a special range of
multibyte codepoints for representing unibyte characters in multibyte
buffers and strings, and insert-char and the primitives it calls
already DTRT with them.  So there should be no need to do anything
special outside insert-char.

Merged 4037 4240. Request was from Glenn Morris <rgm <at> gnu.org> to control <at> emacsbugs.donarmstrong.com. (Wed, 26 Aug 2009 01:25:06 GMT) Full text and rfc822 format available.

Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4240; Package emacs. (Wed, 26 Aug 2009 17:15:14 GMT) Full text and rfc822 format available.

Acknowledgement sent to Eli Zaretskii <eliz <at> gnu.org>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Wed, 26 Aug 2009 17:15:15 GMT) Full text and rfc822 format available.

Message #22 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: 4240 <at> debbugs.gnu.org, handa <at> m17n.org
Cc: juri <at> jurta.org, deniz.a.m.dogan <at> gmail.com
Subject: Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters
Date: Wed, 26 Aug 2009 20:08:40 +0300

Ping!

> Date: Sun, 23 Aug 2009 23:40:00 +0300
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: deniz.a.m.dogan <at> gmail.com
> 
> > From: Juri Linkov <juri <at> jurta.org>
> > Date: Sun, 23 Aug 2009 21:54:04 +0300
> > Cc: 4240 <at> emacsbugs.donarmstrong.com
> > 
> > > I hit "C-u ä" expecting it to come out as "ääää".  Instead it comes out
> > > as "ä\344\344ä".  I try "C-u C-u ä" and it comes out as "ä" followed by
> > > fourteen "\344" and then a trailing "ä".  This happens no matter which
> > > kind of repetition I'm doing, be it using C-u or using e.g. M-3.  It's
> > > always the leading and the trailing character that come out right, all
> > > of the other ones are "broken".
> > 
> > Please see bug#4037:
> > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
> > 
> > I received no confirmation that my proposed fix is correct.
> 
> I think those two lines are not necessary anymore and should be
> removed (together with the comments which explain their need).  I
> think they belong to the old pre-unicode days when raw eight-bit
> characters needed such special treatment.
> 
> Handa-san, can you please comment on that?
> 
> > Maybe the right fix is to reverse negation?
> 
> Why, do you see that the code without these two lines don't DTRT when
> the characters are inserted into a unibyte buffer?  If it works in
> both cases, it's the evidence that I'm right and this code is not
> needed anymore.
> 
> > It seems logical to check if a buffer is unibyte before converting
> > from unibyte to multibyte, but I don't understand what this code was
> > supposed to do.
> 
> It was supposed to produce a multibyte character from a unibyte one,
> by using a special locale-dependent table that mapped, e.g., 8859-1
> encoded Latin-1 characters in the range [128..255] to the
> corresponding multibyte codepoints of Latin-1 characters in the
> internal representation of characters Emacs 22 used.  See the Emacs 22
> definition of unibyte_char_to_multibyte in src/charset.c.
> 
> Nowadays we don't need that, since we have a special range of
> multibyte codepoints for representing unibyte characters in multibyte
> buffers and strings, and insert-char and the primitives it calls
> already DTRT with them.  So there should be no need to do anything
> special outside insert-char.
>

Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4240; Package emacs. (Thu, 27 Aug 2009 05:10:05 GMT) Full text and rfc822 format available.

Acknowledgement sent to Stefan Monnier <monnier <at> iro.umontreal.ca>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Thu, 27 Aug 2009 05:10:05 GMT) Full text and rfc822 format available.

Message #27 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 4240 <at> debbugs.gnu.org, handa <at> m17n.org, deniz.a.m.dogan <at> gmail.com
Subject: Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters
Date: Thu, 27 Aug 2009 01:04:56 -0400

>> > Please see bug#4037:
>> > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
>> > I received no confirmation that my proposed fix is correct.
>> I think those two lines are not necessary anymore and should be
>> removed (together with the comments which explain their need).  I
>> think they belong to the old pre-unicode days when raw eight-bit
>> characters needed such special treatment.

I believe you're right.  Nowadays, the keyboard-decoding should always
take place before we get to that point.


        Stefan

Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#4240; Package emacs. (Thu, 27 Aug 2009 06:30:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Kenichi Handa <handa <at> m17n.org>:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. (Thu, 27 Aug 2009 06:30:04 GMT) Full text and rfc822 format available.

Message #32 received at 4240 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Kenichi Handa <handa <at> m17n.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: eliz <at> gnu.org, 4240 <at> debbugs.gnu.org, deniz.a.m.dogan <at> gmail.com
Subject: Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters
Date: Thu, 27 Aug 2009 15:23:25 +0900

In article <jwvocq14zlk.fsf-monnier+emacsbugreports <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

>>> > Please see bug#4037:
>>> > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
>>> > I received no confirmation that my proposed fix is correct.
>>> I think those two lines are not necessary anymore and should be
>>> removed (together with the comments which explain their need).  I
>>> think they belong to the old pre-unicode days when raw eight-bit
>>> characters needed such special treatment.

> I believe you're right.  Nowadays, the keyboard-decoding should always
> take place before we get to that point.

Sorry for the late responce on this matter.  Yes, that
unibyte->multibyte conversion is not necessary.  I've just
installed a fix.

---
Kenichi Handa
handa <at> m17n.org

Message #33 received at 4240-done <at> emacsbugs.donarmstrong.com (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Kenichi Handa <handa <at> m17n.org>
Cc: monnier <at> iro.umontreal.ca, 4240-done <at> debbugs.gnu.org,
        deniz.a.m.dogan <at> gmail.com, 4037-done <at> debbugs.gnu.org
Subject: Re: bug#4240: 23.1.50; C-u doesn't work with Swedish characters
Date: Fri, 28 Aug 2009 11:52:21 +0300

> From: Kenichi Handa <handa <at> m17n.org>
> Cc: eliz <at> gnu.org, 4240 <at> emacsbugs.donarmstrong.com, deniz.a.m.dogan <at> gmail.com
> Date: Thu, 27 Aug 2009 15:23:25 +0900
> 
> In article <jwvocq14zlk.fsf-monnier+emacsbugreports <at> gnu.org>, Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
> 
> >>> > Please see bug#4037:
> >>> > http://debbugs.gnu.org/cgi/bugreport.cgi?bug=4037
> >>> > I received no confirmation that my proposed fix is correct.
> >>> I think those two lines are not necessary anymore and should be
> >>> removed (together with the comments which explain their need).  I
> >>> think they belong to the old pre-unicode days when raw eight-bit
> >>> characters needed such special treatment.
> 
> > I believe you're right.  Nowadays, the keyboard-decoding should always
> > take place before we get to that point.
> 
> Sorry for the late responce on this matter.  Yes, that
> unibyte->multibyte conversion is not necessary.  I've just
> installed a fix.

Thanks.  I'm closing the two related bug reports.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> emacsbugs.donarmstrong.com. (Fri, 25 Sep 2009 14:24:20 GMT) Full text and rfc822 format available.

This bug report was last modified 15 years and 327 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #4240 23.1.50; C-u doesn't work with Swedish characters

GNU bug report logs - #4240
23.1.50; C-u doesn't work with Swedish characters