GNU bug report logs -
#2354
23.0.90; Emacs fails to detect utf-8 encoding with language environment Latin-1
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 2354 in the body.
You can then email your comments to 2354 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#2354
; Package
emacs
.
(Tue, 17 Feb 2009 10:45:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
David Engster <deng <at> randomsample.de>
:
New bug report received and forwarded. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Tue, 17 Feb 2009 10:45:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):
This is what I believe to be a regression in CVS Emacs since the
23.0.90 pretest. I'm using a fresh CVS checkout from 2009-02-17,
compiled with 'make bootstrap'.
You can reproduce it as follows:
1. emacs -Q
2. M-x set-language-environment RET Latin-1 RET
3. In some buffer write:
(ucs-insert "2500")
4. Eval it, so that the unicode character is inserted into the buffer.
5. Save the file and choose utf-8 as encoding.
6. Kill the buffer.
7. Load the file you just saved.
Result: Emacs displays "â\224\200" for the unicode character.
Expected behaviour: Emacs should detect utf-8 encoding and display
correct character.
Please note that this has worked without problems with the Emacs
23.0.90 pretest, so it must be due to some change(s) since then in CVS.
In GNU Emacs 23.0.90.1 (i686-pc-linux-gnu, GTK+ Version 2.12.11)
of 2009-02-17 on void
Windowing system distributor `The X.Org Foundation', version 11.0.10402000
configured using `configure '--prefix=/usr/local/emacs''
Important settings:
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: nil
value of $XMODIFIERS: nil
locale-coding-system: nil
default-enable-multibyte-characters: t
Major mode: Lisp Interaction
Minor modes in effect:
tooltip-mode: t
tool-bar-mode: t
mouse-wheel-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
global-auto-composition-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
Recent input:
M-x r e p o <tab> r <tab> C-g M-x s e t - l a n <tab>
<return> L a t i n w <backspace> - w <return> <backspace>
1 <return> M-x r e p o <tab> r <tab> <return>
Recent messages:
For information about GNU Emacs and the GNU system, type C-h C-a.
Making completion list...
Quit
Making completion list...
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#2354
; Package
emacs
.
(Tue, 17 Feb 2009 16:55:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Juanma Barranquero <lekktu <at> gmail.com>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Tue, 17 Feb 2009 16:55:04 GMT)
Full text and
rfc822 format available.
Message #10 received at 2354 <at> emacsbugs.donarmstrong.com (full text, mbox):
On Tue, Feb 17, 2009 at 11:35, David Engster <deng <at> randomsample.de> wrote:
> You can reproduce it as follows:
>
> 1. emacs -Q
> 2. M-x set-language-environment RET Latin-1 RET
> 3. In some buffer write:
>
> (ucs-insert "2500")
>
> 4. Eval it, so that the unicode character is inserted into the buffer.
> 5. Save the file and choose utf-8 as encoding.
> 6. Kill the buffer.
> 7. Load the file you just saved.
>
> Result: Emacs displays "â\224\200" for the unicode character.
I cannot reproduce it on Windows with the current trunk. The file's
coding is correctly detected as UTF-8.
Juanma
Information forwarded
to
bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>
:
bug#2354
; Package
emacs
.
(Tue, 17 Feb 2009 18:10:04 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
David Engster <deng <at> randomsample.de>
:
Extra info received and forwarded to list. Copy sent to
Emacs Bugs <bug-gnu-emacs <at> gnu.org>
.
(Tue, 17 Feb 2009 18:10:04 GMT)
Full text and
rfc822 format available.
Message #15 received at 2354 <at> emacsbugs.donarmstrong.com (full text, mbox):
Juanma Barranquero <lekktu <at> gmail.com> writes:
> On Tue, Feb 17, 2009 at 11:35, David Engster <deng <at> randomsample.de> wrote:
>
>> You can reproduce it as follows:
>>
>> 1. emacs -Q
>> 2. M-x set-language-environment RET Latin-1 RET
>> 3. In some buffer write:
>>
>> (ucs-insert "2500")
>>
>> 4. Eval it, so that the unicode character is inserted into the buffer.
>> 5. Save the file and choose utf-8 as encoding.
>> 6. Kill the buffer.
>> 7. Load the file you just saved.
>>
>> Result: Emacs displays "â\224\200" for the unicode character.
>
> I cannot reproduce it on Windows with the current trunk. The file's
> coding is correctly detected as UTF-8.
Thank you for looking into this. I tested this now again on a different
machine, but also running GNU/Linux (Ubuntu 8.10), with the same
result. FWIW, I think I could track down this issue to the following
commit for src/coding.c:
revision 1.413
date: 2009-02-09 01:42:37 +0100; author: handa; state: Exp; lines: +1 -1; commitid: WAhpeD8cqX926HBt;
(detect_coding_charset): Fix previous change.
With revision 1.412 of coding.c, the error disappears for me.
-David
Merged 2354 2497.
Request was from
Jason Rumney <jasonr <at> gnu.org>
to
control <at> emacsbugs.donarmstrong.com
.
(Sat, 28 Feb 2009 01:35:07 GMT)
Full text and
rfc822 format available.
Reply sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
You have taken responsibility.
(Sat, 28 Feb 2009 12:30:04 GMT)
Full text and
rfc822 format available.
Notification sent
to
David Engster <deng <at> randomsample.de>
:
bug acknowledged by developer.
(Sat, 28 Feb 2009 12:30:04 GMT)
Full text and
rfc822 format available.
Message #22 received at 2354-done <at> emacsbugs.donarmstrong.com (full text, mbox):
> From: David Engster <deng <at> randomsample.de>
> Date: Fri, 27 Feb 2009 18:46:12 +0100
> Cc: emacs-pretest-bug <at> gnu.org, 2497 <at> emacsbugs.donarmstrong.com
>
> Uwe Siart <uwe.siart <at> tum.de> writes:
> > I'm using the windows port of 23.0.91 on Win2k SP4 and I found that it
> > fails to read utf-8 encoded files correctly. When visiting a file in
> > utf-8 encoding all characters above 255 are screwed up and "C-h C RET"
> > indicates iso-latin1-dos for saving the file. This has not been an
> > issue in 23.0.90.
>
> Maybe this is a duplicate of what I reported in
>
> http://debbugs.gnu.org/cgi/bugreport.cgi?bug=2354
>
> As I write later in that bug report, I think I could track down this
> issue to the change in revision 1.413 of src/coding.c. Maybe you could
> try if the same applies to your problem.
Should be fixed by this change:
2009-02-28 Eli Zaretskii <eliz <at> gnu.org>
* coding.c (detect_coding_charset): Fix change from 2008-10-21.
Also, check iso-latin-*, not only iso-8859-*.
Reply sent
to
Eli Zaretskii <eliz <at> gnu.org>
:
You have taken responsibility.
(Sat, 28 Feb 2009 12:30:04 GMT)
Full text and
rfc822 format available.
Notification sent
to
uwe.siart <at> tum.de
:
bug acknowledged by developer.
(Sat, 28 Feb 2009 12:30:04 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> emacsbugs.donarmstrong.com
.
(Wed, 01 Apr 2009 14:24:09 GMT)
Full text and
rfc822 format available.
This bug report was last modified 16 years and 87 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.