GNU bug report logs - #6971
24.0.50.1: non-ascii chars appear as numbers

Previous Next

Package: emacs;

Reported by: Andreas Röhler <andreas.roehler <at> easy-emacs.de>

Date: Thu, 2 Sep 2010 10:15:05 UTC

Severity: normal

Found in version 24.0.50.1

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #11 received at 6971-done <at> debbugs.gnu.org (full text, mbox):

From: Andreas Röhler <andreas.roehler <at> easy-emacs.de>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 6971-done <at> debbugs.gnu.org
Subject: Re: bug#6974: Emacs doesn't like Swedish ä (on w32)
Date: Sat, 04 Sep 2010 10:29:09 +0200
Am 04.09.2010 10:16, schrieb Eli Zaretskii:
>> Date: Sat, 04 Sep 2010 09:30:55 +0200
>> From: Andreas Röhler<andreas.roehler <at> easy-emacs.de>
>> CC: bug-gnu-emacs <at> gnu.org
>>
>>> Please post the file as an attachment.
>>>
>>
>> Attached.
>
> Thanks.  Here's your culprit:
>
>>> \240 (autoload 'muse-mode "muse-mode" "" t)
>
> You have literal \240 characters in the file, which are invalid UTF-8
> sequences.
>

Thanks a lot for your efforts.
Question remains how that might happen.

Why Emacs could not prevent that.

See too possible causes

- chars from a auto-saved-file
- something pasted from the net, which had some MS- encoding etc.

If thats real so far, both cases are not that uncommon, think Emacs 
should find a way to deal with.


> This file has also other similar problems, like this one:
>
>    Du kannst es nat\365\202\211\205\365\200\210\246\357\275\357\275\274rlich auch unter Linux ausprobieren, z.B.:
>
> I believe the 4th word should have been "natűrlich", and the invalid
> long byte sequence instead of ű (which Emacs decodes into some
> Japanese Kanji character that cannot be encoded by UTF-8) is the
> result of multiple saving of this file with incorrect encoding.
>
> To fix all this corruption, I suggest the following steps:
>
>    1) C-x RET c utf-8 RET C-x C-f befehle.txt RET
>
>    2) M-: (unencodable-char-position (point) (point-max) 'utf-8) RET
>
>    3) Go to the position shown by the previous command, and edit the
>       file to replace invalid bytes with valid characters.
>
>    4) Move point past the corrected portion.
>
>    5) Go back to 2.  When unencodable-char-position returns nil, you
>       are done; save the file.
>
> I'm closing bug #6971 with this message, since there's no Emacs bug
> here.
>

Hm,

as I didn't see that error for a long time, still suspect Emacs 24 doing 
something not that clever 23 does.

Andreas




This bug report was last modified 14 years and 264 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.