GNU bug report logs - #2497
23.0.91; Fails to read UTF-8 on Win2k

Previous Next

Package: emacs;

Reported by: uwe.siart <at> tum.de

Date: Fri, 27 Feb 2009 14:20:02 UTC

Severity: normal

Merged with 2354

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log

View this message in rfc822 format

From: David Engster <deng <at> randomsample.de>
To: uwe.siart <at> tum.de
Cc: 2497 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: bug#2497: 23.0.91; Fails to read UTF-8 on Win2k
Date: Sat, 28 Feb 2009 11:14:16 +0100

Uwe Siart <uwe.siart <at> tum.de> writes:
> Stefan Monnier <monnier <at> iro.umontreal.ca> writes:
>
>> The guessing shouldn't give priority to buffer-file-coding-system.
>> Instead we have the set-coding-system-priority instead. And IIUC utf-8
>> should always have a pretty high priority since false positives are
>> fairly rare. So this still looks like a real bug.
>
> Here I would like to note that I never had false positives in the past
> (before 23.0.91) but I do have false positives now. Therefore I'm
> inclined to call it a bug.

I second this - this has worked for years without problems, and suddenly
it fails to detect UTF-8 with a Latin-1 environment.

I once again confirmed that this behaviour can be tracked down to this
change in detect_coding_charset in coding.c (revision 1.413):

--- coding.c    7 Feb 2009 10:49:39 -0000       1.412
+++ coding.c    9 Feb 2009 00:42:37 -0000       1.413
@@ -5101,7 +5101,7 @@
   valids = AREF (attrs, coding_attr_charset_valids);
   name = CODING_ID_NAME (coding->id);
   if (VECTORP (Vlatin_extra_code_table)
-      && strcmp ((char *) SDATA (SYMBOL_NAME (name)), "iso-8859-"))
+      && strcmp ((char *) SDATA (SYMBOL_NAME (name)), "iso-8859-") == 0)
     check_latin_extra = 1;
   if (! NILP (CODING_ATTR_ASCII_COMPAT (attrs)))
     src += head_ascii;

I'm inclined to say that this change is wrong, since strcmp will only
return 0 if two strings are exactly equal. In this case though, the
string "iso-8859-" is compared to "iso-8859-1" (in my case), so it
returns 1 and therefore check_latin_extra is not set.

-David

This bug report was last modified 16 years and 140 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #2497 23.0.91; Fails to read UTF-8 on Win2k

GNU bug report logs - #2497
23.0.91; Fails to read UTF-8 on Win2k