GNU bug report logs -
#19393
25.0.50; Emacs cannot determine coding system of ISO-8859 encoded files
Previous Next
Reported by: Tassilo Horn <tsdh <at> gnu.org>
Date: Tue, 16 Dec 2014 15:22:02 UTC
Severity: normal
Found in version 25.0.50
Done: Lars Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
Full log
Message #11 received at 19393 <at> debbugs.gnu.org (full text, mbox):
> Date: Tue, 16 Dec 2014 18:05:38 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 19393 <at> debbugs.gnu.org
>
> > From: Tassilo Horn <tsdh <at> gnu.org>
> > Date: Tue, 16 Dec 2014 16:21:10 +0100
> >
> > ftp://ftp.fu-berlin.de/pub/misc/movies/database/movies.list.gz
> >
> > which contains all movies known to the international movie database
> > (IMDb.com). When I open that file using "emacs -Q movies.list.gz" (or
> > unzip it first) and then do M-x describe-coding-system I can see that it
> > is "t -- raw-text-unix". As a result of this, the last movie in that
> > file is displayed as "\374\347 (2012) 2012".
> >
> > However, according to the `file' command, the file is plain ISO-8859.
>
> Looks like some kind of bug, although with such a large file, it's not
> easy to be sure.
Actually, I don't think this is a bug. There are ISO-8859-15
characters in that file that are not part of ISO-8859-1, so Emacs will
not detect that encoding unless either (a) your locale dictates that
encoding, or (b) you change the preferences to prefer ISO-8859-15.
This is so with any 8-bit encoding -- EMacs cannot easily distinguish
between them, and needs some guidance.
This bug report was last modified 4 years and 334 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.