GNU bug report logs - #68971
Innocent file renders crazy

Previous Next

Package: emacs;

Reported by: Dan Jacobson <jidanni <at> jidanni.org>

Date: Wed, 7 Feb 2024 14:19:01 UTC

Severity: normal

Tags: notabug

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #18 received at 68971-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Dan Jacobson <jidanni <at> jidanni.org>
Cc: 68971-done <at> debbugs.gnu.org
Subject: Re: bug#68971: Innocent file renders crazy
Date: Thu, 08 Feb 2024 08:03:01 +0200
> From: Dan Jacobson <jidanni <at> jidanni.org>
> Cc: 68971 <at> debbugs.gnu.org
> Date: Thu, 08 Feb 2024 05:46:35 +0800
> 
> OK, you are entirely right. It is all the file's fault and not emacs's.
> 
> But on the other hand I wouldn't get far telling the Google Chrome team
> they should stop overriding charset declarations just to make things
> render good.
> 
> In the end it's the emacs users who end up not being able to read the
> document.
> 
> Maybe have some warning "wrong charset detected, proceed? [y,n,(a)utofix...]"

How can Emacs know, up front, that the charset is wrong?  In general,
when a file claims some specific charset or encoding, Emacs believes
that and obeys.  The "gibberish" is in the eyes of the beholder; Emacs
doesn't really understand human-readable text, and so doesn't know
whether what it presents is legible text or garbage caused by wrong
decoding.

> Else well, all the other users in the room are proceeding with their
> homework assignment, except Ralph, who uses emacs, which has gibberish
> on its screen, with no warnings.

What I did when I saw gibberish was to visit the file literally (as in
"M-x find-file-literally"), then, when I saw it was plain ASCII,
looked at its preamble, where I saw UTF-16, which explained why "C-x C-f"
shows gibberish.  So when something like this happens, my suggestion
is:

  . M-x find-file-literally
  . look at the literal display: if its is readable, you can just
    proceed with your home assignment
  . alternatively, force Emacs to visit with the correct encoding, as
    in "C-x RET c utf-8 RET C-x C-f metadata.html RET"

The "utf-8" part above was a guess, based on looking at the file when
visited literally; you may need to guess again if the results are not
good enough.  See the node "Text Coding" in the Emacs user manual for
more about these facilities.

And with that, I'm closing this bug.




This bug report was last modified 1 year and 133 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.