GNU bug report logs -
#35507
Gnus mojibakifies UTF-8 text/x-patch attachments from Thunderbird
Previous Next
Reported by: Paul Eggert <eggert <at> cs.ucla.edu>
Date: Tue, 30 Apr 2019 19:22:02 UTC
Severity: minor
Tags: fixed
Found in version 27
Done: "Basil L. Contovounesios" <contovob <at> tcd.ie>
Bug is archived. No further changes may be made.
Full log
Message #56 received at submit <at> debbugs.gnu.org (full text, mbox):
On May 2, 2019 10:17:51 AM GMT+03:00, Andy Moreton <andrewjmoreton <at> gmail.com> wrote:
> On Wed 01 May 2019, Noam Postavsky wrote:
>
> > Eli Zaretskii <eliz <at> gnu.org> writes:
> >
> >>> From: Andy Moreton <andrewjmoreton <at> gmail.com>
> >>> Date: Wed, 01 May 2019 17:42:18 +0100
> >>>
> >>> + (mm-decode-string text 'utf-8))))
> >>
> >> As I said, I'm not sure we should do this, let alone
> unconditionally
> >> force UTF-8 here, but if we must, why not use decode-coding-string?
> >> Do we really need the mm-* stuff?
> >
> > As far as I can tell, the mm-* version is useful for handling stuff
> lke
> > "UTF-8" as the charset argument (which might be useful if we extract
> it
> > from the "Content-Type: text/plain; charset=UTF-8" header). If
> passing
> > 'utf-8, then it's just the same as calling decode-coding-string.
>
> OK, in that case we could indeed just call decode-coding-string.
>
> > For a default if we don't find a charset header, I guess `undecided'
> > would make more sense, right? After all, Emacs already has the
> coding
> > detection machinery, may as well use it.
>
> Please re-read the original bug report: the problem is with malformed
> messages that do not contain a charset field in the Content-Type
> header.
>
> The one-liner patch changes the default for inline display in the
> Gnus article buffer to assume UTF-8 when nothing is specified, rather
> than just inserting the text without decoding it.
>
> That should result in text that actually is UTF-8 being displayed
> correctly, and no change to plain ASCII. For anything else, the user
> can
> use the `gnus-mime-view-part-as-charset' command to override the
> default.
>
> AndyM
Using 'undecided' doesn't disable decoding, it just means Emacs will try to detect the correct encoding by looking at the text (not at the charset header). In a UTF-8 locale, we will guess UTF-8 anyway, unless we see invalid sequences.
So yes, I think Noam is right, and 'undecided' is a better alternative here.
This bug report was last modified 6 years and 81 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.