GNU bug report logs -
#20623
XML and HTML files with encoding/charset="utf-8" declaration lose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save
Previous Next
Reported by: Simon Ledergerber <sledergerber <at> gmx.net>
Date: Thu, 21 May 2015 18:53:02 UTC
Severity: normal
Found in version 26.1
Fixed in version 26.2
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #118 received at 20623 <at> debbugs.gnu.org (full text, mbox):
> From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
> Cc: rgm <at> gnu.org, a.s <at> realize.ch, 20623 <at> debbugs.gnu.org, sledergerber <at> gmx.net
> Date: Sat, 11 Aug 2018 20:04:05 -0400
>
> You say that the code I wrote is not needed to make sure an existing
> latin-1-mac setting isn't overwritten by a latin-1 guess. I expect this
> is indeed true (otherwise I think we'd have had bug-reports about it),
> but I don't know where that is handled.
It is handled inside select-safe-coding-system, which first invokes
find-auto-coding to decide which encoding is appropriate (and as part
of that, looks at XML or HTML charset information declared by the
text), and then, if the encoding it got doesn't specify the EOL
conversion, it uses the EOL conversion from the buffer's encoding or
from the appropriate defaults.
Since XML/HTML charset tags never specify the EOL conversion, it
follows that Emacs will never override the EOL conversion of the
buffer, it will only use the charset for "text conversion".
I hope this answers your question.
This bug report was last modified 6 years and 279 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.