GNU bug report logs - #20623
XML and HTML files with encoding/charset="utf-8" declaration lose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save

Previous Next

Package: emacs;

Reported by: Simon Ledergerber <sledergerber <at> gmx.net>

Date: Thu, 21 May 2015 18:53:02 UTC

Severity: normal

Found in version 26.1

Fixed in version 26.2

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 20623 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Simon Ledergerber <sledergerber <at> gmx.net>
Cc: 20623 <at> debbugs.gnu.org
Subject: Re: bug#20623: XML and HTML files with
 encoding/charset="utf-8"	declaration loose BOM;
 Coding system is reset from utf-8-with-signature to utf-8 on save
Date: Thu, 21 May 2015 22:48:31 +0300
> Date: Thu, 21 May 2015 20:50:58 +0200
> From: Simon Ledergerber <sledergerber <at> gmx.net>
> 
> When I was editing XHTML and HTML files, I wanted to make sure the BOM 
> was written out to the file in order to make it easier for the browser 
> to detect the UTF-8 encoding. Therefore I changed the coding system for 
> the file buffer to utf-8-with-signature-dos (since I am working on a 
> Windows System) before saving the file.
> 
> After some time I got surprised because the browser (IE11), didn't 
> report UTF-8 as the file's encoding. Having checked the hexdump of my 
> (X)HTML file, I saw the BOM was definitely missing.
> 
> Obviously, when a "UTF-8" string appears in the <meta charset="utf-8"> 
> (even if commented out, see later below) or <?xml version="1.0" 
> encoding="utf-8"?> declaration, Emacs switches the file coding system to 
> utf-8, when it saves the file, even if utf-8-with-signature was 
> specified explicitly before. This appears to me as a bug, because there 
> is no way anymore to restore the BOM using Emacs.

What would you expect Emacs to do instead?  It just obeys the stated
encoding, which says nothing about the BOM.  How can Emacs know when
to use utf-8 and when utf-8-with-signature?




This bug report was last modified 6 years and 279 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.