GNU bug report logs - #20623
XML and HTML files with encoding/charset="utf-8" declaration lose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save

Previous Next

Package: emacs;

Reported by: Simon Ledergerber <sledergerber <at> gmx.net>

Date: Thu, 21 May 2015 18:53:02 UTC

Severity: normal

Found in version 26.1

Fixed in version 26.2

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log

Message #112 received at 20623 <at> debbugs.gnu.org (full text, mbox):

From: Vincent Lefevre <vincent <at> vinc17.net>
To: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
Cc: rgm <at> gnu.org, Eli Zaretskii <eliz <at> gnu.org>, a.s <at> realize.ch,
 20623 <at> debbugs.gnu.org, sledergerber <at> gmx.net
Subject: Re: bug#20623: XML and HTML files with encoding/charset="utf-8"
 declaration loose BOM; Coding system is reset from utf-8-with-signature to
 utf-8 on save
Date: Sun, 12 Aug 2018 02:58:53 +0200

On 2018-08-11 20:11:49 -0400, Stefan Monnier wrote:
> >> Please provide the details, including the use case, if possible.  I'm
> >> still in the dark regarding the importance of the BOM in UTF-8 encoded
> >> HTML stuff.
> >   https://bugzilla.mozilla.org/show_bug.cgi?id=1422889
> 
> I don't see any data loss there.

Because it is not there, it is in Emacs. What the Mozilla bug shows
is that the presence of BOM or not is important and yields very
different behavior.

-- 
Vincent Lefèvre <vincent <at> vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

This bug report was last modified 6 years and 333 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #20623 XML and HTML files with encoding/charset="utf-8" declaration lose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save

GNU bug report logs - #20623
XML and HTML files with encoding/charset="utf-8" declaration lose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save