GNU bug report logs - #20623
XML and HTML files with encoding/charset="utf-8" declaration lose BOM; Coding system is reset from utf-8-with-signature to utf-8 on save

Previous Next

Package: emacs;

Reported by: Simon Ledergerber <sledergerber <at> gmx.net>

Date: Thu, 21 May 2015 18:53:02 UTC

Severity: normal

Found in version 26.1

Fixed in version 26.2

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


Message #85 received at 20623-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Vincent Lefevre <vincent <at> vinc17.net>
Cc: rgm <at> gnu.org, a.s <at> realize.ch, monnier <at> iro.umontreal.ca,
 20623-done <at> debbugs.gnu.org, sledergerber <at> gmx.net
Subject: Re: bug#20623: XML and HTML files with encoding/charset="utf-8"
 declaration loose BOM; Coding system is reset from utf-8-with-signature to
 utf-8 on save
Date: Sat, 11 Aug 2018 12:15:31 +0300
> Date: Wed, 8 Aug 2018 11:47:48 +0200
> From: Vincent Lefevre <vincent <at> vinc17.net>
> Cc: Glenn Morris <rgm <at> gnu.org>, Simon Ledergerber <sledergerber <at> gmx.net>,
> 	Eli Zaretskii <eliz <at> gnu.org>, Alain Schneble <a.s <at> realize.ch>,
> 	20623 <at> debbugs.gnu.org
> 
> On 2017-12-04 12:38:57 -0500, Stefan Monnier wrote:
> > > Now reported with "fix this or get removed from the distribution"
> > > severity at <https://bugs.debian.org/883434>.
> > 
> > I'm curious to see if the OP's "grave" severity settings will stick.
> > "Grave" is defined in https://www.debian.org/Bugs/Developer#severities as:
> > 
> >     makes the package in question unusable or mostly so, or causes data
> >     loss, or introduces a security hole allowing access to the accounts
> >     of users who use the package.
> > 
> > The only part that could arguably apply is "causes data loss", but even
> > that is stretching the meaning of those words, I think.
> 
> Actually there's the issue that the coding system (in Emacs sense)
> is changed, but also the fact that this change is invisible to the
> user (mainly because the BOM is usually not visible), which makes
> the issue even worse. Basically, this is invisible data corruption.
> Even though only two bytes are removed, this introduces breakage in
> other applications, and it can take much time to the user to find
> the cause.
> 
> Emacs should not change the coding system when not needed, and when
> it needs to, it must make sure to have a confirmation from the user.

I agree with the last paragraph, so I've now fixed the remaining issue
of this bug (with HTML files) on the emacs-26 branch.

However, I would respectfully request that in the future bug reports
be accurate and fair in the assigned severity, and in particular make
sure that the severity matches the actual behavior as judged
objectively.

In this case, I cannot but express my extreme surprise to see such a
minor issue described as "grave".  The alleged data loss is minor, if
it exists at all (the BOM is not data important for the user, nor data
whose loss cannot be easily repaired).  The unspecified "breakage in
other applications" cannot be considered without the missing details,
but in general I'd be surprised to hear about modern applications
(browsers?) that really need a BOM in UTF-8 encoded HTML files to the
degree that the lack of BOM causes them to "break" in some way; if
they do, it could arguably be a bug in those applications.

Bottom line: artificially and unreasonably increasing the severity
level doesn't help the motivation to fix the bug, and if anything, has
the opposite effect of ignoring the source of the bug report as not
serious.  I'm sure we don't want that, certainly not for bugs reported
by Debian.

Thanks.




This bug report was last modified 6 years and 279 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.