GNU bug report logs -
#60750
29.0.60; encode-coding-char fails for utf-8-auto coding system
Previous Next
Reported by: Robert Pluim <rpluim <at> gmail.com>
Date: Thu, 12 Jan 2023 09:09:02 UTC
Severity: normal
Found in version 29.0.60
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
>>>>> On Thu, 12 Jan 2023 14:32:52 +0200, Eli Zaretskii <eliz <at> gnu.org> said:
Eli> Actually, the doc string is clear:
Eli> If the value is a cons cell, on decoding, check the first two bytes.
Eli> If they are 0xFE 0xFF, use the car part coding system of the value.
Eli> If they are 0xFF 0xFE, use the cdr part coding system of the value.
Eli> Otherwise, treat them as bytes for a normal character. On encoding,
Eli> produce BOM bytes according to the value of ‘:endian’.
Eli> Note the last sentence: it should unconditionally produce the BOM on
Eli> encoding. Which is what we do in your scenario.
Ah, I misread that as "depending on the value of ':endian'"
One minor nit, the description for ':endian' says:
`:endian'
VALUE must be `big' or `little' specifying big-endian and
little-endian respectively. The default value is `big'.
This attribute is meaningful only when `:coding-type' is `utf-16'.
That last sentence seems untrue, as ':endian' is meaningful for
'utf-8-auto'
>> (Iʼm willing to be told that buffer-file-coding-system shouldnʼt be
>> 'utf-8-auto, but I never set that explicitly as far as I know 😀)
Eli> Who does set utf-8-auto? where did you originally bump into this?
Eli> This is an obscure coding-system, and the fix to make it work as
Eli> documented will produce an incompatible change in behavior. So before
Eli> I decide whether to make the change and on what branch, I'd like to
Eli> know how in the world did you encounter this.
Itʼs entirely my own fault:
The file where I noticed this is shared between a GNU/Linux and a
macOS machine, which means I foolishly added the following a year ago,
even though itʼs unnecessary (perhaps I was thinking I was going to be
sharing it with a Windows machine?):
;; -*- lexical-binding: t; coding: utf-8-auto; -*-
I think that means we can leave the code as it is.
Robert
--
This bug report was last modified 2 years and 189 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.