GNU bug report logs -
#60750
29.0.60; encode-coding-char fails for utf-8-auto coding system
Previous Next
Reported by: Robert Pluim <rpluim <at> gmail.com>
Date: Thu, 12 Jan 2023 09:09:02 UTC
Severity: normal
Found in version 29.0.60
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system
which was filed against the emacs package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 60750 <at> debbugs.gnu.org.
--
60750: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60750
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 60750 <at> debbugs.gnu.org
> Date: Thu, 12 Jan 2023 15:28:49 +0100
>
> >> I think that means we can leave the code as it is.
>
> Eli> ??? "As it is" means this coding-system behaves contrary to
> Eli> documentation: it should produce BOM on encoding. Leaving it as is
> Eli> doesn't sound TRT, so I'd like to have this fixed. From your
> Eli> description, it sounds like you bumped into this by mistake, and I see
> Eli> only one other use of it -- in the test suite. So I'm inclined to
> Eli> installing this on the emacs-29 release branch.
>
> Oh, I thought you were proposing *not* to fix it at all, since itʼs
> such an obscure coding system. I have no opinion on where a fix should
> go: Iʼm not going to be using that coding system again.
OK. So I've installed the fix on the emacs-29 branch, and I'm boldly
closing this bug.
[Message part 3 (message/rfc822, inline)]
src/emacs -Q
M-x toggle-debug-on-error
M-: (setq buffer-file-coding-system 'utf-8-auto)
C-b
C-u C-x =
=>
Debugger entered--Lisp error: (args-out-of-range "))" 3 1)
encode-coding-char(41 utf-8-auto ascii)
describe-char(189)
what-cursor-position((4))
This is because utf-8-auto has a non-nil :bom property:
(define-coding-system 'utf-8-auto
"UTF-8 (auto-detect signature (BOM))"
:coding-type 'utf-8
:mnemonic ?U
:charset-list '(unicode)
:bom '(utf-8-with-signature . utf-8))
and `encode-coding-char' does this:
;; We also need to exclude the leading 2 or 3 bytes if they
;; come from a BOM.
(setq i0
(if bom-p
(cond
((eq (coding-system-type coding-system) 'utf-8)
3)
((eq (coding-system-type coding-system) 'utf-16)
2)
(t 0))
0))
(substring enc2 i0 i2)))))
Iʼm not sure if this needs fixing, but it was surprising, and the
docstring of `define-coding-system' didnʼt make it clear to me whether
a BOM should have been produced here or not. (Iʼm willing to be told
that buffer-file-coding-system shouldnʼt be 'utf-8-auto, but I never
set that explicitly as far as I know 😀)
Thanks
Robert
In GNU Emacs 29.0.60 (build 14, x86_64-pc-linux-gnu, GTK+ Version
3.24.24, cairo version 1.16.0) of 2023-01-12 built on rltb
Repository revision: f4f30ff4c44dcfdf780f1981aa541af713f2805f
Repository branch: emacs-29
System Description: Debian GNU/Linux 11 (bullseye)
Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB
This bug report was last modified 2 years and 189 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.