GNU bug report logs - #60750
29.0.60; encode-coding-char fails for utf-8-auto coding system

Previous Next

Package: emacs;

Reported by: Robert Pluim <rpluim <at> gmail.com>

Date: Thu, 12 Jan 2023 09:09:02 UTC

Severity: normal

Found in version 29.0.60

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#60750: closed (29.0.60; encode-coding-char fails for
 utf-8-auto coding system)
Date: Thu, 12 Jan 2023 14:40:01 +0000
[Message part 1 (text/plain, inline)]
Your message dated Thu, 12 Jan 2023 16:39:07 +0200
with message-id <835ydbbywk.fsf <at> gnu.org>
and subject line Re: bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system
has caused the debbugs.gnu.org bug report #60750,
regarding 29.0.60; encode-coding-char fails for utf-8-auto coding system
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
60750: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=60750
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Robert Pluim <rpluim <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 29.0.60; encode-coding-char fails for utf-8-auto coding system
Date: Thu, 12 Jan 2023 10:08:31 +0100
src/emacs -Q
M-x toggle-debug-on-error
M-: (setq buffer-file-coding-system 'utf-8-auto)
C-b
C-u C-x =

=>
Debugger entered--Lisp error: (args-out-of-range "))" 3 1)
  encode-coding-char(41 utf-8-auto ascii)
  describe-char(189)
  what-cursor-position((4))

This is because utf-8-auto has a non-nil :bom property:

(define-coding-system 'utf-8-auto
  "UTF-8 (auto-detect signature (BOM))"
  :coding-type 'utf-8
  :mnemonic ?U
  :charset-list '(unicode)
  :bom '(utf-8-with-signature . utf-8))

and `encode-coding-char' does this:

        ;; We also need to exclude the leading 2 or 3 bytes if they
        ;; come from a BOM.
        (setq i0
              (if bom-p
                  (cond
                   ((eq (coding-system-type coding-system) 'utf-8)
                    3)
                   ((eq (coding-system-type coding-system) 'utf-16)
                    2)
                   (t 0))
                0))
	(substring enc2 i0 i2)))))

I始m not sure if this needs fixing, but it was surprising, and the
docstring of `define-coding-system' didn始t make it clear to me whether
a BOM should have been produced here or not. (I始m willing to be told
that buffer-file-coding-system shouldn始t be 'utf-8-auto, but I never
set that explicitly as far as I know 馃榾)

Thanks

Robert

In GNU Emacs 29.0.60 (build 14, x86_64-pc-linux-gnu, GTK+ Version
 3.24.24, cairo version 1.16.0) of 2023-01-12 built on rltb
Repository revision: f4f30ff4c44dcfdf780f1981aa541af713f2805f
Repository branch: emacs-29
System Description: Debian GNU/Linux 11 (bullseye)

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG
JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NOTIFY
INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF
TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB


[Message part 3 (message/rfc822, inline)]
From: Eli Zaretskii <eliz <at> gnu.org>
To: Robert Pluim <rpluim <at> gmail.com>
Cc: 60750-done <at> debbugs.gnu.org
Subject: Re: bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto
 coding system
Date: Thu, 12 Jan 2023 16:39:07 +0200
> From: Robert Pluim <rpluim <at> gmail.com>
> Cc: 60750 <at> debbugs.gnu.org
> Date: Thu, 12 Jan 2023 15:28:49 +0100
> 
>     >> I think that means we can leave the code as it is.
> 
>     Eli> ??? "As it is" means this coding-system behaves contrary to
>     Eli> documentation: it should produce BOM on encoding.  Leaving it as is
>     Eli> doesn't sound TRT, so I'd like to have this fixed.  From your
>     Eli> description, it sounds like you bumped into this by mistake, and I see
>     Eli> only one other use of it -- in the test suite.  So I'm inclined to
>     Eli> installing this on the emacs-29 release branch.
> 
> Oh, I thought you were proposing *not* to fix it at all, since it始s
> such an obscure coding system. I have no opinion on where a fix should
> go: I始m not going to be using that coding system again.

OK.  So I've installed the fix on the emacs-29 branch, and I'm boldly
closing this bug.


This bug report was last modified 2 years and 188 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.