From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 12 04:08:47 2023 Received: (at submit) by debbugs.gnu.org; 12 Jan 2023 09:08:47 +0000 Received: from localhost ([127.0.0.1]:44646 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFtZn-0006Mx-Dk for submit@debbugs.gnu.org; Thu, 12 Jan 2023 04:08:47 -0500 Received: from lists.gnu.org ([209.51.188.17]:39182) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFtZi-0006Mo-R5 for submit@debbugs.gnu.org; Thu, 12 Jan 2023 04:08:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFtZd-0006t9-8L for bug-gnu-emacs@gnu.org; Thu, 12 Jan 2023 04:08:37 -0500 Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pFtZb-00020T-Ln for bug-gnu-emacs@gnu.org; Thu, 12 Jan 2023 04:08:36 -0500 Received: by mail-wr1-x42c.google.com with SMTP id k8so2897830wrc.9 for ; Thu, 12 Jan 2023 01:08:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=tsAN6axpVH5j1iYEYe10YGsYWqOCNyFgLJRURg4MQFI=; b=qcyPPx2GAXc4rs07Az3+XYp7Ioes6H9m18f/fb3uppYVahFqqWjviRB3URg3ZpvAzK SZuUQbUCbmGyOC5uRESNS+R1CeFHyTKPsp0eInNkD7QFQ2v80+v106N16esIaKAI6sxA xBn7EmeqyP/V+kEU1BU1WLujxjX7IRy6FiRHaFjfWyyN0Zr8ekGjJjm5XQhJ1YAS7ci7 KqaZzh1mKS9T6mwAiFNe7oCjVIf/KuWnYEF4WC9nAzkz2Xlvzr2CSLs2WU3wjyAeLkTD hMdCBajAApHKWxwllYz1x3XqSbfJt0vraHj1hQPfDMIqm8Z/fwYTyGmCZrIXE29Mg+2J 4DUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tsAN6axpVH5j1iYEYe10YGsYWqOCNyFgLJRURg4MQFI=; b=pGyZ5r2fNWz0nu/1py/40NA0iNMuH+TDbvH3fxBHUj+QqQmcBKiIQaWt8JDEeZ0+Lc TvPUm6C6D6wLKHKSolftwWHXQgbP+lJILm4r5JsgFFbOrfVPDRiArdTz5lEOHJmRTHYx d5biyu54q2OidHYyo7v5vq+qaD/z4Qozer9zW+FsCmbmzjRWxRhvaLfMsC4dTMoFYjvn HxI86S0K0/fO/Pu+KPcVbt68xwOp+6O2P3wbBfULeP/ZCFWSqV86fIFuSMtJk2jyxXyz 9/cqCN8j/RD+flPb73+JMMqvnJguZ8gFDlZUT7K0RIXNHIt1ogrtirX9foGn5T6Jv4TA Q2ow== X-Gm-Message-State: AFqh2kqQXfTrRfWiLxTltCiND3HMujvyXp0lsUL4MRqXnoNVEKnXniNc tOQfdhEkptWobSKFgFSfsIekkuRkMqA= X-Google-Smtp-Source: AMrXdXtI0jjObjhSdTsOlHgKmL3tOgVzKh6cIFC01admh7joyj2lEW4a84D+WvKCL1/HSdCr2yFwfQ== X-Received: by 2002:a5d:4a4f:0:b0:2bb:f255:6bb4 with SMTP id v15-20020a5d4a4f000000b002bbf2556bb4mr10139064wrs.25.1673514513203; Thu, 12 Jan 2023 01:08:33 -0800 (PST) Received: from rltb ([82.66.8.55]) by smtp.gmail.com with ESMTPSA id m13-20020adfe94d000000b002714b3d2348sm15889896wrn.25.2023.01.12.01.08.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Jan 2023 01:08:32 -0800 (PST) From: Robert Pluim To: bug-gnu-emacs@gnu.org Subject: 29.0.60; encode-coding-char fails for utf-8-auto coding system Date: Thu, 12 Jan 2023 10:08:31 +0100 Message-ID: <87zgaof7cg.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass client-ip=2a00:1450:4864:20::42c; envelope-from=rpluim@gmail.com; helo=mail-wr1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.3 (-) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.3 (--) src/emacs -Q M-x toggle-debug-on-error M-: (setq buffer-file-coding-system 'utf-8-auto) C-b C-u C-x =3D =3D> Debugger entered--Lisp error: (args-out-of-range "))" 3 1) encode-coding-char(41 utf-8-auto ascii) describe-char(189) what-cursor-position((4)) This is because utf-8-auto has a non-nil :bom property: (define-coding-system 'utf-8-auto "UTF-8 (auto-detect signature (BOM))" :coding-type 'utf-8 :mnemonic ?U :charset-list '(unicode) :bom '(utf-8-with-signature . utf-8)) and `encode-coding-char' does this: ;; We also need to exclude the leading 2 or 3 bytes if they ;; come from a BOM. (setq i0 (if bom-p (cond ((eq (coding-system-type coding-system) 'utf-8) 3) ((eq (coding-system-type coding-system) 'utf-16) 2) (t 0)) 0)) (substring enc2 i0 i2))))) I=CA=BCm not sure if this needs fixing, but it was surprising, and the docstring of `define-coding-system' didn=CA=BCt make it clear to me whether a BOM should have been produced here or not. (I=CA=BCm willing to be told that buffer-file-coding-system shouldn=CA=BCt be 'utf-8-auto, but I never set that explicitly as far as I know =F0=9F=98=80) Thanks Robert In GNU Emacs 29.0.60 (build 14, x86_64-pc-linux-gnu, GTK+ Version 3.24.24, cairo version 1.16.0) of 2023-01-12 built on rltb Repository revision: f4f30ff4c44dcfdf780f1981aa541af713f2805f Repository branch: emacs-29 System Description: Debian GNU/Linux 11 (bullseye) Configured features: ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3 THREADS TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM GTK3 ZLIB From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 12 07:33:04 2023 Received: (at 60750) by debbugs.gnu.org; 12 Jan 2023 12:33:04 +0000 Received: from localhost ([127.0.0.1]:44935 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFwlU-00086v-1m for submit@debbugs.gnu.org; Thu, 12 Jan 2023 07:33:04 -0500 Received: from eggs.gnu.org ([209.51.188.92]:52674) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFwlR-00086R-LO for 60750@debbugs.gnu.org; Thu, 12 Jan 2023 07:33:03 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFwlL-0001YO-DA; Thu, 12 Jan 2023 07:32:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=sF0TtGvoCyDao4RNT+3MTZ9hs9PTeoTzz109MT7+rPY=; b=ECYmunldZ24X7aBLv84+ 7Nh2qHVQFo9A4tyfGGPRveR4hP94yFbKY+N3zo/vVroWELciXEyMbhAzZDBPBslw2J4fVo+dcNHmz rbH1kpEzoOwnFTwyKv0WUyZHcoQ6Retz1efN16KgxTLkykIJ+bALr92lvXIMU6fZWYMOLxD/vWJjq MnnDPOWiriVV68/H7JtTZueufg3n4GRfMjAtNgKNtxCS9CyD5hAGvd8ELHqdA1mcN226VQGWaOiip vE979MB+Rx9QCtRxbrZDO4ny3PI6c2VtsnJfJN7nuUN0TIciqKv3W1VnAwjnPbp/WyX1k+LWvPTvI O13BTQt8flrFfA==; Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFwlK-00011L-TM; Thu, 12 Jan 2023 07:32:55 -0500 Date: Thu, 12 Jan 2023 14:32:52 +0200 Message-Id: <83fscgaq6j.fsf@gnu.org> From: Eli Zaretskii To: Robert Pluim In-Reply-To: <87zgaof7cg.fsf@gmail.com> (message from Robert Pluim on Thu, 12 Jan 2023 10:08:31 +0100) Subject: Re: bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system References: <87zgaof7cg.fsf@gmail.com> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 60750 Cc: 60750@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Robert Pluim > Date: Thu, 12 Jan 2023 10:08:31 +0100 > > > src/emacs -Q > M-x toggle-debug-on-error > M-: (setq buffer-file-coding-system 'utf-8-auto) > C-b > C-u C-x = > > => > Debugger entered--Lisp error: (args-out-of-range "))" 3 1) > encode-coding-char(41 utf-8-auto ascii) > describe-char(189) > what-cursor-position((4)) > > This is because utf-8-auto has a non-nil :bom property: > > (define-coding-system 'utf-8-auto > "UTF-8 (auto-detect signature (BOM))" > :coding-type 'utf-8 > :mnemonic ?U > :charset-list '(unicode) > :bom '(utf-8-with-signature . utf-8)) Right. This is a very old bug in encoding with utf-8 family of encoding which has a :bom property that is a cons cell. The fix is simple, but I wonder what will this break out there. So: > Iʼm not sure if this needs fixing, but it was surprising, and the > docstring of `define-coding-system' didnʼt make it clear to me whether > a BOM should have been produced here or not. Actually, the doc string is clear: If the value is a cons cell, on decoding, check the first two bytes. If they are 0xFE 0xFF, use the car part coding system of the value. If they are 0xFF 0xFE, use the cdr part coding system of the value. Otherwise, treat them as bytes for a normal character. On encoding, produce BOM bytes according to the value of ‘:endian’. Note the last sentence: it should unconditionally produce the BOM on encoding. Which is what we do in your scenario. > (Iʼm willing to be told that buffer-file-coding-system shouldnʼt be > 'utf-8-auto, but I never set that explicitly as far as I know 😀) Who does set utf-8-auto? where did you originally bump into this? This is an obscure coding-system, and the fix to make it work as documented will produce an incompatible change in behavior. So before I decide whether to make the change and on what branch, I'd like to know how in the world did you encounter this. Thanks. From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 12 08:44:40 2023 Received: (at 60750) by debbugs.gnu.org; 12 Jan 2023 13:44:40 +0000 Received: from localhost ([127.0.0.1]:45036 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFxsl-0001kQ-Mt for submit@debbugs.gnu.org; Thu, 12 Jan 2023 08:44:40 -0500 Received: from mail-wm1-f44.google.com ([209.85.128.44]:39597) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFxsj-0001kE-Ho for 60750@debbugs.gnu.org; Thu, 12 Jan 2023 08:44:38 -0500 Received: by mail-wm1-f44.google.com with SMTP id p3-20020a05600c1d8300b003d9ee5f125bso10105977wms.4 for <60750@debbugs.gnu.org>; Thu, 12 Jan 2023 05:44:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=SGgLqeVSIbQW3ZerByqm/SkhM9/QYLgkN7S7KHKCsUk=; b=XgH4lOqxc0PpMlEN54mIx8Y3Zh7d6xaMBtbunGSXwMK98eBje/0CvBsUJAKn9kZghU Di5HZVhLJ69XcKBrdlymKgansYihr8hxM7n+jcgogDkAaWbAcBspHy7lq41DN26rZK+F o0ltcrSGb5Nz5X4DyrhlLU1gHBvojB683s+K1oK+xWkCPi47G2yl11Gp8I7p70Hq/YwA dNQL8pwyH975Ob6wgM6WHnkeP94wEoBtQNvIBUmmKZbbjEpI0dA4KFaduZqluL2qbmg+ It1lPnlu7ussg+fJ2oTw6FN/Py/XzwyrayMa+oVPXT01lYa7I5fUACqaCCodIi1+zI3o WsmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SGgLqeVSIbQW3ZerByqm/SkhM9/QYLgkN7S7KHKCsUk=; b=A5JA8n2h1ywPtIf6XvUP7l+cLGgQn+DjHXyIpYKRTZeR/vFXIenZOvaG6rgCZPeTTi 6zSKgQutj+73hggGnffi1ZziRM5YrLqFbf1/Vs7p1Qpmn/vGUexMRv+nXhzB53cNpUjf iyn6/HtK4tJM6xc4t8PqLXwfAEH92+kT7daFA39B0HcHuoyOoh1XSaNyMV+OofXIs/4D TBLIUUKWGxMxMFaPbLt3flB3xMv+kP1SPR4tlKdfvOUSDTMiDf0pmVlQZxG67dMks3Qw J7QymWrg95MvIK4C0+RsPvUVmPNX7ABf4IqBTF/XF/ASPNlXXGBNH0L0VaY+dcoXC/tH cpBA== X-Gm-Message-State: AFqh2ko8oWb3eNAeOoc9HVRBdSqw/YFuCXgOn/QOlaoqyo9J5iNVQ6AT pM1i93INRCTTRkPFiOtdel14bMA0Jw4= X-Google-Smtp-Source: AMrXdXtBtugdTumqLqwE531KYxaW57Yrcrlw/zBEK3fAZTfu6i7/dgswZIkqq1r+21PeBZPHpwmcUQ== X-Received: by 2002:a05:600c:510b:b0:3d2:392e:905f with SMTP id o11-20020a05600c510b00b003d2392e905fmr55741125wms.24.1673531071144; Thu, 12 Jan 2023 05:44:31 -0800 (PST) Received: from rltb ([82.66.8.55]) by smtp.gmail.com with ESMTPSA id l36-20020a05600c1d2400b003d9fb59c16fsm9642356wms.11.2023.01.12.05.44.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Jan 2023 05:44:30 -0800 (PST) From: Robert Pluim To: Eli Zaretskii Subject: Re: bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system In-Reply-To: <83fscgaq6j.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 12 Jan 2023 14:32:52 +0200") References: <87zgaof7cg.fsf@gmail.com> <83fscgaq6j.fsf@gnu.org> Date: Thu, 12 Jan 2023 14:44:29 +0100 Message-ID: <871qnzg94y.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 60750 Cc: 60750@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) >>>>> On Thu, 12 Jan 2023 14:32:52 +0200, Eli Zaretskii said: Eli> Actually, the doc string is clear: Eli> If the value is a cons cell, on decoding, check the first two by= tes. Eli> If they are 0xFE 0xFF, use the car part coding system of the val= ue. Eli> If they are 0xFF 0xFE, use the cdr part coding system of the val= ue. Eli> Otherwise, treat them as bytes for a normal character. On encod= ing, Eli> produce BOM bytes according to the value of =E2=80=98:endian=E2= =80=99. Eli> Note the last sentence: it should unconditionally produce the BOM = on Eli> encoding. Which is what we do in your scenario. Ah, I misread that as "depending on the value of ':endian'" One minor nit, the description for ':endian' says: `:endian' VALUE must be `big' or `little' specifying big-endian and little-endian respectively. The default value is `big'. This attribute is meaningful only when `:coding-type' is `utf-16'. That last sentence seems untrue, as ':endian' is meaningful for 'utf-8-auto' >> (I=CA=BCm willing to be told that buffer-file-coding-system shouldn= =CA=BCt be >> 'utf-8-auto, but I never set that explicitly as far as I know =F0=9F= =98=80) Eli> Who does set utf-8-auto? where did you originally bump into this? Eli> This is an obscure coding-system, and the fix to make it work as Eli> documented will produce an incompatible change in behavior. So be= fore Eli> I decide whether to make the change and on what branch, I'd like to Eli> know how in the world did you encounter this. It=CA=BCs entirely my own fault: The file where I noticed this is shared between a GNU/Linux and a macOS machine, which means I foolishly added the following a year ago, even though it=CA=BCs unnecessary (perhaps I was thinking I was going to be sharing it with a Windows machine?): ;; -*- lexical-binding: t; coding: utf-8-auto; -*- I think that means we can leave the code as it is. Robert --=20 From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 12 09:04:18 2023 Received: (at 60750) by debbugs.gnu.org; 12 Jan 2023 14:04:18 +0000 Received: from localhost ([127.0.0.1]:45041 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFyBl-0002Ke-KW for submit@debbugs.gnu.org; Thu, 12 Jan 2023 09:04:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:52782) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFyBk-0002KS-7h for 60750@debbugs.gnu.org; Thu, 12 Jan 2023 09:04:16 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFyBe-00065X-NM; Thu, 12 Jan 2023 09:04:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=yoI8MGUtmjGvOGKWTJGt1qyMEl8jTxAG/9p7mAefz9w=; b=XszaWg0Cqhvu8clXShmd SeYrk5QD5+Jupn60ClO+JHiRRuiCFMbqKa65Ic92UERXibEvUSVK/DtKelXdupXgc8dLUNobF8LIi WmR1YPGoN1XQuY2LRnXVvfQX1E7hhn4Pr2nchFfG1bF9FO2p386rJQ2qFWK1BnOF8ewRyG6wA3psa CNfyU4WtTpwE3iuM0svcTJ1up1HS6uJm3HIiSVjtviXI62xnI5M8W4gwyHKtkKwVefQTWrPeigJx5 //UGQGpdse5eXqcz7ADaxVvQ2oXHSfoLv966DpcT09RCBVIiL/LHwyjUcY7/LK3r9ypCsa7/sXMZr TnmExZb8KJ2QMQ==; Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFyBe-0005Lh-59; Thu, 12 Jan 2023 09:04:10 -0500 Date: Thu, 12 Jan 2023 16:04:07 +0200 Message-Id: <83bkn3c0iw.fsf@gnu.org> From: Eli Zaretskii To: Robert Pluim In-Reply-To: <871qnzg94y.fsf@gmail.com> (message from Robert Pluim on Thu, 12 Jan 2023 14:44:29 +0100) Subject: Re: bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system References: <87zgaof7cg.fsf@gmail.com> <83fscgaq6j.fsf@gnu.org> <871qnzg94y.fsf@gmail.com> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 60750 Cc: 60750@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Robert Pluim > Cc: 60750@debbugs.gnu.org > Date: Thu, 12 Jan 2023 14:44:29 +0100 > > One minor nit, the description for ':endian' says: > > `:endian' > > VALUE must be `big' or `little' specifying big-endian and > little-endian respectively. The default value is `big'. > > This attribute is meaningful only when `:coding-type' is `utf-16'. > > That last sentence seems untrue, as ':endian' is meaningful for > 'utf-8-auto' That depends on what you mean by "meaningful". What it wants to say is that it's meaningless to change the value of this property for any coding-system other than UTF-16. > Eli> Who does set utf-8-auto? where did you originally bump into this? > Eli> This is an obscure coding-system, and the fix to make it work as > Eli> documented will produce an incompatible change in behavior. So before > Eli> I decide whether to make the change and on what branch, I'd like to > Eli> know how in the world did you encounter this. > > Itʼs entirely my own fault: > > The file where I noticed this is shared between a GNU/Linux and a > macOS machine, which means I foolishly added the following a year ago, > even though itʼs unnecessary (perhaps I was thinking I was going to be > sharing it with a Windows machine?): > > ;; -*- lexical-binding: t; coding: utf-8-auto; -*- So you thought the "-auto" part was about the EOL format? > I think that means we can leave the code as it is. ??? "As it is" means this coding-system behaves contrary to documentation: it should produce BOM on encoding. Leaving it as is doesn't sound TRT, so I'd like to have this fixed. From your description, it sounds like you bumped into this by mistake, and I see only one other use of it -- in the test suite. So I'm inclined to installing this on the emacs-29 release branch. From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 12 09:28:58 2023 Received: (at 60750) by debbugs.gnu.org; 12 Jan 2023 14:28:58 +0000 Received: from localhost ([127.0.0.1]:45088 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFyZe-0002z2-6A for submit@debbugs.gnu.org; Thu, 12 Jan 2023 09:28:58 -0500 Received: from mail-wm1-f44.google.com ([209.85.128.44]:42956) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFyZc-0002yd-Gw for 60750@debbugs.gnu.org; Thu, 12 Jan 2023 09:28:56 -0500 Received: by mail-wm1-f44.google.com with SMTP id i17-20020a05600c355100b003d99434b1cfso15202294wmq.1 for <60750@debbugs.gnu.org>; Thu, 12 Jan 2023 06:28:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=W8yg+6j6uiBEthOPgmL4gcYC/E5eIRLMgAHRVxs28pA=; b=QoX6xROFeWxV7aJdozKwxgYex+l2wME8GNDe4bBg0PkitvBUO/a4h063taKv6joh2z SZAugWKMUg9a8PJV+llODsdhLa3v5KMoEMhG5KZzvYG2DRrQNHA/lXZ1gkglM3L76qQY VrDh5hLv4lmDzxldCXWZ/eYQIR+fODsqoTV+M0vrQs3o96LUm76oIB+/A55zZDsQuHAb KmYcpyh0B8b4h/eFJfkAG+kQIuW+4aOs2Aa0GMxW75C8stnTl8WcjIdYtCiNm476oX1B mEsOYA/TChy44BGGohQmD6hwoNRYW5zytN2mSsY98Ph7K9b8s+6B6VR5+VFXJDynosnT JtAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W8yg+6j6uiBEthOPgmL4gcYC/E5eIRLMgAHRVxs28pA=; b=upIWe8IWnerQKteCjrxsz6zpEpgyiFozPCf9txzyFzJJBjJYb0ab0ROQL5KMjHPwoW bASnWxeCwjSJxIXhFEtKXRJoqSHK0FOSYrqCHT5LDMZZw9iCHfDVs1xSzvK+BEOPONHh 95jPiEJeN++oufUkCzkXZJt+e1fuDuCPtjjBp6gofpa7FuEC9RMAjX1hAEGD59BQfyEs 7WrXxDEFgVFqbJLlSRo7NoZ9yZa0ARCLh5n/BVcQwYOG6xeOerfOAXHKvOEPV3jchCOu FdOCYCPIKUNe17yoUi7keELKcMEEw+2US7czJqOq/UXZ1/XHk1EhGcFp72hfq2tdPV4o vKpA== X-Gm-Message-State: AFqh2krYNWzxkURJbEjYdqX0YZCOfE0QWVUv7UbtdvVR4QHq/10yGTxQ pNXG6cwY2c+FnsZA0JwjWwEKLllUkVw= X-Google-Smtp-Source: AMrXdXve1gXsQFA7kEgG6o7cdiHnpzoHL7h27uVm3h+p8TkRMoNZkJo2LUTHsue5LUK4JfUA2P4cTA== X-Received: by 2002:a1c:790a:0:b0:3d6:b691:b814 with SMTP id l10-20020a1c790a000000b003d6b691b814mr55231100wme.1.1673533730220; Thu, 12 Jan 2023 06:28:50 -0800 (PST) Received: from rltb ([82.66.8.55]) by smtp.gmail.com with ESMTPSA id iv14-20020a05600c548e00b003b47b80cec3sm30761584wmb.42.2023.01.12.06.28.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Jan 2023 06:28:49 -0800 (PST) From: Robert Pluim To: Eli Zaretskii Subject: Re: bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system In-Reply-To: <83bkn3c0iw.fsf@gnu.org> (Eli Zaretskii's message of "Thu, 12 Jan 2023 16:04:07 +0200") References: <87zgaof7cg.fsf@gmail.com> <83fscgaq6j.fsf@gnu.org> <871qnzg94y.fsf@gmail.com> <83bkn3c0iw.fsf@gnu.org> Date: Thu, 12 Jan 2023 15:28:49 +0100 Message-ID: <87wn5resim.fsf@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 60750 Cc: 60750@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) >>>>> On Thu, 12 Jan 2023 16:04:07 +0200, Eli Zaretskii said: >> From: Robert Pluim >> Cc: 60750@debbugs.gnu.org >> Date: Thu, 12 Jan 2023 14:44:29 +0100 >>=20 >> One minor nit, the description for ':endian' says: >>=20 >> `:endian' >>=20 >> VALUE must be `big' or `little' specifying big-endian and >> little-endian respectively. The default value is `big'. >>=20 >> This attribute is meaningful only when `:coding-type' is `utf-16'. >>=20 >> That last sentence seems untrue, as ':endian' is meaningful for >> 'utf-8-auto' Eli> That depends on what you mean by "meaningful". What it wants to s= ay Eli> is that it's meaningless to change the value of this property for = any Eli> coding-system other than UTF-16. OK Eli> Who does set utf-8-auto? where did you originally bump into this? Eli> This is an obscure coding-system, and the fix to make it work as Eli> documented will produce an incompatible change in behavior. So be= fore Eli> I decide whether to make the change and on what branch, I'd like to Eli> know how in the world did you encounter this. >>=20 >> It=CA=BCs entirely my own fault: >>=20 >> The file where I noticed this is shared between a GNU/Linux and a >> macOS machine, which means I foolishly added the following a year ag= o, >> even though it=CA=BCs unnecessary (perhaps I was thinking I was goin= g to be >> sharing it with a Windows machine?): >>=20 >> ;; -*- lexical-binding: t; coding: utf-8-auto; -*- Eli> So you thought the "-auto" part was about the EOL format? yes. I=CA=BCm having a reading incomprehension day, obviously (just like a year ago when I made the change originally). >> I think that means we can leave the code as it is. Eli> ??? "As it is" means this coding-system behaves contrary to Eli> documentation: it should produce BOM on encoding. Leaving it as is Eli> doesn't sound TRT, so I'd like to have this fixed. From your Eli> description, it sounds like you bumped into this by mistake, and I= see Eli> only one other use of it -- in the test suite. So I'm inclined to Eli> installing this on the emacs-29 release branch. Oh, I thought you were proposing *not* to fix it at all, since it=CA=BCs such an obscure coding system. I have no opinion on where a fix should go: I=CA=BCm not going to be using that coding system again. Robert --=20 From debbugs-submit-bounces@debbugs.gnu.org Thu Jan 12 09:39:17 2023 Received: (at 60750-done) by debbugs.gnu.org; 12 Jan 2023 14:39:17 +0000 Received: from localhost ([127.0.0.1]:45104 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFyjd-0003GR-5r for submit@debbugs.gnu.org; Thu, 12 Jan 2023 09:39:17 -0500 Received: from eggs.gnu.org ([209.51.188.92]:44438) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pFyjb-0003GE-NH for 60750-done@debbugs.gnu.org; Thu, 12 Jan 2023 09:39:15 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFyjW-0003NZ-FY; Thu, 12 Jan 2023 09:39:10 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=Wu52AsdTO84KiYDj4VxGB+Sw2YY2Mrf94dseDqr5K8s=; b=Mg3EHP9N2BVwDtwQpaHr Jo83u5+yVBMscLLKKJV+v4wd9oPJczoVXd3t+CaCEFUBEdSUT/mLZGzbIiFgLN7+clPctimsqSKtQ R1vBfnfqvBUsU000Jx4dZ889hEQiFR38syuHYpk64GOEGQG4N73BZaQ8a2TQx5Oezd6/qHK35ecQq FuwU2vws/RWlsYI6ri4M/liGVGz2LZ3ZAj03dHg4rDBU13krWRgPrKlOMDK+65/qj5VxZIq3AUgKo HLVheUvQBw23mdG2bnRA3ZJvmDfstVHcZcEOstcxZczuYvmDl50pxfx++jrjKQ/dynU879HiqsWM3 5/xQhkABwwBSFQ==; Received: from [87.69.77.57] (helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pFyjV-0006Gd-0z; Thu, 12 Jan 2023 09:39:10 -0500 Date: Thu, 12 Jan 2023 16:39:07 +0200 Message-Id: <835ydbbywk.fsf@gnu.org> From: Eli Zaretskii To: Robert Pluim In-Reply-To: <87wn5resim.fsf@gmail.com> (message from Robert Pluim on Thu, 12 Jan 2023 15:28:49 +0100) Subject: Re: bug#60750: 29.0.60; encode-coding-char fails for utf-8-auto coding system References: <87zgaof7cg.fsf@gmail.com> <83fscgaq6j.fsf@gnu.org> <871qnzg94y.fsf@gmail.com> <83bkn3c0iw.fsf@gnu.org> <87wn5resim.fsf@gmail.com> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 60750-done Cc: 60750-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Robert Pluim > Cc: 60750@debbugs.gnu.org > Date: Thu, 12 Jan 2023 15:28:49 +0100 > > >> I think that means we can leave the code as it is. > > Eli> ??? "As it is" means this coding-system behaves contrary to > Eli> documentation: it should produce BOM on encoding. Leaving it as is > Eli> doesn't sound TRT, so I'd like to have this fixed. From your > Eli> description, it sounds like you bumped into this by mistake, and I see > Eli> only one other use of it -- in the test suite. So I'm inclined to > Eli> installing this on the emacs-29 release branch. > > Oh, I thought you were proposing *not* to fix it at all, since itʼs > such an obscure coding system. I have no opinion on where a fix should > go: Iʼm not going to be using that coding system again. OK. So I've installed the fix on the emacs-29 branch, and I'm boldly closing this bug. From unknown Sat Aug 16 21:21:01 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Fri, 10 Feb 2023 12:24:07 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator