From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Juri Linkov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 13 Dec 2019 00:04:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: 38587@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.157619539821304 (code B ref -1); Fri, 13 Dec 2019 00:04:01 +0000 Received: (at submit) by debbugs.gnu.org; 13 Dec 2019 00:03:18 +0000 Received: from localhost ([127.0.0.1]:32830 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifYQP-0005XX-P0 for submit@debbugs.gnu.org; Thu, 12 Dec 2019 19:03:17 -0500 Received: from lists.gnu.org ([209.51.188.17]:49295) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifYQO-0005XQ-J4 for submit@debbugs.gnu.org; Thu, 12 Dec 2019 19:03:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:38136) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ifYQN-0001Ag-EN for bug-gnu-emacs@gnu.org; Thu, 12 Dec 2019 19:03:16 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_40,RCVD_IN_DNSWL_NONE, URIBL_BLOCKED autolearn=disabled version=3.3.2 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ifYQM-0004Kb-Bp for bug-gnu-emacs@gnu.org; Thu, 12 Dec 2019 19:03:15 -0500 Received: from aye.elm.relay.mailchannels.net ([23.83.212.6]:3253) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ifYQL-0004Gi-OY for bug-gnu-emacs@gnu.org; Thu, 12 Dec 2019 19:03:14 -0500 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id E6D04501771 for ; Fri, 13 Dec 2019 00:03:11 +0000 (UTC) Received: from pdx1-sub0-mail-a65.g.dreamhost.com (100-96-60-111.trex.outbound.svc.cluster.local [100.96.60.111]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 7CCAD5013CF for ; Fri, 13 Dec 2019 00:03:11 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from pdx1-sub0-mail-a65.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.5); Fri, 13 Dec 2019 00:03:11 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Trouble-Well-Made: 2668f2d740336020_1576195391722_3466530235 X-MC-Loop-Signature: 1576195391721:2376753201 X-MC-Ingress-Time: 1576195391721 Received: from pdx1-sub0-mail-a65.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a65.g.dreamhost.com (Postfix) with ESMTP id D06E77F21C for ; Thu, 12 Dec 2019 16:03:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to :subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=linkov.net; bh=JYejYeFdwKS6ejWl9Sv 05RP1tcE=; b=FtGAHFyitnhvdau+je7H34RwelhslM171Hl37V+OaAR6+DS1xCL 1PtS54eLQmlLA9pSYXuZRJXDUd0NvESM3u3Z3QT+hRH0ls/5pcssXy6a9tx36zJa 7ezwfWZTlwtUP/BPflXPYkZHDsYVs/ZNcHfNaBPRdEj893GyjaWJyL6U= Received: from mail.jurta.org (m91-129-96-42.cust.tele2.ee [91.129.96.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a65.g.dreamhost.com (Postfix) with ESMTPSA id 140737F220 for ; Thu, 12 Dec 2019 16:03:05 -0800 (PST) X-DH-BACKEND: pdx1-sub0-mail-a65 From: Juri Linkov Organization: LINKOV.NET Date: Fri, 13 Dec 2019 01:55:56 +0200 Message-ID: <87blsdhzeb.fsf@mail.linkov.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: 0 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedufedrudelkedgudekucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucenucfjughrpefhvffuohffkfgfgggtgfesthekredttderudenucfhrhhomheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqeenucfkphepledurdduvdelrdeliedrgedvnecurfgrrhgrmhepmhhouggvpehsmhhtphdphhgvlhhopehmrghilhdrjhhurhhtrgdrohhrghdpihhnvghtpeeluddruddvledrleeirdegvddprhgvthhurhhnqdhprghthheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqedpmhgrihhlfhhrohhmpehjuhhriheslhhinhhkohhvrdhnvghtpdhnrhgtphhtthhopegsuhhgqdhgnhhuqdgvmhgrtghssehgnhhurdhorhhgnecuvehluhhsthgvrhfuihiivgeptd Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 23.83.212.6 X-Spam-Score: -1.4 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) 0. emacs -Q 1. insert a non-ASCII char, e.g. =E4 2. select the region around the char 3. M-x base64-encode-region 4. select the region around the encoded text 5. M-x base64-decode-region results in a broken text. IOW, base64-encode-region and base64-decode-re= gion are not reversible, whereas their string counterparts are: (base64-decode-string (base64-encode-string "=E4")) =3D> "\344" (the latter expression returns the right result, but inserts broken text = too) From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 13 Dec 2019 02:53:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Juri Linkov Cc: 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.15762055798899 (code B ref 38587); Fri, 13 Dec 2019 02:53:01 +0000 Received: (at 38587) by debbugs.gnu.org; 13 Dec 2019 02:52:59 +0000 Received: from localhost ([127.0.0.1]:32876 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifb4c-0002JT-Th for submit@debbugs.gnu.org; Thu, 12 Dec 2019 21:52:59 -0500 Received: from quimby.gnus.org ([95.216.78.240]:50528) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ifb4b-0002JF-0D for 38587@debbugs.gnu.org; Thu, 12 Dec 2019 21:52:57 -0500 Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=marnie) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ifb4R-0002XZ-BB; Fri, 13 Dec 2019 03:52:49 +0100 From: Lars Ingebrigtsen References: <87blsdhzeb.fsf@mail.linkov.net> Date: Fri, 13 Dec 2019 03:52:46 +0100 In-Reply-To: <87blsdhzeb.fsf@mail.linkov.net> (Juri Linkov's message of "Fri, 13 Dec 2019 01:55:56 +0200") Message-ID: <87pngtndhd.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Juri Linkov writes: > 0. emacs -Q > 1. insert a non-ASCII char, e.g. =?UTF-8?Q?=C3=A4?= > 2. select the region around the char > 3. M-x base64-encode-region > 4. select the region around the encoded text > 5. M-x base64-decode-region > [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: ingebrigtsen.no] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Juri Linkov writes: > 0. emacs -Q > 1. insert a non-ASCII char, e.g. =C3=A4 > 2. select the region around the char > 3. M-x base64-encode-region > 4. select the region around the encoded text > 5. M-x base64-decode-region > > results in a broken text. IOW, base64-encode-region and base64-decode-re= gion > are not reversible, whereas their string counterparts are: > > (base64-decode-string (base64-encode-string "=C3=A4")) > =3D> "\344" Well, that's not really reversible, either. > (the latter expression returns the right result, but inserts broken text = too) None of these functions work on multibyte text (by design), but I see the doc strings don't mention this. (The manual does.) --=20 (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Fri, 13 Dec 2019 07:14:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Lars Ingebrigtsen Cc: 38587@debbugs.gnu.org, juri@linkov.net Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.15762211886688 (code B ref 38587); Fri, 13 Dec 2019 07:14:01 +0000 Received: (at 38587) by debbugs.gnu.org; 13 Dec 2019 07:13:08 +0000 Received: from localhost ([127.0.0.1]:32934 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iff8N-0001jn-MJ for submit@debbugs.gnu.org; Fri, 13 Dec 2019 02:13:07 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38726) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1iff8L-0001jH-OA for 38587@debbugs.gnu.org; Fri, 13 Dec 2019 02:13:06 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:51059) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1iff8G-0004E7-BM; Fri, 13 Dec 2019 02:13:00 -0500 Received: from [176.228.60.248] (port=4928 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1iff8F-0003xn-An; Fri, 13 Dec 2019 02:12:59 -0500 Date: Fri, 13 Dec 2019 09:12:54 +0200 Message-Id: <83mubw8zrd.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87pngtndhd.fsf@gnus.org> (message from Lars Ingebrigtsen on Fri, 13 Dec 2019 03:52:46 +0100) References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Lars Ingebrigtsen > Date: Fri, 13 Dec 2019 03:52:46 +0100 > Cc: 38587@debbugs.gnu.org > > > (base64-decode-string (base64-encode-string "ä")) > > => "\344" > > Well, that's not really reversible, either. > > > (the latter expression returns the right result, but inserts broken text too) > > None of these functions work on multibyte text (by design) Right. > but I see the doc strings don't mention this. (The manual does.) Let's say that in the doc strings as well. It is not easy to come up with the right text, btw, because saying just "region must be unibyte" is inaccurate; see the source of the implementation for the details. That's why the ELisp manual also doesn't say anything simple in this respect. From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Juri Linkov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 14 Dec 2019 23:37:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Lars Ingebrigtsen Cc: 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157636659829334 (code B ref 38587); Sat, 14 Dec 2019 23:37:02 +0000 Received: (at 38587) by debbugs.gnu.org; 14 Dec 2019 23:36:38 +0000 Received: from localhost ([127.0.0.1]:35883 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igGxi-0007d4-2E for submit@debbugs.gnu.org; Sat, 14 Dec 2019 18:36:38 -0500 Received: from dragonfly.birch.relay.mailchannels.net ([23.83.209.51]:48838) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igGxg-0007cw-Hx for 38587@debbugs.gnu.org; Sat, 14 Dec 2019 18:36:37 -0500 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 7E0CE1A0C25; Sat, 14 Dec 2019 23:36:35 +0000 (UTC) Received: from pdx1-sub0-mail-a34.g.dreamhost.com (100-96-14-23.trex.outbound.svc.cluster.local [100.96.14.23]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 1655E1A0C6D; Sat, 14 Dec 2019 23:36:35 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from pdx1-sub0-mail-a34.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.5); Sat, 14 Dec 2019 23:36:35 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Absorbed-Troubled: 64aaab4f34544db4_1576366595311_2001345377 X-MC-Loop-Signature: 1576366595311:3011476104 X-MC-Ingress-Time: 1576366595311 Received: from pdx1-sub0-mail-a34.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a34.g.dreamhost.com (Postfix) with ESMTP id AEA837F5F7; Sat, 14 Dec 2019 15:36:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=linkov.net; bh=GGwdnc bWwOkux9QoYZggMIZ3n54=; b=Xo9+7YJ3VV0bP9d9ltZUoUek1SP/fYtxxQMze8 +jn0fLMaXtxH3rpR7MB9YenOBa9foAXmETBkrUPiqyTgDbjAdp4YPW7r+158kLnr XKHCkkJ9Jc3GPf7gH4A/YtqfkoO0o8YAiEgroFzyws0ZkSBJWxZLuOe6kqoi16G4 TwyzY= Received: from mail.jurta.org (m91-129-107-186.cust.tele2.ee [91.129.107.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a34.g.dreamhost.com (Postfix) with ESMTPSA id 94BDD7F5F5; Sat, 14 Dec 2019 15:36:28 -0800 (PST) X-DH-BACKEND: pdx1-sub0-mail-a34 From: Juri Linkov Organization: LINKOV.NET References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> Date: Sun, 15 Dec 2019 01:31:38 +0200 In-Reply-To: <87pngtndhd.fsf@gnus.org> (Lars Ingebrigtsen's message of "Fri, 13 Dec 2019 03:52:46 +0100") Message-ID: <87v9qieb6t.fsf@mail.linkov.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: -100 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedufedrvddtvddgudduucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvufhofhffjgfkfgggtgfgsehtkeertddtredunecuhfhrohhmpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqnecukfhppeeluddruddvledruddtjedrudekieenucfrrghrrghmpehmohguvgepshhmthhppdhhvghlohepmhgrihhlrdhjuhhrthgrrdhorhhgpdhinhgvthepledurdduvdelrddutdejrddukeeipdhrvghtuhhrnhdqphgrthhhpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqpdhmrghilhhfrhhomhepjhhurhhisehlihhnkhhovhdrnhgvthdpnhhrtghpthhtoheplhgrrhhsihesghhnuhhsrdhorhhgnecuvehluhhsthgvrhfuihiivgeptd Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) >> 0. emacs -Q >> 1. insert a non-ASCII char, e.g. =E4 >> 2. select the region around the char >> 3. M-x base64-encode-region >> 4. select the region around the encoded text >> 5. M-x base64-decode-region >> >> results in a broken text. IOW, base64-encode-region and base64-decode= -region >> are not reversible, whereas their string counterparts are: >> >> (base64-decode-string (base64-encode-string "=E4")) >> =3D> "\344" > > Well, that's not really reversible, either. But when it's know that the source string was in UTF-8, shouldn't it be reversible? What is needed for that? Maybe an additional CODING arg for base64-decode-region? Or it would be enough to use the coding system of the output buffer? From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Andreas Schwab Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 15 Dec 2019 08:57:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Juri Linkov Cc: Lars Ingebrigtsen , 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157640018429877 (code B ref 38587); Sun, 15 Dec 2019 08:57:02 +0000 Received: (at 38587) by debbugs.gnu.org; 15 Dec 2019 08:56:24 +0000 Received: from localhost ([127.0.0.1]:35992 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igPhQ-0007lp-LU for submit@debbugs.gnu.org; Sun, 15 Dec 2019 03:56:24 -0500 Received: from mail-out.m-online.net ([212.18.0.9]:34300) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igPhO-0007lg-7x for 38587@debbugs.gnu.org; Sun, 15 Dec 2019 03:56:23 -0500 Received: from frontend01.mail.m-online.net (unknown [192.168.8.182]) by mail-out.m-online.net (Postfix) with ESMTP id 47bJD433JFz1qqkS; Sun, 15 Dec 2019 09:56:20 +0100 (CET) Received: from localhost (dynscan1.mnet-online.de [192.168.6.70]) by mail.m-online.net (Postfix) with ESMTP id 47bJD41QjSz1rhBG; Sun, 15 Dec 2019 09:56:20 +0100 (CET) X-Virus-Scanned: amavisd-new at mnet-online.de Received: from mail.mnet-online.de ([192.168.8.182]) by localhost (dynscan1.mail.m-online.net [192.168.6.70]) (amavisd-new, port 10024) with ESMTP id nWoeXuQlvn-o; Sun, 15 Dec 2019 09:56:19 +0100 (CET) X-Auth-Info: LTnDAHZmPHdrbGqOvz/6Y1bnvKOX/SJ36d7tYGNulcXOQojzdHwv/gYaHftYtmPq Received: from hase.home (ppp-46-244-172-19.dynamic.mnet-online.de [46.244.172.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.mnet-online.de (Postfix) with ESMTPSA; Sun, 15 Dec 2019 09:56:19 +0100 (CET) Received: by hase.home (Postfix, from userid 1000) id 4FFBB1012CF; Sun, 15 Dec 2019 09:56:18 +0100 (CET) From: Andreas Schwab References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> X-Yow: Content: 80% POLYESTER, 20% DACRON.. The waitress's UNIFORM sheds TARTAR SAUCE like an 8'' by 10'' GLOSSY.. Date: Sun, 15 Dec 2019 09:56:18 +0100 In-Reply-To: <87v9qieb6t.fsf@mail.linkov.net> (Juri Linkov's message of "Sun, 15 Dec 2019 01:31:38 +0200") Message-ID: <87eex66k7h.fsf@hase.home> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -0.4 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.4 (-) On Dez 15 2019, Juri Linkov wrote: > Maybe an additional CODING arg for base64-decode-region? BASE64 is defined on a sequence of bytes. It doesn't make sense to apply it to characters. The input of base64-encode-region needs to be encoded into bytes and the output of base64-decode-region needs to be decoded into characters. If you do that, you get a full reversible operation. > Or it would be enough to use the coding system of the > output buffer? The coding system of the output buffer has nothing to do with the coding of the data produced by base64-decode-region, just like process-coding-system is independent from the coding system of the process buffer. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 15 Dec 2019 15:28:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Juri Linkov Cc: larsi@gnus.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157642364114424 (code B ref 38587); Sun, 15 Dec 2019 15:28:02 +0000 Received: (at 38587) by debbugs.gnu.org; 15 Dec 2019 15:27:21 +0000 Received: from localhost ([127.0.0.1]:37024 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igVnl-0003ka-Ew for submit@debbugs.gnu.org; Sun, 15 Dec 2019 10:27:21 -0500 Received: from eggs.gnu.org ([209.51.188.92]:55788) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igVnk-0003kN-7q for 38587@debbugs.gnu.org; Sun, 15 Dec 2019 10:27:20 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:45240) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1igVnd-00015s-Ja; Sun, 15 Dec 2019 10:27:13 -0500 Received: from [176.228.60.248] (port=4706 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1igVnc-0001oj-Sr; Sun, 15 Dec 2019 10:27:13 -0500 Date: Sun, 15 Dec 2019 17:26:54 +0200 Message-Id: <83tv61624h.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87v9qieb6t.fsf@mail.linkov.net> (message from Juri Linkov on Sun, 15 Dec 2019 01:31:38 +0200) References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Juri Linkov > Date: Sun, 15 Dec 2019 01:31:38 +0200 > Cc: 38587@debbugs.gnu.org > > But when it's know that the source string was in UTF-8, > shouldn't it be reversible? What is needed for that? The source string is not in UTF-8, it is in internal Emacs representation of strings. From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Juri Linkov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 15 Dec 2019 23:18:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Andreas Schwab Cc: Lars Ingebrigtsen , 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157645184515796 (code B ref 38587); Sun, 15 Dec 2019 23:18:02 +0000 Received: (at 38587) by debbugs.gnu.org; 15 Dec 2019 23:17:25 +0000 Received: from localhost ([127.0.0.1]:37482 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igd8f-00046i-D0 for submit@debbugs.gnu.org; Sun, 15 Dec 2019 18:17:25 -0500 Received: from eastern.birch.relay.mailchannels.net ([23.83.209.55]:58480) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igd8c-00046Y-Gw for 38587@debbugs.gnu.org; Sun, 15 Dec 2019 18:17:23 -0500 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 31670500A35; Sun, 15 Dec 2019 23:17:21 +0000 (UTC) Received: from pdx1-sub0-mail-a34.g.dreamhost.com (100-96-60-111.trex.outbound.svc.cluster.local [100.96.60.111]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 8B1CF5016F0; Sun, 15 Dec 2019 23:17:20 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from pdx1-sub0-mail-a34.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.5); Sun, 15 Dec 2019 23:17:20 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Lettuce-Stretch: 5e8e314f3ad6be90_1576451840815_2660741199 X-MC-Loop-Signature: 1576451840815:3673164450 X-MC-Ingress-Time: 1576451840814 Received: from pdx1-sub0-mail-a34.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a34.g.dreamhost.com (Postfix) with ESMTP id 7C3817F5E9; Sun, 15 Dec 2019 15:17:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=linkov.net; bh=6/IZog RsQt9VM2oLIukqTbPw0WQ=; b=H5idN6PdB2ThWY5bAn+eIvN+W2gj69EYjtaiCI PIfq9OLhWOk/dfX1Pq12gApoUDUb84++NRP4fcB8POHNhtUUAIaVQkRe8HzbcWiQ Y1SzsgFNBCJgscU5C4Gdv54ZfJsHHlWrMnTAfOaYi9nyxBnDBTTcSZA81rYLZwTf RD2fw= Received: from mail.jurta.org (m91-129-107-186.cust.tele2.ee [91.129.107.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a34.g.dreamhost.com (Postfix) with ESMTPSA id ACF847F5EF; Sun, 15 Dec 2019 15:17:11 -0800 (PST) X-DH-BACKEND: pdx1-sub0-mail-a34 From: Juri Linkov Organization: LINKOV.NET References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> Date: Mon, 16 Dec 2019 00:40:55 +0200 In-Reply-To: <87eex66k7h.fsf@hase.home> (Andreas Schwab's message of "Sun, 15 Dec 2019 09:56:18 +0100") Message-ID: <87zhft9rl4.fsf@mail.linkov.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: -100 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedufedrvddtgedgtdekucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvufhofhffjgfkfgggtgfgsehtkeertddtreejnecuhfhrohhmpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqnecukfhppeeluddruddvledruddtjedrudekieenucfrrghrrghmpehmohguvgepshhmthhppdhhvghlohepmhgrihhlrdhjuhhrthgrrdhorhhgpdhinhgvthepledurdduvdelrddutdejrddukeeipdhrvghtuhhrnhdqphgrthhhpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqpdhmrghilhhfrhhomhepjhhurhhisehlihhnkhhovhdrnhgvthdpnhhrtghpthhtohepshgthhifrggssehlihhnuhigqdhmieekkhdrohhrghenucevlhhushhtvghrufhiiigvpedt Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) >> Maybe an additional CODING arg for base64-decode-region? > > BASE64 is defined on a sequence of bytes. It doesn't make sense to > apply it to characters. But isn't UTF-8 a multibyte encoding represented by a sequence of bytes (e.g. when saved to a file)? Then why base64-encode-region couldn't use the buffer's coding to convert the region to a sequence of bytes? Also why base64-encode-region accepts region's characters only from the charsets =E2=80=98eight-bit-control=E2=80=99 and =E2=80=98e= ight-bit-graphic=E2=80=99, but not other UTF-8 characters? > The input of base64-encode-region needs to be encoded into bytes and th= e > output of base64-decode-region needs to be decoded into characters. If > you do that, you get a full reversible operation. I guess base64-encode-region already encodes the region into bytes, but only partially - it signals an error on some characters, I don't understand why it can't encode all of them. >> Or it would be enough to use the coding system of the >> output buffer? > > The coding system of the output buffer has nothing to do with the codin= g > of the data produced by base64-decode-region, just like > process-coding-system is independent from the coding system of the > process buffer. It's understandable that the coding system of the output buffer is not necessarily the same as expected from the output of base64-decode-region. But is it still possible to tell base64-decode-region about the expected output coding system? Maybe using a prefix arg: C-u M-x base64-decode-region could ask for a coding, defaulting to the buffer's coding. For example, in Ruby require 'base64' Base64.decode64(Base64.encode64("=E2=98=83")) =3D> "\xE2\x98\x83" indeed outputs ASCII not encoded to UTF-8. But it's possible to force encoding with: Base64.decode64(Base64.encode64("=E2=98=83")).force_encoding('UTF-8') =3D> "=E2=98=83" Is there an equivalent of force_encoding('UTF-8') in Emacs? I tried to call after base64-decode-region on its output: (decode-coding-region (point-min) (point-max) 'binary) but it doesn't work, neither this: (encode-coding-region (point-min) (point-max) 'utf-8) Also this doesn't work on the string output: (decode-coding-string (base64-decode-string (base64-encode-string "=C3=A4= ")) 'utf-8) Maybe I'm doing something wrong? From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Juri Linkov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sun, 15 Dec 2019 23:18:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: larsi@gnus.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157645184615809 (code B ref 38587); Sun, 15 Dec 2019 23:18:02 +0000 Received: (at 38587) by debbugs.gnu.org; 15 Dec 2019 23:17:26 +0000 Received: from localhost ([127.0.0.1]:37484 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igd8g-00046v-MT for submit@debbugs.gnu.org; Sun, 15 Dec 2019 18:17:26 -0500 Received: from caracal.birch.relay.mailchannels.net ([23.83.209.30]:33147) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igd8e-00046g-II for 38587@debbugs.gnu.org; Sun, 15 Dec 2019 18:17:25 -0500 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 6951534178C; Sun, 15 Dec 2019 23:17:23 +0000 (UTC) Received: from pdx1-sub0-mail-a34.g.dreamhost.com (100-96-60-111.trex.outbound.svc.cluster.local [100.96.60.111]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id E2CA13416B0; Sun, 15 Dec 2019 23:17:22 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from pdx1-sub0-mail-a34.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.5); Sun, 15 Dec 2019 23:17:23 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Callous-Turn: 0e843c587ddebdd7_1576451843165_3971071272 X-MC-Loop-Signature: 1576451843165:3884691725 X-MC-Ingress-Time: 1576451843164 Received: from pdx1-sub0-mail-a34.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a34.g.dreamhost.com (Postfix) with ESMTP id D3CC47F5F2; Sun, 15 Dec 2019 15:17:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=linkov.net; bh=jJs/DkCWfnzOfGp1Jrn1enYFsWw=; b= f4m1TGl9bYyrI5W4QapcgBPq0mXF9LwewpoqV97kNaWP61nE4AqQO9/gJoNml8Ja 18RYlgK/VUQ/nvQjafo53P/bpBi5kUMI42+kmbrJWqLse+dCbrsNBZvQXwA8c8ir 1qw+MDBRVJjW3nbJnc1CScBglnssQip4Aguc72YgZcc= Received: from mail.jurta.org (m91-129-107-186.cust.tele2.ee [91.129.107.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a34.g.dreamhost.com (Postfix) with ESMTPSA id CC7657F5EF; Sun, 15 Dec 2019 15:17:18 -0800 (PST) X-DH-BACKEND: pdx1-sub0-mail-a34 From: Juri Linkov Organization: LINKOV.NET References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <83tv61624h.fsf@gnu.org> Date: Mon, 16 Dec 2019 00:41:48 +0200 In-Reply-To: <83tv61624h.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 15 Dec 2019 17:26:54 +0200") Message-ID: <87y2vd9rjn.fsf@mail.linkov.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: -100 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedufedrvddtgedgtdekucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvufhofhffjgfkfgggtgesthdtredttdertdenucfhrhhomheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqeenucfkphepledurdduvdelrddutdejrddukeeinecurfgrrhgrmhepmhhouggvpehsmhhtphdphhgvlhhopehmrghilhdrjhhurhhtrgdrohhrghdpihhnvghtpeeluddruddvledruddtjedrudekiedprhgvthhurhhnqdhprghthheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqedpmhgrihhlfhhrohhmpehjuhhriheslhhinhhkohhvrdhnvghtpdhnrhgtphhtthhopegvlhhiiiesghhnuhdrohhrghenucevlhhushhtvghrufhiiigvpedt X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) >> But when it's know that the source string was in UTF-8, >> shouldn't it be reversible? What is needed for that? > > The source string is not in UTF-8, it is in internal Emacs > representation of strings. Is internal Emacs representation compatible with UTF-8? From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 16 Dec 2019 03:30:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Juri Linkov Cc: larsi@gnus.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.15764669638573 (code B ref 38587); Mon, 16 Dec 2019 03:30:02 +0000 Received: (at 38587) by debbugs.gnu.org; 16 Dec 2019 03:29:23 +0000 Received: from localhost ([127.0.0.1]:37574 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igh4U-0002ED-Kl for submit@debbugs.gnu.org; Sun, 15 Dec 2019 22:29:22 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43490) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igh4S-0002Dx-Bt for 38587@debbugs.gnu.org; Sun, 15 Dec 2019 22:29:21 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:52537) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1igh4M-0000es-RT; Sun, 15 Dec 2019 22:29:14 -0500 Received: from [176.228.60.248] (port=1286 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1igh4L-0000bL-2L; Sun, 15 Dec 2019 22:29:14 -0500 Date: Mon, 16 Dec 2019 05:28:55 +0200 Message-Id: <83h82154p4.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87y2vd9rjn.fsf@mail.linkov.net> (message from Juri Linkov on Mon, 16 Dec 2019 00:41:48 +0200) References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <83tv61624h.fsf@gnu.org> <87y2vd9rjn.fsf@mail.linkov.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Juri Linkov > Cc: larsi@gnus.org, 38587@debbugs.gnu.org > Date: Mon, 16 Dec 2019 00:41:48 +0200 > > >> But when it's know that the source string was in UTF-8, > >> shouldn't it be reversible? What is needed for that? > > > > The source string is not in UTF-8, it is in internal Emacs > > representation of strings. > > Is internal Emacs representation compatible with UTF-8? It is (currently). But it isn't UTF-8, strictly speaking. In particular, raw bytes are represented there by 2-byte sequences that aren't valid UTF-8. From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 16 Dec 2019 15:59:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Juri Linkov Cc: larsi@gnus.org, schwab@linux-m68k.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.15765119346142 (code B ref 38587); Mon, 16 Dec 2019 15:59:02 +0000 Received: (at 38587) by debbugs.gnu.org; 16 Dec 2019 15:58:54 +0000 Received: from localhost ([127.0.0.1]:40294 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igslq-0001b0-7V for submit@debbugs.gnu.org; Mon, 16 Dec 2019 10:58:54 -0500 Received: from eggs.gnu.org ([209.51.188.92]:33875) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igslp-0001al-Ax for 38587@debbugs.gnu.org; Mon, 16 Dec 2019 10:58:53 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:32937) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1igslj-0008St-PX; Mon, 16 Dec 2019 10:58:47 -0500 Received: from [176.228.60.248] (port=3002 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1igsli-0002O5-Du; Mon, 16 Dec 2019 10:58:47 -0500 Date: Mon, 16 Dec 2019 17:58:29 +0200 Message-Id: <835zig5kka.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87zhft9rl4.fsf@mail.linkov.net> (message from Juri Linkov on Mon, 16 Dec 2019 00:40:55 +0200) References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Juri Linkov > Date: Mon, 16 Dec 2019 00:40:55 +0200 > Cc: Lars Ingebrigtsen , 38587@debbugs.gnu.org > > > BASE64 is defined on a sequence of bytes. It doesn't make sense to > > apply it to characters. > > But isn't UTF-8 a multibyte encoding represented by a sequence of bytes > (e.g. when saved to a file)? When saved to a file, yes. > Then why base64-encode-region couldn't use the buffer's coding > to convert the region to a sequence of bytes? Because it isn't guaranteed that the buffer's encoding is indeed the right one for this job. > Also why base64-encode-region accepts region's characters > only from the charsets ‘eight-bit-control’ and ‘eight-bit-graphic’, > but not other UTF-8 characters? Because it wants raw bytes, and only eight-bit charsets fit that condition. Eight-bit charset is the charset of raw bytes in a multibyte buffer or string. (base64-encode-region can also work on unibyte buffers and strings, in which case "charset" of such "text" has no meaning.) > > The input of base64-encode-region needs to be encoded into bytes and the > > output of base64-decode-region needs to be decoded into characters. If > > you do that, you get a full reversible operation. > > I guess base64-encode-region already encodes the region into bytes, > but only partially - it signals an error on some characters, > I don't understand why it can't encode all of them. Once again, because it wants to process only raw bytes. > But is it still possible to tell base64-decode-region > about the expected output coding system? Maybe using > a prefix arg: C-u M-x base64-decode-region could ask > for a coding, defaulting to the buffer's coding. If we want to make such a change, then "C-x RET c" is a better prefix command, as it is consistent with other commands that accept coding-system overrides. > Is there an equivalent of force_encoding('UTF-8') in Emacs? "C-x RET c utf-8 RET M-x SOME-COMMAND RET" > Also this doesn't work on the string output: > > (decode-coding-string (base64-decode-string (base64-encode-string "ä")) > 'utf-8) It will work if you encode "ä" first: (decode-coding-string (base64-decode-string (base64-encode-string (encode-coding-string "ä" 'utf-8))) 'utf-8) From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Andreas Schwab Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 16 Dec 2019 16:19:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Juri Linkov Cc: Lars Ingebrigtsen , 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157651312315715 (code B ref 38587); Mon, 16 Dec 2019 16:19:01 +0000 Received: (at 38587) by debbugs.gnu.org; 16 Dec 2019 16:18:43 +0000 Received: from localhost ([127.0.0.1]:40309 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igt51-00045P-IJ for submit@debbugs.gnu.org; Mon, 16 Dec 2019 11:18:43 -0500 Received: from mail-out.m-online.net ([212.18.0.9]:55757) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igt50-00045H-3f for 38587@debbugs.gnu.org; Mon, 16 Dec 2019 11:18:42 -0500 Received: from frontend01.mail.m-online.net (unknown [192.168.8.182]) by mail-out.m-online.net (Postfix) with ESMTP id 47c6003yYXz1qqxx; Mon, 16 Dec 2019 17:18:40 +0100 (CET) Received: from localhost (dynscan1.mnet-online.de [192.168.6.70]) by mail.m-online.net (Postfix) with ESMTP id 47c6001lJMz1qqkp; Mon, 16 Dec 2019 17:18:40 +0100 (CET) X-Virus-Scanned: amavisd-new at mnet-online.de Received: from mail.mnet-online.de ([192.168.8.182]) by localhost (dynscan1.mail.m-online.net [192.168.6.70]) (amavisd-new, port 10024) with ESMTP id hzXpO8Q8vjwM; Mon, 16 Dec 2019 17:18:39 +0100 (CET) X-Auth-Info: 56V/sEmWCM4rS0xb+bEV7J5i56QkMjwmBrUxTdeOz6sr7FLB8fNWiNoIyTcP2nbe Received: from hawking (charybdis-ext.suse.de [195.135.221.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.mnet-online.de (Postfix) with ESMTPSA; Mon, 16 Dec 2019 17:18:39 +0100 (CET) From: Andreas Schwab References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> X-Yow: TAPPING? You POLITICIANS! Don't you realize that the END of the ``Wash Cycle'' is a TREASURED MOMENT for most people?! Date: Mon, 16 Dec 2019 17:18:37 +0100 In-Reply-To: <87zhft9rl4.fsf@mail.linkov.net> (Juri Linkov's message of "Mon, 16 Dec 2019 00:40:55 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -0.7 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.7 (-) On Dez 16 2019, Juri Linkov wrote: > Also this doesn't work on the string output: > > (decode-coding-string (base64-decode-string (base64-encode-string "ä")) > 'utf-8) This works as expected, as the string to be decoded is not a valid UTF-8 sequence. Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Juri Linkov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 16 Dec 2019 22:53:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: larsi@gnus.org, schwab@linux-m68k.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157653677026841 (code B ref 38587); Mon, 16 Dec 2019 22:53:03 +0000 Received: (at 38587) by debbugs.gnu.org; 16 Dec 2019 22:52:50 +0000 Received: from localhost ([127.0.0.1]:40472 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igzEQ-0006yq-FZ for submit@debbugs.gnu.org; Mon, 16 Dec 2019 17:52:50 -0500 Received: from antelope.elm.relay.mailchannels.net ([23.83.212.4]:32595) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igzEO-0006yi-6L for 38587@debbugs.gnu.org; Mon, 16 Dec 2019 17:52:48 -0500 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id F27DC1A136D; Mon, 16 Dec 2019 22:52:46 +0000 (UTC) Received: from pdx1-sub0-mail-a19.g.dreamhost.com (100-96-6-249.trex.outbound.svc.cluster.local [100.96.6.249]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 4AEEC1A103F; Mon, 16 Dec 2019 22:52:46 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from pdx1-sub0-mail-a19.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.5); Mon, 16 Dec 2019 22:52:46 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Cold-Vacuous: 71b202eb3bf647de_1576536766569_2258739021 X-MC-Loop-Signature: 1576536766569:1811971875 X-MC-Ingress-Time: 1576536766568 Received: from pdx1-sub0-mail-a19.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a19.g.dreamhost.com (Postfix) with ESMTP id CA1D97F028; Mon, 16 Dec 2019 14:52:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=linkov.net; bh=VnwSoP 7dN0VN4EX4ibuKPy+HToE=; b=WFv5z/5eDZgpwbdqnt4Sljxi8S4Hy0vn2+jlhT 2V3M7tdTeh0xQqiDqZHW5lDnE+mBL4+KVMWxUVg3EVGOYyZd2FFCZidO3ulau2Ge pHSvmTHil3MvYS/0F6hSHDKqYpTfxReBPazDxmK++DvlkpR/aWiwx8jH54F179Nk 56LEw= Received: from mail.jurta.org (m91-129-107-186.cust.tele2.ee [91.129.107.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a19.g.dreamhost.com (Postfix) with ESMTPSA id F1CFC7F024; Mon, 16 Dec 2019 14:52:39 -0800 (PST) X-DH-BACKEND: pdx1-sub0-mail-a19 From: Juri Linkov Organization: LINKOV.NET References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> <835zig5kka.fsf@gnu.org> Date: Mon, 16 Dec 2019 23:51:48 +0200 In-Reply-To: <835zig5kka.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 16 Dec 2019 17:58:29 +0200") Message-ID: <87r214giaz.fsf@mail.linkov.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: -100 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedufedrvddtiedgtdegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvufhofhffjgfkfgggtgfgsehtkeertddtredunecuhfhrohhmpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqnecukfhppeeluddruddvledruddtjedrudekieenucfrrghrrghmpehmohguvgepshhmthhppdhhvghlohepmhgrihhlrdhjuhhrthgrrdhorhhgpdhinhgvthepledurdduvdelrddutdejrddukeeipdhrvghtuhhrnhdqphgrthhhpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqpdhmrghilhhfrhhomhepjhhurhhisehlihhnkhhovhdrnhgvthdpnhhrtghpthhtohepvghlihiisehgnhhurdhorhhgnecuvehluhhsthgvrhfuihiivgeptd Content-Transfer-Encoding: quoted-printable X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) >> But is it still possible to tell base64-decode-region >> about the expected output coding system? Maybe using >> a prefix arg: C-u M-x base64-decode-region could ask >> for a coding, defaulting to the buffer's coding. > > If we want to make such a change, then "C-x RET c" is a better prefix > command, as it is consistent with other commands that accept > coding-system overrides. > >> Is there an equivalent of force_encoding('UTF-8') in Emacs? > > "C-x RET c utf-8 RET M-x SOME-COMMAND RET" I see that 'C-x RET c' just sets coding-system-for-read and coding-system-for-write for the next command, so could base64-decode-region get coding from these variables? > It will work if you encode "=E4" first: > > (decode-coding-string (base64-decode-string > (base64-encode-string > (encode-coding-string "=E4" 'utf-8))) > 'utf-8) Thanks, this works for strings. My real need was to find a way to decode base64 regions that were encoded with UTF-8 coding. First I tried to find such post-processing that would recover "broken" characters inserted by base64-decode-region. It seems these characters represent bytes that are parts of the UTF-8 characters encoded in the UTF-8 buffer using eight-bit charset. I failed to find such functions that would convert the result of base64-decode-region to UTF-8 characters in the UTF-8 buffer. So I wrote a replacement of base64-decode-region: (defun base64-decode-utf8-region (beg end) (interactive "r") (replace-region-contents beg end (lambda () (decode-coding-string (base64-decode-string (buffer-substring beg end)) (or coding-system-for-write 'utf-8))))) But the question remains: is it possible to do the same in a simpler way without the need to write a new command? From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 17 Dec 2019 16:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Juri Linkov Cc: larsi@gnus.org, schwab@linux-m68k.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157659867926357 (code B ref 38587); Tue, 17 Dec 2019 16:05:02 +0000 Received: (at 38587) by debbugs.gnu.org; 17 Dec 2019 16:04:39 +0000 Received: from localhost ([127.0.0.1]:42254 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ihFKw-0006r3-QU for submit@debbugs.gnu.org; Tue, 17 Dec 2019 11:04:39 -0500 Received: from eggs.gnu.org ([209.51.188.92]:57660) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ihFKu-0006qq-Uq for 38587@debbugs.gnu.org; Tue, 17 Dec 2019 11:04:37 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]:53060) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ihFKo-0003Et-QR; Tue, 17 Dec 2019 11:04:30 -0500 Received: from [176.228.60.248] (port=3372 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ihFKo-0002Zq-7j; Tue, 17 Dec 2019 11:04:30 -0500 Date: Tue, 17 Dec 2019 18:04:16 +0200 Message-Id: <83k16v3pmn.fsf@gnu.org> From: Eli Zaretskii In-reply-to: <87r214giaz.fsf@mail.linkov.net> (message from Juri Linkov on Mon, 16 Dec 2019 23:51:48 +0200) References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> <835zig5kka.fsf@gnu.org> <87r214giaz.fsf@mail.linkov.net> MIME-version: 1.0 Content-type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Juri Linkov > Cc: schwab@linux-m68k.org, larsi@gnus.org, 38587@debbugs.gnu.org > Date: Mon, 16 Dec 2019 23:51:48 +0200 > > >> Is there an equivalent of force_encoding('UTF-8') in Emacs? > > > > "C-x RET c utf-8 RET M-x SOME-COMMAND RET" > > I see that 'C-x RET c' just sets coding-system-for-read and > coding-system-for-write for the next command, so could > base64-decode-region get coding from these variables? Yes, just access the variable and use the value. > > (decode-coding-string (base64-decode-string > > (base64-encode-string > > (encode-coding-string "ä" 'utf-8))) > > 'utf-8) > > Thanks, this works for strings. > > My real need was to find a way to decode base64 regions > that were encoded with UTF-8 coding. Then you need just base64-decode-region followed by decode-coding-region. Assuming that I understand what you mean, i.e. that the region you want to decode includes only ASCII characters and raw bytes (otherwise it is not correct to say that it is "encoded with UTF-8"). > First I tried to find such post-processing that would > recover "broken" characters inserted by base64-decode-region. > It seems these characters represent bytes that are parts > of the UTF-8 characters encoded in the UTF-8 buffer > using eight-bit charset. I failed to find such functions > that would convert the result of base64-decode-region > to UTF-8 characters in the UTF-8 buffer. decode-coding-region should be what you want. It decodes raw bytes (a.k.a. "eight-bit charset") into characters. > So I wrote a replacement of base64-decode-region: > > (defun base64-decode-utf8-region (beg end) > (interactive "r") > (replace-region-contents beg end > (lambda () > (decode-coding-string > (base64-decode-string > (buffer-substring beg end)) > (or coding-system-for-write 'utf-8))))) > > But the question remains: is it possible to do the same > in a simpler way without the need to write a new command? Yes, see above. In particular, decode-coding-region already knows how to replace the region with the decoded text. From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 17 Dec 2019 16:28:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Juri Linkov Cc: Andreas Schwab , 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.15766000373810 (code B ref 38587); Tue, 17 Dec 2019 16:28:01 +0000 Received: (at 38587) by debbugs.gnu.org; 17 Dec 2019 16:27:17 +0000 Received: from localhost ([127.0.0.1]:42293 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ihFgr-0000zO-8p for submit@debbugs.gnu.org; Tue, 17 Dec 2019 11:27:17 -0500 Received: from quimby.gnus.org ([95.216.78.240]:40376) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ihFgp-0000yw-Tp for 38587@debbugs.gnu.org; Tue, 17 Dec 2019 11:27:16 -0500 Received: from cm-84.212.202.86.getinternet.no ([84.212.202.86] helo=marnie) by quimby.gnus.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ihFgg-00012r-Rn; Tue, 17 Dec 2019 17:27:09 +0100 From: Lars Ingebrigtsen References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> Date: Tue, 17 Dec 2019 17:27:06 +0100 In-Reply-To: <87zhft9rl4.fsf@mail.linkov.net> (Juri Linkov's message of "Mon, 16 Dec 2019 00:40:55 +0200") Message-ID: <87fthivrxh.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Juri Linkov writes: > But is it still possible to tell base64-decode-region > about the expected output coding system? Anything is possible, but it doesn't make sense to complicate a function like that in this manner. These functions perform a transformation from one set of octets to a different set of octets, and the [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: linkov.net] -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Juri Linkov writes: > But is it still possible to tell base64-decode-region > about the expected output coding system? Anything is possible, but it doesn't make sense to complicate a function like that in this manner. These functions perform a transformation from one set of octets to a different set of octets, and they have nothing to do with characters. We have a bunch of functions in Emacs that work on bytes, and not on characters. The way to use them is always (assuming you're starting with something that is text) to use encode-coding-region first, and (going in the opposite direction), if you want to end up with something that is text afterwards, you have to call decode-coding-region afterwards. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Juri Linkov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 18 Dec 2019 00:07:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: To: Eli Zaretskii Cc: larsi@gnus.org, schwab@linux-m68k.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.1576627607364 (code B ref 38587); Wed, 18 Dec 2019 00:07:02 +0000 Received: (at 38587) by debbugs.gnu.org; 18 Dec 2019 00:06:47 +0000 Received: from localhost ([127.0.0.1]:42524 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ihMrX-00005n-7s for submit@debbugs.gnu.org; Tue, 17 Dec 2019 19:06:47 -0500 Received: from brown.birch.relay.mailchannels.net ([23.83.209.23]:4242) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ihMrV-00005a-Ky; Tue, 17 Dec 2019 19:06:46 -0500 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 37A6C501B60; Wed, 18 Dec 2019 00:06:44 +0000 (UTC) Received: from pdx1-sub0-mail-a90.g.dreamhost.com (100-96-6-249.trex.outbound.svc.cluster.local [100.96.6.249]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 9E99E501C72; Wed, 18 Dec 2019 00:06:43 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Received: from pdx1-sub0-mail-a90.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.5); Wed, 18 Dec 2019 00:06:44 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Power-Whispering: 7337b8c61fbc8cbc_1576627603898_475574895 X-MC-Loop-Signature: 1576627603898:2752949898 X-MC-Ingress-Time: 1576627603897 Received: from pdx1-sub0-mail-a90.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a90.g.dreamhost.com (Postfix) with ESMTP id CA9BB7F150; Tue, 17 Dec 2019 16:06:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type; s=linkov.net; bh=5QO2p2tmxQDE36l1+PcnLyocxI8=; b= hLFt/eBvJkuHemr05m9peomZNC0cPd0sCpHUjRRO1Hz/bxp4E1FWgadx8RYf/+KH gOlsdFoiCIv/sHugszGIuIJpn1e+neZsvnWtsn9hSzEeSkqFWIRwT8/rqjjz32IZ +23yGv1zjhg4jKyTAAfL1I2xWuVeqMa04rO/FqX+dUc= Received: from mail.jurta.org (m91-129-107-186.cust.tele2.ee [91.129.107.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a90.g.dreamhost.com (Postfix) with ESMTPSA id E68FF7F151; Tue, 17 Dec 2019 16:06:36 -0800 (PST) X-DH-BACKEND: pdx1-sub0-mail-a90 From: Juri Linkov Organization: LINKOV.NET References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> <835zig5kka.fsf@gnu.org> <87r214giaz.fsf@mail.linkov.net> <83k16v3pmn.fsf@gnu.org> Date: Wed, 18 Dec 2019 01:10:06 +0200 In-Reply-To: <83k16v3pmn.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 17 Dec 2019 18:04:16 +0200") Message-ID: <87o8w6zgz5.fsf@mail.linkov.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: -100 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedufedrvddtkedgudekucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvufhofhffjgfkfgggtgesthdtredttdertdenucfhrhhomheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqeenucfkphepledurdduvdelrddutdejrddukeeinecurfgrrhgrmhepmhhouggvpehsmhhtphdphhgvlhhopehmrghilhdrjhhurhhtrgdrohhrghdpihhnvghtpeeluddruddvledruddtjedrudekiedprhgvthhurhhnqdhprghthheplfhurhhiucfnihhnkhhovhcuoehjuhhriheslhhinhhkohhvrdhnvghtqedpmhgrihhlfhhrohhmpehjuhhriheslhhinhhkohhvrdhnvghtpdhnrhgtphhtthhopegvlhhiiiesghhnuhdrohhrghenucevlhhushhtvghrufhiiigvpedt X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) tags 38587 wontfix close 38587 27.0.50 quit > decode-coding-region should be what you want. It decodes raw bytes > (a.k.a. "eight-bit charset") into characters. Thanks, I'm using this advice. (advice-add 'base64-decode-region :after (lambda (beg end &optional _base64url) (decode-coding-region beg end buffer-file-coding-system)) '((name . base64-decode-region-with-buffer-coding))) So I'm closing this. Not sure what could be added to documentation. From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 24 Dec 2019 15:38:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: wontfix To: Juri Linkov Cc: Eli Zaretskii , schwab@linux-m68k.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157720184730759 (code B ref 38587); Tue, 24 Dec 2019 15:38:02 +0000 Received: (at 38587) by debbugs.gnu.org; 24 Dec 2019 15:37:27 +0000 Received: from localhost ([127.0.0.1]:52963 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ijmFT-000802-KO for submit@debbugs.gnu.org; Tue, 24 Dec 2019 10:37:27 -0500 Received: from quimby.gnus.org ([95.216.78.240]:43728) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ijmFS-0007zq-IH for 38587@debbugs.gnu.org; Tue, 24 Dec 2019 10:37:27 -0500 Received: from 77.16.52.139.tmi.telenormobil.no ([77.16.52.139] helo=sandy) by quimby.gnus.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ijmFI-0006vL-L4; Tue, 24 Dec 2019 16:37:19 +0100 From: Lars Ingebrigtsen References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> <835zig5kka.fsf@gnu.org> <87r214giaz.fsf@mail.linkov.net> <83k16v3pmn.fsf@gnu.org> <87o8w6zgz5.fsf@mail.linkov.net> Date: Tue, 24 Dec 2019 16:37:15 +0100 In-Reply-To: <87o8w6zgz5.fsf@mail.linkov.net> (Juri Linkov's message of "Wed, 18 Dec 2019 01:10:06 +0200") Message-ID: <87r20tenv8.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Juri Linkov writes: > Thanks, I'm using this advice. > > (advice-add 'base64-decode-region :after > (lambda (beg end &optional _base64url) > (decode-coding-region beg end buffer-file-coding-system)) > '((name . base64-de [...] Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: linkov.net] -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP 0.0 TVD_RCVD_IP Message was received from an IP address -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Juri Linkov writes: > Thanks, I'm using this advice. > > (advice-add 'base64-decode-region :after > (lambda (beg end &optional _base64url) > (decode-coding-region beg end buffer-file-coding-system)) > '((name . base64-decode-region-with-buffer-coding))) I think in many cases this will work fine, but you probably will have Emacs double-decode a lot of data, as the other code in Emacs will normally call decode-coding-region (if you want to end up with text), and you'll destroy the bits of your Emacs that decodes base64 data into images and the like. (If I read the advice correctly -- I'm not very familiar with advising functions.) > So I'm closing this. Not sure what could be added to documentation. Something equivalent to what the manual has to say about it would be nice. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no From unknown Tue Jun 17 22:01:21 2025 X-Loop: help-debbugs@gnu.org Subject: bug#38587: base64-decode-region breaks encoding Resent-From: Lars Ingebrigtsen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 24 Dec 2019 16:14:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: wontfix To: Juri Linkov Cc: Eli Zaretskii , schwab@linux-m68k.org, 38587@debbugs.gnu.org Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.15772040371723 (code B ref 38587); Tue, 24 Dec 2019 16:14:02 +0000 Received: (at 38587) by debbugs.gnu.org; 24 Dec 2019 16:13:57 +0000 Received: from localhost ([127.0.0.1]:53002 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ijmon-0000Rj-2E for submit@debbugs.gnu.org; Tue, 24 Dec 2019 11:13:57 -0500 Received: from quimby.gnus.org ([95.216.78.240]:44162) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ijmol-0000RX-TC for 38587@debbugs.gnu.org; Tue, 24 Dec 2019 11:13:56 -0500 Received: from 77.16.52.139.tmi.telenormobil.no ([77.16.52.139] helo=sandy) by quimby.gnus.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ijmoc-0007DU-Ft; Tue, 24 Dec 2019 17:13:48 +0100 From: Lars Ingebrigtsen References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> <835zig5kka.fsf@gnu.org> <87r214giaz.fsf@mail.linkov.net> <83k16v3pmn.fsf@gnu.org> <87o8w6zgz5.fsf@mail.linkov.net> <87r20tenv8.fsf@gnus.org> Date: Tue, 24 Dec 2019 17:13:45 +0100 In-Reply-To: <87r20tenv8.fsf@gnus.org> (Lars Ingebrigtsen's message of "Tue, 24 Dec 2019 16:37:15 +0100") Message-ID: <87a77hem6e.fsf@gnus.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Report: Spam detection software, running on the system "quimby.gnus.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: Lars Ingebrigtsen writes: > Something equivalent to what the manual has to say about it would be > nice. I've now added a slightly vague version what the manual says here to the doc strings, but I think it should be factually correct. Content analysis details: (-2.9 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: gnus.org] -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP 0.0 TVD_RCVD_IP Message was received from an IP address -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] X-Spam-Score: 0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Lars Ingebrigtsen writes: > Something equivalent to what the manual has to say about it would be > nice. I've now added a slightly vague version what the manual says here to the doc strings, but I think it should be factually correct. I also noticed that the encode-coding-region/decode-coding-region doc strings do not actually say the most important thing -- whether "encoding" goes from bytes to text or the other way around, so I've now made this extremely explicit, because I think there's a lot of confusion in the area, and "encode" and "decode" in themselves don't actually mean anything. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog: http://lars.ingebrigtsen.no