From unknown Sat Aug 09 01:11:24 2025 X-Loop: help-debbugs@gnu.org Subject: bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters. Resent-From: Eshel Yaron Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 16 Jan 2024 13:26:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 68508 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: 68508@debbugs.gnu.org X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Received: via spool by submit@debbugs.gnu.org id=B.17054115029333 (code B ref -1); Tue, 16 Jan 2024 13:26:02 +0000 Received: (at submit) by debbugs.gnu.org; 16 Jan 2024 13:25:02 +0000 Received: from localhost ([127.0.0.1]:48136 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rPjR7-0002QI-Ia for submit@debbugs.gnu.org; Tue, 16 Jan 2024 08:25:02 -0500 Received: from lists.gnu.org ([2001:470:142::17]:57910) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rPjR5-0002PP-UH for submit@debbugs.gnu.org; Tue, 16 Jan 2024 08:25:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPjQr-0001qO-P2 for bug-gnu-emacs@gnu.org; Tue, 16 Jan 2024 08:24:46 -0500 Received: from mail.eshelyaron.com ([107.175.124.16] helo=eshelyaron.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPjQq-0003B5-2c for bug-gnu-emacs@gnu.org; Tue, 16 Jan 2024 08:24:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=eshelyaron.com; s=mail; t=1705411482; bh=BWDmJeXL3mIhepGxnw21xQ7nUSyoFpSh1at1RAFG7F8=; h=From:To:Subject:Date:From; b=oZ4tDIAPoZp0Kr7HMmN3hcJWKRTDH/b4CmKNF8lGfptsdebaX+9Rsims95Kzsa1tj RBNmaWOxZ02OAdi24iYSwtyM/DJkt90QgYWnXaW1kNpMu6z6vTvaSCI2JnjvRMb+6+ PwevQ8e3nvCtL1itF8IMqsGBu+MgLKsf/kT3zEPoqupEeSEs9KLuCqadNarKggZ1Ce 41VOAg6OfNJgM1nRTXoyEvf2716JOCl45UZxsrOVu+G76UkvSicHae9aQ3TOXW+SXt 4S1NOuVcCPW3Q5yTk5K9PnEthzsvpmfM6MGiOSjDCCr2xxjJlwGybk3M59Im8nI48u vpKQfSizvJRbg== From: Eshel Yaron X-Hashcash: 1:20:240116:bug-gnu-emacs@gnu.org::HrVEa/zHsm5/RWZj:1atS Date: Tue, 16 Jan 2024 14:24:40 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=107.175.124.16; envelope-from=me@eshelyaron.com; helo=eshelyaron.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.9 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.1 (/) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Tags: patch This makes `dom-print` encode HTML reserved characters that occur in string elements of the DOM, to ensure the validity of the result. For example, put the following in `foo.html`: --8<---------------cut here---------------start------------->8--- Add =E2=80=98<div class=3D"default"> </div>= ;=E2=80=99 tags around the fontified body. --8<---------------cut here---------------end--------------->8--- (Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfont= ify.html) Open that file in Emacs and say `M-: (require 'dom)` and then `(dom-print (libxml-parse-html-region))` in the HTML buffer. This produces invalid HTML since `libxml-parse-html-region` correctly decodes HTML entities, but `dom-print` doesn't encode (without this patch). --=-=-= Content-Type: text/patch Content-Disposition: attachment; filename=0001-dom-print-Use-HTML-entities-for-reserved-characters.patch >From 259c0138623c352acc7bcd79a1fda42ec606a0cf Mon Sep 17 00:00:00 2001 From: Eshel Yaron Date: Fri, 5 Jan 2024 16:40:44 +0100 Subject: [PATCH] ; (dom-print): Use HTML entities for reserved characters. --- lisp/dom.el | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lisp/dom.el b/lisp/dom.el index f7043ba8252..b329379fdc3 100644 --- a/lisp/dom.el +++ b/lisp/dom.el @@ -288,7 +288,7 @@ dom-print (insert ">") (dolist (child children) (if (stringp child) - (insert child) + (insert (url-insert-entities-in-string child)) (setq non-text t) (when pretty (insert "\n" (make-string (+ column 2) ?\s))) -- 2.42.0 --=-=-=-- From unknown Sat Aug 09 01:11:24 2025 X-Loop: help-debbugs@gnu.org Subject: bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters. Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 16 Jan 2024 13:48:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 68508 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eshel Yaron Cc: 68508@debbugs.gnu.org Received: via spool by 68508-submit@debbugs.gnu.org id=B68508.170541287132163 (code B ref 68508); Tue, 16 Jan 2024 13:48:02 +0000 Received: (at 68508) by debbugs.gnu.org; 16 Jan 2024 13:47:51 +0000 Received: from localhost ([127.0.0.1]:48169 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rPjnD-0008Mf-Bg for submit@debbugs.gnu.org; Tue, 16 Jan 2024 08:47:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:38138) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rPjnA-0008Lq-QN for 68508@debbugs.gnu.org; Tue, 16 Jan 2024 08:47:50 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPjn4-00007I-HA; Tue, 16 Jan 2024 08:47:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=Yu4l3cR79Ujp1dCVeLGTtRmZmix7+4nopg++CVdS6r8=; b=lpUb/3Qr34ENDRtRbn8d tAh5XS5ivY4E6W6fA//BBYiTG/9uF+RfEoFcjMzrgy8D3D9SgE9OyKffM5sMJ9DTiEYe3PXdvjy+l EgSn0fxaS2JHKDo7n8srglzgAxILPym4o5/cI205uTU4/RpSNj+YUG8X7CtHup/OjPbGdY1koA1am 1GwN1eFQfLPWk9CtZie/Zu8orE1F+xP3Iq5b0CVYkqpibaFYuLaAbCoC99TDjLFMWT1yGZaG+GVla owuKs2ckKuY5Tc1mEsM+1rf+EAUgOF8a1cWTRpyJvALZlGmlOgJkifgEaWlEEuSgBPN1sbsvwj7DW IZAwBxzyl55rIQ==; Date: Tue, 16 Jan 2024 15:47:30 +0200 Message-Id: <837ck9crv1.fsf@gnu.org> From: Eli Zaretskii In-Reply-To: (bug-gnu-emacs@gnu.org) References: MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Tue, 16 Jan 2024 14:24:40 +0100 > From: Eshel Yaron via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" > > This makes `dom-print` encode HTML reserved characters that occur in > string elements of the DOM, to ensure the validity of the result. > > For example, put the following in `foo.html`: > > --8<---------------cut here---------------start------------->8--- > > Add ‘<div class="default"> </div>’ tags around the fontified body. > > --8<---------------cut here---------------end--------------->8--- > (Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfontify.html) > > Open that file in Emacs and say `M-: (require 'dom)` and then > `(dom-print (libxml-parse-html-region))` in the HTML buffer. This > produces invalid HTML since `libxml-parse-html-region` correctly decodes > HTML entities, but `dom-print` doesn't encode (without this patch). Thanks, but could you please also add tests for this? From unknown Sat Aug 09 01:11:24 2025 X-Loop: help-debbugs@gnu.org Subject: bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters. Resent-From: Eshel Yaron Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 16 Jan 2024 16:30:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 68508 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: Eli Zaretskii Cc: 68508@debbugs.gnu.org Received: via spool by 68508-submit@debbugs.gnu.org id=B68508.170542255923517 (code B ref 68508); Tue, 16 Jan 2024 16:30:02 +0000 Received: (at 68508) by debbugs.gnu.org; 16 Jan 2024 16:29:19 +0000 Received: from localhost ([127.0.0.1]:49551 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rPmJS-00067E-L2 for submit@debbugs.gnu.org; Tue, 16 Jan 2024 11:29:18 -0500 Received: from mail.eshelyaron.com ([107.175.124.16]:32908 helo=eshelyaron.com) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rPmJQ-000676-1i for 68508@debbugs.gnu.org; Tue, 16 Jan 2024 11:29:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=eshelyaron.com; s=mail; t=1705422554; bh=Npq7LYejLn14g+vyCAgJKyUcK+jMduKopQRSs4MH5rI=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=WLZbF7dIuD7+gFwMRfRO05JZySJdimcbTMjjnSUEvXQdhNU2Sb9qnn2BlRXhREoBD jMR5FZvu83cAzBAlfNBfkGk1wvnTNKYEUSQUDdsyaMf9O/6DkR16XcVHU058eHdf0X 7CDaf7FNbnfqTgSGtdS2b3J5+LM06q1LP6IHOUEOLJa2s6rKHpM7Oysbc5y+sgZg3p UOnVPFyx/6g2YeWvxmfPxkpZPWuga+p3nqAE6TBwZUvfXfLY3bOfpt9Fev1DnVBTmi 5sO/UMOvlQGj6o9Ia2DUCxy7jJR35oWA4URIhMqTKpA0ggh5PNEM/Nv1oYWn4n1F59 2+g1UAUXcnbdw== From: Eshel Yaron In-Reply-To: <837ck9crv1.fsf@gnu.org> (Eli Zaretskii's message of "Tue, 16 Jan 2024 15:47:30 +0200") References: <837ck9crv1.fsf@gnu.org> X-Hashcash: 1:20:240116:eliz@gnu.org::islpTNJAJAN3zb1j:cto X-Hashcash: 1:20:240116:68508@debbugs.gnu.org::HiP52M1Ee0dtB/Hb:604j Date: Tue, 16 Jan 2024 17:29:12 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Spam-Score: -0.0 (/) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Eli Zaretskii writes: >> Date: Tue, 16 Jan 2024 14:24:40 +0100 >> From: Eshel Yaron via "Bug reports for GNU Emacs, >> the Swiss army knife of text editors" >> >> This makes `dom-print` encode HTML reserved characters that occur in >> string elements of the DOM, to ensure the validity of the result. >> >> For example, put the following in `foo.html`: >> >> --8<---------------cut here---------------start------------->8--- >> >> Add =E2=80=98<div class=3D"default"> </div= >=E2=80=99 tags around the fontified body. >> >> --8<---------------cut here---------------end--------------->8--- >> (Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlf= ontify.html) >> >> Open that file in Emacs and say `M-: (require 'dom)` and then >> `(dom-print (libxml-parse-html-region))` in the HTML buffer. This >> produces invalid HTML since `libxml-parse-html-region` correctly decodes >> HTML entities, but `dom-print` doesn't encode (without this patch). > > Thanks, but could you please also add tests for this? Sure, I've added a test to dom-tests.el in the updated patch below. --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=v2-0001-Use-HTML-entities-for-reserved-characters-in-dom-.patch >From 8d60074053ee1ebc04fc3fda417d53ddc5a4fac9 Mon Sep 17 00:00:00 2001 From: Eshel Yaron Date: Fri, 5 Jan 2024 16:40:44 +0100 Subject: [PATCH v2] ; Use HTML entities for reserved characters in 'dom-print' * lisp/dom.el (dom-print): Encode HTML reserved characters in strings. * test/lisp/dom-tests.el (dom-tests-print): New test. (Bug#68508) --- lisp/dom.el | 2 +- test/lisp/dom-tests.el | 10 ++++++++++ 2 files changed, 11 insertions(+), 1 deletion(-) diff --git a/lisp/dom.el b/lisp/dom.el index f7043ba8252..b329379fdc3 100644 --- a/lisp/dom.el +++ b/lisp/dom.el @@ -288,7 +288,7 @@ dom-print (insert ">") (dolist (child children) (if (stringp child) - (insert child) + (insert (url-insert-entities-in-string child)) (setq non-text t) (when pretty (insert "\n" (make-string (+ column 2) ?\s))) diff --git a/test/lisp/dom-tests.el b/test/lisp/dom-tests.el index 8cbfb9ad9df..a4e913541bf 100644 --- a/test/lisp/dom-tests.el +++ b/test/lisp/dom-tests.el @@ -209,6 +209,16 @@ dom-tests-pp (dom-pp node t) (should (equal (buffer-string) "(\"foo\" nil)"))))) +(ert-deftest dom-tests-print () + "Test that `dom-print' correctly encodes HTML reserved characters." + (with-temp-buffer + (dom-print '(samp ((class . "samp")) "
")) + (should (equal + (buffer-string) + (concat "" + "<div class="default"> </div>" + ""))))) + (ert-deftest dom-test-search () (let ((dom '(a nil (b nil (c nil))))) (should (equal (dom-search dom (lambda (d) (eq (dom-tag d) 'a))) -- 2.42.0 --=-=-=-- From unknown Sat Aug 09 01:11:24 2025 MIME-Version: 1.0 X-Mailer: MIME-tools 5.505 (Entity 5.505) X-Loop: help-debbugs@gnu.org From: help-debbugs@gnu.org (GNU bug Tracking System) To: Eshel Yaron Subject: bug#68508: closed (Re: bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters.) Message-ID: References: <83frystk7k.fsf@gnu.org> X-Gnu-PR-Message: they-closed 68508 X-Gnu-PR-Package: emacs X-Gnu-PR-Keywords: patch Reply-To: 68508@debbugs.gnu.org Date: Sat, 20 Jan 2024 09:43:02 +0000 Content-Type: multipart/mixed; boundary="----------=_1705743782-6825-1" This is a multi-part message in MIME format... ------------=_1705743782-6825-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Your bug report #68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters. which was filed against the emacs package, has been closed. The explanation is attached below, along with your original report. If you require more details, please reply to 68508@debbugs.gnu.org. --=20 68508: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D68508 GNU Bug Tracking System Contact help-debbugs@gnu.org with problems ------------=_1705743782-6825-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at 68508-done) by debbugs.gnu.org; 20 Jan 2024 09:42:37 +0000 Received: from localhost ([127.0.0.1]:60869 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rR7s5-0001lS-5s for submit@debbugs.gnu.org; Sat, 20 Jan 2024 04:42:37 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:43884) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rR7s3-0001lD-5r for 68508-done@debbugs.gnu.org; Sat, 20 Jan 2024 04:42:35 -0500 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rR7ru-000108-I0; Sat, 20 Jan 2024 04:42:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=IGv43dayNMsM4OF7imoLWbVlhkwxSLkce3DEqTpPnuE=; b=c/3w7n64fkTm fMTjiNTOYbyXLzByzFnquVIlp/CC2w9B4CGDp1hoWQRASuPyabGeA8Nr7scrZYBNdLg1UT/qLiD1E LaAK3UHllSdrWRtDlBEMTS7jrDf+3iv7hYK3NSy7ux6s8uNdEV3SVqrZzDi1a7nTluPEhCJrs18uc qvXcMmrJ3vIQXNK1YAoVRC/L51AxNaW9d2O4YM+z0XupObQSIrzXQ2pyWsQJ0V35iJHZQEwxEE2WX o3jplZd2XDLTzUhEG938e2UuxVfsBwrQDIIMBpOPKOqQ/f/O5mQac6FCabB4kZ+dRjHwgZQI10SIV p3ORpt1RlZvW/+EmC4+XZg==; Date: Sat, 20 Jan 2024 11:42:07 +0200 Message-Id: <83frystk7k.fsf@gnu.org> From: Eli Zaretskii To: Eshel Yaron In-Reply-To: (message from Eshel Yaron on Tue, 16 Jan 2024 17:29:12 +0100) Subject: Re: bug#68508: [PATCH] ; (dom-print): Use HTML entities for reserved characters. References: <837ck9crv1.fsf@gnu.org> X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 68508-done Cc: 68508-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Eshel Yaron > Cc: 68508@debbugs.gnu.org > Date: Tue, 16 Jan 2024 17:29:12 +0100 > > Eli Zaretskii writes: > > > Thanks, but could you please also add tests for this? > > Sure, I've added a test to dom-tests.el in the updated patch below. Thanks, installed on master, and closing the bug. ------------=_1705743782-6825-1 Content-Type: message/rfc822 Content-Disposition: inline Content-Transfer-Encoding: 7bit Received: (at submit) by debbugs.gnu.org; 16 Jan 2024 13:25:02 +0000 Received: from localhost ([127.0.0.1]:48136 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rPjR7-0002QI-Ia for submit@debbugs.gnu.org; Tue, 16 Jan 2024 08:25:02 -0500 Received: from lists.gnu.org ([2001:470:142::17]:57910) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1rPjR5-0002PP-UH for submit@debbugs.gnu.org; Tue, 16 Jan 2024 08:25:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPjQr-0001qO-P2 for bug-gnu-emacs@gnu.org; Tue, 16 Jan 2024 08:24:46 -0500 Received: from mail.eshelyaron.com ([107.175.124.16] helo=eshelyaron.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rPjQq-0003B5-2c for bug-gnu-emacs@gnu.org; Tue, 16 Jan 2024 08:24:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=eshelyaron.com; s=mail; t=1705411482; bh=BWDmJeXL3mIhepGxnw21xQ7nUSyoFpSh1at1RAFG7F8=; h=From:To:Subject:Date:From; b=oZ4tDIAPoZp0Kr7HMmN3hcJWKRTDH/b4CmKNF8lGfptsdebaX+9Rsims95Kzsa1tj RBNmaWOxZ02OAdi24iYSwtyM/DJkt90QgYWnXaW1kNpMu6z6vTvaSCI2JnjvRMb+6+ PwevQ8e3nvCtL1itF8IMqsGBu+MgLKsf/kT3zEPoqupEeSEs9KLuCqadNarKggZ1Ce 41VOAg6OfNJgM1nRTXoyEvf2716JOCl45UZxsrOVu+G76UkvSicHae9aQ3TOXW+SXt 4S1NOuVcCPW3Q5yTk5K9PnEthzsvpmfM6MGiOSjDCCr2xxjJlwGybk3M59Im8nI48u vpKQfSizvJRbg== From: Eshel Yaron To: bug-gnu-emacs@gnu.org Subject: [PATCH] ; (dom-print): Use HTML entities for reserved characters. X-Hashcash: 1:20:240116:bug-gnu-emacs@gnu.org::HrVEa/zHsm5/RWZj:1atS Date: Tue, 16 Jan 2024 14:24:40 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Received-SPF: pass client-ip=107.175.124.16; envelope-from=me@eshelyaron.com; helo=eshelyaron.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.9 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.1 (/) --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Tags: patch This makes `dom-print` encode HTML reserved characters that occur in string elements of the DOM, to ensure the validity of the result. For example, put the following in `foo.html`: --8<---------------cut here---------------start------------->8--- Add =E2=80=98<div class=3D"default"> </div>= ;=E2=80=99 tags around the fontified body. --8<---------------cut here---------------end--------------->8--- (Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfont= ify.html) Open that file in Emacs and say `M-: (require 'dom)` and then `(dom-print (libxml-parse-html-region))` in the HTML buffer. This produces invalid HTML since `libxml-parse-html-region` correctly decodes HTML entities, but `dom-print` doesn't encode (without this patch). --=-=-= Content-Type: text/patch Content-Disposition: attachment; filename=0001-dom-print-Use-HTML-entities-for-reserved-characters.patch >From 259c0138623c352acc7bcd79a1fda42ec606a0cf Mon Sep 17 00:00:00 2001 From: Eshel Yaron Date: Fri, 5 Jan 2024 16:40:44 +0100 Subject: [PATCH] ; (dom-print): Use HTML entities for reserved characters. --- lisp/dom.el | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lisp/dom.el b/lisp/dom.el index f7043ba8252..b329379fdc3 100644 --- a/lisp/dom.el +++ b/lisp/dom.el @@ -288,7 +288,7 @@ dom-print (insert ">") (dolist (child children) (if (stringp child) - (insert child) + (insert (url-insert-entities-in-string child)) (setq non-text t) (when pretty (insert "\n" (make-string (+ column 2) ?\s))) -- 2.42.0 --=-=-=-- ------------=_1705743782-6825-1--