GNU bug report logs - #68508
[PATCH] ; (dom-print): Use HTML entities for reserved characters.

Previous Next

Package: emacs;

Reported by: Eshel Yaron <me <at> eshelyaron.com>

Date: Tue, 16 Jan 2024 13:26:02 UTC

Severity: normal

Tags: patch

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 68508 in the body.
You can then email your comments to 68508 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#68508; Package emacs. (Tue, 16 Jan 2024 13:26:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Eshel Yaron <me <at> eshelyaron.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 16 Jan 2024 13:26:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Eshel Yaron <me <at> eshelyaron.com>
To: bug-gnu-emacs <at> gnu.org
Subject: [PATCH] ; (dom-print): Use HTML entities for reserved characters.
Date: Tue, 16 Jan 2024 14:24:40 +0100
[Message part 1 (text/plain, inline)]
Tags: patch

This makes `dom-print` encode HTML reserved characters that occur in
string elements of the DOM, to ensure the validity of the result.

For example, put the following in `foo.html`:

--8<---------------cut here---------------start------------->8---
<html><body>
Add ‘<samp class="samp">&lt;div class="default"&gt; &lt;/div&gt;</samp>’ tags around the fontified body.
<body><html>
--8<---------------cut here---------------end--------------->8---
(Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfontify.html)

Open that file in Emacs and say `M-: (require 'dom)` and then
`(dom-print (libxml-parse-html-region))` in the HTML buffer.  This
produces invalid HTML since `libxml-parse-html-region` correctly decodes
HTML entities, but `dom-print` doesn't encode (without this patch).



[0001-dom-print-Use-HTML-entities-for-reserved-characters.patch (text/patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68508; Package emacs. (Tue, 16 Jan 2024 13:48:02 GMT) Full text and rfc822 format available.

Message #8 received at 68508 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Eshel Yaron <me <at> eshelyaron.com>
Cc: 68508 <at> debbugs.gnu.org
Subject: Re: bug#68508: [PATCH] ;
 (dom-print): Use HTML entities for reserved characters.
Date: Tue, 16 Jan 2024 15:47:30 +0200
> Date: Tue, 16 Jan 2024 14:24:40 +0100
> From:  Eshel Yaron via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
> 
> This makes `dom-print` encode HTML reserved characters that occur in
> string elements of the DOM, to ensure the validity of the result.
> 
> For example, put the following in `foo.html`:
> 
> --8<---------------cut here---------------start------------->8---
> <html><body>
> Add ‘<samp class="samp">&lt;div class="default"&gt; &lt;/div&gt;</samp>’ tags around the fontified body.
> <body><html>
> --8<---------------cut here---------------end--------------->8---
> (Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfontify.html)
> 
> Open that file in Emacs and say `M-: (require 'dom)` and then
> `(dom-print (libxml-parse-html-region))` in the HTML buffer.  This
> produces invalid HTML since `libxml-parse-html-region` correctly decodes
> HTML entities, but `dom-print` doesn't encode (without this patch).

Thanks, but could you please also add tests for this?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#68508; Package emacs. (Tue, 16 Jan 2024 16:30:02 GMT) Full text and rfc822 format available.

Message #11 received at 68508 <at> debbugs.gnu.org (full text, mbox):

From: Eshel Yaron <me <at> eshelyaron.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 68508 <at> debbugs.gnu.org
Subject: Re: bug#68508: [PATCH] ; (dom-print): Use HTML entities for
 reserved characters.
Date: Tue, 16 Jan 2024 17:29:12 +0100
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Date: Tue, 16 Jan 2024 14:24:40 +0100
>> From:  Eshel Yaron via "Bug reports for GNU Emacs,
>>  the Swiss army knife of text editors" <bug-gnu-emacs <at> gnu.org>
>>
>> This makes `dom-print` encode HTML reserved characters that occur in
>> string elements of the DOM, to ensure the validity of the result.
>>
>> For example, put the following in `foo.html`:
>>
>> --8<---------------cut here---------------start------------->8---
>> <html><body>
>> Add ‘<samp class="samp">&lt;div class="default"&gt; &lt;/div&gt;</samp>’ tags around the fontified body.
>> <body><html>
>> --8<---------------cut here---------------end--------------->8---
>> (Fragment from https://www.gnu.org/software/emacs/manual/html_mono/htmlfontify.html)
>>
>> Open that file in Emacs and say `M-: (require 'dom)` and then
>> `(dom-print (libxml-parse-html-region))` in the HTML buffer.  This
>> produces invalid HTML since `libxml-parse-html-region` correctly decodes
>> HTML entities, but `dom-print` doesn't encode (without this patch).
>
> Thanks, but could you please also add tests for this?

Sure, I've added a test to dom-tests.el in the updated patch below.

[v2-0001-Use-HTML-entities-for-reserved-characters-in-dom-.patch (text/x-patch, attachment)]

Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 20 Jan 2024 09:43:02 GMT) Full text and rfc822 format available.

Notification sent to Eshel Yaron <me <at> eshelyaron.com>:
bug acknowledged by developer. (Sat, 20 Jan 2024 09:43:02 GMT) Full text and rfc822 format available.

Message #16 received at 68508-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Eshel Yaron <me <at> eshelyaron.com>
Cc: 68508-done <at> debbugs.gnu.org
Subject: Re: bug#68508: [PATCH] ; (dom-print): Use HTML entities for
 reserved characters.
Date: Sat, 20 Jan 2024 11:42:07 +0200
> From: Eshel Yaron <me <at> eshelyaron.com>
> Cc: 68508 <at> debbugs.gnu.org
> Date: Tue, 16 Jan 2024 17:29:12 +0100
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > Thanks, but could you please also add tests for this?
> 
> Sure, I've added a test to dom-tests.el in the updated patch below.

Thanks, installed on master, and closing the bug.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 17 Feb 2024 12:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 119 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.