GNU bug report logs -
#77299
eww-auto-rename-buffer 'title interaction with eww-readable-urls
Previous Next
To reply to this bug, email your comments to 77299 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Thu, 27 Mar 2025 04:48:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Keith Amidon <camalot <at> picnicpark.org>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Thu, 27 Mar 2025 04:48:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
I'm trying to use eww with:
(setq eww-auto-rename-buffer 'title
eww-readable-urls '((".*" . t)))
I'm expecting that this will result in buffer names reflecting
the titles of the HTML pages. However, what I actually see is
that all buffers get named "Untitled # EWW", made unique by
uniquify.
This seems like a bug to me.
I believe it arises because the title element of eww-data is set
to the empty string by eww--before-browse and then set to the
actual title when shr-insert-document csiteslalls the eww-tag-title
function via the shr-external-rendering-functions set in
eww-display-document.
I think that in the readable case, the content passed to
shr-insert-document does not include the title element so
it isn't available from rendering that content.
It appears to me that eww-readable preserves the title
in the eww-data plist by resetting it after calling
eww-before-browse. And in fact, if I set eww-readable-urls to
nil and go to a site, the buffer is properly named based on
the title. If I then call eww-readable, it remains named
based on the title.
The default readability based-on eww-readable-urls appears
to happen in eww-display-html, which itself calls
eww-display-document. The full document hasn't been rendered
at this point so there is no existing title to preserve.
I don't think there is a simple fix for this. The fundamental
problem is that eww only gets the title by rendering the
original document and in the case of default readable URLs
it never renders the original document. I thought it might be
possible to hack eww-score-readability to include the title
in the readable version but it appears that it is omitting the
entire head section currently, so it would need fairly
extensive changes to selectively pass through the title
without any other head elements.
Help with this would be appreciated. Thanks! --- Keith
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Thu, 27 Mar 2025 08:23:03 GMT)
Full text and
rfc822 format available.
Message #8 received at 77299 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 26 Mar 2025 11:04:49 -0700
> From: Keith Amidon <camalot <at> picnicpark.org>
>
> I'm trying to use eww with:
>
> (setq eww-auto-rename-buffer 'title
> eww-readable-urls '((".*" . t)))
>
> I'm expecting that this will result in buffer names reflecting
> the titles of the HTML pages. However, what I actually see is
> that all buffers get named "Untitled # EWW", made unique by
> uniquify.
>
> This seems like a bug to me.
>
> I believe it arises because the title element of eww-data is set
> to the empty string by eww--before-browse and then set to the
> actual title when shr-insert-document csiteslalls the eww-tag-title
> function via the shr-external-rendering-functions set in
> eww-display-document.
>
> I think that in the readable case, the content passed to
> shr-insert-document does not include the title element so
> it isn't available from rendering that content.
>
> It appears to me that eww-readable preserves the title
> in the eww-data plist by resetting it after calling
> eww-before-browse. And in fact, if I set eww-readable-urls to
> nil and go to a site, the buffer is properly named based on
> the title. If I then call eww-readable, it remains named
> based on the title.
>
> The default readability based-on eww-readable-urls appears
> to happen in eww-display-html, which itself calls
> eww-display-document. The full document hasn't been rendered
> at this point so there is no existing title to preserve.
>
> I don't think there is a simple fix for this. The fundamental
> problem is that eww only gets the title by rendering the
> original document and in the case of default readable URLs
> it never renders the original document. I thought it might be
> possible to hack eww-score-readability to include the title
> in the readable version but it appears that it is omitting the
> entire head section currently, so it would need fairly
> extensive changes to selectively pass through the title
> without any other head elements.
I agree with your analysis: it's a design problem, which prevents
shr-insert-document from calling eww-tag-title, and thus the title
remains the empty string. IOW, eww-readable-urls is currently
incompatible with eww-auto-rename-buffer, and moreover, it causes the
URL to be shown as "untitled", even if eww-auto-rename-buffer is not
used.
Hopefully, someone who knows this code and how shr.el works will be
able to come up with a solution. One possible way is to find the
title before calling shr-insert-document, and then manually setting
the :title property of eww-data when shr-insert-document returns.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Tue, 13 May 2025 14:28:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 77299 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 3/27/25 01:22, Eli Zaretskii wrote:
>> Date: Wed, 26 Mar 2025 11:04:49 -0700
>> From: Keith Amidon<camalot <at> picnicpark.org>
>>
>> I'm trying to use eww with:
>>
>> (setq eww-auto-rename-buffer 'title
>> eww-readable-urls '((".*" . t)))
>>
>> I'm expecting that this will result in buffer names reflecting
>> the titles of the HTML pages. However, what I actually see is
>> that all buffers get named "Untitled # EWW", made unique by
>> uniquify.
>>
>> This seems like a bug to me.
>>
>> I believe it arises because the title element of eww-data is set
>> to the empty string by eww--before-browse and then set to the
>> actual title when shr-insert-document csiteslalls the eww-tag-title
>> function via the shr-external-rendering-functions set in
>> eww-display-document.
>>
>> I think that in the readable case, the content passed to
>> shr-insert-document does not include the title element so
>> it isn't available from rendering that content.
>>
>> It appears to me that eww-readable preserves the title
>> in the eww-data plist by resetting it after calling
>> eww-before-browse. And in fact, if I set eww-readable-urls to
>> nil and go to a site, the buffer is properly named based on
>> the title. If I then call eww-readable, it remains named
>> based on the title.
>>
>> The default readability based-on eww-readable-urls appears
>> to happen in eww-display-html, which itself calls
>> eww-display-document. The full document hasn't been rendered
>> at this point so there is no existing title to preserve.
>>
>> I don't think there is a simple fix for this. The fundamental
>> problem is that eww only gets the title by rendering the
>> original document and in the case of default readable URLs
>> it never renders the original document. I thought it might be
>> possible to hack eww-score-readability to include the title
>> in the readable version but it appears that it is omitting the
>> entire head section currently, so it would need fairly
>> extensive changes to selectively pass through the title
>> without any other head elements.
> I agree with your analysis: it's a design problem, which prevents
> shr-insert-document from calling eww-tag-title, and thus the title
> remains the empty string. IOW, eww-readable-urls is currently
> incompatible with eww-auto-rename-buffer, and moreover, it causes the
> URL to be shown as "untitled", even if eww-auto-rename-buffer is not
> used.
>
> Hopefully, someone who knows this code and how shr.el works will be
> able to come up with a solution. One possible way is to find the
> title before calling shr-insert-document, and then manually setting
> the :title property of eww-data when shr-insert-document returns.
>
Sorry it took so long for me to get back to looking into this more
and thus for the long quote above to re-establish context. I have
found that I seem to be able to get eww-auto-rename-buffer 'title,
history titles, and org link capture to work with default readable
URLs in eww-readable-urls by redefining eww-display-html to:
(defun eww-display-html (charset url &optional document point buffer)
(let ((source (buffer-substring (point) (point-max))))
(with-current-buffer buffer
(plist-put eww-data :source source)))
(eww-display-document
(or document
(eww-document-base url (eww--parse-html-region (point) (point-max) charset)))
point buffer)
(and (null document)
(eww-default-readable-p url)
(with-current-buffer buffer
(eww-readable 1))))
This is somewhat less efficient than the prior implementation in that
the document gets displayed twice for default readable URLs, but it is
no worse than not having default readability and manually toggling
readability after the page is rendered.
I haven't noticed any downsides to this redefinition yet, but I've
only been playing around with it for a morning so far. Given it
reduces the conflict between eww-readable-urls and multiple other
features, it seems worth considering.
--- Keith
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Mon, 09 Jun 2025 13:40:04 GMT)
Full text and
rfc822 format available.
Message #14 received at 77299 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
A quick update on this: I've been using the version of eww-display-html
quoted below regularly since I last commented on the bug about a month
ago and think it is working fine. If the approach seems acceptable, I'd
be happy to prepare a patch to send to emacs-devel. --- Keith
On 5/13/25 07:27, Keith Amidon wrote:
> Sorry it took so long for me to get back to looking into this more
> and thus for the long quote above to re-establish context. I have
> found that I seem to be able to get eww-auto-rename-buffer 'title,
> history titles, and org link capture to work with default readable
> URLs in eww-readable-urls by redefining eww-display-html to:
> (defun eww-display-html (charset url &optional document point buffer)
> (let ((source (buffer-substring (point) (point-max))))
> (with-current-buffer buffer
> (plist-put eww-data :source source)))
> (eww-display-document
> (or document
> (eww-document-base url (eww--parse-html-region (point) (point-max) charset)))
> point buffer)
> (and (null document)
> (eww-default-readable-p url)
> (with-current-buffer buffer
> (eww-readable 1))))
> This is somewhat less efficient than the prior implementation in that
> the document gets displayed twice for default readable URLs, but it is
> no worse than not having default readability and manually toggling
> readability after the page is rendered.
>
> I haven't noticed any downsides to this redefinition yet, but I've
> only been playing around with it for a morning so far. Given it
> reduces the conflict between eww-readable-urls and multiple other
> features, it seems worth considering.
>
> --- Keith
>
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Wed, 11 Jun 2025 11:41:05 GMT)
Full text and
rfc822 format available.
Message #17 received at 77299 <at> debbugs.gnu.org (full text, mbox):
> Date: Mon, 9 Jun 2025 06:38:53 -0700
> From: Keith Amidon <camalot <at> picnicpark.org>
> Cc: 77299 <at> debbugs.gnu.org
>
> A quick update on this: I've been using the version of eww-display-html quoted below regularly since I last
> commented on the bug about a month ago and think it is working fine. If the approach seems acceptable, I'd
> be happy to prepare a patch to send to emacs-devel. --- Keith
Please do post a patch, and thanks.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Wed, 11 Jun 2025 17:12:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 77299 <at> debbugs.gnu.org (full text, mbox):
On 6/9/2025 6:38 AM, Keith Amidon wrote:
> A quick update on this: I've been using the version of eww-display-html
> quoted below regularly since I last commented on the bug about a month
> ago and think it is working fine. If the approach seems acceptable, I'd
> be happy to prepare a patch to send to emacs-devel. --- Keith
I think I'd prefer a solution that doesn't require a workaround like
this. If we could get a fix that resolves this issue without the
workaround, I think that would be best. The code would be more
maintainable, and we'd likely have fewer problems in this area in the
future.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Wed, 11 Jun 2025 19:18:02 GMT)
Full text and
rfc822 format available.
Message #23 received at 77299 <at> debbugs.gnu.org (full text, mbox):
On 6/11/2025 10:11 AM, Jim Porter wrote:
> I think I'd prefer a solution that doesn't require a workaround like
> this. If we could get a fix that resolves this issue without the
> workaround, I think that would be best.
That was a bit redundant. Serves me right to try and reply while I'm
distracted. But you get the idea...
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Wed, 11 Jun 2025 19:31:01 GMT)
Full text and
rfc822 format available.
Message #26 received at 77299 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 11 Jun 2025 10:11:03 -0700
> Cc: 77299 <at> debbugs.gnu.org
> From: Jim Porter <jporterbugs <at> gmail.com>
>
> On 6/9/2025 6:38 AM, Keith Amidon wrote:
> > A quick update on this: I've been using the version of eww-display-html
> > quoted below regularly since I last commented on the bug about a month
> > ago and think it is working fine. If the approach seems acceptable, I'd
> > be happy to prepare a patch to send to emacs-devel. --- Keith
>
> I think I'd prefer a solution that doesn't require a workaround like
> this.
What workaround are you alluding to?
> If we could get a fix that resolves this issue without the
> workaround, I think that would be best.
Feel free to suggest a way if you see it. Both Keith and myself
looked at the code and concluded that it would be impossible without
completely rewriting this functionality. See
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=77299#8
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Wed, 11 Jun 2025 22:18:03 GMT)
Full text and
rfc822 format available.
Message #29 received at 77299 <at> debbugs.gnu.org (full text, mbox):
On 6/11/2025 12:30 PM, Eli Zaretskii wrote:
>> Date: Wed, 11 Jun 2025 10:11:03 -0700
>> Cc: 77299 <at> debbugs.gnu.org
>> From: Jim Porter <jporterbugs <at> gmail.com>
>>
>> On 6/9/2025 6:38 AM, Keith Amidon wrote:
>>> A quick update on this: I've been using the version of eww-display-html
>>> quoted below regularly since I last commented on the bug about a month
>>> ago and think it is working fine. If the approach seems acceptable, I'd
>>> be happy to prepare a patch to send to emacs-devel. --- Keith
>>
>> I think I'd prefer a solution that doesn't require a workaround like
>> this.
>
> What workaround are you alluding to?
Rendering the original document and immediately re-rendering the
"readable" form of the document.
>> If we could get a fix that resolves this issue without the
>> workaround, I think that would be best.
>
> Feel free to suggest a way if you see it. Both Keith and myself
> looked at the code and concluded that it would be impossible without
> completely rewriting this functionality. See
> https://debbugs.gnu.org/cgi/bugreport.cgi?bug=77299#8
I think we could change how we return the readable form of the document.
Currently, it returns the DOM node of the section we want to render, but
we could instead return a new document with the <title> node plus the
DOM node we want to render (ditto for <link> nodes, since we use those
for things like 'eww-next-url').
Since scoring the readability of the document requires iterating over
every DOM node, we could just collect the extra nodes we care about
(like <title>) while iterating. Then it'll be easy to include those
nodes in the readable form.
Doing it this way should have the benefit that we don't start network
requests for images that won't be shown in the readable form of the page.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Thu, 12 Jun 2025 06:52:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 77299 <at> debbugs.gnu.org (full text, mbox):
> Date: Wed, 11 Jun 2025 15:17:21 -0700
> Cc: 77299 <at> debbugs.gnu.org, camalot <at> picnicpark.org
> From: Jim Porter <jporterbugs <at> gmail.com>
>
> On 6/11/2025 12:30 PM, Eli Zaretskii wrote:
> >> Date: Wed, 11 Jun 2025 10:11:03 -0700
> >> Cc: 77299 <at> debbugs.gnu.org
> >> From: Jim Porter <jporterbugs <at> gmail.com>
> >>
> >> On 6/9/2025 6:38 AM, Keith Amidon wrote:
> >>> A quick update on this: I've been using the version of eww-display-html
> >>> quoted below regularly since I last commented on the bug about a month
> >>> ago and think it is working fine. If the approach seems acceptable, I'd
> >>> be happy to prepare a patch to send to emacs-devel. --- Keith
> >>
> >> I think I'd prefer a solution that doesn't require a workaround like
> >> this.
> >
> > What workaround are you alluding to?
>
> Rendering the original document and immediately re-rendering the
> "readable" form of the document.
>
> >> If we could get a fix that resolves this issue without the
> >> workaround, I think that would be best.
> >
> > Feel free to suggest a way if you see it. Both Keith and myself
> > looked at the code and concluded that it would be impossible without
> > completely rewriting this functionality. See
> > https://debbugs.gnu.org/cgi/bugreport.cgi?bug=77299#8
>
> I think we could change how we return the readable form of the document.
> Currently, it returns the DOM node of the section we want to render, but
> we could instead return a new document with the <title> node plus the
> DOM node we want to render (ditto for <link> nodes, since we use those
> for things like 'eww-next-url').
>
> Since scoring the readability of the document requires iterating over
> every DOM node, we could just collect the extra nodes we care about
> (like <title>) while iterating. Then it'll be easy to include those
> nodes in the readable form.
>
> Doing it this way should have the benefit that we don't start network
> requests for images that won't be shown in the readable form of the page.
Thanks. That'd be okay, but unless we have such a reimplementation
soon, I intend to install the simpler patch proposed by Keith (when he
posts it).
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Thu, 12 Jun 2025 12:39:01 GMT)
Full text and
rfc822 format available.
Message #35 received at 77299 <at> debbugs.gnu.org (full text, mbox):
On 6/11/25 15:17, Jim Porter wrote:
> I think we could change how we return the readable form of the
> document. Currently, it returns the DOM node of the section we want to
> render, but we could instead return a new document with the <title>
> node plus the DOM node we want to render (ditto for <link> nodes,
> since we use those for things like 'eww-next-url').
>
> Since scoring the readability of the document requires iterating over
> every DOM node, we could just collect the extra nodes we care about
> (like <title>) while iterating. Then it'll be easy to include those
> nodes in the readable form.
>
> Doing it this way should have the benefit that we don't start network
> requests for images that won't be shown in the readable form of the page.
I had originally thought of trying to do it more-or-less likes this but
it seemed like it would be a quite large change to code I wasn't that
familiar with so I went with something more contained to get things
working. It might take me a while, but I can give this approach a second
more sustained try and see how it looks. The points about image loading
and <link> elements are good ones. Solving those within the existing
approach where we save even more information from the original
"non-readable" rending of the page and possibly conditionally modify the
renderer to avoid loading images if we're just going to switch to the
"readable" rending automatically makes a somewhat messy situation even
messier.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Thu, 12 Jun 2025 15:55:02 GMT)
Full text and
rfc822 format available.
Message #38 received at 77299 <at> debbugs.gnu.org (full text, mbox):
On 6/12/2025 5:38 AM, Keith Amidon wrote:
> I had originally thought of trying to do it more-or-less likes this but
> it seemed like it would be a quite large change to code I wasn't that
> familiar with so I went with something more contained to get things
> working. It might take me a while, but I can give this approach a second
> more sustained try and see how it looks.
I wrote the 'eww-readable-urls' feature (though not the original
readable mode code), so I have a fairly good idea of how things work in
this area. If you're interested in working on a patch, I'm happy to
provide some guidance, or if you just want it to work right, I could
write the patch.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Thu, 12 Jun 2025 16:48:01 GMT)
Full text and
rfc822 format available.
Message #41 received at 77299 <at> debbugs.gnu.org (full text, mbox):
On 6/12/25 08:54, Jim Porter wrote:
> I wrote the 'eww-readable-urls' feature (though not the original
> readable mode code), so I have a fairly good idea of how things work
> in this area. If you're interested in working on a patch, I'm happy to
> provide some guidance, or if you just want it to work right, I could
> write the patch.
If you have the time and interest to write the patch based on your
superior knowledge of the code Jim that would be great. I'm motivated to
have this work because I find using eww-readable-urls to make eww much
more useful for my day-to-day usage (thanks for contributing it!) but I
also heavily use org links which need the document title to work well.
My current personal workaround is working but I'd like to get a solution
upstream and stop carrying it. If you come up with something you'd like
help testing I'd be happy to help with that. If you can't get to it for
a while I'll make time to poke at it.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Tue, 17 Jun 2025 16:34:02 GMT)
Full text and
rfc822 format available.
Message #44 received at 77299 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 6/12/2025 9:47 AM, Keith Amidon wrote:
> If you have the time and interest to write the patch based on your
> superior knowledge of the code Jim that would be great. I'm motivated to
> have this work because I find using eww-readable-urls to make eww much
> more useful for my day-to-day usage (thanks for contributing it!) but I
> also heavily use org links which need the document title to work well.
It took a bit longer than I thought to do this, but after a few failed
attempts, I think this patch is pretty close (aside from the couple of
small FIXMEs in there).
As a bonus, this implementation is about 4x faster than the old one,
since it does all the work in a single pass over the DOM. For large
pages, simply computing the readable DOM can take a while; doing it for
the Wikipedia page for "Sun" takes a full second. Here are the stats
from calling the relevant code before and after (sum of 20 iterations):
BEFORE: 20.635833s (15.510534s in 270 GCs)
AFTER: 4.320405s (3.041672s in 55 GCs)
We could probably make this even faster by doing something more
efficient than '(length (split-string FOO))', but one thing at a time.
That's how the code was before, so I didn't change that bit.
> My current personal workaround is working but I'd like to get a solution
> upstream and stop carrying it. If you come up with something you'd like
> help testing I'd be happy to help with that. If you can't get to it for
> a while I'll make time to poke at it.
If the attached changed is too risky for the release branch, I'm open to
the idea of doing something like your workaround there, maybe only when
'eww-auto-rename-buffer' is set to 'title'. That way, people without
that setting don't suffer any performance penalty.
[0001-When-making-a-readable-page-in-EWW-include-the-title.patch (text/plain, attachment)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Tue, 17 Jun 2025 23:55:01 GMT)
Full text and
rfc822 format available.
Message #47 received at 77299 <at> debbugs.gnu.org (full text, mbox):
On 6/17/2025 9:33 AM, Jim Porter wrote:
> As a bonus, this implementation is about 4x faster than the old one,
> since it does all the work in a single pass over the DOM. For large
> pages, simply computing the readable DOM can take a while; doing it for
> the Wikipedia page for "Sun" takes a full second. Here are the stats
> from calling the relevant code before and after (sum of 20 iterations):
>
> BEFORE: 20.635833s (15.510534s in 270 GCs)
> AFTER: 4.320405s (3.041672s in 55 GCs)
>
> We could probably make this even faster by doing something more
> efficient than '(length (split-string FOO))', but one thing at a time.
> That's how the code was before, so I didn't change that bit.
I poked around at this and managed to get it down to 0.54s (still the
sum of 20 runs), so it's now 38x faster. I'll work on cleaning this up
into a real patch and submit it as a separate bug, since that's a pretty
significant improvement.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#77299
; Package
emacs
.
(Wed, 18 Jun 2025 03:01:04 GMT)
Full text and
rfc822 format available.
Message #50 received at 77299 <at> debbugs.gnu.org (full text, mbox):
Thanks for all this work Jim, both the header preserving changes and the
optimization. I haven't gotten a chance to study the diff you provide
earlier in detail or try it out yet, but I did read it carefully enough
to get a good sense of how it is supposed to work and compares with the
old code. It looks really nice. I'll try to make time to try it out ASAP.
This bug report was last modified today.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.