GNU bug report logs - #75585
30.0.92; eww does not use proper file names for downloaded webpages

Previous Next

Package: emacs;

Reported by: Anush V <j <at> gnu.org>

Date: Wed, 15 Jan 2025 14:58:01 UTC

Severity: normal

Found in version 30.0.92

Done: Eli Zaretskii <eliz <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 75585 in the body.
You can then email your comments to 75585 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#75585; Package emacs. (Wed, 15 Jan 2025 14:58:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Anush V <j <at> gnu.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Wed, 15 Jan 2025 14:58:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Anush V <j <at> gnu.org>
To: bug-gnu-emacs <at> gnu.org
Subject: 30.0.92; eww does not use proper file names for downloaded webpages
Date: Wed, 15 Jan 2025 09:57:25 -0500
Hello Maintainers,

I believe eww is not using proper names for downloaded webpages.

Expected Behavior: Downloaded webpages should have a filename
reflecting the full url with an .html extension.

Observed Bug:

1. When I download the page https://www.gnu.org/, eww downloads the
file without an .html extension, naming it simply as !.

2. When I download https://www.gnu.org/home.html, eww names the file
'home.html'. While the extension is correct, shouldn't the filename
reflect the full URL to avoid conflicts when downloading 'home.html'
from different sites?

Steps to Reproduce:

1. emacs --no-init
2. M-x eww
3. https://www.gnu.org/   ;; eww prompt
4. d                      ;; Downloads file !
5. G
6. https://www.gnu.org/home.html ;; eww prompt
7. d                      ;; Downloads file home.html

Please let me know if my expectation regarding the filename &
extension is incorrect.

Thank you for your time and attention.

* * *
In GNU Emacs 30.0.92 (build 1, x86_64-pc-linux-gnu, GTK+ Version
3.24.41, cairo version 1.18.0)
Windowing system distributor 'The X.Org Foundation', version 11.0.12101014
System Description: Guix System

Configured using:
 'configure
 CONFIG_SHELL=/gnu/store/6nqyia3ra10sgd1ppzk2047ncbzjwhff-bash-minimal-5.1.16/bin/bash
 SHELL=/gnu/store/6nqyia3ra10sgd1ppzk2047ncbzjwhff-bash-minimal-5.1.16/bin/bash --prefix=/gnu/store/ml6xyl3py6hqfdps2sypdi7s212y7k02-emacs-next-30.0.92-0.881d593 --enable-fast-install --with-cairo --with-modules --with-native-compilation=aot --disable-build-details'

Configured features:
ACL CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ
JPEG LCMS2 LIBOTF LIBSELINUX LIBSYSTEMD LIBXML2 M17N_FLT MODULES
NATIVE_COMP NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP SOUND SQLITE3
THREADS TIFF TOOLKIT_SCROLL_BARS TREE_SITTER X11 XDBE XIM XINPUT2 XPM
GTK3 ZLIB

--
Regards,
Anush V




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75585; Package emacs. (Thu, 16 Jan 2025 15:40:02 GMT) Full text and rfc822 format available.

Message #8 received at 75585 <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Anush V <j <at> gnu.org>
Cc: 75585 <at> debbugs.gnu.org
Subject: Re: bug#75585: 30.0.92;
 eww does not use proper file names for downloaded webpages
Date: Thu, 16 Jan 2025 17:39:33 +0200
> From: Anush V <j <at> gnu.org>
> Date: Wed, 15 Jan 2025 09:57:25 -0500
> 
> Hello Maintainers,
> 
> I believe eww is not using proper names for downloaded webpages.
> 
> Expected Behavior: Downloaded webpages should have a filename
> reflecting the full url with an .html extension.

That's not what eww-download does.  It downloads the pages to the
directory specified by eww-download-directory, by default
"~/Downloads".

> 1. When I download the page https://www.gnu.org/, eww downloads the
> file without an .html extension, naming it simply as !.

This page has no name.  We invent some name, in this case "!".  Apart
of documenting this, why is that a problem?

> 2. When I download https://www.gnu.org/home.html, eww names the file
> 'home.html'. While the extension is correct, shouldn't the filename
> reflect the full URL to avoid conflicts when downloading 'home.html'
> from different sites?

eww-download detects conflicts and makes the downloaded name unique,
see eww-make-unique-file-name.  This seems to be a deliberate design
decision, and I can't say it sounds wrong to me.

So, given that we augment the documentation to make these aspects
clear, do you still think there's a bug here?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#75585; Package emacs. (Fri, 17 Jan 2025 14:59:01 GMT) Full text and rfc822 format available.

Message #11 received at 75585 <at> debbugs.gnu.org (full text, mbox):

From: Anush V <j <at> gnu.org>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 75585 <at> debbugs.gnu.org
Subject: Re: bug#75585: 30.0.92; eww does not use proper file names for
 downloaded webpages
Date: Fri, 17 Jan 2025 09:57:00 -0500
> From: Eli Zaretskii <eliz <at> gnu.org>
> Date: Thu, 16 Jan 2025 17:39:33 +0200
>
>> From: Anush V <j <at> gnu.org>
>> Date: Wed, 15 Jan 2025 09:57:25 -0500
>>
>> Hello Maintainers,
>>
>> I believe eww is not using proper names for downloaded webpages.
>>
>> Expected Behavior: Downloaded webpages should have a filename
>> reflecting the full url with an .html extension.
>
> That's not what eww-download does.  It downloads the pages to the
> directory specified by eww-download-directory, by default
> "~/Downloads".

Thank you for clarifying.

>> 1. When I download the page https://www.gnu.org/, eww downloads the
>> file without an .html extension, naming it simply as !.
>
> This page has no name.  We invent some name, in this case "!".  Apart
> of documenting this, why is that a problem?

Yes documenting this should help.

>> 2. When I download https://www.gnu.org/home.html, eww names the file
>> 'home.html'. While the extension is correct, shouldn't the filename
>> reflect the full URL to avoid conflicts when downloading 'home.html'
>> from different sites?

Sure, I wasn't clear about how eww-download works.

> eww-download detects conflicts and makes the downloaded name unique,
> see eww-make-unique-file-name.  This seems to be a deliberate design
> decision, and I can't say it sounds wrong to me.
>
> So, given that we augment the documentation to make these aspects
> clear, do you still think there's a bug here?

Just adding to the documentation should be sufficient.

My usecase, was to download webpage for offline reading.  I came
across eww-download (Downloads URL) and eww-open-file (renders html
file only if file has .html extension).  I thought I could download
webpages using eww-download and then read offline using eww-open-file

Regards,
Anush




Reply sent to Eli Zaretskii <eliz <at> gnu.org>:
You have taken responsibility. (Sat, 18 Jan 2025 11:00:02 GMT) Full text and rfc822 format available.

Notification sent to Anush V <j <at> gnu.org>:
bug acknowledged by developer. (Sat, 18 Jan 2025 11:00:02 GMT) Full text and rfc822 format available.

Message #16 received at 75585-done <at> debbugs.gnu.org (full text, mbox):

From: Eli Zaretskii <eliz <at> gnu.org>
To: Anush V <j <at> gnu.org>
Cc: 75585-done <at> debbugs.gnu.org
Subject: Re: bug#75585: 30.0.92; eww does not use proper file names for
 downloaded webpages
Date: Sat, 18 Jan 2025 12:58:55 +0200
[Please use Reply All to reply, to keep the bug tracker CC'ed.]

> From: Anush V <j <at> gnu.org>
> Date: Fri, 17 Jan 2025 09:40:12 -0500
> 
> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Date: Thu, 16 Jan 2025 17:39:33 +0200
> >
> >> From: Anush V <j <at> gnu.org>
> >> Date: Wed, 15 Jan 2025 09:57:25 -0500
> >>
> >> Hello Maintainers,
> >>
> >> I believe eww is not using proper names for downloaded webpages.
> >>
> >> Expected Behavior: Downloaded webpages should have a filename
> >> reflecting the full url with an .html extension.
> >
> > That's not what eww-download does.  It downloads the pages to the
> > directory specified by eww-download-directory, by default
> > "~/Downloads".
> 
> Thank you for clarifying.
> 
> >> 1. When I download the page https://www.gnu.org/, eww downloads the
> >> file without an .html extension, naming it simply as !.
> >
> > This page has no name.  We invent some name, in this case "!".  Apart
> > of documenting this, why is that a problem?
> 
> Yes documenting this should help.
> 
> >> 2. When I download https://www.gnu.org/home.html, eww names the file
> >> 'home.html'. While the extension is correct, shouldn't the filename
> >> reflect the full URL to avoid conflicts when downloading 'home.html'
> >> from different sites?
> >
> > eww-download detects conflicts and makes the downloaded name unique,
> > see eww-make-unique-file-name.  This seems to be a deliberate design
> > decision, and I can't say it sounds wrong to me.
> 
> Sure, I wasn't clear about how eww-download works.
> 
> > So, given that we augment the documentation to make these aspects
> > clear, do you still think there's a bug here?
> 
> Just adding to the documentation should be sufficient.

OK, so I've now done that, and I'm therefore closing this bug.

> My usecase was to download interesting webpages (from different
> websites) for reading offline.  I came across eww-download (Downloads
> URL) and eww-open-file (renders html file only if file has .html
> extension).  I thought I could download using eww-download and then
> read offline using eww-open-file

You can do that: eww-download shows the actual file name under which
it saved the Web page in the echo area.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 15 Feb 2025 12:24:11 GMT) Full text and rfc822 format available.

This bug report was last modified 120 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.