GNU bug report logs -
#58302
29.0.50; browse-url-emacs is extremely slow (and I think always has been?)
Previous Next
To reply to this bug, email your comments to 58302 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Wed, 05 Oct 2022 11:08:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Phil Sainty <psainty <at> orcon.net.nz>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Wed, 05 Oct 2022 11:08:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
browse-url-emacs has always been inexplicably slow for me,
since at least Emacs 24 (but maybe just 'always').
I've just done some basic benchmarking with:
(benchmark-run (save-selected-window
(browse-url-emacs "http://www.example.com")))
;; If I delete the "www.example.com" buffer after each attempt, this
;; call takes nearly 3 seconds:
(2.819470404 1 0.046148434000000016)
(2.669529036 0 0.0)
(2.837350438 1 0.02421054800000011)
;; If I retain the "www.example.com" buffer, then each retry takes
;; <0.5 seconds:
(0.374270464 0 0.0)
(0.428719681 0 0.0)
(0.476068586 1 0.044311528999999794)
;; Whereas benchmarking only `url-retrieve-synchronously':
(benchmark-run (url-retrieve-synchronously "http://www.example.com"))
;; This takes <0.25 seconds.
(0.172364234 0 0.0)
(0.24314511600000002 0 0.0)
(0.19534228 0 0.0)
;; Deleting the network connection via M-x list-processes between
;; attempts adds about 0.25 seconds for all sets of benchmarks, so the
;; network connection time is not a factor.
;; eww is also fast:
(benchmark-run (save-window-excursion (let ((eww-retrieve-command
'sync))
(eww "http://www.example.com"))))
(0.197008672 0 0.0)
(0.25098304499999996 0 0.0)
(0.28419454299999997 0 0.0)
So something to do with browse-url-emacs is taking an additional 2.5s
on top of the basic URL request -- unless the buffer already exists,
in which case it's much faster (albeit still twice as slow as the
other options).
Presumably this could be improved?
If I benchmark-progn the final (funcall func url) in browse-url-emacs
I can see that this is where all the time is spent. func is set to
`find-file-other-window'; so this is equivalent:
(benchmark-run
(save-selected-window
(let ((file-name-handler-alist
(cons (cons url-handler-regexp 'url-file-handler)
file-name-handler-alist)))
(find-file-noselect "http://www.example.com")))
(kill-buffer "www.example.com"))
If the buffer already existed, *Messages* says:
Contacting host: www.example.com:80
(0.408067329 0 0.0)
If the buffer did not already exist, *Messages* says:
Contacting host: www.example.com:80 [2 times]
File exists, but cannot be read
(2.617302471 0 0.0)
(The "[2 times]" is not on account of a previous test;
they are both generated by this single call.)
Benchmarking the end of `find-file-noselect' like so:
(benchmark-progn
(find-file-noselect-1 buf filename nowarn
rawfile truename number))
Gives me:
Elapsed time: 1.876737s (0.057295s in 1 GCs) ;; find-file-noselect-1
(2.680159853 1 0.057295379000000146) ;; the overall
benchmark-run
And that's as far as I've gone.
-Phil
In GNU Emacs 29.0.50 (build 4, x86_64-pc-linux-gnu, X toolkit, cairo
version 1.15.10, Xaw scroll bars)
of 2022-07-15 built on phil-lp
Repository revision: 00eb894a56d63fad3573a53dd57c323289711512
Repository branch: master
Windowing system distributor 'The X.Org Foundation', version
11.0.12008000
System Description: Ubuntu 18.04.6 LTS
Configured using:
'configure --prefix=/home/phil/emacs/trunk/usr/local
--with-x-toolkit=lucid --without-sound
'--program-transform-name=s/^ctags$/ctags_emacs/''
Configured features:
CAIRO DBUS FREETYPE GIF GLIB GMP GNUTLS GPM GSETTINGS HARFBUZZ JPEG JSON
LCMS2 LIBXML2 MODULES NOTIFY INOTIFY PDUMPER PNG RSVG SECCOMP THREADS
TIFF TOOLKIT_SCROLL_BARS WEBP X11 XDBE XIM XINPUT2 XPM LUCID ZLIB
Important settings:
value of $LC_MONETARY: en_NZ.UTF-8
value of $LC_NUMERIC: en_NZ.UTF-8
value of $LC_TIME: en_NZ.UTF-8
value of $LANG: en_GB.UTF-8
value of $XMODIFIERS: @im=ibus
locale-coding-system: utf-8-unix
Major mode: Lisp Interaction
Minor modes in effect:
savehist-mode: t
windmove-mode: t
winner-mode: t
tooltip-mode: t
global-eldoc-mode: t
eldoc-mode: t
show-paren-mode: t
electric-indent-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
line-number-mode: t
indent-tabs-mode: t
transient-mark-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
Load-path shadows:
None found.
Features:
(shadow sort mail-extr emacsbug mule-util display-line-numbers dcl-mode
tempo mm-archive message sendmail yank-media rfc822 mml mml-sec epa
derived epg rfc6068 epg-config gnus-util mailabbrev gmm-utils mailheader
mm-decode mm-bodies mm-encode url-dav parse-time iso8601 cl-extra
cl-print debug backtrace find-func benchmark shortdoc url-about
url-handlers thingatpt help-fns radix-tree help-mode dabbrev time-date
textsec uni-scripts idna-mapping ucs-normalize uni-confusable
textsec-check mail-utils gnutls network-stream url-http mail-parse
rfc2231 rfc2047 rfc2045 mm-util ietf-drums mail-prsvr url-gw nsm
url-cache url-auth shr text-property-search pixel-fill kinsoku url-file
url-dired puny svg xml dom browse-url url url-proxy url-privacy
url-expand url-methods url-history url-cookie generate-lisp-file
url-domsuf url-util url-parse auth-source cl-seq eieio eieio-core
cl-macs password-cache json subr-x map byte-opt gv bytecomp byte-compile
cconv url-vars mailcap savehist windmove winner ring dired-aux
cl-loaddefs cl-lib dired dired-loaddefs advice rmc iso-transl tooltip
eldoc paren electric uniquify ediff-hook vc-hooks lisp-float-type
elisp-mode mwheel term/x-win x-win term/common-win x-dnd tool-bar dnd
fontset image regexp-opt fringe tabulated-list replace newcomment
text-mode lisp-mode prog-mode register page tab-bar menu-bar rfn-eshadow
isearch easymenu timer select scroll-bar mouse jit-lock font-lock syntax
font-core term/tty-colors frame minibuffer nadvice seq simple cl-generic
indonesian philippine cham georgian utf-8-lang misc-lang vietnamese
tibetan thai tai-viet lao korean japanese eucjp-ms cp51932 hebrew greek
romanian slovak czech european ethiopic indian cyrillic chinese
composite emoji-zwj charscript charprop case-table epa-hook
jka-cmpr-hook help abbrev obarray oclosure cl-preloaded button loaddefs
faces cus-face macroexp files window text-properties overlay sha1 md5
base64 format env code-pages mule custom widget keymap
hashtable-print-readable backquote threads dbusbind inotify lcms2
dynamic-setting system-font-setting font-render-setting cairo x-toolkit
xinput2 x multi-tty make-network-process emacs)
Memory information:
((conses 16 210718 27584)
(symbols 48 9278 0)
(strings 32 47429 2202)
(string-bytes 1 1030861)
(vectors 16 40826)
(vector-slots 8 595360 31429)
(floats 8 105 82)
(intervals 56 1185 0)
(buffers 992 38))
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Thu, 06 Oct 2022 12:54:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 58302 <at> debbugs.gnu.org (full text, mbox):
Phil Sainty <psainty <at> orcon.net.nz> writes:
> If the buffer did not already exist, *Messages* says:
> Contacting host: www.example.com:80 [2 times]
> File exists, but cannot be read
> (2.617302471 0 0.0)
I can reproduce this, too. I think it's likely that the delay is coming
from the error message (which is a misleading error message). There's
probably a "sleep-for 2" after displaying the error message?
There's a bug report in debbugs somewhere about fixing the error
message, but I can't find it now.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Thu, 06 Oct 2022 23:26:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 58302 <at> debbugs.gnu.org (full text, mbox):
On 2022-10-07 01:53, Lars Ingebrigtsen wrote:
> Phil Sainty <psainty <at> orcon.net.nz> writes:
>> If the buffer did not already exist, *Messages* says:
>> File exists, but cannot be read
>
> I can reproduce this, too. I think it's likely that the delay is
> coming
> from the error message (which is a misleading error message). There's
> probably a "sleep-for 2" after displaying the error message?
>
> There's a bug report in debbugs somewhere about fixing the error
> message, but I can't find it now.
Perhaps https://debbugs.gnu.org/cgi/bugreport.cgi?bug=42431 ?
There are also a handful of other hits at:
https://debbugs.gnu.org/cgi/search.cgi?phrase=%22File+exists%2C+but+cannot+be+read%22&search=search
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Fri, 07 Oct 2022 02:48:02 GMT)
Full text and
rfc822 format available.
Message #14 received at 58302 <at> debbugs.gnu.org (full text, mbox):
On 2022-10-07 01:53, Lars Ingebrigtsen wrote:
> I can reproduce this, too. I think it's likely that the delay is
> coming
> from the error message (which is a misleading error message). There's
> probably a "sleep-for 2" after displaying the error message?
True, `after-find-file' does this:
(or not-serious (sit-for 1 t))
And indeed that accounts for 1s of the ~2.5s delay; so this is
a significant factor, yet seemingly not the only issue.
With that commented out I now get:
Elapsed time: 0.835925s (0.039518s in 1 GCs) ;; find-file-noselect-1
(1.7317769889999999 1 0.03951770099999996) ;; the overall
benchmark-run
Instead of the former:
Elapsed time: 1.876737s (0.057295s in 1 GCs) ;; find-file-noselect-1
(2.680159853 1 0.057295379000000146) ;; the overall
benchmark-run
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Fri, 07 Oct 2022 11:49:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 58302 <at> debbugs.gnu.org (full text, mbox):
Phil Sainty <psainty <at> orcon.net.nz> writes:
>> coming
>> from the error message (which is a misleading error message). There's
>> probably a "sleep-for 2" after displaying the error message?
>> There's a bug report in debbugs somewhere about fixing the error
>> message, but I can't find it now.
>
> Perhaps https://debbugs.gnu.org/cgi/bugreport.cgi?bug=42431 ?
Yes, that's the one I was thinking about, and it was apparently fixed at
the time? But this looks like pretty much the same problem, but with a
different code path...
Phil Sainty <psainty <at> orcon.net.nz> writes:
> True, `after-find-file' does this:
>
> (or not-serious (sit-for 1 t))
>
> And indeed that accounts for 1s of the ~2.5s delay; so this is
> a significant factor, yet seemingly not the only issue.
>
> With that commented out I now get:
>
> Elapsed time: 0.835925s (0.039518s in 1 GCs) ;; find-file-noselect-1
> (1.7317769889999999 1 0.03951770099999996) ;; the overall
> benchmark-run
>
> Instead of the former:
>
> Elapsed time: 1.876737s (0.057295s in 1 GCs) ;; find-file-noselect-1
> (2.680159853 1 0.057295379000000146) ;; the overall
> benchmark-run
Perhaps there's something else that also wants to sleep a bit after a
file error... In any case, I think the real fix is to not signal an
error here, because that's wrong.
I haven't looked at this code in a while, though.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Wed, 12 Oct 2022 10:27:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 58302 <at> debbugs.gnu.org (full text, mbox):
On 2022-10-07 15:47, Phil Sainty wrote:
> (or not-serious (sit-for 1 t))
With that commented out, I tried to do some profiling like this:
(progn
(profiler-start 'cpu)
(browse-url-emacs "http://www.example.com")
(profiler-report)
(profiler-stop)
(profiler-reset)
(kill-buffer "www.example.com"))
The results were perplexing in their variability -- all I can
suggest is that you run that code multiple times, and C-u RET
to expand the full profile after each run, and see whether you
also observe a variety of fairly different outcomes.
Here's one example where we can see `url-retrieve-synchronously'
being called 4 times; but other times it was called 2-3 times,
and the profile looked rather different.
23 69% - browse-url-emacs
23 69% - find-file-other-window
23 69% - find-file-noselect
17 51% - find-file-noselect-1
8 24% - after-find-file
8 24% - if
4 12% - let*
4 12% - cond
4 12% - and
4 12% - file-exists-p
4 12% - url-file-handler
4 12% - apply
4 12% - url-file-exists-p
4 12% - url-http-file-exists-p
4 12% - url-http-head
4 12% - url-retrieve-synchronously
4 12% - accept-process-output
4 12% - url-http-generic-filter
4 12% -
url-http-wait-for-headers-change-function
4 12% mail-fetch-field
4 12% - run-hooks
4 12% - vc-refresh-state
4 12% - vc-backend
4 12% - vc-file-getprop
4 12% - expand-file-name
4 12% url-file-handler
6 18% - insert-file-contents
6 18% - url-file-handler
6 18% - apply
6 18% - url-insert-file-contents
4 12% url-retrieve-synchronously
2 6% - url-insert-buffer-contents
2 6% - url-insert
2 6% - mm-dissect-buffer
2 6% - mm-dissect-singlepart
2 6% - mm-copy-to-buffer
2 6% generate-new-buffer
3 9% - file-readable-p
3 9% - url-file-handler
3 9% - apply
3 9% - url-file-exists-p
3 9% - url-http-file-exists-p
3 9% - url-http-head
3 9% - url-retrieve-synchronously
3 9% - url-retrieve
3 9% - url-retrieve-internal
3 9% url-http
6 18% - file-attributes
6 18% - url-file-handler
6 18% - apply
6 18% - url-file-attributes
6 18% - url-http-file-attributes
6 18% - url-http-head-file-attributes
6 18% - url-http-head
6 18% - url-retrieve-synchronously
6 18% - url-retrieve
6 18% - url-retrieve-internal
6 18% - url-http
6 18% generate-new-buffer
10 30% Automatic GC
I'm not very familiar with the ins and outs of these code paths,
but my first impression is that we've initiated an operation which
needs to deal with a particular URL and if we were to make a high-
level binding to indicate that we were doing this, we could then
cache and re-use the results of those network requests for the
extent of that binding.
3 of the 4 `url-retrieve-synchronously' calls above are from
`url-http-head'; twice on account of `url-file-exists-p', and another
from `url-file-attributes'.
I see the following in the code:
(defun url-http-head (url)
(let ((url-request-method "HEAD")
(url-request-data nil))
(url-retrieve-synchronously url)))
(defun url-http-file-exists-p (url)
(let ((buffer (url-http-head url)))
...))
(defalias 'url-http-file-readable-p 'url-http-file-exists-p)
(defun url-http-head-file-attributes (url &optional _id-format)
(let ((buffer (url-http-head url)))
...))
(defun url-http-file-attributes (url &optional id-format)
(if (url-dav-supported-p url)
(url-dav-file-attributes url id-format)
(url-http-head-file-attributes url id-format)))
In principle, I don't see why we couldn't be re-using the buffer
returned by the first call `url-http-head' in each of the
subsequent calls.
Furthermore, we could *probably* flag the fact that we are 100%
intending to request the entire file later on in the command,
and use that information to just do a GET request instead of a
HEAD request in the first place -- the resulting buffer for which
can then *also* be re-used by the eventual `url-insert-file-contents'
call.
I think `url-http-head' itself should only ever do a HEAD request,
but `url-http-head-file-attributes' and `url-http-file-exists-p'
could conditionally use the full GET buffer.
-Phil
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Wed, 12 Oct 2022 11:05:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 58302 <at> debbugs.gnu.org (full text, mbox):
Phil Sainty <psainty <at> orcon.net.nz> writes:
> I'm not very familiar with the ins and outs of these code paths,
> but my first impression is that we've initiated an operation which
> needs to deal with a particular URL and if we were to make a high-
> level binding to indicate that we were doing this, we could then
> cache and re-use the results of those network requests for the
> extent of that binding.
[excellent analysis elided]
I think the conclusion here is that using the file-name-handler-alist
stuff for this is the absolutely pessimal way to implement
`browse-url-emacs'.
It should be pretty easy to rewrite browse-url-emacs to just call
`url-retrieve-synchronously' explicitly, and then display the resulting
data -- and it should be much, much faster.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Wed, 12 Oct 2022 11:29:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 58302 <at> debbugs.gnu.org (full text, mbox):
On 2022-10-13 00:03, Lars Ingebrigtsen wrote:
> I think the conclusion here is that using the file-name-handler-alist
> stuff for this is the absolutely pessimal way to implement
> `browse-url-emacs'.
>
> It should be pretty easy to rewrite browse-url-emacs to just call
> `url-retrieve-synchronously' explicitly, and then display the resulting
> data -- and it should be much, much faster.
Undoubtedly so; but making the existing approach more efficient might
also bring the same benefits to other functionality?
E.g.:
(url-handler-mode 1)
(trace-function 'url-retrieve-synchronously "*trace-output*"
(lambda () (format " [%s]" url-request-method)))
(find-file "http://www.example.com")
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [OPTIONS]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[OPTIONS]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t nil)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [OPTIONS]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[OPTIONS]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t nil)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
======================================================================
1 -> (url-retrieve-synchronously "http://www.example.com") [nil]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[nil]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
======================================================================
1 -> (url-retrieve-synchronously #s(url "http" nil nil "www.example.com"
nil "" nil nil t nil t t)) [HEAD]
1 <- url-retrieve-synchronously: #<buffer *http www.example.com:80*>
[HEAD]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Wed, 12 Oct 2022 11:34:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 58302 <at> debbugs.gnu.org (full text, mbox):
Phil Sainty <psainty <at> orcon.net.nz> writes:
> Undoubtedly so; but making the existing approach more efficient might
> also bring the same benefits to other functionality?
>
> E.g.:
>
> (url-handler-mode 1)
>
> (trace-function 'url-retrieve-synchronously "*trace-output*"
> (lambda () (format " [%s]" url-request-method)))
>
> (find-file "http://www.example.com")
That's true, but I kinda feel that this stuff is something that nobody
uses -- it's a fun trick, but you can't do much with it. That is, you
can load "http://www.example.com" as a file, but you can't save it, so...
It's a fun toy and a demonstration of how you can hook into the Emacs
file machinery. But if you're writing code that's actually going to be
used (like browse-url-emacs), it's the worst way.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#58302
; Package
emacs
.
(Thu, 13 Oct 2022 08:02:01 GMT)
Full text and
rfc822 format available.
Message #32 received at 58302 <at> debbugs.gnu.org (full text, mbox):
I've now fixed the spurious error message in the middle of this, so it
should be faster now.
It's still inefficient, but I'm not sure that it's worth fixing...
This bug report was last modified 2 years and 307 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.