GNU bug report logs -
#17958
SHR: base handling broken (shr-parse-base, shr-expand-url)
Previous Next
Reported by: Ivan Shmakov <ivan <at> siamics.net>
Date: Sun, 6 Jul 2014 18:47:01 UTC
Severity: normal
Tags: fixed, patch
Fixed in version 25.1
Done: Lars Magne Ingebrigtsen <larsi <at> gnus.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17958 in the body.
You can then email your comments to 17958 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#17958
; Package
emacs
.
(Sun, 06 Jul 2014 18:47:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Ivan Shmakov <ivan <at> siamics.net>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 06 Jul 2014 18:47:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Package: emacs
As evidenced with the form at [1], EWW currently (as of
36634f669f2c) mishandles the case where an HTML form uses
‘method="POST"’ but specifies no ‘action’ attribute. Namely,
instead of interpreting missing ‘action’ as meaning “this very
same URI”, EWW uses the URI with the ‘path’ component discarded.
Granted, missing ‘action’ is special-cased for GET forms:
1034 (if (cdr (assq :action form))
1035 (shr-expand-url (cdr (assq :action form))
1036 eww-current-url)
1037 eww-current-url)
While POST forms get no such treatment:
1030 (eww-browse-url (shr-expand-url (cdr (assq :action form))
1031 eww-current-url)))
However, I believe that the real culprit is shr-expand-url,
which mishandles the nil ‘uri’ case:
(mapcar (lambda (x) (shr-expand-url x "http://example.com/welcome/"))
'("hello" "/world" nil))
;; ⇒
("http://example.com/welcome/hello"
"http://example.com/world"
"http://example.com")
My expectation for the last result would be the ‘base’ argument
unchanged (i. e., http://example.com/welcome/.)
Thus, I suggest changing shr-expand-url to return not the 0th
element of the (parsed) ‘base’ (see below), but the 3rd.
596 (cond ((or (not url)
597 (not base)
598 (string-match "\\`[a-z]*:" url))
599 ;; Absolute URL.
600 (or url (car base)))
[1] https://tools.wmflabs.org/guc/?user=2001:db8:1337::cafe
--
FSF associate member #7257 http://boycottsystemd.org/
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#17958
; Package
emacs
.
(Thu, 14 Aug 2014 18:51:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 17958 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
retitle 17958 SHR: base handling broken (shr-parse-base, shr-expand-url)
tag 17958 + patch
thanks
>>>>> Ivan Shmakov <ivan <at> siamics.net> writes:
[…]
> However, I believe that the real culprit is shr-expand-url, which
> mishandles the nil ‘uri’ case:
> (mapcar (lambda (x) (shr-expand-url x "http://example.com/welcome/"))
> '("hello" "/world" nil))
> ;; ⇒
> ("http://example.com/welcome/hello"
> "http://example.com/world"
> "http://example.com")
> My expectation for the last result would be the ‘base’ argument
> unchanged (i. e., http://example.com/welcome/.)
> Thus, I suggest changing shr-expand-url to return not the 0th element
> of the (parsed) ‘base’ (see below), but the 3rd.
> 596 (cond ((or (not url)
> 597 (not base)
> 598 (string-match "\\`[a-z]*:" url))
> 599 ;; Absolute URL.
> 600 (or url (car base)))
> [1] https://tools.wmflabs.org/guc/?user=2001:db8:1337::cafe
As it seems, there’s one more issue with SHR “base” handling.
Namely, the <base href="" /> URI may actually itself be
relative, and SHR fails to handle that properly. As per [2]:
To set the frozen base URL, resolve the value of the element's href
content attribute relative to the Document's fallback base URL; if
this is successful, set the frozen base URL to the resulting
absolute URL, otherwise, set the frozen base URL to the fallback
base URL.
The SHR behavior doesn’t match the above. Consider, e. g.:
(let ((shr-base (shr-parse-base "http://example.org/")))
(shr-tag-base '((:href . "/relative")))
shr-base)
;; ⇒
("" "/" nil "/relative")
With the patch MIMEd (which also fixes the issue described in my
initial bug report), it instead gives what I deem to be the
correct result:
(let ((shr-base (shr-parse-base "http://example.org/")))
(shr-tag-base '((:href . "/relative")))
shr-base)
;; ⇒
("http://example.org" "/" "http" "http://example.org/relative")
For proper compliance to the specification, SHR should also
ignore all the <base /> elements but the first one, but I guess
that may be fixed separately.
The relative <base /> URIs appear, e. g., on the Internet
Wayback Machine archive pages, when the original page uses the
<base /> element.
[2] http://www.w3.org/TR/html5/document-metadata.html#the-base-element
--
FSF associate member #7257 http://boycottsystemd.org/ … 3013 B6A0 230E 334A
[Message part 2 (text/x-diff, inline)]
--- a/lisp/net/shr.el
+++ b/lisp/net/shr.el
@@ -574,6 +574,8 @@ size, and full-buffer size."
;; Always chop off anchors.
(when (string-match "#.*" url)
(setq url (substring url 0 (match-beginning 0))))
+ ;; NB: <base href="" > URI may itself be relative to the document’s URI
+ (setq url (shr-expand-url url))
(let* ((parsed (url-generic-parse-url url))
(local (url-filename parsed)))
(setf (url-filename parsed) "")
@@ -592,6 +594,7 @@ size, and full-buffer size."
(defun shr-expand-url (url &optional base)
(setq base
(if base
+ ;; shr-parse-base should never call this with non-nil base!
(shr-parse-base base)
;; Bound by the parser.
shr-base))
@@ -600,8 +603,8 @@ size, and full-buffer size."
(cond ((or (not url)
(not base)
(string-match "\\`[a-z]*:" url))
- ;; Absolute URL.
- (or url (car base)))
+ ;; Absolute or empty URI
+ (or url (nth 3 base)))
((eq (aref url 0) ?/)
(if (and (> (length url) 1)
(eq (aref url 1) ?/))
Changed bug title to 'SHR: base handling broken (shr-parse-base, shr-expand-url)' from 'eww-submit mishandles the POST method, no action forms '
Request was from
Ivan Shmakov <ivan <at> siamics.net>
to
control <at> debbugs.gnu.org
.
(Thu, 14 Aug 2014 18:51:02 GMT)
Full text and
rfc822 format available.
Added tag(s) patch.
Request was from
Ivan Shmakov <ivan <at> siamics.net>
to
control <at> debbugs.gnu.org
.
(Thu, 14 Aug 2014 18:51:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#17958
; Package
emacs
.
(Tue, 04 Nov 2014 16:45:02 GMT)
Full text and
rfc822 format available.
Message #15 received at 17958 <at> debbugs.gnu.org (full text, mbox):
On Thu, 14 Aug 2014 18:50:20 +0000 Ivan Shmakov <ivan <at> siamics.net> wrote:
>> Thus, I suggest changing shr-expand-url to return not the 0th element
>> of the (parsed) ‘base’ (see below), but the 3rd.
...
IS> With the patch MIMEd (which also fixes the issue described in my
IS> initial bug report), it instead gives what I deem to be the
IS> correct result:
IS> (let ((shr-base (shr-parse-base "http://example.org/")))
IS> (shr-tag-base '((:href . "/relative")))
IS> shr-base)
IS> ;; ⇒
IS> ("http://example.org" "/" "http" "http://example.org/relative")
This seems reasonable to me as far as usability but Lars will have to
review the patch for correctness.
Ted
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#17958
; Package
emacs
.
(Thu, 13 Nov 2014 18:43:02 GMT)
Full text and
rfc822 format available.
Message #18 received at 17958 <at> debbugs.gnu.org (full text, mbox):
Ivan Shmakov <ivan <at> siamics.net> writes:
> (let ((shr-base (shr-parse-base "http://example.org/")))
> (shr-tag-base '((:href . "/relative")))
> shr-base)
> ;; ⇒
> ("" "/" nil "/relative")
>
> With the patch MIMEd (which also fixes the issue described in my
> initial bug report), it instead gives what I deem to be the
> correct result:
>
> (let ((shr-base (shr-parse-base "http://example.org/")))
> (shr-tag-base '((:href . "/relative")))
> shr-base)
> ;; ⇒
> ("http://example.org" "/" "http" "http://example.org/relative")
Thanks; applied.
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
Added tag(s) fixed.
Request was from
Lars Magne Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Thu, 13 Nov 2014 18:43:02 GMT)
Full text and
rfc822 format available.
bug marked as fixed in version 25.1, send any further explanations to
17958 <at> debbugs.gnu.org and Ivan Shmakov <ivan <at> siamics.net>
Request was from
Lars Magne Ingebrigtsen <larsi <at> gnus.org>
to
control <at> debbugs.gnu.org
.
(Thu, 13 Nov 2014 18:43:03 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 12 Dec 2014 12:24:06 GMT)
Full text and
rfc822 format available.
This bug report was last modified 10 years and 198 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.