GNU bug report logs - #16220
url-http.el: Not conforming to HTTP spec

Previous Next

Package: emacs;

Reported by: Jarosław Rzeszótko <sztywny <at> gmail.com>

Date: Sun, 22 Dec 2013 20:53:01 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 16220 in the body.
You can then email your comments to 16220 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Sun, 22 Dec 2013 20:53:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jarosław Rzeszótko <sztywny <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 22 Dec 2013 20:53:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jarosław Rzeszótko <sztywny <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: url-http.el: Not conforming to HTTP spec
Date: Sun, 22 Dec 2013 21:52:22 +0100
[Message part 1 (text/plain, inline)]
Hi,

At the end of every HTTP request to be made with url-http.el and containing
a body, an unnecessary "\r\n" is appended, and additionally those two
characters are not used in the calculation of the Content-Length header.
This normally would not matter, because a carefully build server will
anyway only read "Content-Length" bytes from the body and ignore the final
CRLF, but Emacs additionally defaults to using Connection: keep-alive,
which results in the TCP traffic for what was meant to be a single request,
being interpreted as two separate HTTP requests, the first one being
roughly the intended one, and the other one consisting only of CRLF. In
particular, I am using the HTTP server from net.http in Go language. That
keepalive is enabled by default is strange, especially given how the
variable that controls this is described:

(defvar url-http-attempt-keepalives t
  "Whether to use a single TCP connection multiple times in HTTP.
This is only useful when debugging the HTTP subsystem.  Setting to
nil will explicitly close the connection to the server after every
request.")

Those issues have been somewhat discussed here, but it seems the people
discussing unfortunately don't understand HTTP:

https://groups.google.com/forum/#!msg/gnu.emacs.bug/SF4P7gVI6IQ/SExtWzutKI4J

Please just compare this discussion to
http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html

If you don't go into any fancy things like chunked encoding etc., an HTTP
request is a:

- sequence of headers, each header followed with a newline
- a newline terminating the sequence of headers
- an optional request body

The request body will be read by the server only if one of the headers is a
Content-Length header, and the value of the header is exactly the number of
bytes that is being sent, there is no CRLF terminating the body.  So if
there is no body, the request ends with two CRLFs, if there is a body, it
just ends with [Content-Length] bytes of raw data. There is no possibility
a proper HTTP server could be confused by a request being terminated with
two CRLFs, if the request is otherwise correct. I think there must have
been some confusion as to the reason of the original problem, that was then
turned into this "fix".

For reference, my code is roughly this, and as mentioned I am using
net.http from the Go language on the other end:

(defun server-command (command)
  (let* ((url-request-method "POST")
         (url-request-extra-headers '(("Content-Type" .
"application/x-www-form-urlencoded")))
         (url-request-data (concat (url-hexify-string "command") "="
(url-hexify-string (json-encode command))))
         (result-buffer (url-retrieve-synchronously server-url))
         (result
          (with-current-buffer result-buffer
            (goto-char (point-min))
            (delete-region (point-min) (search-forward "\n\n"))
            (buffer-string))))
    (kill-buffer result-buffer)
    result))

Normally I get from this function the contents of the first, "correct"
response body from the server, but if I run it a few times in quick
succession I additonally get the string "HTTP/1.1 400 Bad Request" at the
end, which is actually the second HTTP response showing up at random in the
buffer (altough it's consistently sent by the server every time, as I can
see in a sniffer).

Cheers,
Jarosław Rzeszótko
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Sun, 22 Dec 2013 21:56:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jarosław Rzeszótko <sztywny <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: Re: url-http.el: Not conforming to HTTP spec
Date: Sun, 22 Dec 2013 22:55:07 +0100
[Message part 1 (text/plain, inline)]
Hi,

To turn this into a concrete proposal: I suggest this part in url-http.el
(starting line 356 in trunk):

;; End request
"\r\n"
;; Any data
url-http-data
;; If `url-http-data' is nil, avoid two CRLFs (Bug#8931).
(if url-http-data "\r\n")))

Should read simply:

;; End request
"\r\n"
;; Any data
url-http-data))

I believe that the person who originally introduced the additional newline
most likely was confused by some issue in mediawiki.el itself, a broken
HTTP server, a broken PHP script, or something of this kind, and obscured
the real bug instead of fixing it:

http://bzr.savannah.gnu.org/lh/emacs/trunk/revision/100681
http://article.gmane.org/gmane.emacs.diffs/105629

I have confirmed that HTTPS connections work just fine after removing the
new line, which is unsurprising, because this is what valid HTTP looks
like. In fact, anyone can, using https://posttestserver.com/, easily check
that the use case used as motivation for the original change, works just
fine after removing that additional newline as I suggest here:

(let ((url-request-method "POST")
      (url-request-data "action=login"))
  (url-retrieve-synchronously "https://posttestserver.com/post.php"))

Futhermore url-http-attempt-keepalives should be nil as default, or better
yet should be completely removed, as true keepalive connections are anyway
not currently supported on the Emacs side, are they?

Cheers,
Jarosław Rzeszótko


2013/12/22 Jarosław Rzeszótko <sztywny <at> gmail.com>

> Hi,
>
> At the end of every HTTP request to be made with url-http.el and
> containing a body, an unnecessary "\r\n" is appended, and additionally
> those two characters are not used in the calculation of the Content-Length
> header. This normally would not matter, because a carefully build server
> will anyway only read "Content-Length" bytes from the body and ignore the
> final CRLF, but Emacs additionally defaults to using Connection:
> keep-alive, which results in the TCP traffic for what was meant to be a
> single request, being interpreted as two separate HTTP requests, the first
> one being roughly the intended one, and the other one consisting only of
> CRLF. In particular, I am using the HTTP server from net.http in Go
> language. That keepalive is enabled by default is strange, especially given
> how the variable that controls this is described:
>
> (defvar url-http-attempt-keepalives t
>   "Whether to use a single TCP connection multiple times in HTTP.
> This is only useful when debugging the HTTP subsystem.  Setting to
> nil will explicitly close the connection to the server after every
> request.")
>
> Those issues have been somewhat discussed here, but it seems the people
> discussing unfortunately don't understand HTTP:
>
>
> https://groups.google.com/forum/#!msg/gnu.emacs.bug/SF4P7gVI6IQ/SExtWzutKI4J
>
> Please just compare this discussion to
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html
>
> If you don't go into any fancy things like chunked encoding etc., an HTTP
> request is a:
>
> - sequence of headers, each header followed with a newline
> - a newline terminating the sequence of headers
> - an optional request body
>
> The request body will be read by the server only if one of the headers is
> a Content-Length header, and the value of the header is exactly the number
> of bytes that is being sent, there is no CRLF terminating the body.  So if
> there is no body, the request ends with two CRLFs, if there is a body, it
> just ends with [Content-Length] bytes of raw data. There is no possibility
> a proper HTTP server could be confused by a request being terminated with
> two CRLFs, if the request is otherwise correct. I think there must have
> been some confusion as to the reason of the original problem, that was then
> turned into this "fix".
>
> For reference, my code is roughly this, and as mentioned I am using
> net.http from the Go language on the other end:
>
> (defun server-command (command)
>   (let* ((url-request-method "POST")
>          (url-request-extra-headers '(("Content-Type" .
> "application/x-www-form-urlencoded")))
>          (url-request-data (concat (url-hexify-string "command") "="
> (url-hexify-string (json-encode command))))
>          (result-buffer (url-retrieve-synchronously server-url))
>          (result
>           (with-current-buffer result-buffer
>             (goto-char (point-min))
>             (delete-region (point-min) (search-forward "\n\n"))
>             (buffer-string))))
>     (kill-buffer result-buffer)
>     result))
>
> Normally I get from this function the contents of the first, "correct"
> response body from the server, but if I run it a few times in quick
> succession I additonally get the string "HTTP/1.1 400 Bad Request" at the
> end, which is actually the second HTTP response showing up at random in the
> buffer (altough it's consistently sent by the server every time, as I can
> see in a sniffer).
>
> Cheers,
> Jarosław Rzeszótko
>
[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Sun, 22 Dec 2013 22:33:02 GMT) Full text and rfc822 format available.

Message #11 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Ted Zlatanov <tzz <at> lifelogs.com>
To: Jarosław Rzeszótko <sztywny <at> gmail.com>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Sun, 22 Dec 2013 17:33:41 -0500
On Sun, 22 Dec 2013 22:55:07 +0100 Jarosław Rzeszótko <sztywny <at> gmail.com> wrote: 

JR> To turn this into a concrete proposal: I suggest this part in url-http.el
JR> (starting line 356 in trunk):

JR> ;; End request
JR> "\r\n"
JR> ;; Any data
JR> url-http-data
JR> ;; If `url-http-data' is nil, avoid two CRLFs (Bug#8931).
JR> (if url-http-data "\r\n")))

JR> Should read simply:

JR> ;; End request
JR> "\r\n"
JR> ;; Any data
JR> url-http-data))
...
JR> Futhermore url-http-attempt-keepalives should be nil as default, or better
JR> yet should be completely removed, as true keepalive connections are anyway
JR> not currently supported on the Emacs side, are they?

Jarosław,

can you please check if your fix corrects
https://github.com/milkypostman/melpa/issues/1193 which seems somewhat related?

I don't know much about this area but your fix could help this annoying
issue as well, if it's valid.

Ted




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Mon, 23 Dec 2013 06:53:01 GMT) Full text and rfc822 format available.

Message #14 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Jarosław Rzeszótko <sztywny <at> gmail.com>
To: Ted Zlatanov <tzz <at> lifelogs.com>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Mon, 23 Dec 2013 07:51:57 +0100
Hi,

No, this does not seem directly related - I can't reproduce the GH
error neither with or without my fix, and from the discussion at the
end of the GH issue, it seems it has been fixed by the elpa guys by
adjusting server configuration.

As for the validity of the fix, I found the passage in the spec the
addresses this issue directly (once again from
http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html):

"Certain buggy HTTP/1.0 client implementations generate extra CRLF's
after a POST request. To restate what is explicitly forbidden by the
BNF, an HTTP/1.1 client MUST NOT preface or follow a request with an
extra CRLF."

The BNF this refers to is this:

generic-message = start-line
                  *(message-header CRLF)
                  CRLF
                  [ message-body ]
start-line      = Request-Line | Status-Line

And finally:

"When a Content-Length is given in a message where a message-body is
allowed, its field value MUST exactly match the number of OCTETs in
the message-body. HTTP/1.1 user agents MUST notify the user when an
invalid length is received and detected."

I hope this is enough of a proof that the extra newline is a bug.

Cheers,
Jarosław Rzeszótko

2013/12/22 Ted Zlatanov <tzz <at> lifelogs.com>:
> On Sun, 22 Dec 2013 22:55:07 +0100 Jarosław Rzeszótko <sztywny <at> gmail.com> wrote:
>
> JR> To turn this into a concrete proposal: I suggest this part in url-http.el
> JR> (starting line 356 in trunk):
>
> JR> ;; End request
> JR> "\r\n"
> JR> ;; Any data
> JR> url-http-data
> JR> ;; If `url-http-data' is nil, avoid two CRLFs (Bug#8931).
> JR> (if url-http-data "\r\n")))
>
> JR> Should read simply:
>
> JR> ;; End request
> JR> "\r\n"
> JR> ;; Any data
> JR> url-http-data))
> ...
> JR> Futhermore url-http-attempt-keepalives should be nil as default, or better
> JR> yet should be completely removed, as true keepalive connections are anyway
> JR> not currently supported on the Emacs side, are they?
>
> Jarosław,
>
> can you please check if your fix corrects
> https://github.com/milkypostman/melpa/issues/1193 which seems somewhat related?
>
> I don't know much about this area but your fix could help this annoying
> issue as well, if it's valid.
>
> Ted




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Mon, 23 Dec 2013 13:08:01 GMT) Full text and rfc822 format available.

Message #17 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Ted Zlatanov <tzz <at> lifelogs.com>
To: Jarosław Rzeszótko <sztywny <at> gmail.com>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Mon, 23 Dec 2013 08:08:42 -0500
On Mon, 23 Dec 2013 07:51:57 +0100 Jarosław Rzeszótko <sztywny <at> gmail.com> wrote: 

JR> I hope this is enough of a proof that the extra newline is a bug.

I was already convinced :)

JR> 2013/12/22 Ted Zlatanov <tzz <at> lifelogs.com>:
>> On Sun, 22 Dec 2013 22:55:07 +0100 Jarosław Rzeszótko <sztywny <at> gmail.com> wrote:
>> 
JR> To turn this into a concrete proposal: I suggest this part in url-http.el
JR> (starting line 356 in trunk):
>> 
JR> ;; End request
JR> "\r\n"
JR> ;; Any data
JR> url-http-data
JR> ;; If `url-http-data' is nil, avoid two CRLFs (Bug#8931).
JR> (if url-http-data "\r\n")))
>> 
JR> Should read simply:
>> 
JR> ;; End request
JR> "\r\n"
JR> ;; Any data
JR> url-http-data))
>> ...

I am OK with this fix, if anyone else wants to look it over and commit.

JR> Futhermore url-http-attempt-keepalives should be nil as default, or better
JR> yet should be completely removed, as true keepalive connections are anyway
JR> not currently supported on the Emacs side, are they?

Not AFAIK.

Ted




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Tue, 24 Dec 2013 07:57:01 GMT) Full text and rfc822 format available.

Message #20 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Jarosław Rzeszótko <sztywny <at> gmail.com>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Tue, 24 Dec 2013 08:50:11 +0100
Jarosław Rzeszótko <sztywny <at> gmail.com> writes:

> I believe that the person who originally introduced the additional
> newline most likely was confused by some issue in mediawiki.el itself,
> a broken HTTP server, a broken PHP script, or something of this kind,
> and obscured the real bug instead of fixing it:

Well, that's possible.  If there's a significant number of these broken
HTTP servers, then including the workaround is the correct thing to do.

I don't know whether that's the case, though.

-- 
(domestic pets only, the antidote for overdose, milk.)
  bloggy blog http://lars.ingebrigtsen.no/




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Tue, 24 Dec 2013 08:29:02 GMT) Full text and rfc822 format available.

Message #23 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Jarosław Rzeszótko <sztywny <at> gmail.com>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Tue, 24 Dec 2013 09:28:08 +0100
Hi,

There are no "real" HTTP servers that require this, as evidenced by
the fact that none of the popular HTTP clients does this, just have a
look at curl source code or the http library of your favourite
programming language. Frankly, this is broken so obviously I feel
stupid further discussing it.

Cheers,
Jarosław Rzeszótko

2013/12/24 Lars Ingebrigtsen <larsi <at> gnus.org>:
> Jarosław Rzeszótko <sztywny <at> gmail.com> writes:
>
>> I believe that the person who originally introduced the additional
>> newline most likely was confused by some issue in mediawiki.el itself,
>> a broken HTTP server, a broken PHP script, or something of this kind,
>> and obscured the real bug instead of fixing it:
>
> Well, that's possible.  If there's a significant number of these broken
> HTTP servers, then including the workaround is the correct thing to do.
>
> I don't know whether that's the case, though.
>
> --
> (domestic pets only, the antidote for overdose, milk.)
>   bloggy blog http://lars.ingebrigtsen.no/




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Tue, 24 Dec 2013 14:44:02 GMT) Full text and rfc822 format available.

Message #26 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
To: Jarosław Rzeszótko <sztywny <at> gmail.com>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Tue, 24 Dec 2013 09:43:17 -0500
> At the end of every HTTP request to be made with url-http.el and containing
> a body, an unnecessary "\r\n" is appended, and additionally those two
> characters are not used in the calculation of the Content-Length header.

Looks like an old workaround.  Could someone get rid of this?


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Tue, 24 Dec 2013 16:32:02 GMT) Full text and rfc822 format available.

Message #29 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Jarosław Rzeszótko <sztywny <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Tue, 24 Dec 2013 17:31:57 +0100
Hi,

There are more issues besides just the extra "\r\n", the url-http.el
library is super confusing. I read the documentation for url-retrieve,
which makes doing a HTTP request look really straightforward, and then
I spent several hours finding out what the hell happens. For example,
the documentation for url-retrieve says:

"The variables `url-request-data', `url-request-method' and
`url-request-extra-headers' can be dynamically bound around the
request; dynamic binding of other variables doesn't necessarily take
effect."

The problem is that url-http.el sets a lot of headers by default that
can not be overwritten in any other way then dynamically overshadowing
some variables. For example, all connections are keep-alive by
default, which is confusing in itself already, futhermore you can't
just pass "Connection" . "close' in url-request-extra-headers, because
url-http.el joins the default and "extra" headers instead of merging
the "extra" ones and having them overwrite the default ones. So try do
something like this:

(let ((url-request-method "GET")
      (url-request-extra-headers '(("Connection" . "close"))))
  (url-retrieve-synchronously "http://www.google.com/"))

And what is sent is this:

GET / HTTP/1.1
Connection: keep-alive
...
Connection: close

Which again isn't valid HTTP and the behaviour of the HTTP server in
this case is undefined and implementation specific. The only way to
workaround this is doing this:

(let ((url-http-attempt-keepalives nil)
      (url-request-method "GET")
      (url-request-extra-headers '(("Connection" . "close"))))
  (url-retrieve-synchronously "http://www.google.com/"))

Which is only clear by reading url-http.el where again it is described
in a very confusing way:

(defvar url-http-attempt-keepalives t
  "Whether to use a single TCP connection multiple times in HTTP.
This is only useful when debugging the HTTP subsystem.  Setting to
nil will explicitly close the connection to the server after every
request.")

This is all the more irritating so many of the headers are set by
default using the variables url-vars.el. Why those things are at all
variables is a mystery to me. Then there are messages that appear in
the minibuffer and there is no way to disable them. In the end it is
much easier to do HTTP requests manually using make-network-process
then it is with the url library, it really isn't a good general
purpose HTTP component, there are too many completely opaque things
happening behind the scenes, like the hashmap of persistent tcp
connections that is maintained behind the scenes. Didn't anyone else
run into problems with it?

Cheers,
Jarosław Rzeszótko


2013/12/24 Stefan Monnier <monnier <at> iro.umontreal.ca>:
>> At the end of every HTTP request to be made with url-http.el and containing
>> a body, an unnecessary "\r\n" is appended, and additionally those two
>> characters are not used in the calculation of the Content-Length header.
>
> Looks like an old workaround.  Could someone get rid of this?
>
>
>         Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Thu, 02 Jan 2014 02:22:02 GMT) Full text and rfc822 format available.

Message #32 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Jarosław Rzeszótko <sztywny <at> gmail.com>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Wed, 01 Jan 2014 21:21:04 -0500
> The problem is that url-http.el sets a lot of headers by default that
> can not be overwritten in any other way then dynamically overshadowing
> some variables.

Indeed, this is ugly.  Improvements welcome.

> For example, all connections are keep-alive by
> default, which is confusing in itself already,

Not sure why it should be a problem.

> (let ((url-request-method "GET")
>       (url-request-extra-headers '(("Connection" . "close"))))
>   (url-retrieve-synchronously "http://www.google.com/"))
> And what is sent is this:
> GET / HTTP/1.1
> Connection: keep-alive
> ...
> Connection: close

> Which again isn't valid HTTP and the behaviour of the HTTP server in
> this case is undefined and implementation specific. The only way to
> workaround this is doing this:

> (let ((url-http-attempt-keepalives nil)
>       (url-request-method "GET")
>       (url-request-extra-headers '(("Connection" . "close"))))
>   (url-retrieve-synchronously "http://www.google.com/"))

Yuck!  We can probably fix this fairly easily by letting
url-request-extra-headers override (rather than just add to)
other headers.

> This is all the more irritating so many of the headers are set by
> default using the variables url-vars.el. Why those things are at all
> variables is a mystery to me.

Probably partly historical evolution (there was no place to add new
"parameters", so adding dynamic vars was an easy way to add more control
without breaking existing code).

> In the end it is much easier to do HTTP requests manually using
> make-network-process then it is with the url library,

I think that's misleading: the URL library is supposed to deal with
things like proxies and redirections, which "manual requests via
make-network-process" probably won't handle.

> Didn't anyone else run into problems with it?

Apparently not yet.  But I agree that the API might deserve a redesign
(IIRC another problem is in the way headers in the answer are returned
to the caller, which does not work consistently across different kinds
of URLs (ftp, http, file, imap, ...)).


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Fri, 03 Jan 2014 18:07:01 GMT) Full text and rfc822 format available.

Message #35 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Jarosław Rzeszótko <sztywny <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Fri, 3 Jan 2014 19:06:22 +0100
[Message part 1 (text/plain, inline)]
I attach a patch that removes the extra "\r\n", adds a function to
merge two alists, and uses it to merge the extra-headers on top of the
default-headers. I added a few basic tests, too.

Cheers,
Jarosław Rzeszótko

2014/1/2 Stefan Monnier <monnier <at> iro.umontreal.ca>:
>> The problem is that url-http.el sets a lot of headers by default that
>> can not be overwritten in any other way then dynamically overshadowing
>> some variables.
>
> Indeed, this is ugly.  Improvements welcome.
>
>> For example, all connections are keep-alive by
>> default, which is confusing in itself already,
>
> Not sure why it should be a problem.
>
>> (let ((url-request-method "GET")
>>       (url-request-extra-headers '(("Connection" . "close"))))
>>   (url-retrieve-synchronously "http://www.google.com/"))
>> And what is sent is this:
>> GET / HTTP/1.1
>> Connection: keep-alive
>> ...
>> Connection: close
>
>> Which again isn't valid HTTP and the behaviour of the HTTP server in
>> this case is undefined and implementation specific. The only way to
>> workaround this is doing this:
>
>> (let ((url-http-attempt-keepalives nil)
>>       (url-request-method "GET")
>>       (url-request-extra-headers '(("Connection" . "close"))))
>>   (url-retrieve-synchronously "http://www.google.com/"))
>
> Yuck!  We can probably fix this fairly easily by letting
> url-request-extra-headers override (rather than just add to)
> other headers.
>
>> This is all the more irritating so many of the headers are set by
>> default using the variables url-vars.el. Why those things are at all
>> variables is a mystery to me.
>
> Probably partly historical evolution (there was no place to add new
> "parameters", so adding dynamic vars was an easy way to add more control
> without breaking existing code).
>
>> In the end it is much easier to do HTTP requests manually using
>> make-network-process then it is with the url library,
>
> I think that's misleading: the URL library is supposed to deal with
> things like proxies and redirections, which "manual requests via
> make-network-process" probably won't handle.
>
>> Didn't anyone else run into problems with it?
>
> Apparently not yet.  But I agree that the API might deserve a redesign
> (IIRC another problem is in the way headers in the answer are returned
> to the caller, which does not work consistently across different kinds
> of URLs (ftp, http, file, imap, ...)).
>
>
>         Stefan
[url.patch (text/x-patch, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Sun, 05 Jan 2014 09:58:02 GMT) Full text and rfc822 format available.

Message #38 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Lars Magne Ingebrigtsen <larsi <at> gnus.org>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: Jarosław Rzeszótko <sztywny <at> gmail.com>,
 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Sun, 05 Jan 2014 10:57:05 +0100
Stefan Monnier <monnier <at> iro.umontreal.ca> writes:

> Probably partly historical evolution (there was no place to add new
> "parameters", so adding dynamic vars was an easy way to add more control
> without breaking existing code).

Yes.

This summer, I started rewriting url-retrieve and friends in the way
that was discussed on emacs-devel a ... couple years back:

(with-url "http://fsf.org" :timeout 10
                           :concurrency 5
                           :request-method "POST"
                           :headers '(("Foo" . "Bar"))
  (message "The result was: %s" (buffer-string)))

but I kinda stopped before I really got started, because I couldn't
figure out how to make this work in the non-lexical binding case.  And
having this work only with lexical binding seemed kinda meh.

You wouldn't happen to have any ideas in that area?  >"?
  
-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Sun, 05 Jan 2014 13:26:01 GMT) Full text and rfc822 format available.

Message #41 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Jarosław Rzeszótko <sztywny <at> gmail.com>
To: Lars Magne Ingebrigtsen <larsi <at> gnus.org>
Cc: 16220 <at> debbugs.gnu.org, Stefan Monnier <monnier <at> iro.umontreal.ca>
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Sun, 5 Jan 2014 14:25:09 +0100
Hi,

2014/1/5 Lars Magne Ingebrigtsen <larsi <at> gnus.org>:
> This summer, I started rewriting url-retrieve and friends in the way
> that was discussed on emacs-devel a ... couple years back:
>
> (with-url "http://fsf.org" :timeout 10
>                            :concurrency 5
>                            :request-method "POST"
>                            :headers '(("Foo" . "Bar"))
>   (message "The result was: %s" (buffer-string)))
>
> but I kinda stopped before I really got started, because I couldn't
> figure out how to make this work in the non-lexical binding case.  And
> having this work only with lexical binding seemed kinda meh.
>
> You wouldn't happen to have any ideas in that area?  >"?

I think a natural way of representing the request if this was to be
designed today would be to use alists all the way, perhaps with
optional arguments for adjusting how precisely the request itself
should be sent and handled:

(http "http://www.fsf.org/xyz/zyx"
        '((method . "GET")
          (path . "/")
          (headers . (("Accept-Encoding" . "UTF-8")))
          (form . (("param1" . "value1"))))
         :timeout 10)

This breaks compatibility of course though, as do any sensible changes
I can think of. So meanwhile I would be happy if the patch got
accepted, it would at least make the library at all usable to me, even
if not the easiest to use.

Cheers,
Jarosław Rzeszótko




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Mon, 06 Jan 2014 15:07:01 GMT) Full text and rfc822 format available.

Message #44 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Lars Magne Ingebrigtsen <larsi <at> gnus.org>
Cc: Jarosław Rzeszótko <sztywny <at> gmail.com>,
 16220 <at> debbugs.gnu.org
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Mon, 06 Jan 2014 10:06:50 -0500
> but I kinda stopped before I really got started, because I couldn't
> figure out how to make this work in the non-lexical binding case.  And
> having this work only with lexical binding seemed kinda meh.

I think it's perfectly OK to have it require lexical-binding.
The macro can signal an error if lexical-binding is nil.


        Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Tue, 07 Jan 2014 19:31:02 GMT) Full text and rfc822 format available.

Message #47 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Jarosław Rzeszótko <sztywny <at> gmail.com>
To: Stefan Monnier <monnier <at> iro.umontreal.ca>
Cc: 16220 <at> debbugs.gnu.org, Lars Magne Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Tue, 7 Jan 2014 20:30:26 +0100
Hi,

Is there any chance my patch gets committed or at least that the extra
"\r\n" finally gets removed?

Cheers,
Jarosław Rzeszótko

2014/1/6 Stefan Monnier <monnier <at> iro.umontreal.ca>:
>> but I kinda stopped before I really got started, because I couldn't
>> figure out how to make this work in the non-lexical binding case.  And
>> having this work only with lexical binding seemed kinda meh.
>
> I think it's perfectly OK to have it require lexical-binding.
> The macro can signal an error if lexical-binding is nil.
>
>
>         Stefan




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#16220; Package emacs. (Wed, 08 Jan 2014 18:31:02 GMT) Full text and rfc822 format available.

Message #50 received at 16220 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca>
To: Jarosław Rzeszótko <sztywny <at> gmail.com>
Cc: 16220 <at> debbugs.gnu.org, Lars Magne Ingebrigtsen <larsi <at> gnus.org>
Subject: Re: bug#16220: url-http.el: Not conforming to HTTP spec
Date: Wed, 08 Jan 2014 13:29:54 -0500
> Is there any chance my patch gets committed or at least that the extra
> "\r\n" finally gets removed?

I installed the patch below.
Thank you very much,


        Stefan


=== modified file 'lisp/url/ChangeLog'
--- lisp/url/ChangeLog	2014-01-01 07:43:34 +0000
+++ lisp/url/ChangeLog	2014-01-08 18:05:12 +0000
@@ -1,3 +1,8 @@
+2014-01-08  Jarosław Rzeszótko  <sztywny <at> gmail.com>
+
+	* url-http.el (url-http-create-request): Don't add extra \r\n after
+	http data (bug#16220).
+
 2013-12-28  Glenn Morris  <rgm <at> gnu.org>
 
 	* url-history.el (url-history-track):

=== modified file 'lisp/url/url-http.el'
--- lisp/url/url-http.el	2014-01-01 07:43:34 +0000
+++ lisp/url/url-http.el	2014-01-08 18:03:55 +0000
@@ -356,9 +356,7 @@
              ;; End request
              "\r\n"
              ;; Any data
-             url-http-data
-	     ;; If `url-http-data' is nil, avoid two CRLFs (Bug#8931).
-	     (if url-http-data "\r\n")))
+             url-http-data))
            ""))
     (url-http-debug "Request is: \n%s" request)
     request))





Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Sat, 18 Jan 2014 22:36:02 GMT) Full text and rfc822 format available.

Notification sent to Jarosław Rzeszótko <sztywny <at> gmail.com>:
bug acknowledged by developer. (Sat, 18 Jan 2014 22:36:03 GMT) Full text and rfc822 format available.

Message #55 received at 16220-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: 16220-done <at> debbugs.gnu.org
Subject: Re: url-http.el: Not conforming to HTTP spec
Date: Sat, 18 Jan 2014 14:35:34 -0800
Marking the bug as done since Stefan installed a patch.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 16 Feb 2014 12:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 122 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.