From unknown Sun Sep 07 16:50:41 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7017: Suggestion: (url-retrieve-internal) hexify multibyte URL string first Resent-From: Lars Magne Ingebrigtsen Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 10 Apr 2012 11:24:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7017 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch To: William Xu Cc: 7017@debbugs.gnu.org Received: via spool by 7017-submit@debbugs.gnu.org id=B7017.133405702815261 (code B ref 7017); Tue, 10 Apr 2012 11:24:01 +0000 Received: (at 7017) by debbugs.gnu.org; 10 Apr 2012 11:23:48 +0000 Received: from localhost ([127.0.0.1]:48138 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SHZAq-0003y6-9S for submit@debbugs.gnu.org; Tue, 10 Apr 2012 07:23:48 -0400 Received: from hermes.netfonds.no ([80.91.224.195]:38082) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SHZAn-0003xx-QA for 7017@debbugs.gnu.org; Tue, 10 Apr 2012 07:23:46 -0400 Received: from cm-84.215.51.58.getinternet.no ([84.215.51.58] helo=stories.gnus.org) by hermes.netfonds.no with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1SHZ9f-0007FD-7x; Tue, 10 Apr 2012 13:22:35 +0200 From: Lars Magne Ingebrigtsen References: X-Now-Playing: Patrick Cowley's _The Ultimate Collection_: "Tech-No-Logical World" Date: Tue, 10 Apr 2012 13:22:34 +0200 In-Reply-To: (William Xu's message of "Thu, 01 Jul 2010 02:31:05 +0800") Message-ID: User-Agent: Gnus/5.130004 (Ma Gnus v0.4) Emacs/24.1.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-MailScanner-ID: 1SHZ9f-0007FD-7x X-Netfonds-MailScanner: Found to be clean X-Netfonds-MailScanner-From: larsi@gnus.org MailScanner-NULL-Check: 1334661755.33169@rLQwkffagpvD/oel2vbY8w X-Spam-Status: No X-Spam-Score: -1.9 (-) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.9 (-) William Xu writes: > Feeding the same url to `wget', it would first hexify it, then download > it successfully. I suggest we do the same in url-retrieve, like this: > > (url-retrieve-internal): Hexify multibye URL string first when necessary. Thanks; applied to the Emacs trunk. -- (domestic pets only, the antidote for overdose, milk.) bloggy blog http://lars.ingebrigtsen.no/ From debbugs-submit-bounces@debbugs.gnu.org Tue Apr 10 07:23:51 2012 Received: (at control) by debbugs.gnu.org; 10 Apr 2012 11:23:51 +0000 Received: from localhost ([127.0.0.1]:48141 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SHZAs-0003yJ-H9 for submit@debbugs.gnu.org; Tue, 10 Apr 2012 07:23:50 -0400 Received: from hermes.netfonds.no ([80.91.224.195]:38084) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SHZAq-0003yC-Ve for control@debbugs.gnu.org; Tue, 10 Apr 2012 07:23:49 -0400 Received: from cm-84.215.51.58.getinternet.no ([84.215.51.58] helo=stories.gnus.org) by hermes.netfonds.no with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1SHZ9i-0007FS-Hw for control@debbugs.gnu.org; Tue, 10 Apr 2012 13:22:38 +0200 Date: Tue, 10 Apr 2012 13:22:38 +0200 Message-Id: To: control@debbugs.gnu.org From: Lars Magne Ingebrigtsen Subject: control message for bug #7017 X-MailScanner-ID: 1SHZ9i-0007FS-Hw X-Netfonds-MailScanner: Found to be clean X-Netfonds-MailScanner-From: larsi@gnus.org MailScanner-NULL-Check: 1334661758.79414@QNqc/QTDdEYnInvGIQrrMA X-Spam-Status: No X-Spam-Score: -1.9 (-) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.9 (-) tags 7017 fixed close 7017 24.2 From unknown Sun Sep 07 16:50:41 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7017: url-retrieve seems busted References: In-Reply-To: Resent-From: Seth Mason Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 07 May 2012 21:55:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7017 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: fixed patch To: 7017@debbugs.gnu.org Received: via spool by 7017-submit@debbugs.gnu.org id=B7017.133642767227214 (code B ref 7017); Mon, 07 May 2012 21:55:02 +0000 Received: (at 7017) by debbugs.gnu.org; 7 May 2012 21:54:32 +0000 Received: from localhost ([127.0.0.1]:40126 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SRVt2-00074t-27 for submit@debbugs.gnu.org; Mon, 07 May 2012 17:54:32 -0400 Received: from mail-ob0-f172.google.com ([209.85.214.172]:36636) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SRVsE-00073F-OQ for 7017@debbugs.gnu.org; Mon, 07 May 2012 17:53:43 -0400 Received: by obbeh20 with SMTP id eh20so8678506obb.3 for <7017@debbugs.gnu.org>; Mon, 07 May 2012 14:51:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:subject:date:message-id:mime-version:content-type :x-gm-message-state; bh=y17+TkIV65657vZem23KyBhtszJxfkTVZOWUEmTZAv8=; b=jhiCtj0TMPl+WgezEpLofCXqMl+2XsSpkN4NGY08HxXfBROPqNeVJhRQt0WL7G+Eig JAow9dBc6H1hteEJwRpR54wEVecraWSA2bmlRB791VCaHnN+mJ0SC7s501Yq8Cnuu846 i9h6z4DW+Y54mbS/ygbHpHdWujQKFyu5fGc2OBgU7Fipch65U5ezxiyHAg6jgtV4WFJv MDIVKPE6Uovd52dm3sc8t9tOiKs9tWJJcn1NuJRQgcYeYG+Kvg31VVRM/24acvPdGDmi CU2e2ZaUEGr08HnQwyim2e6y15uxrJFXgrhNbUBon96S032YASHyHVy84dVIDeRE+n89 e86A== Received: by 10.182.72.38 with SMTP id a6mr24562877obv.38.1336427494245; Mon, 07 May 2012 14:51:34 -0700 (PDT) Received: from seth-ec-laptop ([93.184.218.159]) by mx.google.com with ESMTPS id m2sm19755482obk.9.2012.05.07.14.51.31 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 07 May 2012 14:51:33 -0700 (PDT) From: Seth Mason Date: Mon, 07 May 2012 14:51:29 -0700 Message-ID: <87zk9jacda.fsf@edgecast.com> MIME-Version: 1.0 Content-Type: text/plain X-Gm-Message-State: ALoCoQmIVgGfgP+8odpPBEGgs58yhVXXungBQBMfWuLuQCSRY/vCLnpMrzdg77wfMJ14+nLU/fo8 X-Spam-Score: -2.6 (--) X-Mailman-Approved-At: Mon, 07 May 2012 17:54:30 -0400 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.6 (--) If you put the following in a buffer and eval it, you'll get a 404: ;; http://httpbin.org/get?x=1 ;; eval this buffer (url-retrieve (buffer-substring-no-properties 4 30) (lambda (&rest args) (switch-to-buffer (current-buffer)))) If you curl/wget the same URL, it'll work fine. If you look at the request, it's going to "/get%3fx%3d1". It seems to me that the URL is getting improperly encoded for multibyte strings. From unknown Sun Sep 07 16:50:41 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7017: url-retrieve seems busted Resent-From: Chong Yidong Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 08 May 2012 04:55:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7017 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: fixed patch To: Seth Mason Cc: 7017@debbugs.gnu.org Received: via spool by 7017-submit@debbugs.gnu.org id=B7017.13364528582015 (code B ref 7017); Tue, 08 May 2012 04:55:01 +0000 Received: (at 7017) by debbugs.gnu.org; 8 May 2012 04:54:18 +0000 Received: from localhost ([127.0.0.1]:40282 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SRcRG-0000WR-11 for submit@debbugs.gnu.org; Tue, 08 May 2012 00:54:18 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:39267 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SRcRD-0000WK-3j for 7017@debbugs.gnu.org; Tue, 08 May 2012 00:54:15 -0400 Received: from [155.69.17.96] (port=54322 helo=ulysses) by fencepost.gnu.org with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1SRcPA-0004mt-FW; Tue, 08 May 2012 00:52:09 -0400 From: Chong Yidong References: <87zk9jacda.fsf@edgecast.com> Date: Tue, 08 May 2012 12:52:01 +0800 In-Reply-To: <87zk9jacda.fsf@edgecast.com> (Seth Mason's message of "Mon, 07 May 2012 14:51:29 -0700") Message-ID: <87obpz9swe.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.96 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.9 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) Seth Mason writes: > If you put the following in a buffer and eval it, you'll get a 404: > > ;; http://httpbin.org/get?x=1 > ;; eval this buffer > (url-retrieve (buffer-substring-no-properties 4 30) (lambda (&rest > args) (switch-to-buffer (current-buffer)))) > > If you curl/wget the same URL, it'll work fine. > > If you look at the request, it's going to "/get%3fx%3d1". It seems to me > that the URL is getting improperly encoded for multibyte strings. Thanks for pointing this out. Applying url-hexify-string on the entire URL, as the previous patch did, is wrong. We musn't hexify reserved characters that are being used in their special role. Unfortunately, figuring out when those characters are being used in their special role requires an implementation of RFC2396, which I don't think we currently have in Emacs. Or, the following not-strictly-correct hack leaves out reserved characters from hexification. === modified file 'lisp/url/url.el' *** lisp/url/url.el 2012-04-26 12:43:28 +0000 --- lisp/url/url.el 2012-05-08 04:46:45 +0000 *************** *** 180,188 **** (url-gc-dead-buffers) (if (stringp url) (set-text-properties 0 (length url) nil url)) (when (multibyte-string-p url) ! (let ((url-unreserved-chars (append '(?: ?/) url-unreserved-chars))) (setq url (url-hexify-string url)))) (if (not (vectorp url)) (setq url (url-generic-parse-url url))) (if (not (functionp callback)) --- 180,193 ---- (url-gc-dead-buffers) (if (stringp url) (set-text-properties 0 (length url) nil url)) + (when (multibyte-string-p url) ! (let* ((reserved-chars '(?! ?# ?$ ?& ?' ?( ?) ?* ?+ ?, ?/ ?: ?\; ! ?= ?? ?@ ?[ ?])) ! (url-unreserved-chars (append reserved-chars ! url-unreserved-chars))) (setq url (url-hexify-string url)))) + (if (not (vectorp url)) (setq url (url-generic-parse-url url))) (if (not (functionp callback)) From unknown Sun Sep 07 16:50:41 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7017: url-retrieve seems busted Resent-From: Chong Yidong Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 08 May 2012 05:28:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7017 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: fixed patch To: Seth Mason Cc: 7017@debbugs.gnu.org Received: via spool by 7017-submit@debbugs.gnu.org id=B7017.13364548464958 (code B ref 7017); Tue, 08 May 2012 05:28:02 +0000 Received: (at 7017) by debbugs.gnu.org; 8 May 2012 05:27:26 +0000 Received: from localhost ([127.0.0.1]:40299 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SRcxK-0001Hu-82 for submit@debbugs.gnu.org; Tue, 08 May 2012 01:27:26 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:39772 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SRcxH-0001Hn-Il for 7017@debbugs.gnu.org; Tue, 08 May 2012 01:27:24 -0400 Received: from [155.69.17.96] (port=54387 helo=ulysses) by fencepost.gnu.org with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1SRcvF-0000QC-1d; Tue, 08 May 2012 01:25:17 -0400 From: Chong Yidong References: <87zk9jacda.fsf@edgecast.com> <87obpz9swe.fsf@gnu.org> Date: Tue, 08 May 2012 13:25:10 +0800 In-Reply-To: <87obpz9swe.fsf@gnu.org> (Chong Yidong's message of "Tue, 08 May 2012 12:52:01 +0800") Message-ID: <87k40n9rd5.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.96 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.9 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) Chong Yidong writes: > Applying url-hexify-string on the entire URL, as the previous patch did, > is wrong. We musn't hexify reserved characters that are being used in > their special role. Unfortunately, figuring out when those characters > are being used in their special role requires an implementation of > RFC2396, which I don't think we currently have in Emacs. Actually, I think we could use url-generic-parse-url for this. From unknown Sun Sep 07 16:50:41 2025 X-Loop: help-debbugs@gnu.org Subject: bug#7017: url-retrieve seems busted Resent-From: Chong Yidong Original-Sender: debbugs-submit-bounces@debbugs.gnu.org Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 09 May 2012 08:38:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 7017 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: fixed patch To: Seth Mason Cc: 7017@debbugs.gnu.org Received: via spool by 7017-submit@debbugs.gnu.org id=B7017.133655263612233 (code B ref 7017); Wed, 09 May 2012 08:38:01 +0000 Received: (at 7017) by debbugs.gnu.org; 9 May 2012 08:37:16 +0000 Received: from localhost ([127.0.0.1]:42173 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SS2OZ-0003BF-7U for submit@debbugs.gnu.org; Wed, 09 May 2012 04:37:15 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:44186 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SS2OX-0003B8-Mx for 7017@debbugs.gnu.org; Wed, 09 May 2012 04:37:14 -0400 Received: from [155.69.19.49] (port=50745 helo=ulysses) by fencepost.gnu.org with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1SS2MO-0003A8-Rg; Wed, 09 May 2012 04:35:01 -0400 From: Chong Yidong References: <87zk9jacda.fsf@edgecast.com> <87obpz9swe.fsf@gnu.org> <87k40n9rd5.fsf@gnu.org> Date: Wed, 09 May 2012 16:34:53 +0800 In-Reply-To: <87k40n9rd5.fsf@gnu.org> (Chong Yidong's message of "Tue, 08 May 2012 13:25:10 +0800") Message-ID: <87d36dg3bm.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.96 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.9 (------) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) Chong Yidong writes: > Chong Yidong writes: > >> Applying url-hexify-string on the entire URL, as the previous patch did, >> is wrong. We musn't hexify reserved characters that are being used in >> their special role. Unfortunately, figuring out when those characters >> are being used in their special role requires an implementation of >> RFC2396, which I don't think we currently have in Emacs. > > Actually, I think we could use url-generic-parse-url for this. Fixed in trunk (revision 108172). From debbugs-submit-bounces@debbugs.gnu.org Wed May 09 04:37:24 2012 Received: (at control) by debbugs.gnu.org; 9 May 2012 08:37:24 +0000 Received: from localhost ([127.0.0.1]:42176 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SS2Oi-0003BZ-2r for submit@debbugs.gnu.org; Wed, 09 May 2012 04:37:24 -0400 Received: from fencepost.gnu.org ([208.118.235.10]:44189 ident=Debian-exim) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1SS2Og-0003BS-Kh for control@debbugs.gnu.org; Wed, 09 May 2012 04:37:22 -0400 Received: from [155.69.19.49] (port=50746 helo=ulysses) by fencepost.gnu.org with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1SS2MY-0003DV-NV for control@debbugs.gnu.org; Wed, 09 May 2012 04:35:11 -0400 From: Chong Yidong To: control@debbugs.gnu.org Subject: close 7017 Date: Wed, 09 May 2012 16:35:05 +0800 Message-ID: <87bolx69c6.fsf@gnu.org> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -6.9 (------) X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -6.9 (------) close 7017 thanks