GNU bug report logs - #572
thing-at-point 'url gets confused if url has paren

Previous Next

Package: emacs;

Reported by: xah lee <xah <at> xahlee.org>

Date: Fri, 18 Jul 2008 12:50:03 UTC

Severity: normal

Done: joakim <at> verona.se

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 572 in the body.
You can then email your comments to 572 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#572; Package emacs. Full text and rfc822 format available.

Acknowledgement sent to xah lee <xah <at> xahlee.org>:
New bug report received and forwarded. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. Full text and rfc822 format available.

Message #5 received at submit <at> emacsbugs.donarmstrong.com (full text, mbox):

From: xah lee <xah <at> xahlee.org>
To: bug-gnu-emacs <at> gnu.org
Subject: thing-at-point 'url gets confused if url has paren
Date: Fri, 18 Jul 2008 05:41:38 -0700
(thing-at-point 'url) gets confused if the url contains a
parenthesis such as in
http://en.wikipedia.org/wiki/Oz_(programming_language)

Note that, according to
 http://en.wikipedia.org/wiki/Percent-encoding

parenthesis in uri do not necessarily needs to be percent encoded,  
depending on the context the uri is used. Quote: «When a character  
from the reserved set (a "reserved character") has special meaning (a  
"reserved purpose") in a certain context, and a URI scheme says that  
it is necessary to use that character for some other purpose, then  
the character must be percent-encoded.»

But anyhow, practically speaking, often uri will contain parens. e.g.  
Wikipedia has lots of article with url containing paren, and in  
browser they are shown as paren, and this is often copied and pasted  
as is to editors.

the above should be the complete bug description.
The following are supplementary to this bug report.

--------------------------------------------
Here's a sample code

(defun wrap-url ()
  "Make the url at cursor point into a html link.

If there is a region, use the region as url instead.

This function is interface wrapper for `wrap-url-string'.
See that function for detail."
  (interactive)
  (let (bds p1 p2 url)
    (if (and transient-mark-mode mark-active)
        (progn
          (setq p1 (region-beginning))
          (setq p2 (region-end))
          )
      (progn
        (setq bds (bounds-of-thing-at-point 'url))
        (setq p1 (car bds))
        (setq p2 (cdr bds))
        )
      )

    (setq url (buffer-substring-no-properties p1 p2))
    (delete-region p1 p2)
    (goto-char p1)
    (insert (wrap-url-string url))
    )
  )

the error from the above code when the cursor is on the following line:
http://en.wikipedia.org/wiki/Oz_(programming_language)

is:
setq: Wrong type argument: integer-or-marker-p, nil

presumably because the boundary p1 or p2 is not a integer or marker.

-----------------

In GNU Emacs 22.2.1 (powerpc-apple-darwin8.11.0, Carbon Version 1.6.0)
 of 2008-04-05 on g5.tokyo.stp.isas.jaxa.jp
Windowing system distributor `Apple Inc.', version 10.4.11

  Xah
∑ http://xahlee.org/

☄


☄







Information forwarded to bug-submit-list <at> lists.donarmstrong.com, Emacs Bugs <bug-gnu-emacs <at> gnu.org>:
bug#572; Package emacs. Full text and rfc822 format available.

Acknowledgement sent to joakim <at> verona.se:
Extra info received and forwarded to list. Copy sent to Emacs Bugs <bug-gnu-emacs <at> gnu.org>. Full text and rfc822 format available.

Message #10 received at 572 <at> emacsbugs.donarmstrong.com (full text, mbox):

From: joakim <at> verona.se
To: 572 <at> debbugs.gnu.org
Subject: patch which seems to fix it
Date: Sun, 03 Aug 2008 23:43:01 +0200
Removing the () from the regexp seems to fix this.

The OP:s assesment that () are valid in an url is correct.


=== modified file 'lisp/thingatpt.el'
*** lisp/thingatpt.el	2008-05-06 13:57:18 +0000
--- lisp/thingatpt.el	2008-08-03 21:38:46 +0000
***************
*** 208,214 ****
  	 (goto-char (point-min)))))
  
  (defvar thing-at-point-url-path-regexp
!   "[^]\t\n \"'()<>[^`{}]*[^]\t\n \"'()<>[^`{}.,;]+"
    "A regular expression probably matching the host and filename or e-mail part of a URL.")
  
  (defvar thing-at-point-short-url-regexp
--- 208,214 ----
  	 (goto-char (point-min)))))
  
  (defvar thing-at-point-url-path-regexp
!   "[^]\t\n \"'<>[^`{}]*[^]\t\n \"'<>[^`{}.,;]+"
    "A regular expression probably matching the host and filename or e-mail part of a URL.")
  
  (defvar thing-at-point-short-url-regexp


-- 
Joakim Verona




Reply sent to joakim <at> verona.se:
You have taken responsibility. Full text and rfc822 format available.

Notification sent to xah lee <xah <at> xahlee.org>:
bug acknowledged by developer. Full text and rfc822 format available.

Message #15 received at 572-done <at> emacsbugs.donarmstrong.com (full text, mbox):

From: joakim <at> verona.se
To: 572-done <at> debbugs.gnu.org
Subject: commited a fix
Date: Thu, 07 Aug 2008 21:17:09 +0200
Thanks for the thorough bug report!
-- 
Joakim Verona




bug archived. Request was from Debbugs Internal Request <don <at> donarmstrong.com> to internal_control <at> emacsbugs.donarmstrong.com. (Fri, 05 Sep 2008 14:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 16 years and 290 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.