GNU bug report logs - #61519
30.0.50; thing-at-point misdetects emails with numerals in user part

Previous Next

Package: emacs;

Reported by: Aaron Madlon-Kay <aaron <at> madlon-kay.com>

Date: Tue, 14 Feb 2023 23:05:01 UTC

Severity: normal

Tags: fixed

Found in version 30.0.50

Fixed in version 30.1

Done: Robert Pluim <rpluim <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #8 received at 61519 <at> debbugs.gnu.org (full text, mbox):

From: Robert Pluim <rpluim <at> gmail.com>
To: Aaron Madlon-Kay <aaron <at> madlon-kay.com>
Cc: 61519 <at> debbugs.gnu.org
Subject: Re: bug#61519: 30.0.50; thing-at-point misdetects emails with
 numerals in user part
Date: Wed, 15 Feb 2023 12:15:48 +0100
>>>>> On Wed, 15 Feb 2023 08:04:26 +0900, Aaron Madlon-Kay <aaron <at> madlon-kay.com> said:

    Aaron> 1. Launch Emacs with `emacs -Q`
    Aaron> 2. Enter an email address with a numeral in the user part, like
    Aaron>    foo0bar <at> example.com
    Aaron> 3. With point inside the domain part of the email address, evaluate
    Aaron>    `(thing-at-point 'email)`
    Aaron> 4. Result will be `bar <at> example.com` (expected `foo0bar <at> example.com`)

    Aaron> The cause of this is the implementation of `thing-at-point-looking-at'
    Aaron> where it backs up one character at a time to find the start of the
    Aaron> email. The value for `thing-at-point-email-regexp' allows numbers in
    Aaron> the user part only from the *second* character, so as the function
    Aaron> backs up it will mistakenly find the `0` in `0bar <at> example.com` to be
    Aaron> outside of the email address.

That regexp has a few other issues, but breaking out the full rfc 822
parser for this would be overkill. Could you try the following patch?

Robert
-- 

diff --git i/lisp/thingatpt.el w/lisp/thingatpt.el
index 9363a474cb5..f3367290dee 100644
--- i/lisp/thingatpt.el
+++ w/lisp/thingatpt.el
@@ -645,7 +645,7 @@ thing-at-point-looking-at
 
 ;;   Email addresses
 (defvar thing-at-point-email-regexp
-  "<?[-+_.~a-zA-Z][-+_.~:a-zA-Z0-9]*@[-.a-zA-Z0-9]+>?"
+  "<?[-+_~a-zA-Z0-9][-+_.~:a-zA-Z0-9]*@[-a-zA-Z0-9]+[-.a-zA-Z0-9]*>?"
   "A regular expression probably matching an email address.
 This does not match the real name portion, only the address, optionally
 with angle brackets.")




This bug report was last modified 2 years and 156 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.