GNU bug report logs - #32372
[PATCH] Add "uuid" to thing-at-point.el

Previous Next

Package: emacs;

Reported by: Raimon Grau <raimon <at> konghq.com>

Date: Sun, 5 Aug 2018 23:14:02 UTC

Severity: wishlist

Tags: fixed, patch

Fixed in version 27.1

Done: Noam Postavsky <npostavs <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #13 received at control <at> debbugs.gnu.org (full text, mbox):

From: Noam Postavsky <npostavs <at> gmail.com>
To: Raimon Grau <raimon <at> konghq.com>
Cc: 32372 <at> debbugs.gnu.org
Subject: Re: bug#32372: [PATCH] Add "uuid" to thing-at-point.el
Date: Sun, 05 Aug 2018 22:31:43 -0400
severity 32372 wishlist
quit

Raimon Grau <raimon <at> konghq.com> writes:

> Subject: [PATCH] Add uuid as allowed thingatpt symbol
>
> * lisp/thingatpt.el (thing-at-point-uuid-regexp): Add regexp for uuid.

I guess you should mention something about the ops as well here.  Though
it's not 100% clear what kind of format you should use for those.  Maybe
just (top-level): Add 'bounds-of-thing-at-point' operation for 'uuid'.

> +;; UUID
> +
> +(defvar thing-at-point-uuid-regexp
> +  (rx (and bow

Using rx is okay, I think.  There was some discussion about it on
emacs-devel a little time ago, with most people saying the increased
verbosity made them not want to use it, but I kind of like it myself.
However, Stefan made the point that `and' is potentially a bit
confusing, because it could be misread as intersection.  It's better to
use one of the synonyms `seq' or `:'.

> +           (or
> +            "00000000-0000-0000-0000-000000000000"
> +            (and
> +             (repeat 8 hex-digit) "-"
> +             (repeat 4 hex-digit) "-"
> +             (or "1" "2" "3" "4" "5")
> +             (repeat 3 hex-digit) "-"
> +             (or "8" "9" "a" "b" "A" "B")
> +             (repeat 3 hex-digit) "-"
> +             (repeat 12 hex-digit)))
> +           eow))
> +  "A regular expression matching a UUID from versions 1 to 5.
> +
> +  More info on uuid's format in
> +  https://tools.ietf.org/html/rfc4122." )

So, in that RFC I see this grammar

      UUID                   = time-low "-" time-mid "-"
                               time-high-and-version "-"
                               clock-seq-and-reserved
                               clock-seq-low "-" node
      time-low               = 4hexOctet
      time-mid               = 2hexOctet
      time-high-and-version  = 2hexOctet
      clock-seq-and-reserved = hexOctet
      clock-seq-low          = hexOctet
      node                   = 6hexOctet
      hexOctet               = hexDigit hexDigit
      hexDigit =
            "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9" /
            "a" / "b" / "c" / "d" / "e" / "f" /
            "A" / "B" / "C" / "D" / "E" / "F"

It looks like you crafted a regexp which is a tighter match for just the
UUID versions currently in use.  I think we're better off with the
looser definition though, that way it will continue to be correct even
as new versions come out.

Furthermore, I would guess a human user is going to be surprised if
(thing-at-point 'uuid) picks up this

    12345678-1234-1234-8123-123456789012

but not this:

    12345678-1234-1234-5123-123456789012


> +(put 'uuid 'thing-at-point
> +     (lambda ()
> +       (let ((boundary-pair (bounds-of-thing-at-point 'uuid)))
> +         (if boundary-pair
> +             (buffer-substring-no-properties
> +              (car boundary-pair) (cdr boundary-pair))))))

I think this isn't needed, because the `thing-at-point' function already
does this for you:

  (let ((text
         (if (get thing 'thing-at-point)
             (funcall (get thing 'thing-at-point))
           (let ((bounds (bounds-of-thing-at-point thing)))
             (when bounds
               (buffer-substring (car bounds) (cdr bounds)))))))




This bug report was last modified 6 years and 278 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.