GNU bug report logs - #67523
check-declare doesn't account for shorthands

Previous Next

Package: emacs;

Reported by: Joseph Turner <joseph <at> ushin.org>

Date: Wed, 29 Nov 2023 09:13:02 UTC

Severity: wishlist

To reply to this bug, email your comments to 67523 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to adam <at> alphapapa.net, jonas <at> bernoul.li, joaotavora <at> gmail.com, bug-gnu-emacs <at> gnu.org:
bug#67523; Package emacs. (Wed, 29 Nov 2023 09:13:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Joseph Turner <joseph <at> ushin.org>:
New bug report received and forwarded. Copy sent to adam <at> alphapapa.net, jonas <at> bernoul.li, joaotavora <at> gmail.com, bug-gnu-emacs <at> gnu.org. (Wed, 29 Nov 2023 09:13:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: Emacs Bugs Mailing List <bug-gnu-emacs <at> gnu.org>
Subject: check-declare doesn't account for shorthands
Date: Wed, 29 Nov 2023 00:03:50 -0800
On Emacs 29.1, when running `check-declare-file' on a file with
`declare-function' forms, I get

Warning (check-declare): said ‘some-nice-string-utils-foobar’ was defined in
some-nice-string-utils.el: function not found

The problem is that `check-declare-verify' attempts to search for the
full symbol name using this regular expression:

(setq re (format (if cflag
                     "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
                   "^[ \t]*(\\(fset[ \t]+'\\|\
cl-def\\(?:generic\\|method\\|un\\)\\|\
def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
\\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
ine-overloadable-function\\)\\)\
[ \t]*%s\\([ \t;]+\\|$\\)")
                 (regexp-opt (mapcar 'cadr fnlist) t)))
(while (re-search-forward re nil t)
  ...
  )

where (mapcar 'cadr fnlist) is the full symbol name.

Since the full symbol name never appears in the file in which it was
defined, re-search-forward never finds it, and so "function not found".

A potential solution could be to convert the longhand symbol into its
shorthand form and pass that into re-search-forward.  This is tricky
since there may be multiple different shorthands which could yield the
same longhand form.  It might be more feasible to run re-search-forward
on a known common suffix portion of the symbol name, then with point on
the suspected definition, run `intern-soft' to get the full symbol name.

A workaround is to not use shorthands in function definitions.

Thoughts?

Joseph




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67523; Package emacs. (Wed, 29 Nov 2023 09:58:01 GMT) Full text and rfc822 format available.

Message #8 received at 67523 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Joseph Turner <joseph <at> ushin.org>
Cc: 67523 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: bug#67523: check-declare doesn't account for shorthands
Date: Wed, 29 Nov 2023 09:56:32 +0000
On Wed, Nov 29, 2023 at 9:12 AM Joseph Turner <joseph <at> ushin.org> wrote:

> A potential solution could be to convert the longhand symbol into its
> shorthand form and pass that into re-search-forward.  This is tricky
> since there may be multiple different shorthands which could yield the
> same longhand form.  It might be more feasible to run re-search-forward
> on a known common suffix portion of the symbol name, then with point on
> the suspected definition, run `intern-soft' to get the full symbol name.

No, this is brittle.  Check-declare, if it's to be useful (is it?)
is probably meant to be as precise as possible.

> A workaround is to not use shorthands in function definitions.

That's letting the bad guys win :-)

> Thoughts?

As usual, my thoughts are that tools that read Lisp code
should use the Lisp reader, not regular expressions.

Here, check-declare should just walk the whole file.
When it finds a a symbol atom that matches a definition
form,  look in the next atom and check if it matches the probe.
If you don't want to intern all the symbols in the you can read
to a separate obarray I think.

João




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67523; Package emacs. (Wed, 29 Nov 2023 10:37:01 GMT) Full text and rfc822 format available.

Message #11 received at 67523 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Joseph Turner <joseph <at> ushin.org>
Cc: 67523 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: bug#67523: check-declare doesn't account for shorthands
Date: Wed, 29 Nov 2023 10:35:34 +0000
On Wed, Nov 29, 2023 at 9:56 AM João Távora <joaotavora <at> gmail.com> wrote:
>
> On Wed, Nov 29, 2023 at 9:12 AM Joseph Turner <joseph <at> ushin.org> wrote:
>
> > A potential solution could be to convert the longhand symbol into its
> > shorthand form and pass that into re-search-forward.  This is tricky
> > since there may be multiple different shorthands which could yield the
> > same longhand form.  It might be more feasible to run re-search-forward
> > on a known common suffix portion of the symbol name, then with point on
> > the suspected definition, run `intern-soft' to get the full symbol name.
>
> No, this is brittle.  Check-declare, if it's to be useful (is it?)
> is probably meant to be as precise as possible.
>
> > A workaround is to not use shorthands in function definitions.
>
> That's letting the bad guys win :-)
>
> > Thoughts?
>
> As usual, my thoughts are that tools that read Lisp code
> should use the Lisp reader, not regular expressions.
>
> Here, check-declare should just walk the whole file.

Or maybe just this 100% guaranteed untested patch would work:

diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
index c887d95210c..00440276643 100644
--- a/lisp/emacs-lisp/check-declare.el
+++ b/lisp/emacs-lisp/check-declare.el
@@ -145,6 +145,7 @@ check-declare-verify
     (if (file-regular-p fnfile)
         (with-temp-buffer
           (insert-file-contents fnfile)
+          (hack-local-variables) ;; for shorthands
           ;; defsubst's don't _have_ to be known at compile time.
           (setq re (format (if cflag
                                "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
@@ -158,7 +159,7 @@ check-declare-verify
                            (regexp-opt (mapcar 'cadr fnlist) t)))
           (while (re-search-forward re nil t)
             (skip-chars-forward " \t\n")
-            (setq fn (match-string 2)
+            (setq fn (symbol-name (car (read-from-string (match-string 2))))
                   type (match-string 1)
                   ;; (min . max) for a fixed number of arguments, or
                   ;; arglists with optional elements.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67523; Package emacs. (Wed, 29 Nov 2023 11:13:02 GMT) Full text and rfc822 format available.

Message #14 received at 67523 <at> debbugs.gnu.org (full text, mbox):

From: João Távora <joaotavora <at> gmail.com>
To: Joseph Turner <joseph <at> ushin.org>
Cc: 67523 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: bug#67523: check-declare doesn't account for shorthands
Date: Wed, 29 Nov 2023 11:12:11 +0000
On Wed, Nov 29, 2023 at 10:35 AM João Távora <joaotavora <at> gmail.com> wrote:
>
> On Wed, Nov 29, 2023 at 9:56 AM João Távora <joaotavora <at> gmail.com> wrote:
> >
> > On Wed, Nov 29, 2023 at 9:12 AM Joseph Turner <joseph <at> ushin.org> wrote:
> >
> > > A potential solution could be to convert the longhand symbol into its
> > > shorthand form and pass that into re-search-forward.  This is tricky
> > > since there may be multiple different shorthands which could yield the
> > > same longhand form.  It might be more feasible to run re-search-forward
> > > on a known common suffix portion of the symbol name, then with point on
> > > the suspected definition, run `intern-soft' to get the full symbol name.
> >
> > No, this is brittle.  Check-declare, if it's to be useful (is it?)
> > is probably meant to be as precise as possible.
> >
> > > A workaround is to not use shorthands in function definitions.
> >
> > That's letting the bad guys win :-)
> >
> > > Thoughts?
> >
> > As usual, my thoughts are that tools that read Lisp code
> > should use the Lisp reader, not regular expressions.
> >
> > Here, check-declare should just walk the whole file.
>
> Or maybe just this 100% guaranteed untested patch would work:

Sorry, that was 100% untested indeed.  This patch seems to work:

diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
index c887d95210c..bc3844ca9be 100644
--- a/lisp/emacs-lisp/check-declare.el
+++ b/lisp/emacs-lisp/check-declare.el
@@ -145,21 +145,26 @@ check-declare-verify
     (if (file-regular-p fnfile)
         (with-temp-buffer
           (insert-file-contents fnfile)
+          (unless cflag
+            ;; for syntax and shorthands
+            (lisp-data-mode)
+            (hack-local-variables))
           ;; defsubst's don't _have_ to be known at compile time.
-          (setq re (format (if cflag
-                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
-                             "^[ \t]*(\\(fset[ \t]+'\\|\
+          (setq re (if cflag
+                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
+                               (regexp-opt (mapcar 'cadr fnlist) t))
+                     "^[ \t]*(\\(fset[ \t]+'\\|\
 cl-def\\(?:generic\\|method\\|un\\)\\|\
 def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
 ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
 \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
 ine-overloadable-function\\)\\)\
-[ \t]*%s\\([ \t;]+\\|$\\)")
-                           (regexp-opt (mapcar 'cadr fnlist) t)))
+[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
           (while (re-search-forward re nil t)
             (skip-chars-forward " \t\n")
-            (setq fn (match-string 2)
-                  type (match-string 1)
+            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
+            (when (member fn (mapcar 'cadr fnlist))
+            (setq type (match-string 1)
                   ;; (min . max) for a fixed number of arguments, or
                   ;; arglists with optional elements.
                   ;; (min) for arglists with &rest.
@@ -202,7 +207,7 @@ check-declare-verify
                             (t
                              'err))
                   ;; alist of functions and arglist signatures.
-                  siglist (cons (cons fn sig) siglist)))))
+                  siglist (cons (cons fn sig) siglist))))))
     (dolist (e fnlist)
       (setq arglist (nth 2 e)
             type




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#67523; Package emacs. (Sun, 10 Dec 2023 10:59:02 GMT) Full text and rfc822 format available.

Message #17 received at 67523 <at> debbugs.gnu.org (full text, mbox):

From: Joseph Turner <joseph <at> ushin.org>
To: João Távora <joaotavora <at> gmail.com>
Cc: 67523 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>,
 Jonas Bernoulli <jonas <at> bernoul.li>
Subject: Re: bug#67523: check-declare doesn't account for shorthands
Date: Sun, 10 Dec 2023 02:57:50 -0800
João Távora <joaotavora <at> gmail.com> writes:

> On Wed, Nov 29, 2023 at 10:35 AM João Távora <joaotavora <at> gmail.com> wrote:
>>
>> On Wed, Nov 29, 2023 at 9:56 AM João Távora <joaotavora <at> gmail.com> wrote:
>> >
>> > On Wed, Nov 29, 2023 at 9:12 AM Joseph Turner <joseph <at> ushin.org> wrote:
>> >
>> > > A potential solution could be to convert the longhand symbol into its
>> > > shorthand form and pass that into re-search-forward.  This is tricky
>> > > since there may be multiple different shorthands which could yield the
>> > > same longhand form.  It might be more feasible to run re-search-forward
>> > > on a known common suffix portion of the symbol name, then with point on
>> > > the suspected definition, run `intern-soft' to get the full symbol name.
>> >
>> > No, this is brittle.  Check-declare, if it's to be useful (is it?)
>> > is probably meant to be as precise as possible.
>> >
>> > > A workaround is to not use shorthands in function definitions.
>> >
>> > That's letting the bad guys win :-)
>> >
>> > > Thoughts?
>> >
>> > As usual, my thoughts are that tools that read Lisp code
>> > should use the Lisp reader, not regular expressions.
>> >
>> > Here, check-declare should just walk the whole file.
>>
>> Or maybe just this 100% guaranteed untested patch would work:
>
> Sorry, that was 100% untested indeed.  This patch seems to work:
>
> diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el
> index c887d95210c..bc3844ca9be 100644
> --- a/lisp/emacs-lisp/check-declare.el
> +++ b/lisp/emacs-lisp/check-declare.el
> @@ -145,21 +145,26 @@ check-declare-verify
>      (if (file-regular-p fnfile)
>          (with-temp-buffer
>            (insert-file-contents fnfile)
> +          (unless cflag
> +            ;; for syntax and shorthands
> +            (lisp-data-mode)
> +            (hack-local-variables))
>            ;; defsubst's don't _have_ to be known at compile time.
> -          (setq re (format (if cflag
> -                               "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
> -                             "^[ \t]*(\\(fset[ \t]+'\\|\
> +          (setq re (if cflag
> +                       (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\""
> +                               (regexp-opt (mapcar 'cadr fnlist) t))
> +                     "^[ \t]*(\\(fset[ \t]+'\\|\
>  cl-def\\(?:generic\\|method\\|un\\)\\|\
>  def\\(?:un\\|subst\\|foo\\|method\\|class\\|\
>  ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\
>  \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\
>  ine-overloadable-function\\)\\)\
> -[ \t]*%s\\([ \t;]+\\|$\\)")
> -                           (regexp-opt (mapcar 'cadr fnlist) t)))
> +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)"))
>            (while (re-search-forward re nil t)
>              (skip-chars-forward " \t\n")
> -            (setq fn (match-string 2)
> -                  type (match-string 1)
> +            (setq fn (symbol-name (car (read-from-string (match-string 2)))))
> +            (when (member fn (mapcar 'cadr fnlist))
> +            (setq type (match-string 1)
>                    ;; (min . max) for a fixed number of arguments, or
>                    ;; arglists with optional elements.
>                    ;; (min) for arglists with &rest.
> @@ -202,7 +207,7 @@ check-declare-verify
>                              (t
>                               'err))
>                    ;; alist of functions and arglist signatures.
> -                  siglist (cons (cons fn sig) siglist)))))
> +                  siglist (cons (cons fn sig) siglist))))))
>      (dolist (e fnlist)
>        (setq arglist (nth 2 e)
>              type

IIUC, this patch is being merged in response to bug#67390.




Severity set to 'wishlist' from 'normal' Request was from Stefan Kangas <stefankangas <at> gmail.com> to control <at> debbugs.gnu.org. (Wed, 27 Dec 2023 21:53:02 GMT) Full text and rfc822 format available.

This bug report was last modified 1 year and 174 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.