Package: emacs;
Reported by: Jonas Bernoulli <jonas <at> bernoul.li>
Date: Wed, 22 Nov 2023 22:19:01 UTC
Severity: normal
View this message in rfc822 format
From: Joseph Turner <joseph <at> ushin.org> To: João Távora <joaotavora <at> gmail.com> Cc: 67390 <at> debbugs.gnu.org, Adam Porter <adam <at> alphapapa.net>, Eli Zaretskii <eliz <at> gnu.org>, Jonas Bernoulli <jonas <at> bernoul.li> Subject: bug#67390: 28; shorthands-font-lock-shorthands assumes shorthand uses same separator Date: Sat, 09 Dec 2023 10:50:16 -0800
Hi João! Thanks for your patience - preparing for EmacsConf was a blast, and now I'm on a plane to go visit my grandmother! João Távora <joaotavora <at> gmail.com> writes: > Hi all, > > I've been working on all these shorthand-related issues over the last > two days and I have reasonably short fixes for all of them. > > For this particular issue (bug#67309), I've opted to > use Joseph's patch with very slight adjustments, as it's the > only one that guarantees correct behaviour and doesn't seem > to impact performance. > > The other issues are: > > bug#63480 (loaddefs-gen.el doesn't know about shorthands) > bug#67325 (prefix discovery i.e. register-definition-prefixes) > bug#67523 (check-declare.el doesn't know about shorthands) > > I have all this in 6 commits in the bugfix/shorthand-fixes branch. > > Here's the full patch minus whitespace changes. If there are > no comments I'll push in a few days' time. > > João > > diff --git a/doc/lispref/symbols.texi b/doc/lispref/symbols.texi > index 1f3b677d7fb..18e80311177 100644 > --- a/doc/lispref/symbols.texi > +++ b/doc/lispref/symbols.texi > @@ -761,6 +761,23 @@ Shorthands > ;; End: > @end example > > +Note that if you have two shorthands in the same file where one is the > +prefix of the other, the longer shorthand will be attempted first. > +This happens regardless of the order you specify shorthands in the > +local variables section of your file. > + > +@example > +'( > + t//foo ; reads to 'my-tricks--foo', not 'my-tricks-/foo' > + t/foo ; reads to 'my-tricks-foo' > + ) > + > +;; Local Variables: > +;; read-symbol-shorthands: (("t/" . "my-tricks-") > +;; ("t//" . "my-tricks--") > +;; End: > +@end example > + > @subsection Exceptions Clear and concise. > There are two exceptions to rules governing Shorthand transformations: > diff --git a/lisp/emacs-lisp/check-declare.el b/lisp/emacs-lisp/check-declare.el > index c887d95210c..b19aedf314d 100644 > --- a/lisp/emacs-lisp/check-declare.el > +++ b/lisp/emacs-lisp/check-declare.el > @@ -145,21 +145,26 @@ check-declare-verify > (if (file-regular-p fnfile) > (with-temp-buffer > (insert-file-contents fnfile) > + (unless cflag > + ;; If in Elisp, ensure syntax and shorthands available > + (set-syntax-table emacs-lisp-mode-syntax-table) > + (let (enable-local-variables) (hack-local-variables))) > ;; defsubst's don't _have_ to be known at compile time. > - (setq re (format (if cflag > - "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\"" > + (setq re (if cflag > + (format "^[ \t]*\\(DEFUN\\)[ \t]*([ \t]*\"%s\"" > + (regexp-opt (mapcar 'cadr fnlist) t)) > "^[ \t]*(\\(fset[ \t]+'\\|\ > cl-def\\(?:generic\\|method\\|un\\)\\|\ > def\\(?:un\\|subst\\|foo\\|method\\|class\\|\ > ine-\\(?:derived\\|generic\\|\\(?:global\\(?:ized\\)?-\\)?minor\\)-mode\\|\ > \\(?:ine-obsolete-function-\\)?alias[ \t]+'\\|\ > ine-overloadable-function\\)\\)\ > -[ \t]*%s\\([ \t;]+\\|$\\)") > - (regexp-opt (mapcar 'cadr fnlist) t))) > +[ \t]*\\(\\(?:\\sw\\|\\s_\\)+\\)\\([ \t;]+\\|$\\)")) Would you explain what this regexp is intended to match? > (while (re-search-forward re nil t) > (skip-chars-forward " \t\n") > - (setq fn (match-string 2) > - type (match-string 1) > + (setq fn (symbol-name (car (read-from-string (match-string 2))))) > + (when (member fn (mapcar 'cadr fnlist)) > + (setq type (match-string 1) > ;; (min . max) for a fixed number of arguments, or > ;; arglists with optional elements. > ;; (min) for arglists with &rest. > @@ -202,7 +207,7 @@ check-declare-verify > (t > 'err)) > ;; alist of functions and arglist signatures. > - siglist (cons (cons fn sig) siglist))))) > + siglist (cons (cons fn sig) siglist)))))) > (dolist (e fnlist) > (setq arglist (nth 2 e) > type On my machine, this patch removes some of the check-declare "function not found" errors, but not all. For example, with hyperdrive-lib.el: (check-declare-file "~/.local/src/hyperdrive.el/hyperdrive-lib.el") Before this patch, the "*Check Declarations Warnings*" buffer shows: --8<---------------cut here---------------start------------->8--- ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not found ■ hyperdrive-lib.el:1283:Warning (check-declare): said ‘h/org--link-goto’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-org.el: function not found ■ hyperdrive-lib.el:45:Warning (check-declare): said ‘h/dir-mode’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found ■ hyperdrive-lib.el:1069:Warning (check-declare): said ‘h/dir--entry-at-point’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found --8<---------------cut here---------------end--------------->8--- and after your patch: --8<---------------cut here---------------start------------->8--- ■ hyperdrive-lib.el:44:Warning (check-declare): said ‘h/mode’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive.el: function not found ■ hyperdrive-lib.el:508:Warning (check-declare): said ‘h/history’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-history.el: function not found ■ hyperdrive-lib.el:1332:Warning (check-declare): said ‘h/dir-handler’ was defined in ../../../.emacs.d/elpa/hyperdrive/hyperdrive-dir.el: function not found --8<---------------cut here---------------end--------------->8--- Are you able to reproduce this on your machine? > diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el > index 04bea4723a2..e8093200bec 100644 > --- a/lisp/emacs-lisp/loaddefs-gen.el > +++ b/lisp/emacs-lisp/loaddefs-gen.el > @@ -378,6 +378,7 @@ loaddefs-generate--parse-file > (let ((defs nil) > (load-name (loaddefs-generate--file-load-name file main-outfile)) > (compute-prefixes t) > + read-symbol-shorthands > local-outfile inhibit-autoloads) > (with-temp-buffer > (insert-file-contents file) > @@ -399,7 +400,19 @@ loaddefs-generate--parse-file > (setq inhibit-autoloads (read (current-buffer))))) > (save-excursion > (when (re-search-forward "autoload-compute-prefixes: *" nil t) > - (setq compute-prefixes (read (current-buffer)))))) > + (setq compute-prefixes (read (current-buffer))))) > + (save-excursion > + ;; since we're "open-coding" we have to repeat more > + ;; complicated logic in `hack-local-variables'. > + (when (re-search-forward "read-symbol-shorthands: *" nil t) > + (let* ((commentless (replace-regexp-in-string > + "\n\\s-*;+" "" > + (buffer-substring (point) (point-max)))) > + (unsorted-shorthands (car (read-from-string commentless)))) > + (setq read-symbol-shorthands > + (sort unsorted-shorthands > + (lambda (sh1 sh2) > + (> (length (car sh1)) (length (car sh2)))))))))) IIUC, the intention here is to jump to a final "Local Variables" declaration at the end of the file, then remove ";;", then read in the uncommented value of `read-symbol-shorthands'. Since `read-from-string' just reads one expression, the above hunk works when there are more local variables after read-symbol-shorthands: ;; Local Variables: ;; read-symbol-shorthands: (("bc-" . "breadcrumb-")) ;; autoload-compute-prefixes: nil ;; End: But if the read-symbol-shorthands declaration comes at the top, as in... -*- read-symbol-shorthands: (("bc-" . "breadcrumb-")); -*- ...then this form will allocate two strings almost as long as the file. Here's an alternative hack attempting to uncomment and read the minimum: diff --git a/lisp/emacs-lisp/loaddefs-gen.el b/lisp/emacs-lisp/loaddefs-gen.el index e8093200bec..406e4b28f1f 100644 --- a/lisp/emacs-lisp/loaddefs-gen.el +++ b/lisp/emacs-lisp/loaddefs-gen.el @@ -404,10 +404,13 @@ don't include." (save-excursion ;; since we're "open-coding" we have to repeat more ;; complicated logic in `hack-local-variables'. - (when (re-search-forward "read-symbol-shorthands: *" nil t) - (let* ((commentless (replace-regexp-in-string + (when-let ((beg + (re-search-forward "read-symbol-shorthands: *" nil t))) + ;; `read-symbol-shorthands' alist ends with two parens. + (let* ((end (re-search-forward ")[;\n\s]*)")) + (commentless (replace-regexp-in-string "\n\\s-*;+" "" - (buffer-substring (point) (point-max)))) + (buffer-substring beg end))) (unsorted-shorthands (car (read-from-string commentless)))) (setq read-symbol-shorthands (sort unsorted-shorthands > ;; We always return the package version (even for pre-dumped > ;; files). > @@ -486,7 +499,11 @@ loaddefs-generate--compute-prefixes > (while (re-search-forward > "^(\\(def[^ \t\n]+\\)[ \t\n]+['(]*\\([^' ()\"\n]+\\)[\n \t]" nil t) > (unless (member (match-string 1) autoload-ignored-definitions) > - (let ((name (match-string-no-properties 2))) > + (let* ((name (match-string-no-properties 2)) > + ;; Consider `read-symbol-shorthands'. > + (probe (let ((obarray (obarray-make))) > + (car (read-from-string name))))) > + (setq name (symbol-name probe)) > (when (save-excursion > (goto-char (match-beginning 0)) > (or (bobp) > diff --git a/lisp/emacs-lisp/shorthands.el b/lisp/emacs-lisp/shorthands.el > index b0665a55695..69b562e3c7e 100644 > --- a/lisp/emacs-lisp/shorthands.el > +++ b/lisp/emacs-lisp/shorthands.el > @@ -52,38 +52,26 @@ elisp-shorthand-font-lock-face > :version "28.1" > :group 'font-lock-faces) > > -(defun shorthands--mismatch-from-end (str1 str2) > - "Tell index of first mismatch in STR1 and STR2, from end. > -The index is a valid 0-based index on STR1. Returns nil if STR1 > -equals STR2. Return 0 if STR1 is a suffix of STR2." > - (cl-loop with l1 = (length str1) with l2 = (length str2) > - for i from 1 > - for i1 = (- l1 i) for i2 = (- l2 i) > - while (eq (aref str1 i1) (aref str2 i2)) > - if (zerop i2) return (if (zerop i1) nil i1) > - if (zerop i1) return 0 > - finally (return i1))) > - > (defun shorthands-font-lock-shorthands (limit) > + "Font lock until LIMIT considering `read-symbol-shorthands'." > (when read-symbol-shorthands > (while (re-search-forward > (concat "\\_<\\(" (rx lisp-mode-symbol) "\\)\\_>") > limit t) > (let* ((existing (get-text-property (match-beginning 1) 'face)) > + (print-name (match-string 1)) > (probe (and (not (memq existing '(font-lock-comment-face > font-lock-string-face))) > - (intern-soft (match-string 1)))) > - (sname (and probe (symbol-name probe))) > - (mismatch (and sname (shorthands--mismatch-from-end > - (match-string 1) sname))) > - (guess (and mismatch (1+ mismatch)))) > - (when guess > - (when (and (< guess (1- (length (match-string 1)))) > - ;; In bug#67390 we allow other separators > - (eq (char-syntax (aref (match-string 1) guess)) ?_)) > - (setq guess (1+ guess))) > + (intern-soft print-name))) > + (symbol-name (and probe (symbol-name probe))) > + (prefix (and symbol-name > + (not (string-equal print-name symbol-name)) > + (car (assoc print-name > + read-symbol-shorthands > + #'string-prefix-p))))) > + (when prefix > (add-face-text-property (match-beginning 1) > - (+ (match-beginning 1) guess) > + (+ (match-beginning 1) (length prefix)) > 'elisp-shorthand-font-lock-face)))))) Works well. let-binding `symbol-name' and `print-name' is good improvement. > (font-lock-add-keywords 'emacs-lisp-mode > '((shorthands-font-lock-shorthands)) t) > diff --git a/lisp/files.el b/lisp/files.el > index 1cdcec23b11..b266d0727ec 100644 > --- a/lisp/files.el > +++ b/lisp/files.el > @@ -3735,7 +3735,8 @@ before-hack-local-variables-hook > This hook is called only if there is at least one file-local > variable to set.") > > -(defvar permanently-enabled-local-variables '(lexical-binding) > +(defvar permanently-enabled-local-variables > + '(lexical-binding read-symbol-shorthands) > "A list of file-local variables that are always enabled. > This overrides any `enable-local-variables' setting.") > > @@ -4171,6 +4172,13 @@ hack-local-variables--find-variables > ;; to use 'thisbuf's name in the > ;; warning message. > (or (buffer-file-name thisbuf) "")))))) > + ((eq var 'read-symbol-shorthands) > + ;; Sort automatically by shorthand length > + ;; descending > + (setq val (sort val > + (lambda (sh1 sh2) (> > (length (car sh1)) > + > (length (car sh2)))))) > + (push (cons 'read-symbol-shorthands val) result)) > ((and (eq var 'mode) handle-mode)) > (t > (ignore-errors Good catch. I agree that longer shorthands should be applied first. ----- A couple typo nits on the commit message of "Improve shorthands-font-lock-shorthands (bug#67390)": - h//thingy ; hilits "//" reads to 'hyperdrive--thingy' + h//thingy ; hilits "h//" reads to 'hyperdrive--thingy' - Co-authored-by: João Távora <joaotavora <at> gmail.com> + Co-authored-by: Joseph Turner <joseph <at> breatheoutbreathe.in> Thank you! Joseph
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.