GNU bug report logs -
#53749
29.0.50; [PATCH] Xref backend for TeX buffers
Previous Next
Reported by: David Fussner <dfussner <at> googlemail.com>
Date: Thu, 3 Feb 2022 15:10:02 UTC
Severity: normal
Tags: patch
Found in version 29.0.50
Fixed in version 31.1
Done: Stefan Kangas <stefankangas <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
Message #209 received at 53749 <at> debbugs.gnu.org (full text, mbox):
Hi,
Apparently I'm the `tex-mode.el` guy, so I tried to take a look.
> diff --git a/lisp/textmodes/tex-mode.el b/lisp/textmodes/tex-mode.el
> index 97c950267c6..d990a2dbfa9 100644
> --- a/lisp/textmodes/tex-mode.el
> +++ b/lisp/textmodes/tex-mode.el
> @@ -695,7 +696,25 @@ tex-verbatim-environments
> ("\\\\\\(?:end\\|begin\\) *\\({[^\n{}]*}\\)"
> (1 (ignore
> (tex-env-mark (match-beginning 0)
> - (match-beginning 1) (match-end 1))))))))
> + (match-beginning 1) (match-end 1)))))
> + ;; The next two rules change the syntax of `:' and `_' in expl3
> + ;; constructs, so that `tex-font-lock-suscript' can fontify them
> + ;; more accurately.
> + ((concat "\\(\\(?:[\\\\[:space:]{]_\\|"
> + "[\\\\{[:space:]][^][_[:space:][:cntrl:][:digit:]\\\\{}()/=]+\\)"
> + "\\(?:_+\\(?:[^][[:space:][:cntrl:][:digit:]:\\\\{}()/#_=]+\\|"
> + "#+[1-9]\\)\\)+\\)\\([:_]?\\)")
Can you add in the comment some URL pointing to some relevant expl3
documentation which "explains" why the above regexp makes sense?
Also I don't clearly see how the above regexp distinguishes expl3 code
from "normal" LaTeX code, so the comment should say something about it.
Side note: I'd avoid [:space:] whose exact meaning is rarely quite what
we need.
Side note: backslash doesn't need to be backslashed in [...].
> + (1 (ignore
> + (let* ((expr (buffer-substring-no-properties (match-beginning 1)
> + (match-end 1)))
> + (list (seq-positions expr ?_)))
> + (dolist (pos list)
> + (put-text-property (+ pos (match-beginning 1))
> + (1+ (+ pos (match-beginning 1)))
> + 'syntax-table (string-to-syntax "_"))))))
> + (2 "_"))
> + ("\\\\[[:alpha:]]+\\(:\\)[[:alpha:][:space:]\n]"
> + (1 "_")))))
Currently we "skip" inappropriate underscores via
`tex-font-lock-match-suscript` and/or by adding a particular `face` text
property rather than via `syntax-table/propertize`.
For algorithmic reasons, it's better to minimize the work done in
`syntax-propertize-function` as much as possible (font-lock is more lazy
than `syntax-propertize`), so I recommend you try and moving the above
to font-lock rules.
> +(defvar tex-esc-and-group-chars '(?\\ ?{ ?})
> + "The current TeX escape and grouping characters.
I recommend you backslash escape the { and } above (although it's not
indispensable, `emacs-lisp-mode` will parse the code better).
More importantly, the docstring doesn't explain what this list
means/does. E.g. does the order matter? Can it be longer than 3 elements?
From the current docstring I can't guess what would be the consequence
of adding/removing elements to/from this list.
> +;; Populate `semantic-symref-filepattern-alist' for the in-tree modes;
> +;; AUCTeX is doing the same for its modes.
> +(defvar semantic-symref-filepattern-alist)
> +(with-eval-after-load 'semantic/symref/grep
> + (push '(latex-mode "*.[tT]e[xX]" "*.ltx" "*.sty" "*.cl[so]"
> + "*.bbl" "*.drv" "*.hva")
> + semantic-symref-filepattern-alist)
> + (push '(plain-tex-mode "*.[tT]e[xX]" "*.ins")
> + semantic-symref-filepattern-alist)
> + (push '(doctex-mode "*.dtx") semantic-symref-filepattern-alist))
We know `semantic-symref-filepattern-alist` will exist when
`semantic/symref/grep` is loaded, but not before, so I'd put the
`defvar` inside the `with-eval-after-load`.
> +;; Setup AUCTeX modes (for testing purposes only).
> +
> +(add-hook 'TeX-mode-hook #'tex-set-auctex-xref-backend)
> +
> +(defun tex-set-auctex-xref-backend ()
> + (add-hook 'xref-backend-functions #'tex--xref-backend nil t))
I assume this will be sent to AUCTeX and is not meant to be in
`tex-mode.el`, right?
> +;; `xref-find-references' currently may need this when called from a
> +;; latex-mode buffer in order to search files or buffers with a .tex
> +;; suffix (including the buffer from which it has been called). We
> +;; append it to `auto-mode-alist' so as not to interfere with the usual
> +;; mode-setting apparatus. Changes here and in AUCTeX should soon
> +;; render it unnecessary.
> +(add-to-list 'auto-mode-alist '("\\.[tT]e[xX]\\'" . latex-mode) t)
Maybe I have not followed the whole discussion closely enough, but at
least to me the above "soon" is very unclear.
I'll assume that this code will be removed before we install the patch.
If not, please explain in the comment why this specific hack is needed
and how it works.
> +(cl-defmethod xref-backend-references ((_backend (eql 'tex-etags)) identifier)
> + "Find references of IDENTIFIER in TeX buffers and files."
> + (require 'semantic/symref/grep)
> + (let (bufs texbufs
> + (mode major-mode))
> + (dolist (buf (buffer-list))
> + (if (eq (buffer-local-value 'major-mode buf) mode)
> + (push buf bufs)
> + (when (string-match-p ".*\\.[tT]e[xX]" (buffer-name buf))
> + (push buf texbufs))))
> + (unless (seq-set-equal-p tex--buffers-list bufs)
> + (let* ((amalist (tex--collect-file-extensions))
> + (extlist (alist-get mode semantic-symref-filepattern-alist))
> + (extlist-new (seq-uniq
> + (seq-union amalist extlist #'string-match-p))))
After sinking the `defvar` above, you'll need to add a new `defvar` for
`semantic-symref-filepattern-alist` just after the `require`.
> + (setq-local syntax-propertize-function
> + (eval
> + `(tex-xref-syntax-function
> + ,identifier ,beg ,end)))
Why do we need to change `syntax-propertize-function` and why do we need
`eval`?
> + (setq syntax-propertize--done 0)
This is not sufficient. You want to `syntax-ppss-flush-cache`.
Stefan
This bug report was last modified 243 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.