GNU bug report logs - #78612
imenu list generation failing on some items of similar quality

Previous Next

Package: emacs;

Reported by: mdnorton <mdnorton <at> proton.me>

Date: Wed, 28 May 2025 01:46:03 UTC

Severity: normal

Done: Eli Zaretskii <eliz <at> gnu.org>

Full log


Message #14 received at 78612 <at> debbugs.gnu.org (full text, mbox):

From: mdnorton <mdnorton <at> proton.me>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: 78612 <at> debbugs.gnu.org
Subject: Re: bug#78612: imenu list generation failing on some items of similar
 quality
Date: Thu, 29 May 2025 17:57:17 +0000
Yes, the latest version seems to do what is necessary.  That solves the initial issue, and I suppose imenu is just deeply reliant on a good regex for predictable behavior.  Thanks for considering this problem.


Sent with Proton Mail secure email.

On Thursday, May 29th, 2025 at 8:16 AM, Eli Zaretskii <eliz <at> gnu.org> wrote:

> [Please use Reply All to replay, to keep the bug tracker CC'ed.]
> 
> > Date: Thu, 29 May 2025 12:42:07 +0000
> > From: mdnorton mdnorton <at> proton.me
> > 
> > Thank you for looking into this. I performed the steps as indicated (with my own variation -- I called the file foo2.m for example). I do NOT get your results.
> > 
> > Attached a snapshot of the imenu entry that shows the results, and you can see only the 3 last entries.
> > 
> > At this point, I do think this is probably an interaction, though I don't know exactly how much to partition to imenu and matlab-mode. While John Ciolfi and I were swapping debug details on this issue in the Emacs-MATLAB-Mode Github repo (https://github.com/mathworks/Emacs-MATLAB-Mode/issues/42) he did note that the particularly gnarly regex had a trackback due to the way MATLAB handles line extensions (a three character sequence of periods "...") and he'd gone with a somewhat simpler one that doesn't handle ever case. I'm still at the commit I'm using currently as I've not had a chance to repull the repository and experiment.
> > 
> > Look-backs in a regex state machine are kind of notorious for having the potential for going off the rails. However, should a malfunctioning regex be able to eliminate entries in imenu? I suppose that's the bug that I believe might be in imenu. Perhaps imenu is entirely at the mercy of the imenu-generic-expression regex, and if that regex has problems, then imenu's behavior is undefined. If that's the case, then that's the way it's been written and it's upon the package developer to create a better working regex. However my speculation (because I am not great at Elisp and have not dug into the code here for imenu) is that imenu operates serially through the buffer characters applying the regex, and then adding list items as it finds them. So, should a malfunctioning regex be capable of removing PRIOR list items imenu has discovered (assuming my speculation on how it works is correct)? How should imenu behave when the regex has issues?
> > 
> > In any event, I will update my clone of the Emacs-MATLAB-Mode repository and then I'll have John's newest simpler regex which will work for my purposes (I don't use the crazy cases of line continuation he was trying to cover.) So maybe this is just a case where imenu is as the mercy of a regex and gracefully recovering from that is difficult to impossible to do. Just figured I'd report and alert in case there was a fundamental behavior issue, and if it's a case that code cannot anticipate and recover from and that's accepted, then that's alright. I know it's impossible to foolproof everything!
> > 
> > Details about this particular regex at this SHA below:
> > 
> > The SHA that I'm at is eea387. Just for reference, this is the contents of imenu-generic-expression for this commit. Note, there are literal ^M characters in there, so those have been transcribed to string "^M" which isn't really the same thing, but required for the email environment.
> > 
> > Value in #<buffer foo2.m>
> > ((nil
> > "^[[:blank:]]function\\>\\(?:\\(?:[]\\[a-zA-Z0-9_,[:blank:]]\\(?:\\.\\.\\.[[:blank:]]\\(?:%[^^M\n]\\)?^M?\n\\)?\\)+[[:blank:]]=\\)?\\(?:[[:blank:]]\\(?:\\.\\.\\.[[:blank:]]\\(?:%[^^M\n]\\)?^M?\n\\)?\\)[\\.[:space:]\n^M]\\([a-zA-Z][a-zA-Z0-9_]+\\)"
> > 1))
> > 
> > And, the bit of matlab.el at this commit that creates this (because it's a bit easier to read than the non-evaluated string regex (even the evaluated string regex is pretty rough). There are literal ^M's here too however they just show up as whitespace in the Elisp.
> > 
> > ;; -----------------
> > ;; | Imenu support |
> > ;; -----------------
> > ;; Example functions we match, f0, f1, f2, f3, f4, f5, F6, g4
> > ;; function f0
> > ;; function...
> > ;; a = f1
> > ;; function f2
> > ;; function x = ...
> > ;; f3
> > ;; function [a, ...
> > ;; b ] ...
> > ;; = ...
> > ;; f4(c)
> > ;; function a = F6
> > ;; function [ ...
> > ;; a, ... % comment for a
> > ;; b ... % comment for b
> > ;; ] ...
> > ;; = ...
> > ;; g4(c)
> > ;;
> > (defvar matlab-imenu-generic-expression
> > ;; Using concat to increase indentation and improve readability
> > `(,(list nil (concat
> > "^[[:blank:]]*"
> > "function\\>"
> > 
> > ;; Optional return args, function ARGS = NAME. Capture the 'ARGS ='
> > (concat "\\(?:"
> > 
> > ;; ARGS can span multiple lines
> > (concat "\\(?:"
> > ;; valid ARGS chars: "[" "]" variables "," space, tab
> > "[]\\[a-zA-Z0-9_,[:blank:]]*"
> > ;; Optional continue to next line "..." or "... % comment"
> > "\\(?:" matlab--ellipsis-to-eol-re "\\)?"
> > "\\)+")
> > 
> > ;; ARGS must be preceeded by the assignment operator, "="
> > "[[:blank:]]*="
> > 
> > "\\)?")
> > 
> > ;; Optional space/tabs or '...' continuation
> > (concat "\\(?:"
> > "[[:blank:]]"
> > "\\(?:" matlab--ellipsis-to-eol-re "\\)?"
> > "\\)")
> > 
> > "[\\.[:space:]\n\r]*"
> > "\\([a-zA-Z][a-zA-Z0-9_]+\\)" ;; function NAME
> > )
> > 1))
> > "Regexp to find function names in *.m files for `imenu'.")
> 
> 
> I used the latest commit from the 'default' branch of the repository,
> which is one commit ahead of yours. So maybe the recent changes to
> the package solved this problem. Please try the latest Git.




This bug report was last modified 20 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.