Package: emacs;
Reported by: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Mon, 18 Aug 2025 18:54:01 UTC
Severity: normal
Tags: patch
Done: Dmitry Gutov <dmitry <at> gutov.dev>
Message #55 received at 79265 <at> debbugs.gnu.org (full text, mbox):
From: Spencer Baugh <sbaugh <at> janestreet.com> To: Stefan Monnier <monnier <at> iro.umontreal.ca> Cc: Dmitry Gutov <dmitry <at> gutov.dev>, 79265 <at> debbugs.gnu.org Subject: Re: bug#79265: [PATCH] Treat point more consistently in PCM completion, [PATCH] Treat point more consistently in PCM completion, [PATCH] Refactor completion-pcm--merge-completions Date: Mon, 01 Sep 2025 11:56:30 -0400
[Message part 1 (text/plain, inline)]
Stefan Monnier <monnier <at> iro.umontreal.ca> writes: >>> Attached is a patch (which applies on master) which does most of the >>> rest of the refactoring that I wanted to do; it makes the dolist over >>> the pattern into a simple mapcan over the pattern, with no state >>> preserved between elements of the pattern. IMO, this is much easier to >>> follow (and I think it would make the various bugs I've fixed in the >>> last couple years in merge-completions, easier to catch). What do you >>> think? > > I still don't understand what this is trying to do. It still seems more > costly since it allocates more strings and conses onto > `pattern-and-comps` than the previous code did with `ccs` (the > "corresponding" variable). My goal at this point is just to make this code simpler and easier to understand, since I keep finding bugs in it. But that certainly should not be done in a way that makes performance worse, and you're right that it may currently do that. > Also, can we just revert the last patch (and add the test(s) that we > found in the mean time) so as to make it easier to review patches? Certainly, the attached patch does that. (I would have just done the revert directly but I don't have the ability to push to the emacs repo) >> --- a/lisp/minibuffer.el >> +++ b/lisp/minibuffer.el >> @@ -4249,7 +4249,11 @@ completion-pcm--pattern->regex >> (mapconcat >> (lambda (x) >> (cond >> - ((stringp x) (regexp-quote x)) >> + ((stringp x) >> + (let ((re (regexp-quote x))) >> + (if (eq group 'group-all) >> + (concat "\\(" re "\\)") >> + re))) >> (t >> (let ((re (if (eq x 'any-delim) >> (concat completion-pcm--delim-wild-regex "*?") > > I think we should aim for a patch which doesn't use that many subgroups. Well - just as a note, this patch uses only one subgroup for each sequence of wildcards, which reduces the number of subgroups for some kinds of pattern. >> @@ -4622,6 +4626,19 @@ completion--common-suffix >> "Return the common suffix of the strings STRS." >> (nreverse (try-completion "" (mapcar #'reverse strs)))) >> >> +(defun completion-pcm--group-pattern (pattern) >> + "Group together adjacent wildcards in PATTERN." >> + (let (ret) >> + (dolist (elem pattern) >> + (cond >> + ((or (stringp elem) (eq elem 'any-delim)) >> + (push elem ret)) >> + ((consp (car-safe ret)) >> + (setf (car ret) (append (car ret) (list elem)))) >> + (t >> + (push (list elem) ret)))) >> + (nreverse ret))) > > Why is `any-delim` not considered a "wildcard"? Because it's matched by a different regex which only matches delimiters. The idea here is to have one regex subgroup for each stretch of wildcards; i.e., for the pattern '("foo" any star point) we have the regex "foo(.*)" instead of "foo(.*)(.*)(.*)". But any-delim needs its own regex subgroup since its matched by a different regex. > Maybe a better grouping would be to turn > > ("a" "c" any "b" point) > > into something like > > (("a" "c" any) ("b" point)) > > or > > (("ac" any) ("b" point)) > > where each sublist will get its own regexp subgroup and will have its > corresponding `try-completion`. That's a great idea, and it matches up with things I was already thinking of doing. I'll try that in the next version of this patch. >> + ;; Then for each of those non-constant elements, extract the >> + ;; commonality between them. > > AFAICT this includes "constant" elements, so either I'm missing > something or the comment is off. > >> @@ -4774,7 +4773,7 @@ completion-pcm--merge-try >> (equal (completion-pcm--pattern->string pattern) (car all))) >> t) >> (t >> - (let* ((mergedpat (completion-pcm--merge-completions all pattern)) >> + (let* ((mergedpat (nreverse (completion-pcm--merge-completions all pattern))) >> ;; `mergedpat' is in reverse order. Place new point (by >> ;; order of preference) either at the old point, or at >> ;; the last place where there's something to choose, or > > Shouldn't that require an update to the subsequent comment? > Or is it that `completion-pcm--merge-completions` now doesn't return the > merged pat in reverse order, so we need an explicit `nreverse` to > recover the previous reserve-order merged pat? Yes, the return value of completion-pcm--merge-completions is no longer in reverse order. The attached patch just reverts my original broken change and adds tests. After landing that, I'll open a new bug if I want to further work on a patch like this.
[0001-Revert-Avoid-duplicating-strings-in-pcm-merge-comple.patch (text/x-patch, inline)]
From 2bbb3e31467449be25286af5f03078c3fbc5ac8b Mon Sep 17 00:00:00 2001 From: Spencer Baugh <sbaugh <at> janestreet.com> Date: Mon, 1 Sep 2025 11:43:25 -0400 Subject: [PATCH] Revert "Avoid duplicating strings in pcm--merge-completions" Revert "Avoid duplicating strings in pcm--merge-completions", commit b511c38bba5354ff21c697e4d27279bf73e4d3cf. It broke existing behavior, now covered by tests adding in this commit. * lisp/minibuffer.el (completion-pcm--merge-completions): * test/lisp/minibuffer-tests.el (completion-pcm-test-anydelim): (completion-pcm-bug4219): --- lisp/minibuffer.el | 39 +++++++++++++++++++++-------------- test/lisp/minibuffer-tests.el | 15 ++++++++++++++ 2 files changed, 39 insertions(+), 15 deletions(-) diff --git a/lisp/minibuffer.el b/lisp/minibuffer.el index 1c3c4bd0681..6617aee2d5a 100644 --- a/lisp/minibuffer.el +++ b/lisp/minibuffer.el @@ -4639,35 +4639,38 @@ completion-pcm--merge-completions (cond ((null (cdr strs)) (list (car strs))) (t - (let ((re (concat - (completion-pcm--pattern->regex pattern 'group) - ;; The implicit trailing `any' is greedy. - "\\([^z-a]*\\)")) + (let ((re (completion-pcm--pattern->regex pattern 'group)) (ccs ())) ;Chopped completions. - ;; First match each string against PATTERN as a regex and extract - ;; the text matched by each wildcard. + ;; First chop each string into the parts corresponding to each + ;; non-constant element of `pattern', using regexp-matching. (let ((case-fold-search completion-ignore-case)) (dolist (str strs) (unless (string-match re str) (error "Internal error: %s doesn't match %s" str re)) (let ((chopped ()) + (last 0) (i 1) next) - (while (setq next (match-string i str)) - (push next chopped) + (while (setq next (match-end i)) + (push (substring str last next) chopped) + (setq last next) (setq i (1+ i))) + ;; Add the text corresponding to the implicit trailing `any'. + (push (substring str last) chopped) (push (nreverse chopped) ccs)))) - ;; Then for each of those wildcards, extract the commonality between them. + ;; Then for each of those non-constant elements, extract the + ;; commonality between them. (let ((res ()) + (fixed "") ;; Accumulate each stretch of wildcards, and process them as a unit. (wildcards ())) ;; Make the implicit trailing `any' explicit. (dolist (elem (append pattern '(any))) (if (stringp elem) (progn - (push elem res) + (setq fixed (concat fixed elem)) (setq wildcards nil)) (let ((comps ())) (push elem wildcards) @@ -4678,13 +4681,18 @@ completion-pcm--merge-completions ;; different capitalizations in different parts. ;; In practice, it doesn't seem to make any difference. (setq ccs (nreverse ccs)) - (let* ((prefix (try-completion "" comps)) - (unique (or (and (eq prefix t) (setq prefix "")) + ;; FIXED is a prefix of all of COMPS. Try to grow that prefix. + (let* ((prefix (try-completion fixed comps)) + (unique (or (and (eq prefix t) (setq prefix fixed)) (and (stringp prefix) ;; If PREFIX is equal to all of COMPS, ;; then PREFIX is a unique completion. (seq-every-p - (lambda (comp) (= (length prefix) (length comp))) + ;; PREFIX is still a prefix of all of + ;; COMPS, so if COMP is the same length, + ;; they're equal. + (lambda (comp) + (= (length prefix) (length comp))) comps))))) ;; If there's only one completion, `elem' is not useful ;; any more: it can only match the empty string. @@ -4699,7 +4707,7 @@ completion-pcm--merge-completions ;; `prefix' only wants to include the fixed part before the ;; wildcard, not the result of growing that fixed part. (when (seq-some (lambda (elem) (eq elem 'prefix)) wildcards) - (setq prefix "")) + (setq prefix fixed)) (push prefix res) ;; Push all the wildcards in this stretch, to preserve `point' and ;; `star' wildcards before ELEM. @@ -4723,7 +4731,8 @@ completion-pcm--merge-completions (unless (equal suffix "") (push suffix res)))) ;; We pushed these wildcards on RES, so we're done with them. - (setq wildcards nil)))))) + (setq wildcards nil)) + (setq fixed ""))))) ;; We return it in reverse order. res))))) diff --git a/test/lisp/minibuffer-tests.el b/test/lisp/minibuffer-tests.el index 99753f31330..de1a98c8189 100644 --- a/test/lisp/minibuffer-tests.el +++ b/test/lisp/minibuffer-tests.el @@ -332,6 +332,21 @@ completion-pcm-test-8 "" '("fooxbar" "fooybar") nil 0) '("foobar" . 3)))) +(ert-deftest completion-pcm-test-anydelim () + ;; After each delimiter is a special wildcard which matches any + ;; sequence of delimiters. + (should (equal (completion-pcm-try-completion + "-x" '("-_.x" "-__x") nil 2) + '("-_x" . 3)))) + +(ert-deftest completion-pcm-bug4219 () + ;; With `completion-ignore-case', try-completion should change the + ;; case of existing text when the completions have different casing. + (should (equal + (let ((completion-ignore-case t)) + (completion-pcm-try-completion "a" '("ABC" "ABD") nil 1)) + '("AB" . 2)))) + (ert-deftest completion-substring-test-1 () ;; One third of a match! (should (equal -- 2.43.7
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.