GNU bug report logs - #71179
[PATCH] In rgrep, check matching files before excluding files

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Fri, 24 May 2024 20:15:02 UTC

Severity: normal

Tags: patch

Fixed in version 30.1

Done: Stefan Kangas <stefankangas <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


Message #65 received at 71179 <at> debbugs.gnu.org (full text, mbox):

From: Spencer Baugh <sbaugh <at> janestreet.com>
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Dmitry Gutov <dmitry <at> gutov.dev>, 71179 <at> debbugs.gnu.org
Subject: Re: bug#71179: [PATCH] In rgrep, check matching files before
 excluding files
Date: Sun, 26 May 2024 09:42:10 -0400
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes:

>> Date: Sat, 25 May 2024 17:47:49 +0300
>> Cc: 71179 <at> debbugs.gnu.org
>> From: Dmitry Gutov <dmitry <at> gutov.dev>
>> 
>> On 25/05/2024 17:13, Eli Zaretskii wrote:
>> > Anyway, I suggested an approach that should leave everyone happy.  Why
>> > are we still arguing?
>> 
>> New option has the usual problems:
>> 
>> * What to call it.
>> * General proliferation of new options makes it harder to find each one.
>> * Several days of arguing whether we can make the new behavior the 
>> default one.
>
> I already said that I'm okay with making the new behavior the default,
> provided that we document the way of getting back old behavior.
>
> The new option can be a defvar, from where I stand, if that makes
> things easier.  Just don't make it an internal variable.

OK, here's the patch, now with an option.

[0001-In-rgrep-check-matching-files-before-excluding-files.patch (text/x-patch, inline)]
From 231813e5825e1f3a63c24ad6063eb73de7b2b361 Mon Sep 17 00:00:00 2001
From: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Sun, 26 May 2024 09:26:09 -0400
Subject: [PATCH] In rgrep, check matching files before excluding files

There are a lot of excluding globs, and checking them all is expensive.
The files glob (i.e. the glob for files we actually want) is usually
just one or two entries, so it's quite fast to check.

If find checks the files glob first and then the excluding glob, it has
to do much less checking (since the files glob will substantially narrow
down the set of files on its own), and find performance is much better.

In my benchmarking, this takes (rgrep "foo" "*.el" "~/src/emacs/trunk/")
from ~410ms to ~130ms.

Further optimizations are possible now that the ignores and matched
files are in the same <F> argument which can be rearranged more easily
without compatibility issues; I'll do those optimizations in later
commits.

* lisp/progmodes/grep.el (rgrep-find-ignores-in-<f>): Add.
* lisp/progmodes/grep.el (rgrep-default-command): Check
rgrep-find-ignores-in-<f> and move the excluded files glob to part of
the "files" argument.
---
 lisp/progmodes/grep.el | 97 +++++++++++++++++++++++++-----------------
 1 file changed, 58 insertions(+), 39 deletions(-)

diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el
index ce54c57aabc..c0b48fb17e0 100644
--- a/lisp/progmodes/grep.el
+++ b/lisp/progmodes/grep.el
@@ -214,6 +214,21 @@ grep-find-template
   :set #'grep-apply-setting
   :version "22.1")
 
+(defvar rgrep-find-ignores-in-<f> t
+  "If nil, when `rgrep' expands `grep-find-template', file ignores go in <X>.
+
+By default, the <X> placeholder contains find options for affecting the
+directory list, and the <F> placeholder contains the find options which
+affect which files are matched, both `grep-find-ignored-files' and the
+FILES argument to `rgrep'.
+
+This separation allows the two sources of file matching in <F> to be
+optimized together into a set of options which are overall faster for
+\"find\" to evaluate.
+
+If nil, <X> contains ignores both for directories and files, and <F>
+contains only the FILES argument.  This is the old behavior.")
+
 (defvar grep-quoting-style nil
   "Whether to use POSIX-like shell argument quoting.")
 
@@ -1372,45 +1387,49 @@ rgrep-find-ignored-directories
 
 (defun rgrep-default-command (regexp files dir)
   "Compute the command for \\[rgrep] to use by default."
-  (require 'find-dired)      ; for `find-name-arg'
-  (grep-expand-template
-   grep-find-template
-   regexp
-   (concat (shell-quote-argument "(" grep-quoting-style)
-           " " find-name-arg " "
-           (mapconcat
-            (lambda (x) (shell-quote-argument x grep-quoting-style))
-            (split-string files)
-            (concat " -o " find-name-arg " "))
-           " "
-           (shell-quote-argument ")" grep-quoting-style))
-   dir
-   (concat
-    (when-let ((ignored-dirs (rgrep-find-ignored-directories dir)))
-      (concat "-type d "
-              (shell-quote-argument "(" grep-quoting-style)
-              ;; we should use shell-quote-argument here
-              " -path "
-              (mapconcat
-               (lambda (d)
-                 (shell-quote-argument (concat "*/" d) grep-quoting-style))
-               ignored-dirs
-               " -o -path ")
-              " "
-              (shell-quote-argument ")" grep-quoting-style)
-              " -prune -o "))
-    (when-let ((ignored-files (grep-find-ignored-files dir)))
-      (concat (shell-quote-argument "!" grep-quoting-style) " -type d "
-              (shell-quote-argument "(" grep-quoting-style)
-              ;; we should use shell-quote-argument here
-              " -name "
-              (mapconcat
-               (lambda (ignore) (shell-quote-argument ignore grep-quoting-style))
-               ignored-files
-               " -o -name ")
-              " "
-              (shell-quote-argument ")" grep-quoting-style)
-              " -prune -o ")))))
+  (require 'find-dired)                 ; for `find-name-arg'
+  (let ((ignored-files-arg
+         (when-let ((ignored-files (grep-find-ignored-files dir)))
+           (concat (shell-quote-argument "(" grep-quoting-style)
+                   ;; we should use shell-quote-argument here
+                   " -name "
+                   (mapconcat
+                    (lambda (ignore) (shell-quote-argument ignore grep-quoting-style))
+                    ignored-files
+                    " -o -name ")
+                   " " (shell-quote-argument ")" grep-quoting-style)))))
+    (grep-expand-template
+     grep-find-template
+     regexp
+     (concat (shell-quote-argument "(" grep-quoting-style)
+             " " find-name-arg " "
+             (mapconcat
+              (lambda (x) (shell-quote-argument x grep-quoting-style))
+              (split-string files)
+              (concat " -o " find-name-arg " "))
+             " "
+             (shell-quote-argument ")" grep-quoting-style)
+             (when (and rgrep-find-ignores-in-<f> ignored-files-arg)
+               (concat " " (shell-quote-argument "!" grep-quoting-style) " " ignored-files-arg)))
+     dir
+     (concat
+      (when-let ((ignored-dirs (rgrep-find-ignored-directories dir)))
+        (concat "-type d "
+                (shell-quote-argument "(" grep-quoting-style)
+                ;; we should use shell-quote-argument here
+                " -path "
+                (mapconcat
+                 (lambda (d)
+                   (shell-quote-argument (concat "*/" d) grep-quoting-style))
+                 ignored-dirs
+                 " -o -path ")
+                " "
+                (shell-quote-argument ")" grep-quoting-style)
+                " -prune -o "))
+      (when (and (not rgrep-find-ignores-in-<f>) ignored-files-arg)
+        (concat (shell-quote-argument "!" grep-quoting-style) " -type d "
+                ignored-files-arg
+                " -prune -o "))))))
 
 (defun grep-find-toggle-abbreviation ()
   "Toggle showing the hidden part of rgrep/lgrep/zrgrep command line."
-- 
2.39.3


This bug report was last modified 1 year and 40 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.