GNU bug report logs - #64735
29.0.92; find invocations are ~15x slower because of ignores

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Wed, 19 Jul 2023 21:17:02 UTC

Severity: normal

Found in version 29.0.92

Full log


Message #245 received at 64735 <at> debbugs.gnu.org (full text, mbox):

From: sbaugh <at> catern.com
To: Eli Zaretskii <eliz <at> gnu.org>
Cc: Spencer Baugh <sbaugh <at> janestreet.com>, yantar92 <at> posteo.net, rms <at> gnu.org,
 dmitry <at> gutov.dev, michael.albinus <at> gmx.de, 64735 <at> debbugs.gnu.org
Subject: Re: bug#64735: 29.0.92; find invocations are ~15x slower because of
 ignores
Date: Sat, 22 Jul 2023 10:38:37 +0000 (UTC)
Eli Zaretskii <eliz <at> gnu.org> writes:
>> From: Spencer Baugh <sbaugh <at> janestreet.com>
>> Cc: Michael Albinus <michael.albinus <at> gmx.de>,  dmitry <at> gutov.dev,
>>    yantar92 <at> posteo.net,  64735 <at> debbugs.gnu.org, Richard Stallman
>>   <rms <at> gnu.org>
>> Date: Fri, 21 Jul 2023 15:33:13 -0400
>> 
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> > The first idea that comes to mind is to reimplement
>> > directory-files-recursively in C, modeled on how Find does that.
>> 
>> If someone was thinking of doing that, they would be better off
>> responding to RMS's earlier request for C programmers to optimize this
>> behavior in find.
>
> No, the first step is to use in Emacs what Find does today, because it
> will already be a significant speedup.

Why bother?  directory-files-recursively is a rarely used API, as you
have mentioned before in this thread.

And there is a way to speed it up which will have a performance boost
which is unbeatable any other way: Use find instead of
directory-files-recursively, and operate on files as they find prints
them.  Since this runs the directory traversal in parallel with Emacs,
it has a speed advantage that is impossible to match in
directory-files-recursively.

We can fall back to directory-files-recursively when find is not
available.

> Optimizing the case of a long
> list of omissions should come later, as it is a minor optimization.

This seems wrong.  directory-files-recursively is rarely used, and rgrep
is a very popular command, and this problem with find makes rgrep around
~10x slower by default.  How in any world is that a minor optimization?
Most Emacs users will never realize that they can speed up rgrep
massively by setting grep-find-ignored-files to nil.  Indeed, no-one
realized that until I just pointed it out.  In my experience, they just
stop using rgrep in favor of other third-party packages like ripgrep,
because "grep is slow".




This bug report was last modified 1 year and 274 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.