GNU bug report logs - #64735
29.0.92; find invocations are ~15x slower because of ignores

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Wed, 19 Jul 2023 21:17:02 UTC

Severity: normal

Found in version 29.0.92

Full log


Message #32 received at 64735 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Spencer Baugh <sbaugh <at> janestreet.com>, 64735 <at> debbugs.gnu.org
Subject: Re: bug#64735: 29.0.92; find invocations are ~15x slower because of
 ignores
Date: Thu, 20 Jul 2023 18:57:00 +0300
On 20/07/2023 18:42, Ihor Radchenko wrote:
> Dmitry Gutov <dmitry <at> gutov.dev> writes:
> 
>>>> ... Last I checked, Lisp-native file
>>>> listing was simply slower than 'find'.
>>>
>>> Could it be changed?
>>> In my tests, I was able to improve performance of the built-in
>>> `directory-files-recursively' simply by disabling
>>> `file-name-handler-alist' around its call.
>>
>> Then it won't work with Tramp, right? I think it's pretty nifty that
>> project-find-regexp and dired-do-find-regexp work over Tramp.
> 
> Sure. It might also be optimized. Without trying to convince find devs
> to do something about regexp handling.
> 
> And things are not as horrible as 15x slowdown in find.

We haven't compared to the "optimized regexps" solution in find, though.

>>> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf <at> localhost/
>>> (the thread also continues off-list, and it looks like there is a lot of
>>> room for improvement in this area)
>>
>> Does it get close enough to the performance of 'find' this way?
> 
> Comparable:
> 
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (directory-files-recursively "/home/yantar92/.data" ""))))
> ;; Elapsed time: 0.633713s
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (let ((file-name-handler-alist)) (directory-files-recursively "/home/yantar92/.data" "")))))
> ;; Elapsed time: 0.324341s
> ;; time find /home/yantar92/.data >/dev/null
> ;; real	0m0.129s
> ;; user	0m0.017s
> ;; sys	0m0.111s

Still like 2.5x slower, then? That's significant.

>> Also note that processing all matches in Lisp, with many ignores
>> entries, will incur the proportional overhead in Lisp. Which might be
>> relatively slow as well.
> 
> Not significant.
> I tried to unwrap recursion in `directory-files-recursively' and tried
> to play around with regexp matching of the file list itself - no
> significant impact compared to `file-name-handler-alist'.

I suppose that can make sense, if find's slowdown is due to it issuing 
repeated 'stat' calls for every match.

> I am pretty sure that Emacs's native file routines can be optimized to
> the level of find.

I don't know, the GNU tools are often ridiculously optimized. At least 
certain file paths.




This bug report was last modified 1 year and 274 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.