GNU bug report logs -
#64735
29.0.92; find invocations are ~15x slower because of ignores
Previous Next
Full log
Message #32 received at 64735 <at> debbugs.gnu.org (full text, mbox):
On 20/07/2023 18:42, Ihor Radchenko wrote:
> Dmitry Gutov <dmitry <at> gutov.dev> writes:
>
>>>> ... Last I checked, Lisp-native file
>>>> listing was simply slower than 'find'.
>>>
>>> Could it be changed?
>>> In my tests, I was able to improve performance of the built-in
>>> `directory-files-recursively' simply by disabling
>>> `file-name-handler-alist' around its call.
>>
>> Then it won't work with Tramp, right? I think it's pretty nifty that
>> project-find-regexp and dired-do-find-regexp work over Tramp.
>
> Sure. It might also be optimized. Without trying to convince find devs
> to do something about regexp handling.
>
> And things are not as horrible as 15x slowdown in find.
We haven't compared to the "optimized regexps" solution in find, though.
>>> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf <at> localhost/
>>> (the thread also continues off-list, and it looks like there is a lot of
>>> room for improvement in this area)
>>
>> Does it get close enough to the performance of 'find' this way?
>
> Comparable:
>
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (directory-files-recursively "/home/yantar92/.data" ""))))
> ;; Elapsed time: 0.633713s
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (let ((file-name-handler-alist)) (directory-files-recursively "/home/yantar92/.data" "")))))
> ;; Elapsed time: 0.324341s
> ;; time find /home/yantar92/.data >/dev/null
> ;; real 0m0.129s
> ;; user 0m0.017s
> ;; sys 0m0.111s
Still like 2.5x slower, then? That's significant.
>> Also note that processing all matches in Lisp, with many ignores
>> entries, will incur the proportional overhead in Lisp. Which might be
>> relatively slow as well.
>
> Not significant.
> I tried to unwrap recursion in `directory-files-recursively' and tried
> to play around with regexp matching of the file list itself - no
> significant impact compared to `file-name-handler-alist'.
I suppose that can make sense, if find's slowdown is due to it issuing
repeated 'stat' calls for every match.
> I am pretty sure that Emacs's native file routines can be optimized to
> the level of find.
I don't know, the GNU tools are often ridiculously optimized. At least
certain file paths.
This bug report was last modified 1 year and 274 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.