GNU bug report logs -
#64735
29.0.92; find invocations are ~15x slower because of ignores
Previous Next
Full log
Message #32 received at 64735 <at> debbugs.gnu.org (full text, mbox):
On 20/07/2023 18:42, Ihor Radchenko wrote:
> Dmitry Gutov <dmitry <at> gutov.dev> writes:
>
>>>> ... Last I checked, Lisp-native file
>>>> listing was simply slower than 'find'.
>>>
>>> Could it be changed?
>>> In my tests, I was able to improve performance of the built-in
>>> `directory-files-recursively' simply by disabling
>>> `file-name-handler-alist' around its call.
>>
>> Then it won't work with Tramp, right? I think it's pretty nifty that
>> project-find-regexp and dired-do-find-regexp work over Tramp.
>
> Sure. It might also be optimized. Without trying to convince find devs
> to do something about regexp handling.
>
> And things are not as horrible as 15x slowdown in find.
We haven't compared to the "optimized regexps" solution in find, though.
>>> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf <at> localhost/
>>> (the thread also continues off-list, and it looks like there is a lot of
>>> room for improvement in this area)
>>
>> Does it get close enough to the performance of 'find' this way?
>
> Comparable:
>
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (directory-files-recursively "/home/yantar92/.data" ""))))
> ;; Elapsed time: 0.633713s
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (let ((file-name-handler-alist)) (directory-files-recursively "/home/yantar92/.data" "")))))
> ;; Elapsed time: 0.324341s
> ;; time find /home/yantar92/.data >/dev/null
> ;; real 0m0.129s
> ;; user 0m0.017s
> ;; sys 0m0.111s
Still like 2.5x slower, then? That's significant.
>> Also note that processing all matches in Lisp, with many ignores
>> entries, will incur the proportional overhead in Lisp. Which might be
>> relatively slow as well.
>
> Not significant.
> I tried to unwrap recursion in `directory-files-recursively' and tried
> to play around with regexp matching of the file list itself - no
> significant impact compared to `file-name-handler-alist'.
I suppose that can make sense, if find's slowdown is due to it issuing
repeated 'stat' calls for every match.
> I am pretty sure that Emacs's native file routines can be optimized to
> the level of find.
I don't know, the GNU tools are often ridiculously optimized. At least
certain file paths.
This bug report was last modified 16 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.