GNU bug report logs - #64735
29.0.92; find invocations are ~15x slower because of ignores

Previous Next

Package: emacs;

Reported by: Spencer Baugh <sbaugh <at> janestreet.com>

Date: Wed, 19 Jul 2023 21:17:02 UTC

Severity: normal

Found in version 29.0.92

Full log


Message #32 received at 64735 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dmitry <at> gutov.dev>
To: Ihor Radchenko <yantar92 <at> posteo.net>
Cc: Spencer Baugh <sbaugh <at> janestreet.com>, 64735 <at> debbugs.gnu.org
Subject: Re: bug#64735: 29.0.92; find invocations are ~15x slower because of
 ignores
Date: Thu, 20 Jul 2023 18:57:00 +0300
On 20/07/2023 18:42, Ihor Radchenko wrote:
> Dmitry Gutov <dmitry <at> gutov.dev> writes:
> 
>>>> ... Last I checked, Lisp-native file
>>>> listing was simply slower than 'find'.
>>>
>>> Could it be changed?
>>> In my tests, I was able to improve performance of the built-in
>>> `directory-files-recursively' simply by disabling
>>> `file-name-handler-alist' around its call.
>>
>> Then it won't work with Tramp, right? I think it's pretty nifty that
>> project-find-regexp and dired-do-find-regexp work over Tramp.
> 
> Sure. It might also be optimized. Without trying to convince find devs
> to do something about regexp handling.
> 
> And things are not as horrible as 15x slowdown in find.

We haven't compared to the "optimized regexps" solution in find, though.

>>> See https://yhetil.org/emacs-devel/87cz0p2xlc.fsf <at> localhost/
>>> (the thread also continues off-list, and it looks like there is a lot of
>>> room for improvement in this area)
>>
>> Does it get close enough to the performance of 'find' this way?
> 
> Comparable:
> 
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (directory-files-recursively "/home/yantar92/.data" ""))))
> ;; Elapsed time: 0.633713s
> (ignore (let ((gc-cons-threshold most-positive-fixnum)) (benchmark-progn (let ((file-name-handler-alist)) (directory-files-recursively "/home/yantar92/.data" "")))))
> ;; Elapsed time: 0.324341s
> ;; time find /home/yantar92/.data >/dev/null
> ;; real	0m0.129s
> ;; user	0m0.017s
> ;; sys	0m0.111s

Still like 2.5x slower, then? That's significant.

>> Also note that processing all matches in Lisp, with many ignores
>> entries, will incur the proportional overhead in Lisp. Which might be
>> relatively slow as well.
> 
> Not significant.
> I tried to unwrap recursion in `directory-files-recursively' and tried
> to play around with regexp matching of the file list itself - no
> significant impact compared to `file-name-handler-alist'.

I suppose that can make sense, if find's slowdown is due to it issuing 
repeated 'stat' calls for every match.

> I am pretty sure that Emacs's native file routines can be optimized to
> the level of find.

I don't know, the GNU tools are often ridiculously optimized. At least 
certain file paths.




This bug report was last modified 16 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.