GNU bug report logs - #44210
28.0.50; project.el failed to work after customizing find-program to fd

Previous Next

Package: emacs;

Reported by: Zhiwei Chen <condy0919 <at> gmail.com>

Date: Sun, 25 Oct 2020 11:27:01 UTC

Severity: normal

Found in version 28.0.50

To reply to this bug, email your comments to 44210 AT debbugs.gnu.org.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#44210; Package emacs. (Sun, 25 Oct 2020 11:27:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Zhiwei Chen <condy0919 <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Sun, 25 Oct 2020 11:27:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Zhiwei Chen <condy0919 <at> gmail.com>
To: bug-gnu-emacs <at> gnu.org
Subject: 28.0.50; project.el failed to work after customizing find-program
 to fd
Date: Sun, 25 Oct 2020 19:26:40 +0800
The arguments of `find-program' in function
`project--files-in-directory' is hard coded, which disallows customizing
`find-program' in some means.

`counsel-file-jump` uses `find-program' and provides
`counsel-file-jump-args' which I thought is better.

-- 
Zhiwei Chen




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44210; Package emacs. (Mon, 26 Oct 2020 22:38:02 GMT) Full text and rfc822 format available.

Message #8 received at 44210 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Zhiwei Chen <condy0919 <at> gmail.com>, 44210 <at> debbugs.gnu.org
Subject: Re: bug#44210: 28.0.50; project.el failed to work after customizing
 find-program to fd
Date: Tue, 27 Oct 2020 00:37:10 +0200
Hi!

On 25.10.2020 13:26, Zhiwei Chen wrote:
> The arguments of `find-program' in function
> `project--files-in-directory' is hard coded, which disallows customizing
> `find-program' in some means.

The arguments are not hardcoded (they are constructed dynamically), but 
the format is (one expected by 'find').

'fd' uses a different arguments format, both for the "globs to search 
for" and the list of ignores. I wish we had a better mechanism in 
grep.el for a more flexible user ability to choose the tool to list 
files in a dir (and a search tool, and so on).

> `counsel-file-jump` uses `find-program' and provides
> `counsel-file-jump-args' which I thought is better.

A variable with a flat list of args won't do here, because we actually 
have to turn two other lists (FILES and IGNORES) into appropriate arguments.

What you could do, is do full :override advice on which would construct 
a proper command line for 'fd' based on these args, then call it and 
pipe through 'project--remote-file-names' (like 
'project--files-in-directory' currently does). Then benchmark them and 
post the results here.

If the result offers a meaningfully better performance, while honoring 
all ignores, we'll see what we can do to accommodate 'fd'.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44210; Package emacs. (Sun, 10 Jan 2021 03:32:02 GMT) Full text and rfc822 format available.

Message #11 received at 44210 <at> debbugs.gnu.org (full text, mbox):

From: Zhiwei Chen <chenzhiwei03 <at> kuaishou.com>
To: "44210 <at> debbugs.gnu.org" <44210 <at> debbugs.gnu.org>
Cc: "44210 <at> debbugs.gnu.org" <44210 <at> debbugs.gnu.org>,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44210: 28.0.50; project.el failed to work after customizing
 find-program to fd
Date: Sun, 10 Jan 2021 03:31:11 +0000
[Message part 1 (text/plain, inline)]
Sorry for late reply, here are the benchmark stats.

The result is promising, ‘fd’ is 3x faster than ‘find’.

(benchmark 5 '(project--files-in-directory "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 9.401258s (0.097027s in 1 GCs)"

(benchmark 5 '(project--files-in-directory-fd "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 2.759160s (0.105133s in 1 GCs)”

Where `project--files-in-directory’ is the original version in project.el, and `project--files-in-directory-fd’ modified from the previous one for ‘fd’ use.

The definition of `project--files-in-directory-fd’ follows:

(defun project--files-in-directory-fd (dir ignores &optional files)
  (require 'find-dired)
  (require 'xref)
  (defvar find-name-arg)
  (let* ((default-directory dir)
         ;; Make sure ~/ etc. in local directory name is
         ;; expanded and not left for the shell command
         ;; to interpret.
         (localdir (file-local-name (expand-file-name dir)))
         (command (format "%s . %s %s --type f %s --print0"
                          "fd"
                          ;; In case DIR is a symlink.
                          (file-name-as-directory localdir)
                          ""
                          (if files
                              (concat (shell-quote-argument "(")
                                      " " find-name-arg " "
                                      (mapconcat
                                       #'shell-quote-argument
                                       (split-string files)
                                       (concat " -o " find-name-arg " "))
                                      " "
                                      (shell-quote-argument ")"))
                            ""))))
    (message command)
    (project--remote-file-names
     (sort (split-string (shell-command-to-string command) "\0" t)
           #'string<))))

--
Zhiwei Chen


[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44210; Package emacs. (Sun, 10 Jan 2021 03:38:01 GMT) Full text and rfc822 format available.

Message #14 received at 44210 <at> debbugs.gnu.org (full text, mbox):

From: Zhiwei Chen <chenzhiwei03 <at> kuaishou.com>
To: "44210 <at> debbugs.gnu.org" <44210 <at> debbugs.gnu.org>
Cc: "condy0919 <at> gmail.com" <condy0919 <at> gmail.com>,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#44210: 28.0.50; project.el failed to work after customizing
 find-program to fd
Date: Sun, 10 Jan 2021 03:37:18 +0000
[Message part 1 (text/plain, inline)]
+myself

--
Zhiwei Chen


On Jan 10, 2021, at 11:31 AM, Zhiwei Chen <chenzhiwei03 <at> kuaishou.com<mailto:chenzhiwei03 <at> kuaishou.com>> wrote:

Sorry for late reply, here are the benchmark stats.

The result is promising, ‘fd’ is 3x faster than ‘find’.

(benchmark 5 '(project--files-in-directory "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 9.401258s (0.097027s in 1 GCs)"

(benchmark 5 '(project--files-in-directory-fd "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 2.759160s (0.105133s in 1 GCs)”

Where `project--files-in-directory’ is the original version in project.el, and `project--files-in-directory-fd’ modified from the previous one for ‘fd’ use.

The definition of `project--files-in-directory-fd’ follows:

(defun project--files-in-directory-fd (dir ignores &optional files)
  (require 'find-dired)
  (require 'xref)
  (defvar find-name-arg)
  (let* ((default-directory dir)
         ;; Make sure ~/ etc. in local directory name is
         ;; expanded and not left for the shell command
         ;; to interpret.
         (localdir (file-local-name (expand-file-name dir)))
         (command (format "%s . %s %s --type f %s --print0"
                          "fd"
                          ;; In case DIR is a symlink.
                          (file-name-as-directory localdir)
                          ""
                          (if files
                              (concat (shell-quote-argument "(")
                                      " " find-name-arg " "
                                      (mapconcat
                                       #'shell-quote-argument
                                       (split-string files)
                                       (concat " -o " find-name-arg " "))
                                      " "
                                      (shell-quote-argument ")"))
                            ""))))
    (message command)
    (project--remote-file-names
     (sort (split-string (shell-command-to-string command) "\0" t)
           #'string<))))

--
Zhiwei Chen



[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44210; Package emacs. (Sun, 10 Jan 2021 17:49:01 GMT) Full text and rfc822 format available.

Message #17 received at 44210 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Zhiwei Chen <chenzhiwei03 <at> kuaishou.com>,
 "44210 <at> debbugs.gnu.org" <44210 <at> debbugs.gnu.org>
Cc: "condy0919 <at> gmail.com" <condy0919 <at> gmail.com>
Subject: Re: bug#44210: 28.0.50; project.el failed to work after customizing
 find-program to fd
Date: Sun, 10 Jan 2021 19:48:32 +0200
Hi!

On 10.01.2021 05:37, Zhiwei Chen wrote:
> (defun project--files-in-directory-fd (dir ignores &optional files)
>    (require 'find-dired)
>    (require 'xref)
>    (defvar find-name-arg)
>    (let* ((default-directory dir)
>           ;; Make sure ~/ etc. in local directory name is
>           ;; expanded and not left for the shell command
>           ;; to interpret.
>           (localdir (file-local-name (expand-file-name dir)))
>           (command (format "%s . %s %s --type f %s --print0"
>                            "fd"
>                            ;; In case DIR is a symlink.
>                            (file-name-as-directory localdir)
>                            ""
>                            (if files
>                                (concat (shell-quote-argument "(")
>                                        " " find-name-arg " "
>                                        (mapconcat
>                                         #'shell-quote-argument
>                                         (split-string files)
>                                         (concat " -o " find-name-arg " "))
>                                        " "
>                                        (shell-quote-argument ")"))
>                              ""))))
>      (message command)
>      (project--remote-file-names
>       (sort (split-string (shell-command-to-string command) "\0" t)
>             #'string<))))

That code doesn't seem to handle the IGNORES argument at all. Which 
could lead to an imbalanced comparison, though I don't know if it does, 
in this example (with just one ignored dir). But you could try passing 
no ignores to both of them.

It's weird, though. I have just tried both functions, and there was no 
perceptible performance difference (in a different project, though; in 
gecko-dev).

What are the versions of said programs on your machine? Mine:

$ find --version
find (GNU findutils) 4.7.0

$ fdfind --version
fd 7.4.0




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44210; Package emacs. (Mon, 11 Jan 2021 13:05:02 GMT) Full text and rfc822 format available.

Message #20 received at 44210 <at> debbugs.gnu.org (full text, mbox):

From: Zhiwei Chen <chenzhiwei03 <at> kuaishou.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: "44210 <at> debbugs.gnu.org" <44210 <at> debbugs.gnu.org>,
 "condy0919 <at> gmail.com" <condy0919 <at> gmail.com>
Subject: Re: bug#44210: 28.0.50; project.el failed to work after customizing
 find-program to fd
Date: Mon, 11 Jan 2021 13:04:26 +0000
[Message part 1 (text/plain, inline)]
I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.

Make sure the page cache is cleared before each benchmark.

> sudo sysctl -w vm.drop_caches=3

> cd llvm-project

> sudo sysctl -w vm.drop_caches=3
> time fd > /tmp/fd_output
1.04s user 4.11s system 522% cpu 0.987 total

> sudo sysctl -w vm.drop_caches=3
> time find > /tmp/find_output
0.06s user 0.20s system 7% cpu 3.354 total

Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.

--
Zhiwei Chen


[Message part 2 (text/html, inline)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44210; Package emacs. (Mon, 18 Jan 2021 01:16:01 GMT) Full text and rfc822 format available.

Message #23 received at 44210 <at> debbugs.gnu.org (full text, mbox):

From: Zhiwei Chen <condy0919 <at> gmail.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>
Cc: "44210 <at> debbugs.gnu.org" <44210 <at> debbugs.gnu.org>,
 Zhiwei Chen <chenzhiwei03 <at> kuaishou.com>
Subject: Re: bug#44210: 28.0.50; project.el failed to work after customizing
 find-program to fd
Date: Mon, 18 Jan 2021 09:15:03 +0800
I think I replied to the wrong thread, so forwarded it again.

> I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.
> 
> Make sure the page cache is cleared before each benchmark.
> 
> > sudo sysctl -w vm.drop_caches=3
> 
> > cd llvm-project
> 
> > sudo sysctl -w vm.drop_caches=3
> > time fd > /tmp/fd_output
> 1.04s user 4.11s system 522% cpu 0.987 total
> 
> > sudo sysctl -w vm.drop_caches=3
> > time find > /tmp/find_output
> 0.06s user 0.20s system 7% cpu 3.354 total
> 
> Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.

-- 
Zhiwei Chen




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#44210; Package emacs. (Mon, 18 Jan 2021 03:10:02 GMT) Full text and rfc822 format available.

Message #26 received at 44210 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Zhiwei Chen <condy0919 <at> gmail.com>
Cc: "44210 <at> debbugs.gnu.org" <44210 <at> debbugs.gnu.org>,
 Zhiwei Chen <chenzhiwei03 <at> kuaishou.com>
Subject: Re: bug#44210: 28.0.50; project.el failed to work after customizing
 find-program to fd
Date: Mon, 18 Jan 2021 05:09:12 +0200
Hi!

You didn't address my complaint about the ignored IGNORES argument. I 
was going to explain that, but that email got sidetracked, sorry.

In any case, I can't reproduce your results even with the latest fd.

I don't have an LLVM checkout, though, just some other projects like 
Linux kernel and gecko-dev. And 'find' is consistently 2x as fast here.

In any case, I can believe that fd is going to be faster on some 
systems. To make it an "official" option, someone will need to write a 
version of project--files-in-directory that uses fd but honors the 
IGNORES argument, as well as FILES. Preferably with some tests. Then we 
can make the program used switchable.

On 18.01.2021 03:15, Zhiwei Chen wrote:
> 
> I think I replied to the wrong thread, so forwarded it again.
> 
>> I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.
>>
>> Make sure the page cache is cleared before each benchmark.
>>
>>> sudo sysctl -w vm.drop_caches=3
>>
>>> cd llvm-project
>>
>>> sudo sysctl -w vm.drop_caches=3
>>> time fd > /tmp/fd_output
>> 1.04s user 4.11s system 522% cpu 0.987 total
>>
>>> sudo sysctl -w vm.drop_caches=3
>>> time find > /tmp/find_output
>> 0.06s user 0.20s system 7% cpu 3.354 total
>>
>> Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.
> 





This bug report was last modified 4 years and 151 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.