GNU bug report logs -
#44210
28.0.50; project.el failed to work after customizing find-program to fd
Previous Next
To reply to this bug, email your comments to 44210 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44210
; Package
emacs
.
(Sun, 25 Oct 2020 11:27:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Zhiwei Chen <condy0919 <at> gmail.com>
:
New bug report received and forwarded. Copy sent to
bug-gnu-emacs <at> gnu.org
.
(Sun, 25 Oct 2020 11:27:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
The arguments of `find-program' in function
`project--files-in-directory' is hard coded, which disallows customizing
`find-program' in some means.
`counsel-file-jump` uses `find-program' and provides
`counsel-file-jump-args' which I thought is better.
--
Zhiwei Chen
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44210
; Package
emacs
.
(Mon, 26 Oct 2020 22:38:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 44210 <at> debbugs.gnu.org (full text, mbox):
Hi!
On 25.10.2020 13:26, Zhiwei Chen wrote:
> The arguments of `find-program' in function
> `project--files-in-directory' is hard coded, which disallows customizing
> `find-program' in some means.
The arguments are not hardcoded (they are constructed dynamically), but
the format is (one expected by 'find').
'fd' uses a different arguments format, both for the "globs to search
for" and the list of ignores. I wish we had a better mechanism in
grep.el for a more flexible user ability to choose the tool to list
files in a dir (and a search tool, and so on).
> `counsel-file-jump` uses `find-program' and provides
> `counsel-file-jump-args' which I thought is better.
A variable with a flat list of args won't do here, because we actually
have to turn two other lists (FILES and IGNORES) into appropriate arguments.
What you could do, is do full :override advice on which would construct
a proper command line for 'fd' based on these args, then call it and
pipe through 'project--remote-file-names' (like
'project--files-in-directory' currently does). Then benchmark them and
post the results here.
If the result offers a meaningfully better performance, while honoring
all ignores, we'll see what we can do to accommodate 'fd'.
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44210
; Package
emacs
.
(Sun, 10 Jan 2021 03:32:02 GMT)
Full text and
rfc822 format available.
Message #11 received at 44210 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Sorry for late reply, here are the benchmark stats.
The result is promising, ‘fd’ is 3x faster than ‘find’.
(benchmark 5 '(project--files-in-directory "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 9.401258s (0.097027s in 1 GCs)"
(benchmark 5 '(project--files-in-directory-fd "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 2.759160s (0.105133s in 1 GCs)”
Where `project--files-in-directory’ is the original version in project.el, and `project--files-in-directory-fd’ modified from the previous one for ‘fd’ use.
The definition of `project--files-in-directory-fd’ follows:
(defun project--files-in-directory-fd (dir ignores &optional files)
(require 'find-dired)
(require 'xref)
(defvar find-name-arg)
(let* ((default-directory dir)
;; Make sure ~/ etc. in local directory name is
;; expanded and not left for the shell command
;; to interpret.
(localdir (file-local-name (expand-file-name dir)))
(command (format "%s . %s %s --type f %s --print0"
"fd"
;; In case DIR is a symlink.
(file-name-as-directory localdir)
""
(if files
(concat (shell-quote-argument "(")
" " find-name-arg " "
(mapconcat
#'shell-quote-argument
(split-string files)
(concat " -o " find-name-arg " "))
" "
(shell-quote-argument ")"))
""))))
(message command)
(project--remote-file-names
(sort (split-string (shell-command-to-string command) "\0" t)
#'string<))))
--
Zhiwei Chen
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44210
; Package
emacs
.
(Sun, 10 Jan 2021 03:38:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 44210 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
+myself
--
Zhiwei Chen
On Jan 10, 2021, at 11:31 AM, Zhiwei Chen <chenzhiwei03 <at> kuaishou.com<mailto:chenzhiwei03 <at> kuaishou.com>> wrote:
Sorry for late reply, here are the benchmark stats.
The result is promising, ‘fd’ is 3x faster than ‘find’.
(benchmark 5 '(project--files-in-directory "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 9.401258s (0.097027s in 1 GCs)"
(benchmark 5 '(project--files-in-directory-fd "~/Workspace/llvm-project" '(".git")))
;;=> "Elapsed time: 2.759160s (0.105133s in 1 GCs)”
Where `project--files-in-directory’ is the original version in project.el, and `project--files-in-directory-fd’ modified from the previous one for ‘fd’ use.
The definition of `project--files-in-directory-fd’ follows:
(defun project--files-in-directory-fd (dir ignores &optional files)
(require 'find-dired)
(require 'xref)
(defvar find-name-arg)
(let* ((default-directory dir)
;; Make sure ~/ etc. in local directory name is
;; expanded and not left for the shell command
;; to interpret.
(localdir (file-local-name (expand-file-name dir)))
(command (format "%s . %s %s --type f %s --print0"
"fd"
;; In case DIR is a symlink.
(file-name-as-directory localdir)
""
(if files
(concat (shell-quote-argument "(")
" " find-name-arg " "
(mapconcat
#'shell-quote-argument
(split-string files)
(concat " -o " find-name-arg " "))
" "
(shell-quote-argument ")"))
""))))
(message command)
(project--remote-file-names
(sort (split-string (shell-command-to-string command) "\0" t)
#'string<))))
--
Zhiwei Chen
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44210
; Package
emacs
.
(Sun, 10 Jan 2021 17:49:01 GMT)
Full text and
rfc822 format available.
Message #17 received at 44210 <at> debbugs.gnu.org (full text, mbox):
Hi!
On 10.01.2021 05:37, Zhiwei Chen wrote:
> (defun project--files-in-directory-fd (dir ignores &optional files)
> (require 'find-dired)
> (require 'xref)
> (defvar find-name-arg)
> (let* ((default-directory dir)
> ;; Make sure ~/ etc. in local directory name is
> ;; expanded and not left for the shell command
> ;; to interpret.
> (localdir (file-local-name (expand-file-name dir)))
> (command (format "%s . %s %s --type f %s --print0"
> "fd"
> ;; In case DIR is a symlink.
> (file-name-as-directory localdir)
> ""
> (if files
> (concat (shell-quote-argument "(")
> " " find-name-arg " "
> (mapconcat
> #'shell-quote-argument
> (split-string files)
> (concat " -o " find-name-arg " "))
> " "
> (shell-quote-argument ")"))
> ""))))
> (message command)
> (project--remote-file-names
> (sort (split-string (shell-command-to-string command) "\0" t)
> #'string<))))
That code doesn't seem to handle the IGNORES argument at all. Which
could lead to an imbalanced comparison, though I don't know if it does,
in this example (with just one ignored dir). But you could try passing
no ignores to both of them.
It's weird, though. I have just tried both functions, and there was no
perceptible performance difference (in a different project, though; in
gecko-dev).
What are the versions of said programs on your machine? Mine:
$ find --version
find (GNU findutils) 4.7.0
$ fdfind --version
fd 7.4.0
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44210
; Package
emacs
.
(Mon, 11 Jan 2021 13:05:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 44210 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.
Make sure the page cache is cleared before each benchmark.
> sudo sysctl -w vm.drop_caches=3
> cd llvm-project
> sudo sysctl -w vm.drop_caches=3
> time fd > /tmp/fd_output
1.04s user 4.11s system 522% cpu 0.987 total
> sudo sysctl -w vm.drop_caches=3
> time find > /tmp/find_output
0.06s user 0.20s system 7% cpu 3.354 total
Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.
--
Zhiwei Chen
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44210
; Package
emacs
.
(Mon, 18 Jan 2021 01:16:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 44210 <at> debbugs.gnu.org (full text, mbox):
I think I replied to the wrong thread, so forwarded it again.
> I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.
>
> Make sure the page cache is cleared before each benchmark.
>
> > sudo sysctl -w vm.drop_caches=3
>
> > cd llvm-project
>
> > sudo sysctl -w vm.drop_caches=3
> > time fd > /tmp/fd_output
> 1.04s user 4.11s system 522% cpu 0.987 total
>
> > sudo sysctl -w vm.drop_caches=3
> > time find > /tmp/find_output
> 0.06s user 0.20s system 7% cpu 3.354 total
>
> Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.
--
Zhiwei Chen
Information forwarded
to
bug-gnu-emacs <at> gnu.org
:
bug#44210
; Package
emacs
.
(Mon, 18 Jan 2021 03:10:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 44210 <at> debbugs.gnu.org (full text, mbox):
Hi!
You didn't address my complaint about the ignored IGNORES argument. I
was going to explain that, but that email got sidetracked, sorry.
In any case, I can't reproduce your results even with the latest fd.
I don't have an LLVM checkout, though, just some other projects like
Linux kernel and gecko-dev. And 'find' is consistently 2x as fast here.
In any case, I can believe that fd is going to be faster on some
systems. To make it an "official" option, someone will need to write a
version of project--files-in-directory that uses fd but honors the
IGNORES argument, as well as FILES. Preferably with some tests. Then we
can make the program used switchable.
On 18.01.2021 03:15, Zhiwei Chen wrote:
>
> I think I replied to the wrong thread, so forwarded it again.
>
>> I benchmark it again on linux, where find is of 4.7.0 version and fd is of 8.2.1 version.
>>
>> Make sure the page cache is cleared before each benchmark.
>>
>>> sudo sysctl -w vm.drop_caches=3
>>
>>> cd llvm-project
>>
>>> sudo sysctl -w vm.drop_caches=3
>>> time fd > /tmp/fd_output
>> 1.04s user 4.11s system 522% cpu 0.987 total
>>
>>> sudo sysctl -w vm.drop_caches=3
>>> time find > /tmp/find_output
>> 0.06s user 0.20s system 7% cpu 3.354 total
>>
>> Since ‘fd’ is a multi-threaded program, the CPU percent is > 100%.
>
This bug report was last modified 4 years and 151 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.