GNU bug report logs -
#73046
29.4; Emacs 100% CPU usage for several seconds when opening dired buffer over TRAMP
Previous Next
Reported by: "Suhail Singh" <suhailsingh247 <at> gmail.com>
Date: Thu, 5 Sep 2024 14:56:01 UTC
Severity: normal
Found in version 29.4
Fixed in version 31.1
Done: Michael Albinus <michael.albinus <at> gmx.de>
Bug is archived. No further changes may be made.
Full log
Message #47 received at 73046 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
> So if it takes 4 to 5 sec to move a 20KB file, then how much stuff
> needs to be moved for the Dired listing? What does the below show, if
> you run it on that remote machine?
>
> $ ls -al | wc
Given that the contents of the remote directory can be exactly
replicated (apart from metadata such as mtime etc) by the below code
(that I shared in another response), you should be able to answer the
question trivially:
#+name: perf-fix/tramp/font-lock-on-dired/reproducer
#+begin_src sh
cd /tmp
rm -rf /tmp/test
mkdir /tmp/test
cd /tmp/test
for i in `seq -w 0 21`; do
mkdir -p src/"$i"
done
mkdir -p links
cd links
for i in `seq -w 0 21`; do
ln -sf /tmp/test/src/"$i"
done
#+end_src
For the record, the output is as follows:
#+begin_src sh :dir /ssh:${affected-host}:/tmp
cd /tmp/test/links
ls -al | wc
#+end_src
#+RESULTS:
: 25 262 1599
> It needs to show around 40KB to explain 10 sec of delay.
I don't understand your reasoning. In fact if the output of ls -al was
indeed around 40kb I would have been very surprised. The time taken for
transferring the 20KB file included initial connection costs whereas
TRAMP would presumably be reusing the same connection, but would be
sending multiple small requests. I don't see how one can be compared to
the other, other than to say (generally) that when connection is slow
both workflows would take greater time (which is what we observe).
Note that disabling font-lock improved response delay considerably.
That means the delay is not due to transferring information contained in
`ls --dired', which is considerably fast (relatively speaking), but in
doing the additional checks that Michael mentioned:
>>> It seems to be related to font-locking, indeed. See variable
>>> `dired-font-lock-keywords'. It specifies face recognition running basic
>>> file oprtations. For example, ";; Broken Symbolic link" calls
>>> `file-truename' and `file-exists-p', while "Symbolic link to a directory"
>>> and ";; Symbolic link to a non-directory" invoke `file-truename' and
>>> `file-directory-p'.
I did some further investigation; summarizing findings below:
A. On Host-A, the network connection is fairly slow s.t. transferring a
20KB file takes ~5s. On Host-B, the network connection is fairly
fast.
B. On Host-A, the time taken to refresh dired buffer containing 22
Subdirectories (/tmp/test/src as in above code snippet) is 0.70-0.75s
with font-lock enabled, and about the same with font-lock disabled.
These times exclude the time taken to establish the intiial
connection over TRAMP.
C. On Host-A, the time taken to refresh dired buffer containing 22
symlinks (each symlink pointing to a directory, i.e., /tmp/test/links
in the above code snippet) is 0.70-0.75s with font-lock disabled.
With font-lock enabled the time taken is ~13-14s and the CPU is at
100%. These times exclude the time taken to establish the intiial
connection over TRAMP.
D. On Host-B, the time taken to display dired buffer for /tmp/test/links
with font-lock enabled is ~2s greater than when font-lock is
disabled. When /tmp/test/links contains 100 symlinks to directories
(instead of 22), the time taken when font-lock is enabled is ~6s
greater than when font-lock is disabled.
Given above, I conclude:
1. The issue is present when there are symlinks to directories.
2. The issue is worse when there are greater number of symlinks to
directories.
2. The issue is worse when the connection is slower. However, it is
still observable when the connection is faster - if you're having
difficulty reproducing, increase the number of symlinks to
directories in /tmp/test/links above.
3. Given that when connection is slower, not only is the time taken for
font-locking greater, but the CPU is at 100%, I suspect that the
relevant code is doing some kind of busy-waiting.
The above observations seem consistent with Michael's comments above
regd. font-lock checks for "Broken Symbolink link" and "Symbolic link to
a directory". As such, if Michael's proposal below is implemented I
believe it would be an adequate fix to the issue:
>>> I believe it would be helpful to suppress these checks via a user
>>> option. And no, the checks shouldn't be suppressed for remote
>>> directories in general, on a fast connection they are valuable.
I hope that clarifies things, and gives you sufficient information to be
able to reproduce
--
Suhail
This bug report was last modified 294 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.