GNU bug report logs -
#66117
30.0.50; `find-buffer-visiting' is slow when opening large number of buffers
Previous Next
Reported by: Ihor Radchenko <yantar92 <at> posteo.net>
Date: Wed, 20 Sep 2023 08:53:02 UTC
Severity: minor
Found in version 30.0.50
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
Full log
Message #53 received at 66117 <at> debbugs.gnu.org (full text, mbox):
Eli Zaretskii <eliz <at> gnu.org> writes:
>> I feel that I am still missing where `buffer-file-name' is set when
>> opening file via C-x C-f. Debugger showed something weird in my testing.
>
> With local files, it seems like insert-file-contents sets it. So
> maybe we should record the file name in the cache in bset_filename.
Thanks for the pointer!
AFAIU, the relevant code is
if (NILP (handler))
{
current_buffer->modtime = mtime;
current_buffer->modtime_size = st.st_size;
bset_filename (current_buffer, orig_filename);
However, it looks like file handlers are responsible for setting the
filename. So,
> - ~tramp-handle-insert-file-contents~
> - ~tramp-archive-handle-insert-file-contents~
> - ~ange-ftp-insert-file-contents~
> - ~jka-compr-insert-file-contents~
> - ~mm-url-insert-file-contents~
> - ~epa-file-insert-file-contents~
may also need to handle the caching. And also all the third-party handlers.
>> Just to make sure that we are on the same page: the cache I am proposing
>> should be complete - if a buffer is missing from the cache, we should be
>> sure that there is no matching buffer.
>
> Since we will keep buffer-list (we must), even with this cache
> available, we can always leave the current code that scans the buffer
> list if the name is not in the cache. This way, we don't need to
> worry to have all the buffers in the cache, only those which are
> looked for frequently and need the efficiency.
I need to elaborate then.
The problem Org faces happens when we open a file that is not yet opened
in Emacs. So, the FILENAME in question is missing from the buffer list
and `find-buffer-visiting' must (1) traverse every buffer in
`get-file-buffer'; (2) traverse every buffer again, checking
`buffer-file-name' values; (3) traverse every buffer yet again, checking
for `buffer-file-number'. We have the worst-case scenario for the
current code when the buffer with a given file name is not available and
all the checks fail.
To address the above scenario, it is not enough to cache _some_ buffer
names. Because not-yet-open FILENAME will be missing from the cache, but
we will still have to go through the above process, which is slow.
What is needed is a _complete_ cache, so that the fact that FILENAME is
missing there means that no buffer associated with FILENAME is open in
Emacs.
>> `find-buffer-visiting' explicitly checks for `buffer-file-truename'.
>> So, if the cache does not account for `buffer-file-truename', there will
>> be divergence between the existing code and when using the cache.
>>
>> Same argument for `buffer-file-number'
>
> As I said, we could have hash-tables for these as well, if that is
> needed. But I'd like to see the profiles that indicate we do need
> them.
I hope that the above clarified why I want to cache everything.
>> Most of the time was taken by `find-buffer-visiting'. Replacing
>> `find-buffer-visiting' with `get-file-buffer' in certain (not all)
>> places reduced the total runtime by 30%.
>
> So you are saying that 30% of file-visiting buffers are not found by
> get-file-buffer? Or is the 30% increase due to file names for which
> there's no corresponding buffer? If so, does the benchmark indeed
> look for so many buffers that don't exist?
The rough code flow for the profile I attached to the initial message
is: For each of 500 files used to build agenda: (1) check if file is
open in Emacs via `find-buffer-visiting' and open it if not yet open;
(2) search the file to find matching headings to be added to agenda.
The total CPU time spend building agenda from fresh Emacs decreased by
1/3 (~10 seconds) by replacing calls to `find-buffer-visiting' with
`get-file-buffer'. And this replacement did not yet replace every call
to `find-buffer-visiting' (in particular, find-file-no-select by itself
also calls `find-buffer-visiting'; I replaced no more than half of the
calls only). I estimate that over half of the 30 seconds building agenda
was spent repeatedly searching over all the buffers.
--
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
This bug report was last modified 1 year and 135 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.