Package: emacs;
Reported by: Spencer Baugh <sbaugh <at> janestreet.com>
Date: Wed, 13 Mar 2024 15:03:01 UTC
Severity: normal
Tags: patch
Done: Eli Zaretskii <eliz <at> gnu.org>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: sbaugh <at> catern.com To: Eli Zaretskii <eliz <at> gnu.org> Cc: Spencer Baugh <sbaugh <at> janestreet.com>, 69775 <at> debbugs.gnu.org Subject: bug#69775: [PATCH] Use regexp-opt in dired-omit-regexp Date: Sat, 16 Mar 2024 17:15:52 +0000 (UTC)
[Message part 1 (text/plain, inline)]
Eli Zaretskii <eliz <at> gnu.org> writes: >> From: Spencer Baugh <sbaugh <at> janestreet.com> >> Date: Wed, 13 Mar 2024 11:01:05 -0400 >> >> In my benchmarking, for large dired buffers, using regexp-opt provides >> around a 3x speedup in omitting. > > Can you show a recipe for such benchmarking? I'd like to try that on > my systems. > > Also, what is the slowdown in the (improbable, but possible) case > where dired-omit-extensions change for each call of dired-omit-regexp? Yes, run the following after applying the patch: (require 'dired) (require 'dired-x) (require 'cl-lib) (defun dired-omit-regexp-old () (concat (if dired-omit-files (concat "\\(" dired-omit-files "\\)") "") (if (and dired-omit-files dired-omit-extensions) "\\|" "") (if dired-omit-extensions (concat ".";; a non-extension part should exist "\\(" (mapconcat 'regexp-quote dired-omit-extensions "\\|") "\\)$") ""))) (defun my-do-omit (mode) (let ((regexp (cl-case mode (new (dired-omit-regexp)) (old (dired-omit-regexp-old)) (new-uncached (let ((dired-omit--extension-regexp-cache nil)) (dired-omit-regexp))) (t (error "Bad mode %s" mode))))) (dired-mark-if (let ((fn (dired-get-filename nil t))) (and fn (string-match-p regexp fn))) nil))) (defun my-bench-omit (nfiles ntimes) (let ((default-directory (expand-file-name "test-dired-list"))) (make-directory default-directory t) (dolist (file (directory-files "." t "test-file")) (delete-file file)) (dotimes (i nfiles) (write-region "" nil (format "test-file%s" i) nil 'nomessage nil 'excl)) (let ((dired-omit-mode nil)) (with-current-buffer (let ((inhibit-message t)) (dired-noselect default-directory)) (revert-buffer) (message "files %s, ntimes %s: new %s old %s new-uncached %s" nfiles ntimes (car (benchmark-call (lambda () (my-do-omit 'new)) ntimes)) (car (benchmark-call (lambda () (my-do-omit 'old)) ntimes)) (car (benchmark-call (lambda () (my-do-omit 'new-uncached)) ntimes))) )))) (my-bench-omit 1 100) (my-bench-omit 10 100) (my-bench-omit 100 100) (my-bench-omit 1000 100) (my-bench-omit 10000 100) For me, I get: $ ./src/emacs -Q --batch -l ../emacs-29/bench-omit.elc files 1, ntimes 100: new 0.008839979999999999 old 0.018162129 new-uncached 0.031399762 files 10, ntimes 100: new 0.012037615 old 0.040232355000000004 new-uncached 0.037990543 files 100, ntimes 100: new 0.07368538100000001 old 0.314905271 new-uncached 0.10006527300000001 files 1000, ntimes 100: new 0.669103498 old 3.076339984 new-uncached 0.693134644 files 10000, ntimes 100: new 6.336211434 old 30.926320486 new-uncached 6.442762152999999 So the performance improvement is quite substantial for large directories. new-uncached is the performance if dired-omit-extensions changes on each call of dired-omit-regexp. For a directory of 1 file, the overhead of recomputing regexp-opt every time makes the performance perhaps 2x-3x worse, but around 10 files the performance improvement from regexp-opt exceeds the overhead, and above that the uncached version still outperforms the old version substantially. If dired-omit-extensions doesn't change every time, the performance is improved even for directories of 1 file. >> regexp-opt takes around 5 milliseconds, so to avoid slowing down >> omitting in small dired buffers we cache the return value. >> >> Since omitting is now 3x faster, increase dired-omit-size-limit by 3x. >> >> * lisp/dired-x.el (dired-omit--extension-regexp-cache): Add. >> (dired-omit-regexp): Use regexp-opt. >> (dired-omit-size-limit): Increase, since omitting is now faster. > > I'm okay with these changes, but: > > . the change in the default value of dired-omit-size-limit should be > called out in NEWS > . please document this variable in the dired-x.texi manual, where we > document all the other variables relevant to dired-omit mode. > . the doc string of dired-omit-size-limit is embarrassingly > unhelpful, so bonus points for fixing that as well > > Thanks. Certainly, updated patch attached.
[0001-Use-regexp-opt-in-dired-omit-regexp.patch (text/x-patch, inline)]
From ca0ff9d40b85b281e25b998528c1e1a71b9e70c5 Mon Sep 17 00:00:00 2001 From: Spencer Baugh <sbaugh <at> catern.com> Date: Sat, 16 Mar 2024 17:11:24 +0000 Subject: [PATCH] Use regexp-opt in dired-omit-regexp In my benchmarking, for large dired buffers, using regexp-opt provides around a 3x speedup in omitting. regexp-opt takes around 5 milliseconds, so to avoid slowing down omitting in small dired buffers we cache the return value. Since omitting is now 3x faster, increase dired-omit-size-limit by 3x. Also, document dired-omit-size-limit better. * doc/misc/dired-x.texi (Omitting Variables): Document dired-omit-size-limit. * etc/NEWS: Announce increase of dired-omit-size-limit. * lisp/dired-x.el (dired-omit--extension-regexp-cache): Add. (dired-omit-regexp): Use regexp-opt. (bug#69775) (dired-omit-size-limit): Increase and improve docs. --- doc/misc/dired-x.texi | 8 ++++++++ etc/NEWS | 6 ++++++ lisp/dired-x.el | 26 ++++++++++++++++++++------ 3 files changed, 34 insertions(+), 6 deletions(-) diff --git a/doc/misc/dired-x.texi b/doc/misc/dired-x.texi index 4cad016a0f6..66045c5f759 100644 --- a/doc/misc/dired-x.texi +++ b/doc/misc/dired-x.texi @@ -346,6 +346,14 @@ Omitting Variables match the file name relative to the buffer's top-level directory. @end defvar +@defvar dired-omit-size-limit +If non-@code{nil}, omitting will be skipped if the directory listing +exceeds this size in bytes. Since omitting can be slow for very large +directories, this avoids having to wait before seeing the directory. +This variable is ignored when @code{dired-omit-mode} is called +interactively, such as by @code{C-x M-o}, so you can still enable +omitting in the directory after the initial display. + @cindex omitting additional files @defvar dired-omit-marker-char Temporary marker used by Dired to implement omitting. Should never be used diff --git a/etc/NEWS b/etc/NEWS index b4a1c887f2e..fdface7aa0c 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -668,6 +668,12 @@ marked or clicked on files according to the OS conventions. For example, on systems supporting XDG, this runs 'xdg-open' on the files. +*** The default value of 'dired-omit-size-limit' has increased. +After performance improvements to omitting in large directories, the new +default value is 300k, up from 100k. This means 'dired-omit-mode' will +omit files in directories whose directory listing is up to 300 kilobytes +in size. + +++ *** 'dired-listing-switches' handles connection-local values if exist. This allows to customize different switches for different remote machines. diff --git a/lisp/dired-x.el b/lisp/dired-x.el index 62fdd916e69..d7d15028489 100644 --- a/lisp/dired-x.el +++ b/lisp/dired-x.el @@ -77,12 +77,17 @@ dired-vm-read-only-folders (other :tag "non-writable only" if-file-read-only)) :group 'dired-x) -(defcustom dired-omit-size-limit 100000 - "Maximum size for the \"omitting\" feature. +(defcustom dired-omit-size-limit 300000 + "Maximum buffer size for `dired-omit-mode'. + +Omitting will be skipped if the directory listing exceeds this size in +bytes. This variable is ignored when `dired-omit-mode' is called +interactively. + If nil, there is no maximum size." :type '(choice (const :tag "no maximum" nil) integer) :group 'dired-x - :version "29.1") + :version "30.1") (defcustom dired-omit-case-fold 'filesystem "Determine whether \"omitting\" patterns are case-sensitive. @@ -506,14 +511,23 @@ dired-omit-expunge (re-search-forward dired-re-mark nil t)))) count))) +(defvar dired-omit--extension-regexp-cache + nil + "A cache of `regexp-opt' applied to `dired-omit-extensions'. + +This is a cons whose car is a list of strings and whose cdr is a +regexp produced by `regexp-opt'.") + (defun dired-omit-regexp () + (unless (equal dired-omit-extensions (car dired-omit--extension-regexp-cache)) + (setq dired-omit--extension-regexp-cache + (cons dired-omit-extensions (regexp-opt dired-omit-extensions)))) (concat (if dired-omit-files (concat "\\(" dired-omit-files "\\)") "") (if (and dired-omit-files dired-omit-extensions) "\\|" "") (if dired-omit-extensions (concat ".";; a non-extension part should exist - "\\(" - (mapconcat 'regexp-quote dired-omit-extensions "\\|") - "\\)$") + (cdr dired-omit--extension-regexp-cache) + "$") ""))) ;; Returns t if any work was done, nil otherwise. -- 2.44.0
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.