#63861 - [PATCH] pp.el: New "pretty printing" code

GNU bug report logs - #63861
[PATCH] pp.el: New "pretty printing" code

Package: emacs;

Reported by: Stefan Monnier <monnier <at> iro.umontreal.ca>

Date: Fri, 2 Jun 2023 22:52:02 UTC

Severity: normal

Tags: patch

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

Message #77 received at 63861 <at> debbugs.gnu.org (full text, mbox):

From: Stefan Monnier <monnier <at> iro.umontreal.ca> To: Thierry Volpiatto <thievol <at> posteo.net> Cc: 63861 <at> debbugs.gnu.org Subject: Re: bug#63861: [PATCH] pp.el: New "pretty printing" code Date: Thu, 08 Jun 2023 18:35:05 -0400

>> I find these results (mine) quite odd: they suggest that my `pp-region` >> is *faster* than the old `pp-buffer` for `load-history` and `bookmark-alist` >> data, which I definitely did not expect (and don't know how to explain >> either). I've just redone my tests a bit differently, added `pp-emacs-lisp-code`, and also introduced a var to control whether to activate the `lisp-ppss` patch or not. I also fixed my `foo.el` file: its content was accidentally already pretty printed rather than being on a single line, which totally changes the behavior of `pp-region` and `pp-buffer`). For reference: % (cd ~/tmp; l foo.el test*.el) -rw------- 1 monnier monnier 1125154 8 jun 11:20 test-load-history.el -rw------- 1 monnier monnier 163258 8 jun 11:20 test-bookmark-alist.el -rw-r--r-- 1 monnier monnier 77101 8 jun 17:20 foo.el % Here's the code I used to run the test: for f in ~/tmp/foo.el ~/tmp/test-bookmark-alist.el ~/tmp/test-load-history.el; do for ppss in nil t; do for v in '(pp-buffer)' '(pp-region (point-min) (point-max))' '(tv/pp-region (point-min) (point-max))' '(let ((s (read (current-buffer)))) (erase-buffer) (pp-emacs-lisp-code s))'; do src/emacs -Q --batch -l ~/tmp/describe-variable --eval "(with-temp-buffer (emacs-lisp-mode) (insert-file-contents \"$f\") (setq pp-buffer-use-pp-region nil lisp--faster-ppss $ppss) (message \"%S %S %S %S\" (file-name-nondirectory \"$f\") (benchmark-run $v) '$v '$ppss))"&; done; done; done So, by file, from fastest to slowest: foo.el (0.859482743 0 0.0) (pp-buffer) t foo.el (0.890402623 0 0.0) (pp-buffer) nil foo.el (4.62344853 4 1.7225397670000002) (tv/pp-region (point-min) (point-max)) t foo.el (4.687414465 4 1.7116580980000002) (tv/pp-region (point-min) (point-max)) nil foo.el (7.932661181 1 0.3435169600000001) (pp-region (point-min) (point-max)) t foo.el (196.183345212 1 0.618591124) (pp-region (point-min) (point-max)) nil foo.el (2997.739238575 505 105.82851685700001) (let ((s (read (current-buffer)))) (erase-buffer) (pp-emacs-lisp-code s)) t Here we see that my `pp-region` code is slower than `pp-buffer` by a factor ~10x: I'm not very happy about it, but this `foo.el` file was selected because it was the worst case I had come across (tho that was before I found the `lisp-ppss` patch). The last element in each line is whether we activated the `lisp-ppss` patch. As we can see here, the `lisp-ppss` patch makes an enormous difference (~24x) for my code, but not for `pp-buffer` (because it relies on `lisp-indent-region` rather than `lisp-indent-line`) and not for `tv/pp-region` either (because it doesn't indent at all). We also see that `pp-emacs-lisp-code` is *much* slower. I don't include other results for this function in this email because they're still running :-) test-bookmark-alist.el (13.237991019999999 6 2.403892035) (tv/pp-region (point-min) (point-max)) nil test-bookmark-alist.el (14.853056353 6 2.705511935) (tv/pp-region (point-min) (point-max)) t test-bookmark-alist.el (28.059658418 5 2.005039257) (pp-region (point-min) (point-max)) t test-bookmark-alist.el (180.390204026 5 2.1182066760000002) (pp-region (point-min) (point-max)) nil test-bookmark-alist.el (265.95429676599997 10 4.445954908) (pp-buffer) t test-bookmark-alist.el (268.975666886 10 3.6774180120000004) (pp-buffer) nil Here we see that my `pp-region` code can be faster (even significantly so) than `pp-buffer`. I'm not sure why, but I'll take the win :-) We also see that the faster `lisp-ppss` again makes an important difference in the performance of `pp-region` (~8x), tho the effect is not as drastic as in the case of `foo.el`. test-load-history.el (235.134082197 8 4.440112806999999) (tv/pp-region (point-min) (point-max)) nil test-load-history.el (235.873981253 8 4.416064476) (tv/pp-region (point-min) (point-max)) t test-load-history.el (506.770548196 31 9.706665932) (pp-region (point-min) (point-max)) t test-load-history.el (701.991875274 43 12.390212449) (pp-buffer) t test-load-history.el (710.843618646 43 12.156289354) (pp-buffer) nil test-load-history.el (1419.268184021 36 9.260999640000001) (pp-region (point-min) (point-max)) nil Here again, we see that `pp-region` is competitive with `pp-buffer` and the `lisp-ppss` patch speeds it up significantly (~3x). Another thing we see in those tests is that `pp-region` (with the `lisp-ppss` patch) is ~2x slower than `tv/pp-region`, whereas the performance differential with `pp-buffer` varies a lot more. Also if we compare the time it takes to the size of the file, we see: 77101B / 7.932661181s = 9719 B/s 163258B / 28.059658418s = 5818 B/s 1125154B / 506.770548196s = 2220 B/s `pp-region`s performance is not quite linear in the size of the file :-( Interestingly, the same holds for `tv/pp-region`: 77101B / 4.62344853s = 16676 B/s 163258B / 13.237991019s = 12332 B/s 1125154B / 235.134082197s = 4785 B/s even though it works in a fundamentally very different way (which, to my naive eye should result in a linear performance), so maybe the slowdown here is due to something external (such as the fact that various operations on buffers get slower as the buffer gets bigger). > hmm, don't know, I ran pp-buffer with M-: from the test-load-history.el or the > test-bookmark-alist.el buffer. May be using emacs --batch makes a > difference? I don't see any significant performance difference between batch and non-batch :-( > is the data really printed in such case? Yes, definitely. > More or less the code using pp-region takes between 42 to 48s and the one > with old pp-buffer around 6s. I wonder why we see such wildly different performance. In my tests on your `test-bookmark-alist.el` I basically see the reverse ratio! > Also sorry about your last patch I tested it too fast, it is indeed > slightly faster, but not much: 42 vs 46s. This is also perplexing. In my tests, the patch has a very significant impact on the performance of `pp-region`. Are you sure the patch is used (`lisp-mode.el` is preloaded, so you need to re-dump Emacs, or otherwise manually force-reload `lisp-mode.elc` into your Emacs session)? FWIW, I'm running my tests on Emacs's `master` branch with the native ELisp compiler enabled (tho I don't see much difference in performance on these tests when running my other Emacs build without the native compiler) on an i386 Debian testing system. >> And do I understand correctly that `tv/pp-region` does not indent its >> output? > No, it does indent, see function tv/pp which use pp-to-string which use pp-buffer > and pp-buffer indent the whole sexp at the end. AFAICT `tv/pp` uses `pp-to-string` only on "atomic" values (i.e. not lists, vectors, or hash-tables), so there's usually not much to indent in there. What I see in the output after calling `tv/pp-region` are non-indented lists. >> What was the reason for this choice? > Because indentation is done at the end by pp-buffer. When I use `describe-variable` with your code, the printed value is indeed indented, but that code uses only `pp-buffer` and not `tv/pp-region` (i.e. `tv/describe-variable` does not call `tv/pp-region`, neither directly nor indirectly). > PS (unrelated to pp-region): About the old pp-buffer, there is > a difficult to find bug where the same operation is running twice > (newline is added, then removed, then added again and then the loop > continue) > > You can see it by edebugging pp-buffer on a simple alist like this: > > (defvar bar '((1 . "un") (2 . "deux") (3 . "trois") (4 . "quatre") (5 . "cinq") (6 . "six"))) That might be part of the poor performance we see on `test-bookmark-alist.el`? Stefan

This bug report was last modified 2 years and 28 days ago.

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #63861 [PATCH] pp.el: New "pretty printing" code

GNU bug report logs - #63861
[PATCH] pp.el: New "pretty printing" code