Package: emacs;
Reported by: Okamsn <okamsn <at> protonmail.com>
Date: Mon, 23 Sep 2024 01:35:01 UTC
Severity: wishlist
Done: Stefan Monnier <monnier <at> iro.umontreal.ca>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: Philip Kaludercic <philipk <at> posteo.net> To: Okamsn <okamsn <at> protonmail.com> Cc: michael_heerdegen <at> web.de, 73431 <at> debbugs.gnu.org, nicolas <at> petton.fr, monnier <at> iro.umontreal.ca Subject: bug#73431: Add `setf` support for `stream.el` in ELPA Date: Sat, 05 Oct 2024 09:14:01 +0000
Okamsn <okamsn <at> protonmail.com> writes: > Philip Kaludercic wrote: >> Okamsn <okamsn <at> protonmail.com> writes: >> >>> Philip Kaludercic wrote: >>>> Okamsn <okamsn <at> protonmail.com> writes: >>>>> Michael Heerdegen wrote: >>>>>> Does changing the internal representation of streams have an effect >>>>>> on the speed of the run code? >>>>> >>>>> I think that it does make it slower. I am trying to test it, and I think >>>>> that making records is slower than making cons cells. I think that >>>>> accessing the rest of the stream takes longer because the accessors >>>>> created by `cl-defstruct` always perform type checking. It seems to take >>>>> about twice as long when compared to naively using `car` and `cdr`. >>>>> >>>>> Do you think that it would be better to disable the type checking in the >>>>> accessors? If so, would you please share how to do that? The manual >>>>> talks about using `(optimize (safety 0))` in a declare form, but it also >>>>> seems to say that it cannot be done locally for just the `cl-defstruct` >>>>> usage. If it cannot be done, do think it makes sense to use >>>>> `make-record` directly, along with custom function to replace the >>>>> generated accessors? >>>> >>>> It the overhead noticeable, or just measurable? >>> >>> I’m not sure what counts as “noticeable”. >> >> I'd say that real-world code that uses stream.el gets slower. For me >> the main value of synthetic benchmarks is only the speed difference in >> orders of magnitude and in the number of GC interrupts. >> >>> ... > >> >>> You can see that the struct-based run times are about twice as long. I >>> think this is from the extra work done by the accessors. For example, >>> the type-checking is run multiple times when accessing the “first” and >>> “rest” slots, because the accessors are also used in `stream--force`. >> >> Type checking isn't always that bad; Do you see an (easy) way to avoid >> type checking from running multiple times? > > Please see the attached file. By setting `safety` to 0 and explicitly > checking only once in `stream--force`, we can avoid the multiple checks > when evaluating an unevaluated stream. That helps when going through the > stream the first time, but that still leaves multiple calls to > `stream--force` when iterating. I don't have any ideas for the latter. > > Setting safety to 0 is about 15% faster than having the accessors do > type checking, but it still isn't as fast as the current cons-based > implementation. From what I have tried, iterating through nested arrays > seems slower than iterating through a normal list. > > From 98ffcbe184fdbc403afdf5b1c48e77525dc1d476 Mon Sep 17 00:00:00 2001 > From: Earl Hyatt <okamsn <at> protonmail.com> > Date: Sat, 28 Sep 2024 15:09:10 -0400 > Subject: [PATCH v2] Change 'stream.el' to use structs instead of cons cells. > Update features. > > * stream.el (stream): Define the structure using 'cl-defstruct'. Set > safety to 0 using `cl-declaim` to avoid checking the type of the argument > to `stream--force` multiple times. Instead, explicitly check a single time > in `stream--force`, which must be used inside the public functions anyway. > > * stream.el (stream-make): Change to use new structure constructor > 'stream--make-stream'. > > * stream.el (stream--force, stream-first, stream-rest) > (stream-empty, stream-empty-p): Redefine to use structure slots. Move > to be closer to the structure definition. > > * stream.el (stream-first, stream-rest): Signal an error when trying to use > these functions as places for 'setf'. > > * stream.el (stream--fresh-identifier, stream--evald-identifier): > Remove now unused definitions. > > * stream.el (stream): Add a method that accepts a stream, returning it > unmodified. This makes mapping across multiple sequences easier. > > * stream.el (stream): Add a method that accepts an array and which does not > create sub-sequences of the array, unlike the implementation for generic > sequences. This is a bit faster and is a good example of a custom updater > function. > > * stream.el (stream--generalizer, cl-generic-generalizers): Remove > these specializers from the old, cons-based implementation. > > * stream.el (seq-elt): Signal an error when trying to use this function as a > place for 'setf'. > > * stream.el (seq-concatenate): Add methods that did not work as expected with > the generic implementation. > > * tests/stream-tests.el (stream-seq-concatenate-test, stream-seq-mapcat-test) > (stream-array-test): Add tests for these features. > > * tests/stream-tests.el: Test that evaluation is delayed for seq-drop-while > using deftest-for-delayed-evaluation. > --- > stream.el | 213 +++++++++++++++++++++++++++++------------- > tests/stream-tests.el | 26 ++++++ > 2 files changed, 176 insertions(+), 63 deletions(-) > > diff --git a/stream.el b/stream.el > index 7135ee0..23b8700 100644 > --- a/stream.el > +++ b/stream.el > @@ -66,36 +66,135 @@ > (eval-when-compile (require 'cl-lib)) > (require 'seq) > > -(eval-and-compile > - (defconst stream--fresh-identifier '--stream-fresh-- > - "Symbol internally used to streams whose head was not evaluated.") > - (defconst stream--evald-identifier '--stream-evald-- > - "Symbol internally used to streams whose head was evaluated.")) > +;; Set safety to 0 to avoid checking the type of the argument multiple times > +;; within `stream--force', which is used frequently. > +(cl-declaim (optimize (safety 0))) > +(cl-defstruct (stream (:constructor stream--make-stream) > + (:predicate streamp) > + :named) > + > + "A lazily evaluated sequence, compatible with the `seq' library's functions." > + > + (evaluated--internal > + nil > + :type boolean > + :documentation "Whether the head and tail of the stream are accessible. > + > +This value is set to t via the function `stream--force' after it > +calls the updater function.") > + > + (first--internal > + nil > + :type (or t null) Isn't this type just t? Not a proof, but this doesn't signal: (mapatoms (lambda (sym) (when (and (boundp sym) (not (cl-typep (symbol-value sym) '(or t null)))) (throw 'fail sym)))) > + :documentation "The first element of the stream.") > + > + (rest--internal > + nil > + :type (or stream null) > + :documentation "The rest of the stream, which is itself a stream.") > + > + (empty--internal > + nil > + :type boolean > + :documentation "Whether the evaluated stream is empty. > + > +A stream is empty if the updater function returns nil when > +`stream--force' evaluates the stream.") > + > + (updater--internal > + nil > + :type (or function null) > + :documentation "Function that returns the head and tail of the stream when called. > + > +The updater function returns the head and tail in a cons cell. > +If it returns nil, then the stream is empty and `empty--internal' is > +set to t. After this function is called, assuming no errors were signaled, > +`evaluated--internal' is set to t. > + > +In the case of the canonical empty stream (see the variable `stream-empty'), > +this slot is nil.")) > + > +(defun stream--force (stream) > + "Evaluate and return the STREAM. > + > +If the output of the updater function is nil, then STREAM is > +marked as empty. Otherwise, the output of the updater function > +is used to set the head and the tail of the stream." > + ;; Check explicitly so that we can avoid checking > + ;; in accessors by setting safety to 0 via `cl-declaim'. > + (unless (streamp stream) > + (signal 'wrong-type-argument (list 'stream stream))) If you are already using cl-lib, you could also make use of `cl-check-type'. > + (if (stream-evaluated--internal stream) > + stream > + (pcase (funcall (stream-updater--internal stream)) > + (`(,head . ,tail) > + (setf (stream-first--internal stream) head > + (stream-rest--internal stream) tail)) > + ((pred null) > + (setf (stream-empty--internal stream) t)) > + (bad-output > + (error "Bad output from stream updater: %s" > + bad-output))) > + (setf (stream-evaluated--internal stream) t) > + stream)) > > (defmacro stream-make (&rest body) > "Return a stream built from BODY. > -BODY must return nil or a cons cell whose cdr is itself a > -stream." > - (declare (debug t)) > - `(cons ',stream--fresh-identifier (lambda () ,@body))) > > -(defun stream--force (stream) Did you change the order of the definitions? > - "Evaluate and return the first cons cell of STREAM. > -That value is the one passed to `stream-make'." > - (cond > - ((eq (car-safe stream) stream--evald-identifier) > - (cdr stream)) > - ((eq (car-safe stream) stream--fresh-identifier) > - (prog1 (setf (cdr stream) (funcall (cdr stream))) > - ;; identifier is only updated when forcing didn't exit nonlocally > - (setf (car stream) stream--evald-identifier))) > - (t (signal 'wrong-type-argument (list 'streamp stream))))) > +BODY must return a cons cell whose car would be the head of a > +stream and whose cdr would be the tail of a stream. The cdr must > +be a stream itself in order to be a valid tail. Alternatively, > +BODY may return nil, in which case the stream is marked empty > +when the stream is evaluated." > + (declare (debug t)) > + `(stream--make-stream :evaluated--internal nil > + :updater--internal (lambda () ,@body))) > > (defmacro stream-cons (first rest) > "Return a stream built from the cons of FIRST and REST. > -FIRST and REST are forms and REST must return a stream." > + > +FIRST and REST are forms. REST must return a stream." > (declare (debug t)) > `(stream-make (cons ,first ,rest))) > + > +(defconst stream-empty > + (stream--make-stream :evaluated--internal t > + :first--internal nil > + :rest--internal nil > + :empty--internal t > + :updater--internal nil) > + "The empty stream.") > + > +(defun stream-empty () > + "Return the empty stream." > + stream-empty) This definition also appears unchanged? Please try to keep the diff as small as possible. > + > +(defun stream-empty-p (stream) > + "Return non-nil if STREAM is empty, nil otherwise." > + (stream-empty--internal (stream--force stream))) > + > +(defun stream-first (stream) > + "Return the first element of STREAM, evaluating if necessary. > + > +If STREAM is empty, return nil." > + (stream-first--internal (stream--force stream))) > + > +(defun \(setf\ stream-first\) (_store _stream) > + "Signal an error when trying to use `setf' on the head of a stream." > + (error "Streams are not mutable")) This should also be part of a second patch. > +(defun stream-rest (stream) > + "Return the tail of STREAM, evaluating if necessary. > + > +If STREAM is empty, return the canonical empty stream." > + (if (stream-empty-p stream) > + stream-empty > + (stream-rest--internal (stream--force stream)))) > + > +(defun \(setf\ stream-rest\) (_store _stream) > + "Signal an error when trying to use `setf' on the tail of a stream." > + (error "Streams are not mutable")) > + > > > ;;; Convenient functions for creating streams > @@ -103,6 +202,10 @@ (defmacro stream-cons (first rest) > (cl-defgeneric stream (src) > "Return a new stream from SRC.") > > +(cl-defmethod stream ((stream stream)) > + "Return STREAM unmodified." > + stream) > + > (cl-defmethod stream ((seq sequence)) > "Return a stream built from the sequence SEQ. > SEQ can be a list, vector or string." > @@ -112,6 +215,24 @@ (cl-defmethod stream ((seq sequence)) > (seq-elt seq 0) > (stream (seq-subseq seq 1))))) > > +(cl-defmethod stream ((array array)) > + "Return a stream built from the array ARRAY." > + (let ((len (length array))) > + (if (= len 0) > + (stream-empty) > + ;; This approach could avoid one level of indirection by setting > + ;; `stream-updater--internal' directly, but using `funcall' makes for a > + ;; good example of how to use a custom updater function using the public > + ;; interface. > + (let ((idx 0)) > + (cl-labels ((updater () > + (if (< idx len) > + (prog1 (cons (aref array idx) > + (stream-make (funcall #'updater))) > + (setq idx (1+ idx))) > + nil))) > + (stream-make (funcall #'updater))))))) > + > (cl-defmethod stream ((list list)) > "Return a stream built from the list LIST." > (if (null list) > @@ -190,33 +311,6 @@ (defun stream-range (&optional start end step) > (stream-range (+ start step) end step)))) > > > -(defun streamp (stream) > - "Return non-nil if STREAM is a stream, nil otherwise." > - (let ((car (car-safe stream))) > - (or (eq car stream--fresh-identifier) > - (eq car stream--evald-identifier)))) > - > -(defconst stream-empty (cons stream--evald-identifier nil) > - "The empty stream.") > - > -(defun stream-empty () > - "Return the empty stream." > - stream-empty) > - > -(defun stream-empty-p (stream) > - "Return non-nil if STREAM is empty, nil otherwise." > - (null (cdr (stream--force stream)))) > - > -(defun stream-first (stream) > - "Return the first element of STREAM. > -Return nil if STREAM is empty." > - (car (stream--force stream))) > - > -(defun stream-rest (stream) > - "Return a stream of all but the first element of STREAM." > - (or (cdr (stream--force stream)) > - (stream-empty))) > - > (defun stream-append (&rest streams) > "Concatenate the STREAMS. > Requesting elements from the resulting stream will request the > @@ -240,22 +334,7 @@ (defmacro stream-pop (stream) > `(prog1 > (stream-first ,stream) > (setq ,stream (stream-rest ,stream)))) > - > > -;;; cl-generic support for streams > - > -(cl-generic-define-generalizer stream--generalizer > - 11 > - (lambda (name &rest _) > - `(when (streamp ,name) > - 'stream)) > - (lambda (tag &rest _) > - (when (eq tag 'stream) > - '(stream)))) > - > -(cl-defmethod cl-generic-generalizers ((_specializer (eql stream))) > - "Support for `stream' specializers." > - (list stream--generalizer)) > > > ;;; Implementation of seq.el generic functions > @@ -273,6 +352,9 @@ (cl-defmethod seq-elt ((stream stream) n) > (setq n (1- n))) > (stream-first stream)) > > +(cl-defmethod (setf seq-elt) (_store (_sequence stream) _n) > + (error "Streams are not mutable")) > + > (cl-defmethod seq-length ((stream stream)) > "Return the length of STREAM. > This function will eagerly consume the entire stream." > @@ -417,6 +499,11 @@ (defmacro stream-delay (expr) > (cl-defmethod seq-copy ((stream stream)) > "Return a shallow copy of STREAM." > (stream-delay stream)) > + > +(cl-defmethod seq-concatenate ((_type (eql stream)) &rest sequences) > + "Make a stream which concatenates each sequence in SEQUENCES." > + (apply #'stream-append (mapcar #'stream sequences))) > + > > > ;;; More stream operations > diff --git a/tests/stream-tests.el b/tests/stream-tests.el > index ba304f1..defc544 100644 > --- a/tests/stream-tests.el > +++ b/tests/stream-tests.el > @@ -212,6 +212,27 @@ (ert-deftest stream-delay-test () > (and (equal res1 5) > (equal res2 5))))) > > +(ert-deftest stream-seq-concatenate-test () > + (should (streamp (seq-concatenate 'stream (list 1 2) (vector 3 4) (stream (list 5 6))))) > + (should (equal '(1 2 3 4 5 6) > + (seq-into (seq-concatenate 'stream > + (list 1 2) > + (vector 3 4) > + (stream (list 5 6))) > + 'list)))) > + > +(ert-deftest stream-seq-mapcat-test () > + (should (streamp (seq-mapcat #'stream (list (list 1 2) > + (vector 3 4) > + (stream (list 5 6))) > + 'stream))) > + (should (equal '(1 2 3 4 5 6) > + (seq-into (seq-mapcat #'stream (list (list 1 2) > + (vector 3 4) > + (stream (list 5 6))) > + 'stream) > + 'list)))) > + > (ert-deftest stream-seq-copy-test () > (should (streamp (seq-copy (stream-range)))) > (should (= 0 (stream-first (seq-copy (stream-range))))) > @@ -234,6 +255,10 @@ (ert-deftest stream-list-test () > (dolist (list '(nil '(1 2 3) '(a . b))) > (should (equal list (seq-into (stream list) 'list))))) > > +(ert-deftest stream-array-test () > + (dolist (arr (list "cat" [0 1 2])) > + (should (equal arr (seq-into (stream arr) (type-of arr)))))) > + > (ert-deftest stream-seq-subseq-test () > (should (stream-empty-p (seq-subseq (stream-range 2 10) 0 0))) > (should (= (stream-first (seq-subseq (stream-range 2 10) 0 3)) 2)) > @@ -296,6 +321,7 @@ (deftest-for-delayed-evaluation (stream-append (make-delayed-test-stream) (make > (deftest-for-delayed-evaluation (seq-take (make-delayed-test-stream) 2)) > (deftest-for-delayed-evaluation (seq-drop (make-delayed-test-stream) 2)) > (deftest-for-delayed-evaluation (seq-take-while #'numberp (make-delayed-test-stream))) > +(deftest-for-delayed-evaluation (seq-drop-while #'numberp (make-delayed-test-stream))) > (deftest-for-delayed-evaluation (seq-map #'identity (make-delayed-test-stream))) > (deftest-for-delayed-evaluation (seq-mapn #'cons > (make-delayed-test-stream) -- Philip Kaludercic on siskin
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.