GNU bug report logs - #73431
Add `setf` support for `stream.el` in ELPA

Previous Next

Package: emacs;

Reported by: Okamsn <okamsn <at> protonmail.com>

Date: Mon, 23 Sep 2024 01:35:01 UTC

Severity: wishlist

Done: Stefan Monnier <monnier <at> iro.umontreal.ca>

Bug is archived. No further changes may be made.

Full log


Message #88 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Philip Kaludercic <philipk <at> posteo.net>
To: Okamsn <okamsn <at> protonmail.com>
Cc: Michael Heerdegen <michael_heerdegen <at> web.de>, "Okamsn via \"Bug reports
 for GNU Emacs, the Swiss army knife of text editors\"" <bug-gnu-emacs <at> gnu.org>,
 Nicolas Petton <nicolas <at> petton.fr>, Stefan Monnier <monnier <at> iro.umontreal.ca>,
 73431 <at> debbugs.gnu.org
Subject: Re: bug#73431: Add `setf` support for `stream.el` in ELPA
Date: Sat, 05 Oct 2024 09:14:01 +0000
Okamsn <okamsn <at> protonmail.com> writes:

> Philip Kaludercic wrote:
>> Okamsn <okamsn <at> protonmail.com> writes:
>> 
>>> Philip Kaludercic wrote:
>>>> Okamsn <okamsn <at> protonmail.com> writes:
>>>>> Michael Heerdegen wrote:
>>>>>> Does changing the internal representation of streams have an effect
>>>>>> on the speed of the run code?
>>>>>
>>>>> I think that it does make it slower. I am trying to test it, and I think
>>>>> that making records is slower than making cons cells. I think that
>>>>> accessing the rest of the stream takes longer because the accessors
>>>>> created by `cl-defstruct` always perform type checking. It seems to take
>>>>> about twice as long when compared to naively using `car` and `cdr`.
>>>>>
>>>>> Do you think that it would be better to disable the type checking in the
>>>>> accessors? If so, would you please share how to do that? The manual
>>>>> talks about using `(optimize (safety 0))` in a declare form, but it also
>>>>> seems to say that it cannot be done locally for just the `cl-defstruct`
>>>>> usage. If it cannot be done, do think it makes sense to use
>>>>> `make-record` directly, along with custom function to replace the
>>>>> generated accessors?
>>>>
>>>> It the overhead noticeable, or just measurable?
>>>
>>> I’m not sure what counts as “noticeable”.
>> 
>> I'd say that real-world code that uses stream.el gets slower.  For me
>> the main value of synthetic benchmarks is only the speed difference in
>> orders of magnitude and in the number of GC interrupts.
>> 
>>> ...
>  >>
>>> You can see that the struct-based run times are about twice as long.  I
>>> think this is from the extra work done by the accessors.  For example,
>>> the type-checking is run multiple times when accessing the “first” and
>>> “rest” slots, because the accessors are also used in `stream--force`.
>> 
>> Type checking isn't always that bad; Do you see an (easy) way to avoid
>> type checking from running multiple times?
>
> Please see the attached file. By setting `safety` to 0 and explicitly 
> checking only once in `stream--force`, we can avoid the multiple checks 
> when evaluating an unevaluated stream. That helps when going through the 
> stream the first time, but that still leaves multiple calls to 
> `stream--force` when iterating. I don't have any ideas for the latter.
>
> Setting safety to 0 is about 15% faster than having the accessors do 
> type checking, but it still isn't as fast as the current cons-based 
> implementation.  From what I have tried, iterating through nested arrays 
> seems slower than iterating through a normal list.
>
> From 98ffcbe184fdbc403afdf5b1c48e77525dc1d476 Mon Sep 17 00:00:00 2001
> From: Earl Hyatt <okamsn <at> protonmail.com>
> Date: Sat, 28 Sep 2024 15:09:10 -0400
> Subject: [PATCH v2] Change 'stream.el' to use structs instead of cons cells. 
>  Update features.
>
> * stream.el (stream): Define the structure using 'cl-defstruct'.  Set
> safety to 0 using `cl-declaim` to avoid checking the type of the argument
> to `stream--force` multiple times.  Instead, explicitly check a single time
> in `stream--force`, which must be used inside the public functions anyway.
>
> * stream.el (stream-make): Change to use new structure constructor
> 'stream--make-stream'.
>
> * stream.el (stream--force, stream-first, stream-rest)
> (stream-empty, stream-empty-p): Redefine to use structure slots.  Move
> to be closer to the structure definition.
>
> * stream.el (stream-first, stream-rest): Signal an error when trying to use
> these functions as places for 'setf'.
>
> * stream.el (stream--fresh-identifier, stream--evald-identifier):
> Remove now unused definitions.
>
> * stream.el (stream): Add a method that accepts a stream, returning it
> unmodified.  This makes mapping across multiple sequences easier.
>
> * stream.el (stream): Add a method that accepts an array and which does not
> create sub-sequences of the array, unlike the implementation for generic
> sequences.  This is a bit faster and is a good example of a custom updater
> function.
>
> * stream.el (stream--generalizer, cl-generic-generalizers): Remove
> these specializers from the old, cons-based implementation.
>
> * stream.el (seq-elt): Signal an error when trying to use this function as a
> place for 'setf'.
>
> * stream.el (seq-concatenate): Add methods that did not work as expected with
> the generic implementation.
>
> * tests/stream-tests.el (stream-seq-concatenate-test, stream-seq-mapcat-test)
> (stream-array-test): Add tests for these features.
>
> * tests/stream-tests.el: Test that evaluation is delayed for seq-drop-while
> using deftest-for-delayed-evaluation.
> ---
>  stream.el             | 213 +++++++++++++++++++++++++++++-------------
>  tests/stream-tests.el |  26 ++++++
>  2 files changed, 176 insertions(+), 63 deletions(-)
>
> diff --git a/stream.el b/stream.el
> index 7135ee0..23b8700 100644
> --- a/stream.el
> +++ b/stream.el
> @@ -66,36 +66,135 @@
>  (eval-when-compile (require 'cl-lib))
>  (require 'seq)
>  
> -(eval-and-compile
> -  (defconst stream--fresh-identifier '--stream-fresh--
> -    "Symbol internally used to streams whose head was not evaluated.")
> -  (defconst stream--evald-identifier '--stream-evald--
> -    "Symbol internally used to streams whose head was evaluated."))
> +;; Set safety to 0 to avoid checking the type of the argument multiple times
> +;; within `stream--force', which is used frequently.
> +(cl-declaim (optimize (safety 0)))



> +(cl-defstruct (stream (:constructor stream--make-stream)
> +                      (:predicate streamp)
> +                      :named)
> +
> +  "A lazily evaluated sequence, compatible with the `seq' library's functions."
> +
> +  (evaluated--internal
> +   nil
> +   :type boolean
> +   :documentation "Whether the head and tail of the stream are accessible.
> +
> +This value is set to t via the function `stream--force' after it
> +calls the updater function.")
> +
> +  (first--internal
> +   nil
> +   :type (or t null)

Isn't this type just t?  Not a proof, but this doesn't signal:

(mapatoms
 (lambda (sym)
   (when (and (boundp sym)
	      (not (cl-typep (symbol-value sym) '(or t null))))
     (throw 'fail sym))))

> +   :documentation "The first element of the stream.")
> +
> +  (rest--internal
> +   nil
> +   :type (or stream null)
> +   :documentation "The rest of the stream, which is itself a stream.")
> +
> +  (empty--internal
> +   nil
> +   :type boolean
> +   :documentation "Whether the evaluated stream is empty.
> +
> +A stream is empty if the updater function returns nil when
> +`stream--force' evaluates the stream.")
> +
> +  (updater--internal
> +   nil
> +   :type (or function null)
> +   :documentation "Function that returns the head and tail of the stream when called.
> +
> +The updater function returns the head and tail in a cons cell.
> +If it returns nil, then the stream is empty and `empty--internal' is
> +set to t.  After this function is called, assuming no errors were signaled,
> +`evaluated--internal' is set to t.
> +
> +In the case of the canonical empty stream (see the variable `stream-empty'),
> +this slot is nil."))
> +
> +(defun stream--force (stream)
> +  "Evaluate and return the STREAM.
> +
> +If the output of the updater function is nil, then STREAM is
> +marked as empty.  Otherwise, the output of the updater function
> +is used to set the head and the tail of the stream."
> +  ;; Check explicitly so that we can avoid checking
> +  ;; in accessors by setting safety to 0 via `cl-declaim'.
> +  (unless (streamp stream)
> +    (signal 'wrong-type-argument (list 'stream stream)))

If you are already using cl-lib, you could also make use of `cl-check-type'.

> +  (if (stream-evaluated--internal stream)
> +      stream
> +    (pcase (funcall (stream-updater--internal stream))
> +      (`(,head . ,tail)
> +       (setf (stream-first--internal stream) head
> +             (stream-rest--internal stream) tail))
> +      ((pred null)
> +       (setf (stream-empty--internal stream) t))
> +      (bad-output
> +       (error "Bad output from stream updater: %s"
> +              bad-output)))
> +    (setf (stream-evaluated--internal stream) t)
> +    stream))
>  
>  (defmacro stream-make (&rest body)
>    "Return a stream built from BODY.
> -BODY must return nil or a cons cell whose cdr is itself a
> -stream."
> -  (declare (debug t))
> -  `(cons ',stream--fresh-identifier (lambda () ,@body)))
>  
> -(defun stream--force (stream)

Did you change the order of the definitions?

> -  "Evaluate and return the first cons cell of STREAM.
> -That value is the one passed to `stream-make'."
> -  (cond
> -   ((eq (car-safe stream) stream--evald-identifier)
> -    (cdr stream))
> -   ((eq (car-safe stream) stream--fresh-identifier)
> -    (prog1 (setf (cdr stream) (funcall (cdr stream)))
> -      ;; identifier is only updated when forcing didn't exit nonlocally
> -      (setf (car stream) stream--evald-identifier)))
> -   (t (signal 'wrong-type-argument (list 'streamp stream)))))
> +BODY must return a cons cell whose car would be the head of a
> +stream and whose cdr would be the tail of a stream.  The cdr must
> +be a stream itself in order to be a valid tail.  Alternatively,
> +BODY may return nil, in which case the stream is marked empty
> +when the stream is evaluated."
> +  (declare (debug t))
> +  `(stream--make-stream :evaluated--internal nil
> +                        :updater--internal (lambda () ,@body)))
>  
>  (defmacro stream-cons (first rest)
>    "Return a stream built from the cons of FIRST and REST.
> -FIRST and REST are forms and REST must return a stream."
> +
> +FIRST and REST are forms.  REST must return a stream."
>    (declare (debug t))
>    `(stream-make (cons ,first ,rest)))
> +
> +(defconst stream-empty
> +  (stream--make-stream :evaluated--internal t
> +                       :first--internal nil
> +                       :rest--internal nil
> +                       :empty--internal t
> +                       :updater--internal nil)
> +  "The empty stream.")
> +
> +(defun stream-empty ()
> +  "Return the empty stream."
> +  stream-empty)

This definition also appears unchanged?  Please try to keep the diff as
small as possible.

> +
> +(defun stream-empty-p (stream)
> +  "Return non-nil if STREAM is empty, nil otherwise."
> +  (stream-empty--internal (stream--force stream)))
> +
> +(defun stream-first (stream)
> +  "Return the first element of STREAM, evaluating if necessary.
> +
> +If STREAM is empty, return nil."
> +  (stream-first--internal (stream--force stream)))
> +
> +(defun \(setf\ stream-first\) (_store _stream)
> +  "Signal an error when trying to use `setf' on the head of a stream."
> +  (error "Streams are not mutable"))

This should also be part of a second patch.

> +(defun stream-rest (stream)
> +  "Return the tail of STREAM, evaluating if necessary.
> +
> +If STREAM is empty, return the canonical empty stream."
> +  (if (stream-empty-p stream)
> +      stream-empty
> +    (stream-rest--internal (stream--force stream))))
> +
> +(defun \(setf\ stream-rest\) (_store _stream)
> +  "Signal an error when trying to use `setf' on the tail of a stream."
> +  (error "Streams are not mutable"))
> +
>  
>  
>  ;;; Convenient functions for creating streams
> @@ -103,6 +202,10 @@ (defmacro stream-cons (first rest)
>  (cl-defgeneric stream (src)
>    "Return a new stream from SRC.")
>  
> +(cl-defmethod stream ((stream stream))
> +  "Return STREAM unmodified."
> +  stream)
> +
>  (cl-defmethod stream ((seq sequence))
>    "Return a stream built from the sequence SEQ.
>  SEQ can be a list, vector or string."
> @@ -112,6 +215,24 @@ (cl-defmethod stream ((seq sequence))
>       (seq-elt seq 0)
>       (stream (seq-subseq seq 1)))))
>  
> +(cl-defmethod stream ((array array))
> +  "Return a stream built from the array ARRAY."
> +  (let ((len (length array)))
> +    (if (= len 0)
> +        (stream-empty)
> +      ;; This approach could avoid one level of indirection by setting
> +      ;; `stream-updater--internal' directly, but using `funcall' makes for a
> +      ;; good example of how to use a custom updater function using the public
> +      ;; interface.
> +      (let ((idx 0))
> +        (cl-labels ((updater ()
> +                      (if (< idx len)
> +                          (prog1 (cons (aref array idx)
> +                                       (stream-make (funcall #'updater)))
> +                            (setq idx (1+ idx)))
> +                        nil)))
> +          (stream-make (funcall #'updater)))))))
> +
>  (cl-defmethod stream ((list list))
>    "Return a stream built from the list LIST."
>    (if (null list)
> @@ -190,33 +311,6 @@ (defun stream-range (&optional start end step)
>       (stream-range (+ start step) end step))))
>  
>  
> -(defun streamp (stream)
> -  "Return non-nil if STREAM is a stream, nil otherwise."
> -  (let ((car (car-safe stream)))
> -    (or (eq car stream--fresh-identifier)
> -        (eq car stream--evald-identifier))))
> -
> -(defconst stream-empty (cons stream--evald-identifier nil)
> -  "The empty stream.")
> -
> -(defun stream-empty ()
> -  "Return the empty stream."
> -  stream-empty)
> -
> -(defun stream-empty-p (stream)
> -  "Return non-nil if STREAM is empty, nil otherwise."
> -  (null (cdr (stream--force stream))))
> -
> -(defun stream-first (stream)
> -  "Return the first element of STREAM.
> -Return nil if STREAM is empty."
> -  (car (stream--force stream)))
> -
> -(defun stream-rest (stream)
> -  "Return a stream of all but the first element of STREAM."
> -  (or (cdr (stream--force stream))
> -      (stream-empty)))
> -
>  (defun stream-append (&rest streams)
>    "Concatenate the STREAMS.
>  Requesting elements from the resulting stream will request the
> @@ -240,22 +334,7 @@ (defmacro stream-pop (stream)
>    `(prog1
>         (stream-first ,stream)
>       (setq ,stream (stream-rest ,stream))))
> -
>  
> -;;; cl-generic support for streams
> -
> -(cl-generic-define-generalizer stream--generalizer
> -  11
> -  (lambda (name &rest _)
> -    `(when (streamp ,name)
> -       'stream))
> -  (lambda (tag &rest _)
> -    (when (eq tag 'stream)
> -      '(stream))))
> -
> -(cl-defmethod cl-generic-generalizers ((_specializer (eql stream)))
> -  "Support for `stream' specializers."
> -  (list stream--generalizer))
>  
>  
>  ;;; Implementation of seq.el generic functions
> @@ -273,6 +352,9 @@ (cl-defmethod seq-elt ((stream stream) n)
>      (setq n (1- n)))
>    (stream-first stream))
>  
> +(cl-defmethod (setf seq-elt) (_store (_sequence stream) _n)
> +  (error "Streams are not mutable"))
> +
>  (cl-defmethod seq-length ((stream stream))
>    "Return the length of STREAM.
>  This function will eagerly consume the entire stream."
> @@ -417,6 +499,11 @@ (defmacro stream-delay (expr)
>  (cl-defmethod seq-copy ((stream stream))
>    "Return a shallow copy of STREAM."
>    (stream-delay stream))
> +
> +(cl-defmethod seq-concatenate ((_type (eql stream)) &rest sequences)
> +  "Make a stream which concatenates each sequence in SEQUENCES."
> +  (apply #'stream-append (mapcar #'stream sequences)))
> +
>  
>  
>  ;;; More stream operations
> diff --git a/tests/stream-tests.el b/tests/stream-tests.el
> index ba304f1..defc544 100644
> --- a/tests/stream-tests.el
> +++ b/tests/stream-tests.el
> @@ -212,6 +212,27 @@ (ert-deftest stream-delay-test ()
>              (and (equal res1 5)
>                   (equal res2 5)))))
>  
> +(ert-deftest stream-seq-concatenate-test ()
> +  (should (streamp (seq-concatenate 'stream (list 1 2) (vector 3 4) (stream (list 5 6)))))
> +  (should (equal '(1 2 3 4 5 6)
> +                 (seq-into (seq-concatenate 'stream
> +                                            (list 1 2)
> +                                            (vector 3 4)
> +                                            (stream (list 5 6)))
> +                           'list))))
> +
> +(ert-deftest stream-seq-mapcat-test ()
> +  (should (streamp (seq-mapcat #'stream (list (list 1 2)
> +                                              (vector 3 4)
> +                                              (stream (list 5 6)))
> +                               'stream)))
> +  (should (equal '(1 2 3 4 5 6)
> +                 (seq-into (seq-mapcat #'stream (list (list 1 2)
> +                                                      (vector 3 4)
> +                                                      (stream (list 5 6)))
> +                                       'stream)
> +                           'list))))
> +
>  (ert-deftest stream-seq-copy-test ()
>    (should (streamp (seq-copy (stream-range))))
>    (should (= 0 (stream-first (seq-copy (stream-range)))))
> @@ -234,6 +255,10 @@ (ert-deftest stream-list-test ()
>    (dolist (list '(nil '(1 2 3) '(a . b)))
>      (should (equal list (seq-into (stream list) 'list)))))
>  
> +(ert-deftest stream-array-test ()
> +  (dolist (arr (list "cat" [0 1 2]))
> +    (should (equal arr (seq-into (stream arr) (type-of arr))))))
> +
>  (ert-deftest stream-seq-subseq-test ()
>    (should (stream-empty-p (seq-subseq (stream-range 2 10) 0 0)))
>    (should (= (stream-first (seq-subseq (stream-range 2 10) 0 3)) 2))
> @@ -296,6 +321,7 @@ (deftest-for-delayed-evaluation (stream-append  (make-delayed-test-stream) (make
>  (deftest-for-delayed-evaluation (seq-take (make-delayed-test-stream) 2))
>  (deftest-for-delayed-evaluation (seq-drop (make-delayed-test-stream) 2))
>  (deftest-for-delayed-evaluation (seq-take-while #'numberp (make-delayed-test-stream)))
> +(deftest-for-delayed-evaluation (seq-drop-while #'numberp (make-delayed-test-stream)))
>  (deftest-for-delayed-evaluation (seq-map #'identity (make-delayed-test-stream)))
>  (deftest-for-delayed-evaluation (seq-mapn #'cons
>                                            (make-delayed-test-stream)

-- 
	Philip Kaludercic on siskin




This bug report was last modified 264 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.