GNU bug report logs - #56521
Add 'take' list operation [PATCH]

Previous Next

Package: emacs;

Reported by: Mattias Engdegård <mattiase <at> acm.org>

Date: Tue, 12 Jul 2022 17:07:02 UTC

Severity: wishlist

Tags: patch

Done: Mattias Engdegård <mattiase <at> acm.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 56521 in the body.
You can then email your comments to 56521 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#56521; Package emacs. (Tue, 12 Jul 2022 17:07:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Mattias Engdegård <mattiase <at> acm.org>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Tue, 12 Jul 2022 17:07:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: bug-gnu-emacs <at> gnu.org
Subject: Add 'take' list operation [PATCH]
Date: Tue, 12 Jul 2022 19:05:54 +0200
[Message part 1 (text/plain, inline)]
A basic list operation that is often useful is 'take', that extracts the N first elements of a list. The attached patch adds 'take' as a built-in function.

We already have its complement, `drop`, under the name `nthcdr`. (I wouldn't mind adding `drop` as an alias but it's not very important.)

How would it compare to existing alternatives, or to a pure Lisp implementation? Here are relative timings (smaller is better), sans GC.
N/M means taking N elements from a list of M.

|            | 5/10 | 9999/10000 | 1/10 |  1/0 |
|------------+------+------------+------+------|
| take       | 1.00 |       1.00 | 1.00 | 1.00 |
|------------+------+------------+------+------|
| seq-take   | 4.90 |       3.45 | 5.82 | 4.95 |
| -take      | 4.01 |       4.92 | 2.86 | 1.54 |
|------------+------+------------+------+------|
| fwd        | 2.89 |       3.76 | 1.77 | 1.25 |
| rev        | 2.90 |       3.45 | 2.06 | 1.38 |
| cl-loop    | 3.18 |       3.82 | 2.21 | 1.65 |
| butlast    | 3.58 |       1.73 | 6.59 | 2.33 |
|------------+------+------------+------+------|
| ntake (el) | 0.80 |       0.20 | 1.51 | 1.40 |
| ntake (C)  | 0.48 |       0.41 | 0.83 | 0.99 |

`seq-take` and `-take` are existing implementations from seq.el and dash.el respectively.  `fwd`, `rev`, `cl-loop` and `butlast` are 
alternative implementations in elisp. `ntake` are elisp and C implementations of a destructive take operation

A common replacement is using `butlast` but as we see that is very inefficient (worse than the table suggests since GC costs are not included). It also doesn't work for circular or dotted lists.

The C implementation would, of course, be used to implement `seq-take` for lists and speed up existing code. 

Adding `ntake` as a C primitive is not quite as beneficial since the elisp implementation does fairly well. (I'm not sure why Elisp is actually faster than C in the 9999/10000 case; maybe I made a mistake, or it has something to do with `nthcdr` having its own bytecode). I included `ntake` in the patch for reference and because it's Lisp tradition to have faster destructive 'n-' variants of list functions.

The patch is just a draft; there will be tests and manual text.
There are details that can be debated. For example, we don't error on negative N. It might be useful to allow the compiler to partially evaluate calls where N<2 or LIST=nil.

[take.diff (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56521; Package emacs. (Tue, 12 Jul 2022 22:52:01 GMT) Full text and rfc822 format available.

Message #8 received at 56521 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Mattias Engdegård <mattiase <at> acm.org>
Cc: 56521 <at> debbugs.gnu.org
Subject: Re: bug#56521: Add 'take' list operation [PATCH]
Date: Wed, 13 Jul 2022 00:51:28 +0200
Mattias Engdegård <mattiase <at> acm.org> writes:

> A basic list operation that is often useful is 'take', that extracts
> the N first elements of a list. The attached patch adds 'take' as a
> built-in function.

I'm not against adding this, but I'm not sure of the utility.  That is,
if you have performance critical code, you don't use `take', because it
generates much garbage -- instead you loop over a list, taking the
elements a few at a time explicitly.  And if you don't have performance
critical code, then why not use `seq-take'?

So a fast `take' seems to me to land in that uncanny zone of "you don't
really need that", but there may, of course, be usage scenarios that I'm
not aware of.

> I included `ntake` in the patch for reference and because it's Lisp
> tradition to have faster destructive 'n-' variants of list functions.

I think `ntake' would be a more popular function than `take', really --
because shortening a list is efficient, and something that's not
uncommon to do, and it's a bother to write

  (setcdr (nthcdr 10 list) nil)

especially since that may bug out if the list is shorter.  But as you
say, the Lisp version of ntake may well be faster than the C version, so
perhaps just stick it in subr.el instead?

But I'm not sure about the `ntake' name.  😀  It's natural in one way,
but since the main point is that the list is shortened, then perhaps
`list-shorten' (or something like that) is more appropriate.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Severity set to 'wishlist' from 'normal' Request was from Stefan Kangas <stefan <at> marxist.se> to control <at> debbugs.gnu.org. (Wed, 13 Jul 2022 11:47:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56521; Package emacs. (Wed, 13 Jul 2022 13:19:02 GMT) Full text and rfc822 format available.

Message #13 received at 56521 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 56521 <at> debbugs.gnu.org
Subject: Re: bug#56521: Add 'take' list operation [PATCH]
Date: Wed, 13 Jul 2022 15:18:41 +0200
[Message part 1 (text/plain, inline)]
13 juli 2022 kl. 00.51 skrev Lars Ingebrigtsen <larsi <at> gnus.org>:

> I'm not against adding this, but I'm not sure of the utility.  That is,
> if you have performance critical code, you don't use `take', because it
> generates much garbage -- instead you loop over a list, taking the
> elements a few at a time explicitly.  And if you don't have performance
> critical code, then why not use `seq-take'?

Thank you, but it is definitely not the case that performance is either of utmost importance or not at all. In fact, most code falls between those extremes. In particular, for elaborate packages such as Magit, Org, Calc, the byte compiler, many programming modes etc, managing complexity and correctness matter most but performance isn't unimportant.

Another way of looking at it is that we add `take` for convenience, and implement it in C for performance.

And don't underestimate the utility of `take` itself -- it's far from always possible or desired to do immediate work on the extracted prefix. I keep seeing it reimplemented (badly) and have done so several times myself (also badly).

It's also useful in conjunction with drop (nthcdr) since (append (take N L) (drop N L)) = L for all N and L.
For example, (take N (drop M L)) returns a sublist; (append (take N L) (drop M L)) removes a sublist. The primitives are easy to reason about.

> But as you
> say, the Lisp version of ntake may well be faster than the C version, so
> perhaps just stick it in subr.el instead?

After a more careful look, that anomaly was probably just a measurement fluke. On the whole the C implementation is reliably faster, although naturally the difference isn't nearly as big as for `take`.

> But I'm not sure about the `ntake' name.

It follows an established convention and makes it easy for the user to see the relationship with `take` as well as understand how they differ.
(We have practically no list primitive starting with 'list-'; those are commands that list something.)

If you prefer we could name it `destructive-take` or `take-destructively` but these are annoyingly long, lack precedence, and are not really better.

Attached is a more complete patch, now with tests and documentation updates.

[0001-Add-take-and-ntake-bug-56521.patch (application/octet-stream, attachment)]

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#56521; Package emacs. (Sun, 17 Jul 2022 16:01:02 GMT) Full text and rfc822 format available.

Message #16 received at 56521 <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 56521 <at> debbugs.gnu.org
Subject: Re: bug#56521: Add 'take' list operation [PATCH]
Date: Sun, 17 Jul 2022 18:00:17 +0200
Not much argument here; pushed to master. A couple of notes:

I went for the C implementation of `ntake` because I already had the code, and because an elisp implementation would be slower than `take` for short lists or small N, and there very point of `ntake` is speed.

`nthcdr` has some elaborate code for handling bignum arguments which makes some kind of sense since it's defined as an iteration of `cdr`. Currently, `take` and `ntake` require fixnums because bignums didn't seem to be worth the trouble.

Before closing the bug, I'll make the new code pay their way by adding some calls. `seq-take` is first in line for a makeover.





Reply sent to Mattias Engdegård <mattiase <at> acm.org>:
You have taken responsibility. (Mon, 18 Jul 2022 12:52:02 GMT) Full text and rfc822 format available.

Notification sent to Mattias Engdegård <mattiase <at> acm.org>:
bug acknowledged by developer. (Mon, 18 Jul 2022 12:52:02 GMT) Full text and rfc822 format available.

Message #21 received at 56521-done <at> debbugs.gnu.org (full text, mbox):

From: Mattias Engdegård <mattiase <at> acm.org>
To: Lars Ingebrigtsen <larsi <at> gnus.org>
Cc: 56521-done <at> debbugs.gnu.org
Subject: Re: bug#56521: Add 'take' list operation [PATCH]
Date: Mon, 18 Jul 2022 14:50:52 +0200
> `seq-take` is first in line for a makeover.

Done. Naturally there are many more instances of code implementing `take` and `ntake` in creative ways but those will have to wait.





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 16 Aug 2022 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 305 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.