GNU bug report logs - #50928
remove-dups

Previous Next

Package: emacs;

Reported by: Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>

Date: Fri, 1 Oct 2021 03:25:01 UTC

Severity: wishlist

Tags: notabug

Done: Lars Ingebrigtsen <larsi <at> gnus.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 50928 in the body.
You can then email your comments to 50928 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Fri, 01 Oct 2021 03:25:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>:
New bug report received and forwarded. Copy sent to bug-gnu-emacs <at> gnu.org. (Fri, 01 Oct 2021 03:25:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>
To: bug-gnu-emacs <at> gnu.org
Subject: remove-dups
Date: Fri, 1 Oct 2021 12:23:18 +0900
I wanted to delete duplicated items from a list non-destructively.
It took me a while to find out how to do so.

(cl-remove-duplicates list :test 'equal)
(delete-dups (copy-sequence list))

I think it is handy to have something like below in subr.el.
Too obvious?

#+begin_src emacs-lisp
(defun remove-dups (list)
  "Remove 'equal' duplicates from LIST non-destructively.
Note that `delete-dups' deletes duplicates destructively."
  (delete-dups (copy-sequence list)))
#+end_src





Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Fri, 01 Oct 2021 12:46:01 GMT) Full text and rfc822 format available.

Message #8 received at 50928 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>
Cc: 50928 <at> debbugs.gnu.org
Subject: Re: bug#50928: remove-dups
Date: Fri, 01 Oct 2021 14:45:39 +0200
Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp> writes:

> I wanted to delete duplicated items from a list non-destructively.
> It took me a while to find out how to do so.
>
> (cl-remove-duplicates list :test 'equal)
> (delete-dups (copy-sequence list))
>
> I think it is handy to have something like below in subr.el.
> Too obvious?
>
> #+begin_src emacs-lisp
> (defun remove-dups (list)
>   "Remove 'equal' duplicates from LIST non-destructively.
> Note that `delete-dups' deletes duplicates destructively."
>   (delete-dups (copy-sequence list)))
> #+end_src

This is basically seq-uniq:

---
seq-uniq is an autoloaded compiled Lisp function in ‘seq.el’.

(seq-uniq SEQUENCE &optional TESTFN)
---

The seq library has a pretty full set of sequence functions, some of
which overlaps with the older functions like `delete-dups'.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Added tag(s) notabug. Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 01 Oct 2021 12:46:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 50928 <at> debbugs.gnu.org and Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp> Request was from Lars Ingebrigtsen <larsi <at> gnus.org> to control <at> debbugs.gnu.org. (Fri, 01 Oct 2021 12:46:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Fri, 01 Oct 2021 13:17:01 GMT) Full text and rfc822 format available.

Message #15 received at 50928 <at> debbugs.gnu.org (full text, mbox):

From: Dmitry Gutov <dgutov <at> yandex.ru>
To: Lars Ingebrigtsen <larsi <at> gnus.org>,
 Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>
Cc: 50928 <at> debbugs.gnu.org
Subject: Re: bug#50928: remove-dups
Date: Fri, 1 Oct 2021 16:16:33 +0300
On 01.10.2021 15:45, Lars Ingebrigtsen wrote:
> This is basically seq-uniq:
> 
> ---
> seq-uniq is an autoloaded compiled Lisp function in ‘seq.el’.
> 
> (seq-uniq SEQUENCE &optional TESTFN)
> ---
> 
> The seq library has a pretty full set of sequence functions, some of
> which overlaps with the older functions like `delete-dups'.

seq-uniq is O(N^2), though, so it's going to be less efficient than

  (delete-dups (copy-sequence list))

I think the idea was to add specialized faster implementations for 
different data types, but that hasn't happened, so far.

And its signature (accepting testfn) might make the obvious optimization 
which delete-dups uses (caching the "known" set in a hash table) not 
feasible. Maybe we could use a hash table for a limited set of testfn's.




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Fri, 01 Oct 2021 17:03:01 GMT) Full text and rfc822 format available.

Message #18 received at 50928 <at> debbugs.gnu.org (full text, mbox):

From: Drew Adams <drew.adams <at> oracle.com>
To: Dmitry Gutov <dgutov <at> yandex.ru>, Lars Ingebrigtsen <larsi <at> gnus.org>, Tak
 Kunihiro <tkk <at> misasa.okayama-u.ac.jp>
Cc: "50928 <at> debbugs.gnu.org" <50928 <at> debbugs.gnu.org>
Subject: RE: [External] : bug#50928: remove-dups
Date: Fri, 1 Oct 2021 17:02:08 +0000
FWIW, I use this.  I don't recall whether I borrowed
it from somewhere or just wrote it from scratch.

(defun my-remove-dups (sequence &optional test)
  "Copy of SEQUENCE with duplicate elements removed.
Optional arg TEST is the test function.  If nil, test with `equal'.
See `make-hash-table' for possible values of TEST."
  (setq test  (or test  #'equal))
  (let ((htable  (make-hash-table :test test)))
    (loop 
     for elt in sequence
     unless (gethash elt htable)
     do     (puthash elt elt htable)
     finally return (loop for i being the hash-values in htable collect i))))

Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Fri, 01 Oct 2021 17:33:02 GMT) Full text and rfc822 format available.

Message #21 received at 50928 <at> debbugs.gnu.org (full text, mbox):

From: Thierry Volpiatto <thievol <at> posteo.net>
To: Drew Adams <drew.adams <at> oracle.com>
Cc: Lars Ingebrigtsen <larsi <at> gnus.org>,
 Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>,
 "50928 <at> debbugs.gnu.org" <50928 <at> debbugs.gnu.org>,
 Dmitry Gutov <dgutov <at> yandex.ru>
Subject: Re: bug#50928: [External] : bug#50928: remove-dups
Date: Fri, 01 Oct 2021 17:31:53 +0000
Drew Adams <drew.adams <at> oracle.com> writes:

> FWIW, I use this.  I don't recall whether I borrowed
> it from somewhere or just wrote it from scratch.
>
> (defun my-remove-dups (sequence &optional test)
>   "Copy of SEQUENCE with duplicate elements removed.
> Optional arg TEST is the test function.  If nil, test with `equal'.
> See `make-hash-table' for possible values of TEST."
>   (setq test  (or test  #'equal))
>   (let ((htable  (make-hash-table :test test)))
>     (loop 
>      for elt in sequence
>      unless (gethash elt htable)
       collect (puthash elt elt htable))))

Looks like a old version of `helm-fast-remove-dups`, no need to loop
again in hash-table and using cl-loop is better.

-- 
Thierry




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Sun, 03 Oct 2021 23:43:02 GMT) Full text and rfc822 format available.

Message #24 received at 50928 <at> debbugs.gnu.org (full text, mbox):

From: Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>
To: larsi <at> gnus.org
Cc: thievol <at> posteo.net, tkk <at> misasa.okayama-u.ac.jp, 50928 <at> debbugs.gnu.org,
 drew.adams <at> oracle.com, dgutov <at> yandex.ru
Subject: Re: bug#50928: remove-dups
Date: Mon, 04 Oct 2021 08:42:14 +0900 (JST)
> Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp> writes:
> 
>> I wanted to delete duplicated items from a list non-destructively.
>> It took me a while to find out how to do so.
>>
>> (cl-remove-duplicates list :test 'equal)
>> (delete-dups (copy-sequence list))
>>
>> I think it is handy to have something like below in subr.el.
>> Too obvious?
>>
>> (defun remove-dups (list)
>>   "Remove 'equal' duplicates from LIST non-destructively.
>> Note that `delete-dups' deletes duplicates destructively."
>>   (delete-dups (copy-sequence list)))
> 
> This is basically seq-uniq:

Thank you to let me know.  Now I can find its existence in (info
"(elisp) Sequence Functions").  I wonder how I could have reached to
the function by myself.

How did you find it? (apropos-documentation "duplicate")?




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Mon, 04 Oct 2021 09:30:02 GMT) Full text and rfc822 format available.

Message #27 received at 50928 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>
Cc: thievol <at> posteo.net, 50928 <at> debbugs.gnu.org, drew.adams <at> oracle.com,
 dgutov <at> yandex.ru
Subject: Re: bug#50928: remove-dups
Date: Mon, 04 Oct 2021 11:29:10 +0200
Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp> writes:

> Thank you to let me know.  Now I can find its existence in (info
> "(elisp) Sequence Functions").  I wonder how I could have reached to
> the function by myself.
>
> How did you find it? (apropos-documentation "duplicate")?

I just...  knew about seq.el.  The cross-referencing between the older
sequence functions and seq.el is rather lacking -- basically all these
older functions should probably reference something in seq.el in their
doc strings.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Tue, 05 Oct 2021 03:04:01 GMT) Full text and rfc822 format available.

Message #30 received at 50928 <at> debbugs.gnu.org (full text, mbox):

From: Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>
To: larsi <at> gnus.org
Cc: thievol <at> posteo.net, tkk <at> misasa.okayama-u.ac.jp, 50928 <at> debbugs.gnu.org,
 drew.adams <at> oracle.com, dgutov <at> yandex.ru
Subject: Re: bug#50928: remove-dups
Date: Tue, 05 Oct 2021 12:03:11 +0900 (JST)
>> Now I can find its existence in (info
>> "(elisp) Sequence Functions").  I wonder how I could have reached to
>> the function by myself.
>>
>> How did you find it? (apropos-documentation "duplicate")?
> 
> I just...  knew about seq.el.  The cross-referencing between the older
> sequence functions and seq.el is rather lacking -- basically all these
> older functions should probably reference something in seq.el in their
> doc strings.

How about something like below?

commit xxx
Author: yyy
Date:   zzz

    Add references to a newer function `seq-uniq' in seq.el
    
    * lisp/subr.el (delete-dups):
    * doc/lispref/lists.texi (Sets And Lists):
    * doc/lispref/lists.texi (delete-dups):  Refer to `seq-uniq' (bug#50928).

diff --git a/lisp/subr.el b/lisp/subr.el
index e4819c4b2b..228d2e0c22 100644
--- a/lisp/subr.el
+++ b/lisp/subr.el
@@ -696,7 +696,7 @@ delete-dups
   "Destructively remove `equal' duplicates from LIST.
 Store the result in LIST and return it.  LIST must be a proper list.
 Of several `equal' occurrences of an element in LIST, the first
-one is kept."
+one is kept.  See `seq-uniq' for non-destructive operation."
   (let ((l (length list)))
     (if (> l 100)
         (let ((hash (make-hash-table :test #'equal :size l))

diff --git a/doc/lispref/lists.texi b/doc/lispref/lists.texi
index 75641256b6..66c556ecd0 100644
--- a/doc/lispref/lists.texi
+++ b/doc/lispref/lists.texi
@@ -1227,13 +1227,13 @@ Sets And Lists
 @cindex lists as sets
 @cindex sets
 
-  A list can represent an unordered mathematical set---simply consider a
-value an element of a set if it appears in the list, and ignore the
-order of the list.  To form the union of two sets, use @code{append} (as
-long as you don't mind having duplicate elements).  You can remove
-@code{equal} duplicates using @code{delete-dups}.  Other useful
-functions for sets include @code{memq} and @code{delq}, and their
-@code{equal} versions, @code{member} and @code{delete}.
+  A list can represent an unordered mathematical set---simply consider
+a value an element of a set if it appears in the list, and ignore the
+order of the list.  To form the union of two sets, use @code{append}
+(as long as you don't mind having duplicate elements).  You can remove
+@code{equal} duplicates using @code{delete-dups} or @code{seq-uniq}.
+Other useful functions for sets include @code{memq} and @code{delq},
+and their @code{equal} versions, @code{member} and @code{delete}.
 
 @cindex CL note---lack @code{union}, @code{intersection}
 @quotation
@@ -1489,7 +1489,8 @@ Sets And Lists
 This function destructively removes all @code{equal} duplicates from
 @var{list}, stores the result in @var{list} and returns it.  Of
 several @code{equal} occurrences of an element in @var{list},
-@code{delete-dups} keeps the first one.
+@code{delete-dups} keeps the first one.  See @code{seq-uniq} for
+non-destructive operation.
 @end defun
 
   See also the function @code{add-to-list}, in @ref{List Variables},




Information forwarded to bug-gnu-emacs <at> gnu.org:
bug#50928; Package emacs. (Tue, 05 Oct 2021 07:13:02 GMT) Full text and rfc822 format available.

Message #33 received at 50928 <at> debbugs.gnu.org (full text, mbox):

From: Lars Ingebrigtsen <larsi <at> gnus.org>
To: Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp>
Cc: thievol <at> posteo.net, 50928 <at> debbugs.gnu.org, drew.adams <at> oracle.com,
 dgutov <at> yandex.ru
Subject: Re: bug#50928: remove-dups
Date: Tue, 05 Oct 2021 09:11:51 +0200
Tak Kunihiro <tkk <at> misasa.okayama-u.ac.jp> writes:

> How about something like below?

Thanks; applied to Emacs 28.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 02 Nov 2021 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 291 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.