GNU bug report logs -
#13032
24.3.50; Request: Provide a `delete-duplicate-lines' command
Previous Next
Reported by: Dani Moncayo <dmoncayo <at> gmail.com>
Date: Thu, 29 Nov 2012 19:26:01 UTC
Severity: wishlist
Found in version 24.3.50
Done: Juri Linkov <juri <at> jurta.org>
Bug is archived. No further changes may be made.
Full log
Message #41 received at 13032 <at> debbugs.gnu.org (full text, mbox):
>>>> C-u M-| awk -- '!a[$0]++' RET
>
> But I agree that it would be even better if `delete-duplicate-lines'
> did TRT even when the lines are not sorted. (I've just tested this
> feature in MS-Excel, and it is so: it doesn't requires that the lines
> are previously sorted)
Actually I use a slightly different command:
C-u M-| tac | awk -- '!a[$0]++' | tac RET
because I need to keep the last duplicate line instead of the first.
`tac' reverses the lines, removes the duplicates keeping the first duplicate,
and another `tac' reverses lines back thus keeping the last duplicate.
So for `delete-duplicate-lines' to be useful in this case it could support
also the reverse search that keeps the last duplicate.
You can see this limitation described in docstrings of various functions at
http://emacswiki.org/emacs/DuplicateLines
as "keeping first occurrence", so these functions are of no help.
Adding an argument to keep either the first/last duplicate and an argument
to delete only adjacent lines, and using the algorithm like in awk,
and using the calling interface like in `flush-lines', necessitates
the following small function that can be called with the arg `C-u'
to keep the last duplicate line, and `C-u C-u' to delete only adjacent lines:
(defun delete-duplicate-lines (rstart rend &optional reverse adjacent interactive)
"Delete duplicate lines in the region between RSTART and REND.
If REVERSE is nil, search and delete duplicates forward keeping the first
occurrence of duplicate lines. If REVERSE is non-nil, search and delete
duplicates backward keeping the last occurrence of duplicate lines.
If ADJACENT is non-nil, delete repeated lines only if they are adjacent."
(interactive
(progn
(barf-if-buffer-read-only)
(list (region-beginning) (region-end)
(equal current-prefix-arg '(4))
(equal current-prefix-arg '(16))
t)))
(let ((lines (unless adjacent (make-hash-table :weakness 'key :test 'equal)))
line prev-line
(count 0)
(rstart (copy-marker rstart))
(rend (copy-marker rend)))
(save-excursion
(goto-char (if reverse rend rstart))
(if (and reverse (bolp)) (forward-char -1))
(while (if reverse
(and (> (point) rstart) (not (bobp)))
(and (< (point) rend) (not (eobp))))
(setq line (buffer-substring-no-properties
(line-beginning-position) (line-end-position)))
(if (if adjacent (equal line prev-line) (gethash line lines))
(progn
(delete-region (progn (forward-line 0) (point))
(progn (forward-line 1) (point)))
(if reverse (forward-line -1))
(setq count (1+ count)))
(if adjacent (setq prev-line line) (puthash line t lines))
(forward-line (if reverse -1 1)))))
(set-marker rstart nil)
(set-marker rend nil)
(when interactive
(message "Deleted %d %sduplicate line%s%s"
count
(if adjacent "adjacent " "")
(if (= count 1) "" "s")
(if reverse " backward " "")))
count))
This bug report was last modified 12 years and 173 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.