GNU bug report logs -
#7068
Feature request: uniq --field-separator="SEP" --consider-fields="a, b, c" --ignore-fields="x, y, z"
Previous Next
To reply to this bug, email your comments to 7068 AT debbugs.gnu.org.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#7068
; Package
coreutils
.
(Sat, 18 Sep 2010 23:03:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Stefan Nowak <p.org <at> gmx.at>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sat, 18 Sep 2010 23:03:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hello developers!
CURRENT SYNTAX:
http://www.gnu.org/software/coreutils/manual/html_node/uniq-invocation.html
--skip-fields=n Skip n fields on each line before checking for
uniqueness. Use a null string for comparison if a line has fewer than
n fields. Fields are sequences of non-space non-tab characters that
are separated from each other by at least one space or tab.
--- FEATURE REQUEST #1 ---
--field-separator="SEP", -F
EXAMPLE:
Scenario: Imagine a filesystem listing. Because of the hierarchical
nature, all entries are unique. Now I want to ignore the filepath-
prefix (skip the field/s by -F), and only consider the basename, and
see how many instances exist of it, and where (all duplicate instances
by -D).
Input:
folder a<TAB>file 1
folder b<TAB>file 1
folder b<TAB>file 2
folder c<TAB>file 3
Commandline:
cat sample.txt | guniq -D -F "\t" -f 1
Output:
folder a<TAB>file 1
folder b<TAB>file 1
BENEFIT: If you can define the separator character (i.e. TAB), then
you have the freedom to have all other characters besides SEP within
your column data, i.e. your column could then contain SPACE characters.
--- FEATURE SUGGESTION #2 ---
--consider-fields=a[,b,c, ...] Build the comparison string of a line
from these field(s).
--ignore-fields=x[,y,z,...] Build the comparison string of a line
by excluding these field(s).
EXAMPLE:
Input:
folder a<TAB>file 1<TAB>suffixA
folder b<TAB>file 1<TAB>suffixB
folder b<TAB>file 2<TAB>suffixA
folder c<TAB>file 3<TAB>suffixA
Commandline:
cat sample.txt | guniq -D -F "\t" --consider-fields="2"
Equivalent to:
cat sample.txt | guniq -D -F "\t" --ignore-fields="1,3"
Output:
folder a<TAB>file 1<TAB>suffixA
folder b<TAB>file 1<TAB>suffixB
WORKAROUND MEANWHILE: Pre-insert a RegEx find/replace process in the
pipe before uniq, which brings all the comparison-ignored data to the
front, and then --skip-fields.
BENEFIT: Of course it would be much more convenient to work with the
data as-is, and have the functions --consider-fields and --ignore-
fields.
Regards, Stefan Nowak
This bug report was last modified 14 years and 279 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.