From unknown Sat Sep 13 00:10:59 2025
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: MIME-tools 5.509 (Entity 5.509)
Content-Type: text/plain; charset=utf-8
From: bug#7068 <7068@debbugs.gnu.org>
To: bug#7068 <7068@debbugs.gnu.org>
Subject: Status: Feature request: uniq --field-separator="SEP"
--consider-fields="a, b, c" --ignore-fields="x, y, z"
Reply-To: bug#7068 <7068@debbugs.gnu.org>
Date: Sat, 13 Sep 2025 07:10:59 +0000
retitle 7068 Feature request: uniq --field-separator=3D"SEP" --consider-fie=
lds=3D"a, b, c" --ignore-fields=3D"x, y, z"
reassign 7068 coreutils
submitter 7068 Stefan Nowak
severity 7068 wishlist
thanks
From debbugs-submit-bounces@debbugs.gnu.org Sat Sep 18 19:02:58 2010
Received: (at submit) by debbugs.gnu.org; 18 Sep 2010 23:02:58 +0000
Received: from localhost ([127.0.0.1] helo=debbugs.gnu.org)
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from )
id 1Ox6Qr-0004MU-7m
for submit@debbugs.gnu.org; Sat, 18 Sep 2010 19:02:58 -0400
Received: from eggs.gnu.org ([140.186.70.92])
by debbugs.gnu.org with esmtp (Exim 4.69)
(envelope-from ) id 1Ox66a-0004DC-4w
for submit@debbugs.gnu.org; Sat, 18 Sep 2010 18:42:01 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
(envelope-from ) id 1Ox68u-0002rD-V1
for submit@debbugs.gnu.org; Sat, 18 Sep 2010 18:44:25 -0400
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on eggs.gnu.org
X-Spam-Level:
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,
RCVD_IN_DNSWL_NONE,
T_TO_NO_BRKTS_FREEMAIL autolearn=unavailable version=3.3.1
Received: from lists.gnu.org ([199.232.76.165]:59764)
by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from )
id 1Ox68u-0002r9-Ss
for submit@debbugs.gnu.org; Sat, 18 Sep 2010 18:44:24 -0400
Received: from [140.186.70.92] (port=38632 helo=eggs.gnu.org)
by lists.gnu.org with esmtp (Exim 4.43) id 1Ox68t-0004nd-HS
for bug-coreutils@gnu.org; Sat, 18 Sep 2010 18:44:24 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
(envelope-from ) id 1Ox68s-0002qI-8d
for bug-coreutils@gnu.org; Sat, 18 Sep 2010 18:44:23 -0400
Received: from mailout-de.gmx.net ([213.165.64.23]:57703 helo=mail.gmx.net)
by eggs.gnu.org with smtp (Exim 4.69) (envelope-from )
id 1Ox68r-0002pk-RX
for bug-coreutils@gnu.org; Sat, 18 Sep 2010 18:44:22 -0400
Received: (qmail invoked by alias); 18 Sep 2010 22:44:19 -0000
Received: from d86-32-104-53.cust.tele2.at (EHLO [192.168.1.2]) [86.32.104.53]
by mail.gmx.net (mp001) with SMTP; 19 Sep 2010 00:44:19 +0200
X-Authenticated: #14749042
X-Provags-ID: V01U2FsdGVkX19Ru61xviVRh1hQenplsUW13SlTDmoCArgCWQbD+i
oz9disb4rGiNYM
Message-Id: <016CD1D6-E98C-449C-BAAB-EB698D0C2B0F@gmx.at>
From: Stefan Nowak
To: bug-coreutils@gnu.org
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v935.3)
Subject: Feature request: uniq --field-separator="SEP" --consider-fields="a, b,
c" --ignore-fields="x, y, z"
Date: Sun, 19 Sep 2010 00:44:17 +0200
X-Mailer: Apple Mail (2.935.3)
X-Y-GMX-Trusted: 0
X-detected-operating-system: by eggs.gnu.org: Genre and OS details not
recognized.
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older,
4)
X-Spam-Score: -4.0 (----)
X-Debbugs-Envelope-To: submit
X-Mailman-Approved-At: Sat, 18 Sep 2010 19:02:56 -0400
X-BeenThere: debbugs-submit@debbugs.gnu.org
X-Mailman-Version: 2.1.11
Precedence: list
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
Sender: debbugs-submit-bounces@debbugs.gnu.org
Errors-To: debbugs-submit-bounces@debbugs.gnu.org
X-Spam-Score: -5.0 (-----)
Hello developers!
CURRENT SYNTAX:
http://www.gnu.org/software/coreutils/manual/html_node/uniq-invocation.html
--skip-fields=n Skip n fields on each line before checking for
uniqueness. Use a null string for comparison if a line has fewer than
n fields. Fields are sequences of non-space non-tab characters that
are separated from each other by at least one space or tab.
--- FEATURE REQUEST #1 ---
--field-separator="SEP", -F
EXAMPLE:
Scenario: Imagine a filesystem listing. Because of the hierarchical
nature, all entries are unique. Now I want to ignore the filepath-
prefix (skip the field/s by -F), and only consider the basename, and
see how many instances exist of it, and where (all duplicate instances
by -D).
Input:
folder afile 1
folder bfile 1
folder bfile 2
folder cfile 3
Commandline:
cat sample.txt | guniq -D -F "\t" -f 1
Output:
folder afile 1
folder bfile 1
BENEFIT: If you can define the separator character (i.e. TAB), then
you have the freedom to have all other characters besides SEP within
your column data, i.e. your column could then contain SPACE characters.
--- FEATURE SUGGESTION #2 ---
--consider-fields=a[,b,c, ...] Build the comparison string of a line
from these field(s).
--ignore-fields=x[,y,z,...] Build the comparison string of a line
by excluding these field(s).
EXAMPLE:
Input:
folder afile 1suffixA
folder bfile 1suffixB
folder bfile 2suffixA
folder cfile 3suffixA
Commandline:
cat sample.txt | guniq -D -F "\t" --consider-fields="2"
Equivalent to:
cat sample.txt | guniq -D -F "\t" --ignore-fields="1,3"
Output:
folder afile 1suffixA
folder bfile 1suffixB
WORKAROUND MEANWHILE: Pre-insert a RegEx find/replace process in the
pipe before uniq, which brings all the comparison-ignored data to the
front, and then --skip-fields.
BENEFIT: Of course it would be much more convenient to work with the
data as-is, and have the functions --consider-fields and --ignore-
fields.
Regards, Stefan Nowak