From unknown Sun Jun 22 11:33:11 2025 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Mailer: MIME-tools 5.509 (Entity 5.509) Content-Type: text/plain; charset=utf-8 From: bug#14555 <14555@debbugs.gnu.org> To: bug#14555 <14555@debbugs.gnu.org> Subject: Status: Facing Some problem in uniq command Reply-To: bug#14555 <14555@debbugs.gnu.org> Date: Sun, 22 Jun 2025 18:33:11 +0000 retitle 14555 Facing Some problem in uniq command reassign 14555 coreutils submitter 14555 Shahid Hussain severity 14555 normal tag 14555 moreinfo thanks From shnx88@gmail.com Tue Jun 4 17:37:27 2013 Received: (at submit) by debbugs.gnu.org; 4 Jun 2013 16:20:26 +0000 From: Shahid Hussain Subject: Facing Some problem in uniq command Message-Id: To: bug-coreutils@gnu.org Date: Tue, 4 Jun 2013 17:37:27 +0530 X-Debbugs-Envelope-To: submit I have a file (named 'a')which contains following data. ""; 8003 8004 8005 8010 9040 9041 9042 8336 8336 8337 8338 8338 8339 8340 8341 9000 9216 9217 9218 9219 9220 9221 9232 9233 9234 9248 9249 9250 9251 9264 9265 9280 9296 9281 9297 9001 9226 9040 9040 15008 9041 9042 15009 15010 6169 6170 18000 18000 ************************************************* And Below is the commands i am executing along with its output with comments. [ussc@lab211 config]$ uniq -d a 8336 8338 //Displaying one duplicate entry But so many duplicate entries are there in file [ussc@lab211 config]$ uniq -D a 8336 8336 8338 8338 //Displaying only two duplicate entry But so many duplicate entries are there in file [ussc@lab211 config]$ uniq -c a 1 ""; 1 8003 1 8004 1 8005 1 8010 1 9040 1 9041 1 9042 2 8336 1 8337 2 8338 1 8339 1 8340 1 8341 1 9000 1 9216 1 9217 1 9218 1 9219 1 9220 1 9221 1 9232 1 9233 1 9234 1 9248 1 9249 1 9250 1 9251 1 9264 1 9265 1 9280 1 9296 1 9281 1 9297 1 9001 1 9226 1 9040 1 9040 1 15008 1 9041 1 9042 1 15009 1 15010 1 6169 1 6170 1 18000 1 18000 //Observe last line which is repeated with its previous line (some other entries are also there)but uniq command not able to find it. Please check it once and let me know if i am wrong. Thanks and Regards, Shahid Hussain Bangalore. From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 04 12:45:57 2013 Received: (at 14555) by debbugs.gnu.org; 4 Jun 2013 16:45:57 +0000 Received: from localhost ([127.0.0.1]:50755 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UjuMu-0005aR-QK for submit@debbugs.gnu.org; Tue, 04 Jun 2013 12:45:57 -0400 Received: from mail1.vodafone.ie ([213.233.128.43]:33139) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UjuMs-0005aB-BT for 14555@debbugs.gnu.org; Tue, 04 Jun 2013 12:45:55 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjgDAJoYrlFtTPRo/2dsb2JhbAANTL9egnMDAYEQgxcBAQEEMgFGEAsNAQoJFg8JAwIBAgFFBg0BBwEBsniSZ45FXweDWwOeAI4P Received: from unknown (HELO [192.168.1.79]) ([109.76.244.104]) by mail1.vodafone.ie with ESMTP; 04 Jun 2013 17:43:39 +0100 Message-ID: <51AE193A.8050805@draigBrady.com> Date: Tue, 04 Jun 2013 17:43:38 +0100 From: =?ISO-8859-1?Q?P=E1draig_Brady?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 MIME-Version: 1.0 To: Shahid Hussain Subject: Re: bug#14555: Facing Some problem in uniq command References: In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Spam-Score: -0.5 (/) X-Debbugs-Envelope-To: 14555 Cc: 14555@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -1.9 (-) On 06/04/2013 01:07 PM, Shahid Hussain wrote: > I have a file (named 'a')which contains following data. > ""; > 8003 > 8004 > 8005 > 8010 > 9040 > 9041 > 9042 > 8336 > 8336 > 8337 > 8338 > 8338 > 8339 > 8340 > 8341 > 9000 > 9216 > 9217 > 9218 > 9219 > 9220 > 9221 > 9232 > 9233 > 9234 > 9248 > 9249 > 9250 > 9251 > 9264 > 9265 > 9280 > 9296 > 9281 > 9297 > 9001 > 9226 > 9040 > 9040 > 15008 > 9041 > 9042 > 15009 > 15010 > 6169 > 6170 > 18000 > 18000 > > ************************************************* > And Below is the commands i am executing along with its output with > comments. > [ussc@lab211 config]$ uniq -d a > 8336 > 8338 > //Displaying one duplicate entry But so many duplicate entries are there in > file > [ussc@lab211 config]$ uniq -D a > 8336 > 8336 > 8338 > 8338 > //Displaying only two duplicate entry But so many duplicate entries are > there in file > [ussc@lab211 config]$ uniq -c a > 1 ""; > 1 8003 > 1 8004 > 1 8005 > 1 8010 > 1 9040 > 1 9041 > 1 9042 > 2 8336 > 1 8337 > 2 8338 > 1 8339 > 1 8340 > 1 8341 > 1 9000 > 1 9216 > 1 9217 > 1 9218 > 1 9219 > 1 9220 > 1 9221 > 1 9232 > 1 9233 > 1 9234 > 1 9248 > 1 9249 > 1 9250 > 1 9251 > 1 9264 > 1 9265 > 1 9280 > 1 9296 > 1 9281 > 1 9297 > 1 9001 > 1 9226 > 1 9040 > 1 9040 > 1 15008 > 1 9041 > 1 9042 > 1 15009 > 1 15010 > 1 6169 > 1 6170 > 1 18000 > 1 18000 > //Observe last line which is repeated with its previous line (some other > entries are also there)but uniq command not able to find it. > > > > Please check it once and let me know if i am wrong. > Thanks and Regards, > Shahid Hussain > Bangalore. > > Note 9041 is also repeated but you won't see that until you sort first, though that's not your specific issue here. Perhaps you have mixed \n and \r\n line endings or something? This might be informative? tail -n2 a | od -Ax -tx1z -v thanks, Pádraig. From debbugs-submit-bounces@debbugs.gnu.org Tue Jun 04 13:26:08 2013 Received: (at 14555) by debbugs.gnu.org; 4 Jun 2013 17:26:08 +0000 Received: from localhost ([127.0.0.1]:50780 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Ujuzo-0007BI-0x for submit@debbugs.gnu.org; Tue, 04 Jun 2013 13:26:08 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33794) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Ujuzg-0007Ae-6P; Tue, 04 Jun 2013 13:26:03 -0400 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r54HO0eh021834 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 4 Jun 2013 13:24:00 -0400 Received: from [10.3.113.96] (ovpn-113-96.phx2.redhat.com [10.3.113.96]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r54GUPUa005314; Tue, 4 Jun 2013 12:30:29 -0400 Message-ID: <51AE1621.3000609@redhat.com> Date: Tue, 04 Jun 2013 10:30:25 -0600 From: Eric Blake Organization: Red Hat, Inc. User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6 MIME-Version: 1.0 To: Shahid Hussain Subject: Re: bug#14555: Facing Some problem in uniq command References: In-Reply-To: X-Enigmail-Version: 1.5.1 OpenPGP: url=http://people.redhat.com/eblake/eblake.gpg Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="----enig2QRSPASCIXQCPGMLJXIKI" X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-Spam-Score: -7.4 (-------) X-Debbugs-Envelope-To: 14555 Cc: GNU bug tracker automated control server , 14555@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -7.4 (-------) This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2QRSPASCIXQCPGMLJXIKI Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable tag 14555 moreinfo thanks On 06/04/2013 06:07 AM, Shahid Hussain wrote: > I have a file (named 'a')which contains following data. > 9041 > 9042 > 8336 =2E.. > 9041 Ouch. Your file is not sorted. Therefore, 9041 is NOT unique when run through 'uniq', which only compares adjacent lines. > And Below is the commands i am executing along with its output with > comments. > [ussc@lab211 config]$ uniq -d a > 8336 > 8338 I get different results when copying and pasting from your email: $ uniq -d a 8336 8338 9040 18000 $ uniq --version | head -n1 uniq (GNU coreutils) 8.17 Could it be you are using an older version of coreutils, and we have fixed a bug in the meantime for how unique behaves when presented an unsorted file? > 1 18000 > 1 18000 > //Observe last line which is repeated with its previous line (some othe= r > entries are also there)but uniq command not able to find it. One other possibility: Are you sure the whitespace is identical on every line? Or could you have trailing whitespace on one line but not the other (such as a carriage return), so that the lines really are not unique even though they appeared unique? If so, that would explain why _my_ uniq run counted 18000 as a duplicate, if the act of sending the email and then me copying and pasting into a file munged the whitespace differences away. While I suspect that there is no bug in coreutils, I need more information from you to confirm that claim, so I'm leaving the bug open for now. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org ------enig2QRSPASCIXQCPGMLJXIKI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJRrhYhAAoJEKeha0olJ0NqUZAH/2RbAu4r8n9JcEnMcG5m4NmB HblJLZ8Nd/u9hbQIUbam0FxJ2TCUL2ZYBSTDHkx/3Ww5D3/Tnupljkhe8MwCucNo yPstOquiTmV3A8NhGJ9HZGzv4Ki+Q8tBw3qoevC0YQtItqeyUVXH7FMQaI0A/i4Q TnsFnVI2d+oqwaKlC54fD0hlILpS1TceivRsNlz3xCEfMWtNFn14fulqqiC407Ux a/bXAEjlhOsbE/RRn/XsG1+GDopJ9Xb0YZ3mb8pnpIcwK4C/JNLNPW1fxGou/O2P jck9sx+a8lf30oBW1UmpQtB5ZVZnv1byZWsB9mdoHh3+tTsa+KaGx9oYeaAHiX4= =Fr8f -----END PGP SIGNATURE----- ------enig2QRSPASCIXQCPGMLJXIKI-- From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 05 01:19:56 2013 Received: (at 14555) by debbugs.gnu.org; 5 Jun 2013 05:19:56 +0000 Received: from localhost ([127.0.0.1]:51327 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Uk68Z-0004J5-EI for submit@debbugs.gnu.org; Wed, 05 Jun 2013 01:19:56 -0400 Received: from mail-ie0-f171.google.com ([209.85.223.171]:60305) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1Uk68U-0004IF-PQ; Wed, 05 Jun 2013 01:19:52 -0400 Received: by mail-ie0-f171.google.com with SMTP id s9so2564719iec.2 for ; Tue, 04 Jun 2013 22:17:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=EMRZDuB5oztS3OUG2upXkpvSqfmMQG8ynL/Gk05lvPQ=; b=K08s5BMUy84n+uuOQO/UKua0NewsHNxh6tDZnei+UKhBRi213roHcuCcU/PuXQGNUg FIFZjW/HooOmDGAYF+04+0BvI+Xy8ZSG9+6Qp3zmJSabAKnoULBjD+fT6ZcWwIUfXevV 1ZjBr9RWYCFPNlw1934Wx/ykkWhxkY+zuwGSzjL+kWKA7oEA4JqLZRAc556jZSjunsgn oqOVFOUW/1Mu0DhYkdrVQPok5kxG5Q1HeABn/IjjUWv2QoijODazUjZiZKQP8uQiniUn ocX1uDtZYmRHdLR44QWhi0P0EvteTYs1mkukKyIQqgr9thK5onwuBcLqmYumMgEoKQsv 333g== MIME-Version: 1.0 X-Received: by 10.50.40.34 with SMTP id u2mr2411840igk.16.1370409463415; Tue, 04 Jun 2013 22:17:43 -0700 (PDT) Received: by 10.64.67.36 with HTTP; Tue, 4 Jun 2013 22:17:43 -0700 (PDT) In-Reply-To: <51AE1621.3000609@redhat.com> References: <51AE1621.3000609@redhat.com> Date: Wed, 5 Jun 2013 10:47:43 +0530 Message-ID: Subject: Re: bug#14555: Facing Some problem in uniq command From: Shahid Hussain To: Eric Blake Content-Type: multipart/alternative; boundary=089e0122f84c438b8504de6151e4 X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 14555 Cc: GNU bug tracker automated control server , 14555@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.3 (--) --089e0122f84c438b8504de6151e4 Content-Type: text/plain; charset=ISO-8859-1 Hi, Appreciate your quick reply. What exactly i m doing is there are so many files in my product which contains some data in "name = value" format. By using some pattern i m extracting only "value" field from all files and redirecting the output to one temporarily file as i do not want any value to be repeated in any file. And here i m applying uniq command to this temporary file (by pipe lining sort [sort |uniq -c tempFile]) But i am unable to get expected result. But as you have told whitespace also should be identical at every line so this might be the problem in my case. Because when i displayed content of file using cat command and manually copied the same data to another file and then tried uniq with sort command it works fine. So it is fine for me but it would be too better if there could be an option in uniq command to work fine even if whitespace is not identical :). Lot of thanks, shahid hussain On Tue, Jun 4, 2013 at 10:00 PM, Eric Blake wrote: > tag 14555 moreinfo > thanks > > On 06/04/2013 06:07 AM, Shahid Hussain wrote: > > I have a file (named 'a')which contains following data. > > > 9041 > > 9042 > > 8336 > ... > > > 9041 > > Ouch. Your file is not sorted. Therefore, 9041 is NOT unique when run > through 'uniq', which only compares adjacent lines. > > > And Below is the commands i am executing along with its output with > > comments. > > [ussc@lab211 config]$ uniq -d a > > 8336 > > 8338 > > I get different results when copying and pasting from your email: > $ uniq -d a > 8336 > 8338 > 9040 > 18000 > $ uniq --version | head -n1 > uniq (GNU coreutils) 8.17 > > Could it be you are using an older version of coreutils, and we have > fixed a bug in the meantime for how unique behaves when presented an > unsorted file? > > > 1 18000 > > 1 18000 > > //Observe last line which is repeated with its previous line (some other > > entries are also there)but uniq command not able to find it. > > One other possibility: Are you sure the whitespace is identical on every > line? Or could you have trailing whitespace on one line but not the > other (such as a carriage return), so that the lines really are not > unique even though they appeared unique? If so, that would explain why > _my_ uniq run counted 18000 as a duplicate, if the act of sending the > email and then me copying and pasting into a file munged the whitespace > differences away. > > While I suspect that there is no bug in coreutils, I need more > information from you to confirm that claim, so I'm leaving the bug open > for now. > > -- > Eric Blake eblake redhat com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org > > --089e0122f84c438b8504de6151e4 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,
Appreciate your quic= k reply. What exactly i m doing is there are so many files in my product wh= ich contains some data in "name =3D=A0 value" format. By using so= me pattern i m extracting only "value" field from all files and r= edirecting the output to one temporarily file as i do not want any value to= be repeated in any file. And here i m applying uniq command to this tempor= ary file (by pipe lining sort [sort |uniq -c tempFile]) But i am unable to = get expected result.

But as you have told whitespace also should be identical at ever= y line so this might be the problem in my case. Because when i displayed co= ntent of file using cat command and manually copied the same data to anothe= r file and then tried uniq with sort command it works fine.


So it is fine for me but it would be too better if there coul= d be an option in uniq command to work fine even if=A0 whitespace is not = identical :).

Lot of thanks,
shahid hussain



On Tue, Jun 4, 2013 at 10:00 PM, Eric Blake <= eblake@redhat.com> wrote:
tag 14555 moreinfo
thanks

On 06/04/2013 06:07 AM, Shahid Hussain wrote:
> I have a file (named 'a')which contains following data.

> 9041
> 9042
> 8336
...

> 9041

Ouch. =A0Your file is not sorted. =A0Therefore, 9041 is NOT unique when run=
through 'uniq', which only compares adjacent lines.

> And Below is the commands i am executing along with its output with > comments.
> [ussc@lab211 config]$ uniq -d a
> 8336
> 8338

I get different results when copying and pasting from your email:
$ uniq -d a
8336
8338
9040
18000
$ uniq --version | head -n1
uniq (GNU coreutils) 8.17

Could it be you are using an older version of coreutils, and we have
fixed a bug in the meantime for how unique behaves when presented an
unsorted file?

> =A0 =A0 =A0 1 18000
> =A0 =A0 =A0 1 18000
> //Observe last line which is repeated with its previous line (some oth= er
> entries are also there)but uniq command not able to find it.

One other possibility: Are you sure the whitespace is identical on ev= ery
line? =A0Or could you have trailing whitespace on one line but not the
other (such as a carriage return), so that the lines really are not
unique even though they appeared unique? =A0If so, that would explain why _my_ uniq run counted 18000 as a duplicate, if the act of sending the
email and then me copying and pasting into a file munged the whitespace
differences away.

While I suspect that there is no bug in coreutils, I need more
information from you to confirm that claim, so I'm leaving the bug open=
for now.

--
Eric Blake =A0 eblake redhat com =A0 =A0+1-919-301-3266
Libvirt virtualization library http://libvirt.org


--089e0122f84c438b8504de6151e4-- From debbugs-submit-bounces@debbugs.gnu.org Wed Jun 05 11:08:50 2013 Received: (at 14555) by debbugs.gnu.org; 5 Jun 2013 15:08:50 +0000 Received: from localhost ([127.0.0.1]:52697 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UkFKU-0000tA-4w for submit@debbugs.gnu.org; Wed, 05 Jun 2013 11:08:50 -0400 Received: from joseki.proulx.com ([216.17.153.58]:54493) by debbugs.gnu.org with esmtp (Exim 4.72) (envelope-from ) id 1UkFKR-0000sr-3A for 14555@debbugs.gnu.org; Wed, 05 Jun 2013 11:08:48 -0400 Received: from hysteria.proulx.com (hysteria.proulx.com [192.168.230.119]) by joseki.proulx.com (Postfix) with ESMTP id E89A8211D5; Wed, 5 Jun 2013 09:06:38 -0600 (MDT) Received: by hysteria.proulx.com (Postfix, from userid 1000) id D1C522DCF5; Wed, 5 Jun 2013 09:06:38 -0600 (MDT) Date: Wed, 5 Jun 2013 09:06:38 -0600 From: Bob Proulx To: Shahid Hussain Subject: Re: bug#14555: Facing Some problem in uniq command Message-ID: <20130605150638.GA12710@hysteria.proulx.com> References: <51AE1621.3000609@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: 0.3 (/) X-Debbugs-Envelope-To: 14555 Cc: 14555@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.13 Precedence: list Reply-To: 14555@debbugs.gnu.org List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: debbugs-submit-bounces@debbugs.gnu.org Errors-To: debbugs-submit-bounces@debbugs.gnu.org X-Spam-Score: -2.4 (--) Shahid Hussain wrote: > Appreciate your quick reply. What exactly i m doing is there are so many > files in my product which contains some data in "name = value" format. By > using some pattern i m extracting only "value" field from all files and > redirecting the output to one temporarily file as i do not want any value > to be repeated in any file. And here i m applying uniq command to this > temporary file (by pipe lining sort [sort |uniq -c tempFile]) But i am > unable to get expected result. It might be better if in your script you set: #!/bin/sh LC_ALL=C export LC_ALL ... sort | uniq ... That will force a standard sort order everywhere in your script. > But as you have told whitespace also should be identical at every line so > this might be the problem in my case. Because when i displayed content of > file using cat command and manually copied the same data to another file > and then tried uniq with sort command it works fine. Without knowing enough about your data a quick and dirty hack to clean up whitespace might be to pass it through awk. awk '{print$1}' somefile1 | sort | uniq ... Since awk splits on whitespace this will only print the first field and any whitespace or additional anything will be discarded. > So it is fine for me but it would be too better if there could be an option > in uniq command to work fine even if whitespace is not identical :). No. The way is not to use an option. The way is to prepare the data without whitespace differences. You have the option of using tools like awk to split on whitespace while preparing the data. Preparing the data to avoid whitespace differences is the right option to use. Bob From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 23 18:41:17 2018 Received: (at 14555) by debbugs.gnu.org; 23 Oct 2018 22:41:17 +0000 Received: from localhost ([127.0.0.1]:38777 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gF5MT-000357-22 for submit@debbugs.gnu.org; Tue, 23 Oct 2018 18:41:17 -0400 Received: from mail-it1-f175.google.com ([209.85.166.175]:35301) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gF5MS-00034v-5t for 14555@debbugs.gnu.org; Tue, 23 Oct 2018 18:41:16 -0400 Received: by mail-it1-f175.google.com with SMTP id p64-v6so4233323itp.0 for <14555@debbugs.gnu.org>; Tue, 23 Oct 2018 15:41:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=Z9TwGqhXNEgBdtdfoKgsgcKP6I/kxyVYm3PDwYmv1Mc=; b=f9ytLVD7dJvapaj4oZ6c2OGzD54KOySuGQnohUEvkD52PbeaGmwuQt4TvXRHq8g8cv f7fxF4Ch40Ao4PuY/FjmrXLbVq5Kdo8zW+DyEnbhK1c7sXDAaSetKI41nGgoJgyUO+aU EgDdBOfSoao1y2nJOkFEarJ/TGuoNcJZvsQbaKcDGd25UWY5Pi9NT4Ai4WkEmEqQHziG Tv4Iq2pqHPm++ZKQb0r9HB2bFpCnJ035Efuitiq827RF3VL0chDf78WvgFtCWUgROsF7 4kAHgBhKmn9dQUNtqs8YgOXjB0X9dS87kl4J8VkManycuE8G8daFYU4bZnoPEsGGTXPe dZsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Z9TwGqhXNEgBdtdfoKgsgcKP6I/kxyVYm3PDwYmv1Mc=; b=Vk+our++chwcE603vOe1eP8NM7kMn59kII8O+5xMy2BVxNUgr3mZLM4oLCO4ilrxsN NUDAOIi+kXNq25+nY6F8mGbk1JQ3GUS5UpN6s5LbjbmEj7dtkh62m1mLqqSdLCLkCfZG 1VrZiqcFMDunZ4PHCawuR2P0Q81RyFXiEGiKGGYwIHhoWPJL5zJM78o2SCq/vljNgK6G bSmgxQgoJtw4sj5KvSNciaEqf/544eaFEoiRBg7AO5vNPX2NHKqraBARjnVCXZ6dMyvZ GbAVe99KOfswgOEcg804cBg8POgL/w6t5OANtN/uE+8LKFI84pHwW4Zn1zegdVbEXVkl fZww== X-Gm-Message-State: AGRZ1gLc45RJSYh4t63HU7YcfVrRtd5SV3yC8viawiaXHokXqdvUA+us LBlnXWZqbQwYVgjg/C1WYwM8y8lSsck= X-Google-Smtp-Source: AJdET5cjX8Oii90r6p9D6cYsoJiLq3sEM1nbGlgXHxQWsBfZu7aTmEaPJsHzaxRjGL4Sj2Q0AT/pjQ== X-Received: by 2002:a02:9911:: with SMTP id r17-v6mr164390jaj.132.1540334470527; Tue, 23 Oct 2018 15:41:10 -0700 (PDT) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id u132-v6sm1732720ita.9.2018.10.23.15.41.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 23 Oct 2018 15:41:09 -0700 (PDT) Subject: Re: bug#14555: Facing Some problem in uniq command To: 14555@debbugs.gnu.org, Shahid Hussain References: <51AE1621.3000609@redhat.com> <20130605150638.GA12710@hysteria.proulx.com> From: Assaf Gordon Message-ID: Date: Tue, 23 Oct 2018 16:41:08 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20130605150638.GA12710@hysteria.proulx.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 14555 X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) close 14555 stop (triaging old bugs) On 05/06/13 09:06 AM, Bob Proulx wrote: > Shahid Hussain wrote: >> Appreciate your quick reply. What exactly i m doing is there are so many >> files in my product which contains some data in "name = value" format. By >> using some pattern i m extracting only "value" field from all files and >> redirecting the output to one temporarily file as i do not want any value >> to be repeated in any file. And here i m applying uniq command to this >> temporary file (by pipe lining sort [sort |uniq -c tempFile]) But i am >> unable to get expected result. > > It might be better if in your script you set: > > #!/bin/sh > LC_ALL=C > export LC_ALL > ... > sort | uniq > ... > > That will force a standard sort order everywhere in your script. > >> But as you have told whitespace also should be identical at every line so >> this might be the problem in my case. Because when i displayed content of >> file using cat command and manually copied the same data to another file >> and then tried uniq with sort command it works fine. > > Without knowing enough about your data a quick and dirty hack to clean > up whitespace might be to pass it through awk. > > awk '{print$1}' somefile1 | sort | uniq ... > > Since awk splits on whitespace this will only print the first field > and any whitespace or additional anything will be discarded. > >> So it is fine for me but it would be too better if there could be an option >> in uniq command to work fine even if whitespace is not identical :). > > No. The way is not to use an option. The way is to prepare the data > without whitespace differences. You have the option of using tools > like awk to split on whitespace while preparing the data. Preparing > the data to avoid whitespace differences is the right option to use. > With no further comments in 5 years, I'm closing this bug. -assaf From debbugs-submit-bounces@debbugs.gnu.org Tue Oct 30 00:29:50 2018 Received: (at control) by debbugs.gnu.org; 30 Oct 2018 04:29:50 +0000 Received: from localhost ([127.0.0.1]:53010 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gHLf3-0006pQ-Om for submit@debbugs.gnu.org; Tue, 30 Oct 2018 00:29:49 -0400 Received: from mail-io1-f42.google.com ([209.85.166.42]:42734) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gHLf2-0006pC-4J for control@debbugs.gnu.org; Tue, 30 Oct 2018 00:29:48 -0400 Received: by mail-io1-f42.google.com with SMTP id n18-v6so6445587ioa.9 for ; Mon, 29 Oct 2018 21:29:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=to:from:message-id:date:user-agent:mime-version:content-language :content-transfer-encoding; bh=5vadsTNne9JUciBevlU1h36DkMMXG64EHaWFeD7BQz4=; b=T6Tck6Bbcz397ID8tOybZDRBDYhgNVLmFbV775nT7v/PvDfrFUdYTxYH82roeZSeEZ MzAajAwPQQ6tzDV+UZlN/KXtVtDXX+LwBDm0TGTo2H33niNYimRcnw5JXal0lGz4kGUN RtgBXce/Iq1Z903cUotvcySZaYe+uAPZaFvXrG6KmHXkYxMZix12FTbb7ZmAUReB9PXf 0ggOTu8+eWIS4PgtMMSDY1I4G9cyRQnh900JWQybzNvoKJMH6eXj1mLbclXr0CYadfIL x8tHnoGkbFpbu8PRz+MxuHJrGK8RhSakQTIDxDP24ic/y8HKdCW6zlHIQrFg5yEWpCsw mPLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:from:message-id:date:user-agent:mime-version :content-language:content-transfer-encoding; bh=5vadsTNne9JUciBevlU1h36DkMMXG64EHaWFeD7BQz4=; b=rfros9ZdmTGWUejFTSkSXonpZ53z7NFCg5lxnDs8YA6mVsFEYwHkTX2HTA9L1Kod+S 03SfNZkOG7KbTJNw4uWjbvCOg3ASqGG5wiU2TF+blX0SIi9g2jExEKuU2faTb2AwZ6pU MFyoLoFVZF36zR3B3i+zspK96hij5n8q9ZbSGhiN/XPEi+yY9ETbEv/nyZXXlnvPt5JL nYawVDPw+jAw2SCSpgW75UW/SFhoHM5Vcjt31ugiOiAyievr0aLjuY3oCeEQooiW4Q7T QVj50klvJ8qCe/4zuM1F+QkJNDptSSqTx/DBNswku7TXEFqMuGg5tz7T91QqVhCjZQJw 3/Bg== X-Gm-Message-State: AGRZ1gIvtXDaqwHZuWhh785Gmv/oTBNBlx51TDmMyjY8p0Sp0rSVE5LD 4SOMSX4YDfl6Dp3IXkG4Q/0jCdzyqCA= X-Google-Smtp-Source: AJdET5c7dQiKF9QG8TWjghllACIalrAb5/HdtF8K0MS9s/nyRajciUzb63RiIDLcV+5V0KCYFC1PdA== X-Received: by 2002:a5e:8619:: with SMTP id z25-v6mr10857116ioj.28.1540873782135; Mon, 29 Oct 2018 21:29:42 -0700 (PDT) Received: from tomato.housegordon.com (moose.housegordon.com. [184.68.105.38]) by smtp.googlemail.com with ESMTPSA id y190-v6sm7079951itg.3.2018.10.29.21.29.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 29 Oct 2018 21:29:40 -0700 (PDT) To: control@debbugs.gnu.org From: Assaf Gordon Message-ID: <29a1af5d-c362-3ac8-ab6e-9394775de058@gmail.com> Date: Mon, 29 Oct 2018 22:29:39 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Spam-Score: 2.0 (++) X-Spam-Report: Spam detection software, running on the system "debbugs.gnu.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: close 14555 stop [...] Content analysis details: (2.0 points, 10.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_PASS SPF: sender matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (assafgordon[at]gmail.com) -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [209.85.166.42 listed in list.dnswl.org] 1.8 MISSING_SUBJECT Missing Subject: header 0.2 NO_SUBJECT Extra score for no subject X-Debbugs-Envelope-To: control X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: 1.0 (+) close 14555 stop From unknown Sun Jun 22 11:33:11 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Tue, 27 Nov 2018 12:24:09 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator