GNU bug report logs - #14555
Facing Some problem in uniq command

Previous Next

Package: coreutils;

Reported by: Shahid Hussain <shnx88 <at> gmail.com>

Date: Tue, 4 Jun 2013 16:21:02 UTC

Severity: normal

Tags: moreinfo

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Bob Proulx <bob <at> proulx.com>
To: Shahid Hussain <shnx88 <at> gmail.com>
Cc: 14555 <at> debbugs.gnu.org
Subject: bug#14555: Facing Some problem in uniq command
Date: Wed, 5 Jun 2013 09:06:38 -0600
Shahid Hussain wrote:
> Appreciate your quick reply. What exactly i m doing is there are so many
> files in my product which contains some data in "name =  value" format. By
> using some pattern i m extracting only "value" field from all files and
> redirecting the output to one temporarily file as i do not want any value
> to be repeated in any file. And here i m applying uniq command to this
> temporary file (by pipe lining sort [sort |uniq -c tempFile]) But i am
> unable to get expected result.

It might be better if in your script you set:

  #!/bin/sh
  LC_ALL=C
  export LC_ALL
  ...
  sort | uniq
  ...

That will force a standard sort order everywhere in your script.

> But as you have told whitespace also should be identical at every line so
> this might be the problem in my case. Because when i displayed content of
> file using cat command and manually copied the same data to another file
> and then tried uniq with sort command it works fine.

Without knowing enough about your data a quick and dirty hack to clean
up whitespace might be to pass it through awk.

  awk '{print$1}' somefile1 | sort | uniq ...

Since awk splits on whitespace this will only print the first field
and any whitespace or additional anything will be discarded.

> So it is fine for me but it would be too better if there could be an option
> in uniq command to work fine even if  whitespace is not identical :).

No.  The way is not to use an option.  The way is to prepare the data
without whitespace differences.  You have the option of using tools
like awk to split on whitespace while preparing the data.  Preparing
the data to avoid whitespace differences is the right option to use.

Bob




This bug report was last modified 6 years and 257 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.