GNU bug report logs -
#14555
Facing Some problem in uniq command
Previous Next
Reported by: Shahid Hussain <shnx88 <at> gmail.com>
Date: Tue, 4 Jun 2013 16:21:02 UTC
Severity: normal
Tags: moreinfo
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
Message #22 received at 14555 <at> debbugs.gnu.org (full text, mbox):
close 14555
stop
(triaging old bugs)
On 05/06/13 09:06 AM, Bob Proulx wrote:
> Shahid Hussain wrote:
>> Appreciate your quick reply. What exactly i m doing is there are so many
>> files in my product which contains some data in "name = value" format. By
>> using some pattern i m extracting only "value" field from all files and
>> redirecting the output to one temporarily file as i do not want any value
>> to be repeated in any file. And here i m applying uniq command to this
>> temporary file (by pipe lining sort [sort |uniq -c tempFile]) But i am
>> unable to get expected result.
>
> It might be better if in your script you set:
>
> #!/bin/sh
> LC_ALL=C
> export LC_ALL
> ...
> sort | uniq
> ...
>
> That will force a standard sort order everywhere in your script.
>
>> But as you have told whitespace also should be identical at every line so
>> this might be the problem in my case. Because when i displayed content of
>> file using cat command and manually copied the same data to another file
>> and then tried uniq with sort command it works fine.
>
> Without knowing enough about your data a quick and dirty hack to clean
> up whitespace might be to pass it through awk.
>
> awk '{print$1}' somefile1 | sort | uniq ...
>
> Since awk splits on whitespace this will only print the first field
> and any whitespace or additional anything will be discarded.
>
>> So it is fine for me but it would be too better if there could be an option
>> in uniq command to work fine even if whitespace is not identical :).
>
> No. The way is not to use an option. The way is to prepare the data
> without whitespace differences. You have the option of using tools
> like awk to split on whitespace while preparing the data. Preparing
> the data to avoid whitespace differences is the right option to use.
>
With no further comments in 5 years, I'm closing this bug.
-assaf
This bug report was last modified 6 years and 257 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.