GNU bug report logs -
#24527
Problem while sorting comma separated values using sort command.
Previous Next
Reported by: Jash Dave <jashdave23 <at> gmail.com>
Date: Sat, 24 Sep 2016 13:13:02 UTC
Severity: normal
Tags: notabug
Done: Assaf Gordon <assafgordon <at> gmail.com>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
tag 24527 notabug
close 24527
stop
On 09/24/2016 11:44 AM, Jash Dave wrote:
> There is problem while sorting comma separated entries (specifically
> numbers). Even when the separator symbol is set to comma, it reads all
> following columns with numbers, and doesn't treats comma as separator
> between following numbers.
>
> If I use command:
> sort -t"," -k1 -n Example.csv
>
> Example.csv :
> 1,100,a,1,a
> 4,1000,d,4,c
> 3,1002,c,3,c
> 22,10,a,2,b
>
> Output:
> 1,100,a,1,a
> 22,10,a,2,b
> 3,1002,c,3,c
> 4,1000,d,4,c
>
>
> Expected:
> 1,100,a,1,a
> 3,1002,c,3,c
> 4,1000,d,4,c
> 22,10,a,2,b
>
> But it works with column 2 or 4, since there are no following numbers.
When having trouble with sort, usually the --debug option helps:
$ sort --debug -t"," -k1 -n
sort: using ‘en_US.UTF-8’ sorting rules
sort: key 1 is numeric and spans multiple fields
1,100,a,1,a
4,1000,d,4,c
3,1002,c,3,c
22,10,a,2,b
1,100,a,1,a
______
___________
22,10,a,2,b
______
___________
3,1002,c,3,c
_______
____________
4,1000,d,4,c
_______
____________
Aha, the key spans multiple fields.
As you want to sort on the first field only, you need to
tell sort to do so:
$ sort --debug -t"," -k1,1 -n
sort: using ‘en_US.UTF-8’ sorting rules
1,100,a,1,a
4,1000,d,4,c
3,1002,c,3,c
22,10,a,2,b
1,100,a,1,a
_
___________
3,1002,c,3,c
_
____________
4,1000,d,4,c
_
____________
22,10,a,2,b
__
___________
Therefore, I'm marking this as not a bug in sort.
Thanks & have a nice day,
Berny
This bug report was last modified 6 years and 267 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.