GNU bug report logs - #6972
bug in sorting floats

Previous Next

Package: coreutils;

Reported by: saddy <sadmail <at> gmx.de>

Date: Thu, 2 Sep 2010 15:48:01 UTC

Severity: normal

Tags: notabug

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 6972 in the body.
You can then email your comments to 6972 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6972; Package coreutils. (Thu, 02 Sep 2010 15:48:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to saddy <sadmail <at> gmx.de>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Thu, 02 Sep 2010 15:48:01 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: saddy <sadmail <at> gmx.de>
To: bug-coreutils <at> gnu.org
Subject: bug in sorting floats
Date: Thu, 02 Sep 2010 11:25:10 +0200
Hi, I want to sort a file by the 6th column:

test.txt

AvrBs2<- Fer2 ->  XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))
AvrBs2<- UPF0302 ->  XopE1 0 ((40)*(5.2e+02))
AvrBs1<- Calc_CGRP_IAPP ->  XopH 0.1 ((2.6e+02)*(4.2e+02))
AvrBs2<- DUF3248 ->  XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
AvrBs2<- MIG-14_Wnt-bd ->  XopE1 0 ((3.6e+03)*(3.6e+03))
XopX<- Zw10 ->  XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
XopX<- Zw10 ->  XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))
AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
AvrBs2<- Ubiq_cyt_C_chap ->  XopE1 0 ((7.6e+02)*(1.8e+03))

There's no sort option to do that correctly:

saddy <at> lapsdy:~$ sort -k6g test.txt
AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
AvrBs1<- Calc_CGRP_IAPP ->  XopH 0.1 ((2.6e+02)*(4.2e+02))
AvrBs2<- DUF3248 ->  XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
AvrBs2<- Fer2 ->  XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))
AvrBs2<- MIG-14_Wnt-bd ->  XopE1 0 ((3.6e+03)*(3.6e+03))
AvrBs2<- Ubiq_cyt_C_chap ->  XopE1 0 ((7.6e+02)*(1.8e+03))
AvrBs2<- UPF0302 ->  XopE1 0 ((40)*(5.2e+02))
XopX<- Zw10 ->  XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
XopX<- Zw10 ->  XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))

saddy <at> lapsdy:~$ sort -k6n test.txt
AvrBs2<- MIG-14_Wnt-bd ->  XopE1 0 ((3.6e+03)*(3.6e+03))
AvrBs2<- Ubiq_cyt_C_chap ->  XopE1 0 ((7.6e+02)*(1.8e+03))
AvrBs2<- UPF0302 ->  XopE1 0 ((40)*(5.2e+02))
AvrBs1<- Calc_CGRP_IAPP ->  XopH 0.1 ((2.6e+02)*(4.2e+02))
XopX<- Zw10 ->  XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))
AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
XopX<- Zw10 ->  XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
AvrBs2<- DUF3248 ->  XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
AvrBs2<- Fer2 ->  XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))

saddy <at> lapsdy:~$ sort -k6 test.txt
XopX<- Zw10 ->  XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))
XopX<- Zw10 ->  XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
AvrBs2<- DUF3248 ->  XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
AvrBs2<- Fer2 ->  XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))
AvrBs1<- Calc_CGRP_IAPP ->  XopH 0.1 ((2.6e+02)*(4.2e+02))
AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
AvrBs2<- MIG-14_Wnt-bd ->  XopE1 0 ((3.6e+03)*(3.6e+03))
AvrBs2<- UPF0302 ->  XopE1 0 ((40)*(5.2e+02))
AvrBs2<- Ubiq_cyt_C_chap ->  XopE1 0 ((7.6e+02)*(1.8e+03))


Best regards





Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6972; Package coreutils. (Thu, 02 Sep 2010 15:49:02 GMT) Full text and rfc822 format available.

Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):

From: saddy <sadmail <at> gmx.de>
To: bug-coreutils <at> gnu.org
Subject: Re: bug in sorting floats
Date: Thu, 02 Sep 2010 11:40:10 +0200
Seems to be fixed since version 8.5.

saddy <at> lapsdy:~$ sort --version
sort (GNU coreutils) 7.4



Am 02.09.2010 11:25, schrieb saddy:
> Hi, I want to sort a file by the 6th column:
>
> test.txt
>
> AvrBs2<- Fer2 ->  XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))
> AvrBs2<- UPF0302 ->  XopE1 0 ((40)*(5.2e+02))
> AvrBs1<- Calc_CGRP_IAPP ->  XopH 0.1 ((2.6e+02)*(4.2e+02))
> AvrBs2<- DUF3248 ->  XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
> AvrBs2<- MIG-14_Wnt-bd ->  XopE1 0 ((3.6e+03)*(3.6e+03))
> XopX<- Zw10 ->  XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
> XopX<- Zw10 ->  XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))
> AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
> AvrBs2<- Ubiq_cyt_C_chap ->  XopE1 0 ((7.6e+02)*(1.8e+03))
>
> There's no sort option to do that correctly:
>
> saddy <at> lapsdy:~$ sort -k6g test.txt
> AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
> AvrBs1<- Calc_CGRP_IAPP ->  XopH 0.1 ((2.6e+02)*(4.2e+02))
> AvrBs2<- DUF3248 ->  XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
> AvrBs2<- Fer2 ->  XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))
> AvrBs2<- MIG-14_Wnt-bd ->  XopE1 0 ((3.6e+03)*(3.6e+03))
> AvrBs2<- Ubiq_cyt_C_chap ->  XopE1 0 ((7.6e+02)*(1.8e+03))
> AvrBs2<- UPF0302 ->  XopE1 0 ((40)*(5.2e+02))
> XopX<- Zw10 ->  XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
> XopX<- Zw10 ->  XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))
>
> saddy <at> lapsdy:~$ sort -k6n test.txt
> AvrBs2<- MIG-14_Wnt-bd ->  XopE1 0 ((3.6e+03)*(3.6e+03))
> AvrBs2<- Ubiq_cyt_C_chap ->  XopE1 0 ((7.6e+02)*(1.8e+03))
> AvrBs2<- UPF0302 ->  XopE1 0 ((40)*(5.2e+02))
> AvrBs1<- Calc_CGRP_IAPP ->  XopH 0.1 ((2.6e+02)*(4.2e+02))
> XopX<- Zw10 ->  XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))
> AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
> XopX<- Zw10 ->  XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
> AvrBs2<- DUF3248 ->  XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
> AvrBs2<- Fer2 ->  XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))
>
> saddy <at> lapsdy:~$ sort -k6 test.txt
> XopX<- Zw10 ->  XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))
> XopX<- Zw10 ->  XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
> AvrBs2<- DUF3248 ->  XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
> AvrBs2<- Fer2 ->  XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))
> AvrBs1<- Calc_CGRP_IAPP ->  XopH 0.1 ((2.6e+02)*(4.2e+02))
> AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
> AvrBs2<- MIG-14_Wnt-bd ->  XopE1 0 ((3.6e+03)*(3.6e+03))
> AvrBs2<- UPF0302 ->  XopE1 0 ((40)*(5.2e+02))
> AvrBs2<- Ubiq_cyt_C_chap ->  XopE1 0 ((7.6e+02)*(1.8e+03))
>
>
> Best regards
>





Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6972; Package coreutils. (Thu, 02 Sep 2010 17:34:02 GMT) Full text and rfc822 format available.

Message #11 received at 6972 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: saddy <sadmail <at> gmx.de>
Cc: 6972 <at> debbugs.gnu.org
Subject: Re: bug#6972: bug in sorting floats
Date: Thu, 02 Sep 2010 11:35:19 -0600
On 09/02/2010 03:25 AM, saddy wrote:
> Hi, I want to sort a file by the 6th column:
>
> test.txt
>
> AvrBs2<- Fer2 -> XopE1 0.0844155844155844 ((1.1e+03)*(1.1e+03))
> AvrBs2<- UPF0302 -> XopE1 0 ((40)*(5.2e+02))
> AvrBs1<- Calc_CGRP_IAPP -> XopH 0.1 ((2.6e+02)*(4.2e+02))
> AvrBs2<- DUF3248 -> XopE1 0.0476190476190476 ((7.6e+02)*(5.7e+02))
> AvrBs2<- MIG-14_Wnt-bd -> XopE1 0 ((3.6e+03)*(3.6e+03))
> XopX<- Zw10 -> XopJ1 0.0218487394957983 ((1.6e+02)*(7e+02))
> XopX<- Zw10 -> XopJ2 0.019327731092437 ((1.6e+02)*(1.7e+03))
> AvrBs1<- 3H -> XopJ4 0.153061224489796 ((3.4e+02)*(25))
> AvrBs2<- Ubiq_cyt_C_chap -> XopE1 0 ((7.6e+02)*(1.8e+03))

Thanks for the report.  However, I fail to see what you are trying to do 
in the first place, to state whether the issue is a problem in your 
usage of sort (likely) or a bug in sort itself (less likely, but it has 
been known to happen).  Exactly what order are you expecting?  Column 6 
is not a floating point number, but an arithmetic expression, so asking 
to sort by column 6 doesn't really make sense to me.  Using the new 
(post-8.5) sort --debug feature is telling:

$ sort --debug -k6g test.txt
sort: using `en_US.UTF-8' sorting rules
sort: key 1 is numeric and spans multiple fields
AvrBs1<- 3H ->  XopJ4 0.153061224489796 ((3.4e+02)*(25))
                                        ^ no match for key
________________________________________________________
...

Did you mean sorting by column 5, as in 0.153061224489796, in which case 
you should be using -k5,5g?

Or are you actually trying to sort by the value that would be computed 
if you performed floating point arithmetic on the expressions, such as 
((3.4e+02)*25)) sorting equivalent to 85.0?  Sort does not support 
performing math on a sort column, and I don't think that is a feature 
that anyone has ever been clamoring for; to sort by a computed value, 
you'd have to run the computation independently.

-- 
Eric Blake   eblake <at> redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6972; Package coreutils. (Thu, 02 Sep 2010 18:09:02 GMT) Full text and rfc822 format available.

Message #14 received at 6972 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: saddy <sadmail <at> gmx.de>, 6972 <at> debbugs.gnu.org
Subject: Re: bug#6972: bug in sorting floats
Date: Thu, 02 Sep 2010 12:10:26 -0600
[forwarding on your mail to the list, to close this out]

On 09/02/2010 11:55 AM, saddy wrote:
> Thanks for your reply. In my last mail some spaces have been lost, so
> now indeed it is line 5 to sort.

The spaces were gone before your original mail hit my inbox.  From the 
mail headers of your original mail, I see you use Thunderbird:

User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
	rv:1.9.1.11) Gecko/20100713 Thunderbird/3.0.6

which has a known issue where pasting text or replying to a prior email 
can result in Tbird eating spaces prior to words starting with <, >, and 
&.  That's probably the explanation.

> I've found out that it has been my fault. I have locale de_DE.UTF-8,
> therefore sort is assuming ',' for floats. With en locale it's working
> fine.

Ah, an issue with locales.

And yes, the new 'sort --debug' tries to make locale issues obvious, 
since that's the first thing it outputs:

>> $ sort --debug -k6g test.txt
>> sort: using `en_US.UTF-8' sorting rules

Glad to know it's not a bug.

-- 
Eric Blake   eblake <at> redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#6972; Package coreutils. (Sun, 07 Aug 2011 16:27:02 GMT) Full text and rfc822 format available.

Message #17 received at 6972 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Eric Blake <eblake <at> redhat.com>
Cc: 6972 <at> debbugs.gnu.org, saddy <sadmail <at> gmx.de>
Subject: Re: bug#6972: bug in sorting floats
Date: Sun, 07 Aug 2011 18:25:24 +0200
tags 6972 + notabug
close 6972
thanks

Eric Blake wrote:
> [forwarding on your mail to the list, to close this out]
...
>
> Glad to know it's not a bug.

Thanks.  Marking as "notabug" and closing.




Added tag(s) notabug. Request was from Jim Meyering <jim <at> meyering.net> to control <at> debbugs.gnu.org. (Sun, 07 Aug 2011 16:30:03 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 6972 <at> debbugs.gnu.org and saddy <sadmail <at> gmx.de> Request was from Jim Meyering <jim <at> meyering.net> to control <at> debbugs.gnu.org. (Sun, 07 Aug 2011 16:30:03 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 05 Sep 2011 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 291 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.