GNU bug report logs -
#9334
sort bug
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 9334 in the body.
You can then email your comments to 9334 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9334
; Package
coreutils
.
(Sat, 20 Aug 2011 20:28:01 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
"ROGER GRAYDON CHRISTMAN" <dvl <at> psu.edu>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Sat, 20 Aug 2011 20:28:01 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
First: some version information:
sort (GNU coreutils) 8.4
I run a series of pipes, and after piping into 'sort -n', I see this:
1 12
1 4
5 16
9 20
The first column sorted correctly, numerically, but the second did not.
I do not have sufficient data to determine whether the second column
is sorted lexicographically, or simply ignored.
Roger Christman
Computer Science and Engineering
Pennsylvania State Univeristy
[Message part 2 (text/html, inline)]
Reply sent
to
Bob Proulx <bob <at> proulx.com>
:
You have taken responsibility.
(Mon, 22 Aug 2011 01:59:01 GMT)
Full text and
rfc822 format available.
Notification sent
to
"ROGER GRAYDON CHRISTMAN" <dvl <at> psu.edu>
:
bug acknowledged by developer.
(Mon, 22 Aug 2011 01:59:01 GMT)
Full text and
rfc822 format available.
Message #10 received at 9334-done <at> debbugs.gnu.org (full text, mbox):
tags 9334 + notabug
thanks
ROGER GRAYDON CHRISTMAN wrote:
> First: some version information:
> sort (GNU coreutils) 8.4
Thanks!
> I run a series of pipes, and after piping into 'sort -n', I see this:
> 1 12
> 1 4
> 5 16
> 9 20
>
> The first column sorted correctly, numerically, but the second did not.
> I do not have sufficient data to determine whether the second column
> is sorted lexicographically, or simply ignored.
Thanks for the report but you are not seeing a bug in sort but in the
use of it. You have insufficiently qualified the sort criteria. Try
this:
sort -n -k1,1 -k2,2
Or my preference:
sort -k1,1n -k2,2n
The reasoning is as found in the sort documentation:
A pair of lines is compared as follows: `sort' compares each pair of
fields, in the order specified on the command line, according to the
associated ordering options, until a difference is found or no fields
are left. If no key fields are specified, `sort' uses a default key of
the entire line. Finally, as a last resort when all keys compare
equal, `sort' compares entire lines as if no ordering options other
than `--reverse' (`-r') were specified. The `--stable' (`-s') option
disables this "last-resort comparison" so that lines in which all
fields compare equal are left in their original relative order. The
`--unique' (`-u') option also disables the last-resort comparison.
...
`-n'
`--numeric-sort'
`--sort=numeric'
Sort numerically. The number begins each line and consists of
optional blanks, an optional `-' sign, and zero or more digits
possibly separated by thousands separators, optionally followed by
a decimal-point character and zero or more digits. An empty
number is treated as `0'. ...
Since no fields are specified sort is using a default key of the
entire line. Since you care about sorting on fields you should
include sort field options.
Bob
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9334
; Package
coreutils
.
(Mon, 22 Aug 2011 04:14:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 9334 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On Sunday, August 21, 2011, Bob Proulx <bob <at> proulx.com> wrote:
> tags 9334 + notabug
> thanks
>
> ROGER GRAYDON CHRISTMAN wrote:
>> First: some version information:
>> sort (GNU coreutils) 8.4
>
> Thanks!
>
>> I run a series of pipes, and after piping into 'sort -n', I see this:
>> 1 12
>> 1 4
>> 5 16
>> 9 20
>>
>> The first column sorted correctly, numerically, but the second did not.
>> I do not have sufficient data to determine whether the second column
>> is sorted lexicographically, or simply ignored.
>
> Thanks for the report but you are not seeing a bug in sort but in the
> use of it. You have insufficiently qualified the sort criteria. Try
> this:
>
> sort -n -k1,1 -k2,2
>
> Or my preference:
>
> sort -k1,1n -k2,2n
>
> The reasoning is as found in the sort documentation:
>
> A pair of lines is compared as follows: `sort' compares each pair of
> fields, in the order specified on the command line, according to the
> associated ordering options, until a difference is found or no fields
> are left. If no key fields are specified, `sort' uses a default key of
> the entire line. Finally, as a last resort when all keys compare
> equal, `sort' compares entire lines as if no ordering options other
> than `--reverse' (`-r') were specified. The `--stable' (`-s') option
> disables this "last-resort comparison" so that lines in which all
> fields compare equal are left in their original relative order. The
> `--unique' (`-u') option also disables the last-resort comparison.
> ...
> `-n'
> `--numeric-sort'
> `--sort=numeric'
> Sort numerically. The number begins each line and consists of
> optional blanks, an optional `-' sign, and zero or more digits
> possibly separated by thousands separators, optionally followed by
> a decimal-point character and zero or more digits. An empty
> number is treated as `0'. ...
>
> Since no fields are specified sort is using a default key of the
> entire line. Since you care about sorting on fields you should
> include sort field options.
Out of curiosity, what's the output mean in this case? "two lines, starting
with the number one, in their original order", "two lines, starting with the
number one, also containing the strings '12' and '4' and sorted
lexicographically thereby", or something else entirely?
--
Aaron Davies
aaron.davies <at> gmail.com
[Message part 2 (text/html, inline)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9334
; Package
coreutils
.
(Mon, 22 Aug 2011 14:43:01 GMT)
Full text and
rfc822 format available.
Message #16 received at 9334 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Thanks. I guess I misinterpreted "uses a default key of the entire line"
as "uses the entire line as keys by default", in which case if the first column
was equal, it would compare the second, then the third, etc.
I guess I don't know what "default key of the entire line" means with respect
to -n,
since it apparently didn't treat "1 12" as "112" and "1 4" as 14.
I'm curious to find out what this phrase means in this context.
Roger Christman
On Sun, Aug 21, 2011 09:55 PM, Bob Proulx <bob <at> proulx.com> wrote:
>
tags 9334 + notabug
>thanks
>
>ROGER GRAYDON CHRISTMAN wrote:
>> First: some version information:
>> sort (GNU coreutils) 8.4
>
>Thanks!
>
>> I run a series of pipes, and after piping into 'sort -n', I see this:
>> 1 12
>> 1 4
>> 5 16
>> 9 20
>>
>> The first column sorted correctly, numerically, but the second did not.
>> I do not have sufficient data to determine whether the second column
>> is sorted lexicographically, or simply ignored.
>
>Thanks for the report but you are not seeing a bug in sort but in the
>use of it. You have insufficiently qualified the sort criteria. Try
>this:
>
> sort -n -k1,1 -k2,2
>
>Or my preference:
>
> sort -k1,1n -k2,2n
>
>The reasoning is as found in the sort documentation:
>
> A pair of lines is compared as follows: `sort' compares each pair of
> fields, in the order specified on the command line, according to the
> associated ordering options, until a difference is found or no fields
> are left. If no key fields are specified, `sort' uses a default key of
> the entire line. Finally, as a last resort when all keys compare
> equal, `sort' compares entire lines as if no ordering options other
> than `--reverse' (`-r') were specified. The `--stable'
>(`-s') option
> disables this "last-resort comparison" so that lines in which all
> fields compare equal are left in their original relative order. The
> `--unique' (`-u') option also disables the last-resort comparison.
> ...
> `-n'
> `--numeric-sort'
> `--sort=numeric'
> Sort numerically. The number begins each line and consists of
> optional blanks, an optional `-' sign, and zero or more digits
> possibly separated by thousands separators, optionally followed by
> a decimal-point character and zero or more digits. An empty
> number is treated as `0'. ...
>
>Since no fields are specified sort is using a default key of the
>entire line. Since you care about sorting on fields you should
>include sort field options.
>
>Bob
>
>
>
[Message part 2 (text/html, inline)]
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#9334
; Package
coreutils
.
(Mon, 22 Aug 2011 14:48:01 GMT)
Full text and
rfc822 format available.
Message #19 received at 9334 <at> debbugs.gnu.org (full text, mbox):
On 08/22/2011 07:47 AM, ROGER GRAYDON CHRISTMAN wrote:
> Thanks. I guess I misinterpreted "uses a default key of the entire line"
> as "uses the entire line as keys by default", in which case if the first column
> was equal, it would compare the second, then the third, etc.
>
> I guess I don't know what "default key of the entire line" means with respect
> to -n,
> since it apparently didn't treat "1 12" as "112" and "1 4" as 14.
> I'm curious to find out what this phrase means in this context.
'sort --debug' is your friend. In the C locale, global -n means 'parse
as much of the prefix of the line as can be treated as a number as the
primary key, then treat the entire line as the secondary key'.
$ printf ' 1 12\n 1 4\n 5 16\n 9 20\n' | LC_ALL=C sort --debug -n
sort: using simple byte comparison
1 4
_
_____
1 12
_
_____
5 16
_
_____
9 20
_
_____
--
Eric Blake eblake <at> redhat.com +1-801-349-2682
Libvirt virtualization library http://libvirt.org
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Tue, 20 Sep 2011 11:24:03 GMT)
Full text and
rfc822 format available.
This bug report was last modified 13 years and 275 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.