GNU bug report logs - #7214
sort --debug maps large old-style field number to 0 in diagnostic

Previous Next

Package: coreutils;

Reported by: Jim Meyering <jim <at> meyering.net>

Date: Thu, 14 Oct 2010 10:03:02 UTC

Severity: normal

Tags: wontfix

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 7214 in the body.
You can then email your comments to 7214 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7214; Package coreutils. (Thu, 14 Oct 2010 10:03:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Jim Meyering <jim <at> meyering.net>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Thu, 14 Oct 2010 10:03:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: bug-coreutils <at> gnu.org
Subject: sort --debug maps large old-style field number to 0 in diagnostic
Date: Thu, 14 Oct 2010 12:06:02 +0200
I noticed that using a field number of SIZE_MAX or larger makes --debug
give an invalid diagnostic:

  $ :|_POSIX2_VERSION=199209 src/sort --debug +$(echo 2^64-1|bc).4 -1.2
  src/sort: using simple byte comparison
  src/sort: obsolescent key `+0 -2' used; consider `-k 1,2' instead
  src/sort: leading blanks are significant in key 1; consider also specifying `b'




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7214; Package coreutils. (Thu, 14 Oct 2010 12:18:01 GMT) Full text and rfc822 format available.

Message #8 received at 7214 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 7214 <at> debbugs.gnu.org
Subject: Re: bug#7214: sort --debug maps large old-style field number to 0
	in	diagnostic
Date: Thu, 14 Oct 2010 13:19:31 +0100
On 14/10/10 11:06, Jim Meyering wrote:
> I noticed that using a field number of SIZE_MAX or larger makes --debug
> give an invalid diagnostic:
> 
>   $ :|_POSIX2_VERSION=199209 src/sort --debug +$(echo 2^64-1|bc).4 -1.2
>   src/sort: using simple byte comparison
>   src/sort: obsolescent key `+0 -2' used; consider `-k 1,2' instead
>   src/sort: leading blanks are significant in key 1; consider also specifying `b'

I'd nearly call that corner case a feature,
as it informs you the passed count has been wrapped.
I.E. the obsolete syntax has a 1-less narrower range
and this is informing you of that fact.

printf "1 2\n" | src/sort -sb --debug +$((2**32-1)) -1
src/sort: using `en_IE.UTF-8' sorting rules
src/sort: obsolescent key `+0 -1' used; consider `-k 1,1' instead
1 2
_


It does mean though, that overflows on the start field
for obsolete syntax do sort the data, while overflows
with the -k syntax do not. We could detect that for
the old syntax, and map overflows to SIZE_MAX-1?

I suppose we could also add a debug warning for when
we do overflow, something along the lines of:

$ src/sort -sb --debug -k$((2**32)) /dev/null
src/sort: 4294967296 is too large, using 4294967295

@@ -3882,6 +3882,8 @@ parse_field_count (char const *string, size_t *val, char const *msgid)
     case LONGINT_OVERFLOW:
     case LONGINT_OVERFLOW | LONGINT_INVALID_SUFFIX_CHAR:
       *val = SIZE_MAX;
+      if (debug) /* Note --debug must come before keys to diagnose this.  */
+        error (0, 0, _("%" PRIuMAX " is too large, using %zu"), n, *val);
       break;

cheers,
Pádraig.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7214; Package coreutils. (Thu, 14 Oct 2010 12:41:01 GMT) Full text and rfc822 format available.

Message #11 received at 7214 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 7214 <at> debbugs.gnu.org
Subject: Re: bug#7214: sort --debug maps large old-style field number to 0
	in	diagnostic
Date: Thu, 14 Oct 2010 14:44:18 +0200
Pádraig Brady wrote:
> On 14/10/10 11:06, Jim Meyering wrote:
>> I noticed that using a field number of SIZE_MAX or larger makes --debug
>> give an invalid diagnostic:
>>
>>   $ :|_POSIX2_VERSION=199209 src/sort --debug +$(echo 2^64-1|bc).4 -1.2
>>   src/sort: using simple byte comparison
>>   src/sort: obsolescent key `+0 -2' used; consider `-k 1,2' instead
>>   src/sort: leading blanks are significant in key 1; consider also specifying `b'
>
> I'd nearly call that corner case a feature,
> as it informs you the passed count has been wrapped.
> I.E. the obsolete syntax has a 1-less narrower range
> and this is informing you of that fact.
>
> printf "1 2\n" | src/sort -sb --debug +$((2**32-1)) -1
> src/sort: using `en_IE.UTF-8' sorting rules
> src/sort: obsolescent key `+0 -1' used; consider `-k 1,1' instead
> 1 2
> _
>
>
> It does mean though, that overflows on the start field
> for obsolete syntax do sort the data, while overflows
> with the -k syntax do not. We could detect that for
> the old syntax, and map overflows to SIZE_MAX-1?
>
> I suppose we could also add a debug warning for when
> we do overflow, something along the lines of:
>
> $ src/sort -sb --debug -k$((2**32)) /dev/null
> src/sort: 4294967296 is too large, using 4294967295
>
> @@ -3882,6 +3882,8 @@ parse_field_count (char const *string, size_t *val, char const *msgid)
>      case LONGINT_OVERFLOW:
>      case LONGINT_OVERFLOW | LONGINT_INVALID_SUFFIX_CHAR:
>        *val = SIZE_MAX;
> +      if (debug) /* Note --debug must come before keys to diagnose this.  */
> +        error (0, 0, _("%" PRIuMAX " is too large, using %zu"), n, *val);

That does sound like an improvement (that would require comment changes),
but considering it's only with an outrageously large field number
and when using obsolescent key-specifying syntax, it's probably
not worth the trouble.  But if you want to add the above, that's
fine, too.

Note however, that would not change the mapping from a
field number of exactly SIZE_MAX to "0" in the diagnostic.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7214; Package coreutils. (Thu, 14 Oct 2010 13:14:02 GMT) Full text and rfc822 format available.

Message #14 received at 7214 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Jim Meyering <jim <at> meyering.net>
Cc: 7214 <at> debbugs.gnu.org
Subject: Re: bug#7214: sort --debug maps large old-style field number to 0
	in	diagnostic
Date: Thu, 14 Oct 2010 14:14:52 +0100
On 14/10/10 13:44, Jim Meyering wrote:
> Pádraig Brady wrote:
>> On 14/10/10 11:06, Jim Meyering wrote:
>>> I noticed that using a field number of SIZE_MAX or larger makes --debug
>>> give an invalid diagnostic:
>>>
>>>   $ :|_POSIX2_VERSION=199209 src/sort --debug +$(echo 2^64-1|bc).4 -1.2
>>>   src/sort: using simple byte comparison
>>>   src/sort: obsolescent key `+0 -2' used; consider `-k 1,2' instead
>>>   src/sort: leading blanks are significant in key 1; consider also specifying `b'
>>
>> I'd nearly call that corner case a feature,
>> as it informs you the passed count has been wrapped.
>> I.E. the obsolete syntax has a 1-less narrower range
>> and this is informing you of that fact.
>>
>> printf "1 2\n" | src/sort -sb --debug +$((2**32-1)) -1
>> src/sort: using `en_IE.UTF-8' sorting rules
>> src/sort: obsolescent key `+0 -1' used; consider `-k 1,1' instead
>> 1 2
>> _
>>
>>
>> It does mean though, that overflows on the start field
>> for obsolete syntax do sort the data, while overflows
>> with the -k syntax do not. We could detect that for
>> the old syntax, and map overflows to SIZE_MAX-1?
>>
>> I suppose we could also add a debug warning for when
>> we do overflow, something along the lines of:
>>
>> $ src/sort -sb --debug -k$((2**32)) /dev/null
>> src/sort: 4294967296 is too large, using 4294967295
>>
>> @@ -3882,6 +3882,8 @@ parse_field_count (char const *string, size_t *val, char const *msgid)
>>      case LONGINT_OVERFLOW:
>>      case LONGINT_OVERFLOW | LONGINT_INVALID_SUFFIX_CHAR:
>>        *val = SIZE_MAX;
>> +      if (debug) /* Note --debug must come before keys to diagnose this.  */
>> +        error (0, 0, _("%" PRIuMAX " is too large, using %zu"), n, *val);
> 
> That does sound like an improvement (that would require comment changes),
> but considering it's only with an outrageously large field number
> and when using obsolescent key-specifying syntax, it's probably
> not worth the trouble.  But if you want to add the above, that's
> fine, too.
> 
> Note however, that would not change the mapping from a
> field number of exactly SIZE_MAX to "0" in the diagnostic.
> 

I'll try and fix it up for next release




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7214; Package coreutils. (Thu, 14 Oct 2010 13:40:04 GMT) Full text and rfc822 format available.

Message #17 received at 7214 <at> debbugs.gnu.org (full text, mbox):

From: Eric Blake <eblake <at> redhat.com>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: Jim Meyering <jim <at> meyering.net>, 7214 <at> debbugs.gnu.org
Subject: Re: bug#7214: sort --debug maps large old-style field number to 0
	in	diagnostic
Date: Thu, 14 Oct 2010 07:43:30 -0600
On 10/14/2010 06:19 AM, Pádraig Brady wrote:
> @@ -3882,6 +3882,8 @@ parse_field_count (char const *string, size_t *val, char const *msgid)
>       case LONGINT_OVERFLOW:
>       case LONGINT_OVERFLOW | LONGINT_INVALID_SUFFIX_CHAR:
>         *val = SIZE_MAX;
> +      if (debug) /* Note --debug must come before keys to diagnose this.  */
> +        error (0, 0, _("%" PRIuMAX " is too large, using %zu"), n, *val);
>         break;

Rather than warn during option parsing, what if you instead set a bool 
variable overflow_detected, and warn only after option parsing is 
completed?  The warning won't be quite as specific:

_("At least one key range overflowed")

but given that it's a corner case already, it means you would then have 
warning for:

sort +$(((1<<31)-1)) --debug

Just a thought; I'm okay with whatever you eventually commit on this front.

-- 
Eric Blake   eblake <at> redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7214; Package coreutils. (Thu, 14 Oct 2010 16:46:01 GMT) Full text and rfc822 format available.

Message #20 received at 7214 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>
Cc: Pádraig Brady <P <at> draigBrady.com>, 7214 <at> debbugs.gnu.org
Subject: Re: bug#7214: sort --debug maps large old-style field number to 0
	in	diagnostic
Date: Thu, 14 Oct 2010 09:48:52 -0700
On 10/14/10 05:44, Jim Meyering wrote:
>>        *val = SIZE_MAX;
>> > +      if (debug) /* Note --debug must come before keys to diagnose this.  */
>> > +        error (0, 0, _("%" PRIuMAX " is too large, using %zu"), n, *val);
> That does sound like an improvement (that would require comment changes),

No, it's better to silently treat large field numbers as if they were infinity.

'sort' already does that in other cases (e.g., specify_nthreads),
and so does 'join' (e.g., string_to_join_field), and we should
be doing that whenever possible.

The idea is that, since 'sort' can't possibly represent a line that
contains more than SIZE_MAX fields, it's correct to treat very large
field numbers, even numbers that provoke LONGINT_OVERFLOW, as if they
were SIZE_MAX.  GNU code should not contain arbitrary limits: so
when it's correct to substitute a representable number for a
number that's so large that it's unrepresentable, we should do that
instead of reporting an overflow and failing.




Information forwarded to bug-coreutils <at> gnu.org:
bug#7214; Package coreutils. (Mon, 15 Oct 2018 17:28:02 GMT) Full text and rfc822 format available.

Message #23 received at 7214 <at> debbugs.gnu.org (full text, mbox):

From: Assaf Gordon <assafgordon <at> gmail.com>
Cc: 7214 <at> debbugs.gnu.org
Subject: Re: bug#7214: sort --debug maps large old-style field number to 0 in
 diagnostic
Date: Mon, 15 Oct 2018 11:27:28 -0600
tags 7214 wontfix
close 7214
stop

(triaging old bugs)

On 14/10/10 10:48 AM, Paul Eggert wrote:
> On 10/14/10 05:44, Jim Meyering wrote:
>>>         *val = SIZE_MAX;
>>>> +      if (debug) /* Note --debug must come before keys to diagnose this.  */
>>>> +        error (0, 0, _("%" PRIuMAX " is too large, using %zu"), n, *val);
>> That does sound like an improvement (that would require comment changes),
> 
> No, it's better to silently treat large field numbers as if they were infinity.
> 

8 years later, the change has not been committed - so closing as "wontfix".

-assaf






Added tag(s) wontfix. Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Mon, 15 Oct 2018 17:28:02 GMT) Full text and rfc822 format available.

bug closed, send any further explanations to 7214 <at> debbugs.gnu.org and Jim Meyering <jim <at> meyering.net> Request was from Assaf Gordon <assafgordon <at> gmail.com> to control <at> debbugs.gnu.org. (Mon, 15 Oct 2018 17:28:02 GMT) Full text and rfc822 format available.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 13 Nov 2018 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 6 years and 276 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.