GNU bug report logs - #49217
'shuf' returns nothing if the low range number is higher by 1 than the high number

Previous Next

Package: coreutils;

Reported by: F8ER F8ER <the.f8er <at> gmail.com>

Date: Fri, 25 Jun 2021 04:14:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 49217 in the body.
You can then email your comments to 49217 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 04:14:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to F8ER F8ER <the.f8er <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 25 Jun 2021 04:14:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: F8ER F8ER <the.f8er <at> gmail.com>
To: bug-coreutils <at> gnu.org
Subject: 'shuf' returns nothing if the low range number is higher by 1 than
 the high number
Date: Fri, 25 Jun 2021 01:46:30 +0200
Dear GNU Developers,


Thank you very much for the project.


It seems, if the low range number is higher by 1 than the high number,
the program returns

nothing (with exit code = 0), while 102-100 results in an error and
100-100 returns 100 as expected.

For example, `shuf -i 101-100 -n 1` returns nothing with the exit code
= 0 (unexpected).


Expected (normal):

`shuf -i 100-101 -n 1` returns either 100 or 101 with an exit code = 0.

`shuf -i 100-100 -n 1` returns 100 with an exit code = 0.

`shuf -i 102-100 -n 1` results in an error with an exit code = 1.


Is that an expected behavior?


Stay safe!


Best and kind regards




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Fri, 25 Jun 2021 04:20:02 GMT) Full text and rfc822 format available.

Notification sent to F8ER F8ER <the.f8er <at> gmail.com>:
bug acknowledged by developer. (Fri, 25 Jun 2021 04:20:02 GMT) Full text and rfc822 format available.

Message #10 received at 49217-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: F8ER F8ER <the.f8er <at> gmail.com>
Cc: 49217-done <at> debbugs.gnu.org
Subject: Re: bug#49217: 'shuf' returns nothing if the low range number is
 higher by 1 than the high number
Date: Thu, 24 Jun 2021 21:19:36 -0700
On 6/24/21 4:46 PM, F8ER F8ER wrote:
> For example, `shuf -i 101-100 -n 1` returns nothing with the exit code
> = 0 (unexpected).

Actually, it's the expected behavior. It's the same behavior as 'shuf -n 
1 </dev/null'. The '-n 1' option does not mean "output exactly 1 line"; 
it means "output at most 1 line".




Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 06:51:01 GMT) Full text and rfc822 format available.

Message #13 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>
To: bug-coreutils <at> gnu.org
Cc: 49217 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, the.f8er <at> gmail.com
Subject: Re: bug#49217: 'shuf' returns nothing if the low range number is
 higher by 1 than the high number
Date: Fri, 25 Jun 2021 08:49:51 +0200
Hi,

On Thu, Jun 24, 2021 at 09:19:36PM -0700, Paul Eggert wrote:
> On 6/24/21 4:46 PM, F8ER F8ER wrote:
> >For example, `shuf -i 101-100 -n 1` returns nothing with the exit code
> >= 0 (unexpected).
> 
> Actually, it's the expected behavior. It's the same behavior as
> 'shuf -n 1 </dev/null'. The '-n 1' option does not mean "output
> exactly 1 line"; it means "output at most 1 line".

I think the reported issue is with producing no error with LO==HI+1,
but producing an error with LO<HI+1:

    $ shuf -i 3-0 ; echo %exit code $?
    shuf: invalid input range: ‘3-0’
    %exit code 1
    $ shuf -i 2-0 ; echo %exit code $?
    shuf: invalid input range: ‘2-0’
    %exit code 1
    $ shuf -i 1-0 ; echo %exit code $?
    %exit code 0

This looks inconsistent and possibly not exactly as intended.

I have taken this example with an older "shuf" from my system, not the
current upstream version:

     shuf --version | head -n1
     shuf (GNU coreutils) 8.28

This is just intended to hopefully clear up a possible misunderstanding,
not to confirm or falsify if the current "shuf" version behaves in the
same way.

HTH, HAND,
Erik
-- 
Object-oriented design is the roman numerals of computing.
                        -- Rob Pike




Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 06:51:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 06:55:02 GMT) Full text and rfc822 format available.

Message #19 received at 49217 <at> debbugs.gnu.org (full text, mbox):

From: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>
To: 49217 <at> debbugs.gnu.org
Subject: Re: bug#49217: 'shuf' returns nothing if the low range number is
 higher by 1 than the high number
Date: Fri, 25 Jun 2021 08:54:43 +0200
Hi,

On Fri, Jun 25, 2021 at 08:49:51AM +0200, Erik Auerswald wrote:
> On Thu, Jun 24, 2021 at 09:19:36PM -0700, Paul Eggert wrote:
> > On 6/24/21 4:46 PM, F8ER F8ER wrote:
> > >For example, `shuf -i 101-100 -n 1` returns nothing with the exit code
> > >= 0 (unexpected).
> > 
> > Actually, it's the expected behavior. It's the same behavior as
> > 'shuf -n 1 </dev/null'. The '-n 1' option does not mean "output
> > exactly 1 line"; it means "output at most 1 line".
> 
> I think the reported issue is with producing no error with LO==HI+1,
> but producing an error with LO<HI+1:
                                ^
                              LO>HI+1

Sorry!

Thanks,
Erik
-- 
Hofstadter's Law: It always takes longer than you expect, even when
                  you take into account Hofstadter's Law.




Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 13:24:02 GMT) Full text and rfc822 format available.

Message #22 received at 49217 <at> debbugs.gnu.org (full text, mbox):

From: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>
To: 49217 <at> debbugs.gnu.org
Subject: [PATCH] shuf: fix bug with "-i 1-0"
Date: Fri, 25 Jun 2021 15:23:33 +0200
"shuf -i 1-0" would mistakenly accept the invalid range
without an error message and produce no output.  Other
invalid ranges, e.g., "shuf -i 2-0", would be detected
and produce an error message, non-zero exit code, and
no output.

Bug reported by "F8ER F8ER."

* src/shuf.c (main): Fix bug.
* tests/misc/shuf.sh: Add a test case for the bug.
---
 src/shuf.c         | 2 +-
 tests/misc/shuf.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/shuf.c b/src/shuf.c
index 1af1b533a..91430a88a 100644
--- a/src/shuf.c
+++ b/src/shuf.c
@@ -431,7 +431,7 @@ main (int argc, char **argv)
                                  _("invalid input range"), 0);
 
           n_lines = hi_input - lo_input + 1;
-          invalid |= ((lo_input <= hi_input) == (n_lines == 0));
+          invalid |= (lo_input > hi_input);
           if (invalid)
             die (EXIT_FAILURE, errno, "%s: %s", _("invalid input range"),
                  quote (optarg));
diff --git a/tests/misc/shuf.sh b/tests/misc/shuf.sh
index 892386b3f..2a7cba4d3 100755
--- a/tests/misc/shuf.sh
+++ b/tests/misc/shuf.sh
@@ -95,7 +95,7 @@ test "$c" -eq 3 || { fail=1; echo "Multiple -n failed">&2 ; }
 { shuf -i0-9 -n10 -i8-90 || test $? -ne 1; } &&
   { fail=1; echo "shuf did not detect multiple -i usage.">&2 ; }
 # Test invalid range
-for ARG in '1' 'A' '1-' '1-A'; do
+for ARG in '1' 'A' '1-' '1-A' '1-0' '2-0'; do
     { shuf -i$ARG || test $? -ne 1; } &&
     { fail=1; echo "shuf did not detect erroneous -i$ARG usage.">&2 ; }
 done
-- 
2.17.1




Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 14:48:02 GMT) Full text and rfc822 format available.

Message #25 received at 49217 <at> debbugs.gnu.org (full text, mbox):

From: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>
To: 49217 <at> debbugs.gnu.org
Subject: Re: bug#49217: 'shuf' returns nothing if the low range number is
 higher by 1 than the high number
Date: Fri, 25 Jun 2021 16:47:24 +0200
Hi,


On Fri, Jun 25, 2021 at 08:54:43AM +0200, Erik Auerswald wrote:
> On Fri, Jun 25, 2021 at 08:49:51AM +0200, Erik Auerswald wrote:
> > On Thu, Jun 24, 2021 at 09:19:36PM -0700, Paul Eggert wrote:
> > > On 6/24/21 4:46 PM, F8ER F8ER wrote:
> > > >For example, `shuf -i 101-100 -n 1` returns nothing with the exit code
> > > >= 0 (unexpected).
> > > 
> > > Actually, it's the expected behavior. It's the same behavior as
> > > 'shuf -n 1 </dev/null'. The '-n 1' option does not mean "output
> > > exactly 1 line"; it means "output at most 1 line".
> > 
> > I think the reported issue is with producing no error with LO==HI+1,
> > but producing an error with LO<HI+1:
>                                 ^
>                               LO>HI+1

The code seems to intentionally silently ignore LO == HI+1, but not
LO > HI+1.  But this is neither documented nor tested.  This may be
an intentionally interesting way to simulate reading from an empty
file containing no lines between LO and HI.

Please see my previous patch as a suggestion on how to make the code
less suprising.

I am fine with keeping the current behavior, but then I'd like to
document it and add test cases.  Please let me know if you'd rather
have a documentation change & tests patch than the current code
change & tests patch.

I do think that it would be better to either change the code or the
documentation, and add test cases, than to do nothing.

Thanks,
Erik
-- 
Simplicity is prerequisite for reliability.
                        -- Edsger W. Dijkstra




Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 16:30:02 GMT) Full text and rfc822 format available.

Message #28 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>, bug-coreutils <at> gnu.org,
 49217 <at> debbugs.gnu.org, the.f8er <at> gmail.com
Subject: Re: bug#49217: 'shuf' returns nothing if the low range number is
 higher by 1 than the high number
Date: Fri, 25 Jun 2021 09:29:04 -0700
On 6/24/21 11:49 PM, Erik Auerswald wrote:
>      $ shuf -i 2-0 ; echo %exit code $?
>      shuf: invalid input range: ‘2-0’
>      %exit code 1
>      $ shuf -i 1-0 ; echo %exit code $?
>      %exit code 0
> 
> This looks inconsistent and possibly not exactly as intended.

It's exactly what I intended and there's no inconsistency. When you say 
'shuf -i M-N' you select from a collection of N-M+1 lines. N-M+1 = 0 (no 
input lines) makes sense, but N-M+1 < 0 (negative number of input 
lines?) does not.

> I'd like to
 document it and add test cases.

Feel free, though we need to reserve the right to extend 'shuf' in the 
future. In other words, not every invocation of 'shuf' that provokes a 
diagnostic now will provoke a diagnostic in the future.




Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 16:30:03 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Fri, 25 Jun 2021 18:01:02 GMT) Full text and rfc822 format available.

Message #34 received at 49217 <at> debbugs.gnu.org (full text, mbox):

From: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>
To: 49217 <at> debbugs.gnu.org
Subject: Re: bug#49217: 'shuf' returns nothing if the low range number is
 higher by 1 than the high number
Date: Fri, 25 Jun 2021 20:00:13 +0200
Hi Paul,

On Fri, Jun 25, 2021 at 09:29:04AM -0700, Paul Eggert wrote:
> On 6/24/21 11:49 PM, Erik Auerswald wrote:
> >     $ shuf -i 2-0 ; echo %exit code $?
> >     shuf: invalid input range: ‘2-0’
> >     %exit code 1
> >     $ shuf -i 1-0 ; echo %exit code $?
> >     %exit code 0
> >
> >This looks inconsistent and possibly not exactly as intended.
> 
> It's exactly what I intended and there's no inconsistency. When you
> say 'shuf -i M-N' you select from a collection of N-M+1 lines.

It also specifies the contents of those lines, unless there is less than
one line.

> N-M+1 = 0 (no input lines) makes sense, but N-M+1 < 0 (negative number
> of input lines?) does not.

I do not think that it makes sense to specify the contents of no input
lines.  Perhaps we can agree to disagree on this?

Then the documentation does not describe it that way.  I think that can
lead to confusion.

The documentation describes the option as simulating input "from a file
containing the range of unsigned decimal integers LO...HI, one per line."
From this description it is not obvious that "1-0" is OK, but "2-0"
is not.  In both cases LO > HI, but one is accepted without error,
but the other is not.

I think that "select from a negative number of lines" makes just as much
sense as "select from no lines at all."  Here we seem to disagree, which
is OK with me.

Similarly to "shuf -iLO-HI", "seq FIRST LAST" produces LAST-FIRST+1 lines.
But seq does allow to ask, to adapt your wording, for a negative number
of lines:

    $ seq 2 0 ; echo %exit code $?
    %exit code 0
    $ seq 1 0 ; echo %exit code $?
    %exit code 0
    $ seq 0 0 ; echo %exit code $?
    0
    %exit code 0

The problem I see is that the intention behind "shuf -i" that can be
gleaned from your implementation, and that you have described above,
is not obvious from the documentation or from similar functionality in
the GNU Core Utilities.

I see three views regarding the case of LO > HI in this thread:

  1. The bug reporter expected LO > HI to always produce an error,
     or possibly to never produce an error.

  2. Your "shuf" implementation sees LO == HI+1 as the one allowed
     possibility to specify no input, based on the HI-LO+1 formula for
     the number of lines to choose from.

  3. The "seq" implementation in the GNU Core Utilities allows LO > HI
     and interprets it as the empty sequence.  I actually like this best.

Thus I think that it is not as clear and obvious as you seem to
expect that the current "shuf" behavior is the obviously correct one.
No offense intended!

I do not care deeply which behavior is selected.  I just want to make
it clearer for others, including me, to understand that the current
implementation is as intended.  Adding to the documentation (for users)
and the tests (for developers) seems to be helpful to me.

> >I'd like to document it and add test cases.
> 
> Feel free,

Thanks, I'll think about a wording both simple to understand and including
the special case.  I intend to send a patch to this bug report in a
couple of days.

> though we need to reserve the right to extend 'shuf' in
> the future. In other words, not every invocation of 'shuf' that
> provokes a diagnostic now will provoke a diagnostic in the future.

I like that.

HTH, HAND,
Erik
-- 
Be water, my friend.
                        -- Bruce Lee




Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Sat, 26 Jun 2021 17:02:01 GMT) Full text and rfc822 format available.

Message #37 received at 49217 <at> debbugs.gnu.org (full text, mbox):

From: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>
To: 49217 <at> debbugs.gnu.org
Subject: [PATCH] doc: clarify valid ranges for shuf -i
Date: Sat, 26 Jun 2021 19:01:46 +0200
* doc/coreutils.texi (shut invocation): Mention valid and invalid
edge cases for --input-range.
---
 doc/coreutils.texi | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index ea040458e..f59c5e962 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -4978,7 +4978,10 @@ Treat each command-line operand as an input line.
 @opindex --input-range
 @cindex input range to shuffle
 Act as if input came from a file containing the range of unsigned
-decimal integers @var{lo}@dots{}@var{hi}, one per line.
+decimal integers @var{lo}@dots{}@var{hi}, one per line.  If @var{lo} is
+equal to @var{hi}, this is a single line.  If @var{lo} is one bigger than
+@var{hi}, this is accepted as the empty range.  Other cases of @var{lo}
+greater than @var{hi} are rejected as invalid.
 
 @end table
 
-- 
2.17.1




Information forwarded to bug-coreutils <at> gnu.org:
bug#49217; Package coreutils. (Sat, 26 Jun 2021 17:03:02 GMT) Full text and rfc822 format available.

Message #40 received at 49217 <at> debbugs.gnu.org (full text, mbox):

From: Erik Auerswald <auerswal <at> unix-ag.uni-kl.de>
To: 49217 <at> debbugs.gnu.org
Subject: [PATCH] tests: exercise shuf --input-range edge cases
Date: Sat, 26 Jun 2021 19:02:23 +0200
* tests/misc/shuf.sh: Test valid "shuf -i" edge cases that result
in a single line of input, or no line at all.  Test an invalid
range, too.
---
 tests/misc/shuf.sh | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tests/misc/shuf.sh b/tests/misc/shuf.sh
index 892386b3f..83e940ec4 100755
--- a/tests/misc/shuf.sh
+++ b/tests/misc/shuf.sh
@@ -39,6 +39,15 @@ compare in out > /dev/null && { fail=1; echo "not random?" 1>&2; }
 sort -n out > out1
 compare in out1 || { fail=1; echo "not a permutation" 1>&2; }
 
+# Exercise border conditions of shuf's -i option
+# LO == HI gives one line
+echo 1 > in1 || framework_failure_
+shuf -i 1-1 > out || fail=1
+compare in1 out || fail=1
+# LO == HI+1 gives no output
+shuf -i 1-0 > out || fail=1
+compare /dev/null out || fail=1
+
 # Exercize shuf's -r -n 0 options, with no standard input.
 shuf -r -n 0 in <&- >out || fail=1
 compare /dev/null out || fail=1
@@ -95,7 +104,7 @@ test "$c" -eq 3 || { fail=1; echo "Multiple -n failed">&2 ; }
 { shuf -i0-9 -n10 -i8-90 || test $? -ne 1; } &&
   { fail=1; echo "shuf did not detect multiple -i usage.">&2 ; }
 # Test invalid range
-for ARG in '1' 'A' '1-' '1-A'; do
+for ARG in '1' 'A' '1-' '1-A' '3-1'; do
     { shuf -i$ARG || test $? -ne 1; } &&
     { fail=1; echo "shuf did not detect erroneous -i$ARG usage.">&2 ; }
 done
-- 
2.17.1




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 25 Jul 2021 11:24:05 GMT) Full text and rfc822 format available.

This bug report was last modified 3 years and 326 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.