GNU bug report logs - #20511
split : does not account for --numeric-suffixes=FROM in calculation of suffix length?

Previous Next

Package: coreutils;

Reported by: Ben Rusholme <rusholme <at> caltech.edu>

Date: Tue, 5 May 2015 20:45:03 UTC

Severity: normal

Done: Pádraig Brady <P <at> draigBrady.com>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Pádraig Brady <P <at> draigBrady.com>
To: Ben Rusholme <rusholme <at> caltech.edu>, 20511 <at> debbugs.gnu.org
Subject: bug#20511: split : does not account for --numeric-suffixes=FROM in calculation of suffix length?
Date: Tue, 05 May 2015 22:58:31 +0100
On 05/05/15 21:42, Ben Rusholme wrote:
> Hi,
> 
> “split” (in the current GNU coreutils 8.23 release) does not account for the optional start index (“split --numeric-suffixes=FROM”) when calculating suffix length.
> 
> I couldn’t find any prior reference to this problem in either the bug tracker or mailing list archive.
> 
> Thanks, Ben
> 
> 
> 
> $ seq 100 >& input.txt
> $ split --numeric-suffixes --number=l/100 input.txt
> $ ls
> input.txt  x06  x13  x20  x27  x34  x41  x48  x55  x62  x69  x76  x83  x90  x97
> x00        x07  x14  x21  x28  x35  x42  x49  x56  x63  x70  x77  x84  x91  x98
> x01        x08  x15  x22  x29  x36  x43  x50  x57  x64  x71  x78  x85  x92  x99
> x02        x09  x16  x23  x30  x37  x44  x51  x58  x65  x72  x79  x86  x93
> x03        x10  x17  x24  x31  x38  x45  x52  x59  x66  x73  x80  x87  x94
> x04        x11  x18  x25  x32  x39  x46  x53  x60  x67  x74  x81  x88  x95
> x05        x12  x19  x26  x33  x40  x47  x54  x61  x68  x75  x82  x89  x96
> 
> 
> $ rm x*
> $ split --numeric-suffixes=1 --number=l/100 input.txt
> split: output file suffixes exhausted
> $ ls
> input.txt  x07  x14  x21  x28  x35  x42  x49  x56  x63  x70  x77  x84  x91  x98
> x01        x08  x15  x22  x29  x36  x43  x50  x57  x64  x71  x78  x85  x92  x99
> x02        x09  x16  x23  x30  x37  x44  x51  x58  x65  x72  x79  x86  x93
> x03        x10  x17  x24  x31  x38  x45  x52  x59  x66  x73  x80  x87  x94
> x04        x11  x18  x25  x32  x39  x46  x53  x60  x67  x74  x81  x88  x95
> x05        x12  x19  x26  x33  x40  x47  x54  x61  x68  x75  x82  x89  x96
> x06        x13  x20  x27  x34  x41  x48  x55  x62  x69  x76  x83  x90  x97
> $ # Should run from x001 to x100!
> 
> 
> $ rm x*
> $ split --numeric-suffixes=1 --number=l/101 input.txt
> $ ls
> input.txt  x008  x016  x024  x032  x040  x048  x056  x064  x072  x080  x088  x096
> x001       x009  x017  x025  x033  x041  x049  x057  x065  x073  x081  x089  x097
> x002       x010  x018  x026  x034  x042  x050  x058  x066  x074  x082  x090  x098
> x003       x011  x019  x027  x035  x043  x051  x059  x067  x075  x083  x091  x099
> x004       x012  x020  x028  x036  x044  x052  x060  x068  x076  x084  x092  x100
> x005       x013  x021  x029  x037  x045  x053  x061  x069  x077  x085  x093  x101
> x006       x014  x022  x030  x038  x046  x054  x062  x070  x078  x086  x094
> x007       x015  x023  x031  x039  x047  x055  x063  x071  x079  x087  x095

The info docs say about the --numeric-suffixes option:

  Note specifying a FROM value also disables the default auto suffix
  length expansion described above, and so you may also want to
  specify ‘-a’ to allow suffixes beyond ‘99’.

Now also specifying the fixed number of files with --number
auto sets the suffix length based on the number. I.E. when
you specified -nl/101 it bumped the suffix length to 3

Now you could bump the suffix length based on the start number,
though I don't think we should as that would impact on future
processing (ordering) of the resultant files.  I.E. specifying
a FROM value to --numeric-suffixes should only impact the
start value, rather than the width.

In other words if you were to split 2 files into 200 parts like:
  split                        --number=l/100 input1.txt
  split --numeric-suffixes=100 --number=l/100 input2.txt
Then you really need to be specifying -a3 to set
the suffix length appropriately.

We might be able to give an earlier error in this case,
and we should probably clarify the info docs a bit more.
I'll think about it.

cheers,
Pádraig.




This bug report was last modified 10 years and 8 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.