GNU bug report logs - #7241
Possible bug on split ?

Previous Next

Package: coreutils;

Reported by: Ulf Zibis <Ulf.Zibis <at> gmx.de>

Date: Mon, 18 Oct 2010 18:05:03 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 7241 in the body.
You can then email your comments to 7241 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Mon, 18 Oct 2010 18:05:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ulf Zibis <Ulf.Zibis <at> gmx.de>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Mon, 18 Oct 2010 18:05:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ulf Zibis <Ulf.Zibis <at> gmx.de>
To: bug-coreutils <at> gnu.org
Subject: Possible bug on split ?
Date: Mon, 18 Oct 2010 19:19:54 +0200
 With split --help I get the information on units like K, KB, M, MB etc.

As split 123m and split 123MB work fine, but split 123mb doesn't, it seems, that the unit 
identifiers only partly work for lower-case letters.

IMO this is a bug, or should be documented more explicit.

What you think?

-Ulf





Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Mon, 18 Oct 2010 19:51:02 GMT) Full text and rfc822 format available.

Message #8 received at 7241 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ulf Zibis <Ulf.Zibis <at> gmx.de>
Cc: 7241 <at> debbugs.gnu.org
Subject: Re: bug#7241: Possible bug on split ?
Date: Mon, 18 Oct 2010 12:54:12 -0700
On 10/18/10 10:19, Ulf Zibis wrote:
> IMO this is a bug, or should be documented more explicit.

I'd say fix the doc.  Do you have a suggestion for
improving the wording?




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Tue, 19 Oct 2010 15:06:01 GMT) Full text and rfc822 format available.

Message #11 received at 7241 <at> debbugs.gnu.org (full text, mbox):

From: Ulf Zibis <Ulf.Zibis <at> gmx.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 7241 <at> debbugs.gnu.org
Subject: Re: bug#7241: Possible bug on split ?
Date: Tue, 19 Oct 2010 16:42:41 +0200
 In many FAQs the small letter spelling is suggested, e.g.:

ntfsclone -s -o - <source> | gzip -c | split -a 3 -b 700m - <destination>

So IMO small letter writing for "MB" should be allowed, instead doc to explain, the small letters 
are allowed for "M", but not for "MB", seems to be difficult to communicate.

-Ulf


Am 18.10.2010 21:54, schrieb Paul Eggert:
> On 10/18/10 10:19, Ulf Zibis wrote:
>> IMO this is a bug, or should be documented more explicit.
> I'd say fix the doc.  Do you have a suggestion for
> improving the wording?
>
>
>
>




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Tue, 19 Oct 2010 19:21:02 GMT) Full text and rfc822 format available.

Message #14 received at 7241 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Ulf Zibis <Ulf.Zibis <at> gmx.de>
Cc: 7241 <at> debbugs.gnu.org
Subject: Re: bug#7241: Possible bug on split ?
Date: Tue, 19 Oct 2010 12:24:04 -0700
On 10/19/10 07:42, Ulf Zibis wrote:
> In many FAQs the small letter spelling is suggested, e.g.:
> 
> ntfsclone -s -o - <source> | gzip -c | split -a 3 -b 700m - <destination>
> 
> So IMO small letter writing for "MB" should be allowed

But "mb" is SI syntax for "millibit", which
is a very small unit of information.  Having "mb"
be an alias for "megabyte" would be confusing
to those used to the standard notation.  (Having
"mb" be an alias for "megabit", or for "millibyte",
would be bad as well.)

We have to support plain "m" as an alias for "MiB",
because POSIX requires support for plain "m".
But let's not compound POSIX's mistake by supporting
even more usages that are contrary to SI.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Tue, 19 Oct 2010 21:49:02 GMT) Full text and rfc822 format available.

Message #17 received at 7241 <at> debbugs.gnu.org (full text, mbox):

From: Ulf Zibis <Ulf.Zibis <at> gmx.de>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 7241 <at> debbugs.gnu.org
Subject: Re: bug#7241: Possible bug on split ?
Date: Tue, 19 Oct 2010 23:51:58 +0200
 Am 19.10.2010 21:24, schrieb Paul Eggert:
> On 10/19/10 07:42, Ulf Zibis wrote:
>> In many FAQs the small letter spelling is suggested, e.g.:
>>
>> ntfsclone -s -o -<source>  | gzip -c | split -a 3 -b 700m -<destination>
>>
>> So IMO small letter writing for "MB" should be allowed
> But "mb" is SI syntax for "millibit", which
> is a very small unit of information.  Having "mb"
> be an alias for "megabyte" would be confusing
> to those used to the standard notation.  (Having
> "mb" be an alias for "megabit", or for "millibyte",
> would be bad as well.)
>
> We have to support plain "m" as an alias for "MiB",
> because POSIX requires support for plain "m".
> But let's not compound POSIX's mistake by supporting
> even more usages that are contrary to SI.

Ok, so the docs should be better detailed.
BTW: this problem is not restricted on the split command, it concerns allmost all GNU coreutils and 
more, e.g. ntfsprogs.

-Ulf





Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Tue, 31 May 2011 21:37:01 GMT) Full text and rfc822 format available.

Notification sent to Ulf Zibis <Ulf.Zibis <at> gmx.de>:
bug acknowledged by developer. (Tue, 31 May 2011 21:37:01 GMT) Full text and rfc822 format available.

Message #22 received at 7241-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Ulf Zibis <Ulf.Zibis <at> gmx.de>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 7241-done <at> debbugs.gnu.org
Subject: Re: bug#7241: Possible bug on split ?
Date: Tue, 31 May 2011 23:36:15 +0200
Ulf Zibis wrote:

>  Am 19.10.2010 21:24, schrieb Paul Eggert:
>> On 10/19/10 07:42, Ulf Zibis wrote:
>>> In many FAQs the small letter spelling is suggested, e.g.:
>>>
>>> ntfsclone -s -o -<source>  | gzip -c | split -a 3 -b 700m -<destination>
>>>
>>> So IMO small letter writing for "MB" should be allowed
>> But "mb" is SI syntax for "millibit", which
>> is a very small unit of information.  Having "mb"
>> be an alias for "megabyte" would be confusing
>> to those used to the standard notation.  (Having
>> "mb" be an alias for "megabit", or for "millibyte",
>> would be bad as well.)
>>
>> We have to support plain "m" as an alias for "MiB",
>> because POSIX requires support for plain "m".
>> But let's not compound POSIX's mistake by supporting
>> even more usages that are contrary to SI.
>
> Ok, so the docs should be better detailed.
> BTW: this problem is not restricted on the split command, it concerns
> allmost all GNU coreutils and more, e.g. ntfsprogs.

The info documentation is very specific:

`-b SIZE'
`--bytes=SIZE'
     Put SIZE bytes of INPUT into each output file.  SIZE may be, or
     may be an integer optionally followed by, one of the following
     multiplicative suffixes:
          `b'  =>            512 ("blocks")
          `KB' =>           1000 (KiloBytes)
          `K'  =>           1024 (KibiBytes)
          `MB' =>      1000*1000 (MegaBytes)
          `M'  =>      1024*1024 (MebiBytes)
          `GB' => 1000*1000*1000 (GigaBytes)
          `G'  => 1024*1024*1024 (GibiBytes)
     and so on for `T', `P', `E', `Z', and `Y'.

However, split --help is brief (and split.1 is mechanically
derived from split --help output), but that's fine, because it's
more of a quick-reference and points to the info documentation.
Note the last line:

    For complete documentation, run: info coreutils 'split invocation'

We'd welcome any specific suggestions for improvement you may have.
Note that while I'm closing this "issue", you're welcome to continue
replying, and comments will still be archived for it and read by people
on this list.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Tue, 31 May 2011 22:49:02 GMT) Full text and rfc822 format available.

Message #25 received at 7241 <at> debbugs.gnu.org (full text, mbox):

From: Ulf Zibis <Ulf.Zibis <at> gmx.de>
To: 7241 <at> debbugs.gnu.org
Subject: Re: bug#7241: closed (Re: bug#7241: Possible bug on split ?)
Date: Wed, 01 Jun 2011 00:48:42 +0200
You could write:

`-b SIZE'
`--bytes=SIZE'
     Put SIZE bytes of INPUT into each output file.  SIZE may be, or
     may be an integer optionally followed by, one of the following
     multiplicative suffixes:
          `b'     =>             512 ("blocks")
          `KB'    =>            1000 (KiloBytes)
          `K'     =>            1024 (KibiBytes)
          `MB'    =>       1000*1000 (MegaBytes)
          `M','m' =>       1024*1024 (MebiBytes)
          `GB'    =>  1000*1000*1000 (GigaBytes)
          `G'     =>  1024*1024*1024 (GibiBytes)
     and so on for `T', `P', `E', `Z', and `Y'.


For split --help you could write (I only have the german version):

GRÖßE kann eine der folgenden Abkürzungen sein (oder eine Zahl, die optional
von einer der Abkürzungen gefolgt wird):
KB 1000, K 1024, MB 1000x1000, M or m 1024x1024 und so weiter für G, T, P, E, Z, Y.






Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Tue, 31 May 2011 22:57:02 GMT) Full text and rfc822 format available.

Message #28 received at 7241 <at> debbugs.gnu.org (full text, mbox):

From: Ulf Zibis <Ulf.Zibis <at> gmx.de>
To: 7241 <at> debbugs.gnu.org
Subject: Re: bug#7241: closed (Re: bug#7241: Possible bug on split ?)
Date: Wed, 01 Jun 2011 00:56:22 +0200
Correction:

Am 01.06.2011 00:48, schrieb Ulf Zibis:
>
> For split --help you could write (I only have the german version):
>
> GRÖßE kann eine der folgenden Abkürzungen sein (oder eine Zahl, die optional
> von einer der Abkürzungen gefolgt wird):
> KB 1000, K 1024, MB 1000x1000, M or m 1024x1024 und so weiter für G, T, P, E, Z, Y.
>
KB 1000, K 1024, MB 1000x1000, M oder m 1024x1024 und so weiter für G, T, P, E, Z, Y.
or
KB 1000, K 1024, MB 1000x1000, M|m 1024x1024 und so weiter für G, T, P, E, Z, Y.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Wed, 01 Jun 2011 19:31:02 GMT) Full text and rfc822 format available.

Message #31 received at 7241 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Ulf Zibis <Ulf.Zibis <at> gmx.de>
Cc: 7241 <at> debbugs.gnu.org
Subject: Re: bug#7241: closed (Re: bug#7241: Possible bug on split ?)
Date: Wed, 01 Jun 2011 21:30:43 +0200
Ulf Zibis wrote:
> You could write:
>
> `-b SIZE'
> `--bytes=SIZE'
>      Put SIZE bytes of INPUT into each output file.  SIZE may be, or
>      may be an integer optionally followed by, one of the following
>      multiplicative suffixes:
>           `b'     =>             512 ("blocks")
>           `KB'    =>            1000 (KiloBytes)
>           `K'     =>            1024 (KibiBytes)
>           `MB'    =>       1000*1000 (MegaBytes)
>           `M','m' =>       1024*1024 (MebiBytes)

Thanks, but I'd rather not encourage the use of "m", seeing as how
it's irregular.  With no documentation, we have more flexibility, if
we decide to deprecated it and eventually remove support for it some day.
If we document it, that's an implicit endorsement and more of a
commitment.

>           `GB'    =>  1000*1000*1000 (GigaBytes)
>           `G'     =>  1024*1024*1024 (GibiBytes)
>      and so on for `T', `P', `E', `Z', and `Y'.
>
>
> For split --help you could write (I only have the german version):
>
> GRÖßE kann eine der folgenden Abkürzungen sein (oder eine Zahl, die optional
> von einer der Abkürzungen gefolgt wird):
> KB 1000, K 1024, MB 1000x1000, M or m 1024x1024 und so weiter für G, T, P, E, Z, Y.




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#7241; Package coreutils. (Thu, 02 Jun 2011 10:58:01 GMT) Full text and rfc822 format available.

Message #34 received at 7241 <at> debbugs.gnu.org (full text, mbox):

From: Ulf Zibis <Ulf.Zibis <at> gmx.de>
To: Jim Meyering <jim <at> meyering.net>
Cc: 7241 <at> debbugs.gnu.org
Subject: Re: bug#7241: closed (Re: bug#7241: Possible bug on split ?)
Date: Thu, 02 Jun 2011 12:57:30 +0200
Am 01.06.2011 21:30, schrieb Jim Meyering:
> Ulf Zibis wrote:
>> You could write:
>>
>> `-b SIZE'
>> `--bytes=SIZE'
>>       Put SIZE bytes of INPUT into each output file.  SIZE may be, or
>>       may be an integer optionally followed by, one of the following
>>       multiplicative suffixes:
>>            `b'     =>              512 ("blocks")
>>            `KB'    =>             1000 (KiloBytes)
>>            `K'     =>             1024 (KibiBytes)
>>            `MB'    =>        1000*1000 (MegaBytes)
>>            `M','m' =>        1024*1024 (MebiBytes)
> Thanks, but I'd rather not encourage the use of "m", seeing as how
> it's irregular.  With no documentation, we have more flexibility, if
> we decide to deprecated it and eventually remove support for it some day.
> If we document it, that's an implicit endorsement and more of a
> commitment.
>
Maybe you could add a note:
(Note: for POSIX support lower 'm' is allowed for "MiB", but is not recommended)
(Note: In contradiction to SI, lower 'm' is allowed for "MiB" for POSIX support, but could become 
deprecated)

Then user would not try other lower letters if coming from FAQ tips like:
ntfsclone -s -o - <source> | gzip -c | split -a 3 -b 700m - <destination> ,
... and FAQ tip authors are aware to better use 'M'.

-Ulf

>>            `GB'    =>   1000*1000*1000 (GigaBytes)
>>            `G'     =>   1024*1024*1024 (GibiBytes)
>>       and so on for `T', `P', `E', `Z', and `Y'.
>>
>>
>> For split --help you could write (I only have the german version):
>>
>> GRÖßE kann eine der folgenden Abkürzungen sein (oder eine Zahl, die optional
>> von einer der Abkürzungen gefolgt wird):
>> KB 1000, K 1024, MB 1000x1000, M or m 1024x1024 und so weiter für G, T, P, E, Z, Y.
>
>
>




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 30 Jun 2011 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 13 years and 358 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.