GNU bug report logs - #47023
df utilility displays G instead of GM as unit size for Gigabytes in power of 1000

Previous Next

Package: coreutils;

Reported by: Philippe Bénézech <philippe.benezech <at> laposte.net>

Date: Tue, 9 Mar 2021 16:07:02 UTC

Severity: wishlist

Tags: wontfix

Merged with 18119

Done: Assaf Gordon <assafgordon <at> gmail.com>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 47023 in the body.
You can then email your comments to 47023 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Tue, 09 Mar 2021 16:07:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Philippe Bénézech <philippe.benezech <at> laposte.net>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Tue, 09 Mar 2021 16:07:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Philippe Bénézech <philippe.benezech <at> laposte.net>
To: bug-coreutils <at> gnu.org
Subject: df utilility displays G instead of GM as unit size for Gigabytes in
 power of 1000
Date: Tue, 9 Mar 2021 13:58:54 +0100
[Message part 1 (text/plain, inline)]
Dear maintener,

I found a reproducible bug in df utility, installed in debian stable

$ df --version |head -1
df (GNU coreutils) 8.30
$ cat /etc/debian_version
10.8

df displays G instead of GM as unit size for Gigabytes in power of 1000 
(but the value is correct)

$ df -BGB /home
Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2        421GB   355GB       45GB  89% /home

$ df -H /home
Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
/dev/mapper/ssd2   421G    355G   45G  89% /home

As a remark, the display is ok (no bug) with TeraBytes, MegaBytes and 
KiloBytes in power of 1000

$ df -BTB /home
Sys. de fichiers blocs de 1TB Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2          1TB     1TB        1TB  89% /home

$ df -BMB /home
Sys. de fichiers blocs de 1MB  Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2     420611MB 354492MB    44682MB  89% /home

$ df -BKB /home
Sys. de fichiers blocs de 1kB     Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2  420610057kB 354491245kB 44681065kB  89% /home

Best regards


[Message part 2 (text/html, inline)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Tue, 09 Mar 2021 19:52:02 GMT) Full text and rfc822 format available.

Message #8 received at 47023 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Philippe Bénézech <philippe.benezech <at> laposte.net>,
 47023 <at> debbugs.gnu.org
Subject: Re: bug#47023: df utilility displays G instead of GM as unit size for
 Gigabytes in power of 1000
Date: Tue, 9 Mar 2021 19:51:45 +0000
unarchive 18119
forcemerge 18119 47023
stop


On 09/03/2021 12:58, Philippe Bénézech via GNU coreutils Bug Reports wrote:
> Dear maintener,
> 
> I found a reproducible bug in df utility, installed in debian stable
> 
> $ df --version |head -1
> df (GNU coreutils) 8.30
> $ cat /etc/debian_version
> 10.8
> 
> df displays G instead of GM as unit size for Gigabytes in power of 1000
> (but the value is correct)

This is not restricted to G

> $ df -BGB /home
> Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
> /dev/mapper/ssd2        421GB   355GB       45GB  89% /home
> 
> $ df -H /home
> Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
> /dev/mapper/ssd2   421G    355G   45G  89% /home

In summary df -H is outputting with a concise single letter,
which is indistinguishable from that of df -h.
I agree that's not ideal as the output can't be
interpreted without the command as context.
I.e. it restricts usage to direct command line usage.
A possible change we could make here would be to use GB, MB etc.
if --si is specified.
But also -h and -H are not really useful outside of direct cli usage,
I'm 50:50 on changing this.

This was originally discussed at https://bugs.gnu.org/18119

Mentioned there is an option to use the new numfmt functionality
to provide more control and unambiguous output.

BTW the fact that a B suffix implies SI units is awkward in the first place,
which I've documented the reasons for at:
https://www.pixelbeat.org/docs/coreutils-gotchas.html#units

cheers,
Pádraig




Forcibly Merged 18119 47023. Request was from Pádraig Brady <P <at> draigBrady.com> to control <at> debbugs.gnu.org. (Tue, 09 Mar 2021 19:52:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Tue, 09 Mar 2021 20:37:02 GMT) Full text and rfc822 format available.

Message #13 received at 47023 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Philippe Bénézech <philippe.benezech <at> laposte.net>,
 47023 <at> debbugs.gnu.org
Subject: Re: bug#47023: df utilility displays G instead of GM as unit size for
 Gigabytes in power of 1000
Date: Tue, 9 Mar 2021 12:35:59 -0800
On 3/9/21 4:58 AM, Philippe Bénézech via GNU coreutils Bug Reports wrote:
> 
> df displays G instead of GM as unit size for Gigabytes in power of 1000 
> (but the value is correct)
> 
> $ df -BGB /home
> Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
> /dev/mapper/ssd2        421GB   355GB       45GB  89% /home
> 
> $ df -H /home
> Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
> /dev/mapper/ssd2   421G    355G   45G  89% /home

I don't see a bug here. First, I assume you meant to write "GB" rather 
than "GM". Second, "df -BGB" is documented to append units (in this 
case, "GB") to the output number, whereas "df -H" is merely documented 
to append a size indication (in this case, "G").




Information forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Wed, 10 Mar 2021 14:51:01 GMT) Full text and rfc822 format available.

Message #16 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Glenn Golden <gdg <at> zplane.com>
To: bug-coreutils <at> gnu.org
Subject: Re: bug#47023: df utilility displays G instead of GM as unit size
 for Gigabytes in power of 1000
Date: Wed, 10 Mar 2021 07:50:14 -0700
Pádraig, Philippe, Paul -

Pádraig Brady [Tue, 9 Mar 2021 19:51:45 +0000]:
> 
> > On 09/03/2021 12:58, Philippe Bénézech via GNU coreutils Bug Reports wrote:
> > Dear maintener,
> > 
> > I found a reproducible bug in df utility, installed in debian stable
> > 
> > $ df --version |head -1
> > df (GNU coreutils) 8.30
> > $ cat /etc/debian_version
> > 10.8
> > 
> > df displays G instead of GM as unit size for Gigabytes in power of 1000
> > (but the value is correct)
> 
> This is not restricted to G
> 
> > $ df -BGB /home
> > Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
> > /dev/mapper/ssd2        421GB   355GB       45GB  89% /home
> > 
> > $ df -H /home
> > Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
> > /dev/mapper/ssd2   421G    355G   45G  89% /home
> >

> 
> In summary df -H is outputting with a concise single letter, which is
> indistinguishable from that of df -h.  I agree that's not ideal as the
> output can't be interpreted without the command as context.  I.e. it
> restricts usage to direct command line usage.  A possible change we could
> make here would be to use GB, MB etc.  if --si is specified.  But also -h
> and -H are not really useful outside of direct cli usage, I'm 50:50 on
> changing this.
> 
> This was originally discussed at https://bugs.gnu.org/18119
> 

It was brought up again more recently (Sept. 2020) here:

    https://lists.gnu.org/archive/html/coreutils/2020-09/msg00001.html

The above post provided an extensive background and history of the issue,
and a suggested patchset.

>
> Mentioned there is an option to use the new numfmt functionality
> to provide more control and unambiguous output.
> 
> BTW the fact that a B suffix implies SI units is awkward in the first
> place, which I've documented the reasons for at:
> 
>     https://www.pixelbeat.org/docs/coreutils-gotchas.html#units
> 

Agree 100% with your statements therein. (And your above document is
referenced as [2] in the above-mentioned posting from September).

Imo, it would be a Very Nice Thing if the program behavior of --si would
be brought into accordance with your documentation above (and with Section
2.3 of coreutils.info, which says essentially the same things) rather than
having two sets of mutually conflicting documentation co-existing within
coreutils.  The proposed patchset does that.

See above posting for details. It's very long, but it lays out the entire
story from start to finish, with all known back references that I'm aware of.

- Glenn Golden




Information forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Wed, 10 Mar 2021 21:28:02 GMT) Full text and rfc822 format available.

Message #19 received at submit <at> debbugs.gnu.org (full text, mbox):

From: L A Walsh <coreutils <at> tlinx.org>
Cc: Glenn Golden <gdg <at> zplane.com>, Coreutils <bug-coreutils <at> gnu.org>
Subject: Re: bug#47023: df utilility displays G instead of GM as unit size
 for Gigabytes in power of 1000
Date: Wed, 10 Mar 2021 13:27:15 -0800
On 2021/03/10 06:50, Glenn Golden wrote:
> Pádraig, Philippe, Paul -
>
> Pádraig Brady [Tue, 9 Mar 2021 19:51:45 +0000]:
>   
>>> On 09/03/2021 12:58, Philippe Bénézech via GNU coreutils Bug Reports wrote:
>>> Dear maintener,
>>>
>>> I found a reproducible bug in df utility, installed in debian stable
>>>
>>> $ df --version |head -1
>>> df (GNU coreutils) 8.30
>>> $ cat /etc/debian_version
>>> 10.8
>>>
>>> df displays G instead of GM as unit size for Gigabytes in power of 1000
>>> (but the value is correct)
>>>       
----
The documentation says:

  -h, --human-readable
          print sizes in powers of 1024 (e.g., 1023M)

  -H, --si
          print sizes in powers of 1000 (e.g., 1.1G)

How is this a bug?

If the idea is to print a scaling factor and use the  minimum
space necessary (1 byte for the prefix), it seem to be doing
exactly what it is documented to do.

Side rant:
 Using decimal prefixes with a binary unit (1B=2**3 bits)
 defeats the purpose of using a common multiplier for metric.
 Since computers use base-2, similar prefixes should be used.
 Just because the disk-industry bought and paid for the
 ruling to use base-10 doesn't mean that memory comes in
 units of 1-million, 1-billion or 1-trillion bytes or
 that disk space is organized in decimal units.

 I find it amazing that it is the French who most
 often vocalize support for the capitalistic-backed
 decision. 

*sigh*


Second, minor, side rant:
   Would be nice if more attention was paid to fixing
mailers encoding "Pádraig" and "Bénézech" as "P�draig"
and "B�n�zech"
*double sigh*
:-)






Information forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Wed, 10 Mar 2021 22:10:01 GMT) Full text and rfc822 format available.

Message #22 received at 47023 <at> debbugs.gnu.org (full text, mbox):

From: Glenn Golden <gdg <at> zplane.com>
To: L A Walsh <coreutils <at> tlinx.org>
Cc: 47023 <at> debbugs.gnu.org
Subject: Re: bug#47023: df utilility displays G instead of GM as unit size
 for Gigabytes in power of 1000
Date: Wed, 10 Mar 2021 15:09:36 -0700
L A Walsh <coreutils <at> tlinx.org> [2021-03-10 13:27:15 -0800]:
> On 2021/03/10 06:50, Glenn Golden wrote:
> > Pádraig, Philippe, Paul -
> > 
> > Pádraig Brady [Tue, 9 Mar 2021 19:51:45 +0000]:
> > > > On 09/03/2021 12:58, Philippe Bénézech via GNU coreutils Bug Reports wrote:
> > > > Dear maintener,
> > > > 
> > > > I found a reproducible bug in df utility, installed in debian stable
> > > > 
> > > > $ df --version |head -1
> > > > df (GNU coreutils) 8.30
> > > > $ cat /etc/debian_version
> > > > 10.8
> > > > 
> > > > df displays G instead of GM as unit size for Gigabytes in power of 1000
> > > > (but the value is correct)
> ----
> The documentation says:
> 
>   -h, --human-readable
>           print sizes in powers of 1024 (e.g., 1023M)
> 
>   -H, --si
>           print sizes in powers of 1000 (e.g., 1.1G)
> 
> How is this a bug?
> 
> If the idea is to print a scaling factor and use the  minimum
> space necessary (1 byte for the prefix), it seem to be doing
> exactly what it is documented to do.
> 

If you read the referenced post

    https://lists.gnu.org/archive/html/coreutils/2020-09/msg00001.html

you'll understand what the issue is.


>
> Side rant:
>  Using decimal prefixes with a binary unit (1B=2**3 bits)
>  defeats the purpose of using a common multiplier for metric.
>  Since computers use base-2, similar prefixes should be used.
>  Just because the disk-industry bought and paid for the
>  ruling to use base-10 doesn't mean that memory comes in
>  units of 1-million, 1-billion or 1-trillion bytes or
>  that disk space is organized in decimal units.
> 

We've been thru all this before. See

    https://lists.gnu.org/archive/html/coreutils/2020-09/msg00007.html


> 
> Second, minor, side rant:
>    Would be nice if more attention was paid to fixing mailers encoding
>    "Pádraig" and "Bénézech" as "P�draig" and "B�n�zech"
>

If you see substitute encodings like that, it strongly suggests the problem
is your MUA, not mine. My posting to the list (the one you quote from above)
shows the accented characters correctly. I'll be happy to send you a screenshot
if you wish.

- Glenn Golden




Information forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Wed, 10 Mar 2021 22:51:02 GMT) Full text and rfc822 format available.

Message #25 received at submit <at> debbugs.gnu.org (full text, mbox):

From: L A Walsh <coreutils <at> tlinx.org>
Cc: Glenn Golden <gdg <at> zplane.com>, Coreutils <bug-coreutils <at> gnu.org>
Subject: Re: bug#47023: df utilility displays G instead of GM as unit size
 for Gigabytes in power of 1000
Date: Wed, 10 Mar 2021 14:50:50 -0800

On 2021/03/10 14:09, Glenn Golden wrote:
>
>> Second, minor, side rant:
>>    Would be nice if more attention was paid to fixing mailers encoding
>>    "Pádraig" and "Bénézech" as "P�draig" and "B�n�zech"
>>
> 
> If you see substitute encodings like that, it strongly suggests the problem
> is your MUA, not mine. 
----
It's yours.

You are using a local 8-bit encoding, whereas everyone else was
using UTF-8.  Your mailer re-encoded their messages into one
of the 8-bit western encodings, whereas most people use UTF-8
these days, so while their original messages with accents came
through just fine in UTF-8, your re-encoding into Western didn't
display properly.







Information forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Wed, 10 Mar 2021 23:22:02 GMT) Full text and rfc822 format available.

Message #28 received at 47023 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: L A Walsh <coreutils <at> tlinx.org>
Cc: gdg <at> zplane.com, 47023 <at> debbugs.gnu.org
Subject: Re: bug#47023: df utilility displays G instead of GM as unit size for
 Gigabytes in power of 1000
Date: Wed, 10 Mar 2021 15:21:29 -0800
On 3/10/21 2:50 PM, L A Walsh wrote:
> You are using a local 8-bit encoding, whereas everyone else was
> using UTF-8.  Your mailer re-encoded their messages into one
> of the 8-bit western encodings, whereas most people use UTF-8
> these days, so while their original messages with accents came
> through just fine in UTF-8, your re-encoding into Western didn't
> display properly.

Although his email did reencode those names into ISO 8859-1 which is 
more likely to cause problems than cure them these days, it still 
displays well on my MUA (Thunderbird) because its header said 
"Content-Type: text/plain; charset=iso-8859-1". His email is also 
displaying properly in the archive 
<https://debbugs.gnu.org/cgi/bugreport.cgi?bug=47023#16>, as the 
archiving software reencodes those names back into UTF-8 and the web 
page uses "Content-Type: text/html; charset=utf-8" for all the emails.

Possibly your email client is programmed to ignore encodings in incoming 
"Content-Type" lines; that would explain the glitches you saw.




Information forwarded to bug-coreutils <at> gnu.org:
bug#47023; Package coreutils. (Thu, 11 Mar 2021 04:14:02 GMT) Full text and rfc822 format available.

Message #31 received at submit <at> debbugs.gnu.org (full text, mbox):

From: L A Walsh <coreutils <at> tlinx.org>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: gdg <at> zplane.com, Coreutils <bug-coreutils <at> gnu.org>
Subject: Re: bug#47023: df utilility displays G instead of GM as unit size
 for Gigabytes in power of 1000
Date: Wed, 10 Mar 2021 20:13:05 -0800

On 2021/03/10 15:21, Paul Eggert wrote:
> Although his email did reencode those names into ISO 8859-1 which is 
> more likely to cause problems than cure them these days, it still 
> displays well on my MUA (Thunderbird) because its header said 
> "Content-Type: text/plain; charset=iso-8859-1". His email is also 
> displaying properly in the archive 
> <https://debbugs.gnu.org/cgi/bugreport.cgi?bug=47023#16>, as the 
> archiving software reencodes those names back into UTF-8 and the web 
> page uses "Content-Type: text/html; charset=utf-8" for all the emails.
> 
> Possibly your email client is programmed to ignore encodings in incoming 
> "Content-Type" lines; that would explain the glitches you saw.
----
	I also have Tbird, and for content I can adapt it but usually
have it set to ignore non-UTF-8 these days, as occasionally, 
someone has an old mailer that sends out 8-bit encoding and Tbird would
seamlessly adapt and send the email back out in the same format.  

	I didn't want to pass on bad encodings when I replied to such
items to a list, so I changed that default.

	However, what Tbird doesn't handle well is local encodings
used in the 'To' or subject lines since there is no encoding line
for those.  In a text archive it can convert and display both in 
UTF-8 format, but even if I set Tbird to respond in the same encoding,
it won't fix the encodings in the header as the content-type: text/plain
doesn't apply to the headers.

	So while archives and the email text can be fixed, the text in
the headers often can't.  The only thing to do is bring it to their attention
and hope they have the technical skills to set their mailer to use
UTF-8.








bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 08 Apr 2021 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 154 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.