GNU bug report logs -
#47023
df utilility displays G instead of GM as unit size for Gigabytes in power of 1000
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 47023 in the body.
You can then email your comments to 47023 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Tue, 09 Mar 2021 16:07:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Philippe Bénézech <philippe.benezech <at> laposte.net>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Tue, 09 Mar 2021 16:07:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Dear maintener,
I found a reproducible bug in df utility, installed in debian stable
$ df --version |head -1
df (GNU coreutils) 8.30
$ cat /etc/debian_version
10.8
df displays G instead of GM as unit size for Gigabytes in power of 1000
(but the value is correct)
$ df -BGB /home
Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2 421GB 355GB 45GB 89% /home
$ df -H /home
Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
/dev/mapper/ssd2 421G 355G 45G 89% /home
As a remark, the display is ok (no bug) with TeraBytes, MegaBytes and
KiloBytes in power of 1000
$ df -BTB /home
Sys. de fichiers blocs de 1TB Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2 1TB 1TB 1TB 89% /home
$ df -BMB /home
Sys. de fichiers blocs de 1MB Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2 420611MB 354492MB 44682MB 89% /home
$ df -BKB /home
Sys. de fichiers blocs de 1kB Utilisé Disponible Uti% Monté sur
/dev/mapper/ssd2 420610057kB 354491245kB 44681065kB 89% /home
Best regards
[Message part 2 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Tue, 09 Mar 2021 19:52:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 47023 <at> debbugs.gnu.org (full text, mbox):
unarchive 18119
forcemerge 18119 47023
stop
On 09/03/2021 12:58, Philippe Bénézech via GNU coreutils Bug Reports wrote:
> Dear maintener,
>
> I found a reproducible bug in df utility, installed in debian stable
>
> $ df --version |head -1
> df (GNU coreutils) 8.30
> $ cat /etc/debian_version
> 10.8
>
> df displays G instead of GM as unit size for Gigabytes in power of 1000
> (but the value is correct)
This is not restricted to G
> $ df -BGB /home
> Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
> /dev/mapper/ssd2 421GB 355GB 45GB 89% /home
>
> $ df -H /home
> Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
> /dev/mapper/ssd2 421G 355G 45G 89% /home
In summary df -H is outputting with a concise single letter,
which is indistinguishable from that of df -h.
I agree that's not ideal as the output can't be
interpreted without the command as context.
I.e. it restricts usage to direct command line usage.
A possible change we could make here would be to use GB, MB etc.
if --si is specified.
But also -h and -H are not really useful outside of direct cli usage,
I'm 50:50 on changing this.
This was originally discussed at https://bugs.gnu.org/18119
Mentioned there is an option to use the new numfmt functionality
to provide more control and unambiguous output.
BTW the fact that a B suffix implies SI units is awkward in the first place,
which I've documented the reasons for at:
https://www.pixelbeat.org/docs/coreutils-gotchas.html#units
cheers,
Pádraig
Forcibly Merged 18119 47023.
Request was from
Pádraig Brady <P <at> draigBrady.com>
to
control <at> debbugs.gnu.org
.
(Tue, 09 Mar 2021 19:52:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Tue, 09 Mar 2021 20:37:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 47023 <at> debbugs.gnu.org (full text, mbox):
On 3/9/21 4:58 AM, Philippe Bénézech via GNU coreutils Bug Reports wrote:
>
> df displays G instead of GM as unit size for Gigabytes in power of 1000
> (but the value is correct)
>
> $ df -BGB /home
> Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
> /dev/mapper/ssd2 421GB 355GB 45GB 89% /home
>
> $ df -H /home
> Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
> /dev/mapper/ssd2 421G 355G 45G 89% /home
I don't see a bug here. First, I assume you meant to write "GB" rather
than "GM". Second, "df -BGB" is documented to append units (in this
case, "GB") to the output number, whereas "df -H" is merely documented
to append a size indication (in this case, "G").
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Wed, 10 Mar 2021 14:51:01 GMT)
Full text and
rfc822 format available.
Message #16 received at submit <at> debbugs.gnu.org (full text, mbox):
Pádraig, Philippe, Paul -
Pádraig Brady [Tue, 9 Mar 2021 19:51:45 +0000]:
>
> > On 09/03/2021 12:58, Philippe Bénézech via GNU coreutils Bug Reports wrote:
> > Dear maintener,
> >
> > I found a reproducible bug in df utility, installed in debian stable
> >
> > $ df --version |head -1
> > df (GNU coreutils) 8.30
> > $ cat /etc/debian_version
> > 10.8
> >
> > df displays G instead of GM as unit size for Gigabytes in power of 1000
> > (but the value is correct)
>
> This is not restricted to G
>
> > $ df -BGB /home
> > Sys. de fichiers blocs de 1GB Utilisé Disponible Uti% Monté sur
> > /dev/mapper/ssd2 421GB 355GB 45GB 89% /home
> >
> > $ df -H /home
> > Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur
> > /dev/mapper/ssd2 421G 355G 45G 89% /home
> >
>
> In summary df -H is outputting with a concise single letter, which is
> indistinguishable from that of df -h. I agree that's not ideal as the
> output can't be interpreted without the command as context. I.e. it
> restricts usage to direct command line usage. A possible change we could
> make here would be to use GB, MB etc. if --si is specified. But also -h
> and -H are not really useful outside of direct cli usage, I'm 50:50 on
> changing this.
>
> This was originally discussed at https://bugs.gnu.org/18119
>
It was brought up again more recently (Sept. 2020) here:
https://lists.gnu.org/archive/html/coreutils/2020-09/msg00001.html
The above post provided an extensive background and history of the issue,
and a suggested patchset.
>
> Mentioned there is an option to use the new numfmt functionality
> to provide more control and unambiguous output.
>
> BTW the fact that a B suffix implies SI units is awkward in the first
> place, which I've documented the reasons for at:
>
> https://www.pixelbeat.org/docs/coreutils-gotchas.html#units
>
Agree 100% with your statements therein. (And your above document is
referenced as [2] in the above-mentioned posting from September).
Imo, it would be a Very Nice Thing if the program behavior of --si would
be brought into accordance with your documentation above (and with Section
2.3 of coreutils.info, which says essentially the same things) rather than
having two sets of mutually conflicting documentation co-existing within
coreutils. The proposed patchset does that.
See above posting for details. It's very long, but it lays out the entire
story from start to finish, with all known back references that I'm aware of.
- Glenn Golden
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Wed, 10 Mar 2021 21:28:02 GMT)
Full text and
rfc822 format available.
Message #19 received at submit <at> debbugs.gnu.org (full text, mbox):
On 2021/03/10 06:50, Glenn Golden wrote:
> Pádraig, Philippe, Paul -
>
> Pádraig Brady [Tue, 9 Mar 2021 19:51:45 +0000]:
>
>>> On 09/03/2021 12:58, Philippe Bénézech via GNU coreutils Bug Reports wrote:
>>> Dear maintener,
>>>
>>> I found a reproducible bug in df utility, installed in debian stable
>>>
>>> $ df --version |head -1
>>> df (GNU coreutils) 8.30
>>> $ cat /etc/debian_version
>>> 10.8
>>>
>>> df displays G instead of GM as unit size for Gigabytes in power of 1000
>>> (but the value is correct)
>>>
----
The documentation says:
-h, --human-readable
print sizes in powers of 1024 (e.g., 1023M)
-H, --si
print sizes in powers of 1000 (e.g., 1.1G)
How is this a bug?
If the idea is to print a scaling factor and use the minimum
space necessary (1 byte for the prefix), it seem to be doing
exactly what it is documented to do.
Side rant:
Using decimal prefixes with a binary unit (1B=2**3 bits)
defeats the purpose of using a common multiplier for metric.
Since computers use base-2, similar prefixes should be used.
Just because the disk-industry bought and paid for the
ruling to use base-10 doesn't mean that memory comes in
units of 1-million, 1-billion or 1-trillion bytes or
that disk space is organized in decimal units.
I find it amazing that it is the French who most
often vocalize support for the capitalistic-backed
decision.
*sigh*
Second, minor, side rant:
Would be nice if more attention was paid to fixing
mailers encoding "Pádraig" and "Bénézech" as "P�draig"
and "B�n�zech"
*double sigh*
:-)
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Wed, 10 Mar 2021 22:10:01 GMT)
Full text and
rfc822 format available.
Message #22 received at 47023 <at> debbugs.gnu.org (full text, mbox):
L A Walsh <coreutils <at> tlinx.org> [2021-03-10 13:27:15 -0800]:
> On 2021/03/10 06:50, Glenn Golden wrote:
> > Pádraig, Philippe, Paul -
> >
> > Pádraig Brady [Tue, 9 Mar 2021 19:51:45 +0000]:
> > > > On 09/03/2021 12:58, Philippe Bénézech via GNU coreutils Bug Reports wrote:
> > > > Dear maintener,
> > > >
> > > > I found a reproducible bug in df utility, installed in debian stable
> > > >
> > > > $ df --version |head -1
> > > > df (GNU coreutils) 8.30
> > > > $ cat /etc/debian_version
> > > > 10.8
> > > >
> > > > df displays G instead of GM as unit size for Gigabytes in power of 1000
> > > > (but the value is correct)
> ----
> The documentation says:
>
> -h, --human-readable
> print sizes in powers of 1024 (e.g., 1023M)
>
> -H, --si
> print sizes in powers of 1000 (e.g., 1.1G)
>
> How is this a bug?
>
> If the idea is to print a scaling factor and use the minimum
> space necessary (1 byte for the prefix), it seem to be doing
> exactly what it is documented to do.
>
If you read the referenced post
https://lists.gnu.org/archive/html/coreutils/2020-09/msg00001.html
you'll understand what the issue is.
>
> Side rant:
> Using decimal prefixes with a binary unit (1B=2**3 bits)
> defeats the purpose of using a common multiplier for metric.
> Since computers use base-2, similar prefixes should be used.
> Just because the disk-industry bought and paid for the
> ruling to use base-10 doesn't mean that memory comes in
> units of 1-million, 1-billion or 1-trillion bytes or
> that disk space is organized in decimal units.
>
We've been thru all this before. See
https://lists.gnu.org/archive/html/coreutils/2020-09/msg00007.html
>
> Second, minor, side rant:
> Would be nice if more attention was paid to fixing mailers encoding
> "Pádraig" and "Bénézech" as "P�draig" and "B�n�zech"
>
If you see substitute encodings like that, it strongly suggests the problem
is your MUA, not mine. My posting to the list (the one you quote from above)
shows the accented characters correctly. I'll be happy to send you a screenshot
if you wish.
- Glenn Golden
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Wed, 10 Mar 2021 22:51:02 GMT)
Full text and
rfc822 format available.
Message #25 received at submit <at> debbugs.gnu.org (full text, mbox):
On 2021/03/10 14:09, Glenn Golden wrote:
>
>> Second, minor, side rant:
>> Would be nice if more attention was paid to fixing mailers encoding
>> "Pádraig" and "Bénézech" as "P�draig" and "B�n�zech"
>>
>
> If you see substitute encodings like that, it strongly suggests the problem
> is your MUA, not mine.
----
It's yours.
You are using a local 8-bit encoding, whereas everyone else was
using UTF-8. Your mailer re-encoded their messages into one
of the 8-bit western encodings, whereas most people use UTF-8
these days, so while their original messages with accents came
through just fine in UTF-8, your re-encoding into Western didn't
display properly.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Wed, 10 Mar 2021 23:22:02 GMT)
Full text and
rfc822 format available.
Message #28 received at 47023 <at> debbugs.gnu.org (full text, mbox):
On 3/10/21 2:50 PM, L A Walsh wrote:
> You are using a local 8-bit encoding, whereas everyone else was
> using UTF-8. Your mailer re-encoded their messages into one
> of the 8-bit western encodings, whereas most people use UTF-8
> these days, so while their original messages with accents came
> through just fine in UTF-8, your re-encoding into Western didn't
> display properly.
Although his email did reencode those names into ISO 8859-1 which is
more likely to cause problems than cure them these days, it still
displays well on my MUA (Thunderbird) because its header said
"Content-Type: text/plain; charset=iso-8859-1". His email is also
displaying properly in the archive
<https://debbugs.gnu.org/cgi/bugreport.cgi?bug=47023#16>, as the
archiving software reencodes those names back into UTF-8 and the web
page uses "Content-Type: text/html; charset=utf-8" for all the emails.
Possibly your email client is programmed to ignore encodings in incoming
"Content-Type" lines; that would explain the glitches you saw.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#47023
; Package
coreutils
.
(Thu, 11 Mar 2021 04:14:02 GMT)
Full text and
rfc822 format available.
Message #31 received at submit <at> debbugs.gnu.org (full text, mbox):
On 2021/03/10 15:21, Paul Eggert wrote:
> Although his email did reencode those names into ISO 8859-1 which is
> more likely to cause problems than cure them these days, it still
> displays well on my MUA (Thunderbird) because its header said
> "Content-Type: text/plain; charset=iso-8859-1". His email is also
> displaying properly in the archive
> <https://debbugs.gnu.org/cgi/bugreport.cgi?bug=47023#16>, as the
> archiving software reencodes those names back into UTF-8 and the web
> page uses "Content-Type: text/html; charset=utf-8" for all the emails.
>
> Possibly your email client is programmed to ignore encodings in incoming
> "Content-Type" lines; that would explain the glitches you saw.
----
I also have Tbird, and for content I can adapt it but usually
have it set to ignore non-UTF-8 these days, as occasionally,
someone has an old mailer that sends out 8-bit encoding and Tbird would
seamlessly adapt and send the email back out in the same format.
I didn't want to pass on bad encodings when I replied to such
items to a list, so I changed that default.
However, what Tbird doesn't handle well is local encodings
used in the 'To' or subject lines since there is no encoding line
for those. In a text archive it can convert and display both in
UTF-8 format, but even if I set Tbird to respond in the same encoding,
it won't fix the encodings in the header as the content-type: text/plain
doesn't apply to the headers.
So while archives and the email text can be fixed, the text in
the headers often can't. The only thing to do is bring it to their attention
and hope they have the technical skills to set their mailer to use
UTF-8.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 08 Apr 2021 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 4 years and 154 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.