GNU bug report logs - #56710
ls vs. stat display of st_size

Previous Next

Package: coreutils;

Reported by: Andreas Schwab <schwab <at> linux-m68k.org>

Date: Fri, 22 Jul 2022 20:10:02 UTC

Severity: normal

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 56710 in the body.
You can then email your comments to 56710 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-coreutils <at> gnu.org:
bug#56710; Package coreutils. (Fri, 22 Jul 2022 20:10:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Andreas Schwab <schwab <at> linux-m68k.org>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Fri, 22 Jul 2022 20:10:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Andreas Schwab <schwab <at> linux-m68k.org>
To: bug-coreutils <at> gnu.org
Subject: ls vs. stat display of st_size
Date: Fri, 22 Jul 2022 22:09:49 +0200
$ ls -l /proc/kcore
-r-------- 1 root root 18446744000862892032 Jun 21 00:00 /proc/kcore
$ stat -c %s /proc/kcore
-72846659584

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."




Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Fri, 22 Jul 2022 20:53:01 GMT) Full text and rfc822 format available.

Notification sent to Andreas Schwab <schwab <at> linux-m68k.org>:
bug acknowledged by developer. (Fri, 22 Jul 2022 20:53:02 GMT) Full text and rfc822 format available.

Message #10 received at 56710-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Andreas Schwab <schwab <at> linux-m68k.org>
Cc: 56710-done <at> debbugs.gnu.org
Subject: Re: bug#56710: ls vs. stat display of st_size
Date: Fri, 22 Jul 2022 13:52:33 -0700
[Message part 1 (text/plain, inline)]
Thanks for reporting that. I installed the attached.
[0001-stat-c-s-now-prints-unsigned.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#56710; Package coreutils. (Sat, 23 Jul 2022 12:18:01 GMT) Full text and rfc822 format available.

Message #13 received at 56710 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: 56710 <at> debbugs.gnu.org, eggert <at> cs.ucla.edu, schwab <at> linux-m68k.org
Subject: Re: bug#56710: ls vs. stat display of st_size
Date: Sat, 23 Jul 2022 13:17:38 +0100
On 22/07/2022 21:52, Paul Eggert wrote:
> Thanks for reporting that. I installed the attached.

Playing devil's advocate, this takes the stance that
st_size should always be treated as unsigned
(given that stat(1) is a lower level util than ls(1)).

This is only a real consideration for virtual files I think
since off_t is signed, and so impractical for a real file system
to support files > OFF_T_MAX.
In this case /proc/kcore is a virtual file, with the
size representing the VM size (guessing riscv64 in this case).
But other virtual files may set st_size = -1,
to represent an unknown file size, which with the change,
scripts using stat(1) can no longer rely on?
Perhaps the "-1" case could be specialized for this.

BTW I see we've code in cache_fstatat() that assumes
st_size can't have such large values, which contradicts a bit.

BTW assuming that st_size is unsigned, reminds me of this change where
we cast all st_size to unsigned, which also allowed us to enable -Wsign-compare:
https://lists.gnu.org/archive/html/bug-coreutils/2009-01/msg00050.html

cheers,
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#56710; Package coreutils. (Sat, 23 Jul 2022 20:08:02 GMT) Full text and rfc822 format available.

Message #16 received at 56710 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 56710 <at> debbugs.gnu.org, schwab <at> linux-m68k.org
Subject: Re: bug#56710: ls vs. stat display of st_size
Date: Sat, 23 Jul 2022 13:07:28 -0700
[Message part 1 (text/plain, inline)]
On 7/23/22 05:17, Pádraig Brady wrote:

> BTW I see we've code in cache_fstatat() that assumes
> st_size can't have such large values, which contradicts a bit.

Good catch. I installed the first attached patch.


> This is only a real consideration for virtual files I think
> since off_t is signed, and so impractical for a real file system
> to support files > OFF_T_MAX.

Yes, that sounds right.

You've convinced me that 'ls' should switch to the way 'stat' behaves 
rather than vice versa; that's more useful anyway. How about the 
attached second patch, which I haven't installed? (I was actually 
inclined this way originally but got lazy.)
[0001-rm-don-t-assume-st_size-is-nonnegative.patch (text/x-patch, attachment)]
[0002-ls-print-negative-file-sizes-as-negative.patch (text/x-patch, attachment)]

Information forwarded to bug-coreutils <at> gnu.org:
bug#56710; Package coreutils. (Sun, 24 Jul 2022 08:49:02 GMT) Full text and rfc822 format available.

Message #19 received at 56710 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 56710 <at> debbugs.gnu.org, schwab <at> linux-m68k.org
Subject: Re: bug#56710: ls vs. stat display of st_size
Date: Sun, 24 Jul 2022 09:48:11 +0100
On 23/07/2022 21:07, Paul Eggert wrote:
> On 7/23/22 05:17, Pádraig Brady wrote:
> 
>> BTW I see we've code in cache_fstatat() that assumes
>> st_size can't have such large values, which contradicts a bit.
> 
> Good catch. I installed the first attached patch.
> 
> 
>   > This is only a real consideration for virtual files I think
>   > since off_t is signed, and so impractical for a real file system
>   > to support files > OFF_T_MAX.
> 
> Yes, that sounds right.
> 
> You've convinced me that 'ls' should switch to the way 'stat' behaves
> rather than vice versa; that's more useful anyway. How about the
> attached second patch, which I haven't installed? (I was actually
> inclined this way originally but got lazy.)

Well ls(1) was explicitly changed to assuming only positive,
citing POSIX (though I can't see it in POSIX myself):
https://github.com/coreutils/coreutils/commit/67ba4ac01

Also ls(1) can sort by size, which gives a little more
credence to assuming positive only size.

Also ls(1) is a bit higher level, more human facing than stat(1).

For these reasons I would keep ls(1) as is (assuming positive).

As for stat(1), it's now consistent with ls(1) which has some benefit.
It is lower level though, so in my mind it might be better
to output the raw value, especially since it's such an edge case.

So I'd leave ls(1) as is, and I'll leave it up to you
how to handle stat(1) given the above points.

cheers,
Pádraig




Information forwarded to bug-coreutils <at> gnu.org:
bug#56710; Package coreutils. (Sun, 24 Jul 2022 16:19:02 GMT) Full text and rfc822 format available.

Message #22 received at 56710 <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Pádraig Brady <P <at> draigBrady.com>
Cc: 56710 <at> debbugs.gnu.org, schwab <at> linux-m68k.org
Subject: Re: bug#56710: ls vs. stat display of st_size
Date: Sun, 24 Jul 2022 09:18:45 -0700
On 7/24/22 01:48, Pádraig Brady wrote:

> Well ls(1) was explicitly changed to assuming only positive,
> citing POSIX (though I can't see it in POSIX myself):
> https://github.com/coreutils/coreutils/commit/67ba4ac01

I vaguely recall being involved with that decades-old change. The POSIX 
requirement is here:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html#tag_20_73_10

(look for "%u").


> Also ls(1) can sort by size, which gives a little more
> credence to assuming positive only size.

I don't see why; negative sizes sort just as well as positive ones do.


> For these reasons I would keep ls(1) as is (assuming positive).
> 
> As for stat(1), it's now consistent with ls(1) which has some benefit.
> It is lower level though, so in my mind it might be better
> to output the raw value, especially since it's such an edge case.
> 
> So I'd leave ls(1) as is, and I'll leave it up to you
> how to handle stat(1) given the above points.

Consistency is reasonably important here (as per the original bug 
report), so if those are the choices let's leave things as-is.




Information forwarded to bug-coreutils <at> gnu.org:
bug#56710; Package coreutils. (Sun, 24 Jul 2022 17:15:01 GMT) Full text and rfc822 format available.

Message #25 received at 56710 <at> debbugs.gnu.org (full text, mbox):

From: Pádraig Brady <P <at> draigBrady.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 56710 <at> debbugs.gnu.org, schwab <at> linux-m68k.org
Subject: Re: bug#56710: ls vs. stat display of st_size
Date: Sun, 24 Jul 2022 18:14:31 +0100
On 24/07/2022 17:18, Paul Eggert wrote:
> On 7/24/22 01:48, Pádraig Brady wrote:
> 
>> Well ls(1) was explicitly changed to assuming only positive,
>> citing POSIX (though I can't see it in POSIX myself):
>> https://github.com/coreutils/coreutils/commit/67ba4ac01
> 
> I vaguely recall being involved with that decades-old change. The POSIX
> requirement is here:
> 
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html#tag_20_73_10
> 
> (look for "%u").

Right, that's fairly conclusive for ls.

>> Also ls(1) can sort by size, which gives a little more
>> credence to assuming positive only size.
> 
> I don't see why; negative sizes sort just as well as positive ones do.

Fair enough.

>> For these reasons I would keep ls(1) as is (assuming positive).
>>
>> As for stat(1), it's now consistent with ls(1) which has some benefit.
>> It is lower level though, so in my mind it might be better
>> to output the raw value, especially since it's such an edge case.
>>
>> So I'd leave ls(1) as is, and I'll leave it up to you
>> how to handle stat(1) given the above points.
> 
> Consistency is reasonably important here (as per the original bug
> report), so if those are the choices let's leave things as-is.

Cool.

For reference stat(1) on FreeBSD takes the lower level approach,
outputting signed by default (I presume from looking at the man page),
and allowing the user to override that.  I.e. it defaults
to `stat -f %z` but the user can override to `stat -f %Uz`.
We don't have many letters left to play with but I suppose
we could default to unsigned (as we now are) and support %Is etc.
for signed integer quantities. I'm not suggesting we need this,
just thinking out loud.

cheers,
Pádraig




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Mon, 22 Aug 2022 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 2 years and 354 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.