GNU bug report logs -
#35531
problem with ls in coreutils
Previous Next
Reported by: Viktors Berstis <cugnujm <at> berstis.com>
Date: Wed, 1 May 2019 22:53:01 UTC
Severity: normal
Tags: notabug
Done: Pádraig Brady <P <at> draigBrady.com>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 35531 in the body.
You can then email your comments to 35531 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Wed, 01 May 2019 22:53:03 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Viktors Berstis <cugnujm <at> berstis.com>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Wed, 01 May 2019 22:53:03 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
When running "ls" or "ls -U" on a windows directory containing 50000
files, ls takes forever. Something seems to be highly inefficient in there.
This is for the 64 bit version build 4/20/2005 11:41AM. The exe size is
180736 bytes.
Thanks.
- Viktors Berstis
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Thu, 02 May 2019 05:45:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 35531 <at> debbugs.gnu.org (full text, mbox):
On Thursday, May 2, 2019 12:03:31 AM CEST Viktors Berstis wrote:
> When running "ls" or "ls -U" on a windows directory containing 50000
> files, ls takes forever. Something seems to be highly inefficient in there.
Could you please try it with ls -U -1?
Kamil
> This is for the 64 bit version build 4/20/2005 11:41AM. The exe size is
> 180736 bytes.
>
> Thanks.
>
> - Viktors Berstis
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Thu, 02 May 2019 05:46:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Thu, 02 May 2019 23:18:01 GMT)
Full text and
rfc822 format available.
Message #14 received at 35531 <at> debbugs.gnu.org (full text, mbox):
My machine has 64GB of ram, 6 core 3.5ghz processor and fast disks.
The directory in question has 57,600 files in it with a total size of
about 47gb.
On a freshly booted machine (nothing cached), "dir /on dirname | wc"
takes about 6 seconds. The second time it takes about 2 seconds.
On a freshly booted machine, "ls -U -1 dirname | wc" takes 5 minutes 48
seconds! A second time it is about a minute less.
ls might be doing something akin to opening every file. If I run a
program to actually open and read every file in that directory, the
system seems to cache it all in ram. Then the ls takes only about 11
seconds.
- Viktors Berstis
Kamil Dudka wrote:
> On Thursday, May 2, 2019 12:03:31 AM CEST Viktors Berstis wrote:
>> When running "ls" or "ls -U" on a windows directory containing 50000
>> files, ls takes forever. Something seems to be highly inefficient in there.
> Could you please try it with ls -U -1?
>
> Kamil
>
>> This is for the 64 bit version build 4/20/2005 11:41AM. The exe size is
>> 180736 bytes.
>>
>> Thanks.
>>
>> - Viktors Berstis
>
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 03 May 2019 00:12:02 GMT)
Full text and
rfc822 format available.
Message #17 received at 35531 <at> debbugs.gnu.org (full text, mbox):
It's probably something inside the kernel (e.g., filesystem code).
What does the shell command 'strace -o /tmp/tr -s 128 -T ls -U -1
dirname | wc' say? You can see which system calls are taking the most
time by then running 'sort -t"<" -k2n /tmp/tr'. On my platform (Fedora
29 x86-64 ext4, an older desktop with only disk drives), the hoggiest
syscalls are getdents64, which are as much as 24 ms per call when the
data are not cached, and more like 0.7 ms per call when the data are
cached (each such call retrieves about 1000 directory entries). What do
you see?
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 03 May 2019 01:19:02 GMT)
Full text and
rfc822 format available.
Message #20 received at 35531 <at> debbugs.gnu.org (full text, mbox):
I am running coreutlls on Windows, not linux... so strace does not work
there.
The November 10, 1999 version 3.16 of coreutils "ls" command is
lightning fast on Windows (and on the large directory) but unfortunately
stops at 32K files. The newer version of "ls" built for Windows has the
problem.
By "new" version, I am using the 64 bit build for windows dated
4/20/2005 at 11:41AM with exe size of 180736 bytes, md5sum:
47ba770d80382cbd66ddba13924c1417 Version 5.3.0 . I didn't see a place
to download a newer binary version to try.
BTW, booting the machine with Ubuntu, ls on that same large directory is
very fast.
- Viktors Berstis
Paul Eggert wrote:
> It's probably something inside the kernel (e.g., filesystem code).
>
> What does the shell command 'strace -o /tmp/tr -s 128 -T ls -U -1
> dirname | wc' say? You can see which system calls are taking the most
> time by then running 'sort -t"<" -k2n /tmp/tr'. On my platform (Fedora
> 29 x86-64 ext4, an older desktop with only disk drives), the hoggiest
> syscalls are getdents64, which are as much as 24 ms per call when the
> data are not cached, and more like 0.7 ms per call when the data are
> cached (each such call retrieves about 1000 directory entries). What do
> you see?
>
>
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 03 May 2019 01:21:01 GMT)
Full text and
rfc822 format available.
Message #23 received at 35531 <at> debbugs.gnu.org (full text, mbox):
On 5/2/19 5:41 PM, Viktors Berstis wrote:
> The newer version of "ls" built for Windows has the problem.
Ah, then you'll have to talk to whoever built that version, which is not
me (and generally speaking they don't hang out on this mailing list).
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 03 May 2019 03:44:02 GMT)
Full text and
rfc822 format available.
Message #26 received at 35531 <at> debbugs.gnu.org (full text, mbox):
I downloaded it from
https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.3.0.exe/download
The help said "Report bugs to <bug-coreutils <at> gnu.org>" which is what I did.
The build is so old that I suspect none of the original players are around.
Do you know of a windows binary or windows source that is newer
anywhere? Thanks.
- Viktors Berstis
Paul Eggert wrote:
> On 5/2/19 5:41 PM, Viktors Berstis wrote:
>> The newer version of "ls" built for Windows has the problem.
> Ah, then you'll have to talk to whoever built that version, which is not
> me (and generally speaking they don't hang out on this mailing list).
>
>
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 03 May 2019 11:16:02 GMT)
Full text and
rfc822 format available.
Message #29 received at 35531 <at> debbugs.gnu.org (full text, mbox):
On Friday, May 3, 2019 5:43:20 AM CEST Viktors Berstis wrote:
> I downloaded it from
> https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.
> 3.0.exe/download The help said "Report bugs to <bug-coreutils <at> gnu.org>"
> which is what I did. The build is so old that I suspect none of the
> original players are around. Do you know of a windows binary or windows
> source that is newer
> anywhere? Thanks.
>
> - Viktors Berstis
`ls -U1` will not run significantly faster than `ls` on powerful hardware.
The key difference is that `ls -U1` prints the results continuously as the
list of files is read from file system whereas `ls` will be silent until
the complete list is read. You need to use a new enough version of coreutils
for this to work properly. This optimisation was introduced in coreutils-7.5:
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.0~113
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v7.5~49
Kamil
> Paul Eggert wrote:
> > On 5/2/19 5:41 PM, Viktors Berstis wrote:
> >> The newer version of "ls" built for Windows has the problem.
> >
> > Ah, then you'll have to talk to whoever built that version, which is not
> > me (and generally speaking they don't hang out on this mailing list).
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 03 May 2019 15:57:02 GMT)
Full text and
rfc822 format available.
Message #32 received at 35531 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/html, inline)]
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 03 May 2019 16:14:02 GMT)
Full text and
rfc822 format available.
Message #35 received at 35531 <at> debbugs.gnu.org (full text, mbox):
On 5/2/19 8:43 PM, Viktors Berstis wrote:
> I downloaded it from
> https://sourceforge.net/projects/gnuwin32/files/coreutils/5.3.0/coreutils-5.3.0.exe/download
>
> The help said "Report bugs to <bug-coreutils <at> gnu.org>" which is what I
> did.
Whoever built it just copied that line from upstream. If the build has
MS-Windows-specific problems, you'll need to find an MS-Windows person
somewhere who can fix it, or find a better build somewhere. This
bug-reporting system is not the best place to do that; see:
https://www.gnu.org/prep/standards/html_node/System-Portability.html
and look for "Windows".
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 03 May 2019 16:25:01 GMT)
Full text and
rfc822 format available.
Message #38 received at 35531 <at> debbugs.gnu.org (full text, mbox):
On Friday, May 3, 2019 5:56:35 PM CEST Viktors Berstis wrote:
> I don't think the problem has anything to do with sorting or -U1.
It was unclear what you meant by "the problem" so I pointed out the only
inefficiency that was immediately obvious to me.
> When ls is taking over 5 minutes for something that should run in a
> couple of seconds, the task manager shows that it is using nearly no
> CPU.... it is doing a lot of "other I/O".
You can try to use some profiling/tracing tools to debug the root cause.
> It doesn't look like the build you referenced is designed to be
> compileable for Windows. Is there one that is? Thanks.
I would suggest to build the latest upstream release (coreutils-8.31 now)
from:
https://www.gnu.org/software/coreutils/
Kamil
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Sat, 04 May 2019 01:05:02 GMT)
Full text and
rfc822 format available.
Message #41 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi
Although this bug report seems to be a problem with the windows port
of ls, it reminded me of an interesting investigation into slow ls
speeds due to colorizing via the LS_COLORS environment variable.
See
https://news.sherlock.stanford.edu/posts/when-setting-an-environment-variable-gives-you-a-40-x-speedup
I thought it an interesting case study.
Regards - PSDE
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Sat, 04 May 2019 05:28:01 GMT)
Full text and
rfc822 format available.
Message #44 received at submit <at> debbugs.gnu.org (full text, mbox):
On 5/1/2019 3:03 PM, Viktors Berstis wrote:
> When running "ls" or "ls -U" on a windows directory containing 50000
> files, ls takes forever. Something seems to be highly inefficient in there.
>
---
it sounds like you are running ls with no options
(nothing in environment and no switches on the command line).
Is this the case? If is, I'm stumped unless whoever
compiled that had it set to do some things by default.
Basically on Windows, anything that you might get away with on
linux with a stat call, takes an 'open' call on windows. That gets
costly. Anything that appends a classifyer to the end of the file
(like ls -F, --classify or --file-type) or that would display any
of the data or size information (ls -l would be right out!). The
only thing 'ls' could display without such a penalty is the file
name. However that only apply to stock ls, and since we don't know
what options might have been enabled for that 'ls' (including any
default usage of switches such as those mentioned above), it's
hard to say exactly what the problem is.
A suggestion -- try installing a minimal snapshot of 'Cygwin'
('cygwin.org') and try env -i /bin/ls on cygwin's command line
in that directory and see how fast that is. If it is slow, then
something excessively weird is going on that is the wonder of a closed
source Windows. However, my hunch would have it be 'fast', but since
I don't know the cause, can't say if that would help or not.
One further possibility that I'd think unlikely: the directory could
be very fragmented and take a long time to (5minutes?! really unlikely,
almost has to be the missing stat call) read...though the figures
you are stating sound out of bounds for a fragmented directory.
Still, if you grab the 'contig' tool from the sysinternals site (a
windows subsite), it can show you the number of fragments a file
is split into -- and can be used on directories:
/prog/Sysinternals/cmd> contig -a -v .
Contig v1.6 - Makes files contiguous
Copyright (C) 1998-2010 Mark Russinovich
Sysinternals - www.sysinternals.com
------------------------
Processing C:\prog\Sysinternals\cmd:
Scanning file...
Cluster: Length
0: 3
File size: 12288 bytes
C:\prog\Sysinternals\cmd is in 1 fragment
------------------------
Summary:
Number of files processed : 1
Average fragmentation : 1 frags/file
========
Other than those options, not sure what else to suggest to narrow
it down, but thought i'd at least mention a few possibilities.
Good luck!
Information forwarded
to
bug-coreutils <at> gnu.org
:
bug#35531
; Package
coreutils
.
(Fri, 10 May 2019 10:13:01 GMT)
Full text and
rfc822 format available.
Message #47 received at 35531 <at> debbugs.gnu.org (full text, mbox):
tag 35531 notabug
close 35531
stop
On 03/05/19 17:01, Peter Edwards wrote:
> Hi
>
> Although this bug report seems to be a problem with the windows port
> of ls, it reminded me of an interesting investigation into slow ls
> speeds due to colorizing via the LS_COLORS environment variable.
>
> See
> https://news.sherlock.stanford.edu/posts/when-setting-an-environment-variable-gives-you-a-40-x-speedup
>
> I thought it an interesting case study.
Thanks for the info.
In summary, to speed up ls color induced processing significantly,
disable stat() and getxattr() calls with:
LS_COLORS='ex=00:su=00:sg=00:ca=00:'
A general point though is that colors are for human processing,
and how fast can one process the output from ls :)
I.E. if ls is being written to pipe/file or somewhere where
speed may be important, the coloring is disabled by default anyway.
cheers,
Pádraig
Added tag(s) notabug.
Request was from
Pádraig Brady <P <at> draigBrady.com>
to
control <at> debbugs.gnu.org
.
(Fri, 10 May 2019 10:13:02 GMT)
Full text and
rfc822 format available.
bug closed, send any further explanations to
35531 <at> debbugs.gnu.org and Viktors Berstis <cugnujm <at> berstis.com>
Request was from
Pádraig Brady <P <at> draigBrady.com>
to
control <at> debbugs.gnu.org
.
(Fri, 10 May 2019 10:13:02 GMT)
Full text and
rfc822 format available.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 07 Jun 2019 11:24:06 GMT)
Full text and
rfc822 format available.
This bug report was last modified 6 years and 7 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.