GNU bug report logs -
#8200
cp -lr uses a lot of CPU time.
Previous Next
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8200 in the body.
You can then email your comments to 8200 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#8200
; Package
coreutils
.
(Tue, 08 Mar 2011 05:23:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Rogier Wolff <R.E.Wolff <at> BitWizard.nl>
:
New bug report received and forwarded. Copy sent to
bug-coreutils <at> gnu.org
.
(Tue, 08 Mar 2011 05:23:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
Hi,
In my backupscripts I need a "cp -lr" every day. I make backups of
directories that hold up to millions of files.
When
cp -lr sourc dest
runs for a while, it becomes CPU limited. Virtual memory is only about
2Mb. "resident" is under 1M.
Top reports:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26721 root 20 0 2456 720 468 R 58.0 0.1 65:32.60 cp
2855 root 20 0 2560 936 624 R 40.8 0.1 30:30.52 cp
and I doubt they are half way through.
I wrote an application I call "cplr" which does the obvious in the
obvious manner, and it doesn't have this problem.
I've run "strace", and determined that it is not doing systemcalls
that take much CPU time. Most system calls return in microseconds.
The time spent /between/ system calls runs up into the hundreds of
milliseconds. You might say: well that's way less than a
second. Sure. But if you need to do that tens of thousands of times it
becomes quite significant....
So my question is: Why does cp -lr take such rediculous amounts of CPU
time?
Or another way: BUG REPORT: cp -lr takes unneccessary amounts of CPU
time.
Roger.
--
** R.E.Wolff <at> BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement.
Does it sit on the couch all day? Is it unemployed? Please be specific!
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#8200
; Package
coreutils
.
(Tue, 08 Mar 2011 15:06:02 GMT)
Full text and
rfc822 format available.
Message #8 received at 8200 <at> debbugs.gnu.org (full text, mbox):
Rogier Wolff wrote:
> In my backupscripts I need a "cp -lr" every day. I make backups of
> directories that hold up to millions of files.
>
> When
>
> cp -lr sourc dest
>
> runs for a while, it becomes CPU limited. Virtual memory is only about
> 2Mb. "resident" is under 1M.
Thank you for the bug report.
That sounds like there is a serious problem, somewhere.
If you give us enough information, we'll find the cause.
For starters, what version of cp did you use?
Run cp --version
> Top reports:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 26721 root 20 0 2456 720 468 R 58.0 0.1 65:32.60 cp
> 2855 root 20 0 2560 936 624 R 40.8 0.1 30:30.52 cp
>
> and I doubt they are half way through.
>
> I wrote an application I call "cplr" which does the obvious in the
> obvious manner, and it doesn't have this problem.
>
> I've run "strace", and determined that it is not doing systemcalls
> that take much CPU time. Most system calls return in microseconds.
Please give us a sample listing of the syscalls that strace
shows you when you trace one of those long-running cp commands.
A few hundred lines worth would be good.
> The time spent /between/ system calls runs up into the hundreds of
> milliseconds. You might say: well that's way less than a
> second. Sure. But if you need to do that tens of thousands of times it
> becomes quite significant....
>
>
> So my question is: Why does cp -lr take such rediculous amounts of CPU
> time?
>
> Or another way: BUG REPORT: cp -lr takes unneccessary amounts of CPU
> time.
What type of file system are you using, and is it nearly full?
Run this from e.g, the source directory: df -hT .
Ideally, you'd attach to one of those processes with gdb and step through
the code enough to tell us where it's spending its time, presumably in
coreutils-8.10/src/copy.c. Just running "gdb -p 26721" (where 26721
is the PID of one of your running cp processes) and typing "backtrace"
at the prompt may give us a good clue.
Next best, you would give us access to your system or a copy of your hierarchy.
But we don't even ask for that, because that's rarely feasible.
Next best: you would give us the output of these two commands:
[if you can do this, please respond privately, not to the list]
find YOUR_SRC_DIR -ls | xz -e > src.find.xz
find YOUR_DST_DIR -ls | xz -e > dst.find.xz
[if you don't have xz, install it or use bzip2 -9 instead of "xz -e";
xz is better]
With that, we'd get an idea of hard link counts and max/average
number of entries per directory, name length, etc.
However, most people don't want to share file names like that.
If you can, please put those two compressed files somewhere like
an upload site and reply with links to them.
Otherwise, please give us some statistics describing your
two hierarchies by running these commands:
These give counts of files and directories for each of your source
and destination directories:
find YOUR_SRC_DIR -type f |wc -l
find YOUR_SRC_DIR -type d |wc -l
find YOUR_DST_DIR -type f |wc -l
find YOUR_DST_DIR -type d |wc -l
Print the total number of links for each of those directories:
find YOUR_SRC_DIR -type f -printf '%n\n'|awk '{s += $1} END {printf "%F\n", s}'
find YOUR_DST_DIR -type f -printf '%n\n'|awk '{s += $1} END {printf "%F\n", s}'
Jim
Information forwarded
to
owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org
:
bug#8200
; Package
coreutils
.
(Tue, 08 Mar 2011 16:36:01 GMT)
Full text and
rfc822 format available.
Message #11 received at 8200 <at> debbugs.gnu.org (full text, mbox):
Hi Jim & others,
Aaargh... It seems the bug has been fixed... Feel free to ignore my
explanation below.
On Tue, Mar 08, 2011 at 04:05:04PM +0100, Jim Meyering wrote:
> For starters, what version of cp did you use?
> Run cp --version
-> cp (GNU coreutils) 8.5
> > Top reports:
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 26721 root 20 0 2456 720 468 R 58.0 0.1 65:32.60 cp
> > 2855 root 20 0 2560 936 624 R 40.8 0.1 30:30.52 cp
> >
> > and I doubt they are half way through.
> >
> > I wrote an application I call "cplr" which does the obvious in the
> > obvious manner, and it doesn't have this problem.
> >
> > I've run "strace", and determined that it is not doing systemcalls
> > that take much CPU time. Most system calls return in microseconds.
>
> Please give us a sample listing of the syscalls that strace
> shows you when you trace one of those long-running cp commands.
> A few hundred lines worth would be good.
I ran:
time strace -tttTp 11453 | & head -1000 | awk '{print ($1-t)*1000 , $0 ; t=$1;}'
to get the output of 1000 of the process' system calls.
Previously I had omitted the "*1000" which made things harder to read,
and I hadn't noticed that the mkdir calls were the calls that took a
long time.....
My own "cplr" program I started one time without any arguments and it
said:
=> Usage: cplr srcdir dstdir
=> Copy srcdir to dstdir by making hardlinks
=> (Like cp -lR, but without consuming lots of memory)
So apparently the problem we ran into when I made that was that cp was
consuming much too much memory.... This apparently has been fixed
in the meantime.
Here is a typical section of the strace output. This is from my own
cplr program, as the "cp" has scrolled out of my screen and I've
stopped the cp -lr, as the problem has been fixed.
0.0741482 1299598743.435264 link("current/linux-2.6.0-test2-clean/fs/file_table.c", "test2/linux-2.6.0-test2-clean/fs/file_table.c") = 0 <0.000047>
0.133991 1299598743.435398 link("current/linux-2.6.0-test2-clean/fs/read_write.c", "test2/linux-2.6.0-test2-clean/fs/read_write.c") = 0 <0.000036>
0.122786 1299598743.435521 link("current/linux-2.6.0-test2-clean/fs/xattr_acl.c", "test2/linux-2.6.0-test2-clean/fs/xattr_acl.c") = 0 <0.000041>
0.13113 1299598743.435652 link("current/linux-2.6.0-test2-clean/fs/jffs2", "test2/linux-2.6.0-test2-clean/fs/jffs2") = -1 EPERM (Operation not permitted) <0.000046>
0.740051 1299598743.436392 lstat64("current/linux-2.6.0-test2-clean/fs/jffs2", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000015>
0.119925 1299598743.436512 open("current/linux-2.6.0-test2-clean/fs/jffs2/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 6 <0.000022>
0.0889301 1299598743.436601 mkdir("test2/linux-2.6.0-test2-clean/fs/jffs2/", 0777) = 0 <0.031938>
32.057 1299598743.468658 getdents(6, /* 36 entries */, 32768) = 776 <0.000317>
> What type of file system are you using, and is it nearly full?
> Run this from e.g, the source directory: df -hT .
Filesystem Type Size Used Avail Use% Mounted on
/dev/md3 ext3 2.7T 2.4T 190G 93% /backup
> Ideally, you'd attach to one of those processes with gdb and step
> through the code enough to tell us where it's spending its time,
> presumably in coreutils-8.10/src/copy.c. Just running "gdb -p
> 26721" (where 26721 is the PID of one of your running cp processes)
> and typing "backtrace" at the prompt may give us a good clue.
It's spending time in mkdir. It's visible from the strace output.
For a sample directory, download the linux source code, unpack it some
300 times into different subdirs (after unpacking, rename the
resulting tree to linux-1 linux-2, etc.)
(My count comes to 325 copies, but many are 2.4, so a lot smaller than
current kernels).
> Next best, you would give us access to your system or a copy of your hierarchy.
> But we don't even ask for that, because that's rarely feasible.
> Next best: you would give us the output of these two commands:
> [if you can do this, please respond privately, not to the list]
>
> find YOUR_SRC_DIR -ls | xz -e > src.find.xz
> find YOUR_DST_DIR -ls | xz -e > dst.find.xz
>
> [if you don't have xz, install it or use bzip2 -9 instead of "xz -e";
> xz is better]
>
> With that, we'd get an idea of hard link counts and max/average
> number of entries per directory, name length, etc.
>
> However, most people don't want to share file names like that.
> If you can, please put those two compressed files somewhere like
> an upload site and reply with links to them.
> Otherwise, please give us some statistics describing your
> two hierarchies by running these commands:
>
> These give counts of files and directories for each of your source
> and destination directories:
dest dir is created by the cp -lr. so it starts out empty, and ends up with
the same number as the source dir. :-)
> find YOUR_SRC_DIR -type f |wc -l
About 4.7 million.
> find YOUR_SRC_DIR -type d |wc -l
About 325000.
> Print the total number of links for each of those directories:
You say links for directories, but your command counts the links on
the files...
> find YOUR_SRC_DIR -type f -printf '%n\n'|awk '{s += $1} END {printf "%F\n", s}'
539 million. So about 100 links to each file on average.
Rogier.
--
** R.E.Wolff <at> BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement.
Does it sit on the couch all day? Is it unemployed? Please be specific!
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ
Reply sent
to
Jim Meyering <jim <at> meyering.net>
:
You have taken responsibility.
(Thu, 10 Mar 2011 17:40:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Rogier Wolff <R.E.Wolff <at> BitWizard.nl>
:
bug acknowledged by developer.
(Thu, 10 Mar 2011 17:40:03 GMT)
Full text and
rfc822 format available.
Message #16 received at 8200-done <at> debbugs.gnu.org (full text, mbox):
Rogier Wolff wrote:
> Aaargh... It seems the bug has been fixed... Feel free to ignore my
> explanation below.
Thanks. I've marked this ticket as closed.
> On Tue, Mar 08, 2011 at 04:05:04PM +0100, Jim Meyering wrote:
>> For starters, what version of cp did you use?
>> Run cp --version
>
> -> cp (GNU coreutils) 8.5
In v7.0-63-g3ece035, I made this change:
http://git.sv.gnu.org/cgit/coreutils.git/commit/?id=3ece0355d52e41a1
cp: use far less memory in some cases
cp --link was "remembering" many name,dev,inode triples unnecessarily.
...
which would make a big difference in your case.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Fri, 08 Apr 2011 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 14 years and 132 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.