GNU bug report logs - #8200
cp -lr uses a lot of CPU time.

Previous Next

Package: coreutils;

Reported by: Rogier Wolff <R.E.Wolff <at> BitWizard.nl>

Date: Tue, 8 Mar 2011 05:23:02 UTC

Severity: normal

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 8200 in the body.
You can then email your comments to 8200 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8200; Package coreutils. (Tue, 08 Mar 2011 05:23:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Rogier Wolff <R.E.Wolff <at> BitWizard.nl>:
New bug report received and forwarded. Copy sent to bug-coreutils <at> gnu.org. (Tue, 08 Mar 2011 05:23:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Rogier Wolff <R.E.Wolff <at> BitWizard.nl>
To: bug-coreutils <at> gnu.org
Subject: cp -lr uses a lot of CPU time. 
Date: Tue, 8 Mar 2011 06:08:40 +0100
Hi, 

In my backupscripts I need a "cp -lr" every day. I make backups of
directories that hold up to millions of files. 

When 

	cp -lr sourc dest

runs for a while, it becomes CPU limited. Virtual memory is only about
2Mb. "resident" is under 1M.

Top reports: 

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
26721 root      20   0  2456  720  468 R 58.0  0.1  65:32.60 cp
 2855 root      20   0  2560  936  624 R 40.8  0.1  30:30.52 cp

and I doubt they are half way through.

I wrote an application I call "cplr" which does the obvious in the
obvious manner, and it doesn't have this problem. 

I've run "strace", and determined that it is not doing systemcalls
that take much CPU time. Most system calls return in microseconds.

The time spent /between/ system calls runs up into the hundreds of
milliseconds. You might say: well that's way less than a
second. Sure. But if you need to do that tens of thousands of times it
becomes quite significant....


So my question is: Why does cp -lr take such rediculous amounts of CPU
time?

Or another way: BUG REPORT: cp -lr takes unneccessary amounts of CPU
time.

	Roger. 

-- 
** R.E.Wolff <at> BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8200; Package coreutils. (Tue, 08 Mar 2011 15:06:02 GMT) Full text and rfc822 format available.

Message #8 received at 8200 <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Rogier Wolff <R.E.Wolff <at> BitWizard.nl>
Cc: 8200 <at> debbugs.gnu.org
Subject: Re: bug#8200: cp -lr uses a lot of CPU time.
Date: Tue, 08 Mar 2011 16:05:04 +0100
Rogier Wolff wrote:
> In my backupscripts I need a "cp -lr" every day. I make backups of
> directories that hold up to millions of files.
>
> When
>
> 	cp -lr sourc dest
>
> runs for a while, it becomes CPU limited. Virtual memory is only about
> 2Mb. "resident" is under 1M.

Thank you for the bug report.
That sounds like there is a serious problem, somewhere.
If you give us enough information, we'll find the cause.

For starters, what version of cp did you use?
Run cp --version

> Top reports:
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 26721 root      20   0  2456  720  468 R 58.0  0.1  65:32.60 cp
>  2855 root      20   0  2560  936  624 R 40.8  0.1  30:30.52 cp
>
> and I doubt they are half way through.
>
> I wrote an application I call "cplr" which does the obvious in the
> obvious manner, and it doesn't have this problem.
>
> I've run "strace", and determined that it is not doing systemcalls
> that take much CPU time. Most system calls return in microseconds.

Please give us a sample listing of the syscalls that strace
shows you when you trace one of those long-running cp commands.
A few hundred lines worth would be good.

> The time spent /between/ system calls runs up into the hundreds of
> milliseconds. You might say: well that's way less than a
> second. Sure. But if you need to do that tens of thousands of times it
> becomes quite significant....
>
>
> So my question is: Why does cp -lr take such rediculous amounts of CPU
> time?
>
> Or another way: BUG REPORT: cp -lr takes unneccessary amounts of CPU
> time.

What type of file system are you using, and is it nearly full?
Run this from e.g, the source directory:  df -hT .

Ideally, you'd attach to one of those processes with gdb and step through
the code enough to tell us where it's spending its time, presumably in
coreutils-8.10/src/copy.c.  Just running "gdb -p 26721" (where 26721
is the PID of one of your running cp processes) and typing "backtrace"
at the prompt may give us a good clue.

Next best, you would give us access to your system or a copy of your hierarchy.
But we don't even ask for that, because that's rarely feasible.
Next best: you would give us the output of these two commands:
[if you can do this, please respond privately, not to the list]

    find YOUR_SRC_DIR -ls | xz -e > src.find.xz
    find YOUR_DST_DIR -ls | xz -e > dst.find.xz

[if you don't have xz, install it or use bzip2 -9 instead of "xz -e";
 xz is better]

With that, we'd get an idea of hard link counts and max/average
number of entries per directory, name length, etc.

However, most people don't want to share file names like that.
If you can, please put those two compressed files somewhere like
an upload site and reply with links to them.
Otherwise, please give us some statistics describing your
two hierarchies by running these commands:

These give counts of files and directories for each of your source
and destination directories:
    find YOUR_SRC_DIR -type f |wc -l
    find YOUR_SRC_DIR -type d |wc -l
    find YOUR_DST_DIR -type f |wc -l
    find YOUR_DST_DIR -type d |wc -l

Print the total number of links for each of those directories:
    find YOUR_SRC_DIR -type f -printf '%n\n'|awk '{s += $1} END {printf "%F\n", s}'
    find YOUR_DST_DIR -type f -printf '%n\n'|awk '{s += $1} END {printf "%F\n", s}'

Jim




Information forwarded to owner <at> debbugs.gnu.org, bug-coreutils <at> gnu.org:
bug#8200; Package coreutils. (Tue, 08 Mar 2011 16:36:01 GMT) Full text and rfc822 format available.

Message #11 received at 8200 <at> debbugs.gnu.org (full text, mbox):

From: Rogier Wolff <R.E.Wolff <at> BitWizard.nl>
To: Jim Meyering <jim <at> meyering.net>
Cc: 8200 <at> debbugs.gnu.org, Rogier Wolff <R.E.Wolff <at> BitWizard.nl>
Subject: Re: bug#8200: cp -lr uses a lot of CPU time.
Date: Tue, 8 Mar 2011 17:35:22 +0100
Hi Jim & others, 

Aaargh... It seems the bug has been fixed... Feel free to ignore my
explanation below.

On Tue, Mar 08, 2011 at 04:05:04PM +0100, Jim Meyering wrote:
> For starters, what version of cp did you use?
> Run cp --version

-> cp (GNU coreutils) 8.5


> > Top reports:
> >
> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 26721 root      20   0  2456  720  468 R 58.0  0.1  65:32.60 cp
> >  2855 root      20   0  2560  936  624 R 40.8  0.1  30:30.52 cp
> >
> > and I doubt they are half way through.
> >
> > I wrote an application I call "cplr" which does the obvious in the
> > obvious manner, and it doesn't have this problem.
> >
> > I've run "strace", and determined that it is not doing systemcalls
> > that take much CPU time. Most system calls return in microseconds.
> 
> Please give us a sample listing of the syscalls that strace
> shows you when you trace one of those long-running cp commands.
> A few hundred lines worth would be good.

I ran: 

   time strace -tttTp 11453 | & head -1000 | awk '{print ($1-t)*1000 , $0 ; t=$1;}' 

to get the output of 1000 of the process' system calls. 

Previously I had omitted the "*1000" which made things harder to read,
and I hadn't noticed that the mkdir calls were the calls that took a
long time..... 

My own "cplr" program I started one time without any arguments and it 
said: 

=> Usage: cplr srcdir dstdir
=> Copy srcdir to dstdir by making hardlinks
=> (Like cp -lR, but without consuming lots of memory)

So apparently the problem we ran into when I made that was that cp was
consuming much too much memory....  This apparently has been fixed 
in the meantime. 

Here is a typical section of the strace output. This is from my own
cplr program, as the "cp" has scrolled out of my screen and I've
stopped the cp -lr, as the problem has been fixed.

0.0741482 1299598743.435264 link("current/linux-2.6.0-test2-clean/fs/file_table.c", "test2/linux-2.6.0-test2-clean/fs/file_table.c") = 0 <0.000047>
0.133991 1299598743.435398 link("current/linux-2.6.0-test2-clean/fs/read_write.c", "test2/linux-2.6.0-test2-clean/fs/read_write.c") = 0 <0.000036>
0.122786 1299598743.435521 link("current/linux-2.6.0-test2-clean/fs/xattr_acl.c", "test2/linux-2.6.0-test2-clean/fs/xattr_acl.c") = 0 <0.000041>
0.13113 1299598743.435652 link("current/linux-2.6.0-test2-clean/fs/jffs2", "test2/linux-2.6.0-test2-clean/fs/jffs2") = -1 EPERM (Operation not permitted) <0.000046>
0.740051 1299598743.436392 lstat64("current/linux-2.6.0-test2-clean/fs/jffs2", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000015>
0.119925 1299598743.436512 open("current/linux-2.6.0-test2-clean/fs/jffs2/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 6 <0.000022>
0.0889301 1299598743.436601 mkdir("test2/linux-2.6.0-test2-clean/fs/jffs2/", 0777) = 0 <0.031938>
32.057 1299598743.468658 getdents(6, /* 36 entries */, 32768) = 776 <0.000317>


> What type of file system are you using, and is it nearly full?
> Run this from e.g, the source directory:  df -hT .

Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/md3      ext3    2.7T  2.4T  190G  93% /backup


> Ideally, you'd attach to one of those processes with gdb and step
> through the code enough to tell us where it's spending its time,
> presumably in coreutils-8.10/src/copy.c.  Just running "gdb -p
> 26721" (where 26721 is the PID of one of your running cp processes)
> and typing "backtrace" at the prompt may give us a good clue.

It's spending time in mkdir. It's visible from the strace output. 

For a sample directory, download the linux source code, unpack it some
300 times into different subdirs (after unpacking, rename the
resulting tree to linux-1 linux-2, etc.)

(My count comes to 325 copies, but many are 2.4, so a lot smaller than
current kernels).

> Next best, you would give us access to your system or a copy of your hierarchy.
> But we don't even ask for that, because that's rarely feasible.
> Next best: you would give us the output of these two commands:
> [if you can do this, please respond privately, not to the list]
> 
>     find YOUR_SRC_DIR -ls | xz -e > src.find.xz
>     find YOUR_DST_DIR -ls | xz -e > dst.find.xz
> 
> [if you don't have xz, install it or use bzip2 -9 instead of "xz -e";
>  xz is better]
> 
> With that, we'd get an idea of hard link counts and max/average
> number of entries per directory, name length, etc.
> 
> However, most people don't want to share file names like that.
> If you can, please put those two compressed files somewhere like
> an upload site and reply with links to them.
> Otherwise, please give us some statistics describing your
> two hierarchies by running these commands:
> 
> These give counts of files and directories for each of your source
> and destination directories:

dest dir is created by the cp -lr. so it starts out empty, and ends up with
the same number as the source dir. :-)

>     find YOUR_SRC_DIR -type f |wc -l

About 4.7 million. 

>     find YOUR_SRC_DIR -type d |wc -l

About 325000. 

> Print the total number of links for each of those directories:

You say links for directories, but your command counts the links on
the files...

>     find YOUR_SRC_DIR -type f -printf '%n\n'|awk '{s += $1} END {printf "%F\n", s}'

539 million. So about 100 links to each file on average.

	Rogier. 

-- 
** R.E.Wolff <at> BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ




Reply sent to Jim Meyering <jim <at> meyering.net>:
You have taken responsibility. (Thu, 10 Mar 2011 17:40:02 GMT) Full text and rfc822 format available.

Notification sent to Rogier Wolff <R.E.Wolff <at> BitWizard.nl>:
bug acknowledged by developer. (Thu, 10 Mar 2011 17:40:03 GMT) Full text and rfc822 format available.

Message #16 received at 8200-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Rogier Wolff <R.E.Wolff <at> BitWizard.nl>
Cc: 8200-done <at> debbugs.gnu.org
Subject: Re: bug#8200: cp -lr uses a lot of CPU time.
Date: Thu, 10 Mar 2011 18:39:34 +0100
Rogier Wolff wrote:
> Aaargh... It seems the bug has been fixed... Feel free to ignore my
> explanation below.

Thanks.  I've marked this ticket as closed.

> On Tue, Mar 08, 2011 at 04:05:04PM +0100, Jim Meyering wrote:
>> For starters, what version of cp did you use?
>> Run cp --version
>
> -> cp (GNU coreutils) 8.5

In v7.0-63-g3ece035, I made this change:

    http://git.sv.gnu.org/cgit/coreutils.git/commit/?id=3ece0355d52e41a1
    cp: use far less memory in some cases

    cp --link was "remembering" many name,dev,inode triples unnecessarily.
    ...

which would make a big difference in your case.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 08 Apr 2011 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 14 years and 132 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.