Package: coreutils;
Reported by: Rogier Wolff <R.E.Wolff <at> BitWizard.nl>
Date: Tue, 8 Mar 2011 05:23:02 UTC
Severity: normal
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
Message #11 received at 8200 <at> debbugs.gnu.org (full text, mbox):
From: Rogier Wolff <R.E.Wolff <at> BitWizard.nl> To: Jim Meyering <jim <at> meyering.net> Cc: 8200 <at> debbugs.gnu.org, Rogier Wolff <R.E.Wolff <at> BitWizard.nl> Subject: Re: bug#8200: cp -lr uses a lot of CPU time. Date: Tue, 8 Mar 2011 17:35:22 +0100
Hi Jim & others, Aaargh... It seems the bug has been fixed... Feel free to ignore my explanation below. On Tue, Mar 08, 2011 at 04:05:04PM +0100, Jim Meyering wrote: > For starters, what version of cp did you use? > Run cp --version -> cp (GNU coreutils) 8.5 > > Top reports: > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 26721 root 20 0 2456 720 468 R 58.0 0.1 65:32.60 cp > > 2855 root 20 0 2560 936 624 R 40.8 0.1 30:30.52 cp > > > > and I doubt they are half way through. > > > > I wrote an application I call "cplr" which does the obvious in the > > obvious manner, and it doesn't have this problem. > > > > I've run "strace", and determined that it is not doing systemcalls > > that take much CPU time. Most system calls return in microseconds. > > Please give us a sample listing of the syscalls that strace > shows you when you trace one of those long-running cp commands. > A few hundred lines worth would be good. I ran: time strace -tttTp 11453 | & head -1000 | awk '{print ($1-t)*1000 , $0 ; t=$1;}' to get the output of 1000 of the process' system calls. Previously I had omitted the "*1000" which made things harder to read, and I hadn't noticed that the mkdir calls were the calls that took a long time..... My own "cplr" program I started one time without any arguments and it said: => Usage: cplr srcdir dstdir => Copy srcdir to dstdir by making hardlinks => (Like cp -lR, but without consuming lots of memory) So apparently the problem we ran into when I made that was that cp was consuming much too much memory.... This apparently has been fixed in the meantime. Here is a typical section of the strace output. This is from my own cplr program, as the "cp" has scrolled out of my screen and I've stopped the cp -lr, as the problem has been fixed. 0.0741482 1299598743.435264 link("current/linux-2.6.0-test2-clean/fs/file_table.c", "test2/linux-2.6.0-test2-clean/fs/file_table.c") = 0 <0.000047> 0.133991 1299598743.435398 link("current/linux-2.6.0-test2-clean/fs/read_write.c", "test2/linux-2.6.0-test2-clean/fs/read_write.c") = 0 <0.000036> 0.122786 1299598743.435521 link("current/linux-2.6.0-test2-clean/fs/xattr_acl.c", "test2/linux-2.6.0-test2-clean/fs/xattr_acl.c") = 0 <0.000041> 0.13113 1299598743.435652 link("current/linux-2.6.0-test2-clean/fs/jffs2", "test2/linux-2.6.0-test2-clean/fs/jffs2") = -1 EPERM (Operation not permitted) <0.000046> 0.740051 1299598743.436392 lstat64("current/linux-2.6.0-test2-clean/fs/jffs2", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 <0.000015> 0.119925 1299598743.436512 open("current/linux-2.6.0-test2-clean/fs/jffs2/", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 6 <0.000022> 0.0889301 1299598743.436601 mkdir("test2/linux-2.6.0-test2-clean/fs/jffs2/", 0777) = 0 <0.031938> 32.057 1299598743.468658 getdents(6, /* 36 entries */, 32768) = 776 <0.000317> > What type of file system are you using, and is it nearly full? > Run this from e.g, the source directory: df -hT . Filesystem Type Size Used Avail Use% Mounted on /dev/md3 ext3 2.7T 2.4T 190G 93% /backup > Ideally, you'd attach to one of those processes with gdb and step > through the code enough to tell us where it's spending its time, > presumably in coreutils-8.10/src/copy.c. Just running "gdb -p > 26721" (where 26721 is the PID of one of your running cp processes) > and typing "backtrace" at the prompt may give us a good clue. It's spending time in mkdir. It's visible from the strace output. For a sample directory, download the linux source code, unpack it some 300 times into different subdirs (after unpacking, rename the resulting tree to linux-1 linux-2, etc.) (My count comes to 325 copies, but many are 2.4, so a lot smaller than current kernels). > Next best, you would give us access to your system or a copy of your hierarchy. > But we don't even ask for that, because that's rarely feasible. > Next best: you would give us the output of these two commands: > [if you can do this, please respond privately, not to the list] > > find YOUR_SRC_DIR -ls | xz -e > src.find.xz > find YOUR_DST_DIR -ls | xz -e > dst.find.xz > > [if you don't have xz, install it or use bzip2 -9 instead of "xz -e"; > xz is better] > > With that, we'd get an idea of hard link counts and max/average > number of entries per directory, name length, etc. > > However, most people don't want to share file names like that. > If you can, please put those two compressed files somewhere like > an upload site and reply with links to them. > Otherwise, please give us some statistics describing your > two hierarchies by running these commands: > > These give counts of files and directories for each of your source > and destination directories: dest dir is created by the cp -lr. so it starts out empty, and ends up with the same number as the source dir. :-) > find YOUR_SRC_DIR -type f |wc -l About 4.7 million. > find YOUR_SRC_DIR -type d |wc -l About 325000. > Print the total number of links for each of those directories: You say links for directories, but your command counts the links on the files... > find YOUR_SRC_DIR -type f -printf '%n\n'|awk '{s += $1} END {printf "%F\n", s}' 539 million. So about 100 links to each file on average. Rogier. -- ** R.E.Wolff <at> BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 ** ** Delftechpark 26 2628 XH Delft, The Netherlands. KVK: 27239233 ** *-- BitWizard writes Linux device drivers for any device you may have! --* Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. Does it sit on the couch all day? Is it unemployed? Please be specific! Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.