Package: grep;
Reported by: sur-behoffski <sur_behoffski <at> grouse.com.au>
Date: Wed, 4 Mar 2015 03:22:01 UTC
Severity: normal
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
View this message in rfc822 format
From: sur-behoffski <sur_behoffski <at> grouse.com.au> To: 19997 <at> debbugs.gnu.org Subject: bug#19997: Performance differences between Git sources and release tarballs? Date: Tue, 10 Mar 2015 16:29:27 +1030
On 03/04/15 17:48, Paul Eggert wrote: > sur-behoffski wrote: >> Looking at Grep's "configure --help" output, I see that >> "--enable-gcc-warnings" is an option, and, with some experimentation with >> invoking compilation in different environments, it seems that a slew of >> warning options is enabled in the development tree, that are not enabled in >> the release tarball. > > These shouldn't affect grep's runtime performance. > > I don't know why you're observing the performance you do. It could be that you're building 'grep' differently from the way Gentoo builds it. You might try building both grep 2.21 and grep master yourself, with the same build procedure both ways. > Okay, I think I have this issue sorted: Adding "-lpapi" to the link command results in more infrastructure effort before main() is called. If this change is factored out, then your comment looks to be correct: There should be no performance differences between Git/release tarball builds. The practical upshot is that PAPI's valuable for finding resource-heavy portions of code, but then has to be factored out, or possibly added in equally everywhere, when comparisons between programs are being made. For the factoring-out case, you have to adopt a dual-configuration stance: 1. PAPI involved in one-off runs, reporting results, highlighting places where resources are being used, and giving quick feedback for experimental code changes; plus 2. The program without PAPI (and also without any PAPI-associated intrusive monitoring changes whatsoever), built for multiple-repetition full-program timing runs. Here's a quick summary of results, followed by a more detailed look at program activity provided courtesy of "strace". The summary uses "repl", a Lua script that: - Uses "time" on a bash script that repeatedly invokes the program (e.g. 1000 "inner reps"); - Does this whole set multiple times (e.g. 50 "outer reps"); - Uses "nice -12" to reduce likely variation from other system activity; - Sleeps for a short while between outer reps to give other processes a chance, as otherwise "nice -12" might be unduly disruptive; - Crosses off a few of the fastest and slowest times as possibly being aberrations; - Sorts the remaining times; and - Reports the mean time, together with the lowest few (10) runs. cheers, sur-behoffski (Brenton Hoff) Programmer, Grouse Software ------------------------------------------------------------ Summary: "justmain.c" and -lpapi (yeah, I know "-O2 -pipe" doesn't do much; these are included as they are usual for my Gentoo system): $ cat justmain.c int main(void) { return 0; } $ gcc -O2 -pipe -o justmain justmain.c $ gcc -O2 -pipe -o justmain-papi justmain.c -lpapi $ repl 50/1000 ./justmain Cmd: "./justmain" [50/1000] Mean: 0.501; Lowest: 0.495 0.496 0.496 0.496 0.497 0.497 0.497 0.498 0.498 0.499 $ repl 50/1000 ./justmain-papi Cmd: "./justmain-papi" [50/1000] Mean: 0.988; Lowest: 0.964 0.965 0.965 0.968 0.970 0.973 0.974 0.975 0.977 0.978 ------------------------------------------------------------ Comparing the strace output for "justmain" and "justmain-papi" shows the extra effort needed for papi and other associated libraries: $ strace ./justmain execve("./justmain", ["./justmain"], [/* 37 vars */]) = 0 brk(0) = 0x1d40000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbae97eb000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=144976, ...}) = 0 mmap(NULL, 144976, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fbae97c7000 close(3) = 0 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300N\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1716576, ...}) = 0 mmap(NULL, 3828824, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fbae9225000 mprotect(0x7fbae93c3000, 2093056, PROT_NONE) = 0 mmap(0x7fbae95c2000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19d000) = 0x7fbae95c2000 mmap(0x7fbae95c8000, 15448, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fbae95c8000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbae97c6000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbae97c5000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbae97c4000 arch_prctl(ARCH_SET_FS, 0x7fbae97c5700) = 0 mprotect(0x7fbae95c2000, 16384, PROT_READ) = 0 mprotect(0x600000, 4096, PROT_READ) = 0 mprotect(0x7fbae97ec000, 4096, PROT_READ) = 0 munmap(0x7fbae97c7000, 144976) = 0 exit_group(0) = ? +++ exited with 0 +++ $ strace ./justmain-papi execve("./justmain-papi", ["./justmain-papi"], [/* 37 vars */]) = 0 brk(0) = 0x2045000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2112cf4000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=144976, ...}) = 0 mmap(NULL, 144976, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f2112cd0000 close(3) = 0 open("/usr/lib64/libpapi.so.5.4.0.0", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\353\0\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=356288, ...}) = 0 mmap(NULL, 2456816, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f211287d000 mprotect(0x7f21128ca000, 2093056, PROT_NONE) = 0 mmap(0x7f2112ac9000, 45056, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x4c000) = 0x7f2112ac9000 mmap(0x7f2112ad4000, 3312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f2112ad4000 close(3) = 0 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300N\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1716576, ...}) = 0 mmap(NULL, 3828824, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f21124d6000 mprotect(0x7f2112674000, 2093056, PROT_NONE) = 0 mmap(0x7f2112873000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19d000) = 0x7f2112873000 mmap(0x7f2112879000, 15448, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f2112879000 close(3) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2112ccf000 open("/usr/lib64/tls/x86_64/libpfm.so.4", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/usr/lib64/tls/x86_64", 0x7ffd2d32b460) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libpfm.so.4", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/usr/lib64/tls", 0x7ffd2d32b460) = -1 ENOENT (No such file or directory) open("/usr/lib64/x86_64/libpfm.so.4", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) stat("/usr/lib64/x86_64", 0x7ffd2d32b460) = -1 ENOENT (No such file or directory) open("/usr/lib64/libpfm.so.4", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \251\5\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1054784, ...}) = 0 mmap(NULL, 3155440, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f21121d3000 mprotect(0x7f211226c000, 2097152, PROT_NONE) = 0 mmap(0x7f211246c000, 430080, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x99000) = 0x7f211246c000 mmap(0x7f21124d5000, 1520, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f21124d5000 close(3) = 0 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2112ccd000 arch_prctl(ARCH_SET_FS, 0x7f2112ccd740) = 0 mprotect(0x7f2112873000, 16384, PROT_READ) = 0 mprotect(0x7f211246c000, 385024, PROT_READ) = 0 mprotect(0x7f2112ac9000, 4096, PROT_READ) = 0 mprotect(0x600000, 4096, PROT_READ) = 0 mprotect(0x7f2112cf5000, 4096, PROT_READ) = 0 munmap(0x7f2112cd0000, 144976) = 0 exit_group(0) = ? +++ exited with 0 +++
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.