GNU bug report logs -
#15630
grep 2.14 much slower than 2.5.1
Previous Next
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your bug report
#15630: grep 2.14 much slower than 2.5.1
which was filed against the grep package, has been closed.
The explanation is attached below, along with your original report.
If you require more details, please reply to 15630 <at> debbugs.gnu.org.
--
15630: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=15630
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
On 04/06/2014 03:45 AM, Norihiro Tanaka wrote:
> This bug will be fixed by application of patch#17019 and patch#17034.
Thanks, Iapplied those patchesa week or two ago, and so am closing this bug.
[Message part 3 (message/rfc822, inline)]
[Message part 4 (text/plain, inline)]
I'm running 64-bit builds of grep 2.14 and 2.5.1 on a Red Hat 5.6 box - grep 2.14 is significantly slower than 2.5.1 on a simple regex - the times are:
grep 2.5.1: 4.39user 3.19system 0:07.60elapsed
grep 2.14: 25.92user 2.84system 0:28.76elapsed
the grep commandline is -i "<name>.*russia" - the file is a large XML file with 101,766,751 lines around 3.6 GB size - there are 14,772 matched lines - runs are in the C locale - both grep builds have the default configuration - here are callgrind top Ir counts:
grep 2.5.1:
5,985,715,429 kwset.c:kwsexec
833,138,736 dfa.c:dfaexec
360,061,388 ???:memchr
110,119,157 search.c:EGexecute
34,010,204 grep.c:grepfile
32,198,545 ???:__ctype_get_mb_cur_max
11,459,760 grep.c:fillbuf
7,175,377 ???:memmove
3,623,898 grep.c:grepbuf
grep 2.14:
36,717,431,504 dfa.c:dfaexec
15,709,111,428 ???:memchr
12,363,145,663 kwset.c:kwsexec
6,483,204,386 dfasearch.c:EGexecute
14,650,909 ???:memrchr
10,358,230 main.c:fillbuf
7,172,801 ???:memmove
7,162,667 main.c:grepdesc
4,484,004 main.c:grepbuf
1,250,200 ???:__ctype_get_mb_cur_max
and top function call counts:
grep 2.5.1:
kwsexec 1656108
__ctype_get_mb_cur_max 1547396
memchr 1547383
dfaexec 1547383
__ctype_get_mb_cur_max 1547383
__ctype_get_mb_cur_max 1547383
__ctype_get_mb_cur_max 1532611
EGexecute 124962
__ctype_get_mb_cur_max 124962
read 110191
grepbuf 110190
fillbuf 110190
memmove 110189
__ctype_get_mb_cur_max 108725
prtext 14772
prline 14772
grep 2.14:
memchr 101766751
kwsexec 101766751
dfaexec 101766751
EGexecute 124966
__ctype_get_mb_cur_max 124966
__ctype_get_mb_cur_max 124966
read 110195
memrchr 110194
grepbuf 110194
fillbuf 110194
memmove 110193
prtext 14772
prline 14772
Ratios of Ir counts to function call counts:
grep 2.5.1:
dfaexec: 538.42 = 833138736/1547383
kwsexec: 3614.33 = 5985715429/1656108
memchr: 232.69 = 360061388/1547383
grep 2.14:
dfaexec: 360.80 = 36717431504/101766751
kwsexec: 121.48 = 12363145663/101766751
memchr:154.36 = 15709111428/101766751
1. grep 2.14 calls kwsexec, dfaexec and memchr once per line while 2.5.1 makes far fewer calls to those functions
2. grep 2.5.1 calls __ctype_get_mb_cur_max many more times than 2.14 but overall spends less time in the function
3. grep 2.14 calls memrchr while grep 2.5.1 does not
4. grep 2.5.1 generally passes longer chunks to memchr thus reducing the overall time it spends in the function
Is there a runtime option or buildtime configuration for grep 2.14 that could give it comparable performance to grep 2.5.1 for the sort of simple regex in my example?
Zartaj
[Message part 5 (text/html, inline)]
This bug report was last modified 11 years and 39 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.