GNU bug report logs -
#17230
[PATCH 1/2] grep: may also use Boyer-Moore algorithm for case-insensitive matching
Previous Next
Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Date: Wed, 9 Apr 2014 13:56:07 UTC
Severity: normal
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 17230 in the body.
You can then email your comments to 17230 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-grep <at> gnu.org
:
bug#17230
; Package
grep
.
(Wed, 09 Apr 2014 13:56:08 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Norihiro Tanaka <noritnk <at> kcn.ne.jp>
:
New bug report received and forwarded. Copy sent to
bug-grep <at> gnu.org
.
(Wed, 09 Apr 2014 13:56:09 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Boyer-Moore algorithm is faster than Beate Commentz-Walter, in especially
worst case, because Galil rule is applicable only for Boyer-Moore
algorithm.
This patch enables to use Boyer-Moore algorithm for case-insensitive
matching.
Norihiro
[patch.txt (text/plain, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17230
; Package
grep
.
(Sun, 20 Apr 2014 08:22:01 GMT)
Full text and
rfc822 format available.
Message #8 received at 17230 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
I simplified the patch by macros to work delta2 in bmexec().
[patch.txt (text/plain, attachment)]
Reply sent
to
Paul Eggert <eggert <at> cs.ucla.edu>
:
You have taken responsibility.
(Wed, 23 Apr 2014 04:22:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Norihiro Tanaka <noritnk <at> kcn.ne.jp>
:
bug acknowledged by developer.
(Wed, 23 Apr 2014 04:22:03 GMT)
Full text and
rfc822 format available.
Message #13 received at 17230-done <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Thanks much for this performance improvement. I puzzled for a while to
figure out what was going on, and was confused by that tricky macro
expansion. Nowadays functions are typically not significantly slower
than macros, so unless there's a major performance difference I'd prefer
to use functions. I installed the performance improvement patch and
followed up with the attached patch, which uses C functions instead of
macros and which coalesces some of the near-duplicate code. The
followup patch also includes some comments to try to explain the new
functions.
[0001-kwset-simplify-Boyer-Moore-with-unibyte-i.patch (text/plain, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17230
; Package
grep
.
(Wed, 23 Apr 2014 17:52:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 17230-done <at> debbugs.gnu.org (full text, mbox):
Paul, thanks for a lot of reviews and commits. However, it may be wrong.
I ran several tests for the worst case.
$ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 >../k
$ env LANG=C time -p src/grep -i kjjjjjjjjjjjjjjjjjjj ../k
My version:
real 0.74
user 0.43
sys 0.30
Simplified version:
real 2.06
user 1.74
sys 0.31
It's slower than DFA.
$ env LANG=C time -p src/grep -i 'a\|kjjjjjjjjjjjjjjjjjjj' ../k
real 1.26
user 0.96
sys 0.30
I ran the test on Solaris 10 (SPARC) and HP-UX 11v2 (Itanium)
- On HP-UX 11v2
$ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 >../k
$ env LANG=C time -p src/grep -i kjjjjjjjjjjjjjjjjjjj ../k
My version:
real 2.9
user 2.6
sys 0.2
Simplified version:
real 7.5
user 7.3
sys 0.2
Using DFA:
$ env LANG=C time -p src/grep -i 'a\|kjjjjjjjjjjjjjjjjjjj' ../k
real 3.2
user 3.0
sys 0.2
- On Solaris 10
$ env LANG=C time -p src/grep -i kjjjjjjjjjjjjjjjjjjj ../k
My version:
real 7.61
user 6.89
sys 0.71
Simplified version:
real 24.44
user 23.71
sys 0.72
Using DFA:
$ env LANG=C time -p src/grep -i 'a\|kjjjjjjjjjjjjjjjjjjj' ../k
real 8.03
user 7.31
sys 0.71
Norihiro
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17230
; Package
grep
.
(Wed, 23 Apr 2014 20:48:02 GMT)
Full text and
rfc822 format available.
Message #19 received at 17230 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
On 04/23/2014 10:51 AM, Norihiro Tanaka wrote:
> Paul, thanks for a lot of reviews and commits. However, it may be wrong.
> I ran several tests for the worst case.
>
> $ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 >../k
> $ env LANG=C time -p src/grep -i kjjjjjjjjjjjjjjjjjjj ../k
>
> My version:
> real 0.74
> user 0.43
> sys 0.30
>
> Simplified version:
> real 2.06
> user 1.74
> sys 0.31
>
> It's slower than DFA.
>
That's odd, as I'm not observing this slowdown. I'm running on Fedora
20 x86-64 with GCC 4.9.0. The current master
(b0a087b4c7196782d8e76f00c45de1540daf5b91) runs a bit faster on your
benchmark than the branch with your patch
(dd0de0dce596add9098e1fb6e294c190c9b04f18): about 1.167 seconds versus
about 1.230 seconds real time, both in the C locale. I'm not using any
special compiler flags, just the -g -O2 that 'configure' deduces. I'll
attach the exact difference between these two versions, as the two
branches have diverged with time and perhaps something else is affecting
these numbers.
More generally, I don't even understand why your patch would speed
things up. To my mind, it should slow things down a bit, which is what
I'm observing. And (as I mentioned) it causes 'make check' to fail.
Possibly I'm simply not understanding the proposed change.
As far as Solaris and HP-UX goes, I'd like to wait until we've figured
out the GNU situation. Those platforms are lower priority, and the
code's correct for them, it's just a performance issue.
[17230.diff (text/x-patch, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17230
; Package
grep
.
(Thu, 24 Apr 2014 00:00:03 GMT)
Full text and
rfc822 format available.
Message #22 received at 17230 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Paul Eggert wrote:
> That's odd, as I'm not observing this slowdown.
Though I apply 17230.diff and tests, I don't observed slowdown. However,
I cann't find macros `BM_DELTA2_SEARCH' etc in 17230.diff. Could you also
consider them, and run (A) instead of (B)? It means that overheads by
`yes' and `head' should be ignored.
(A)
$ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 >k
$ grep -i jk
(B)
$ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 | grep -i jk
I attach the difference between current master and my version.
Though I don't analyze detail still, don't seem that overhead by check
for `trans' in `tr' function, which is called each time the comparison
of a character, can be ignorable.
Norihiro
[17230.diff (text/plain, attachment)]
Information forwarded
to
bug-grep <at> gnu.org
:
bug#17230
; Package
grep
.
(Thu, 24 Apr 2014 01:50:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 17230 <at> debbugs.gnu.org (full text, mbox):
Norihiro Tanaka wrote:
> Could you also
> consider them, and run (A) instead of (B)? It means that overheads by
> `yes' and `head' should be ignored.
Sure, I did that, using the updated 17230.diff patch, and found that the
patch slowed grep down on my machine. Your benchmark took about 0.58s
real-time with grep master, and about about 0.89s real-time with the
updated patch.
I normally time with an AMD Phenom II X4 910e; this is a 65W Deneb
processor released January 2010. I just now tried a different
processor, an Intel Xeon E5620 under RHEL 6.5; this is an 80W
Westmere-EP processor released March 2010. As with the AMD machine, I
compiled with GCC 4.9.0 with default (-O2) optimizations. The same
benchmark took about 0.40s real-time with the master, and about 0.83s
real-time with the updated patch.
> Though I don't analyze detail still, don't seem that overhead by check
> for `trans' in `tr' function, which is called each time the comparison
> of a character, can be ignorable.
I haven't looked into the details either, but I'd guess that the
compiler and/or branch prediction optimizes most of that stuff away.
Even if it didn't, we could use inlining rather than macroexpansion to
optimize it away by hand, though I'd rather not bother unless the
performance improvement is worthwhile.
By the way, the updated patch does pass 'make check', so my earlier
version of the patch was incorrect and its timings should be ignored.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Thu, 22 May 2014 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 11 years and 112 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.