GNU bug report logs - #21696
[PATCH 1/2] grep: improvement of performance of grep -Fw

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Sat, 17 Oct 2015 01:14:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 21696 in the body.
You can then email your comments to 21696 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#21696; Package grep. (Sat, 17 Oct 2015 01:14:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Norihiro Tanaka <noritnk <at> kcn.ne.jp>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Sat, 17 Oct 2015 01:14:03 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: <bug-grep <at> gnu.org>
Subject: [PATCH 1/2] grep: improvement of performance of grep -Fw
Date: Sat, 17 Oct 2015 10:13:36 +0900
[Message part 1 (text/plain, inline)]
I found that grep -Fw is extremely slow in spite of whether in
multibyte locales or not.

$ yes 'abcdefg hijklmn opqrstu vwxyz' | head -100000 >k
$ time -p env LC_ALL=C grep -Fw vwxy k
real 14.03
user 12.51
sys 0.74
$ time -p env LC_ALL=ja_JP.eucJP grep -Fw vwxy k
real 14.29
user 12.67
sys 0.50

$ time -p env LC_ALL=C grep -w vwxy k
real 0.11
user 0.01
sys 0.09
$ time -p env LC_ALL=ja_JP.eucJP grep -w vwxy k
real 0.89
user 0.71
sys 0.15

First patch fixes the problem.  Second patch changes as using grep
matcher for grep -Fw in single byte locales.

In single byte locales, DFA (not regex) is also used for words matching,
and it is very fast as above result.
[0001-grep-improvement-of-performance-of-grep-Fw.patch (text/plain, attachment)]
[0002-grep-use-grep-matcher-for-grep-Fw-in-single-byte-loc.patch (text/plain, attachment)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Sat, 17 Oct 2015 05:49:01 GMT) Full text and rfc822 format available.

Notification sent to Norihiro Tanaka <noritnk <at> kcn.ne.jp>:
bug acknowledged by developer. (Sat, 17 Oct 2015 05:49:02 GMT) Full text and rfc822 format available.

Message #10 received at 21696-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>, 21696-done <at> debbugs.gnu.org
Subject: Re: bug#21696: [PATCH 1/2] grep: improvement of performance of grep
 -Fw
Date: Fri, 16 Oct 2015 22:48:07 -0700
[Message part 1 (text/plain, inline)]
Thanks for those performance improvements. I installed them, with some minor 
changes to commentary. I also installed a couple of minor tweaks to the code, to 
use memrchr and to simplify the multibyte test. Attached are the revised set of 
patches.
[0001-grep-improve-performance-of-grep-Fw.patch (text/plain, attachment)]
[0002-grep-use-memchr-memrchar.patch (text/plain, attachment)]
[0003-grep-use-grep-matcher-for-grep-Fw-when-unibyte.patch (text/plain, attachment)]
[0004-grep-simplify-previous-change.patch (text/plain, attachment)]
[0005-maint-add-news-item.patch (text/plain, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#21696; Package grep. (Sat, 17 Oct 2015 07:45:02 GMT) Full text and rfc822 format available.

Message #13 received at 21696-done <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 21696-done <at> debbugs.gnu.org
Subject: Re: bug#21696: [PATCH 1/2] grep: improvement of performance of grep
 -Fw
Date: Sat, 17 Oct 2015 16:44:41 +0900
On Fri, 16 Oct 2015 22:48:07 -0700
Paul Eggert <eggert <at> cs.ucla.edu> wrote:

> Thanks for those performance improvements. I installed them, with some minor changes to commentary. I also installed a couple of minor tweaks to the code, to use memrchr and to simplify the multibyte test. Attached are the revised set of patches.

Thanks for review and rewording, Paul.





Information forwarded to bug-grep <at> gnu.org:
bug#21696; Package grep. (Sat, 17 Oct 2015 15:12:02 GMT) Full text and rfc822 format available.

Message #16 received at 21696-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 21696-done <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>
Subject: Re: bug#21696: [PATCH 1/2] grep: improvement of performance of grep
 -Fw
Date: Sat, 17 Oct 2015 08:11:21 -0700
On Sat, Oct 17, 2015 at 12:44 AM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> On Fri, 16 Oct 2015 22:48:07 -0700
> Paul Eggert <eggert <at> cs.ucla.edu> wrote:
>
>> Thanks for those performance improvements. I installed them, with some minor changes to commentary. I also installed a couple of minor tweaks to the code, to use memrchr and to simplify the multibyte test. Attached are the revised set of patches.
>
> Thanks for review and rewording, Paul.

Thank you both for those fine changes.
I know there are older pending patches from Norihiro Tanaka, and
normally I would have prioritized applying them before the next
release, but now that we know about the false-match fix (echo abc|grep
-E '^b|b$') affecting 2.19..2.21, I would prefer to release the fixed
grep-2.22 ASAP, and integrate those additional changes afterwards.

I would like to make a prerelease snapshot late today or tomorrow, so
if you know of any quick/trivial improvements or anything
bug-fix-related, please let me know soon.




Information forwarded to bug-grep <at> gnu.org:
bug#21696; Package grep. (Sat, 17 Oct 2015 16:55:02 GMT) Full text and rfc822 format available.

Message #19 received at 21696-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Jim Meyering <jim <at> meyering.net>, Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 21696-done <at> debbugs.gnu.org
Subject: Re: bug#21696: [PATCH 1/2] grep: improvement of performance of grep
 -Fw
Date: Sat, 17 Oct 2015 09:54:26 -0700
Jim Meyering wrote:
> I would prefer to release the fixed
> grep-2.22 ASAP, and integrate those additional changes afterwards.

Thanks for doing the release. I don't have any quick or trivial patches in my 
pipeline (which also includes a fix for the "binary output matches" problem, and 
reviewing Zev Weiss's multithreading patches).




Information forwarded to bug-grep <at> gnu.org:
bug#21696; Package grep. (Sat, 17 Oct 2015 23:13:02 GMT) Full text and rfc822 format available.

Message #22 received at 21696-done <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Jim Meyering <jim <at> meyering.net>
Cc: 21696-done <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>
Subject: Re: bug#21696: [PATCH 1/2] grep: improvement of performance of grep
 -Fw
Date: Sun, 18 Oct 2015 08:12:04 +0900
On Sat, 17 Oct 2015 08:11:21 -0700
Jim Meyering <jim <at> meyering.net> wrote:

> I would like to make a prerelease snapshot late today or tomorrow, so
> if you know of any quick/trivial improvements or anything
> bug-fix-related, please let me know soon.

Thanks for making pre-release.  I don't also have any quick and/or
trivial patches.






Information forwarded to bug-grep <at> gnu.org:
bug#21696; Package grep. (Sun, 18 Oct 2015 04:11:02 GMT) Full text and rfc822 format available.

Message #25 received at 21696-done <at> debbugs.gnu.org (full text, mbox):

From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 21696-done <at> debbugs.gnu.org, Paul Eggert <eggert <at> cs.ucla.edu>
Subject: FYI: two maint patches [was Re: bug#21696: ...
Date: Sat, 17 Oct 2015 21:10:09 -0700
[Message part 1 (text/plain, inline)]
On Sat, Oct 17, 2015 at 4:12 PM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
> On Sat, 17 Oct 2015 08:11:21 -0700
> Jim Meyering <jim <at> meyering.net> wrote:
>
>> I would like to make a prerelease snapshot late today or tomorrow, so
>> if you know of any quick/trivial improvements or anything
>> bug-fix-related, please let me know soon.

FYI, I've just pushed these:
[0001-build-avoid-spurious-bootstrap-failure-involving-pkg.patch (text/x-patch, attachment)]
[0002-gnulib-update-to-latest-also-bootstrap-and-tests-ini.patch (text/x-patch, attachment)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sun, 15 Nov 2015 12:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 221 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.