GNU bug report logs - #15191
faster DFA.C state merge

Previous Next

Package: grep;

Reported by: Ivan Yanikov <dobrokot <at> gmail.com>

Date: Mon, 26 Aug 2013 06:58:03 UTC

Severity: wishlist

Tags: moreinfo, patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 15191 in the body.
You can then email your comments to 15191 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-grep <at> gnu.org:
bug#15191; Package grep. (Mon, 26 Aug 2013 06:58:04 GMT) Full text and rfc822 format available.

Acknowledgement sent to Ivan Yanikov <dobrokot <at> gmail.com>:
New bug report received and forwarded. Copy sent to bug-grep <at> gnu.org. (Mon, 26 Aug 2013 06:58:05 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Ivan Yanikov <dobrokot <at> gmail.com>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: bug-grep <at> gnu.org
Subject: Re: faster DFA.C state merge
Date: Mon, 26 Aug 2013 05:26:11 +0400
[Message part 1 (text/plain, inline)]
I have polished the patch (
    faster code for simple cases of merging 2 or 1 sets,
    fixes of code style, like fancy 2+2 indentation,
    important fix: memory leaks under valgrind - I improperly returned array from a function )

New patch available on URL https://gist.github.com/dobrokot/6337339 and included as attach

> One thing that'd help would be a test case that illustrates the need for the patch.

Not sure how to properly to send test-case, and how to reference grep compiled old/new binary. I had put it on URL http://dobrokot.ru/dump/slow_dfa_merge.2013-08-26.tar.gz
I can send it in a attach for a "history", if binary/large attaches are allowed in maillists.
Real regex contains sensitive private data, and it's huge. So I had little obfuscated it and reduce to kilobytes. The runtime difference is not so great as in real example (1 to 100), but still large 
( 2-3 times faster ).

Times for new/old version: 2.3sec / 8.7sec

> I assume you're willing to assign copyright to the FSF for the change?  (I can send you copies of the paperwork, if so.)

You mean, I should somewhere explicitly state, that I am agree with GPL and give FSF rights to distribute and use the code from patch?
If so, I am surely agree with the license, and (proudly) give the permission to use/distribute code from my patch :)

Or you mean some modification of authorship lines in the AUTHORS/THANKS and beginning of dfa.c, which should contain my name now ?

Which paperwork do you mean? Real paper which I should sign with pen and ink?

Thanks for attention, Ivan Yanikov.

On 25.08.2013 11:02, Paul Eggert wrote:
> Ivan Yanikov wrote:
>> How I can properly send/commit it for review?
> You've already done that, sort of, with the pointers
> to the patch.  One thing that'd help would be a test
> case that illustrates the need for the patch.  Also,
> I assume you're willing to assign copyright to the
> FSF for the change?  (I can send you copies of the
> paperwork, if so.)
>

[grep-2.14-faster-dfa-merger.diff (text/plain, attachment)]

Added tag(s) patch. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Sat, 08 Mar 2014 17:52:02 GMT) Full text and rfc822 format available.

Added tag(s) moreinfo. Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Sun, 06 Apr 2014 15:19:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15191; Package grep. (Tue, 06 May 2014 14:36:01 GMT) Full text and rfc822 format available.

Message #12 received at 15191 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Ivan Yanikov <dobrokot <at> gmail.com>
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, 15191 <at> debbugs.gnu.org
Subject: bug#15191: faster DFA.C state merge
Date: Tue, 06 May 2014 23:34:53 +0900
[Message part 1 (text/plain, inline)]
I tested below on CentOS 5.10 Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz
GCC 4.1.2.

$ env LANG=C time -p src/grep -f regex.re input_lines.txt
Command exited with non-zero status 1
real 4.81
user 4.40
sys 0.09

I apply an attachment, and tested again.

$ env LANG=C time -p src/grep -f regex.re input_lines.txt
Command exited with non-zero status 1
real 4.63
user 4.46
sys 0.10

Therefore, I don't think that the patch is effective.
[diff.txt (text/plain, attachment)]

Information forwarded to bug-grep <at> gnu.org:
bug#15191; Package grep. (Wed, 07 May 2014 14:52:02 GMT) Full text and rfc822 format available.

Message #15 received at 15191 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: 15191 <at> debbugs.gnu.org
Cc: Paul Eggert <eggert <at> cs.ucla.edu>
Subject: bug#15191: faster DFA.C state merge
Date: Wed, 07 May 2014 23:51:20 +0900
[Message part 1 (text/plain, inline)]
As far as I examined a cause of slowness, it seems that it's caused by
not state merge but dfamust.

Although I'm suspicious of memory allocations in dfamust, I don't still
have any ideas to fix them.

I tried an attachment, but it became more and more slowdown.
[0001-dfa-efficient-memory-allocation.patch (text/plain, attachment)]

Severity set to 'wishlist' from 'normal' Request was from Paul Eggert <eggert <at> cs.ucla.edu> to control <at> debbugs.gnu.org. (Sat, 10 May 2014 20:49:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-grep <at> gnu.org:
bug#15191; Package grep. (Sat, 17 May 2014 02:04:02 GMT) Full text and rfc822 format available.

Message #20 received at 15191 <at> debbugs.gnu.org (full text, mbox):

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: 15191 <at> debbugs.gnu.org
Cc: Paul Eggert <eggert <at> cs.ucla.edu>, Ivan Yanikov <dobrokot <at> gmail.com>
Subject: Re: bug#15191: faster DFA.C state merge
Date: Sat, 17 May 2014 11:03:13 +0900
[Message part 1 (text/plain, inline)]
I tried it on grep-2.14, and I confirmed that the patch is effective
certainly.  However, I see that it's fixed in bug#17377.

$ env LC_ALL=C time -p ../grep-2.19/src/grep -E -f regex.re input_lines.txt

grep-2.14
  before: real 7.66  user 7.56  sys 0.09
  after : real 2.06  user 1.96  sys 0.09

grep-2.18.146-ebf3
  before: real 1.59  user 1.50  sys 0.08
  after : real 1.99  user 1.86  sys 0.12

If my rewriting of the patch is right, it causes slowdown rather by
building heap.

Norihiro
[dfa_faster_state_merge.diff (text/plain, attachment)]

Reply sent to Paul Eggert <eggert <at> cs.ucla.edu>:
You have taken responsibility. (Sat, 17 May 2014 03:26:02 GMT) Full text and rfc822 format available.

Notification sent to Ivan Yanikov <dobrokot <at> gmail.com>:
bug acknowledged by developer. (Sat, 17 May 2014 03:26:02 GMT) Full text and rfc822 format available.

Message #25 received at 15191-done <at> debbugs.gnu.org (full text, mbox):

From: Paul Eggert <eggert <at> cs.ucla.edu>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>, 15191-done <at> debbugs.gnu.org
Cc: Ivan Yanikov <dobrokot <at> gmail.com>
Subject: Re: bug#15191: faster DFA.C state merge
Date: Fri, 16 May 2014 20:24:54 -0700
Norihiro Tanaka wrote:
> If my rewriting of the patch is right, it causes slowdown rather by
> building heap.

Thanks for checking it.  I'll close this bug report, since it seems to 
have been fixed in a different way.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 14 Jun 2014 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 11 years and 11 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.