GNU bug report logs - #17350
[PATCH] grep: speed up for a case to repeat failure in DFA after success in kwset

Previous Next

Package: grep;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Sat, 26 Apr 2014 11:27:02 UTC

Severity: normal

Tags: patch

Done: Paul Eggert <eggert <at> cs.ucla.edu>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: Paul Eggert <eggert <at> cs.ucla.edu>
Cc: 17350-done <at> debbugs.gnu.org
Subject: bug#17350: [PATCH] grep: speed up for a case to repeat failure in DFA after success in kwset
Date: Sun, 27 Apr 2014 20:54:09 +0900
[Message part 1 (text/plain, inline)]
Norihiro Tanaka wrote:
> By the way, I took into another bug by my previous patch.  If
> `kwsm.index < kwset_exact_matches', don't have to run DFA for whole a buffer.

I found a issue in the patch.

If failed in DFA after succeed in kwset, doesn't return to kwset until
reaches the end of the buffer or find a match.  By that, thought some
cases speed up, there is also a case slowdown.

Although A is speed-up, B is slowdown.

(A)
  yes abcdabc | head -50000000 >k
  env LC_ALL=C time -p src/grep abcd.bd k

(B)
  yes "abcdabc
    $(yes jjjjjjj | head -99)" | head -50000000 >k
  env LC_ALL=C time -p src/grep abcd.bd k

In A, KWset doesn't work at all, and it's harmful.
OTOH, in B, It works effectively in A.

I considered only the case of A, but it's necessary to consider how B
does not slowdown.

I wrote the patch for the master to return to KWset, after checking with
DFA about 30 line.  `30' is based on results of the tests.

However, I don't so like this patch, since the basis to 30 is weak...
Is there anyone that have any good ideas?

Norihiro
[patch.txt (text/plain, attachment)]

This bug report was last modified 11 years and 19 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.