GNU bug report logs -
#32750
[PATCH 2/2] dfa: optmization of alternation in NFA
Previous Next
Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Date: Mon, 17 Sep 2018 14:16:02 UTC
Severity: normal
Tags: patch
Done: Paul Eggert <eggert <at> cs.ucla.edu>
Bug is archived. No further changes may be made.
Full log
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Hi,
Even when similer states exists in alternation, DFA treats them as
separate items. It may complicate the transition in NFA and cause
slowdown. This change assembles the states into one.
For example, ab|ac is changed into a(b|c).
This change speeds-up matching for many branched pattern. For
example, grep speeds-up more than 30x in following case.
seq 10000 | sed 's/$/ abcdefghijklmnopqrstuvwxyz/; s/$/./' >in
time -p env LC_ALL=C grep -vf in in
(before) real 63.43 user 61.67 system 1.65
(after) real 1.64 user 1.61 system 0.03
# If we do not add '.' at last in pattern, not dfa but kwset is used.
grep also speeds-up about 3x in following case.
time -p env LC_ALL=C grep -vf /usr/share/dict/linux.words /usr/share/dict/linux.words
(before) real 2.48 user 2.09 system 0.38
(after) real 7.69 user 6.32 system 1.29
Thanks,
Norihiro
[0001-dfa-simplify-initial-state.patch (text/plain, attachment)]
[0002-dfa-optmization-of-alternation-in-NFA.patch (text/plain, attachment)]
This bug report was last modified 6 years and 251 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.