GNU bug report logs -
#24161
[PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher
Previous Next
Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Date: Fri, 5 Aug 2016 14:05:01 UTC
Severity: normal
Tags: patch
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Hi,
We can speeds up sed by using dfa matcher brought from grep. gawk users
it, sed does not uses it yet. It will speed up matching for typical
cases.
$ yes $(printf %040d 0) | head -1000000 >k
Before:
]$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 3.04
user 2.99
sys 0.03
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 3.04
user 2.90
sys 0.06
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 7.09
user 6.77
sys 0.31
After patching:
$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 0.29
user 0.15
sys 0.10
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 0.27
user 0.25
sys 0.02
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 0.33
user 0.29
sys 0.03
I believe that this patch can greatly improve performance of matching by
sed, however I worry about the maintenance as updates for dfa is always
done in grep.
Thanks,
Norihiro
[0002-sed-speed-up-matching-by-reguler-expression-with-dfa.patch (text/plain, attachment)]
This bug report was last modified 8 years and 284 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.