GNU bug report logs -
#24161
[PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher
Previous Next
Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Date: Fri, 5 Aug 2016 14:05:01 UTC
Severity: normal
Tags: patch
Done: Jim Meyering <jim <at> meyering.net>
Bug is archived. No further changes may be made.
Full log
View this message in rfc822 format
[Message part 1 (text/plain, inline)]
Your message dated Mon, 8 Aug 2016 17:53:04 -0700
with message-id <CA+8g5KGRY_Ya9HnvDQbRKUa-7bM_fc3pNGTOo=ByHaYP=xs3Ag <at> mail.gmail.com>
and subject line Re: bug#24161: [PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher
has caused the debbugs.gnu.org bug report #24161,
regarding [PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher
to be marked as done.
(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)
--
24161: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=24161
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
[Message part 3 (text/plain, inline)]
Hi,
We can speeds up sed by using dfa matcher brought from grep. gawk users
it, sed does not uses it yet. It will speed up matching for typical
cases.
$ yes $(printf %040d 0) | head -1000000 >k
Before:
]$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 3.04
user 2.99
sys 0.03
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 3.04
user 2.90
sys 0.06
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 7.09
user 6.77
sys 0.31
After patching:
$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 0.29
user 0.15
sys 0.10
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 0.27
user 0.25
sys 0.02
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 0.33
user 0.29
sys 0.03
I believe that this patch can greatly improve performance of matching by
sed, however I worry about the maintenance as updates for dfa is always
done in grep.
Thanks,
Norihiro
[0002-sed-speed-up-matching-by-reguler-expression-with-dfa.patch (text/plain, attachment)]
[Message part 5 (message/rfc822, inline)]
On Mon, Aug 8, 2016 at 4:29 PM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
>
> On Sat, 6 Aug 2016 19:20:22 -0700
> Jim Meyering <jim <at> meyering.net> wrote:
>
>> Thanks again.
>> I've revised it as follows and expect to push tomorrow:
>> - remove the abort and comment from dfaerror -- should is not
>> necessary, given the _Noreturn attribute.
>> - adjusted commit log and NEWS entry, also moving the "Improvements"
>> section to the top
>> - sorted source file names in local.mk (they were not sorted before, either)
>> - added the "make syntax-check"-required mention of sed/dfa.c in
>> po/POTFILES.in
>
> Thanks for adjusting. I agree the all changes.
Pushed.
This bug report was last modified 8 years and 284 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.