GNU bug report logs - #24161
[PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher

Previous Next

Package: sed;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Fri, 5 Aug 2016 14:05:01 UTC

Severity: normal

Tags: patch

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Jim Meyering <jim <at> meyering.net>
Cc: tracker <at> debbugs.gnu.org
Subject: bug#24161: closed ([PATCH 2/2] sed: speed up matching by reguler
 expression with dfa matcher)
Date: Tue, 09 Aug 2016 00:54:01 +0000
[Message part 1 (text/plain, inline)]
Your message dated Mon, 8 Aug 2016 17:53:04 -0700
with message-id <CA+8g5KGRY_Ya9HnvDQbRKUa-7bM_fc3pNGTOo=ByHaYP=xs3Ag <at> mail.gmail.com>
and subject line Re: bug#24161: [PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher
has caused the debbugs.gnu.org bug report #24161,
regarding [PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs <at> gnu.org.)


-- 
24161: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=24161
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: <bug-sed <at> gnu.org>
Subject: [PATCH 2/2] sed: speed up matching by reguler expression with dfa
 matcher
Date: Fri, 05 Aug 2016 23:03:26 +0900
[Message part 3 (text/plain, inline)]
Hi,

We can speeds up sed by using dfa matcher brought from grep.  gawk users
it, sed does not uses it yet.  It will speed up matching for typical
cases.

$ yes $(printf %040d 0) | head -1000000 >k

Before:

]$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 3.04
user 2.99
sys 0.03
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 3.04
user 2.90
sys 0.06
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 7.09
user 6.77
sys 0.31

After patching:

$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 0.29
user 0.15
sys 0.10
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 0.27
user 0.25
sys 0.02
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 0.33
user 0.29
sys 0.03

I believe that this patch can greatly improve performance of matching by
sed, however I worry about the maintenance as updates for dfa is always
done in grep.

Thanks,
Norihiro
[0002-sed-speed-up-matching-by-reguler-expression-with-dfa.patch (text/plain, attachment)]
[Message part 5 (message/rfc822, inline)]
From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 24161-done <at> debbugs.gnu.org, Assaf Gordon <assafgordon <at> gmail.com>
Subject: Re: bug#24161: [PATCH 2/2] sed: speed up matching by reguler
 expression with dfa matcher
Date: Mon, 8 Aug 2016 17:53:04 -0700
On Mon, Aug 8, 2016 at 4:29 PM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
>
> On Sat, 6 Aug 2016 19:20:22 -0700
> Jim Meyering <jim <at> meyering.net> wrote:
>
>> Thanks again.
>> I've revised it as follows and expect to push tomorrow:
>>   - remove the abort and comment from dfaerror -- should is not
>> necessary, given the _Noreturn attribute.
>>   - adjusted commit log and NEWS entry, also moving the "Improvements"
>> section to the top
>>   - sorted source file names in local.mk (they were not sorted before, either)
>>   - added the "make syntax-check"-required mention of sed/dfa.c in
>> po/POTFILES.in
>
> Thanks for adjusting.  I agree the all changes.

Pushed.


This bug report was last modified 8 years and 284 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.