GNU bug report logs - #24161
[PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher

Previous Next

Package: sed;

Reported by: Norihiro Tanaka <noritnk <at> kcn.ne.jp>

Date: Fri, 5 Aug 2016 14:05:01 UTC

Severity: normal

Tags: patch

Done: Jim Meyering <jim <at> meyering.net>

Bug is archived. No further changes may be made.

Full log


View this message in rfc822 format

From: help-debbugs <at> gnu.org (GNU bug Tracking System)
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Subject: bug#24161: closed (Re: bug#24161: [PATCH 2/2] sed: speed up
 matching by reguler expression with dfa matcher)
Date: Tue, 09 Aug 2016 00:54:02 +0000
[Message part 1 (text/plain, inline)]
Your bug report

#24161: [PATCH 2/2] sed: speed up matching by reguler expression with dfa matcher

which was filed against the sed package, has been closed.

The explanation is attached below, along with your original report.
If you require more details, please reply to 24161 <at> debbugs.gnu.org.

-- 
24161: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=24161
GNU Bug Tracking System
Contact help-debbugs <at> gnu.org with problems
[Message part 2 (message/rfc822, inline)]
From: Jim Meyering <jim <at> meyering.net>
To: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
Cc: 24161-done <at> debbugs.gnu.org, Assaf Gordon <assafgordon <at> gmail.com>
Subject: Re: bug#24161: [PATCH 2/2] sed: speed up matching by reguler
 expression with dfa matcher
Date: Mon, 8 Aug 2016 17:53:04 -0700
On Mon, Aug 8, 2016 at 4:29 PM, Norihiro Tanaka <noritnk <at> kcn.ne.jp> wrote:
>
> On Sat, 6 Aug 2016 19:20:22 -0700
> Jim Meyering <jim <at> meyering.net> wrote:
>
>> Thanks again.
>> I've revised it as follows and expect to push tomorrow:
>>   - remove the abort and comment from dfaerror -- should is not
>> necessary, given the _Noreturn attribute.
>>   - adjusted commit log and NEWS entry, also moving the "Improvements"
>> section to the top
>>   - sorted source file names in local.mk (they were not sorted before, either)
>>   - added the "make syntax-check"-required mention of sed/dfa.c in
>> po/POTFILES.in
>
> Thanks for adjusting.  I agree the all changes.

Pushed.

[Message part 3 (message/rfc822, inline)]
From: Norihiro Tanaka <noritnk <at> kcn.ne.jp>
To: <bug-sed <at> gnu.org>
Subject: [PATCH 2/2] sed: speed up matching by reguler expression with dfa
 matcher
Date: Fri, 05 Aug 2016 23:03:26 +0900
[Message part 4 (text/plain, inline)]
Hi,

We can speeds up sed by using dfa matcher brought from grep.  gawk users
it, sed does not uses it yet.  It will speed up matching for typical
cases.

$ yes $(printf %040d 0) | head -1000000 >k

Before:

]$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 3.04
user 2.99
sys 0.03
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 3.04
user 2.90
sys 0.06
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 7.09
user 6.77
sys 0.31

After patching:

$ time -p env LC_ALL=C sed/sed -ne /000000000k/p k
real 0.29
user 0.15
sys 0.10
$ time -p env LC_ALL=en_US.utf8 sed/sed -ne /000000000k/p k
real 0.27
user 0.25
sys 0.02
$ time -p env LC_ALL=ja_JP.eucjp sed/sed -ne /000000000k/p k
real 0.33
user 0.29
sys 0.03

I believe that this patch can greatly improve performance of matching by
sed, however I worry about the maintenance as updates for dfa is always
done in grep.

Thanks,
Norihiro
[0002-sed-speed-up-matching-by-reguler-expression-with-dfa.patch (text/plain, attachment)]

This bug report was last modified 8 years and 340 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.